diff options
author | Simon Marlow <simonmar@microsoft.com> | 2006-04-07 02:05:11 +0000 |
---|---|---|
committer | Simon Marlow <simonmar@microsoft.com> | 2006-04-07 02:05:11 +0000 |
commit | 0065d5ab628975892cea1ec7303f968c3338cbe1 (patch) | |
tree | 8e2afe0ab48ee33cf95009809d67c9649573ef92 /compiler | |
parent | 28a464a75e14cece5db40f2765a29348273ff2d2 (diff) | |
download | haskell-0065d5ab628975892cea1ec7303f968c3338cbe1.tar.gz |
Reorganisation of the source tree
Most of the other users of the fptools build system have migrated to
Cabal, and with the move to darcs we can now flatten the source tree
without losing history, so here goes.
The main change is that the ghc/ subdir is gone, and most of what it
contained is now at the top level. The build system now makes no
pretense at being multi-project, it is just the GHC build system.
No doubt this will break many things, and there will be a period of
instability while we fix the dependencies. A straightforward build
should work, but I haven't yet fixed binary/source distributions.
Changes to the Building Guide will follow, too.
Diffstat (limited to 'compiler')
388 files changed, 164159 insertions, 0 deletions
diff --git a/compiler/DEPEND-NOTES b/compiler/DEPEND-NOTES new file mode 100644 index 0000000000..f2ba244315 --- /dev/null +++ b/compiler/DEPEND-NOTES @@ -0,0 +1,4 @@ +Module dependency information is now given in the GHC commentary + + ghc/docs/comm/genesis/modules.html + diff --git a/compiler/DLL-NOTES b/compiler/DLL-NOTES new file mode 100644 index 0000000000..c710b14251 --- /dev/null +++ b/compiler/DLL-NOTES @@ -0,0 +1,58 @@ + The DLL story + ------------- + +*** + +This file is intended to be a focal point for notes on how DLLs work. Please +add cross-references to source and other docs, especially when you don't +find something here that you need. + +*** + + +Introduction +------------ + +On Windows, DLLs are synonymous with packages (since 4.07; this change +simplified a rather horrible mess). Hence whenever a module is to be +compiled to go in a DLL, it must be compiled with -package-name dll-name. +Typically, failing to do this gives Windows error message boxes of the form +"The instruction at address <x> tried to read memory at address <x>". + + +Dependencies +------------ + +Because references in DLLs must be fully resolved when the DLL is compiled +(except for references to other DLLs), it is not possible for DLLs to call +the main program. This means that the parts of the RTS and standard package +which call the main program cannot be compiled into the relevant DLLs, and +must instead be compiled as standalone object files and linked in to each +executable. This gives the following picture of dependencies within a program: + + ___________ ___________ + | |------>| | GHC-land | Application-land +DLL-land | HSrts.dll | | HSstd.dll | | + |___________|<------|___________| | + | ^ | +-----------------|-------------------|-------------------| + _____v_____ _____|______ | +.o-land | | | | | + | Main.o | | PrelMain.o |----------------------- + |___________| |____________| | | + | | ______v______ + | | | | + ------------------------------------------>| Main.o | + | |_____________| + +(The application's dependencies are not shown.) + + +Bits of the compiler that deal with DLLs +---------------------------------------- + +basicTypes/Module.lhs is the most important place, as it deals with which +modules identifiers are in. + +basicTypes/name.lhs, other bits of basicTypes/, nativeGen/, codeGen/, +abcCSyn/, and even profiling/ have other references. diff --git a/compiler/HsVersions.h b/compiler/HsVersions.h new file mode 100644 index 0000000000..dd80922e0b --- /dev/null +++ b/compiler/HsVersions.h @@ -0,0 +1,108 @@ +#ifndef HSVERSIONS_H +#define HSVERSIONS_H + +#if 0 + +IMPORTANT! If you put extra tabs/spaces in these macro definitions, +you will screw up the layout where they are used in case expressions! + +(This is cpp-dependent, of course) + +#endif + +/* Useful in the headers that we share with the RTS */ +#define COMPILING_GHC 1 + +/* Pull in all the platform defines for this build (foo_TARGET_ARCH etc.) */ +#include "ghc_boot_platform.h" + +/* Pull in the autoconf defines (HAVE_FOO), but don't include + * ghcconfig.h, because that will include ghcplatform.h which has the + * wrong platform settings for the compiler (it has the platform + * settings for the target plat instead). */ +#include "../includes/ghcautoconf.h" + +#if __GLASGOW_HASKELL__ >= 504 + +#define CONCURRENT Control.Concurrent +#define EXCEPTION Control.Exception + /* If you want Control.Exception.try, get it as Panic.try, which + deals with the shift from 'tryAllIO' to 'try'. */ +#define DYNAMIC Data.Dynamic +#define GLAEXTS GHC.Exts +#define DATA_BITS Data.Bits +#define DATA_INT Data.Int +#define DATA_WORD Data.Word +#define UNSAFE_IO System.IO.Unsafe +#define TRACE Debug.Trace +#define DATA_IOREF Data.IORef +#define FIX_IO System.IO +#define MONAD_ST Control.Monad.ST +#define ST_ARRAY Data.Array.ST + +#else + +#define CONCURRENT Concurrent +#define EXCEPTION Exception +#define DYNAMIC Dynamic +#define GLAEXTS GlaExts +#define DATA_BITS Bits +#define DATA_INT Int +#define DATA_WORD Word +#define UNSAFE_IO IOExts +#define TRACE IOExts +#define DATA_IOREF IOExts +#define FIX_IO IOExts +#define MONAD_ST ST +#define ST_ARRAY ST + +#endif + +#ifdef __GLASGOW_HASKELL__ +#define GLOBAL_VAR(name,value,ty) \ +name = Util.global (value) :: IORef (ty); \ +{-# NOINLINE name #-} +#endif + +#if __GLASGOW_HASKELL__ >= 620 +#define UNBOX_FIELD !! +#else +#define UNBOX_FIELD ! +#endif + +#define COMMA , + +#ifdef DEBUG +#define ASSERT(e) if (not (e)) then (assertPanic __FILE__ __LINE__) else +#define ASSERT2(e,msg) if (not (e)) then (assertPprPanic __FILE__ __LINE__ (msg)) else +#define WARN( e, msg ) (warnPprTrace (e) __FILE__ __LINE__ (msg)) +#define ASSERTM(mbool) do { bool <- mbool; ASSERT(bool) return () } +#define ASSERTM2(mbool,msg) do { bool <- mbool; ASSERT2(bool,msg) return () } +#else +#define ASSERT(e) if False then error "ASSERT" else +#define ASSERT2(e,msg) if False then error "ASSERT2" else +#define ASSERTM(e) +#define ASSERTM2(e,msg) +#define WARN(e,msg) if False then error "WARN" else +#endif + +-- This #ifndef lets us switch off the "import FastString" +-- when compiling FastString itself +#ifndef COMPILING_FAST_STRING +-- +import qualified FastString as FS +#endif + +#define SLIT(x) (FS.mkLitString# (x#)) +#define FSLIT(x) (FS.mkFastString# (x#)) + +-- Useful for declaring arguments to be strict +#define STRICT1(f) f a b c | a `seq` False = undefined +#define STRICT2(f) f a b | a `seq` b `seq` False = undefined +#define STRICT3(f) f a b c | a `seq` b `seq` c `seq` False = undefined +#define STRICT4(f) f a b c d | a `seq` b `seq` c `seq` d `seq` False = undefined +#define STRICT5(f) f a b c d e | a `seq` b `seq` c `seq` d `seq` e `seq` False = undefined +#define STRICT6(f) f a b c d e f | a `seq` b `seq` c `seq` d `seq` e `seq` f `seq` False = undefined + +#endif /* HsVersions.h */ + diff --git a/compiler/Makefile b/compiler/Makefile new file mode 100644 index 0000000000..a7ed0355ff --- /dev/null +++ b/compiler/Makefile @@ -0,0 +1,833 @@ +# ----------------------------------------------------------------------------- +# Main compiler Makefile + +# Targets: +# +# all builds stage1 compiler +# +# boot stage=N generate build dirs and dependencies for stage N. +# NB. Must be done before 'make stageN'. +# NB. Cannot 'make boot stage=2' until stage1 has +# been built (similarly for stage3). +# +# stage1 (or stage=1) builds stage1 compiler +# stage2 (or stage=2) builds stage2 compiler +# stage3 (or stage=3) builds stage3 compiler +# + +TOP = .. + +# Use GHC for compiling C bits (NB. must be before boilerplate include) +# +UseGhcForCc = YES + +include $(TOP)/mk/boilerplate.mk + +#----------------------------------------------------------------------------- +# Counting source code lines + +USER_SRCS = $(filter-out $(DERIVED_SRCS),$(SRCS)) +count : + ./count_lines $(USER_SRCS) + +#----------------------------------------------------------------------------- +# Building ghc different ways (default is just `normal' sequential) + +WAYS=$(GhcCompilerWays) + +# ----------------------------------------------------------------------------- +# Bootstrapping + +# The stage1/stage2/stage3 business is quite delicate. Here's how it works: +# +# - the variable $(stage) holds the current stage number. To build a +# particular stage, you say 'make stage=N' where N is 1, 2, or 3. +# N defaults to 1. +# +# - for stage N, object files and .hi files are placed inside +# the directory stageN, in subdirectories as per the sources. +# +# - .hi-boot files are *linked* into the stageN tree, because in GHC 5.05+ +# the .hi-boot file must reside in the same place as the .hi file. +# +# - we use explicit -o and -ohi options to direct the output from C & +# Haskell compilations. +# +# - we generate a different .depend file for each build. They need to be +# different, because each stage might include different files: stage1 +# might not include GHCi, for example. For each stage, a normal .depend +# file is generated, and then post-processed to add the correct stageN/ +# prefix to each object and .hi filename. The resulting .depend file +# is named .depend-$(stage). See the end of this Makefile for details. +# +# - normal implicit rules don't work any more, because they're of the form +# +# %.o : %.hs +# +# whereas we really need +# +# stageN/%.o : %.hs +# +# so suffix.mk now defines the appropriate suffix rules when +# $(odir) is set to a non-empty value. Here we set $(odir) to +# stage1, stage2, or stage3. +# +# There are other plausible designs that might work, but each has different +# problems: +# +# - using -odir and -hidir: GHC <= 4.08 doesn't support -hidir, and +# anyway -odir puts all the objects in one directory (strips off the +# subdirectory part), which eventually forces us to use VPATH to find +# the sources. I have a really bad feeling about VPATH. +# +# - invoke make in the stageN subdirectory. This probably requires VPATH +# too. +# +# - create a link tree. The problem with requiring link trees is that +# Windows doesn't support symbolic links. + +ifeq "$(stage)" "" +stage=1 +endif + +.DUMMY: stage_dir +stage_dirs : + $(MKDIRHIER) stage$(stage) + for i in $(ALL_DIRS); do \ + $(MKDIRHIER) stage$(stage)/$$i; \ + done + +ifeq "$(stage) $(ghc_ge_603)" "1 YES" +UsingHsBoot = YES +else +ifneq "$(findstring $(stage), 2 3)" "" +UsingHsBoot = YES +else +UsingHsBoot = NO +endif +endif + +boot :: stage_dirs +# On Windows, we can't use symbolic links for the -hi-boot files +# because GHC itself is a Mingw program and does not understand +# symbolic links. So we have to copy the files instead of link them. +# That means that if you modify a .hi-boot file in Windows, you +# have to to say 'make boot' again. +# +# PS: 'ln -s foo baz' takes 'foo' relative to the path to 'baz' +# whereas 'cp foo baz' treats the two paths independently. +# Hence the "../.." in the ln command line +ifeq "$(UsingHsBoot)" "NO" +ifeq "$(HOSTPLATFORM)" "i386-unknown-mingw32" + for i in */*hi-boot*; do \ + cp -u -f $$i stage$(stage)/$$i; \ + done +else + for i in */*hi-boot*; do \ + $(LN_S) -f ../../$$i stage$(stage)/$$i || true ; \ + done +endif +endif + +ifeq "$(stage)" "1" +HC=$(GHC) +endif + +ifeq "$(stage)" "2" +HC=$(GHC_STAGE1) +endif + +ifeq "$(stage)" "3" +HC=$(GHC_STAGE2) +endif + +stage1 :: + $(MAKE) stage=1 + +stage2 :: + $(MAKE) stage=2 + +stage3 :: + $(MAKE) stage=3 + +odir=stage$(stage) + +SRC_HC_OPTS += $(patsubst %, -i$(odir)/%, $(ALL_DIRS)) + +HS_OBJS = $(patsubst %, $(odir)/%, $(addsuffix .$(way_)o,$(basename $(HS_SRCS)))) +C_OBJS = $(patsubst %, $(odir)/%, $(addsuffix .$(way_)o,$(basename $(C_SRCS)))) + +# Our standard cleaning rules don't know that we're doing our output +# into $(odir), so we have to augment CLEAN_FILES appropriateliy. + +CLEAN_FILES += $(odir)/*/*.hi $(odir)/*/*.hi-boot $(odir)/*/*.o-boot + +ifeq "$(UsingHsBoot)" "YES" +CLEAN_FILES += $(odir)/*/*.hi-boot $(odir)/*/*.o-boot +endif + +ifeq "$(stage)" "1" +mostlyclean clean distclean maintainer-clean :: + $(MAKE) $@ stage=2 + $(MAKE) $@ stage=3 +endif + +# ----------------------------------------------------------------------------- +# Set HS_PROG + +# Note: there have been reports of people running up against the ARG_MAX limit +# when linking ghc with all its constituent object files. The likely source of +# the problem is that the environment is a bit too big, so a workaround could +# be to do `env PATH=$(PATH) make ghc' to minimise the environment. (or the +# equivalent of `env' if it doesn't exist locally). +# +ifneq "$(way)" "dll" +ifneq "$(TARGETPLATFORM)" "i386-unknown-mingw32" +HS_PROG=$(odir)/ghc$(_way)-$(ProjectVersion) +else +HS_PROG=$(odir)/ghc$(_way) +endif +else +HS_PROG=$(odir)/ghc-$(ProjectVersion) +endif + +# ----------------------------------------------------------------------------- +# Create compiler configuration +# +# The 'echo' commands simply spit the values of various make variables +# into Config.hs, whence they can be compiled and used by GHC itself + +CONFIG_HS = main/Config.hs +boot :: $(CONFIG_HS) + +$(CONFIG_HS) : $(FPTOOLS_TOP)/mk/config.mk Makefile + @$(RM) -f $(CONFIG_HS) + @echo "Creating $(CONFIG_HS) ... " + @echo "module Config where" >>$(CONFIG_HS) + @echo "cProjectName = \"$(ProjectName)\"" >> $(CONFIG_HS) + @echo "cProjectVersion = \"$(ProjectVersion)\"" >> $(CONFIG_HS) + @echo "cProjectVersionInt = \"$(ProjectVersionInt)\"" >> $(CONFIG_HS) + @echo "cProjectPatchLevel = \"$(ProjectPatchLevel)\"" >> $(CONFIG_HS) + @echo "cBooterVersion = \"$(GhcVersion)\"" >> $(CONFIG_HS) + @echo "cHscIfaceFileVersion = \"$(HscIfaceFileVersion)\"" >> $(CONFIG_HS) + @echo "cGhcWithNativeCodeGen = \"$(GhcWithNativeCodeGen)\"" >> $(CONFIG_HS) + @echo "cGhcUnregisterised = \"$(GhcUnregisterised)\"" >> $(CONFIG_HS) + @echo "cLeadingUnderscore = \"$(LeadingUnderscore)\"" >> $(CONFIG_HS) + @echo "cRAWCPP_FLAGS = \"$(RAWCPP_FLAGS)\"" >> $(CONFIG_HS) + @echo "cGCC = \"$(WhatGccIsCalled)\"" >> $(CONFIG_HS) + @echo "cMKDLL = \"$(BLD_DLL)\"" >> $(CONFIG_HS) + @echo "cLdIsGNULd = \"$(LdIsGNULd)\"" >> $(CONFIG_HS) + @echo "cLD_X = \"$(LD_X)\"" >> $(CONFIG_HS) + @echo "cPROJECT_DIR = \"$(PROJECT_DIR)\"" >> $(CONFIG_HS) + @echo "cGHC_DRIVER_DIR_REL = \"$(GHC_DRIVER_DIR_REL)\"" >> $(CONFIG_HS) + @echo "cGHC_TOUCHY_PGM = \"$(GHC_TOUCHY_PGM)\"" >> $(CONFIG_HS) + @echo "cGHC_TOUCHY_DIR_REL = \"$(GHC_TOUCHY_DIR_REL)\"" >> $(CONFIG_HS) + @echo "cGHC_UNLIT_PGM = \"$(GHC_UNLIT_PGM)\"" >> $(CONFIG_HS) + @echo "cGHC_UNLIT_DIR_REL = \"$(GHC_UNLIT_DIR_REL)\"" >> $(CONFIG_HS) + @echo "cGHC_MANGLER_PGM = \"$(GHC_MANGLER_PGM)\"" >> $(CONFIG_HS) + @echo "cGHC_MANGLER_DIR_REL = \"$(GHC_MANGLER_DIR_REL)\"" >> $(CONFIG_HS) + @echo "cGHC_SPLIT_PGM = \"$(GHC_SPLIT_PGM)\"" >> $(CONFIG_HS) + @echo "cGHC_SPLIT_DIR_REL = \"$(GHC_SPLIT_DIR_REL)\"" >> $(CONFIG_HS) + @echo "cGHC_SYSMAN_PGM = \"$(GHC_SYSMAN)\"" >> $(CONFIG_HS) + @echo "cGHC_SYSMAN_DIR_REL = \"$(GHC_SYSMAN_DIR)\"" >> $(CONFIG_HS) + @echo "cGHC_CP = \"$(GHC_CP)\"" >> $(CONFIG_HS) + @echo "cGHC_PERL = \"$(GHC_PERL)\"" >> $(CONFIG_HS) +ifeq ($(GhcWithIlx),YES) + @echo "cILX2IL = \"$(ILX2IL)\"" >> $(CONFIG_HS) + @echo "cILASM = \"$(ILASM)\"" >> $(CONFIG_HS) +endif + @echo "cEnableWin32DLLs = \"$(EnableWin32DLLs)\"" >> $(CONFIG_HS) + @echo "cCONTEXT_DIFF = \"$(CONTEXT_DIFF)\"" >> $(CONFIG_HS) + @echo "cUSER_WAY_NAMES = \"$(USER_WAY_NAMES)\"" >> $(CONFIG_HS) + @echo "cUSER_WAY_OPTS = \"$(USER_WAY_OPTS)\"" >> $(CONFIG_HS) + @echo "cDEFAULT_TMPDIR = \"$(DEFAULT_TMPDIR)\"" >> $(CONFIG_HS) + @echo done. + +CLEAN_FILES += $(CONFIG_HS) + +# ----------------------------------------------------------------------------- +# Create platform includes + +# Here we generate a little header file containing CPP symbols that GHC +# uses to determine which platform it is building on/for. The platforms +# can differ between stage1 and stage2 if we're cross-compiling, so we +# need one of these header files per stage. + +PLATFORM_H = ghc_boot_platform.h + +stage1/$(PLATFORM_H) : stage_dirs $(FPTOOLS_TOP)/mk/config.mk Makefile + @echo "Creating $@..." + @$(RM) $@ + @echo "#ifndef __PLATFORM_H__" >$@ + @echo "#define __PLATFORM_H__" >>$@ + @echo >> $@ + @echo "#define BuildPlatform_NAME \"$(BUILDPLATFORM)\"" >> $@ + @echo "#define HostPlatform_NAME \"$(HOSTPLATFORM)\"" >> $@ + @echo "#define TargetPlatform_NAME \"$(TARGETPLATFORM)\"" >> $@ + @echo >> $@ + @echo "#define $(BuildPlatform_CPP)_BUILD 1" >> $@ + @echo "#define $(HostPlatform_CPP)_HOST 1" >> $@ + @echo "#define $(TargetPlatform_CPP)_TARGET 1" >> $@ + @echo >> $@ + @echo "#define $(BuildArch_CPP)_BUILD_ARCH 1" >> $@ + @echo "#define $(HostArch_CPP)_HOST_ARCH 1" >> $@ + @echo "#define $(TargetArch_CPP)_TARGET_ARCH 1" >> $@ + @echo "#define BUILD_ARCH \"$(BuildArch_CPP)\"" >> $@ + @echo "#define HOST_ARCH \"$(HostArch_CPP)\"" >> $@ + @echo "#define TARGET_ARCH \"$(TargetArch_CPP)\"" >> $@ + @echo >> $@ + @echo "#define $(BuildOS_CPP)_BUILD_OS 1" >> $@ + @echo "#define $(HostOS_CPP)_HOST_OS 1" >> $@ + @echo "#define $(TargetOS_CPP)_TARGET_OS 1" >> $@ + @echo "#define BUILD_OS \"$(BuildOS_CPP)\"" >> $@ + @echo "#define HOST_OS \"$(HostOS_CPP)\"" >> $@ + @echo "#define TARGET_OS \"$(TargetOS_CPP)\"" >> $@ +ifeq "$(HostOS_CPP)" "irix" + @echo "#ifndef $(IRIX_MAJOR)_TARGET_OS " >> $@ + @echo "#define $(IRIX_MAJOR)_TARGET_OS 1" >> $@ + @echo "#endif " >> $@ +endif + @echo >> $@ + @echo "#define $(BuildVendor_CPP)_BUILD_VENDOR 1" >> $@ + @echo "#define $(HostVendor_CPP)_HOST_VENDOR 1" >> $@ + @echo "#define $(TargetVendor_CPP)_TARGET_VENDOR 1" >> $@ + @echo "#define BUILD_VENDOR \"$(BuildVendor_CPP)\"" >> $@ + @echo "#define HOST_VENDOR \"$(HostVendor_CPP)\"" >> $@ + @echo "#define TARGET_VENDOR \"$(TargetVendor_CPP)\"" >> $@ + @echo >> $@ + @echo "#endif /* __PLATFORM_H__ */" >> $@ + @echo "Done." + +# For stage2 and above, the BUILD platform is the HOST of stage1, and +# the HOST platform is the TARGET of stage1. The TARGET remains the same +# (stage1 is the cross-compiler, not stage2). +stage2/$(PLATFORM_H) : stage_dirs $(FPTOOLS_TOP)/mk/config.mk Makefile + @echo "Creating $@..." + @$(RM) $@ + @echo "#ifndef __PLATFORM_H__" >$@ + @echo "#define __PLATFORM_H__" >>$@ + @echo >> $@ + @echo "#define BuildPlatform_NAME \"$(HOSTPLATFORM)\"" >> $@ + @echo "#define HostPlatform_NAME \"$(TARGETPLATFORM)\"" >> $@ + @echo "#define TargetPlatform_NAME \"$(TARGETPLATFORM)\"" >> $@ + @echo >> $@ + @echo "#define $(HostPlatform_CPP)_BUILD 1" >> $@ + @echo "#define $(TargetPlatform_CPP)_HOST 1" >> $@ + @echo "#define $(TargetPlatform_CPP)_TARGET 1" >> $@ + @echo >> $@ + @echo "#define $(HostArch_CPP)_BUILD_ARCH 1" >> $@ + @echo "#define $(TargetArch_CPP)_HOST_ARCH 1" >> $@ + @echo "#define $(TargetArch_CPP)_TARGET_ARCH 1" >> $@ + @echo "#define BUILD_ARCH \"$(HostArch_CPP)\"" >> $@ + @echo "#define HOST_ARCH \"$(TargetArch_CPP)\"" >> $@ + @echo "#define TARGET_ARCH \"$(TargetArch_CPP)\"" >> $@ + @echo >> $@ + @echo "#define $(HostOS_CPP)_BUILD_OS 1" >> $@ + @echo "#define $(TargetOS_CPP)_HOST_OS 1" >> $@ + @echo "#define $(TargetOS_CPP)_TARGET_OS 1" >> $@ + @echo "#define BUILD_OS \"$(HostOS_CPP)\"" >> $@ + @echo "#define HOST_OS \"$(TargetOS_CPP)\"" >> $@ + @echo "#define TARGET_OS \"$(TargetOS_CPP)\"" >> $@ +ifeq "$(HostOS_CPP)" "irix" + @echo "#ifndef $(IRIX_MAJOR)_TARGET_OS " >> $@ + @echo "#define $(IRIX_MAJOR)_TARGET_OS 1" >> $@ + @echo "#endif " >> $@ +endif + @echo >> $@ + @echo "#define $(HostVendor_CPP)_BUILD_VENDOR 1" >> $@ + @echo "#define $(TargetVendor_CPP)_HOST_VENDOR 1" >> $@ + @echo "#define $(TargetVendor_CPP)_TARGET_VENDOR 1" >> $@ + @echo "#define BUILD_VENDOR \"$(HostVendor_CPP)\"" >> $@ + @echo "#define HOST_VENDOR \"$(TargetVendor_CPP)\"" >> $@ + @echo "#define TARGET_VENDOR \"$(TargetVendor_CPP)\"" >> $@ + @echo >> $@ + @echo "#endif /* __PLATFORM_H__ */" >> $@ + @echo "Done." + +stage3/$(PLATFORM_H) : stage_dirs stage2/$(PLATFORM_H) + $(CP) stage2/$(PLATFORM_H) stage3/$(PLATFORM_H) + +STAGE_PLATFORM_H = stage$(stage)/$(PLATFORM_H) + +boot :: $(STAGE_PLATFORM_H) + +SRC_HC_OPTS += -Istage$(stage) + +# ----------------------------------------------------------------------------- +# Set SRCS etc. +# +# First figure out ALL_DIRS, the source sub-directories + +ALL_DIRS = \ + utils basicTypes types hsSyn prelude rename typecheck deSugar coreSyn \ + specialise simplCore stranal stgSyn simplStg codeGen main \ + profiling parser cprAnalysis ndpFlatten iface cmm + +# Make sure we include Config.hs even if it doesn't exist yet... +ALL_SRCS += $(CONFIG_HS) + +# HsGeneric.hs is not used just now +EXCLUDED_SRCS += hsSyn/HsGeneric.hs + +ifeq ($(GhcWithNativeCodeGen),YES) +ALL_DIRS += nativeGen +else +SRC_HC_OPTS += -DOMIT_NATIVE_CODEGEN +endif + +ifeq ($(GhcWithIlx),YES) +ALL_DIRS += ilxGen +SRC_HC_OPTS += -DILX +endif + +ifeq ($(GhcWithJavaGen),YES) +ALL_DIRS += javaGen +SRC_HC_OPTS += -DJAVA +endif + +ifeq "$(BootingFromHc)" "YES" +# HC files are always from a self-booted compiler +bootstrapped = YES +else +ifneq "$(findstring $(stage), 2 3)" "" +bootstrapped = YES +else +bootstrapped = $(shell if (test $(GhcCanonVersion) -eq $(ProjectVersionInt) -a $(GhcPatchLevel) -eq $(ProjectPatchLevel)); then echo YES; else echo NO; fi) +endif +endif + +# ----------------------------------------------------------------------------- +# Building a compiler with interpreter support +# +# The interpreter, GHCi interface, and Template Haskell are only +# enabled when we are bootstrapping with the same version of GHC, and +# the interpreter is supported on this platform. + +ifeq "$(GhcWithInterpreter) $(bootstrapped)" "YES YES" + +# Yes, include the interepreter, readline, and Template Haskell extensions +SRC_HC_OPTS += -DGHCI -package template-haskell +PKG_DEPENDS += template-haskell + +# Use threaded RTS with GHCi, so threads don't get blocked at the prompt. +SRC_HC_OPTS += -threaded + +ALL_DIRS += ghci + +# If we are going to use dynamic libraries instead of .o files for ghci, +# we will need to always retain CAFs in the compiler. +# ghci/keepCAFsForGHCi contains a GNU C __attribute__((constructor)) +# function which sets the keepCAFs flag for the RTS before any Haskell +# code is run. +ifeq "$(GhcBuildDylibs)" "YES" +else +EXCLUDED_SRCS += ghci/keepCAFsForGHCi.c +endif + +# Enable readline if either: +# - we're building stage 1 and $(GhcHasReadline)="YES" +# - we're building stage 2/3, and we have built the readline package +# +# But we don't enable readline on Windows, because readline is fairly +# broken there. +# +ifneq "$(TARGETPLATFORM)" "i386-unknown-mingw32" +ifeq "$(stage)" "1" +ifeq "$(GhcHasReadline)" "YES" +SRC_HC_OPTS += -package readline -DUSE_READLINE +PKG_DEPENDS += readline +endif +else +ifeq "$(GhcLibsWithReadline)" "YES" +SRC_HC_OPTS += -package readline -DUSE_READLINE +PKG_DEPENDS += readline +endif +endif # stage=1 +endif # not windows + +else + +# No interpreter, so exclude Template Haskell modules +EXCLUDED_SRCS += deSugar/DsMeta.hs typecheck/TcSplice.lhs hsSyn/Convert.lhs + +endif # bootstrapped with interpreter + +# ----------------------------------------------- +# mkdependC stuff +# +# Big Fudge to get around inherent problem that Makefile setup +# has got with 'mkdependC'. +# +SRC_MKDEPENDC_OPTS += -D__GLASGOW_HASKELL__=$(ProjectVersionInt) + +# XXX not really correct, hschooks.c actually gets include files like +# RtsFlags.c from the installed GHC, but we can't tell mkdependC about that. +SRC_MKDEPENDC_OPTS += -I$(GHC_INCLUDE_DIR) + +# ----------------------------------------------------------------------------- +# Haskell compilations + +SRC_HC_OPTS += \ + -cpp -fglasgow-exts -fno-generics -Rghc-timing \ + -I. -IcodeGen -InativeGen -Iparser + +# Omitted: -I$(GHC_INCLUDE_DIR) +# We should have -I$(GHC_INCLUDE_DIR) in SRC_HC_OPTS, +# to avoid the use of an explicit path in GHC source files +# (include "../includes/config.h" +# But alas GHC 4.08 (and others for all I know) uses this very +# same include path when compiling the .hc files it generates. +# Disaster! Then the hc file sees the GHC 5.02 (or whatever) +# include files. For the moment we've reverted to using +# an explicit path in the .hs sources +# +# For the benefit of <5.00 compilers, do include GHC_INCLUDE_DIR +# when generating dependencies. (=> it gets passed onto mkdependHS, +# which needs it). +SRC_MKDEPENDHS_OPTS += -I$(GHC_INCLUDE_DIR) + +# We need System.Posix (or Posix when ghc < 6.2) +ifneq "$(TARGETPLATFORM)" "i386-unknown-mingw32" +ifeq "$(bootstrapped) $(ghc_ge_601)" "NO NO" +SRC_HC_OPTS += -package posix +else +SRC_HC_OPTS += -package unix +PKG_DEPENDS += unix +endif +endif + +# We use the Cabal package in stages 2/3 only; in stage 1 we're using +# the libcompat library which provides the Cabal modules. +ifneq "$(stage)" "1" +SRC_HC_OPTS += -package Cabal +PKG_DEPENDS += Cabal +endif + +ifeq "$(ghc_ge_603)" "YES" +# Ignore lang, to avoid potential clash with the Generics module if +# lang happens to be a dependency of some exposed package in the local +# GHC installation (eg. wxHaskell did this around 6.4). +SRC_HC_OPTS += -ignore-package lang +endif + +SRC_CC_OPTS += -Iparser -I. -O +SRC_HC_OPTS += -recomp $(GhcHcOpts) $(GhcStage$(stage)HcOpts) +SRC_HC_OPTS += -H16M + +ifeq "$(BootingFromHc)" "YES" +SRC_CC_OPTS += -D__GLASGOW_HASKELL__=$(ProjectVersionInt) +endif + +# Special flags for particular modules +# The standard suffix rule for compiling a Haskell file +# adds these flags to the command line + +# There used to be a -no-recomp flag on PrimOp, but why? +# It's an expensive module to recompile! +prelude/PrimOp_HC_OPTS = -H80m + + +main/ParsePkgConf_HC_OPTS += -fno-warn-incomplete-patterns +parser/Parser_HC_OPTS += -fno-warn-incomplete-patterns + +ifeq "$(ghc_ge_603)" "NO" +# Use -fvia-C since the NCG can't handle the narrow16Int# (and intToInt16#?) +# primops on all platforms. +parser/Parser_HC_OPTS += -fvia-C +# because the NCG can't handle the 64-bit math in here +prelude/PrelRules_HC_OPTS += -fvia-C +# ByteCodeItbls uses primops that the NCG doesn't support. +ghci/ByteCodeItbls_HC_OPTS += -fvia-C +ghci/ByteCodeLink_HC_OPTS += -fvia-C -monly-3-regs +endif + +# Careful optimisation of the parser: we don't want to throw everything +# at it, because that takes too long and doesn't buy much, but we do want +# to inline certain key external functions, so we instruct GHC not to +# throw away inlinings as it would normally do in -Onot mode: +parser/Parser_HC_OPTS += -Onot -fno-ignore-interface-pragmas + +ifeq "$(HOSTPLATFORM)" "hppa1.1-hp-hpux9" +rename/RnMonad_HC_OPTS = -O2 -O2-for-C +endif + +utils/Digraph_HC_OPTS = -fglasgow-exts + +basicTypes/SrcLoc_HC_OPTS = -funbox-strict-fields + +ifeq "$(bootstrapped)" "YES" +utils/Binary_HC_OPTS = -funbox-strict-fields +endif + +# We always optimise some low-level modules, otherwise performance of +# a non-optimised compiler is severely affected. +main/BinIface_HC_OPTS += -O +utils/Binary_HC_OPTS += -O +utils/FastMutInt_HC_OPTS += -O +utils/Encoding_HC_OPTS += -O +utils/StringBuffer_HC_OPTS += -O +utils/FastString_HC_OPTS += -O + +# ---- Profiling ---- +#simplCore/Simplify_HC_OPTS = -auto-all +#simplCore/SimplEnv_HC_OPTS = -auto-all +#simplCore/SimplUtils_HC_OPTS = -auto-all + +# CSE interacts badly with top-level IORefs (reportedly in DriverState and +# DriverMkDepend), causing some of them to be commoned up. We have a fix for +# this in 5.00+, but earlier versions of the compiler will need CSE turned off. +# To be on the safe side, we disable CSE in *all* modules with top-level IORefs. +ghci/InteractiveUI_HC_OPTS = -fno-cse +main/CmdLineOpts_HC_OPTS = -fno-cse +main/DriverMkDepend_HC_OPTS = -fno-cse +main/DriverPipeline_HC_OPTS = -fno-cse +main/Finder_HC_OPTS = -fno-cse +main/SysTools_HC_OPTS = -fno-cse +main/StaticFlags_HC_OPTS = -fno-cse + +# The #include is vital for the via-C route, else the C +# compiler doesn't realise that the stcall foreign imports are indeed +# stdcall, and doesn't generate the Foo@8 name for them +ifeq "$(HOSTPLATFORM)" "i386-unknown-mingw32" +main/SysTools_HC_OPTS += '-\#include <windows.h>' '-\#include <process.h>' +endif + +parser/Lexer_HC_OPTS += -funbox-strict-fields + +# ghc_strlen percolates through so many modules that it is easier to get its +# prototype via a global option instead of a myriad of per-file OPTIONS +SRC_HC_OPTS += '-\#include "hschooks.h"' + +# ---------------------------------------------------------------------------- +# Generate supporting stuff for prelude/PrimOp.lhs +# from prelude/primops.txt + +PRIMOP_BITS=primop-data-decl.hs-incl \ + primop-tag.hs-incl \ + primop-list.hs-incl \ + primop-has-side-effects.hs-incl \ + primop-out-of-line.hs-incl \ + primop-commutable.hs-incl \ + primop-needs-wrapper.hs-incl \ + primop-can-fail.hs-incl \ + primop-strictness.hs-incl \ + primop-primop-info.hs-incl + +CLEAN_FILES += prelude/primops.txt +CLEAN_FILES += $(PRIMOP_BITS) + +SRC_CPP_OPTS += -I$(GHC_INCLUDE_DIR) +SRC_CPP_OPTS += ${GhcCppOpts} + +ifneq "$(BootingFromHc)" "YES" +prelude/PrimOp.lhs $(odir)/prelude/PrimOp.o: $(PRIMOP_BITS) +endif + +ifneq "$(BootingFromHc)" "YES" +depend :: $(PRIMOP_BITS) +endif + +primop-data-decl.hs-incl: prelude/primops.txt + $(GENPRIMOP) --data-decl < $< > $@ +primop-tag.hs-incl: prelude/primops.txt + $(GENPRIMOP) --primop-tag < $< > $@ +primop-list.hs-incl: prelude/primops.txt + $(GENPRIMOP) --primop-list < $< > $@ +primop-has-side-effects.hs-incl: prelude/primops.txt + $(GENPRIMOP) --has-side-effects < $< > $@ +primop-out-of-line.hs-incl: prelude/primops.txt + $(GENPRIMOP) --out-of-line < $< > $@ +primop-commutable.hs-incl: prelude/primops.txt + $(GENPRIMOP) --commutable < $< > $@ +primop-needs-wrapper.hs-incl: prelude/primops.txt + $(GENPRIMOP) --needs-wrapper < $< > $@ +primop-can-fail.hs-incl: prelude/primops.txt + $(GENPRIMOP) --can-fail < $< > $@ +primop-strictness.hs-incl: prelude/primops.txt + $(GENPRIMOP) --strictness < $< > $@ +primop-primop-info.hs-incl: prelude/primops.txt + $(GENPRIMOP) --primop-primop-info < $< > $@ + +# Usages aren't used any more; but the generator +# can still generate them if we want them back +primop-usage.hs-incl: prelude/primops.txt + $(GENPRIMOP) --usage < $< > $@ + + +#----------------------------------------------------------------------------- +# Linking + +# Include libghccompat in stage1 only. In stage2 onwards, all these +# libraries will be available from the main libraries. + +ifeq "$(stage)" "1" +include $(GHC_COMPAT_DIR)/compat.mk +endif + +SRC_LD_OPTS += -no-link-chk + +# ----------------------------------------------------------------------------- +# create ghc-inplace, a convenient way to run ghc from the build tree... + +all :: $(odir)/ghc-inplace ghc-inplace + +$(odir)/ghc-inplace : $(HS_PROG) + @$(RM) $@ + echo '#!/bin/sh' >>$@ + echo exec $(GHC_COMPILER_DIR_ABS)/$(HS_PROG) '-B$(subst \,\\,$(FPTOOLS_TOP_ABS_PLATFORM))' '"$$@"' >>$@ + chmod 755 $@ + +ghc-inplace : stage1/ghc-inplace + $(LN_S) -f $< $@ + +ifeq "$(stage)" "1" +CLEAN_FILES += ghc-inplace +endif + +CLEAN_FILES += $(odir)/ghc-inplace + +#----------------------------------------------------------------------------- +# install + +# We don't want ghc treated as an ordinary executable, +# but put it together with the libraries. +# Also don't want any interface files installed + +DESTDIR = $(INSTALL_LIBRARY_DIR_GHC) + +ifneq "$(HOSTPLATFORM)" "i386-unknown-mingw32" +INSTALL_LIBEXECS += $(HS_PROG) +else +INSTALL_PROGS += $(HS_PROG) +endif + +# ---------------------------------------------------------------------------- +# profiling. + +# rename/RnBinds_HC_OPTS += -auto-all +# rename/RnEnv_HC_OPTS += -auto-all +# rename/RnExpr_HC_OPTS += -auto-all +# rename/RnHiFiles_HC_OPTS += -auto-all +# rename/RnHsSyn_HC_OPTS += -auto-all +# rename/Rename_HC_OPTS += -auto-all +# rename/RnIfaces_HC_OPTS += -auto-all +# rename/RnNames_HC_OPTS += -auto-all +# rename/RnSource_HC_OPTS += -auto-all +# rename/RnTypes_HC_OPTS += -auto-all +# +# typecheck/Inst_HC_OPTS += -auto-all +# typecheck/TcBinds_HC_OPTS += -auto-all +# typecheck/TcClassDcl_HC_OPTS += -auto-all +# typecheck/TcDefaults_HC_OPTS += -auto-all +# typecheck/TcDeriv_HC_OPTS += -auto-all +# typecheck/TcEnv_HC_OPTS += -auto-all +# typecheck/TcExpr_HC_OPTS += -auto-all +# typecheck/TcForeign_HC_OPTS += -auto-all +# typecheck/TcGenDeriv_HC_OPTS += -auto-all +# typecheck/TcHsSyn_HC_OPTS += -auto-all +# typecheck/TcIfaceSig_HC_OPTS += -auto-all +# typecheck/TcInstDcls_HC_OPTS += -auto-all +# typecheck/TcMatches_HC_OPTS += -auto-all +# typecheck/TcMonoType_HC_OPTS += -auto-all +# typecheck/TcMType_HC_OPTS += -auto-all +# typecheck/TcPat_HC_OPTS += -auto-all +# typecheck/TcRnDriver_HC_OPTS += -auto-all +# #typecheck/TcRnMonad_HC_OPTS += -auto-all +# #typecheck/TcRnTypes_HC_OPTS += -auto-all +# typecheck/TcRules_HC_OPTS += -auto-all +# typecheck/TcSimplify_HC_OPTS += -auto-all +# typecheck/TcSplice_HC_OPTS += -auto-all +# typecheck/TcTyClsDecls_HC_OPTS += -auto-all +# typecheck/TcTyDecls_HC_OPTS += -auto-all +# typecheck/TcType_HC_OPTS += -auto-all +# typecheck/TcUnify_HC_OPTS += -auto-all + +coreSyn/CorePrep_HC_OPTS += -auto-all +# parser/Parser_HC_OPTS += -fasm + +#----------------------------------------------------------------------------- +# Building the GHC package + +# The GHC package is made from the stage 2 build. Fortunately the +# package build system framework more or less does the right thing for +# us here. + +ifeq "$(stage)" "2" +PACKAGE = ghc +HIERARCHICAL_LIB = NO +VERSION = $(ProjectVersion) +PKG_DEPENDS += base haskell98 +PACKAGE_CPP_OPTS += -DPKG_DEPENDS='$(PKG_DEPENDS)' + +# Omit Main from the library, the client will want to plug their own Main in +LIBOBJS = $(filter-out $(odir)/main/Main.o $(odir)/parser/hschooks.o, $(OBJS)) + +# disable splitting: it won't really help with GHC, and the specialised +# build system for compiler/ isn't set up to handle it. +SplitObjs = NO + +# the package build system likes to set WAYS=$(GhcLibWays), but we don't +# really want to build the whole of GHC multiple ways... if you do, +# set GhcCompilerWays instead. +GhcLibWays = $(GhcCompilerWays) + +# override $(GhcLibHcOpts): we want GhcStage2HcOpts to take precedence +GhcLibHcOpts = + +# override default definition of HS_IFACES so we can add $(odir) +HS_IFACES = $(addsuffix .$(way_)hi,$(basename $(HS_OBJS))) + +# Haddock can't handle recursive modules currently, so we disable it for now. +NO_HADDOCK_DOCS = YES +endif + +#----------------------------------------------------------------------------- +# clean + +MAINTAINER_CLEAN_FILES += parser/Parser.info main/ParsePkgConf.info + +#----------------------------------------------------------------------------- +# Include target-rule boilerplate + +# Don't use the default MKDEPENDHS stuff... we'll do our own, below +MKDEPENDHS_SRCS = +MKDEPENDC_SRCS = + +# Make doesn't work this out for itself, it seems +parser/Parser.y : parser/Parser.y.pp +EXTRA_SRCS += parser/Parser.y + + +#----------------------------------------------------------------------------- +# Source files for tags file generation +# +# We want to excluded derived sources, because they won't be in the source +# tree, which is where we are going to move the TAGS file to.a + +TAGS_HS_SRCS = parser/Parser.y.pp $(filter-out $(DERIVED_SRCS) main/Config.hs parser/Parser.y, $(sort $(SRCS))) + + +include $(TOP)/mk/target.mk + +# ----------------------------------------------------------------------------- +# Dependencies + +MKDEPENDHS_HC_OPTS = $(patsubst -i$(odir)/%, -i%, $(HC_OPTS)) + +MKDEPENDHS=$(HC) + +# Must do this *after* including target.mk, because $(HS_SRCS) isn't set yet. +depend :: $(STAGE_PLATFORM_H) $(HS_SRCS) $(C_SRCS) + touch .depend-BASE +ifneq "$(BootingFromHc)" "YES" + $(MKDEPENDHS) -M -optdep-f -optdep.depend-BASE $(foreach way,$(WAYS),-optdep-s -optdep$(way)) $(foreach obj,$(MKDEPENDHS_OBJ_SUFFICES),-osuf $(obj)) $(MKDEPENDHS_OPTS) $(filter-out -split-objs, $(MKDEPENDHS_HC_OPTS)) $(HS_SRCS) +endif + $(MKDEPENDC) -f .depend-BASE $(MKDEPENDC_OPTS) $(foreach way,$(WAYS),-s $(way)) -- $(CC_OPTS) -- $(C_SRCS) + $(PERL) -pe 'binmode(stdin); binmode(stdout); s@^(\S*\.o)@stage$(stage)/$$1@g; s@(\S*\.hi)@stage$(stage)/$$1@g; s@^.*/lib/compat.*$$@@g;' <.depend-BASE >.depend-$(stage) +# The binmode stuff tells perl not to add stupid ^M's to the output +# +# The /lib/compat replacement is to workaround a bug in the +# -optdep--exclude-module flag in GHC 6.4. It is not required for any +# other version of GHC, but doesn't do any harm. + +-include .depend-$(stage) diff --git a/compiler/NOTES b/compiler/NOTES new file mode 100644 index 0000000000..8c62750008 --- /dev/null +++ b/compiler/NOTES @@ -0,0 +1,171 @@ + +------------------------- +*** unexpected failure for jtod_circint(opt) + + + New back end thoughts + +----------------------------------------------------------------------------- +Codegen notes + +* jumps to ImpossibleBranch should be removed. + +* Profiling: + - when updating a closure with an indirection to a function, + we should make a permanent indirection. + + - check that we're bumping the scc count appropriately + +* check perf & binary sizes against the HEAD + +----------------------------------------------------------------------------- +C backend notes + +* use STGCALL macros for foreign calls (doesn't look like volatile regs + are handled properly at the mo). + +----------------------------------------------------------------------------- +Cmm parser notes + +* switches + +* need to cater for unexported procedures/info tables? + +* We should be able to get rid of entry labels, use info labels only. + - we need a %ENTRY_LBL(info_lbl) macro, so that instead of + JMP_(foo_entry) we can write jump %ENTRY_LBL(foo_info). + +----------------------------------------------------------------------------- + +* Move arg-descr from LFInfo to ClosureInfo? + But: only needed for functions + +* Move all of CgClosure.link_caf into NewCaf, and newDynCaf + +* If the case binder is dead, and the constr is nullary, + do we need to assign to Node? + + +------------------------- +* Relation between separate type sigs and pattern type sigs +f :: forall a. a->a +f :: b->b = e -- No: monomorphic + +f :: forall a. a->a +f :: forall a. a->a -- OK + +f :: forall a. [a] -> [a] +f :: forall b. b->b = e ??? + + +------------------------------- +NB: all floats are let-binds, but some non-rec lets + may be unlifted (with RHS ok-for-speculation) + + +simplArg: [use strictness] + [used for non-top-lvl non-rec RHS or function arg] + if strict-type || demanded + simplStrictExpr + else + simplExpr ---> (floats,expr) + float all the floats if exposes constr app, return expr + +simpl (applied lambda) ==> simplNonRecBind +simpl (Let (NonRec ...) ..) ==> simplNonRecBind + +simpl (Let (Rec ...) ..) ==> simplRecBind + +simplRecBind: + simplify binders (but not its IdInfo) + simplify the pairs one at a time + using simplRecPair + +simplNonRecBind: [was simplBeta] + [used for non-top-lvl non-rec bindings] + - check for PreInlineUnconditionally + - simplify binder, including its IdInfo + - simplArg + - if strict-type + addCaseBind [which makes a let if ok-for-spec] + else + completeLazyBind + +simplLazyBind: [binder already simplified, but not its IdInfo] + [used for both rec and top-lvl non-rec] + [must not be strict/unboxed; case not allowed] + - check for PreInlineUnconditionally + - substituteIdInfo and add result to in-scope + [so that rules are available in rec rhs] + - simplExpr --> (floats,expr) + - float: lifted floats only + if exposes constructor or pap (even if non-triv args) + or if top level + - completeLazyBind + + +completeLazyBind: [given a simplified RHS] + [used for both rec and non-rec bindings, top level and not] + - try discarding dead + - try PostInlineUnconditionally + - let-bind coerce arg and repeat + - try rhs tylam (float) + - try eta expand (float) [not if any float is unlifted && (non-spec || top_lvl || rec)] + - let-bind constructor args [not if any float is ..as above..] + + - add unfolding [this is the only place we add an unfolding] + add arity + + + +Right hand sides and arguments +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +In many ways we want to treat + (a) the right hand side of a let(rec), and + (b) a function argument +in the same way. But not always! In particular, we would +like to leave these arguments exactly as they are, so they +will match a RULE more easily. + + f (g x, h x) + g (+ x) + +It's harder to make the rule match if we ANF-ise the constructor, +or eta-expand the PAP: + + f (let { a = g x; b = h x } in (a,b)) + g (\y. + x y) + +On the other hand if we see the let-defns + + p = (g x, h x) + q = + x + +then we *do* want to ANF-ise and eta-expand, so that p and q +can be safely inlined. + +Even floating lets out is a bit dubious. For let RHS's we float lets +out if that exposes a value, so that the value can be inlined more vigorously. +For example + + r = let x = e in (x,x) + +Here, if we float the let out we'll expose a nice constructor. We did experiments +that showed this to be a generally good thing. But it was a bad thing to float +lets out unconditionally, because that meant they got allocated more often. + +For function arguments, there's less reason to expose a constructor (it won't +get inlined). Just possibly it might make a rule match, but I'm pretty skeptical. +So for the moment we don't float lets out of function arguments either. + + +Eta expansion +~~~~~~~~~~~~~~ +For eta expansion, we want to catch things like + + case e of (a,b) -> \x -> case a of (p,q) -> \y -> r + +If the \x was on the RHS of a let, we'd eta expand to bring the two +lambdas together. And in general that's a good thing to do. Perhaps +we should eta expand wherever we find a (value) lambda? Then the eta +expansion at a let RHS can concentrate solely on the PAP case. diff --git a/compiler/README b/compiler/README new file mode 100644 index 0000000000..ca619cdde0 --- /dev/null +++ b/compiler/README @@ -0,0 +1,11 @@ +This directory contains the source for Glorious Glasgow Haskell +compiler proper, normally a binary called "hsc". The source is +organized into _one_ level of directories, and the literate Haskell +source files sit in those directories (i.e., */*.lhs). + +The only "real" subdirectory is the tests/ directory [NB: not +distributed normally, but available to gluttons for punishment], which +includes some tests that we use to make sure we're not going +backwards. The subdirs of the test directory "match" the subdirs of +the main source directory; e.g., the desugarer is in subdir deSugar/, +and the tests for the desugarer are in tests/deSugar/. diff --git a/compiler/Simon-log b/compiler/Simon-log new file mode 100644 index 0000000000..9d60ccc6eb --- /dev/null +++ b/compiler/Simon-log @@ -0,0 +1,1260 @@ + ------------------------------------ + GHCI hacking + ------------------------------------ + +* Don't forget to put deferred-type-decls back into RnIfaces + +* Do we want to record a package name in a .hi file? + Does pi_mod have a ModuleName or a Module? + + ------------------------------------ + Mainly FunDeps (23 Jan 01) + ------------------------------------ + +This commit re-engineers the handling of functional dependencies. +A functional dependency is no longer an Inst; instead, the necessary +dependencies are snaffled out of their Class when necessary. + +As part of this exercise I found that I had to re-work how to do generalisation +in a binding group. There is rather exhaustive documentation on the new Plan +at the top of TcSimplify. + + ****************** + WARNING: I have compiled all the libraries with this new compiler + and all looks well, but I have not run many programs. + Things may break. Let me know if so. + ****************** + +The main changes are these: + +1. typecheck/TcBinds and TcSimplify have a lot of changes due to the + new generalisation and context reduction story. There are extensive + comments at the start of TcSimplify + +2. typecheck/TcImprove is removed altogether. Instead, improvement is + interleaved with context reduction (until a fixpoint is reached). + All this is done in TcSimplify. + +3. types/FunDeps has new exports + * 'improve' does improvement, returning a list of equations + * 'grow' and 'oclose' close a list of type variables wrt a set of + PredTypes, but in slightly different ways. Comments in file. + +4. I improved the way in which we check that main::IO t. It's tidier now. + +In addition + +* typecheck/TcMatches: + a) Tidy up, introducing a common function tcCheckExistentialPat + + b) Improve the typechecking of parallel list comprehensions, + which wasn't quite right before. (see comments with tcStmts) + + WARNING: (b) is untested! Jeff, you might want to check. + +* Numerous other incidental changes in the typechecker + +* Manuel found that rules don't fire well when you have partial applications + from overloading. For example, we may get + + f a (d::Ord a) = let m_g = g a d + in + \y :: a -> ...(m_g (h y))... + + The 'method' m_g doesn't get inlined because (g a d) might be a redex. + Yet a rule that looks like + g a d (h y) = ... + won't fire because that doesn't show up. One way out would be to make + the rule matcher a bit less paranoid about duplicating work, but instead + I've added a flag + -fno-method-sharing + which controls whether we generate things like m_g in the first place. + It's not clear that they are a win in the first place. + + The flag is actually consulted in Inst.tcInstId + + + + ------------------------------------ + Mainly PredTypes (28 Sept 00) + ------------------------------------ + +Three things in this commit: + + 1. Main thing: tidy up PredTypes + 2. Move all Keys into PrelNames + 3. Check for unboxed tuples in function args + +1. Tidy up PredTypes +~~~~~~~~~~~~~~~~~~~~ +The main thing in this commit is to modify the representation of Types +so that they are a (much) better for the qualified-type world. This +should simplify Jeff's life as he proceeds with implicit parameters +and functional dependencies. In particular, PredType, introduced by +Jeff, is now blessed and dignified with a place in TypeRep.lhs: + + data PredType = Class Class [Type] + | IParam Name Type + +Consider these examples: + f :: (Eq a) => a -> Int + g :: (?x :: Int -> Int) => a -> Int + h :: (r\l) => {r} => {l::Int | r} + +Here the "Eq a" and "?x :: Int -> Int" and "r\l" are all called +*predicates*, and are represented by a PredType. (We don't support +TREX records yet, but the setup is designed to expand to allow them.) + +In addition, Type gains an extra constructor: + + data Type = .... | PredTy PredType + +so that PredType is injected directly into Type. So the type + p => t +is represented by + PredType p `FunTy` t + +I have deleted the hackish IPNote stuff; predicates are dealt with entirely +through PredTys, not through NoteTy at all. + + +2. Move Keys into PrelNames +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +This is just a housekeeping operation. I've moved all the pre-assigned Uniques +(aka Keys) from Unique.lhs into PrelNames.lhs. I've also moved knowKeyRdrNames +from PrelInfo down into PrelNames. This localises in PrelNames lots of stuff +about predefined names. Previously one had to alter three files to add one, +now only one. + +3. Unboxed tuples +~~~~~~~~~~~~~~~~~~ +Add a static check for unboxed tuple arguments. E.g. + data T = T (# Int, Int #) +is illegal + + + + --------------------------------------- + Update in place + --------------------------------------- + +-funfolding-update-in-place +Switching it on doesn't affect many programs, except these +sphere is because it makes a critical function (vecsub) more inlinable + + sphere 66465k -20.61% + infer 13390k +1.27% + parstof 1461k +1.18% + fluid 3442k +1.61% + atom 177163k +13.20% + bspt 4837k +4.85% + cichelli 33546k +2.69% + typecheck 146023k +1.47% + + + --------------------------------------- + Simon's tuning changes: early Sept 2000 + --------------------------------------- + +Library changes +~~~~~~~~~~~~~~~ +* Eta expand PrelShow.showLitChar. It's impossible to compile this well, + and it makes a big difference to some programs (e.g. gen_regexps) + +* Make PrelList.concat into a good producer (in the foldr/build sense) + + +Flag changes +~~~~~~~~~~~~ +* Add -ddump-hi-diffs to print out changes in interface files. Useful + when watching what the compiler is doing + +* Add -funfolding-update-in-place to enable the experimental optimisation + that makes the inliner a bit keener to inline if it's in the RHS of + a thunk that might be updated in place. Sometimes this is a bad idea + (one example is in spectral/sphere; see notes in nofib/Simon-nofib-notes) + + +Tuning things +~~~~~~~~~~~~~ +* Fix a bug in SetLevels.lvlMFE. (change ctxt_lvl to dest_level) + I don't think this has any performance effect, but it saves making + a redundant let-binding that is later eliminated. + +* Desugar.dsProgram and DsForeign + Glom together all the bindings into a single Rec. Previously the + bindings generated by 'foreign' declarations were not glommed together, but + this led to an infelicity (i.e. poorer code than necessary) in the modules + that actually declare Float and Double (explained a bit more in Desugar.dsProgram) + +* OccurAnal.shortMeOut and IdInfo.shortableIdInfo + Don't do the occurrence analyser's shorting out stuff for things which + have rules. Comments near IdInfo.shortableIdInfo. + This is deeply boring, and mainly to do with making rules work well. + Maybe rules should have phases attached too.... + +* CprAnalyse.addIdCprInfo + Be a bit more willing to add CPR information to thunks; + in particular, if the strictness analyser has just discovered that this + is a strict let, then the let-to-case transform will happen, and CPR is fine. + This made a big difference to PrelBase.modInt, which had something like + modInt = \ x -> let r = ... -> I# v in + ...body strict in r... + r's RHS isn't a value yet; but modInt returns r in various branches, so + if r doesn't have the CPR property then neither does modInt + +* MkId.mkDataConWrapId + Arrange that vanilla constructors, like (:) and I#, get unfoldings that are + just a simple variable $w:, $wI#. This ensures they'll be inlined even into + rules etc, which makes matching a bit more reliable. The downside is that in + situations like (map (:) xs), we'll end up with (map (\y ys. $w: y ys) xs. + Which is tiresome but it doesn't happen much. + +* SaAbsInt.findStrictness + Deal with the case where a thing with no arguments is bottom. This is Good. + E.g. module M where { foo = error "help" } + Suppose we have in another module + case M.foo of ... + Then we'd like to do the case-of-error transform, without inlining foo. + + +Tidying up things +~~~~~~~~~~~~~~~~~ +* Reorganised Simplify.completeBinding (again). + +* Removed the is_bot field in CoreUnfolding (is_cheap is true if is_bot is!) + This is just a tidy up + +* HsDecls and others + Remove the NewCon constructor from ConDecl. It just added code, and nothing else. + And it led to a bug in MkIface, which though that a newtype decl was always changing! + +* IdInfo and many others + Remove all vestiges of UpdateInfo (hasn't been used for years) + + ------------------------------ + Join points Sept 2000 + ------------------------------ + +With Andrew Kennedy, I found out why a few of the join points introduced by +the simplifier end up as *not* let-no-escpaed. Here's an example: + +f x y = case (pwr x b) == 1 of + False -> False + True -> pwr x c == 1 + +This compiles to: + f = \ @ t w :: Integer -> + let { + $j :: (State# RealWorld -> Bool) + P + $j + = \ w1 :: (State# RealWorld) -> + case pwr w c of wild { + S# i -> case i of wild1 { 1 -> $wTrue; __DEFAULT -> $wFalse }; + J# s d1 -> + case cmpIntegerInt# s d1 1 of wild2 { + 0 -> $wTrue; __DEFAULT -> $wFalse + } + } + } in + case pwr w b of wild { + S# i -> + case i of wild1 { 1 -> $j realWorld#; __DEFAULT -> $wFalse }; + J# s d1 -> + case cmpIntegerInt# s d1 1 of wild2 { + 0 -> $j realWorld#; __DEFAULT -> $wFalse + } + } + +Now consider + + case (f x) of + True -> False + False -> True + +Suppose f is inlined into this case. No new join points are introduced, +because the alternatives are both small. But the consumer + case [.] of {True -> False; False -> True} +will move into the body of f, be duplicated 4 ways, and end up consuming +the result of the four outcomes at the body of f. This yields: + $j :: (State# RealWorld -> Bool) + P + $j + = \ w1 :: (State# RealWorld) -> + case pwr w c of wild { + S# i -> case i of wild1 { 1 -> $wTrue; __DEFAULT -> $wFalse }; + J# s d1 -> + case cmpIntegerInt# s d1 1 of wild2 { + 0 -> $wTrue; __DEFAULT -> $wFalse + } + } + } in + case pwr w b of wild { + S# i -> + case i of wild1 { 1 -> case $j realWorld# of {T->F; F->T} + ; __DEFAULT -> $wTrue }; + J# s d1 -> + case cmpIntegerInt# s d1 1 of wild2 { + 0 -> case $j realWorld# of {T->F; F->T} + ; __DEFAULT -> $wTrue + } + } + +And, voila, the join point $j isn't let-no-escaped any more. +The point is that the consuming context can't "see inside" the join point. +It's a phase ordering thing. If f is inlined before the join points +are built in the first place, then all is well. + + + + ----------------------------- + Sept 7 2000 + ----------------------------- + +* Make the simplifier's Stop continuation record whether the expression being + simplified is the RHS of a thunk, or (say) the body of a lambda or case RHS. + In the thunk case we want to be a bit keener about inlining if the type of + the thunk is amenable to update in place. + +* SetLevels was being a bit too eager to float things to the top + level; e.g. _inline_me_ (\a -> e); here e got floated... + Easily fixed by a change to ltMajLvl + +* Make CoreUnfold.calcUnfoldingGuidance a bit less keen to make case expressions + seem small. The original idea was to make inlined wrappers look small, so that + when we inline a wrapper it doesn't make call site (much) bigger + Otherwise we get nasty phase ordering stuff: + -- f x = g x x + -- h y = ...(f e)... + If we inline g's wrapper, f looks big, and doesn't get inlined + into h; if we inline f first, while it looks small, then g's + wrapper will get inlined later anyway. To avoid this nasty + ordering difference, we make (case a of (x,y) -> ...), + *where a is one of the arguments* look free. + + BUT (a) It's too eager. We don't want to inline a wrapper into a + context with no benefit. + E.g. \ x. f (x+x) o point in inlining (+) here! + + (b) It's ineffective. Once g's wrapper is inlined, its case-expressions + aren't scrutinising arguments any more + + So I've rescinded this idea for now. cases still look fairly small. + +* Fix interestingArg, which was being too liberal, and hence doing + too much inlining. + +* Extended CoreUtils.exprIsCheap to make two more things cheap: + - case (coerce x) of ... + - let x = y +# z + This makes a bit more eta expansion happen. It was provoked by + a program of Marcin's. + +* The simplifier used to glom together all the top-level bindings into + a single Rec every time it was invoked. The reason for this is explained + in SimplCore.lhs, but for at least one simple program it meant that the + simplifier never got around to unravelling the recursive group into + non-recursive pieces. So I've put the glomming under explicit flag + control with a -fglom-binds simplifier pass. A side benefit is + that because it happens less often, the (expensive) SCC algorithm + runs less often. + +* MkIface.ifaceBinds. Make sure that we emit rules for things + (like class operations) that don't get a top-level binding in the + interface file. Previously such rules were silently forgotten. + +* Move transformRhs to *after* simplification, which makes it a + little easier to do, and means that the arity it computes is + readily available to completeBinding. This gets much better + arities. + +* Do coerce splitting in completeBinding. This gets good code for + newtype CInt = CInt Int + + test:: CInt -> Int + test x = case x of + 1 -> 2 + 2 -> 4 + 3 -> 8 + 4 -> 16 + _ -> 0 + +* Modify the meaning of "arity" so that during compilation it means + "if you apply this function to fewer args, it will do virtually + no work". So, for example + f = coerce t (\x -> e) + has arity at least 1. When a function is exported, it's arity becomes + the number of exposed, top-level lambdas, which is subtly different. + But that's ok. + + I removed CoreUtils.exprArity altogether: it looked only at the exposed + lambdas. Instead, we use exprEtaExpandArity exclusively. + + All of this makes I/O programs work much better. + + + ----------------------------- + Sept 4 2000 + ----------------------------- + +* PrimRep, TysPrim. Add PrimPtrRep as the representation for + MVars and MutVars. Previously they were given PtrRep, but that + crashed dataReturnConvPrim! Here's the program the killed it: + data STRef s a = STRef (MutVar# s a) + from (STRef x) = x + +* Make the desugarer use string equality for string literal + patterns longer than 1 character. And put a specialised + eqString into PrelBase, with a suitable specialisation rule. + This makes a huge difference to the size of the code generated + by deriving(Read) notably in Time.lhs + + ----------------------------- + Marktoberdorf Commits (Aug 2000) + ----------------------------- + +1. Tidy up the renaming story for "system binders", such as +dictionary functions, default methods, constructor workers etc. These +are now documented in HsDecls. The main effect of the change, apart +from tidying up, is to make the *type-checker* (instead of the +renamer) generate names for dict-funs and default-methods. This is +good because Sergei's generic-class stuff generates new classes at +typecheck time. + + +2. Fix the CSE pass so it does not require the no-shadowing invariant. +Keith discovered that the simplifier occasionally returns a result +with shadowing. After much fiddling around (which has improved the +code in the simplifier a bit) I found that it is nearly impossible to +arrange that it really does do no-shadowing. So I gave up and fixed +the CSE pass (which is the only one to rely on it) instead. + + +3. Fix a performance bug in the simplifier. The change is in +SimplUtils.interestingArg. It computes whether an argment should +be considered "interesting"; if a function is applied to an interesting +argument, we are more likely to inline that function. +Consider this case + let x = 3 in f x +The 'x' argument was considered "uninteresting" for a silly reason. +Since x only occurs once, it was unconditionally substituted, but +interestingArg didn't take account of that case. Now it does. + +I also made interestingArg a bit more liberal. Let's see if we +get too much inlining now. + + +4. In the occurrence analyser, we were choosing a bad loop breaker. +Here's the comment that's now in OccurAnal.reOrderRec + + score ((bndr, rhs), _, _) + | exprIsTrivial rhs = 3 -- Practically certain to be inlined + -- Used to have also: && not (isExportedId bndr) + -- But I found this sometimes cost an extra iteration when we have + -- rec { d = (a,b); a = ...df...; b = ...df...; df = d } + -- where df is the exported dictionary. Then df makes a really + -- bad choice for loop breaker + +I also increased the score for bindings with a non-functional type, so that +dictionaries have a better chance of getting inlined early + + +5. Add a hash code to the InScopeSet (and make it properly abstract) +This should make uniqAway a lot more robust. Simple experiments suggest +that uniqAway no longer gets into the long iteration chains that it used +to. + + +6. Fix a bug in the inliner that made the simplifier tend to get into +a loop where it would keep iterating ("4 iterations, bailing out" message). +In SimplUtils.mkRhsTyLam we float bindings out past a big lambda, thus: + x = /\ b -> let g = \x -> f x x + in E +becomes + g* = /\a -> \x -> f x x + x = /\ b -> let g = g* b in E + +It's essential that we don't simply inling g* back into the RHS of g, +else we will be back to square 1. The inliner is meant not to do this +because there's no benefit to the inlining, but the size calculation +was a little off in CoreUnfold. + + +7. In SetLevels we were bogus-ly building a Subst with an empty in-scope +set, so a WARNING popped up when compiling some modules. (knights/ChessSetList +was the example that tickled it.) Now in fact the warning wasn't an error, +but the Right Thing to do is to carry down a proper Subst in SetLevels, so +that is what I have now done. It is very little more expensive. + + + + ~~~~~~~~~~~~ + Apr/May 2000 + ~~~~~~~~~~~~ + +This is a pretty big commit! It adds stuff I've been working on +over the last month or so. DO NOT MERGE IT WITH 4.07! + +Recompilation checking +~~~~~~~~~~~~~~~~~~~~~~ +Substantial improvement in recompilation checking. The version management +is now entirely internal to GHC. ghc-iface.lprl is dead! + +The trick is to generate the new interface file in two steps: + - first convert Types etc to HsTypes etc, and thereby + build a new ParsedIface + - then compare against the parsed (but not renamed) version of the old + interface file +Doing this meant adding code to convert *to* HsSyn things, and to +compare HsSyn things for equality. That is the main tedious bit. + +Another improvement is that we now track version info for +fixities and rules, which was missing before. + + +Interface file reading +~~~~~~~~~~~~~~~~~~~~~~ +Make interface files reading more robust. + * If the old interface file is unreadable, don't fail. [bug fix] + + * If the old interface file mentions interfaces + that are unreadable, don't fail. [bug fix] + + * When we can't find the interface file, + print the directories we are looking in. [feature] + + +Type signatures +~~~~~~~~~~~~~~~ + * New flag -ddump-types to print type signatures + + +Type pruning +~~~~~~~~~~~~ +When importing + data T = T1 A | T2 B | T3 C +it seems excessive to import the types A, B, C as well, unless +the constructors T1, T2 etc are used. A,B,C might be more types, +and importing them may mean reading more interfaces, and so on. + So the idea is that the renamer will just import the decl + data T +unless one of the constructors is used. This turns out to be quite +easy to implement. The downside is that we must make sure the +constructors are always available if they are really needed, so +I regard this as an experimental feature. + + +Elimininate ThinAir names +~~~~~~~~~~~~~~~~~~~~~~~~~ +Eliminate ThinAir.lhs and all its works. It was always a hack, and now +the desugarer carries around an environment I think we can nuke ThinAir +altogether. + +As part of this, I had to move all the Prelude RdrName defns from PrelInfo +to PrelMods --- so I renamed PrelMods as PrelNames. + +I also had to move the builtinRules so that they are injected by the renamer +(rather than appearing out of the blue in SimplCore). This is if anything simpler. + +Miscellaneous +~~~~~~~~~~~~~ +* Tidy up the data types involved in Rules + +* Eliminate RnEnv.better_provenance; use Name.hasBetterProv instead + +* Add Unique.hasKey :: Uniquable a => a -> Unique -> Bool + It's useful in a lot of places + +* Fix a bug in interface file parsing for __U[!] + + +======================================= +To-do +~~~~~ +* Try the effect of enhancing update in place with the CPR + idea in CoreUnfold.calcUnfoldingGuidance + +* Check with Simon M re srt on Lit + +* Make all primops return a data type so that we can't over-apply a primop + This makes code gen simpler. Currently the only primops with a polymorphic + return type are: + raise# :: a -> b + catch# :: a -> (b->a) -> a + tagToEnum# :: Int -> a + + Very strange code for PrelException.catchException! What has STret got + to do with it? + +* Liberate case + +* Missing w/w for coerce in go2 functions of fibToList' in fibheaps + +* Watch out for re-boxing in workers; sometimes it happens + and then w/w is a Bad Thing + +* Only two uses of mkCompulsoryUnfolding -- try to nuke it + +* Note that mkDupAlt makes alts that have binders that + are guaranteed to appear just once or not at all + (a,b) -> j a + Same for case binder, but that's harder to take into account. + +* max :: Int -> Int -> Int could be CPRd but isn't. + +* In mandel2 we do a little less well than 4.04 because we aren't + inlining point_colour, and that means we have to box up an argument + before calling it. [This was due to a bug in 4.04] + There's also a great opportunity for liberateCase + in check_radius, where it loops around with two lazy F# built each time + +* In PrelShow.itos' we find a thunk like: + tpl = case chrzh {(zpzh {(remIntzh {x{-aMf-} 10}) 48})} + of tpl{-X1j-} __D P { __DEFAULT -> + PrelBase.Czh{-62,s-} {tpl{-X1j-}} + } + This is a pity. The remInt# can't fail because the divisor isn't 0, + so we could do the sum eagerly and allocate a charcter instead of a thunk. + +* It's good to do let-to-case before we wrap up. Consider + f b xs = let ys = partition isUpper xs + zs = case ys of (a,b) -> a + in case b of + True -> case ys of + (a,b) -> (zs,[]) + False -> case ys of + (a,b) -> (zs ++ xs,[]) + If we don't do let-to-case at all, we get 3 redundant case ys left. + On the other hand we don't want to do it too early, because it + prevents inlining into strict arg positions, which is important for + rules to work. + +* Strict dictionaries. + +* INLINE functions are not always INLINEd, so it's sad to leave + stuff in their bodies like constructors that havn't been inlined. + +* If let x = e in b is strict, then CPR can use the CPR info from x + This bites in the mod method of Integral Int + +* Inline wrappers if they are the RHS of a let, so that update in place + can happen? + +* Consider doing unboxing on strict constr args in a pattern match, + as part of w/w. + +* In spectral/expert/Search.ask there's a statically visible CSE. Catching this + depends almost entirely on chance, which is a pity. + +* Think about exprEtaExpandArity in WwLib. Perhaps eliminate eta expand in simplify? + Perhaps use even if no coerces etc, just eta expansion. (e.g. PrelArr.done) + +* In knights/KnightHeuristic, we don't find that possibleMoves is strict + (with important knock-on effects) unless we apply rules before floating + out the literal list [A,B,C...]. + Similarly, in f_se (F_Cmp ...) in listcompr (but a smaller effect) + +* Floating can float the entire body of an INLINE thing out. + e.g. PrelArr.done + This is sad, and a bit stupid. + +* In spectral/multiplier, we have + xor = lift21 forceBit f + where f :: Bit -> Bit -> Bit + f 0 0 = 0 + f 0 1 = 1 + f 1 0 = 1 + f 1 1 = 0 + Trouble is, f is CPR'd, and that means that instead of returning + the constants I# 0, I# 1, it returns 0,1 and then boxes them. + So allocation goes up. I don't see a way around this. + +* spectral/hartel/parstof ends up saying + case (unpackCString "x") of { c:cs -> ... } + quite a bit. We should spot these and behave accordingly. + +* Try a different hashing algorithms in hashUFM. This might reduce long CSE lists + as well as making uniqAway faster. + +* [I'm not sure this is really important in the end.] + Don't float out partial applications in lvlMFE. E.g. (in hPutStr defn of shoveString) + \x -> case .. of + [] -> setBufWPtr a b + ... + setBufWPtr has arity 3. Floating it out is plain silly. And in this particular + case it's harmful, because it ends up preventing eta expansion on the \x. + That in turn leads to a big extra cost in hPutStr. + + *** Try not doing lvlMFE on the body of a lambda and case alternative *** + +* PrelNumExtra.lhs we get three copies of dropTrailing0s. Too much inlining! + drop0 has cost 21, but gets a discount of 6 (3 * #constrs) for its arg. + With a keen-neess factor of 2, that makes a discount of 12. Add two for + the arguments and we get 21-12-2, which is just small enough to inline. + But that is plainly stupid. + + Add one for cases; and decrease discount for constructors. + +* IO.hGetContents still doesn't see that it is strict in the handle. + Coerces still getting in the way. + +* Try not having really_interesting_cont (subsumed by changes in the + way guidance is calculated for inline things?) + +* Enumeration types in worker/wrapper for strictness analysis + +* This should be reported as an error: + data T k = MkT (k Int#) + +* Bogus report of overlapped pattern for + f (R {field = [c]}) = 1 + f (R {}) = 2 + This shows up for TyCon.maybeTyConSingleCon + +* > module Main( main ) where + + > f :: String -> Int + > f "=<" = 0 + > f "=" = 0 + + > g :: [Char] -> Int + > g ['=','<'] = 0 + > g ['='] = 0 + + > main = return () + + For ``f'' the following is reported. + + tmp.lhs:4: + Pattern match(es) are overlapped in the definition of function `f' + "=" = ... + + There are no complaints for definition for ``g''. + +* Without -O I don't think we need change the module version + if the usages change; I forget why it changes even with -O + +* Record selectors for existential type; no good! What to do? + Record update doesn't make sense either. + + Need to be careful when figuring out strictness, and when generating + worker-wrapper split. + + Also when deriving. + + + Jan 2000 + ~~~~~~~~ + +A fairly big pile of work originally aimed at +removing the Con form of Core expression, and replacing it with simple +Lit form. However, I wanted to make sure that the resulting thing +performed better than the original, so I ended up making an absolute +raft of other changes. + +Removing the Con form of Core expressions +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The big thing is that + + For every constructor C there are now *two* Ids: + + C is the constructor's *wrapper*. It evaluates and unboxes arguments + before calling $wC. It has a perfectly ordinary top-level defn + in the module defining the data type. + + $wC is the constructor's *worker*. It is like a primop that simply + allocates and builds the constructor value. Its arguments are the + actual representation arguments of the constructor. + + For every primop P there is *one* Id, its (curried) Id + + Neither contructor worker Id nor the primop Id have a defminition anywhere. + Instead they are saturated during the core-to-STG pass, and the code generator + generates code for them directly. The STG language still has saturated + primops and constructor applications. + +* The Const type disappears, along with Const.lhs. The literal part + of Const.lhs reappears as Literal.lhs. Much tidying up in here, + to bring all the range checking into this one module. + +* I got rid of NoRep literals entirely. They just seem to be too much trouble. + +* Because Con's don't exist any more, the funny C { args } syntax + disappears from inteface files. + +* Every constructor, C, comes with a + + *wrapper*, called C, whose type is exactly what it looks like + in the source program. It is an ordinary function, + and it gets a top-level binding like any other function + + *worker*, called $wC, which is the actual data constructor. + Its type may be different to C, because: + - useless dict args are dropped + - strict args may be flattened + It does not have a binding. + + The worker is very like a primop, in that it has no binding, + + +Parsing +~~~~~~~ +* Result type signatures now work + f :: Int -> Int = \x -> x + -- The Int->Int is the type of f + + g x y :: Int = x+y + -- The Int is the type of the result of (g x y) + + +Recompilation checking and make +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +* The .hi file for a modules is not touched if it doesn't change. (It used to + be touched regardless, forcing a chain of recompilations.) The penalty for this + is that we record exported things just as if they were mentioned in the body of + the module. And the penalty for that is that we may recompile a module when + the only things that have changed are the things it is passing on without using. + But it seems like a good trade. + +* -recomp is on by default + +Foreign declarations +~~~~~~~~~~~~~~~~~~~~ +* If you say + foreign export zoo :: Int -> IO Int + then you get a C produre called 'zoo', not 'zzoo' as before. + I've also added a check that complains if you export (or import) a C + procedure whose name isn't legal C. + + +Code generation and labels +~~~~~~~~~~~~~~~~~~~~~~~~~~ +* Now that constructor workers and wrappers have distinct names, there's + no need to have a Foo_static_closure and a Foo_closure for constructor Foo. + I nuked the entire StaticClosure story. This has effects in some of + the RTS headers (i.e. s/static_closure/closure/g) + + +Rules, constant folding +~~~~~~~~~~~~~~~~~~~~~~~ +* Constant folding becomes just another rewrite rule, attached to the Id for the + PrimOp. To achieve this, there's a new form of Rule, a BuiltinRule (see CoreSyn.lhs). + The prelude rules are in prelude/PrelRules.lhs, while simplCore/ConFold.lhs has gone. + +* Appending of constant strings now works, using fold/build fusion, plus + the rewrite rule + unpack "foo" c (unpack "baz" c n) = unpack "foobaz" c n + Implemented in PrelRules.lhs + +* The CCall primop is tidied up quite a bit. There is now a data type CCall, + defined in PrimOp, that packages up the info needed for a particular CCall. + There is a new Id for each new ccall, with an big "occurrence name" + {__ccall "foo" gc Int# -> Int#} + In interface files, this is parsed as a single Id, which is what it is, really. + +Miscellaneous +~~~~~~~~~~~~~ +* There were numerous places where the host compiler's + minInt/maxInt was being used as the target machine's minInt/maxInt. + I nuked all of these; everything is localised to inIntRange and inWordRange, + in Literal.lhs + +* Desugaring record updates was broken: it didn't generate correct matches when + used withe records with fancy unboxing etc. It now uses matchWrapper. + +* Significant tidying up in codeGen/SMRep.lhs + +* Add __word, __word64, __int64 terminals to signal the obvious types + in interface files. Add the ability to print word values in hex into + C code. + +* PrimOp.lhs is no longer part of a loop. Remove PrimOp.hi-boot* + + +Types +~~~~~ +* isProductTyCon no longer returns False for recursive products, nor + for unboxed products; you have to test for these separately. + There's no reason not to do CPR for recursive product types, for example. + Ditto splitProductType_maybe. + +Simplification +~~~~~~~~~~~~~~~ +* New -fno-case-of-case flag for the simplifier. We use this in the first run + of the simplifier, where it helps to stop messing up expressions that + the (subsequent) full laziness pass would otherwise find float out. + It's much more effective than previous half-baked hacks in inlining. + + Actually, it turned out that there were three places in Simplify.lhs that + needed to know use this flag. + +* Make the float-in pass push duplicatable bindings into the branches of + a case expression, in the hope that we never have to allocate them. + (see FloatIn.sepBindsByDropPoint) + +* Arrange that top-level bottoming Ids get a NOINLINE pragma + This reduced gratuitous inlining of error messages. + But arrange that such things still get w/w'd. + +* Arrange that a strict argument position is regarded as an 'interesting' + context, so that if we see + foldr k z (g x) + then we'll be inclined to inline g; this can expose a build. + +* There was a missing case in CoreUtils.exprEtaExpandArity that meant + we were missing some obvious cases for eta expansion + Also improve the code when handling applications. + +* Make record selectors (identifiable by their IdFlavour) into "cheap" operations. + [The change is a 2-liner in CoreUtils.exprIsCheap] + This means that record selection may be inlined into function bodies, which + greatly improves the arities of overloaded functions. + +* Make a cleaner job of inlining "lone variables". There was some distributed + cunning, but I've centralised it all now in SimplUtils.analyseCont, which + analyses the context of a call to decide whether it is "interesting". + +* Don't specialise very small functions in Specialise.specDefn + It's better to inline it. Rather like the worker/wrapper case. + +* Be just a little more aggressive when floating out of let rhss. + See comments with Simplify.wantToExpose + A small change with an occasional big effect. + +* Make the inline-size computation think that + case x of I# x -> ... + is *free*. + + +CPR analysis +~~~~~~~~~~~~ +* Fix what was essentially a bug in CPR analysis. Consider + + letrec f x = let g y = let ... in f e1 + in + if ... then (a,b) else g x + + g has the CPR property if f does; so when generating the final annotated + RHS for f, we must use an envt in which f is bound to its final abstract + value. This wasn't happening. Instead, f was given the CPR tag but g + wasn't; but of course the w/w pass gives rotten results in that case!! + (Because f's CPR-ness relied on g's.) + + On they way I tidied up the code in CprAnalyse. It's quite a bit shorter. + + The fact that some data constructors return a constructed product shows + up in their CPR info (MkId.mkDataConId) not in CprAnalyse.lhs + + + +Strictness analysis and worker/wrapper +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +* BIG THING: pass in the demand to StrictAnal.saExpr. This affects situations + like + f (let x = e1 in (x,x)) + where f turns out to have strictness u(SS), say. In this case we can + mark x as demanded, and use a case expression for it. + + The situation before is that we didn't "know" that there is the u(SS) + demand on the argument, so we simply computed that the body of the let + expression is lazy in x, and marked x as lazily-demanded. Then even after + f was w/w'd we got + + let x = e1 in case (x,x) of (a,b) -> $wf a b + + and hence + + let x = e1 in $wf a b + + I found a much more complicated situation in spectral/sphere/Main.shade, + which improved quite a bit with this change. + +* Moved the StrictnessInfo type from IdInfo to Demand. It's the logical + place for it, and helps avoid module loops + +* Do worker/wrapper for coerces even if the arity is zero. Thus: + stdout = coerce Handle (..blurg..) + ==> + wibble = (...blurg...) + stdout = coerce Handle wibble + This is good because I found places where we were saying + case coerce t stdout of { MVar a -> + ... + case coerce t stdout of { MVar b -> + ... + and the redundant case wasn't getting eliminated because of the coerce. + + + +End December +~~~~~~~~~~~~ +* Fix a few renamer bugs + +* Substantially reorganise the Prelude to eliminate all orphan declarations. + Details in PrelBase.lhs + +* Do a much better job of appending literal strings + - remove NoRepStr + - move unpackCString stuff to PrelBase + - add BuiltinRules to the Rule type + - add fold/build rules for literal strings + + + +Week of Mon 25 Oct +~~~~~~~~~~~~~~~~~~ +* Fix a terrible bug in Simplify.mkDupableAlt; we were duplicating a small + *InAlt*, but doing so invalidated occurrence info, which could lead to + substantial code duplication. + +* Fix a bug in WwLib.mkWWcpr; I was generating CPR wrappers like + I# (case x of ...) + which is utterly wrong. It should be + case x of ...(I# r) + (The effect was to make functions stricter than they really are.) + +* Try doing no inlining at all in phase 0. This noticeably improved + spectral/fish (esp Main.hs I think), by improving floating. + This single change has quite a large effect on some programs (allocation) + + Don't inline Don't inline + wrappers anything + in phase 0 in phase 0 + awards 113k -7.08% + cichelli 28962k -3.12% + wave4main 88089k +130.45% + fibheaps 31731k +19.01% + fish 8273k -1.64% + typecheck 148713k +4.91% + + But I found that fish worked much better if we inline *local* things + in phase 0, but not *imported* things. + +* Fix a terrible bug in Simplify.mkLamBndrZapper. It was counting + type args in one place, but not type binders, so it was sometimes + inlining into unsaturated lambdas! + +* I found that there were some very bad loss-of-arity cases in PrelShow. + In particular, we had: + + showl "" = showChar '"' s + showl ('"':xs) = showString "\\\"" . showl xs + showl (x:xs) = showLitChar x . showl xs + + Trouble is, we get + showl = \xs -> case xs of + ... + (x:xs) -> let f = showLitChar x + g = showl xs + in \s -> f (g x) + which is TERRIBLE. We can't spot that showLitChar has arity 2 because + it looks like this: + + ...other eqns... + showLitChar c = showString ('\\' : asciiTab!!ord c) + + notice that the (asciiTab!!orc c) is outside the \s, so GHC can't rewrite it to + + showLitChar c = \s -> showString ('\\' : asciiTab!!ord c) s + + So I've changed PrelShow.showLitChar to use explicit \s. Even then, showl + doesn't work, because GHC can't see that showl xs can be pushed inside the \s. + So I've put an explict \s there too. + + showl "" s = showChar '"' s + showl ('"':xs) s = showString "\\\"" (showl xs s) + showl (x:xs) s = showLitChar x (showl xs s) + + Net result: imaginary/gen_regexps more than halves in allocation! + + Turns out that the mkLamBndrZapper bug (above) meant that showl was + erroneously inlining showLitChar x and showl xs, which is why this + problem hasn't shown up before. + +* Improve CSE a bit. In ptic + case h x of y -> ...(h x)... + replaces (h x) by y. + +* Inline INLINE things very agressively, even though we get code duplication + thereby. Reason: otherwise we sometimes call the original un-inlined INLINE + defns, which have constructors etc still un-inlined in their RHSs. The + improvement is dramatic for a few programs: + + typecheck 150865k -1.43% + wave4main 114216k -22.87% + boyer 28793k -7.86% + cichelli 33786k -14.28% + ida 59505k -1.79% + rewrite 14665k -4.91% + sched 17641k -4.22% + + Code size increases by 10% which is not so good. There must be a better way. + Another bad thing showed up in fish/Main.hs. Here we have + (x1,y1) `vec_add` (x2,y2) = (x1+x2, y1+y2) + which tends to get inlined. But if we first inline (+), it looks big, + so we don't inline it. Sigh. + + +* Don't inline constructors in INLINE RHSs. Ever. Otherwise rules don't match. + E.g. build + +* In ebnf2ps/Lexer.uncommentString, it would be a good idea to inline a constructor + that occurs once in each branch of a case. That way it doesn't get allocated + in the branches that don't use it. And in fact in this particular case + something else good happens. So CoreUnfold now does that. + +* Reverted to n_val_binders+2 in calcUnfoldingGuidance + Otherwise wrappers are inlined even if there's no benefit. + + +Week of Mon 18 Oct +~~~~~~~~~~ +* Arrange that simplConArgs works in one less pass than before. + This exposed a bug: a bogus call to completeBeta. + +* Add a top-level flag in CoreUnfolding, used in callSiteInline + +* Extend w/w to use etaExpandArity, so it does eta/coerce expansion + +* Don't float anything out of an INLINE. + Don't float things to top level unless they also escape a value lambda. + [see comments with SetLevels.lvlMFE + Without at least one of these changes, I found that + {-# INLINE concat #-} + concat = __inline (/\a -> foldr (++) []) + was getting floated to + concat = __inline( /\a -> lvl a ) + lvl = ...inlined version of foldr... + + Subsequently I found that not floating constants out of an INLINE + gave really bad code like + __inline (let x = e in \y -> ...) + so I now let things float out of INLINE + +* Implement inline phases. The meaning of the inline pragmas is + described in CoreUnfold.lhs + +* Implement the "reverse-mapping" idea for CSE; actually it turned out to be easier + to implement it in SetLevels, and may benefit full laziness too. + +Thurs 14 Oct +~~~~~~~~~~~~ +* It's a good idea to inline inRange. Consider + + index (l,h) i = case inRange (l,h) i of + True -> l+i + False -> error + inRange itself isn't strict in h, but if it't inlined then 'index' + *does* become strict in h. Interesting! + +* Big change to the way unfoldings and occurrence info is propagated in the simplifier + The plan is described in Subst.lhs with the Subst type + Occurrence info is now in a separate IdInfo field than user pragmas + +* I found that + (coerce T (coerce S (\x.e))) y + didn't simplify in one round. First we get to + (\x.e) y + and only then do the beta. Solution: cancel the coerces in the continuation + +* Amazingly, CoreUnfold wasn't counting the cost of a function an application. + +Early Oct +~~~~~~~~~ +* No commas between for-alls in RULES + +* Disable rules in initial simplifier run. Otherwise full laziness + doesn't get a chance to lift out a MFE before a rule (e.g. fusion) + zaps it. queens is a case in point + +* Improve float-out stuff significantly. The big change is that if we have + + \x -> ... /\a -> ...let p = ..a.. in let q = ...p... + + where p's rhs doesn't x, we abstract a from p, so that we can get p past x. + (We did that before.) But we also substitute (p a) for p in q, and then + we can do the same thing for q. (We didn't do that, so q got stuck.) + This is much better. It involves doing a substitution "as we go" in SetLevels, + though. + + +Weds 15 Sept +~~~~~~~~~~~~ +* exprIsDupable for an application (f e1 .. en) wasn't calling exprIsDupable + on the arguments!! So applications with few, but large, args were being dupliated. + +* sizeExpr on an application wasn't doing a nukeScrutDiscount on the arg of + an application!! So bogus discounts could accumulate from arguments! + +* Improve handling of INLINE pragmas in calcUnfoldingGuidance. It was really + wrong before + +* Substantially improve handling of coerces in worker/wrapper + +Tuesday 6 June +~~~~~~~~~~~~~~ +* Fix Kevin Atkinson's cant-find-instance bug. Turns out that Rename.slurpSourceRefs + needs to repeatedly call getImportedInstDecls, and then go back to slurping + source-refs. Comments with Rename.slurpSourceRefs. + +* Add a case to Simplify.mkDupableAlt for the quite-common case where there's + a very simple alternative, in which case there's no point in creating a + join-point binding. + +* Fix CoreUtils.exprOkForSpeculation so that it returns True of (==# a# b#). + This lack meant that + case ==# a# b# of { True -> x; False -> x } + was not simplifying + +* Make float-out dump bindings at the top of a function argument, as + at the top of a let(rec) rhs. See notes with FloatOut.floatRhs + +* Make the ArgOf case of mkDupableAlt generate a OneShot lambda. + This gave a noticeable boost to spectral/boyer2 + + +Monday 5 June +~~~~~~~~~~~~~ +Work, using IO.hPutStr as an example, to reduce the number of coerces. +The main idea is in WwLib.mkWWcoerce. The gloss is that we must do +the w/w split even for small non-recursive things. See notes with +WorkWrap.tryWw. + + +Friday 2 June +~~~~~~~~~~~~~ +Study why gen_regexps is slower than before. Problem is in IO.writeLines, +in particular the local defn shoveString. Two things are getting +in the way of arity expansion, which means we build far more function +closures than we should: + shove = \ x -> let lvl = \s -> ... + in \s -> ... lvl ... + +The two things are: + a) coerces + b) full laziness floats + + +Solution to (a): add coerces to the worker/wrapper stuff. +See notes with WwLib.mkWWcoerce. + +This further complicated getWorkerId, so I finally bit the bullet and +make the workerInfo field of the IdInfo work properly, including +under substitutions. Death to getWorkerId. + + + +Solution to (b): make all lambdas over realWorldStatePrimTy +into one-shot lambdas. This is a GROSS HACK. + +* Also make the occurrence analyser aware of one-shot lambdas. + + +Thurs 1 June +~~~~~~~~~~~~ +Fix SetLevels so that it does not clone top-level bindings, but it +*does* clone bindings that are destined for the top level. + +The global invariant is that the top level bindings are always +unique, and never cloned. diff --git a/compiler/basicTypes/BasicTypes.lhs b/compiler/basicTypes/BasicTypes.lhs new file mode 100644 index 0000000000..6b662bd6a6 --- /dev/null +++ b/compiler/basicTypes/BasicTypes.lhs @@ -0,0 +1,508 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1997-1998 +% +\section[BasicTypes]{Miscellanous types} + +This module defines a miscellaneously collection of very simple +types that + +\begin{itemize} +\item have no other obvious home +\item don't depend on any other complicated types +\item are used in more than one "part" of the compiler +\end{itemize} + +\begin{code} +module BasicTypes( + Version, bumpVersion, initialVersion, + + Arity, + + DeprecTxt, + + Fixity(..), FixityDirection(..), + defaultFixity, maxPrecedence, + negateFixity, funTyFixity, + compareFixity, + + IPName(..), ipNameName, mapIPName, + + RecFlag(..), isRec, isNonRec, boolToRecFlag, + + TopLevelFlag(..), isTopLevel, isNotTopLevel, + + Boxity(..), isBoxed, + + TupCon(..), tupleParens, + + OccInfo(..), seqOccInfo, isFragileOcc, isOneOcc, + isDeadOcc, isLoopBreaker, isNoOcc, + + InsideLam, insideLam, notInsideLam, + OneBranch, oneBranch, notOneBranch, + InterestingCxt, + + EP(..), + + StrictnessMark(..), isMarkedUnboxed, isMarkedStrict, + + CompilerPhase, + Activation(..), isActive, isNeverActive, isAlwaysActive, + InlineSpec(..), defaultInlineSpec, alwaysInlineSpec, neverInlineSpec, + + SuccessFlag(..), succeeded, failed, successIf + ) where + +#include "HsVersions.h" + +import FastString( FastString ) +import Outputable +\end{code} + +%************************************************************************ +%* * +\subsection[Arity]{Arity} +%* * +%************************************************************************ + +\begin{code} +type Arity = Int +\end{code} + + +%************************************************************************ +%* * +\subsection[Version]{Module and identifier version numbers} +%* * +%************************************************************************ + +\begin{code} +type Version = Int + +bumpVersion :: Version -> Version +bumpVersion v = v+1 + +initialVersion :: Version +initialVersion = 1 +\end{code} + +%************************************************************************ +%* * + Deprecations +%* * +%************************************************************************ + + +\begin{code} +type DeprecTxt = FastString -- reason/explanation for deprecation +\end{code} + +%************************************************************************ +%* * +\subsection{Implicit parameter identity} +%* * +%************************************************************************ + +The @IPName@ type is here because it is used in TypeRep (i.e. very +early in the hierarchy), but also in HsSyn. + +\begin{code} +data IPName name + = Dupable name -- ?x: you can freely duplicate this implicit parameter + | Linear name -- %x: you must use the splitting function to duplicate it + deriving( Eq, Ord ) -- Ord is used in the IP name cache finite map + -- (used in HscTypes.OrigIParamCache) + + +ipNameName :: IPName name -> name +ipNameName (Dupable n) = n +ipNameName (Linear n) = n + +mapIPName :: (a->b) -> IPName a -> IPName b +mapIPName f (Dupable n) = Dupable (f n) +mapIPName f (Linear n) = Linear (f n) + +instance Outputable name => Outputable (IPName name) where + ppr (Dupable n) = char '?' <> ppr n -- Ordinary implicit parameters + ppr (Linear n) = char '%' <> ppr n -- Splittable implicit parameters +\end{code} + + +%************************************************************************ +%* * +\subsection[Fixity]{Fixity info} +%* * +%************************************************************************ + +\begin{code} +------------------------ +data Fixity = Fixity Int FixityDirection + +instance Outputable Fixity where + ppr (Fixity prec dir) = hcat [ppr dir, space, int prec] + +instance Eq Fixity where -- Used to determine if two fixities conflict + (Fixity p1 dir1) == (Fixity p2 dir2) = p1==p2 && dir1 == dir2 + +------------------------ +data FixityDirection = InfixL | InfixR | InfixN + deriving(Eq) + +instance Outputable FixityDirection where + ppr InfixL = ptext SLIT("infixl") + ppr InfixR = ptext SLIT("infixr") + ppr InfixN = ptext SLIT("infix") + +------------------------ +maxPrecedence = (9::Int) +defaultFixity = Fixity maxPrecedence InfixL + +negateFixity, funTyFixity :: Fixity +-- Wired-in fixities +negateFixity = Fixity 6 InfixL -- Fixity of unary negate +funTyFixity = Fixity 0 InfixR -- Fixity of '->' +\end{code} + +Consider + +\begin{verbatim} + a `op1` b `op2` c +\end{verbatim} +@(compareFixity op1 op2)@ tells which way to arrange appication, or +whether there's an error. + +\begin{code} +compareFixity :: Fixity -> Fixity + -> (Bool, -- Error please + Bool) -- Associate to the right: a op1 (b op2 c) +compareFixity (Fixity prec1 dir1) (Fixity prec2 dir2) + = case prec1 `compare` prec2 of + GT -> left + LT -> right + EQ -> case (dir1, dir2) of + (InfixR, InfixR) -> right + (InfixL, InfixL) -> left + _ -> error_please + where + right = (False, True) + left = (False, False) + error_please = (True, False) +\end{code} + + +%************************************************************************ +%* * +\subsection[Top-level/local]{Top-level/not-top level flag} +%* * +%************************************************************************ + +\begin{code} +data TopLevelFlag + = TopLevel + | NotTopLevel + +isTopLevel, isNotTopLevel :: TopLevelFlag -> Bool + +isNotTopLevel NotTopLevel = True +isNotTopLevel TopLevel = False + +isTopLevel TopLevel = True +isTopLevel NotTopLevel = False + +instance Outputable TopLevelFlag where + ppr TopLevel = ptext SLIT("<TopLevel>") + ppr NotTopLevel = ptext SLIT("<NotTopLevel>") +\end{code} + + +%************************************************************************ +%* * +\subsection[Top-level/local]{Top-level/not-top level flag} +%* * +%************************************************************************ + +\begin{code} +data Boxity + = Boxed + | Unboxed + deriving( Eq ) + +isBoxed :: Boxity -> Bool +isBoxed Boxed = True +isBoxed Unboxed = False +\end{code} + + +%************************************************************************ +%* * +\subsection[Recursive/Non-Recursive]{Recursive/Non-Recursive flag} +%* * +%************************************************************************ + +\begin{code} +data RecFlag = Recursive + | NonRecursive + deriving( Eq ) + +isRec :: RecFlag -> Bool +isRec Recursive = True +isRec NonRecursive = False + +isNonRec :: RecFlag -> Bool +isNonRec Recursive = False +isNonRec NonRecursive = True + +boolToRecFlag :: Bool -> RecFlag +boolToRecFlag True = Recursive +boolToRecFlag False = NonRecursive + +instance Outputable RecFlag where + ppr Recursive = ptext SLIT("Recursive") + ppr NonRecursive = ptext SLIT("NonRecursive") +\end{code} + +%************************************************************************ +%* * + Tuples +%* * +%************************************************************************ + +\begin{code} +data TupCon = TupCon Boxity Arity + +instance Eq TupCon where + (TupCon b1 a1) == (TupCon b2 a2) = b1==b2 && a1==a2 + +tupleParens :: Boxity -> SDoc -> SDoc +tupleParens Boxed p = parens p +tupleParens Unboxed p = ptext SLIT("(#") <+> p <+> ptext SLIT("#)") +\end{code} + +%************************************************************************ +%* * +\subsection[Generic]{Generic flag} +%* * +%************************************************************************ + +This is the "Embedding-Projection pair" datatype, it contains +two pieces of code (normally either RenamedExpr's or Id's) +If we have a such a pair (EP from to), the idea is that 'from' and 'to' +represents functions of type + + from :: T -> Tring + to :: Tring -> T + +And we should have + + to (from x) = x + +T and Tring are arbitrary, but typically T is the 'main' type while +Tring is the 'representation' type. (This just helps us remember +whether to use 'from' or 'to'. + +\begin{code} +data EP a = EP { fromEP :: a, -- :: T -> Tring + toEP :: a } -- :: Tring -> T +\end{code} + +Embedding-projection pairs are used in several places: + +First of all, each type constructor has an EP associated with it, the +code in EP converts (datatype T) from T to Tring and back again. + +Secondly, when we are filling in Generic methods (in the typechecker, +tcMethodBinds), we are constructing bimaps by induction on the structure +of the type of the method signature. + + +%************************************************************************ +%* * +\subsection{Occurrence information} +%* * +%************************************************************************ + +This data type is used exclusively by the simplifier, but it appears in a +SubstResult, which is currently defined in VarEnv, which is pretty near +the base of the module hierarchy. So it seemed simpler to put the +defn of OccInfo here, safely at the bottom + +\begin{code} +data OccInfo + = NoOccInfo + + | IAmDead -- Marks unused variables. Sometimes useful for + -- lambda and case-bound variables. + + | OneOcc !InsideLam + !OneBranch + !InterestingCxt + + | IAmALoopBreaker -- Used by the occurrence analyser to mark loop-breakers + -- in a group of recursive definitions + +isNoOcc :: OccInfo -> Bool +isNoOcc NoOccInfo = True +isNoOcc other = False + +seqOccInfo :: OccInfo -> () +seqOccInfo occ = occ `seq` () + +----------------- +type InterestingCxt = Bool -- True <=> Function: is applied + -- Data value: scrutinised by a case with + -- at least one non-DEFAULT branch + +----------------- +type InsideLam = Bool -- True <=> Occurs inside a non-linear lambda + -- Substituting a redex for this occurrence is + -- dangerous because it might duplicate work. +insideLam = True +notInsideLam = False + +----------------- +type OneBranch = Bool -- True <=> Occurs in only one case branch + -- so no code-duplication issue to worry about +oneBranch = True +notOneBranch = False + +isLoopBreaker :: OccInfo -> Bool +isLoopBreaker IAmALoopBreaker = True +isLoopBreaker other = False + +isDeadOcc :: OccInfo -> Bool +isDeadOcc IAmDead = True +isDeadOcc other = False + +isOneOcc (OneOcc _ _ _) = True +isOneOcc other = False + +isFragileOcc :: OccInfo -> Bool +isFragileOcc (OneOcc _ _ _) = True +isFragileOcc other = False +\end{code} + +\begin{code} +instance Outputable OccInfo where + -- only used for debugging; never parsed. KSW 1999-07 + ppr NoOccInfo = empty + ppr IAmALoopBreaker = ptext SLIT("LoopBreaker") + ppr IAmDead = ptext SLIT("Dead") + ppr (OneOcc inside_lam one_branch int_cxt) + = ptext SLIT("Once") <> pp_lam <> pp_br <> pp_args + where + pp_lam | inside_lam = char 'L' + | otherwise = empty + pp_br | one_branch = empty + | otherwise = char '*' + pp_args | int_cxt = char '!' + | otherwise = empty + +instance Show OccInfo where + showsPrec p occ = showsPrecSDoc p (ppr occ) +\end{code} + +%************************************************************************ +%* * +\subsection{Strictness indication} +%* * +%************************************************************************ + +The strictness annotations on types in data type declarations +e.g. data T = MkT !Int !(Bool,Bool) + +\begin{code} +data StrictnessMark -- Used in interface decls only + = MarkedStrict + | MarkedUnboxed + | NotMarkedStrict + deriving( Eq ) + +isMarkedUnboxed MarkedUnboxed = True +isMarkedUnboxed other = False + +isMarkedStrict NotMarkedStrict = False +isMarkedStrict other = True -- All others are strict + +instance Outputable StrictnessMark where + ppr MarkedStrict = ptext SLIT("!") + ppr MarkedUnboxed = ptext SLIT("!!") + ppr NotMarkedStrict = ptext SLIT("_") +\end{code} + + +%************************************************************************ +%* * +\subsection{Success flag} +%* * +%************************************************************************ + +\begin{code} +data SuccessFlag = Succeeded | Failed + +successIf :: Bool -> SuccessFlag +successIf True = Succeeded +successIf False = Failed + +succeeded, failed :: SuccessFlag -> Bool +succeeded Succeeded = True +succeeded Failed = False + +failed Succeeded = False +failed Failed = True +\end{code} + + +%************************************************************************ +%* * +\subsection{Activation} +%* * +%************************************************************************ + +When a rule or inlining is active + +\begin{code} +type CompilerPhase = Int -- Compilation phase + -- Phases decrease towards zero + -- Zero is the last phase + +data Activation = NeverActive + | AlwaysActive + | ActiveBefore CompilerPhase -- Active only *before* this phase + | ActiveAfter CompilerPhase -- Active in this phase and later + deriving( Eq ) -- Eq used in comparing rules in HsDecls + +data InlineSpec + = Inline + Activation -- Says during which phases inlining is allowed + Bool -- True <=> make the RHS look small, so that when inlining + -- is enabled, it will definitely actually happen + deriving( Eq ) + +defaultInlineSpec = Inline AlwaysActive False -- Inlining is OK, but not forced +alwaysInlineSpec = Inline AlwaysActive True -- INLINE always +neverInlineSpec = Inline NeverActive False -- NOINLINE + +instance Outputable Activation where + ppr AlwaysActive = empty -- The default + ppr (ActiveBefore n) = brackets (char '~' <> int n) + ppr (ActiveAfter n) = brackets (int n) + ppr NeverActive = ptext SLIT("NEVER") + +instance Outputable InlineSpec where + ppr (Inline act True) = ptext SLIT("INLINE") <> ppr act + ppr (Inline act False) = ptext SLIT("NOINLINE") <> ppr act + +isActive :: CompilerPhase -> Activation -> Bool +isActive p NeverActive = False +isActive p AlwaysActive = True +isActive p (ActiveAfter n) = p <= n +isActive p (ActiveBefore n) = p > n + +isNeverActive, isAlwaysActive :: Activation -> Bool +isNeverActive NeverActive = True +isNeverActive act = False + +isAlwaysActive AlwaysActive = True +isAlwaysActive other = False +\end{code} + diff --git a/compiler/basicTypes/DataCon.hi-boot-5 b/compiler/basicTypes/DataCon.hi-boot-5 new file mode 100644 index 0000000000..f5a8a2d6a8 --- /dev/null +++ b/compiler/basicTypes/DataCon.hi-boot-5 @@ -0,0 +1,5 @@ +__interface DataCon 1 0 where +__export DataCon DataCon isExistentialDataCon dataConName ; +1 data DataCon ; +1 isExistentialDataCon :: DataCon -> PrelBase.Bool ; +1 dataConName :: DataCon -> Name.Name ; diff --git a/compiler/basicTypes/DataCon.hi-boot-6 b/compiler/basicTypes/DataCon.hi-boot-6 new file mode 100644 index 0000000000..7882469bce --- /dev/null +++ b/compiler/basicTypes/DataCon.hi-boot-6 @@ -0,0 +1,5 @@ +module DataCon where + +data DataCon +dataConName :: DataCon -> Name.Name +isVanillaDataCon :: DataCon -> GHC.Base.Bool diff --git a/compiler/basicTypes/DataCon.lhs b/compiler/basicTypes/DataCon.lhs new file mode 100644 index 0000000000..805ef73c59 --- /dev/null +++ b/compiler/basicTypes/DataCon.lhs @@ -0,0 +1,632 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1998 +% +\section[DataCon]{@DataCon@: Data Constructors} + +\begin{code} +module DataCon ( + DataCon, DataConIds(..), + ConTag, fIRST_TAG, + mkDataCon, + dataConRepType, dataConSig, dataConName, dataConTag, dataConTyCon, + dataConTyVars, dataConResTys, + dataConStupidTheta, + dataConInstArgTys, dataConOrigArgTys, dataConInstResTy, + dataConInstOrigArgTys, dataConRepArgTys, + dataConFieldLabels, dataConFieldType, + dataConStrictMarks, dataConExStricts, + dataConSourceArity, dataConRepArity, + dataConIsInfix, + dataConWorkId, dataConWrapId, dataConWrapId_maybe, dataConImplicitIds, + dataConRepStrictness, + isNullarySrcDataCon, isNullaryRepDataCon, isTupleCon, isUnboxedTupleCon, + isVanillaDataCon, classDataCon, + + splitProductType_maybe, splitProductType, + ) where + +#include "HsVersions.h" + +import Type ( Type, ThetaType, substTyWith, substTy, zipOpenTvSubst, + mkForAllTys, mkFunTys, mkTyConApp, + splitTyConApp_maybe, + mkPredTys, isStrictPred, pprType + ) +import TyCon ( TyCon, FieldLabel, tyConDataCons, + isProductTyCon, isTupleTyCon, isUnboxedTupleTyCon ) +import Class ( Class, classTyCon ) +import Name ( Name, NamedThing(..), nameUnique ) +import Var ( TyVar, Id ) +import BasicTypes ( Arity, StrictnessMark(..) ) +import Outputable +import Unique ( Unique, Uniquable(..) ) +import ListSetOps ( assoc ) +import Util ( zipEqual, zipWithEqual ) +import Maybes ( expectJust ) +\end{code} + + +Data constructor representation +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Consider the following Haskell data type declaration + + data T = T !Int ![Int] + +Using the strictness annotations, GHC will represent this as + + data T = T Int# [Int] + +That is, the Int has been unboxed. Furthermore, the Haskell source construction + + T e1 e2 + +is translated to + + case e1 of { I# x -> + case e2 of { r -> + T x r }} + +That is, the first argument is unboxed, and the second is evaluated. Finally, +pattern matching is translated too: + + case e of { T a b -> ... } + +becomes + + case e of { T a' b -> let a = I# a' in ... } + +To keep ourselves sane, we name the different versions of the data constructor +differently, as follows. + + +Note [Data Constructor Naming] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Each data constructor C has two, and possibly three, Names associated with it: + + OccName Name space Used for + --------------------------------------------------------------------------- + * The "source data con" C DataName The DataCon itself + * The "real data con" C VarName Its worker Id + * The "wrapper data con" $WC VarName Wrapper Id (optional) + +Each of these three has a distinct Unique. The "source data con" name +appears in the output of the renamer, and names the Haskell-source +data constructor. The type checker translates it into either the wrapper Id +(if it exists) or worker Id (otherwise). + +The data con has one or two Ids associated with it: + + The "worker Id", is the actual data constructor. + Its type may be different to the Haskell source constructor + because: + - useless dict args are dropped + - strict args may be flattened + The worker is very like a primop, in that it has no binding. + + Newtypes have no worker Id + + + The "wrapper Id", $WC, whose type is exactly what it looks like + in the source program. It is an ordinary function, + and it gets a top-level binding like any other function. + + The wrapper Id isn't generated for a data type if the worker + and wrapper are identical. It's always generated for a newtype. + + + +A note about the stupid context +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Data types can have a context: + + data (Eq a, Ord b) => T a b = T1 a b | T2 a + +and that makes the constructors have a context too +(notice that T2's context is "thinned"): + + T1 :: (Eq a, Ord b) => a -> b -> T a b + T2 :: (Eq a) => a -> T a b + +Furthermore, this context pops up when pattern matching +(though GHC hasn't implemented this, but it is in H98, and +I've fixed GHC so that it now does): + + f (T2 x) = x +gets inferred type + f :: Eq a => T a b -> a + +I say the context is "stupid" because the dictionaries passed +are immediately discarded -- they do nothing and have no benefit. +It's a flaw in the language. + + Up to now [March 2002] I have put this stupid context into the + type of the "wrapper" constructors functions, T1 and T2, but + that turned out to be jolly inconvenient for generics, and + record update, and other functions that build values of type T + (because they don't have suitable dictionaries available). + + So now I've taken the stupid context out. I simply deal with + it separately in the type checker on occurrences of a + constructor, either in an expression or in a pattern. + + [May 2003: actually I think this decision could evasily be + reversed now, and probably should be. Generics could be + disabled for types with a stupid context; record updates now + (H98) needs the context too; etc. It's an unforced change, so + I'm leaving it for now --- but it does seem odd that the + wrapper doesn't include the stupid context.] + +[July 04] With the advent of generalised data types, it's less obvious +what the "stupid context" is. Consider + C :: forall a. Ord a => a -> a -> T (Foo a) +Does the C constructor in Core contain the Ord dictionary? Yes, it must: + + f :: T b -> Ordering + f = /\b. \x:T b. + case x of + C a (d:Ord a) (p:a) (q:a) -> compare d p q + +Note that (Foo a) might not be an instance of Ord. + +%************************************************************************ +%* * +\subsection{Data constructors} +%* * +%************************************************************************ + +\begin{code} +data DataCon + = MkData { + dcName :: Name, -- This is the name of the *source data con* + -- (see "Note [Data Constructor Naming]" above) + dcUnique :: Unique, -- Cached from Name + dcTag :: ConTag, + + -- Running example: + -- + -- data Eq a => T a = forall b. Ord b => MkT a [b] + + -- The next six fields express the type of the constructor, in pieces + -- e.g. + -- + -- dcTyVars = [a,b] + -- dcStupidTheta = [Eq a] + -- dcTheta = [Ord b] + -- dcOrigArgTys = [a,List b] + -- dcTyCon = T + -- dcTyArgs = [a,b] + + dcVanilla :: Bool, -- True <=> This is a vanilla Haskell 98 data constructor + -- Its type is of form + -- forall a1..an . t1 -> ... tm -> T a1..an + -- No existentials, no GADTs, nothing. + -- + -- NB1: the order of the forall'd variables does matter; + -- for a vanilla constructor, we assume that if the result + -- type is (T t1 ... tn) then we can instantiate the constr + -- at types [t1, ..., tn] + -- + -- NB2: a vanilla constructor can still be declared in GADT-style + -- syntax, provided its type looks like the above. + + dcTyVars :: [TyVar], -- Universally-quantified type vars + -- for the data constructor. + -- See NB1 on dcVanilla for the conneciton between dcTyVars and dcResTys + -- + -- In general, the dcTyVars are NOT NECESSARILY THE SAME AS THE TYVARS + -- FOR THE PARENT TyCon. With GADTs the data con might not even have + -- the same number of type variables. + -- [This is a change (Oct05): previously, vanilla datacons guaranteed to + -- have the same type variables as their parent TyCon, but that seems ugly.] + + dcStupidTheta :: ThetaType, -- This is a "thinned" version of + -- the context of the data decl. + -- "Thinned", because the Report says + -- to eliminate any constraints that don't mention + -- tyvars free in the arg types for this constructor + -- + -- "Stupid", because the dictionaries aren't used for anything. + -- + -- Indeed, [as of March 02] they are no + -- longer in the type of the wrapper Id, because + -- that makes it harder to use the wrap-id to rebuild + -- values after record selection or in generics. + -- + -- Fact: the free tyvars of dcStupidTheta are a subset of + -- the free tyvars of dcResTys + -- Reason: dcStupidTeta is gotten by instantiating the + -- stupid theta from the tycon (see BuildTyCl.mkDataConStupidTheta) + + dcTheta :: ThetaType, -- The existentially quantified stuff + + dcOrigArgTys :: [Type], -- Original argument types + -- (before unboxing and flattening of + -- strict fields) + + -- Result type of constructor is T t1..tn + dcTyCon :: TyCon, -- Result tycon, T + dcResTys :: [Type], -- Result type args, t1..tn + + -- Now the strictness annotations and field labels of the constructor + dcStrictMarks :: [StrictnessMark], + -- Strictness annotations as decided by the compiler. + -- Does *not* include the existential dictionaries + -- length = dataConSourceArity dataCon + + dcFields :: [FieldLabel], + -- Field labels for this constructor, in the + -- same order as the argument types; + -- length = 0 (if not a record) or dataConSourceArity. + + -- Constructor representation + dcRepArgTys :: [Type], -- Final, representation argument types, + -- after unboxing and flattening, + -- and *including* existential dictionaries + + dcRepStrictness :: [StrictnessMark], -- One for each *representation* argument + + dcRepType :: Type, -- Type of the constructor + -- forall a b . Ord b => a -> [b] -> MkT a + -- (this is *not* of the constructor wrapper Id: + -- see notes after this data type declaration) + -- + -- Notice that the existential type parameters come *second*. + -- Reason: in a case expression we may find: + -- case (e :: T t) of { MkT b (d:Ord b) (x:t) (xs:[b]) -> ... } + -- It's convenient to apply the rep-type of MkT to 't', to get + -- forall b. Ord b => ... + -- and use that to check the pattern. Mind you, this is really only + -- use in CoreLint. + + + -- Finally, the curried worker function that corresponds to the constructor + -- It doesn't have an unfolding; the code generator saturates these Ids + -- and allocates a real constructor when it finds one. + -- + -- An entirely separate wrapper function is built in TcTyDecls + dcIds :: DataConIds, + + dcInfix :: Bool -- True <=> declared infix + -- Used for Template Haskell and 'deriving' only + -- The actual fixity is stored elsewhere + } + +data DataConIds + = NewDC Id -- Newtypes have only a wrapper, but no worker + | AlgDC (Maybe Id) Id -- Algebraic data types always have a worker, and + -- may or may not have a wrapper, depending on whether + -- the wrapper does anything. + + -- _Neither_ the worker _nor_ the wrapper take the dcStupidTheta dicts as arguments + + -- The wrapper takes dcOrigArgTys as its arguments + -- The worker takes dcRepArgTys as its arguments + -- If the worker is absent, dcRepArgTys is the same as dcOrigArgTys + + -- The 'Nothing' case of AlgDC is important + -- Not only is this efficient, + -- but it also ensures that the wrapper is replaced + -- by the worker (becuase it *is* the wroker) + -- even when there are no args. E.g. in + -- f (:) x + -- the (:) *is* the worker. + -- This is really important in rule matching, + -- (We could match on the wrappers, + -- but that makes it less likely that rules will match + -- when we bring bits of unfoldings together.) + +type ConTag = Int + +fIRST_TAG :: ConTag +fIRST_TAG = 1 -- Tags allocated from here for real constructors +\end{code} + +The dcRepType field contains the type of the representation of a contructor +This may differ from the type of the contructor *Id* (built +by MkId.mkDataConId) for two reasons: + a) the constructor Id may be overloaded, but the dictionary isn't stored + e.g. data Eq a => T a = MkT a a + + b) the constructor may store an unboxed version of a strict field. + +Here's an example illustrating both: + data Ord a => T a = MkT Int! a +Here + T :: Ord a => Int -> a -> T a +but the rep type is + Trep :: Int# -> a -> T a +Actually, the unboxed part isn't implemented yet! + + +%************************************************************************ +%* * +\subsection{Instances} +%* * +%************************************************************************ + +\begin{code} +instance Eq DataCon where + a == b = getUnique a == getUnique b + a /= b = getUnique a /= getUnique b + +instance Ord DataCon where + a <= b = getUnique a <= getUnique b + a < b = getUnique a < getUnique b + a >= b = getUnique a >= getUnique b + a > b = getUnique a > getUnique b + compare a b = getUnique a `compare` getUnique b + +instance Uniquable DataCon where + getUnique = dcUnique + +instance NamedThing DataCon where + getName = dcName + +instance Outputable DataCon where + ppr con = ppr (dataConName con) + +instance Show DataCon where + showsPrec p con = showsPrecSDoc p (ppr con) +\end{code} + + +%************************************************************************ +%* * +\subsection{Construction} +%* * +%************************************************************************ + +\begin{code} +mkDataCon :: Name + -> Bool -- Declared infix + -> Bool -- Vanilla (see notes with dcVanilla) + -> [StrictnessMark] -> [FieldLabel] + -> [TyVar] -> ThetaType -> ThetaType + -> [Type] -> TyCon -> [Type] + -> DataConIds + -> DataCon + -- Can get the tag from the TyCon + +mkDataCon name declared_infix vanilla + arg_stricts -- Must match orig_arg_tys 1-1 + fields + tyvars stupid_theta theta orig_arg_tys tycon res_tys + ids + = con + where + con = MkData {dcName = name, + dcUnique = nameUnique name, dcVanilla = vanilla, + dcTyVars = tyvars, dcStupidTheta = stupid_theta, dcTheta = theta, + dcOrigArgTys = orig_arg_tys, dcTyCon = tycon, dcResTys = res_tys, + dcRepArgTys = rep_arg_tys, + dcStrictMarks = arg_stricts, dcRepStrictness = rep_arg_stricts, + dcFields = fields, dcTag = tag, dcRepType = ty, + dcIds = ids, dcInfix = declared_infix} + + -- Strictness marks for source-args + -- *after unboxing choices*, + -- but *including existential dictionaries* + -- + -- The 'arg_stricts' passed to mkDataCon are simply those for the + -- source-language arguments. We add extra ones for the + -- dictionary arguments right here. + dict_tys = mkPredTys theta + real_arg_tys = dict_tys ++ orig_arg_tys + real_stricts = map mk_dict_strict_mark theta ++ arg_stricts + + -- Representation arguments and demands + (rep_arg_stricts, rep_arg_tys) = computeRep real_stricts real_arg_tys + + tag = assoc "mkDataCon" (tyConDataCons tycon `zip` [fIRST_TAG..]) con + ty = mkForAllTys tyvars (mkFunTys rep_arg_tys result_ty) + -- NB: the existential dict args are already in rep_arg_tys + + result_ty = mkTyConApp tycon res_tys + +mk_dict_strict_mark pred | isStrictPred pred = MarkedStrict + | otherwise = NotMarkedStrict +\end{code} + +\begin{code} +dataConName :: DataCon -> Name +dataConName = dcName + +dataConTag :: DataCon -> ConTag +dataConTag = dcTag + +dataConTyCon :: DataCon -> TyCon +dataConTyCon = dcTyCon + +dataConRepType :: DataCon -> Type +dataConRepType = dcRepType + +dataConIsInfix :: DataCon -> Bool +dataConIsInfix = dcInfix + +dataConTyVars :: DataCon -> [TyVar] +dataConTyVars = dcTyVars + +dataConWorkId :: DataCon -> Id +dataConWorkId dc = case dcIds dc of + AlgDC _ wrk_id -> wrk_id + NewDC _ -> pprPanic "dataConWorkId" (ppr dc) + +dataConWrapId_maybe :: DataCon -> Maybe Id +dataConWrapId_maybe dc = case dcIds dc of + AlgDC mb_wrap _ -> mb_wrap + NewDC wrap -> Just wrap + +dataConWrapId :: DataCon -> Id +-- Returns an Id which looks like the Haskell-source constructor +dataConWrapId dc = case dcIds dc of + AlgDC (Just wrap) _ -> wrap + AlgDC Nothing wrk -> wrk -- worker=wrapper + NewDC wrap -> wrap + +dataConImplicitIds :: DataCon -> [Id] +dataConImplicitIds dc = case dcIds dc of + AlgDC (Just wrap) work -> [wrap,work] + AlgDC Nothing work -> [work] + NewDC wrap -> [wrap] + +dataConFieldLabels :: DataCon -> [FieldLabel] +dataConFieldLabels = dcFields + +dataConFieldType :: DataCon -> FieldLabel -> Type +dataConFieldType con label = expectJust "unexpected label" $ + lookup label (dcFields con `zip` dcOrigArgTys con) + +dataConStrictMarks :: DataCon -> [StrictnessMark] +dataConStrictMarks = dcStrictMarks + +dataConExStricts :: DataCon -> [StrictnessMark] +-- Strictness of *existential* arguments only +-- Usually empty, so we don't bother to cache this +dataConExStricts dc = map mk_dict_strict_mark (dcTheta dc) + +dataConSourceArity :: DataCon -> Arity + -- Source-level arity of the data constructor +dataConSourceArity dc = length (dcOrigArgTys dc) + +-- dataConRepArity gives the number of actual fields in the +-- {\em representation} of the data constructor. This may be more than appear +-- in the source code; the extra ones are the existentially quantified +-- dictionaries +dataConRepArity (MkData {dcRepArgTys = arg_tys}) = length arg_tys + +isNullarySrcDataCon, isNullaryRepDataCon :: DataCon -> Bool +isNullarySrcDataCon dc = null (dcOrigArgTys dc) +isNullaryRepDataCon dc = null (dcRepArgTys dc) + +dataConRepStrictness :: DataCon -> [StrictnessMark] + -- Give the demands on the arguments of a + -- Core constructor application (Con dc args) +dataConRepStrictness dc = dcRepStrictness dc + +dataConSig :: DataCon -> ([TyVar], ThetaType, + [Type], TyCon, [Type]) + +dataConSig (MkData {dcTyVars = tyvars, dcTheta = theta, + dcOrigArgTys = arg_tys, dcTyCon = tycon, dcResTys = res_tys}) + = (tyvars, theta, arg_tys, tycon, res_tys) + +dataConStupidTheta :: DataCon -> ThetaType +dataConStupidTheta dc = dcStupidTheta dc + +dataConResTys :: DataCon -> [Type] +dataConResTys dc = dcResTys dc + +dataConInstArgTys :: DataCon + -> [Type] -- Instantiated at these types + -- NB: these INCLUDE the existentially quantified arg types + -> [Type] -- Needs arguments of these types + -- NB: these INCLUDE the existentially quantified dict args + -- but EXCLUDE the data-decl context which is discarded + -- It's all post-flattening etc; this is a representation type +dataConInstArgTys (MkData {dcRepArgTys = arg_tys, dcTyVars = tyvars}) inst_tys + = ASSERT( length tyvars == length inst_tys ) + map (substTyWith tyvars inst_tys) arg_tys + +dataConInstResTy :: DataCon -> [Type] -> Type +dataConInstResTy (MkData {dcTyVars = tyvars, dcTyCon = tc, dcResTys = res_tys}) inst_tys + = ASSERT( length tyvars == length inst_tys ) + substTy (zipOpenTvSubst tyvars inst_tys) (mkTyConApp tc res_tys) + -- res_tys can't currently contain any foralls, + -- but might in future; hence zipOpenTvSubst + +-- And the same deal for the original arg tys +dataConInstOrigArgTys :: DataCon -> [Type] -> [Type] +dataConInstOrigArgTys (MkData {dcOrigArgTys = arg_tys, dcTyVars = tyvars}) inst_tys + = ASSERT( length tyvars == length inst_tys ) + map (substTyWith tyvars inst_tys) arg_tys +\end{code} + +These two functions get the real argument types of the constructor, +without substituting for any type variables. + +dataConOrigArgTys returns the arg types of the wrapper, excluding all dictionary args. + +dataConRepArgTys retuns the arg types of the worker, including all dictionaries, and +after any flattening has been done. + +\begin{code} +dataConOrigArgTys :: DataCon -> [Type] +dataConOrigArgTys dc = dcOrigArgTys dc + +dataConRepArgTys :: DataCon -> [Type] +dataConRepArgTys dc = dcRepArgTys dc +\end{code} + + +\begin{code} +isTupleCon :: DataCon -> Bool +isTupleCon (MkData {dcTyCon = tc}) = isTupleTyCon tc + +isUnboxedTupleCon :: DataCon -> Bool +isUnboxedTupleCon (MkData {dcTyCon = tc}) = isUnboxedTupleTyCon tc + +isVanillaDataCon :: DataCon -> Bool +isVanillaDataCon dc = dcVanilla dc +\end{code} + + +\begin{code} +classDataCon :: Class -> DataCon +classDataCon clas = case tyConDataCons (classTyCon clas) of + (dict_constr:no_more) -> ASSERT( null no_more ) dict_constr +\end{code} + +%************************************************************************ +%* * +\subsection{Splitting products} +%* * +%************************************************************************ + +\begin{code} +splitProductType_maybe + :: Type -- A product type, perhaps + -> Maybe (TyCon, -- The type constructor + [Type], -- Type args of the tycon + DataCon, -- The data constructor + [Type]) -- Its *representation* arg types + + -- Returns (Just ...) for any + -- concrete (i.e. constructors visible) + -- single-constructor + -- not existentially quantified + -- type whether a data type or a new type + -- + -- Rejecing existentials is conservative. Maybe some things + -- could be made to work with them, but I'm not going to sweat + -- it through till someone finds it's important. + +splitProductType_maybe ty + = case splitTyConApp_maybe ty of + Just (tycon,ty_args) + | isProductTyCon tycon -- Includes check for non-existential, + -- and for constructors visible + -> Just (tycon, ty_args, data_con, dataConInstArgTys data_con ty_args) + where + data_con = head (tyConDataCons tycon) + other -> Nothing + +splitProductType str ty + = case splitProductType_maybe ty of + Just stuff -> stuff + Nothing -> pprPanic (str ++ ": not a product") (pprType ty) + + +computeRep :: [StrictnessMark] -- Original arg strictness + -> [Type] -- and types + -> ([StrictnessMark], -- Representation arg strictness + [Type]) -- And type + +computeRep stricts tys + = unzip $ concat $ zipWithEqual "computeRep" unbox stricts tys + where + unbox NotMarkedStrict ty = [(NotMarkedStrict, ty)] + unbox MarkedStrict ty = [(MarkedStrict, ty)] + unbox MarkedUnboxed ty = zipEqual "computeRep" (dataConRepStrictness arg_dc) arg_tys + where + (_, _, arg_dc, arg_tys) = splitProductType "unbox_strict_arg_ty" ty +\end{code} diff --git a/compiler/basicTypes/DataCon.lhs-boot b/compiler/basicTypes/DataCon.lhs-boot new file mode 100644 index 0000000000..c5e05c9ecd --- /dev/null +++ b/compiler/basicTypes/DataCon.lhs-boot @@ -0,0 +1,8 @@ +\begin{code} +module DataCon where +import Name( Name ) + +data DataCon +dataConName :: DataCon -> Name +isVanillaDataCon :: DataCon -> Bool +\end{code} diff --git a/compiler/basicTypes/Demand.lhs b/compiler/basicTypes/Demand.lhs new file mode 100644 index 0000000000..50bb0c6ffa --- /dev/null +++ b/compiler/basicTypes/Demand.lhs @@ -0,0 +1,208 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[Demand]{@Demand@: the amount of demand on a value} + +\begin{code} +#ifndef OLD_STRICTNESS +module Demand () where +#else + +module Demand( + Demand(..), + + wwLazy, wwStrict, wwUnpack, wwPrim, wwEnum, + isStrict, isLazy, isPrim, + + pprDemands, seqDemand, seqDemands, + + StrictnessInfo(..), + mkStrictnessInfo, + noStrictnessInfo, + ppStrictnessInfo, seqStrictnessInfo, + isBottomingStrictness, appIsBottom, + + ) where + +#include "HsVersions.h" + +import Outputable +import Util ( listLengthCmp ) +\end{code} + + +%************************************************************************ +%* * +\subsection{The @Demand@ data type} +%* * +%************************************************************************ + +\begin{code} +data Demand + = WwLazy -- Argument is lazy as far as we know + MaybeAbsent -- (does not imply worker's existence [etc]). + -- If MaybeAbsent == True, then it is + -- *definitely* lazy. (NB: Absence implies + -- a worker...) + + | WwStrict -- Argument is strict but that's all we know + -- (does not imply worker's existence or any + -- calling-convention magic) + + | WwUnpack -- Argument is strict & a single-constructor type + Bool -- True <=> wrapper unpacks it; False <=> doesn't + [Demand] -- Its constituent parts (whose StrictInfos + -- are in the list) should be passed + -- as arguments to the worker. + + | WwPrim -- Argument is of primitive type, therefore + -- strict; doesn't imply existence of a worker; + -- argument should be passed as is to worker. + + | WwEnum -- Argument is strict & an enumeration type; + -- an Int# representing the tag (start counting + -- at zero) should be passed to the worker. + deriving( Eq ) + +type MaybeAbsent = Bool -- True <=> not even used + +-- versions that don't worry about Absence: +wwLazy = WwLazy False +wwStrict = WwStrict +wwUnpack xs = WwUnpack False xs +wwPrim = WwPrim +wwEnum = WwEnum + +seqDemand :: Demand -> () +seqDemand (WwLazy a) = a `seq` () +seqDemand (WwUnpack b ds) = b `seq` seqDemands ds +seqDemand other = () + +seqDemands [] = () +seqDemands (d:ds) = seqDemand d `seq` seqDemands ds +\end{code} + + +%************************************************************************ +%* * +\subsection{Functions over @Demand@} +%* * +%************************************************************************ + +\begin{code} +isLazy :: Demand -> Bool +isLazy (WwLazy _) = True +isLazy _ = False + +isStrict :: Demand -> Bool +isStrict d = not (isLazy d) + +isPrim :: Demand -> Bool +isPrim WwPrim = True +isPrim other = False +\end{code} + + +%************************************************************************ +%* * +\subsection{Instances} +%* * +%************************************************************************ + + +\begin{code} +pprDemands demands bot = hcat (map pprDemand demands) <> pp_bot + where + pp_bot | bot = ptext SLIT("B") + | otherwise = empty + + +pprDemand (WwLazy False) = char 'L' +pprDemand (WwLazy True) = char 'A' +pprDemand WwStrict = char 'S' +pprDemand WwPrim = char 'P' +pprDemand WwEnum = char 'E' +pprDemand (WwUnpack wu args) = char ch <> parens (hcat (map pprDemand args)) + where + ch = if wu then 'U' else 'u' + +instance Outputable Demand where + ppr (WwLazy False) = empty + ppr other_demand = ptext SLIT("__D") <+> pprDemand other_demand + +instance Show Demand where + showsPrec p d = showsPrecSDoc p (ppr d) + +-- Reading demands is done in Lex.lhs +\end{code} + + +%************************************************************************ +%* * +\subsection[strictness-IdInfo]{Strictness info about an @Id@} +%* * +%************************************************************************ + +We specify the strictness of a function by giving information about +each of the ``wrapper's'' arguments (see the description about +worker/wrapper-style transformations in the PJ/Launchbury paper on +unboxed types). + +The list of @Demands@ specifies: (a)~the strictness properties of a +function's arguments; and (b)~the type signature of that worker (if it +exists); i.e. its calling convention. + +Note that the existence of a worker function is now denoted by the Id's +workerInfo field. + +\begin{code} +data StrictnessInfo + = NoStrictnessInfo + + | StrictnessInfo [Demand] -- Demands on the arguments. + + Bool -- True <=> the function diverges regardless of its arguments + -- Useful for "error" and other disguised variants thereof. + -- BUT NB: f = \x y. error "urk" + -- will have info SI [SS] True + -- but still (f) and (f 2) are not bot; only (f 3 2) is bot + deriving( Eq ) + + -- NOTA BENE: if the arg demands are, say, [S,L], this means that + -- (f bot) is not necy bot, only (f bot x) is bot + -- We simply cannot express accurately the strictness of a function + -- like f = \x -> case x of (a,b) -> \y -> ... + -- The up-side is that we don't need to restrict the strictness info + -- to the visible arity of the function. + +seqStrictnessInfo :: StrictnessInfo -> () +seqStrictnessInfo (StrictnessInfo ds b) = b `seq` seqDemands ds +seqStrictnessInfo other = () +\end{code} + +\begin{code} +mkStrictnessInfo :: ([Demand], Bool) -> StrictnessInfo + +mkStrictnessInfo (xs, is_bot) + | all totally_boring xs && not is_bot = NoStrictnessInfo -- Uninteresting + | otherwise = StrictnessInfo xs is_bot + where + totally_boring (WwLazy False) = True + totally_boring other = False + +noStrictnessInfo = NoStrictnessInfo + +isBottomingStrictness (StrictnessInfo _ bot) = bot +isBottomingStrictness NoStrictnessInfo = False + +-- appIsBottom returns true if an application to n args would diverge +appIsBottom (StrictnessInfo ds bot) n = bot && (listLengthCmp ds n /=GT) -- not more than 'n' elts in 'ds'. +appIsBottom NoStrictnessInfo n = False + +ppStrictnessInfo NoStrictnessInfo = empty +ppStrictnessInfo (StrictnessInfo wrapper_args bot) = hsep [pprDemands wrapper_args bot] +\end{code} + +\begin{code} +#endif /* OLD_STRICTNESS */ +\end{code} diff --git a/compiler/basicTypes/FieldLabel.lhs b/compiler/basicTypes/FieldLabel.lhs new file mode 100644 index 0000000000..b388d378d7 --- /dev/null +++ b/compiler/basicTypes/FieldLabel.lhs @@ -0,0 +1,71 @@ +% +% (c) The AQUA Project, Glasgow University, 1996-1998 +% +\section[FieldLabel]{The @FieldLabel@ type} + +\begin{code} +module FieldLabel( + FieldLabel, -- Abstract + + mkFieldLabel, + fieldLabelName, fieldLabelTyCon, fieldLabelType, fieldLabelTag, + + FieldLabelTag, + firstFieldLabelTag, allFieldLabelTags + ) where + +#include "HsVersions.h" + +import Type( Type ) +import TyCon( TyCon ) +import Name ( Name{-instance Eq/Outputable-}, NamedThing(..), nameUnique ) +import Outputable +import Unique ( Uniquable(..) ) +\end{code} + +\begin{code} +data FieldLabel + = FieldLabel Name -- Also used as the Name of the field selector Id + + TyCon -- Parent type constructor + + Type -- Type of the field; may have free type variables that + -- are the tyvars of its parent *data* constructor, and + -- those will be the same as the tyvars of its parent *type* constructor + -- e.g. data T a = MkT { op1 :: a -> a, op2 :: a -> Int } + -- The type in the FieldLabel for op1 will be simply (a->a). + + FieldLabelTag -- Indicates position within constructor + -- (starting with firstFieldLabelTag) + -- + -- If the same field occurs in more than one constructor + -- then it'll have a separate FieldLabel on each occasion, + -- but with a single name (and presumably the same type!) + +type FieldLabelTag = Int + +mkFieldLabel = FieldLabel + +firstFieldLabelTag :: FieldLabelTag +firstFieldLabelTag = 1 + +allFieldLabelTags :: [FieldLabelTag] +allFieldLabelTags = [firstFieldLabelTag..] + +fieldLabelName (FieldLabel n _ _ _) = n +fieldLabelTyCon (FieldLabel _ tc _ _) = tc +fieldLabelType (FieldLabel _ _ ty _) = ty +fieldLabelTag (FieldLabel _ _ _ tag) = tag + +instance Eq FieldLabel where + fl1 == fl2 = fieldLabelName fl1 == fieldLabelName fl2 + +instance Outputable FieldLabel where + ppr fl = ppr (fieldLabelName fl) + +instance NamedThing FieldLabel where + getName = fieldLabelName + +instance Uniquable FieldLabel where + getUnique fl = nameUnique (fieldLabelName fl) +\end{code} diff --git a/compiler/basicTypes/Id.lhs b/compiler/basicTypes/Id.lhs new file mode 100644 index 0000000000..c7ce818adb --- /dev/null +++ b/compiler/basicTypes/Id.lhs @@ -0,0 +1,529 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[Id]{@Ids@: Value and constructor identifiers} + +\begin{code} +module Id ( + Id, DictId, + + -- Simple construction + mkGlobalId, mkLocalId, mkLocalIdWithInfo, + mkSysLocal, mkUserLocal, mkVanillaGlobal, + mkTemplateLocals, mkTemplateLocalsNum, mkWildId, mkTemplateLocal, + mkWorkerId, mkExportedLocalId, + + -- Taking an Id apart + idName, idType, idUnique, idInfo, + isId, globalIdDetails, idPrimRep, + recordSelectorFieldLabel, + + -- Modifying an Id + setIdName, setIdUnique, Id.setIdType, setIdExported, setIdNotExported, + setIdInfo, lazySetIdInfo, modifyIdInfo, maybeModifyIdInfo, + zapLamIdInfo, zapDemandIdInfo, + + -- Predicates + isImplicitId, isDeadBinder, isDictId, + isExportedId, isLocalId, isGlobalId, + isRecordSelector, isNaughtyRecordSelector, + isClassOpId_maybe, + isPrimOpId, isPrimOpId_maybe, + isFCallId, isFCallId_maybe, + isDataConWorkId, isDataConWorkId_maybe, isDataConId_maybe, idDataCon, + isBottomingId, idIsFrom, + hasNoBinding, + + -- Inline pragma stuff + idInlinePragma, setInlinePragma, modifyInlinePragma, + + + -- One shot lambda stuff + isOneShotBndr, isOneShotLambda, isStateHackType, + setOneShotLambda, clearOneShotLambda, + + -- IdInfo stuff + setIdUnfolding, + setIdArity, + setIdNewDemandInfo, + setIdNewStrictness, zapIdNewStrictness, + setIdWorkerInfo, + setIdSpecialisation, + setIdCafInfo, + setIdOccInfo, + +#ifdef OLD_STRICTNESS + idDemandInfo, + idStrictness, + idCprInfo, + setIdStrictness, + setIdDemandInfo, + setIdCprInfo, +#endif + + idArity, + idNewDemandInfo, idNewDemandInfo_maybe, + idNewStrictness, idNewStrictness_maybe, + idWorkerInfo, + idUnfolding, + idSpecialisation, idCoreRules, + idCafInfo, + idLBVarInfo, + idOccInfo, + +#ifdef OLD_STRICTNESS + newStrictnessFromOld -- Temporary +#endif + + ) where + +#include "HsVersions.h" + + +import CoreSyn ( Unfolding, CoreRule ) +import BasicTypes ( Arity ) +import Var ( Id, DictId, + isId, isExportedId, isLocalId, + idName, idType, idUnique, idInfo, isGlobalId, + setIdName, setIdType, setIdUnique, + setIdExported, setIdNotExported, + setIdInfo, lazySetIdInfo, modifyIdInfo, + maybeModifyIdInfo, + globalIdDetails + ) +import qualified Var ( mkLocalId, mkGlobalId, mkExportedLocalId ) +import TyCon ( FieldLabel, TyCon ) +import Type ( Type, typePrimRep, addFreeTyVars, seqType, + splitTyConApp_maybe, PrimRep ) +import TcType ( isDictTy ) +import TysPrim ( statePrimTyCon ) +import IdInfo + +#ifdef OLD_STRICTNESS +import qualified Demand ( Demand ) +#endif +import DataCon ( DataCon, isUnboxedTupleCon ) +import NewDemand ( Demand, StrictSig, topDmd, topSig, isBottomingSig ) +import Name ( Name, OccName, nameIsLocalOrFrom, + mkSystemVarName, mkInternalName, getOccName, + getSrcLoc ) +import Module ( Module ) +import OccName ( mkWorkerOcc ) +import Maybes ( orElse ) +import SrcLoc ( SrcLoc ) +import Outputable +import Unique ( Unique, mkBuiltinUnique ) +import FastString ( FastString ) +import StaticFlags ( opt_NoStateHack ) + +-- infixl so you can say (id `set` a `set` b) +infixl 1 `setIdUnfolding`, + `setIdArity`, + `setIdNewDemandInfo`, + `setIdNewStrictness`, + `setIdWorkerInfo`, + `setIdSpecialisation`, + `setInlinePragma`, + `idCafInfo` +#ifdef OLD_STRICTNESS + ,`idCprInfo` + ,`setIdStrictness` + ,`setIdDemandInfo` +#endif +\end{code} + + + +%************************************************************************ +%* * +\subsection{Simple Id construction} +%* * +%************************************************************************ + +Absolutely all Ids are made by mkId. It is just like Var.mkId, +but in addition it pins free-tyvar-info onto the Id's type, +where it can easily be found. + +\begin{code} +mkLocalIdWithInfo :: Name -> Type -> IdInfo -> Id +mkLocalIdWithInfo name ty info = Var.mkLocalId name (addFreeTyVars ty) info + +mkExportedLocalId :: Name -> Type -> Id +mkExportedLocalId name ty = Var.mkExportedLocalId name (addFreeTyVars ty) vanillaIdInfo + +mkGlobalId :: GlobalIdDetails -> Name -> Type -> IdInfo -> Id +mkGlobalId details name ty info = Var.mkGlobalId details name (addFreeTyVars ty) info +\end{code} + +\begin{code} +mkLocalId :: Name -> Type -> Id +mkLocalId name ty = mkLocalIdWithInfo name ty vanillaIdInfo + +-- SysLocal: for an Id being created by the compiler out of thin air... +-- UserLocal: an Id with a name the user might recognize... +mkUserLocal :: OccName -> Unique -> Type -> SrcLoc -> Id +mkSysLocal :: FastString -> Unique -> Type -> Id +mkVanillaGlobal :: Name -> Type -> IdInfo -> Id + +mkSysLocal fs uniq ty = mkLocalId (mkSystemVarName uniq fs) ty + +mkUserLocal occ uniq ty loc = mkLocalId (mkInternalName uniq occ loc) ty +mkVanillaGlobal = mkGlobalId VanillaGlobal +\end{code} + +Make some local @Ids@ for a template @CoreExpr@. These have bogus +@Uniques@, but that's OK because the templates are supposed to be +instantiated before use. + +\begin{code} +-- "Wild Id" typically used when you need a binder that you don't expect to use +mkWildId :: Type -> Id +mkWildId ty = mkSysLocal FSLIT("wild") (mkBuiltinUnique 1) ty + +mkWorkerId :: Unique -> Id -> Type -> Id +-- A worker gets a local name. CoreTidy will externalise it if necessary. +mkWorkerId uniq unwrkr ty + = mkLocalId wkr_name ty + where + wkr_name = mkInternalName uniq (mkWorkerOcc (getOccName unwrkr)) (getSrcLoc unwrkr) + +-- "Template locals" typically used in unfoldings +mkTemplateLocals :: [Type] -> [Id] +mkTemplateLocals tys = zipWith mkTemplateLocal [1..] tys + +mkTemplateLocalsNum :: Int -> [Type] -> [Id] +-- The Int gives the starting point for unique allocation +mkTemplateLocalsNum n tys = zipWith mkTemplateLocal [n..] tys + +mkTemplateLocal :: Int -> Type -> Id +mkTemplateLocal i ty = mkSysLocal FSLIT("tpl") (mkBuiltinUnique i) ty +\end{code} + + +%************************************************************************ +%* * +\subsection[Id-general-funs]{General @Id@-related functions} +%* * +%************************************************************************ + +\begin{code} +setIdType :: Id -> Type -> Id + -- Add free tyvar info to the type +setIdType id ty = seqType ty `seq` Var.setIdType id (addFreeTyVars ty) + +idPrimRep :: Id -> PrimRep +idPrimRep id = typePrimRep (idType id) +\end{code} + + +%************************************************************************ +%* * +\subsection{Special Ids} +%* * +%************************************************************************ + +\begin{code} +recordSelectorFieldLabel :: Id -> (TyCon, FieldLabel) +recordSelectorFieldLabel id = case globalIdDetails id of + RecordSelId tycon lbl _ -> (tycon,lbl) + other -> panic "recordSelectorFieldLabel" + +isRecordSelector id = case globalIdDetails id of + RecordSelId {} -> True + other -> False + +isNaughtyRecordSelector id = case globalIdDetails id of + RecordSelId { sel_naughty = n } -> n + other -> False + +isClassOpId_maybe id = case globalIdDetails id of + ClassOpId cls -> Just cls + _other -> Nothing + +isPrimOpId id = case globalIdDetails id of + PrimOpId op -> True + other -> False + +isPrimOpId_maybe id = case globalIdDetails id of + PrimOpId op -> Just op + other -> Nothing + +isFCallId id = case globalIdDetails id of + FCallId call -> True + other -> False + +isFCallId_maybe id = case globalIdDetails id of + FCallId call -> Just call + other -> Nothing + +isDataConWorkId id = case globalIdDetails id of + DataConWorkId _ -> True + other -> False + +isDataConWorkId_maybe id = case globalIdDetails id of + DataConWorkId con -> Just con + other -> Nothing + +isDataConId_maybe :: Id -> Maybe DataCon +isDataConId_maybe id = case globalIdDetails id of + DataConWorkId con -> Just con + DataConWrapId con -> Just con + other -> Nothing + +idDataCon :: Id -> DataCon +-- Get from either the worker or the wrapper to the DataCon +-- Currently used only in the desugarer +-- INVARIANT: idDataCon (dataConWrapId d) = d +-- (Remember, dataConWrapId can return either the wrapper or the worker.) +idDataCon id = case globalIdDetails id of + DataConWorkId con -> con + DataConWrapId con -> con + other -> pprPanic "idDataCon" (ppr id) + + +isDictId :: Id -> Bool +isDictId id = isDictTy (idType id) + +-- hasNoBinding returns True of an Id which may not have a +-- binding, even though it is defined in this module. +-- Data constructor workers used to be things of this kind, but +-- they aren't any more. Instead, we inject a binding for +-- them at the CorePrep stage. +-- EXCEPT: unboxed tuples, which definitely have no binding +hasNoBinding id = case globalIdDetails id of + PrimOpId _ -> True + FCallId _ -> True + DataConWorkId dc -> isUnboxedTupleCon dc + other -> False + +isImplicitId :: Id -> Bool + -- isImplicitId tells whether an Id's info is implied by other + -- declarations, so we don't need to put its signature in an interface + -- file, even if it's mentioned in some other interface unfolding. +isImplicitId id + = case globalIdDetails id of + RecordSelId {} -> True + FCallId _ -> True + PrimOpId _ -> True + ClassOpId _ -> True + DataConWorkId _ -> True + DataConWrapId _ -> True + -- These are are implied by their type or class decl; + -- remember that all type and class decls appear in the interface file. + -- The dfun id is not an implicit Id; it must *not* be omitted, because + -- it carries version info for the instance decl + other -> False + +idIsFrom :: Module -> Id -> Bool +idIsFrom mod id = nameIsLocalOrFrom mod (idName id) +\end{code} + +\begin{code} +isDeadBinder :: Id -> Bool +isDeadBinder bndr | isId bndr = isDeadOcc (idOccInfo bndr) + | otherwise = False -- TyVars count as not dead +\end{code} + + +%************************************************************************ +%* * +\subsection{IdInfo stuff} +%* * +%************************************************************************ + +\begin{code} + --------------------------------- + -- ARITY +idArity :: Id -> Arity +idArity id = arityInfo (idInfo id) + +setIdArity :: Id -> Arity -> Id +setIdArity id arity = modifyIdInfo (`setArityInfo` arity) id + +#ifdef OLD_STRICTNESS + --------------------------------- + -- (OLD) STRICTNESS +idStrictness :: Id -> StrictnessInfo +idStrictness id = strictnessInfo (idInfo id) + +setIdStrictness :: Id -> StrictnessInfo -> Id +setIdStrictness id strict_info = modifyIdInfo (`setStrictnessInfo` strict_info) id +#endif + +-- isBottomingId returns true if an application to n args would diverge +isBottomingId :: Id -> Bool +isBottomingId id = isBottomingSig (idNewStrictness id) + +idNewStrictness_maybe :: Id -> Maybe StrictSig +idNewStrictness :: Id -> StrictSig + +idNewStrictness_maybe id = newStrictnessInfo (idInfo id) +idNewStrictness id = idNewStrictness_maybe id `orElse` topSig + +setIdNewStrictness :: Id -> StrictSig -> Id +setIdNewStrictness id sig = modifyIdInfo (`setNewStrictnessInfo` Just sig) id + +zapIdNewStrictness :: Id -> Id +zapIdNewStrictness id = modifyIdInfo (`setNewStrictnessInfo` Nothing) id + + --------------------------------- + -- WORKER ID +idWorkerInfo :: Id -> WorkerInfo +idWorkerInfo id = workerInfo (idInfo id) + +setIdWorkerInfo :: Id -> WorkerInfo -> Id +setIdWorkerInfo id work_info = modifyIdInfo (`setWorkerInfo` work_info) id + + --------------------------------- + -- UNFOLDING +idUnfolding :: Id -> Unfolding +idUnfolding id = unfoldingInfo (idInfo id) + +setIdUnfolding :: Id -> Unfolding -> Id +setIdUnfolding id unfolding = modifyIdInfo (`setUnfoldingInfo` unfolding) id + +#ifdef OLD_STRICTNESS + --------------------------------- + -- (OLD) DEMAND +idDemandInfo :: Id -> Demand.Demand +idDemandInfo id = demandInfo (idInfo id) + +setIdDemandInfo :: Id -> Demand.Demand -> Id +setIdDemandInfo id demand_info = modifyIdInfo (`setDemandInfo` demand_info) id +#endif + +idNewDemandInfo_maybe :: Id -> Maybe NewDemand.Demand +idNewDemandInfo :: Id -> NewDemand.Demand + +idNewDemandInfo_maybe id = newDemandInfo (idInfo id) +idNewDemandInfo id = newDemandInfo (idInfo id) `orElse` NewDemand.topDmd + +setIdNewDemandInfo :: Id -> NewDemand.Demand -> Id +setIdNewDemandInfo id dmd = modifyIdInfo (`setNewDemandInfo` Just dmd) id + + --------------------------------- + -- SPECIALISATION +idSpecialisation :: Id -> SpecInfo +idSpecialisation id = specInfo (idInfo id) + +idCoreRules :: Id -> [CoreRule] +idCoreRules id = specInfoRules (idSpecialisation id) + +setIdSpecialisation :: Id -> SpecInfo -> Id +setIdSpecialisation id spec_info = modifyIdInfo (`setSpecInfo` spec_info) id + + --------------------------------- + -- CAF INFO +idCafInfo :: Id -> CafInfo +#ifdef OLD_STRICTNESS +idCafInfo id = case cgInfo (idInfo id) of + NoCgInfo -> pprPanic "idCafInfo" (ppr id) + info -> cgCafInfo info +#else +idCafInfo id = cafInfo (idInfo id) +#endif + +setIdCafInfo :: Id -> CafInfo -> Id +setIdCafInfo id caf_info = modifyIdInfo (`setCafInfo` caf_info) id + + --------------------------------- + -- CPR INFO +#ifdef OLD_STRICTNESS +idCprInfo :: Id -> CprInfo +idCprInfo id = cprInfo (idInfo id) + +setIdCprInfo :: Id -> CprInfo -> Id +setIdCprInfo id cpr_info = modifyIdInfo (`setCprInfo` cpr_info) id +#endif + + --------------------------------- + -- Occcurrence INFO +idOccInfo :: Id -> OccInfo +idOccInfo id = occInfo (idInfo id) + +setIdOccInfo :: Id -> OccInfo -> Id +setIdOccInfo id occ_info = modifyIdInfo (`setOccInfo` occ_info) id +\end{code} + + + --------------------------------- + -- INLINING +The inline pragma tells us to be very keen to inline this Id, but it's still +OK not to if optimisation is switched off. + +\begin{code} +idInlinePragma :: Id -> InlinePragInfo +idInlinePragma id = inlinePragInfo (idInfo id) + +setInlinePragma :: Id -> InlinePragInfo -> Id +setInlinePragma id prag = modifyIdInfo (`setInlinePragInfo` prag) id + +modifyInlinePragma :: Id -> (InlinePragInfo -> InlinePragInfo) -> Id +modifyInlinePragma id fn = modifyIdInfo (\info -> info `setInlinePragInfo` (fn (inlinePragInfo info))) id +\end{code} + + + --------------------------------- + -- ONE-SHOT LAMBDAS +\begin{code} +idLBVarInfo :: Id -> LBVarInfo +idLBVarInfo id = lbvarInfo (idInfo id) + +isOneShotBndr :: Id -> Bool +-- This one is the "business end", called externally. +-- Its main purpose is to encapsulate the Horrible State Hack +isOneShotBndr id = isOneShotLambda id || (isStateHackType (idType id)) + +isStateHackType :: Type -> Bool +isStateHackType ty + | opt_NoStateHack + = False + | otherwise + = case splitTyConApp_maybe ty of + Just (tycon,_) -> tycon == statePrimTyCon + other -> False + -- This is a gross hack. It claims that + -- every function over realWorldStatePrimTy is a one-shot + -- function. This is pretty true in practice, and makes a big + -- difference. For example, consider + -- a `thenST` \ r -> ...E... + -- The early full laziness pass, if it doesn't know that r is one-shot + -- will pull out E (let's say it doesn't mention r) to give + -- let lvl = E in a `thenST` \ r -> ...lvl... + -- When `thenST` gets inlined, we end up with + -- let lvl = E in \s -> case a s of (r, s') -> ...lvl... + -- and we don't re-inline E. + -- + -- It would be better to spot that r was one-shot to start with, but + -- I don't want to rely on that. + -- + -- Another good example is in fill_in in PrelPack.lhs. We should be able to + -- spot that fill_in has arity 2 (and when Keith is done, we will) but we can't yet. + + +-- The OneShotLambda functions simply fiddle with the IdInfo flag +isOneShotLambda :: Id -> Bool +isOneShotLambda id = case idLBVarInfo id of + IsOneShotLambda -> True + NoLBVarInfo -> False + +setOneShotLambda :: Id -> Id +setOneShotLambda id = modifyIdInfo (`setLBVarInfo` IsOneShotLambda) id + +clearOneShotLambda :: Id -> Id +clearOneShotLambda id + | isOneShotLambda id = modifyIdInfo (`setLBVarInfo` NoLBVarInfo) id + | otherwise = id + +-- But watch out: this may change the type of something else +-- f = \x -> e +-- If we change the one-shot-ness of x, f's type changes +\end{code} + +\begin{code} +zapLamIdInfo :: Id -> Id +zapLamIdInfo id = maybeModifyIdInfo zapLamInfo id + +zapDemandIdInfo id = maybeModifyIdInfo zapDemandInfo id +\end{code} + diff --git a/compiler/basicTypes/IdInfo.hi-boot-5 b/compiler/basicTypes/IdInfo.hi-boot-5 new file mode 100644 index 0000000000..4a326cad6f --- /dev/null +++ b/compiler/basicTypes/IdInfo.hi-boot-5 @@ -0,0 +1,8 @@ +__interface IdInfo 1 0 where +__export IdInfo IdInfo GlobalIdDetails notGlobalId seqIdInfo vanillaIdInfo ; +1 data IdInfo ; +1 data GlobalIdDetails ; +1 notGlobalId :: GlobalIdDetails ; +1 seqIdInfo :: IdInfo -> PrelBase.Z0T ; +1 vanillaIdInfo :: IdInfo ; + diff --git a/compiler/basicTypes/IdInfo.hi-boot-6 b/compiler/basicTypes/IdInfo.hi-boot-6 new file mode 100644 index 0000000000..e090800d61 --- /dev/null +++ b/compiler/basicTypes/IdInfo.hi-boot-6 @@ -0,0 +1,8 @@ +module IdInfo where + +data IdInfo +data GlobalIdDetails + +notGlobalId :: GlobalIdDetails +seqIdInfo :: IdInfo -> () +vanillaIdInfo :: IdInfo diff --git a/compiler/basicTypes/IdInfo.lhs b/compiler/basicTypes/IdInfo.lhs new file mode 100644 index 0000000000..d53bf5627d --- /dev/null +++ b/compiler/basicTypes/IdInfo.lhs @@ -0,0 +1,699 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1993-1998 +% +\section[IdInfo]{@IdInfos@: Non-essential information about @Ids@} + +(And a pretty good illustration of quite a few things wrong with +Haskell. [WDP 94/11]) + +\begin{code} +module IdInfo ( + GlobalIdDetails(..), notGlobalId, -- Not abstract + + IdInfo, -- Abstract + vanillaIdInfo, noCafIdInfo, + seqIdInfo, megaSeqIdInfo, + + -- Zapping + zapLamInfo, zapDemandInfo, + + -- Arity + ArityInfo, + unknownArity, + arityInfo, setArityInfo, ppArityInfo, + + -- New demand and strictness info + newStrictnessInfo, setNewStrictnessInfo, + newDemandInfo, setNewDemandInfo, pprNewStrictness, + setAllStrictnessInfo, + +#ifdef OLD_STRICTNESS + -- Strictness; imported from Demand + StrictnessInfo(..), + mkStrictnessInfo, noStrictnessInfo, + ppStrictnessInfo,isBottomingStrictness, +#endif + + -- Worker + WorkerInfo(..), workerExists, wrapperArity, workerId, + workerInfo, setWorkerInfo, ppWorkerInfo, + + -- Unfolding + unfoldingInfo, setUnfoldingInfo, setUnfoldingInfoLazily, + +#ifdef OLD_STRICTNESS + -- Old DemandInfo and StrictnessInfo + demandInfo, setDemandInfo, + strictnessInfo, setStrictnessInfo, + cprInfoFromNewStrictness, + oldStrictnessFromNew, newStrictnessFromOld, + oldDemand, newDemand, + + -- Constructed Product Result Info + CprInfo(..), cprInfo, setCprInfo, ppCprInfo, noCprInfo, +#endif + + -- Inline prags + InlinePragInfo, + inlinePragInfo, setInlinePragInfo, + + -- Occurrence info + OccInfo(..), isFragileOcc, isDeadOcc, isLoopBreaker, + InsideLam, OneBranch, insideLam, notInsideLam, oneBranch, notOneBranch, + occInfo, setOccInfo, + + -- Specialisation + SpecInfo(..), specInfo, setSpecInfo, isEmptySpecInfo, + specInfoFreeVars, specInfoRules, seqSpecInfo, + + -- CAF info + CafInfo(..), cafInfo, ppCafInfo, setCafInfo, mayHaveCafRefs, + + -- Lambda-bound variable info + LBVarInfo(..), lbvarInfo, setLBVarInfo, noLBVarInfo, hasNoLBVarInfo + ) where + +#include "HsVersions.h" + + +import CoreSyn +import Class ( Class ) +import PrimOp ( PrimOp ) +import Var ( Id ) +import VarSet ( VarSet, emptyVarSet, seqVarSet ) +import BasicTypes ( OccInfo(..), isFragileOcc, isDeadOcc, seqOccInfo, isLoopBreaker, + InsideLam, insideLam, notInsideLam, + OneBranch, oneBranch, notOneBranch, + Arity, + Activation(..) + ) +import DataCon ( DataCon ) +import TyCon ( TyCon, FieldLabel ) +import ForeignCall ( ForeignCall ) +import NewDemand +import Outputable +import Maybe ( isJust ) + +#ifdef OLD_STRICTNESS +import Name ( Name ) +import Demand hiding( Demand, seqDemand ) +import qualified Demand +import Util ( listLengthCmp ) +import List ( replicate ) +#endif + +-- infixl so you can say (id `set` a `set` b) +infixl 1 `setSpecInfo`, + `setArityInfo`, + `setInlinePragInfo`, + `setUnfoldingInfo`, + `setWorkerInfo`, + `setLBVarInfo`, + `setOccInfo`, + `setCafInfo`, + `setNewStrictnessInfo`, + `setAllStrictnessInfo`, + `setNewDemandInfo` +#ifdef OLD_STRICTNESS + , `setCprInfo` + , `setDemandInfo` + , `setStrictnessInfo` +#endif +\end{code} + +%************************************************************************ +%* * +\subsection{New strictness info} +%* * +%************************************************************************ + +To be removed later + +\begin{code} +-- setAllStrictnessInfo :: IdInfo -> Maybe StrictSig -> IdInfo +-- Set old and new strictness info +setAllStrictnessInfo info Nothing + = info { newStrictnessInfo = Nothing +#ifdef OLD_STRICTNESS + , strictnessInfo = NoStrictnessInfo + , cprInfo = NoCPRInfo +#endif + } + +setAllStrictnessInfo info (Just sig) + = info { newStrictnessInfo = Just sig +#ifdef OLD_STRICTNESS + , strictnessInfo = oldStrictnessFromNew sig + , cprInfo = cprInfoFromNewStrictness sig +#endif + } + +seqNewStrictnessInfo Nothing = () +seqNewStrictnessInfo (Just ty) = seqStrictSig ty + +pprNewStrictness Nothing = empty +pprNewStrictness (Just sig) = ftext FSLIT("Str:") <+> ppr sig + +#ifdef OLD_STRICTNESS +oldStrictnessFromNew :: StrictSig -> Demand.StrictnessInfo +oldStrictnessFromNew sig = mkStrictnessInfo (map oldDemand dmds, isBotRes res_info) + where + (dmds, res_info) = splitStrictSig sig + +cprInfoFromNewStrictness :: StrictSig -> CprInfo +cprInfoFromNewStrictness sig = case strictSigResInfo sig of + RetCPR -> ReturnsCPR + other -> NoCPRInfo + +newStrictnessFromOld :: Name -> Arity -> Demand.StrictnessInfo -> CprInfo -> StrictSig +newStrictnessFromOld name arity (Demand.StrictnessInfo ds res) cpr + | listLengthCmp ds arity /= GT -- length ds <= arity + -- Sometimes the old strictness analyser has more + -- demands than the arity justifies + = mk_strict_sig name arity $ + mkTopDmdType (map newDemand ds) (newRes res cpr) + +newStrictnessFromOld name arity other cpr + = -- Either no strictness info, or arity is too small + -- In either case we can't say anything useful + mk_strict_sig name arity $ + mkTopDmdType (replicate arity lazyDmd) (newRes False cpr) + +mk_strict_sig name arity dmd_ty + = WARN( arity /= dmdTypeDepth dmd_ty, ppr name <+> (ppr arity $$ ppr dmd_ty) ) + mkStrictSig dmd_ty + +newRes True _ = BotRes +newRes False ReturnsCPR = retCPR +newRes False NoCPRInfo = TopRes + +newDemand :: Demand.Demand -> NewDemand.Demand +newDemand (WwLazy True) = Abs +newDemand (WwLazy False) = lazyDmd +newDemand WwStrict = evalDmd +newDemand (WwUnpack unpk ds) = Eval (Prod (map newDemand ds)) +newDemand WwPrim = lazyDmd +newDemand WwEnum = evalDmd + +oldDemand :: NewDemand.Demand -> Demand.Demand +oldDemand Abs = WwLazy True +oldDemand Top = WwLazy False +oldDemand Bot = WwStrict +oldDemand (Box Bot) = WwStrict +oldDemand (Box Abs) = WwLazy False +oldDemand (Box (Eval _)) = WwStrict -- Pass box only +oldDemand (Defer d) = WwLazy False +oldDemand (Eval (Prod ds)) = WwUnpack True (map oldDemand ds) +oldDemand (Eval (Poly _)) = WwStrict +oldDemand (Call _) = WwStrict + +#endif /* OLD_STRICTNESS */ +\end{code} + + +\begin{code} +seqNewDemandInfo Nothing = () +seqNewDemandInfo (Just dmd) = seqDemand dmd +\end{code} + + +%************************************************************************ +%* * +\subsection{GlobalIdDetails +%* * +%************************************************************************ + +This type is here (rather than in Id.lhs) mainly because there's +an IdInfo.hi-boot, but no Id.hi-boot, and GlobalIdDetails is imported +(recursively) by Var.lhs. + +\begin{code} +data GlobalIdDetails + = VanillaGlobal -- Imported from elsewhere, a default method Id. + + | RecordSelId -- The Id for a record selector + { sel_tycon :: TyCon + , sel_label :: FieldLabel + , sel_naughty :: Bool -- True <=> naughty + } -- See Note [Naughty record selectors] + -- with MkId.mkRecordSelectorId + + | DataConWorkId DataCon -- The Id for a data constructor *worker* + | DataConWrapId DataCon -- The Id for a data constructor *wrapper* + -- [the only reasons we need to know is so that + -- a) to support isImplicitId + -- b) when desugaring a RecordCon we can get + -- from the Id back to the data con] + + | ClassOpId Class -- An operation of a class + + | PrimOpId PrimOp -- The Id for a primitive operator + | FCallId ForeignCall -- The Id for a foreign call + + | NotGlobalId -- Used as a convenient extra return value from globalIdDetails + +notGlobalId = NotGlobalId + +instance Outputable GlobalIdDetails where + ppr NotGlobalId = ptext SLIT("[***NotGlobalId***]") + ppr VanillaGlobal = ptext SLIT("[GlobalId]") + ppr (DataConWorkId _) = ptext SLIT("[DataCon]") + ppr (DataConWrapId _) = ptext SLIT("[DataConWrapper]") + ppr (ClassOpId _) = ptext SLIT("[ClassOp]") + ppr (PrimOpId _) = ptext SLIT("[PrimOp]") + ppr (FCallId _) = ptext SLIT("[ForeignCall]") + ppr (RecordSelId {}) = ptext SLIT("[RecSel]") +\end{code} + + +%************************************************************************ +%* * +\subsection{The main IdInfo type} +%* * +%************************************************************************ + +An @IdInfo@ gives {\em optional} information about an @Id@. If +present it never lies, but it may not be present, in which case there +is always a conservative assumption which can be made. + +Two @Id@s may have different info even though they have the same +@Unique@ (and are hence the same @Id@); for example, one might lack +the properties attached to the other. + +The @IdInfo@ gives information about the value, or definition, of the +@Id@. It does {\em not} contain information about the @Id@'s usage +(except for @DemandInfo@? ToDo). (@lbvarInfo@ is also a marginal +case. KSW 1999-04). + +\begin{code} +data IdInfo + = IdInfo { + arityInfo :: !ArityInfo, -- Its arity + specInfo :: SpecInfo, -- Specialisations of this function which exist +#ifdef OLD_STRICTNESS + cprInfo :: CprInfo, -- Function always constructs a product result + demandInfo :: Demand.Demand, -- Whether or not it is definitely demanded + strictnessInfo :: StrictnessInfo, -- Strictness properties +#endif + workerInfo :: WorkerInfo, -- Pointer to Worker Function + -- Within one module this is irrelevant; the + -- inlining of a worker is handled via the Unfolding + -- WorkerInfo is used *only* to indicate the form of + -- the RHS, so that interface files don't actually + -- need to contain the RHS; it can be derived from + -- the strictness info + + unfoldingInfo :: Unfolding, -- Its unfolding + cafInfo :: CafInfo, -- CAF info + lbvarInfo :: LBVarInfo, -- Info about a lambda-bound variable + inlinePragInfo :: InlinePragInfo, -- Inline pragma + occInfo :: OccInfo, -- How it occurs + + newStrictnessInfo :: Maybe StrictSig, -- Reason for Maybe: the DmdAnal phase needs to + -- know whether whether this is the first visit, + -- so it can assign botSig. Other customers want + -- topSig. So Nothing is good. + + newDemandInfo :: Maybe Demand -- Similarly we want to know if there's no + -- known demand yet, for when we are looking for + -- CPR info + } + +seqIdInfo :: IdInfo -> () +seqIdInfo (IdInfo {}) = () + +megaSeqIdInfo :: IdInfo -> () +megaSeqIdInfo info + = seqSpecInfo (specInfo info) `seq` + seqWorker (workerInfo info) `seq` + +-- Omitting this improves runtimes a little, presumably because +-- some unfoldings are not calculated at all +-- seqUnfolding (unfoldingInfo info) `seq` + + seqNewDemandInfo (newDemandInfo info) `seq` + seqNewStrictnessInfo (newStrictnessInfo info) `seq` + +#ifdef OLD_STRICTNESS + Demand.seqDemand (demandInfo info) `seq` + seqStrictnessInfo (strictnessInfo info) `seq` + seqCpr (cprInfo info) `seq` +#endif + + seqCaf (cafInfo info) `seq` + seqLBVar (lbvarInfo info) `seq` + seqOccInfo (occInfo info) +\end{code} + +Setters + +\begin{code} +setWorkerInfo info wk = wk `seq` info { workerInfo = wk } +setSpecInfo info sp = sp `seq` info { specInfo = sp } +setInlinePragInfo info pr = pr `seq` info { inlinePragInfo = pr } +setOccInfo info oc = oc `seq` info { occInfo = oc } +#ifdef OLD_STRICTNESS +setStrictnessInfo info st = st `seq` info { strictnessInfo = st } +#endif + -- Try to avoid spack leaks by seq'ing + +setUnfoldingInfoLazily info uf -- Lazy variant to avoid looking at the + = -- unfolding of an imported Id unless necessary + info { unfoldingInfo = uf } -- (In this case the demand-zapping is redundant.) + +setUnfoldingInfo info uf + -- We do *not* seq on the unfolding info, For some reason, doing so + -- actually increases residency significantly. + = info { unfoldingInfo = uf } + +#ifdef OLD_STRICTNESS +setDemandInfo info dd = info { demandInfo = dd } +setCprInfo info cp = info { cprInfo = cp } +#endif + +setArityInfo info ar = info { arityInfo = ar } +setCafInfo info caf = info { cafInfo = caf } + +setLBVarInfo info lb = {-lb `seq`-} info { lbvarInfo = lb } + +setNewDemandInfo info dd = dd `seq` info { newDemandInfo = dd } +setNewStrictnessInfo info dd = dd `seq` info { newStrictnessInfo = dd } +\end{code} + + +\begin{code} +vanillaIdInfo :: IdInfo +vanillaIdInfo + = IdInfo { + cafInfo = vanillaCafInfo, + arityInfo = unknownArity, +#ifdef OLD_STRICTNESS + cprInfo = NoCPRInfo, + demandInfo = wwLazy, + strictnessInfo = NoStrictnessInfo, +#endif + specInfo = emptySpecInfo, + workerInfo = NoWorker, + unfoldingInfo = noUnfolding, + lbvarInfo = NoLBVarInfo, + inlinePragInfo = AlwaysActive, + occInfo = NoOccInfo, + newDemandInfo = Nothing, + newStrictnessInfo = Nothing + } + +noCafIdInfo = vanillaIdInfo `setCafInfo` NoCafRefs + -- Used for built-in type Ids in MkId. +\end{code} + + +%************************************************************************ +%* * +\subsection[arity-IdInfo]{Arity info about an @Id@} +%* * +%************************************************************************ + +For locally-defined Ids, the code generator maintains its own notion +of their arities; so it should not be asking... (but other things +besides the code-generator need arity info!) + +\begin{code} +type ArityInfo = Arity + -- A partial application of this Id to up to n-1 value arguments + -- does essentially no work. That is not necessarily the + -- same as saying that it has n leading lambdas, because coerces + -- may get in the way. + + -- The arity might increase later in the compilation process, if + -- an extra lambda floats up to the binding site. + +unknownArity = 0 :: Arity + +ppArityInfo 0 = empty +ppArityInfo n = hsep [ptext SLIT("Arity"), int n] +\end{code} + +%************************************************************************ +%* * +\subsection{Inline-pragma information} +%* * +%************************************************************************ + +\begin{code} +type InlinePragInfo = Activation + -- Tells when the inlining is active + -- When it is active the thing may be inlined, depending on how + -- big it is. + -- + -- If there was an INLINE pragma, then as a separate matter, the + -- RHS will have been made to look small with a CoreSyn Inline Note + + -- The default InlinePragInfo is AlwaysActive, so the info serves + -- entirely as a way to inhibit inlining until we want it +\end{code} + + +%************************************************************************ +%* * + SpecInfo +%* * +%************************************************************************ + +\begin{code} +-- CoreRules is used only in an idSpecialisation (move to IdInfo?) +data SpecInfo + = SpecInfo [CoreRule] VarSet -- Locally-defined free vars of RHSs + +emptySpecInfo :: SpecInfo +emptySpecInfo = SpecInfo [] emptyVarSet + +isEmptySpecInfo :: SpecInfo -> Bool +isEmptySpecInfo (SpecInfo rs _) = null rs + +specInfoFreeVars :: SpecInfo -> VarSet +specInfoFreeVars (SpecInfo _ fvs) = fvs + +specInfoRules :: SpecInfo -> [CoreRule] +specInfoRules (SpecInfo rules _) = rules + +seqSpecInfo (SpecInfo rules fvs) = seqRules rules `seq` seqVarSet fvs +\end{code} + + +%************************************************************************ +%* * +\subsection[worker-IdInfo]{Worker info about an @Id@} +%* * +%************************************************************************ + +If this Id has a worker then we store a reference to it. Worker +functions are generated by the worker/wrapper pass. This uses +information from strictness analysis. + +There might not be a worker, even for a strict function, because: +(a) the function might be small enough to inline, so no need + for w/w split +(b) the strictness info might be "SSS" or something, so no w/w split. + +Sometimes the arity of a wrapper changes from the original arity from +which it was generated, so we always emit the "original" arity into +the interface file, as part of the worker info. + +How can this happen? Sometimes we get + f = coerce t (\x y -> $wf x y) +at the moment of w/w split; but the eta reducer turns it into + f = coerce t $wf +which is perfectly fine except that the exposed arity so far as +the code generator is concerned (zero) differs from the arity +when we did the split (2). + +All this arises because we use 'arity' to mean "exactly how many +top level lambdas are there" in interface files; but during the +compilation of this module it means "how many things can I apply +this to". + +\begin{code} + +data WorkerInfo = NoWorker + | HasWorker Id Arity + -- The Arity is the arity of the *wrapper* at the moment of the + -- w/w split. See notes above. + +seqWorker :: WorkerInfo -> () +seqWorker (HasWorker id a) = id `seq` a `seq` () +seqWorker NoWorker = () + +ppWorkerInfo NoWorker = empty +ppWorkerInfo (HasWorker wk_id _) = ptext SLIT("Worker") <+> ppr wk_id + +workerExists :: WorkerInfo -> Bool +workerExists NoWorker = False +workerExists (HasWorker _ _) = True + +workerId :: WorkerInfo -> Id +workerId (HasWorker id _) = id + +wrapperArity :: WorkerInfo -> Arity +wrapperArity (HasWorker _ a) = a +\end{code} + + +%************************************************************************ +%* * +\subsection[CG-IdInfo]{Code generator-related information} +%* * +%************************************************************************ + +\begin{code} +-- CafInfo is used to build Static Reference Tables (see simplStg/SRT.lhs). + +data CafInfo + = MayHaveCafRefs -- either: + -- (1) A function or static constructor + -- that refers to one or more CAFs, + -- (2) A real live CAF + + | NoCafRefs -- A function or static constructor + -- that refers to no CAFs. + +vanillaCafInfo = MayHaveCafRefs -- Definitely safe + +mayHaveCafRefs MayHaveCafRefs = True +mayHaveCafRefs _ = False + +seqCaf c = c `seq` () + +ppCafInfo NoCafRefs = ptext SLIT("NoCafRefs") +ppCafInfo MayHaveCafRefs = empty +\end{code} + +%************************************************************************ +%* * +\subsection[cpr-IdInfo]{Constructed Product Result info about an @Id@} +%* * +%************************************************************************ + +If the @Id@ is a function then it may have CPR info. A CPR analysis +phase detects whether: + +\begin{enumerate} +\item +The function's return value has a product type, i.e. an algebraic type +with a single constructor. Examples of such types are tuples and boxed +primitive values. +\item +The function always 'constructs' the value that it is returning. It +must do this on every path through, and it's OK if it calls another +function which constructs the result. +\end{enumerate} + +If this is the case then we store a template which tells us the +function has the CPR property and which components of the result are +also CPRs. + +\begin{code} +#ifdef OLD_STRICTNESS +data CprInfo + = NoCPRInfo + | ReturnsCPR -- Yes, this function returns a constructed product + -- Implicitly, this means "after the function has been applied + -- to all its arguments", so the worker/wrapper builder in + -- WwLib.mkWWcpr checks that that it is indeed saturated before + -- making use of the CPR info + + -- We used to keep nested info about sub-components, but + -- we never used it so I threw it away + +seqCpr :: CprInfo -> () +seqCpr ReturnsCPR = () +seqCpr NoCPRInfo = () + +noCprInfo = NoCPRInfo + +ppCprInfo NoCPRInfo = empty +ppCprInfo ReturnsCPR = ptext SLIT("__M") + +instance Outputable CprInfo where + ppr = ppCprInfo + +instance Show CprInfo where + showsPrec p c = showsPrecSDoc p (ppr c) +#endif +\end{code} + + +%************************************************************************ +%* * +\subsection[lbvar-IdInfo]{Lambda-bound var info about an @Id@} +%* * +%************************************************************************ + +If the @Id@ is a lambda-bound variable then it may have lambda-bound +var info. Sometimes we know whether the lambda binding this var is a +``one-shot'' lambda; that is, whether it is applied at most once. + +This information may be useful in optimisation, as computations may +safely be floated inside such a lambda without risk of duplicating +work. + +\begin{code} +data LBVarInfo = NoLBVarInfo + | IsOneShotLambda -- The lambda is applied at most once). + +seqLBVar l = l `seq` () +\end{code} + +\begin{code} +hasNoLBVarInfo NoLBVarInfo = True +hasNoLBVarInfo IsOneShotLambda = False + +noLBVarInfo = NoLBVarInfo + +pprLBVarInfo NoLBVarInfo = empty +pprLBVarInfo IsOneShotLambda = ptext SLIT("OneShot") + +instance Outputable LBVarInfo where + ppr = pprLBVarInfo + +instance Show LBVarInfo where + showsPrec p c = showsPrecSDoc p (ppr c) +\end{code} + + +%************************************************************************ +%* * +\subsection{Bulk operations on IdInfo} +%* * +%************************************************************************ + +@zapLamInfo@ is used for lambda binders that turn out to to be +part of an unsaturated lambda + +\begin{code} +zapLamInfo :: IdInfo -> Maybe IdInfo +zapLamInfo info@(IdInfo {occInfo = occ, newDemandInfo = demand}) + | is_safe_occ occ && is_safe_dmd demand + = Nothing + | otherwise + = Just (info {occInfo = safe_occ, newDemandInfo = Nothing}) + where + -- The "unsafe" occ info is the ones that say I'm not in a lambda + -- because that might not be true for an unsaturated lambda + is_safe_occ (OneOcc in_lam _ _) = in_lam + is_safe_occ other = True + + safe_occ = case occ of + OneOcc _ once int_cxt -> OneOcc insideLam once int_cxt + other -> occ + + is_safe_dmd Nothing = True + is_safe_dmd (Just dmd) = not (isStrictDmd dmd) +\end{code} + +\begin{code} +zapDemandInfo :: IdInfo -> Maybe IdInfo +zapDemandInfo info@(IdInfo {newDemandInfo = dmd}) + | isJust dmd = Just (info {newDemandInfo = Nothing}) + | otherwise = Nothing +\end{code} + diff --git a/compiler/basicTypes/IdInfo.lhs-boot b/compiler/basicTypes/IdInfo.lhs-boot new file mode 100644 index 0000000000..90cf36f90b --- /dev/null +++ b/compiler/basicTypes/IdInfo.lhs-boot @@ -0,0 +1,9 @@ +\begin{code} +module IdInfo where + +data IdInfo +data GlobalIdDetails + +notGlobalId :: GlobalIdDetails +seqIdInfo :: IdInfo -> () +\end{code}
\ No newline at end of file diff --git a/compiler/basicTypes/Literal.lhs b/compiler/basicTypes/Literal.lhs new file mode 100644 index 0000000000..e83ea9db74 --- /dev/null +++ b/compiler/basicTypes/Literal.lhs @@ -0,0 +1,405 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1998 +% +\section[Literal]{@Literal@: Machine literals (unboxed, of course)} + +\begin{code} +module Literal + ( Literal(..) -- Exported to ParseIface + , mkMachInt, mkMachWord + , mkMachInt64, mkMachWord64, mkStringLit + , litSize + , litIsDupable, litIsTrivial + , literalType + , hashLiteral + + , inIntRange, inWordRange, tARGET_MAX_INT, inCharRange + , isZeroLit + + , word2IntLit, int2WordLit + , narrow8IntLit, narrow16IntLit, narrow32IntLit + , narrow8WordLit, narrow16WordLit, narrow32WordLit + , char2IntLit, int2CharLit + , float2IntLit, int2FloatLit, double2IntLit, int2DoubleLit + , nullAddrLit, float2DoubleLit, double2FloatLit + ) where + +#include "HsVersions.h" + +import TysPrim ( charPrimTy, addrPrimTy, floatPrimTy, doublePrimTy, + intPrimTy, wordPrimTy, int64PrimTy, word64PrimTy + ) +import Type ( Type ) +import Outputable +import FastTypes +import FastString +import Binary + +import Ratio ( numerator ) +import FastString ( uniqueOfFS, lengthFS ) +import DATA_INT ( Int8, Int16, Int32 ) +import DATA_WORD ( Word8, Word16, Word32 ) +import Char ( ord, chr ) +\end{code} + + + +%************************************************************************ +%* * +\subsection{Sizes} +%* * +%************************************************************************ + +If we're compiling with GHC (and we're not cross-compiling), then we +know that minBound and maxBound :: Int are the right values for the +target architecture. Otherwise, we assume -2^31 and 2^31-1 +respectively (which will be wrong on a 64-bit machine). + +\begin{code} +tARGET_MIN_INT, tARGET_MAX_INT, tARGET_MAX_WORD :: Integer +#if __GLASGOW_HASKELL__ +tARGET_MIN_INT = toInteger (minBound :: Int) +tARGET_MAX_INT = toInteger (maxBound :: Int) +#else +tARGET_MIN_INT = -2147483648 +tARGET_MAX_INT = 2147483647 +#endif +tARGET_MAX_WORD = (tARGET_MAX_INT * 2) + 1 + +tARGET_MAX_CHAR :: Int +tARGET_MAX_CHAR = 0x10ffff +\end{code} + + +%************************************************************************ +%* * +\subsection{Literals} +%* * +%************************************************************************ + +So-called @Literals@ are {\em either}: +\begin{itemize} +\item +An unboxed (``machine'') literal (type: @IntPrim@, @FloatPrim@, etc.), +which is presumed to be surrounded by appropriate constructors +(@mKINT@, etc.), so that the overall thing makes sense. +\item +An Integer, Rational, or String literal whose representation we are +{\em uncommitted} about; i.e., the surrounding with constructors, +function applications, etc., etc., has not yet been done. +\end{itemize} + +\begin{code} +data Literal + = ------------------ + -- First the primitive guys + MachChar Char -- Char# At least 31 bits + + | MachStr FastString -- A string-literal: stored and emitted + -- UTF-8 encoded, we'll arrange to decode it + -- at runtime. Also emitted with a '\0' + -- terminator. + + | MachNullAddr -- the NULL pointer, the only pointer value + -- that can be represented as a Literal. + + | MachInt Integer -- Int# At least WORD_SIZE_IN_BITS bits + | MachInt64 Integer -- Int64# At least 64 bits + | MachWord Integer -- Word# At least WORD_SIZE_IN_BITS bits + | MachWord64 Integer -- Word64# At least 64 bits + + | MachFloat Rational + | MachDouble Rational + + -- MachLabel is used (only) for the literal derived from a + -- "foreign label" declaration. + -- string argument is the name of a symbol. This literal + -- refers to the *address* of the label. + | MachLabel FastString -- always an Addr# + (Maybe Int) -- the size (in bytes) of the arguments + -- the label expects. Only applicable with + -- 'stdcall' labels. + -- Just x => "@<x>" will be appended to label + -- name when emitting asm. +\end{code} + +Binary instance + +\begin{code} +instance Binary Literal where + put_ bh (MachChar aa) = do putByte bh 0; put_ bh aa + put_ bh (MachStr ab) = do putByte bh 1; put_ bh ab + put_ bh (MachNullAddr) = do putByte bh 2 + put_ bh (MachInt ad) = do putByte bh 3; put_ bh ad + put_ bh (MachInt64 ae) = do putByte bh 4; put_ bh ae + put_ bh (MachWord af) = do putByte bh 5; put_ bh af + put_ bh (MachWord64 ag) = do putByte bh 6; put_ bh ag + put_ bh (MachFloat ah) = do putByte bh 7; put_ bh ah + put_ bh (MachDouble ai) = do putByte bh 8; put_ bh ai + put_ bh (MachLabel aj mb) = do putByte bh 9; put_ bh aj ; put_ bh mb + get bh = do + h <- getByte bh + case h of + 0 -> do + aa <- get bh + return (MachChar aa) + 1 -> do + ab <- get bh + return (MachStr ab) + 2 -> do + return (MachNullAddr) + 3 -> do + ad <- get bh + return (MachInt ad) + 4 -> do + ae <- get bh + return (MachInt64 ae) + 5 -> do + af <- get bh + return (MachWord af) + 6 -> do + ag <- get bh + return (MachWord64 ag) + 7 -> do + ah <- get bh + return (MachFloat ah) + 8 -> do + ai <- get bh + return (MachDouble ai) + 9 -> do + aj <- get bh + mb <- get bh + return (MachLabel aj mb) +\end{code} + +\begin{code} +instance Outputable Literal where + ppr lit = pprLit lit + +instance Show Literal where + showsPrec p lit = showsPrecSDoc p (ppr lit) + +instance Eq Literal where + a == b = case (a `compare` b) of { EQ -> True; _ -> False } + a /= b = case (a `compare` b) of { EQ -> False; _ -> True } + +instance Ord Literal where + a <= b = case (a `compare` b) of { LT -> True; EQ -> True; GT -> False } + a < b = case (a `compare` b) of { LT -> True; EQ -> False; GT -> False } + a >= b = case (a `compare` b) of { LT -> False; EQ -> True; GT -> True } + a > b = case (a `compare` b) of { LT -> False; EQ -> False; GT -> True } + compare a b = cmpLit a b +\end{code} + + + Construction + ~~~~~~~~~~~~ +\begin{code} +mkMachInt, mkMachWord, mkMachInt64, mkMachWord64 :: Integer -> Literal + +mkMachInt x = -- ASSERT2( inIntRange x, integer x ) + -- Not true: you can write out of range Int# literals + -- For example, one can write (intToWord# 0xffff0000) to + -- get a particular Word bit-pattern, and there's no other + -- convenient way to write such literals, which is why we allow it. + MachInt x +mkMachWord x = -- ASSERT2( inWordRange x, integer x ) + MachWord x +mkMachInt64 x = MachInt64 x +mkMachWord64 x = MachWord64 x + +mkStringLit :: String -> Literal +mkStringLit s = MachStr (mkFastString s) -- stored UTF-8 encoded + +inIntRange, inWordRange :: Integer -> Bool +inIntRange x = x >= tARGET_MIN_INT && x <= tARGET_MAX_INT +inWordRange x = x >= 0 && x <= tARGET_MAX_WORD + +inCharRange :: Char -> Bool +inCharRange c = c >= '\0' && c <= chr tARGET_MAX_CHAR + +isZeroLit :: Literal -> Bool +isZeroLit (MachInt 0) = True +isZeroLit (MachInt64 0) = True +isZeroLit (MachWord 0) = True +isZeroLit (MachWord64 0) = True +isZeroLit (MachFloat 0) = True +isZeroLit (MachDouble 0) = True +isZeroLit other = False +\end{code} + + Coercions + ~~~~~~~~~ +\begin{code} +word2IntLit, int2WordLit, + narrow8IntLit, narrow16IntLit, narrow32IntLit, + narrow8WordLit, narrow16WordLit, narrow32WordLit, + char2IntLit, int2CharLit, + float2IntLit, int2FloatLit, double2IntLit, int2DoubleLit, + float2DoubleLit, double2FloatLit + :: Literal -> Literal + +word2IntLit (MachWord w) + | w > tARGET_MAX_INT = MachInt (w - tARGET_MAX_WORD - 1) + | otherwise = MachInt w + +int2WordLit (MachInt i) + | i < 0 = MachWord (1 + tARGET_MAX_WORD + i) -- (-1) ---> tARGET_MAX_WORD + | otherwise = MachWord i + +narrow8IntLit (MachInt i) = MachInt (toInteger (fromInteger i :: Int8)) +narrow16IntLit (MachInt i) = MachInt (toInteger (fromInteger i :: Int16)) +narrow32IntLit (MachInt i) = MachInt (toInteger (fromInteger i :: Int32)) +narrow8WordLit (MachWord w) = MachWord (toInteger (fromInteger w :: Word8)) +narrow16WordLit (MachWord w) = MachWord (toInteger (fromInteger w :: Word16)) +narrow32WordLit (MachWord w) = MachWord (toInteger (fromInteger w :: Word32)) + +char2IntLit (MachChar c) = MachInt (toInteger (ord c)) +int2CharLit (MachInt i) = MachChar (chr (fromInteger i)) + +float2IntLit (MachFloat f) = MachInt (truncate f) +int2FloatLit (MachInt i) = MachFloat (fromInteger i) + +double2IntLit (MachDouble f) = MachInt (truncate f) +int2DoubleLit (MachInt i) = MachDouble (fromInteger i) + +float2DoubleLit (MachFloat f) = MachDouble f +double2FloatLit (MachDouble d) = MachFloat d + +nullAddrLit :: Literal +nullAddrLit = MachNullAddr +\end{code} + + Predicates + ~~~~~~~~~~ +\begin{code} +litIsTrivial :: Literal -> Bool +-- True if there is absolutely no penalty to duplicating the literal +-- c.f. CoreUtils.exprIsTrivial +-- False principally of strings +litIsTrivial (MachStr _) = False +litIsTrivial other = True + +litIsDupable :: Literal -> Bool +-- True if code space does not go bad if we duplicate this literal +-- c.f. CoreUtils.exprIsDupable +-- Currently we treat it just like litIsTrivial +litIsDupable (MachStr _) = False +litIsDupable other = True + +litSize :: Literal -> Int +-- Used by CoreUnfold.sizeExpr +litSize (MachStr str) = 1 + ((lengthFS str + 3) `div` 4) + -- Every literal has size at least 1, otherwise + -- f "x" + -- might be too small + -- [Sept03: make literal strings a bit bigger to avoid fruitless + -- duplication of little strings] +litSize _other = 1 +\end{code} + + Types + ~~~~~ +\begin{code} +literalType :: Literal -> Type +literalType MachNullAddr = addrPrimTy +literalType (MachChar _) = charPrimTy +literalType (MachStr _) = addrPrimTy +literalType (MachInt _) = intPrimTy +literalType (MachWord _) = wordPrimTy +literalType (MachInt64 _) = int64PrimTy +literalType (MachWord64 _) = word64PrimTy +literalType (MachFloat _) = floatPrimTy +literalType (MachDouble _) = doublePrimTy +literalType (MachLabel _ _) = addrPrimTy +\end{code} + + + Comparison + ~~~~~~~~~~ +\begin{code} +cmpLit (MachChar a) (MachChar b) = a `compare` b +cmpLit (MachStr a) (MachStr b) = a `compare` b +cmpLit (MachNullAddr) (MachNullAddr) = EQ +cmpLit (MachInt a) (MachInt b) = a `compare` b +cmpLit (MachWord a) (MachWord b) = a `compare` b +cmpLit (MachInt64 a) (MachInt64 b) = a `compare` b +cmpLit (MachWord64 a) (MachWord64 b) = a `compare` b +cmpLit (MachFloat a) (MachFloat b) = a `compare` b +cmpLit (MachDouble a) (MachDouble b) = a `compare` b +cmpLit (MachLabel a _) (MachLabel b _) = a `compare` b +cmpLit lit1 lit2 | litTag lit1 <# litTag lit2 = LT + | otherwise = GT + +litTag (MachChar _) = _ILIT(1) +litTag (MachStr _) = _ILIT(2) +litTag (MachNullAddr) = _ILIT(3) +litTag (MachInt _) = _ILIT(4) +litTag (MachWord _) = _ILIT(5) +litTag (MachInt64 _) = _ILIT(6) +litTag (MachWord64 _) = _ILIT(7) +litTag (MachFloat _) = _ILIT(8) +litTag (MachDouble _) = _ILIT(9) +litTag (MachLabel _ _) = _ILIT(10) +\end{code} + + Printing + ~~~~~~~~ +* MachX (i.e. unboxed) things are printed unadornded (e.g. 3, 'a', "foo") + exceptions: MachFloat gets an initial keyword prefix. + +\begin{code} +pprLit (MachChar ch) = pprHsChar ch +pprLit (MachStr s) = pprHsString s +pprLit (MachInt i) = pprIntVal i +pprLit (MachInt64 i) = ptext SLIT("__int64") <+> integer i +pprLit (MachWord w) = ptext SLIT("__word") <+> integer w +pprLit (MachWord64 w) = ptext SLIT("__word64") <+> integer w +pprLit (MachFloat f) = ptext SLIT("__float") <+> rational f +pprLit (MachDouble d) = rational d +pprLit (MachNullAddr) = ptext SLIT("__NULL") +pprLit (MachLabel l mb) = ptext SLIT("__label") <+> + case mb of + Nothing -> pprHsString l + Just x -> doubleQuotes (text (unpackFS l ++ '@':show x)) + +pprIntVal :: Integer -> SDoc +-- Print negative integers with parens to be sure it's unambiguous +pprIntVal i | i < 0 = parens (integer i) + | otherwise = integer i +\end{code} + + +%************************************************************************ +%* * +\subsection{Hashing} +%* * +%************************************************************************ + +Hash values should be zero or a positive integer. No negatives please. +(They mess up the UniqFM for some reason.) + +\begin{code} +hashLiteral :: Literal -> Int +hashLiteral (MachChar c) = ord c + 1000 -- Keep it out of range of common ints +hashLiteral (MachStr s) = hashFS s +hashLiteral (MachNullAddr) = 0 +hashLiteral (MachInt i) = hashInteger i +hashLiteral (MachInt64 i) = hashInteger i +hashLiteral (MachWord i) = hashInteger i +hashLiteral (MachWord64 i) = hashInteger i +hashLiteral (MachFloat r) = hashRational r +hashLiteral (MachDouble r) = hashRational r +hashLiteral (MachLabel s _) = hashFS s + +hashRational :: Rational -> Int +hashRational r = hashInteger (numerator r) + +hashInteger :: Integer -> Int +hashInteger i = 1 + abs (fromInteger (i `rem` 10000)) + -- The 1+ is to avoid zero, which is a Bad Number + -- since we use * to combine hash values + +hashFS :: FastString -> Int +hashFS s = iBox (uniqueOfFS s) +\end{code} diff --git a/compiler/basicTypes/MkId.hi-boot-5 b/compiler/basicTypes/MkId.hi-boot-5 new file mode 100644 index 0000000000..ff901a5840 --- /dev/null +++ b/compiler/basicTypes/MkId.hi-boot-5 @@ -0,0 +1,3 @@ +__interface MkId 1 0 where +__export MkId mkDataConIds ; +1 mkDataConIds :: Name.Name -> Name.Name -> DataCon.DataCon -> DataCon.DataConIds ; diff --git a/compiler/basicTypes/MkId.hi-boot-6 b/compiler/basicTypes/MkId.hi-boot-6 new file mode 100644 index 0000000000..d3f22527f3 --- /dev/null +++ b/compiler/basicTypes/MkId.hi-boot-6 @@ -0,0 +1,5 @@ +module MkId where + +mkDataConIds :: Name.Name -> Name.Name -> DataCon.DataCon -> DataCon.DataConIds + + diff --git a/compiler/basicTypes/MkId.lhs b/compiler/basicTypes/MkId.lhs new file mode 100644 index 0000000000..84b3546e62 --- /dev/null +++ b/compiler/basicTypes/MkId.lhs @@ -0,0 +1,1044 @@ +% +% (c) The AQUA Project, Glasgow University, 1998 +% +\section[StdIdInfo]{Standard unfoldings} + +This module contains definitions for the IdInfo for things that +have a standard form, namely: + + * data constructors + * record selectors + * method and superclass selectors + * primitive operations + +\begin{code} +module MkId ( + mkDictFunId, mkDefaultMethodId, + mkDictSelId, + + mkDataConIds, + mkRecordSelId, + mkPrimOpId, mkFCallId, + + mkReboxingAlt, mkNewTypeBody, + + -- And some particular Ids; see below for why they are wired in + wiredInIds, ghcPrimIds, + unsafeCoerceId, realWorldPrimId, voidArgId, nullAddrId, seqId, + lazyId, lazyIdUnfolding, lazyIdKey, + + mkRuntimeErrorApp, + rEC_CON_ERROR_ID, iRREFUT_PAT_ERROR_ID, rUNTIME_ERROR_ID, + nON_EXHAUSTIVE_GUARDS_ERROR_ID, nO_METHOD_BINDING_ERROR_ID, + pAT_ERROR_ID, eRROR_ID, + + unsafeCoerceName + ) where + +#include "HsVersions.h" + + +import BasicTypes ( Arity, StrictnessMark(..), isMarkedUnboxed, isMarkedStrict ) +import Rules ( mkSpecInfo ) +import TysPrim ( openAlphaTyVars, alphaTyVar, alphaTy, + realWorldStatePrimTy, addrPrimTy + ) +import TysWiredIn ( charTy, mkListTy ) +import PrelRules ( primOpRules ) +import Type ( TyThing(..), mkForAllTy, tyVarsOfTypes ) +import TcType ( Type, ThetaType, mkDictTy, mkPredTys, mkPredTy, + mkTyConApp, mkTyVarTys, mkClassPred, + mkFunTys, mkFunTy, mkSigmaTy, tcSplitSigmaTy, + isUnLiftedType, mkForAllTys, mkTyVarTy, tyVarsOfType, + tcSplitFunTys, tcSplitForAllTys, dataConsStupidTheta + ) +import CoreUtils ( exprType ) +import CoreUnfold ( mkTopUnfolding, mkCompulsoryUnfolding ) +import Literal ( nullAddrLit, mkStringLit ) +import TyCon ( TyCon, isNewTyCon, tyConDataCons, FieldLabel, + tyConStupidTheta, isProductTyCon, isDataTyCon, isRecursiveTyCon ) +import Class ( Class, classTyCon, classSelIds ) +import Var ( Id, TyVar, Var ) +import VarSet ( isEmptyVarSet, subVarSet, varSetElems ) +import Name ( mkFCallName, mkWiredInName, Name, BuiltInSyntax(..) ) +import OccName ( mkOccNameFS, varName ) +import PrimOp ( PrimOp, primOpSig, primOpOcc, primOpTag ) +import ForeignCall ( ForeignCall ) +import DataCon ( DataCon, DataConIds(..), dataConTyVars, + dataConFieldLabels, dataConRepArity, dataConResTys, + dataConRepArgTys, dataConRepType, + dataConSig, dataConStrictMarks, dataConExStricts, + splitProductType, isVanillaDataCon, dataConFieldType, + dataConInstOrigArgTys + ) +import Id ( idType, mkGlobalId, mkVanillaGlobal, mkSysLocal, + mkTemplateLocals, mkTemplateLocalsNum, mkExportedLocalId, + mkTemplateLocal, idName + ) +import IdInfo ( IdInfo, noCafIdInfo, setUnfoldingInfo, + setArityInfo, setSpecInfo, setCafInfo, + setAllStrictnessInfo, vanillaIdInfo, + GlobalIdDetails(..), CafInfo(..) + ) +import NewDemand ( mkStrictSig, DmdResult(..), + mkTopDmdType, topDmd, evalDmd, lazyDmd, retCPR, + Demand(..), Demands(..) ) +import DmdAnal ( dmdAnalTopRhs ) +import CoreSyn +import Unique ( mkBuiltinUnique, mkPrimOpIdUnique ) +import Maybes +import PrelNames +import Util ( dropList, isSingleton ) +import Outputable +import FastString +import ListSetOps ( assoc ) +\end{code} + +%************************************************************************ +%* * +\subsection{Wired in Ids} +%* * +%************************************************************************ + +\begin{code} +wiredInIds + = [ -- These error-y things are wired in because we don't yet have + -- a way to express in an interface file that the result type variable + -- is 'open'; that is can be unified with an unboxed type + -- + -- [The interface file format now carry such information, but there's + -- no way yet of expressing at the definition site for these + -- error-reporting functions that they have an 'open' + -- result type. -- sof 1/99] + + eRROR_ID, -- This one isn't used anywhere else in the compiler + -- But we still need it in wiredInIds so that when GHC + -- compiles a program that mentions 'error' we don't + -- import its type from the interface file; we just get + -- the Id defined here. Which has an 'open-tyvar' type. + + rUNTIME_ERROR_ID, + iRREFUT_PAT_ERROR_ID, + nON_EXHAUSTIVE_GUARDS_ERROR_ID, + nO_METHOD_BINDING_ERROR_ID, + pAT_ERROR_ID, + rEC_CON_ERROR_ID, + + lazyId + ] ++ ghcPrimIds + +-- These Ids are exported from GHC.Prim +ghcPrimIds + = [ -- These can't be defined in Haskell, but they have + -- perfectly reasonable unfoldings in Core + realWorldPrimId, + unsafeCoerceId, + nullAddrId, + seqId + ] +\end{code} + +%************************************************************************ +%* * +\subsection{Data constructors} +%* * +%************************************************************************ + +The wrapper for a constructor is an ordinary top-level binding that evaluates +any strict args, unboxes any args that are going to be flattened, and calls +the worker. + +We're going to build a constructor that looks like: + + data (Data a, C b) => T a b = T1 !a !Int b + + T1 = /\ a b -> + \d1::Data a, d2::C b -> + \p q r -> case p of { p -> + case q of { q -> + Con T1 [a,b] [p,q,r]}} + +Notice that + +* d2 is thrown away --- a context in a data decl is used to make sure + one *could* construct dictionaries at the site the constructor + is used, but the dictionary isn't actually used. + +* We have to check that we can construct Data dictionaries for + the types a and Int. Once we've done that we can throw d1 away too. + +* We use (case p of q -> ...) to evaluate p, rather than "seq" because + all that matters is that the arguments are evaluated. "seq" is + very careful to preserve evaluation order, which we don't need + to be here. + + You might think that we could simply give constructors some strictness + info, like PrimOps, and let CoreToStg do the let-to-case transformation. + But we don't do that because in the case of primops and functions strictness + is a *property* not a *requirement*. In the case of constructors we need to + do something active to evaluate the argument. + + Making an explicit case expression allows the simplifier to eliminate + it in the (common) case where the constructor arg is already evaluated. + + +\begin{code} +mkDataConIds :: Name -> Name -> DataCon -> DataConIds + -- Makes the *worker* for the data constructor; that is, the function + -- that takes the reprsentation arguments and builds the constructor. +mkDataConIds wrap_name wkr_name data_con + | isNewTyCon tycon + = NewDC nt_wrap_id + + | any isMarkedStrict all_strict_marks -- Algebraic, needs wrapper + = AlgDC (Just alg_wrap_id) wrk_id + + | otherwise -- Algebraic, no wrapper + = AlgDC Nothing wrk_id + where + (tyvars, theta, orig_arg_tys, tycon, res_tys) = dataConSig data_con + + dict_tys = mkPredTys theta + all_arg_tys = dict_tys ++ orig_arg_tys + result_ty = mkTyConApp tycon res_tys + + wrap_ty = mkForAllTys tyvars (mkFunTys all_arg_tys result_ty) + -- We used to include the stupid theta in the wrapper's args + -- but now we don't. Instead the type checker just injects these + -- extra constraints where necessary. + + ----------- Worker (algebraic data types only) -------------- + wrk_id = mkGlobalId (DataConWorkId data_con) wkr_name + (dataConRepType data_con) wkr_info + + wkr_arity = dataConRepArity data_con + wkr_info = noCafIdInfo + `setArityInfo` wkr_arity + `setAllStrictnessInfo` Just wkr_sig + `setUnfoldingInfo` evaldUnfolding -- Record that it's evaluated, + -- even if arity = 0 + + wkr_sig = mkStrictSig (mkTopDmdType (replicate wkr_arity topDmd) cpr_info) + -- Notice that we do *not* say the worker is strict + -- even if the data constructor is declared strict + -- e.g. data T = MkT !(Int,Int) + -- Why? Because the *wrapper* is strict (and its unfolding has case + -- expresssions that do the evals) but the *worker* itself is not. + -- If we pretend it is strict then when we see + -- case x of y -> $wMkT y + -- the simplifier thinks that y is "sure to be evaluated" (because + -- $wMkT is strict) and drops the case. No, $wMkT is not strict. + -- + -- When the simplifer sees a pattern + -- case e of MkT x -> ... + -- it uses the dataConRepStrictness of MkT to mark x as evaluated; + -- but that's fine... dataConRepStrictness comes from the data con + -- not from the worker Id. + + cpr_info | isProductTyCon tycon && + isDataTyCon tycon && + wkr_arity > 0 && + wkr_arity <= mAX_CPR_SIZE = retCPR + | otherwise = TopRes + -- RetCPR is only true for products that are real data types; + -- that is, not unboxed tuples or [non-recursive] newtypes + + ----------- Wrappers for newtypes -------------- + nt_wrap_id = mkGlobalId (DataConWrapId data_con) wrap_name wrap_ty nt_wrap_info + nt_wrap_info = noCafIdInfo -- The NoCaf-ness is set by noCafIdInfo + `setArityInfo` 1 -- Arity 1 + `setUnfoldingInfo` newtype_unf + newtype_unf = ASSERT( isVanillaDataCon data_con && + isSingleton orig_arg_tys ) + -- No existentials on a newtype, but it can have a context + -- e.g. newtype Eq a => T a = MkT (...) + mkTopUnfolding $ Note InlineMe $ + mkLams tyvars $ Lam id_arg1 $ + mkNewTypeBody tycon result_ty (Var id_arg1) + + id_arg1 = mkTemplateLocal 1 (head orig_arg_tys) + + ----------- Wrappers for algebraic data types -------------- + alg_wrap_id = mkGlobalId (DataConWrapId data_con) wrap_name wrap_ty alg_wrap_info + alg_wrap_info = noCafIdInfo -- The NoCaf-ness is set by noCafIdInfo + `setArityInfo` alg_arity + -- It's important to specify the arity, so that partial + -- applications are treated as values + `setUnfoldingInfo` alg_unf + `setAllStrictnessInfo` Just wrap_sig + + all_strict_marks = dataConExStricts data_con ++ dataConStrictMarks data_con + wrap_sig = mkStrictSig (mkTopDmdType arg_dmds cpr_info) + arg_dmds = map mk_dmd all_strict_marks + mk_dmd str | isMarkedStrict str = evalDmd + | otherwise = lazyDmd + -- The Cpr info can be important inside INLINE rhss, where the + -- wrapper constructor isn't inlined. + -- And the argument strictness can be important too; we + -- may not inline a contructor when it is partially applied. + -- For example: + -- data W = C !Int !Int !Int + -- ...(let w = C x in ...(w p q)...)... + -- we want to see that w is strict in its two arguments + + alg_unf = mkTopUnfolding $ Note InlineMe $ + mkLams tyvars $ + mkLams dict_args $ mkLams id_args $ + foldr mk_case con_app + (zip (dict_args ++ id_args) all_strict_marks) + i3 [] + + con_app i rep_ids = mkApps (Var wrk_id) + (map varToCoreExpr (tyvars ++ reverse rep_ids)) + + (dict_args,i2) = mkLocals 1 dict_tys + (id_args,i3) = mkLocals i2 orig_arg_tys + alg_arity = i3-1 + + mk_case + :: (Id, StrictnessMark) -- Arg, strictness + -> (Int -> [Id] -> CoreExpr) -- Body + -> Int -- Next rep arg id + -> [Id] -- Rep args so far, reversed + -> CoreExpr + mk_case (arg,strict) body i rep_args + = case strict of + NotMarkedStrict -> body i (arg:rep_args) + MarkedStrict + | isUnLiftedType (idType arg) -> body i (arg:rep_args) + | otherwise -> + Case (Var arg) arg result_ty [(DEFAULT,[], body i (arg:rep_args))] + + MarkedUnboxed + -> case splitProductType "do_unbox" (idType arg) of + (tycon, tycon_args, con, tys) -> + Case (Var arg) arg result_ty + [(DataAlt con, + con_args, + body i' (reverse con_args ++ rep_args))] + where + (con_args, i') = mkLocals i tys + +mAX_CPR_SIZE :: Arity +mAX_CPR_SIZE = 10 +-- We do not treat very big tuples as CPR-ish: +-- a) for a start we get into trouble because there aren't +-- "enough" unboxed tuple types (a tiresome restriction, +-- but hard to fix), +-- b) more importantly, big unboxed tuples get returned mainly +-- on the stack, and are often then allocated in the heap +-- by the caller. So doing CPR for them may in fact make +-- things worse. + +mkLocals i tys = (zipWith mkTemplateLocal [i..i+n-1] tys, i+n) + where + n = length tys +\end{code} + + +%************************************************************************ +%* * +\subsection{Record selectors} +%* * +%************************************************************************ + +We're going to build a record selector unfolding that looks like this: + + data T a b c = T1 { ..., op :: a, ...} + | T2 { ..., op :: a, ...} + | T3 + + sel = /\ a b c -> \ d -> case d of + T1 ... x ... -> x + T2 ... x ... -> x + other -> error "..." + +Similarly for newtypes + + newtype N a = MkN { unN :: a->a } + + unN :: N a -> a -> a + unN n = coerce (a->a) n + +We need to take a little care if the field has a polymorphic type: + + data R = R { f :: forall a. a->a } + +Then we want + + f :: forall a. R -> a -> a + f = /\ a \ r = case r of + R f -> f a + +(not f :: R -> forall a. a->a, which gives the type inference mechanism +problems at call sites) + +Similarly for (recursive) newtypes + + newtype N = MkN { unN :: forall a. a->a } + + unN :: forall b. N -> b -> b + unN = /\b -> \n:N -> (coerce (forall a. a->a) n) + + +Note [Naughty record selectors] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +A "naughty" field is one for which we can't define a record +selector, because an existential type variable would escape. For example: + data T = forall a. MkT { x,y::a } +We obviously can't define + x (MkT v _) = v +Nevertheless we *do* put a RecordSelId into the type environment +so that if the user tries to use 'x' as a selector we can bleat +helpfully, rather than saying unhelpfully that 'x' is not in scope. +Hence the sel_naughty flag, to identify record selcectors that don't really exist. + +In general, a field is naughty if its type mentions a type variable that +isn't in the result type of the constructor. + +For GADTs, we require that all constructors with a common field 'f' have the same +result type (modulo alpha conversion). [Checked in TcTyClsDecls.checkValidTyCon] +E.g. + data T where + T1 { f :: a } :: T [a] + T2 { f :: a, y :: b } :: T [a] +and now the selector takes that type as its argument: + f :: forall a. T [a] -> a + f t = case t of + T1 { f = v } -> v + T2 { f = v } -> v +Note the forall'd tyvars of the selector are just the free tyvars +of the result type; there may be other tyvars in the constructor's +type (e.g. 'b' in T2). + +\begin{code} + +-- XXX - autrijus - +-- Plan: 1. Determine naughtiness by comparing field type vs result type +-- 2. Install naughty ones with selector_ty of type _|_ and fill in mzero for info +-- 3. If it's not naughty, do the normal plan. + +mkRecordSelId :: TyCon -> FieldLabel -> Id +mkRecordSelId tycon field_label + -- Assumes that all fields with the same field label have the same type + | is_naughty = naughty_id + | otherwise = sel_id + where + is_naughty = not (tyVarsOfType field_ty `subVarSet` tyvar_set) + sel_id_details = RecordSelId tycon field_label is_naughty + + -- Escapist case here for naughty construcotrs + -- We give it no IdInfo, and a type of forall a.a (never looked at) + naughty_id = mkGlobalId sel_id_details field_label forall_a_a noCafIdInfo + forall_a_a = mkForAllTy alphaTyVar (mkTyVarTy alphaTyVar) + + -- Normal case starts here + sel_id = mkGlobalId sel_id_details field_label selector_ty info + data_cons = tyConDataCons tycon + data_cons_w_field = filter has_field data_cons -- Can't be empty! + has_field con = field_label `elem` dataConFieldLabels con + + con1 = head data_cons_w_field + res_tys = dataConResTys con1 + tyvar_set = tyVarsOfTypes res_tys + tyvars = varSetElems tyvar_set + data_ty = mkTyConApp tycon res_tys + field_ty = dataConFieldType con1 field_label + + -- *Very* tiresomely, the selectors are (unnecessarily!) overloaded over + -- just the dictionaries in the types of the constructors that contain + -- the relevant field. [The Report says that pattern matching on a + -- constructor gives the same constraints as applying it.] Urgh. + -- + -- However, not all data cons have all constraints (because of + -- BuildTyCl.mkDataConStupidTheta). So we need to find all the data cons + -- involved in the pattern match and take the union of their constraints. + stupid_dict_tys = mkPredTys (dataConsStupidTheta data_cons_w_field) + n_stupid_dicts = length stupid_dict_tys + + (field_tyvars,field_theta,field_tau) = tcSplitSigmaTy field_ty + field_dict_tys = mkPredTys field_theta + n_field_dict_tys = length field_dict_tys + -- If the field has a universally quantified type we have to + -- be a bit careful. Suppose we have + -- data R = R { op :: forall a. Foo a => a -> a } + -- Then we can't give op the type + -- op :: R -> forall a. Foo a => a -> a + -- because the typechecker doesn't understand foralls to the + -- right of an arrow. The "right" type to give it is + -- op :: forall a. Foo a => R -> a -> a + -- But then we must generate the right unfolding too: + -- op = /\a -> \dfoo -> \ r -> + -- case r of + -- R op -> op a dfoo + -- Note that this is exactly the type we'd infer from a user defn + -- op (R op) = op + + selector_ty :: Type + selector_ty = mkForAllTys tyvars $ mkForAllTys field_tyvars $ + mkFunTys stupid_dict_tys $ mkFunTys field_dict_tys $ + mkFunTy data_ty field_tau + + arity = 1 + n_stupid_dicts + n_field_dict_tys + + (strict_sig, rhs_w_str) = dmdAnalTopRhs sel_rhs + -- Use the demand analyser to work out strictness. + -- With all this unpackery it's not easy! + + info = noCafIdInfo + `setCafInfo` caf_info + `setArityInfo` arity + `setUnfoldingInfo` mkTopUnfolding rhs_w_str + `setAllStrictnessInfo` Just strict_sig + + -- Allocate Ids. We do it a funny way round because field_dict_tys is + -- almost always empty. Also note that we use max_dict_tys + -- rather than n_dict_tys, because the latter gives an infinite loop: + -- n_dict tys depends on the_alts, which depens on arg_ids, which depends + -- on arity, which depends on n_dict tys. Sigh! Mega sigh! + stupid_dict_ids = mkTemplateLocalsNum 1 stupid_dict_tys + max_stupid_dicts = length (tyConStupidTheta tycon) + field_dict_base = max_stupid_dicts + 1 + field_dict_ids = mkTemplateLocalsNum field_dict_base field_dict_tys + dict_id_base = field_dict_base + n_field_dict_tys + data_id = mkTemplateLocal dict_id_base data_ty + arg_base = dict_id_base + 1 + + the_alts :: [CoreAlt] + the_alts = map mk_alt data_cons_w_field -- Already sorted by data-con + no_default = length data_cons == length data_cons_w_field -- No default needed + + default_alt | no_default = [] + | otherwise = [(DEFAULT, [], error_expr)] + + -- The default branch may have CAF refs, because it calls recSelError etc. + caf_info | no_default = NoCafRefs + | otherwise = MayHaveCafRefs + + sel_rhs = mkLams tyvars $ mkLams field_tyvars $ + mkLams stupid_dict_ids $ mkLams field_dict_ids $ + Lam data_id $ sel_body + + sel_body | isNewTyCon tycon = mk_result (mkNewTypeBody tycon field_ty (Var data_id)) + | otherwise = Case (Var data_id) data_id field_tau (default_alt ++ the_alts) + + mk_result poly_result = mkVarApps (mkVarApps poly_result field_tyvars) field_dict_ids + -- We pull the field lambdas to the top, so we need to + -- apply them in the body. For example: + -- data T = MkT { foo :: forall a. a->a } + -- + -- foo :: forall a. T -> a -> a + -- foo = /\a. \t:T. case t of { MkT f -> f a } + + mk_alt data_con + = -- In the non-vanilla case, the pattern must bind type variables and + -- the context stuff; hence the arg_prefix binding below + mkReboxingAlt uniqs data_con (arg_prefix ++ arg_ids) + (mk_result (Var the_arg_id)) + where + (arg_prefix, arg_ids) + | isVanillaDataCon data_con -- Instantiate from commmon base + = ([], mkTemplateLocalsNum arg_base (dataConInstOrigArgTys data_con res_tys)) + | otherwise -- The case pattern binds type variables, which are used + -- in the types of the arguments of the pattern + = (dc_tyvars ++ mkTemplateLocalsNum arg_base (mkPredTys dc_theta), + mkTemplateLocalsNum arg_base' dc_arg_tys) + + (dc_tyvars, dc_theta, dc_arg_tys, _, _) = dataConSig data_con + arg_base' = arg_base + length dc_theta + + unpack_base = arg_base' + length dc_arg_tys + uniqs = map mkBuiltinUnique [unpack_base..] + + the_arg_id = assoc "mkRecordSelId:mk_alt" (field_lbls `zip` arg_ids) field_label + field_lbls = dataConFieldLabels data_con + + error_expr = mkRuntimeErrorApp rEC_SEL_ERROR_ID field_tau full_msg + full_msg = showSDoc (sep [text "No match in record selector", ppr sel_id]) + + +-- (mkReboxingAlt us con xs rhs) basically constructs the case +-- alternative (con, xs, rhs) +-- but it does the reboxing necessary to construct the *source* +-- arguments, xs, from the representation arguments ys. +-- For example: +-- data T = MkT !(Int,Int) Bool +-- +-- mkReboxingAlt MkT [x,b] r +-- = (DataAlt MkT, [y::Int,z::Int,b], let x = (y,z) in r) +-- +-- mkDataAlt should really be in DataCon, but it can't because +-- it manipulates CoreSyn. + +mkReboxingAlt + :: [Unique] -- Uniques for the new Ids + -> DataCon + -> [Var] -- Source-level args, including existential dicts + -> CoreExpr -- RHS + -> CoreAlt + +mkReboxingAlt us con args rhs + | not (any isMarkedUnboxed stricts) + = (DataAlt con, args, rhs) + + | otherwise + = let + (binds, args') = go args stricts us + in + (DataAlt con, args', mkLets binds rhs) + + where + stricts = dataConExStricts con ++ dataConStrictMarks con + + go [] stricts us = ([], []) + + -- Type variable case + go (arg:args) stricts us + | isTyVar arg + = let (binds, args') = go args stricts us + in (binds, arg:args') + + -- Term variable case + go (arg:args) (str:stricts) us + | isMarkedUnboxed str + = let + (_, tycon_args, pack_con, con_arg_tys) + = splitProductType "mkReboxingAlt" (idType arg) + + unpacked_args = zipWith (mkSysLocal FSLIT("rb")) us con_arg_tys + (binds, args') = go args stricts (dropList con_arg_tys us) + con_app = mkConApp pack_con (map Type tycon_args ++ map Var unpacked_args) + in + (NonRec arg con_app : binds, unpacked_args ++ args') + + | otherwise + = let (binds, args') = go args stricts us + in (binds, arg:args') +\end{code} + + +%************************************************************************ +%* * +\subsection{Dictionary selectors} +%* * +%************************************************************************ + +Selecting a field for a dictionary. If there is just one field, then +there's nothing to do. + +Dictionary selectors may get nested forall-types. Thus: + + class Foo a where + op :: forall b. Ord b => a -> b -> b + +Then the top-level type for op is + + op :: forall a. Foo a => + forall b. Ord b => + a -> b -> b + +This is unlike ordinary record selectors, which have all the for-alls +at the outside. When dealing with classes it's very convenient to +recover the original type signature from the class op selector. + +\begin{code} +mkDictSelId :: Name -> Class -> Id +mkDictSelId name clas + = mkGlobalId (ClassOpId clas) name sel_ty info + where + sel_ty = mkForAllTys tyvars (mkFunTy (idType dict_id) (idType the_arg_id)) + -- We can't just say (exprType rhs), because that would give a type + -- C a -> C a + -- for a single-op class (after all, the selector is the identity) + -- But it's type must expose the representation of the dictionary + -- to gat (say) C a -> (a -> a) + + info = noCafIdInfo + `setArityInfo` 1 + `setUnfoldingInfo` mkTopUnfolding rhs + `setAllStrictnessInfo` Just strict_sig + + -- We no longer use 'must-inline' on record selectors. They'll + -- inline like crazy if they scrutinise a constructor + + -- The strictness signature is of the form U(AAAVAAAA) -> T + -- where the V depends on which item we are selecting + -- It's worth giving one, so that absence info etc is generated + -- even if the selector isn't inlined + strict_sig = mkStrictSig (mkTopDmdType [arg_dmd] TopRes) + arg_dmd | isNewTyCon tycon = evalDmd + | otherwise = Eval (Prod [ if the_arg_id == id then evalDmd else Abs + | id <- arg_ids ]) + + tycon = classTyCon clas + [data_con] = tyConDataCons tycon + tyvars = dataConTyVars data_con + arg_tys = dataConRepArgTys data_con + the_arg_id = assoc "MkId.mkDictSelId" (map idName (classSelIds clas) `zip` arg_ids) name + + pred = mkClassPred clas (mkTyVarTys tyvars) + (dict_id:arg_ids) = mkTemplateLocals (mkPredTy pred : arg_tys) + + rhs | isNewTyCon tycon = mkLams tyvars $ Lam dict_id $ + mkNewTypeBody tycon (head arg_tys) (Var dict_id) + | otherwise = mkLams tyvars $ Lam dict_id $ + Case (Var dict_id) dict_id (idType the_arg_id) + [(DataAlt data_con, arg_ids, Var the_arg_id)] + +mkNewTypeBody tycon result_ty result_expr + -- Adds a coerce where necessary + -- Used for both wrapping and unwrapping + | isRecursiveTyCon tycon -- Recursive case; use a coerce + = Note (Coerce result_ty (exprType result_expr)) result_expr + | otherwise -- Normal case + = result_expr +\end{code} + + +%************************************************************************ +%* * +\subsection{Primitive operations +%* * +%************************************************************************ + +\begin{code} +mkPrimOpId :: PrimOp -> Id +mkPrimOpId prim_op + = id + where + (tyvars,arg_tys,res_ty, arity, strict_sig) = primOpSig prim_op + ty = mkForAllTys tyvars (mkFunTys arg_tys res_ty) + name = mkWiredInName gHC_PRIM (primOpOcc prim_op) + (mkPrimOpIdUnique (primOpTag prim_op)) + Nothing (AnId id) UserSyntax + id = mkGlobalId (PrimOpId prim_op) name ty info + + info = noCafIdInfo + `setSpecInfo` mkSpecInfo (primOpRules prim_op name) + `setArityInfo` arity + `setAllStrictnessInfo` Just strict_sig + +-- For each ccall we manufacture a separate CCallOpId, giving it +-- a fresh unique, a type that is correct for this particular ccall, +-- and a CCall structure that gives the correct details about calling +-- convention etc. +-- +-- The *name* of this Id is a local name whose OccName gives the full +-- details of the ccall, type and all. This means that the interface +-- file reader can reconstruct a suitable Id + +mkFCallId :: Unique -> ForeignCall -> Type -> Id +mkFCallId uniq fcall ty + = ASSERT( isEmptyVarSet (tyVarsOfType ty) ) + -- A CCallOpId should have no free type variables; + -- when doing substitutions won't substitute over it + mkGlobalId (FCallId fcall) name ty info + where + occ_str = showSDoc (braces (ppr fcall <+> ppr ty)) + -- The "occurrence name" of a ccall is the full info about the + -- ccall; it is encoded, but may have embedded spaces etc! + + name = mkFCallName uniq occ_str + + info = noCafIdInfo + `setArityInfo` arity + `setAllStrictnessInfo` Just strict_sig + + (_, tau) = tcSplitForAllTys ty + (arg_tys, _) = tcSplitFunTys tau + arity = length arg_tys + strict_sig = mkStrictSig (mkTopDmdType (replicate arity evalDmd) TopRes) +\end{code} + + +%************************************************************************ +%* * +\subsection{DictFuns and default methods} +%* * +%************************************************************************ + +Important notes about dict funs and default methods +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Dict funs and default methods are *not* ImplicitIds. Their definition +involves user-written code, so we can't figure out their strictness etc +based on fixed info, as we can for constructors and record selectors (say). + +We build them as LocalIds, but with External Names. This ensures that +they are taken to account by free-variable finding and dependency +analysis (e.g. CoreFVs.exprFreeVars). + +Why shouldn't they be bound as GlobalIds? Because, in particular, if +they are globals, the specialiser floats dict uses above their defns, +which prevents good simplifications happening. Also the strictness +analyser treats a occurrence of a GlobalId as imported and assumes it +contains strictness in its IdInfo, which isn't true if the thing is +bound in the same module as the occurrence. + +It's OK for dfuns to be LocalIds, because we form the instance-env to +pass on to the next module (md_insts) in CoreTidy, afer tidying +and globalising the top-level Ids. + +BUT make sure they are *exported* LocalIds (mkExportedLocalId) so +that they aren't discarded by the occurrence analyser. + +\begin{code} +mkDefaultMethodId dm_name ty = mkExportedLocalId dm_name ty + +mkDictFunId :: Name -- Name to use for the dict fun; + -> [TyVar] + -> ThetaType + -> Class + -> [Type] + -> Id + +mkDictFunId dfun_name inst_tyvars dfun_theta clas inst_tys + = mkExportedLocalId dfun_name dfun_ty + where + dfun_ty = mkSigmaTy inst_tyvars dfun_theta (mkDictTy clas inst_tys) + +{- 1 dec 99: disable the Mark Jones optimisation for the sake + of compatibility with Hugs. + See `types/InstEnv' for a discussion related to this. + + (class_tyvars, sc_theta, _, _) = classBigSig clas + not_const (clas, tys) = not (isEmptyVarSet (tyVarsOfTypes tys)) + sc_theta' = substClasses (zipTopTvSubst class_tyvars inst_tys) sc_theta + dfun_theta = case inst_decl_theta of + [] -> [] -- If inst_decl_theta is empty, then we don't + -- want to have any dict arguments, so that we can + -- expose the constant methods. + + other -> nub (inst_decl_theta ++ filter not_const sc_theta') + -- Otherwise we pass the superclass dictionaries to + -- the dictionary function; the Mark Jones optimisation. + -- + -- NOTE the "nub". I got caught by this one: + -- class Monad m => MonadT t m where ... + -- instance Monad m => MonadT (EnvT env) m where ... + -- Here, the inst_decl_theta has (Monad m); but so + -- does the sc_theta'! + -- + -- NOTE the "not_const". I got caught by this one too: + -- class Foo a => Baz a b where ... + -- instance Wob b => Baz T b where.. + -- Now sc_theta' has Foo T +-} +\end{code} + + +%************************************************************************ +%* * +\subsection{Un-definable} +%* * +%************************************************************************ + +These Ids can't be defined in Haskell. They could be defined in +unfoldings in the wired-in GHC.Prim interface file, but we'd have to +ensure that they were definitely, definitely inlined, because there is +no curried identifier for them. That's what mkCompulsoryUnfolding +does. If we had a way to get a compulsory unfolding from an interface +file, we could do that, but we don't right now. + +unsafeCoerce# isn't so much a PrimOp as a phantom identifier, that +just gets expanded into a type coercion wherever it occurs. Hence we +add it as a built-in Id with an unfolding here. + +The type variables we use here are "open" type variables: this means +they can unify with both unlifted and lifted types. Hence we provide +another gun with which to shoot yourself in the foot. + +\begin{code} +mkWiredInIdName mod fs uniq id + = mkWiredInName mod (mkOccNameFS varName fs) uniq Nothing (AnId id) UserSyntax + +unsafeCoerceName = mkWiredInIdName gHC_PRIM FSLIT("unsafeCoerce#") unsafeCoerceIdKey unsafeCoerceId +nullAddrName = mkWiredInIdName gHC_PRIM FSLIT("nullAddr#") nullAddrIdKey nullAddrId +seqName = mkWiredInIdName gHC_PRIM FSLIT("seq") seqIdKey seqId +realWorldName = mkWiredInIdName gHC_PRIM FSLIT("realWorld#") realWorldPrimIdKey realWorldPrimId +lazyIdName = mkWiredInIdName pREL_BASE FSLIT("lazy") lazyIdKey lazyId + +errorName = mkWiredInIdName pREL_ERR FSLIT("error") errorIdKey eRROR_ID +recSelErrorName = mkWiredInIdName pREL_ERR FSLIT("recSelError") recSelErrorIdKey rEC_SEL_ERROR_ID +runtimeErrorName = mkWiredInIdName pREL_ERR FSLIT("runtimeError") runtimeErrorIdKey rUNTIME_ERROR_ID +irrefutPatErrorName = mkWiredInIdName pREL_ERR FSLIT("irrefutPatError") irrefutPatErrorIdKey iRREFUT_PAT_ERROR_ID +recConErrorName = mkWiredInIdName pREL_ERR FSLIT("recConError") recConErrorIdKey rEC_CON_ERROR_ID +patErrorName = mkWiredInIdName pREL_ERR FSLIT("patError") patErrorIdKey pAT_ERROR_ID +noMethodBindingErrorName = mkWiredInIdName pREL_ERR FSLIT("noMethodBindingError") + noMethodBindingErrorIdKey nO_METHOD_BINDING_ERROR_ID +nonExhaustiveGuardsErrorName + = mkWiredInIdName pREL_ERR FSLIT("nonExhaustiveGuardsError") + nonExhaustiveGuardsErrorIdKey nON_EXHAUSTIVE_GUARDS_ERROR_ID +\end{code} + +\begin{code} +-- unsafeCoerce# :: forall a b. a -> b +unsafeCoerceId + = pcMiscPrelId unsafeCoerceName ty info + where + info = noCafIdInfo `setUnfoldingInfo` mkCompulsoryUnfolding rhs + + + ty = mkForAllTys [openAlphaTyVar,openBetaTyVar] + (mkFunTy openAlphaTy openBetaTy) + [x] = mkTemplateLocals [openAlphaTy] + rhs = mkLams [openAlphaTyVar,openBetaTyVar,x] $ + Note (Coerce openBetaTy openAlphaTy) (Var x) + +-- nullAddr# :: Addr# +-- The reason is is here is because we don't provide +-- a way to write this literal in Haskell. +nullAddrId + = pcMiscPrelId nullAddrName addrPrimTy info + where + info = noCafIdInfo `setUnfoldingInfo` + mkCompulsoryUnfolding (Lit nullAddrLit) + +seqId + = pcMiscPrelId seqName ty info + where + info = noCafIdInfo `setUnfoldingInfo` mkCompulsoryUnfolding rhs + + + ty = mkForAllTys [alphaTyVar,openBetaTyVar] + (mkFunTy alphaTy (mkFunTy openBetaTy openBetaTy)) + [x,y] = mkTemplateLocals [alphaTy, openBetaTy] +-- gaw 2004 + rhs = mkLams [alphaTyVar,openBetaTyVar,x,y] (Case (Var x) x openBetaTy [(DEFAULT, [], Var y)]) + +-- lazy :: forall a?. a? -> a? (i.e. works for unboxed types too) +-- Used to lazify pseq: pseq a b = a `seq` lazy b +-- No unfolding: it gets "inlined" by the worker/wrapper pass +-- Also, no strictness: by being a built-in Id, it overrides all +-- the info in PrelBase.hi. This is important, because the strictness +-- analyser will spot it as strict! +lazyId + = pcMiscPrelId lazyIdName ty info + where + info = noCafIdInfo + ty = mkForAllTys [alphaTyVar] (mkFunTy alphaTy alphaTy) + +lazyIdUnfolding :: CoreExpr -- Used to expand LazyOp after strictness anal +lazyIdUnfolding = mkLams [openAlphaTyVar,x] (Var x) + where + [x] = mkTemplateLocals [openAlphaTy] +\end{code} + +@realWorld#@ used to be a magic literal, \tr{void#}. If things get +nasty as-is, change it back to a literal (@Literal@). + +voidArgId is a Local Id used simply as an argument in functions +where we just want an arg to avoid having a thunk of unlifted type. +E.g. + x = \ void :: State# RealWorld -> (# p, q #) + +This comes up in strictness analysis + +\begin{code} +realWorldPrimId -- :: State# RealWorld + = pcMiscPrelId realWorldName realWorldStatePrimTy + (noCafIdInfo `setUnfoldingInfo` evaldUnfolding) + -- The evaldUnfolding makes it look that realWorld# is evaluated + -- which in turn makes Simplify.interestingArg return True, + -- which in turn makes INLINE things applied to realWorld# likely + -- to be inlined + +voidArgId -- :: State# RealWorld + = mkSysLocal FSLIT("void") voidArgIdKey realWorldStatePrimTy +\end{code} + + +%************************************************************************ +%* * +\subsection[PrelVals-error-related]{@error@ and friends; @trace@} +%* * +%************************************************************************ + +GHC randomly injects these into the code. + +@patError@ is just a version of @error@ for pattern-matching +failures. It knows various ``codes'' which expand to longer +strings---this saves space! + +@absentErr@ is a thing we put in for ``absent'' arguments. They jolly +well shouldn't be yanked on, but if one is, then you will get a +friendly message from @absentErr@ (rather than a totally random +crash). + +@parError@ is a special version of @error@ which the compiler does +not know to be a bottoming Id. It is used in the @_par_@ and @_seq_@ +templates, but we don't ever expect to generate code for it. + +\begin{code} +mkRuntimeErrorApp + :: Id -- Should be of type (forall a. Addr# -> a) + -- where Addr# points to a UTF8 encoded string + -> Type -- The type to instantiate 'a' + -> String -- The string to print + -> CoreExpr + +mkRuntimeErrorApp err_id res_ty err_msg + = mkApps (Var err_id) [Type res_ty, err_string] + where + err_string = Lit (mkStringLit err_msg) + +rEC_SEL_ERROR_ID = mkRuntimeErrorId recSelErrorName +rUNTIME_ERROR_ID = mkRuntimeErrorId runtimeErrorName +iRREFUT_PAT_ERROR_ID = mkRuntimeErrorId irrefutPatErrorName +rEC_CON_ERROR_ID = mkRuntimeErrorId recConErrorName +pAT_ERROR_ID = mkRuntimeErrorId patErrorName +nO_METHOD_BINDING_ERROR_ID = mkRuntimeErrorId noMethodBindingErrorName +nON_EXHAUSTIVE_GUARDS_ERROR_ID = mkRuntimeErrorId nonExhaustiveGuardsErrorName + +-- The runtime error Ids take a UTF8-encoded string as argument +mkRuntimeErrorId name = pc_bottoming_Id name runtimeErrorTy +runtimeErrorTy = mkSigmaTy [openAlphaTyVar] [] (mkFunTy addrPrimTy openAlphaTy) +\end{code} + +\begin{code} +eRROR_ID = pc_bottoming_Id errorName errorTy + +errorTy :: Type +errorTy = mkSigmaTy [openAlphaTyVar] [] (mkFunTys [mkListTy charTy] openAlphaTy) + -- Notice the openAlphaTyVar. It says that "error" can be applied + -- to unboxed as well as boxed types. This is OK because it never + -- returns, so the return type is irrelevant. +\end{code} + + +%************************************************************************ +%* * +\subsection{Utilities} +%* * +%************************************************************************ + +\begin{code} +pcMiscPrelId :: Name -> Type -> IdInfo -> Id +pcMiscPrelId name ty info + = mkVanillaGlobal name ty info + -- We lie and say the thing is imported; otherwise, we get into + -- a mess with dependency analysis; e.g., core2stg may heave in + -- random calls to GHCbase.unpackPS__. If GHCbase is the module + -- being compiled, then it's just a matter of luck if the definition + -- will be in "the right place" to be in scope. + +pc_bottoming_Id name ty + = pcMiscPrelId name ty bottoming_info + where + bottoming_info = vanillaIdInfo `setAllStrictnessInfo` Just strict_sig + -- Do *not* mark them as NoCafRefs, because they can indeed have + -- CAF refs. For example, pAT_ERROR_ID calls GHC.Err.untangle, + -- which has some CAFs + -- In due course we may arrange that these error-y things are + -- regarded by the GC as permanently live, in which case we + -- can give them NoCaf info. As it is, any function that calls + -- any pc_bottoming_Id will itself have CafRefs, which bloats + -- SRTs. + + strict_sig = mkStrictSig (mkTopDmdType [evalDmd] BotRes) + -- These "bottom" out, no matter what their arguments + +(openAlphaTyVar:openBetaTyVar:_) = openAlphaTyVars +openAlphaTy = mkTyVarTy openAlphaTyVar +openBetaTy = mkTyVarTy openBetaTyVar +\end{code} + diff --git a/compiler/basicTypes/MkId.lhs-boot b/compiler/basicTypes/MkId.lhs-boot new file mode 100644 index 0000000000..4f9615a061 --- /dev/null +++ b/compiler/basicTypes/MkId.lhs-boot @@ -0,0 +1,9 @@ +\begin{code} +module MkId where +import Name( Name ) +import DataCon( DataCon, DataConIds ) + +mkDataConIds :: Name -> Name -> DataCon -> DataConIds +\end{code} + + diff --git a/compiler/basicTypes/Module.hi-boot-5 b/compiler/basicTypes/Module.hi-boot-5 new file mode 100644 index 0000000000..cdc5fbf581 --- /dev/null +++ b/compiler/basicTypes/Module.hi-boot-5 @@ -0,0 +1,4 @@ +__interface Module 1 0 where +__export Module Module ; +1 data Module ; + diff --git a/compiler/basicTypes/Module.hi-boot-6 b/compiler/basicTypes/Module.hi-boot-6 new file mode 100644 index 0000000000..c4d4b5d474 --- /dev/null +++ b/compiler/basicTypes/Module.hi-boot-6 @@ -0,0 +1,3 @@ +module Module where +data Module + diff --git a/compiler/basicTypes/Module.lhs b/compiler/basicTypes/Module.lhs new file mode 100644 index 0000000000..69521625b0 --- /dev/null +++ b/compiler/basicTypes/Module.lhs @@ -0,0 +1,216 @@ +% +% (c) The University of Glasgow, 2004 +% + +Module +~~~~~~~~~~ +Simply the name of a module, represented as a FastString. +These are Uniquable, hence we can build FiniteMaps with ModuleNames as +the keys. + +\begin{code} +module Module + ( + Module -- Abstract, instance of Eq, Ord, Outputable + , pprModule -- :: ModuleName -> SDoc + + , ModLocation(..) + , addBootSuffix, addBootSuffix_maybe, addBootSuffixLocn + + , moduleString -- :: ModuleName -> String + , moduleFS -- :: ModuleName -> FastString + + , mkModule -- :: String -> ModuleName + , mkModuleFS -- :: FastString -> ModuleName + + , ModuleEnv + , elemModuleEnv, extendModuleEnv, extendModuleEnvList, plusModuleEnv_C + , delModuleEnvList, delModuleEnv, plusModuleEnv, lookupModuleEnv + , lookupWithDefaultModuleEnv, mapModuleEnv, mkModuleEnv, emptyModuleEnv + , moduleEnvElts, unitModuleEnv, isEmptyModuleEnv, foldModuleEnv + , extendModuleEnv_C, filterModuleEnv + + , ModuleSet, emptyModuleSet, mkModuleSet, moduleSetElts, extendModuleSet, elemModuleSet + + ) where + +#include "HsVersions.h" +import Outputable +import Unique ( Uniquable(..) ) +import UniqFM +import UniqSet +import Binary +import FastString +\end{code} + +%************************************************************************ +%* * +\subsection{Module locations} +%* * +%************************************************************************ + +\begin{code} +data ModLocation + = ModLocation { + ml_hs_file :: Maybe FilePath, + -- The source file, if we have one. Package modules + -- probably don't have source files. + + ml_hi_file :: FilePath, + -- Where the .hi file is, whether or not it exists + -- yet. Always of form foo.hi, even if there is an + -- hi-boot file (we add the -boot suffix later) + + ml_obj_file :: FilePath + -- Where the .o file is, whether or not it exists yet. + -- (might not exist either because the module hasn't + -- been compiled yet, or because it is part of a + -- package with a .a file) + } deriving Show + +instance Outputable ModLocation where + ppr = text . show +\end{code} + +For a module in another package, the hs_file and obj_file +components of ModLocation are undefined. + +The locations specified by a ModLocation may or may not +correspond to actual files yet: for example, even if the object +file doesn't exist, the ModLocation still contains the path to +where the object file will reside if/when it is created. + +\begin{code} +addBootSuffix :: FilePath -> FilePath +-- Add the "-boot" suffix to .hs, .hi and .o files +addBootSuffix path = path ++ "-boot" + +addBootSuffix_maybe :: Bool -> FilePath -> FilePath +addBootSuffix_maybe is_boot path + | is_boot = addBootSuffix path + | otherwise = path + +addBootSuffixLocn :: ModLocation -> ModLocation +addBootSuffixLocn locn + = locn { ml_hs_file = fmap addBootSuffix (ml_hs_file locn) + , ml_hi_file = addBootSuffix (ml_hi_file locn) + , ml_obj_file = addBootSuffix (ml_obj_file locn) } +\end{code} + + +%************************************************************************ +%* * +\subsection{The name of a module} +%* * +%************************************************************************ + +\begin{code} +newtype Module = Module FastString + -- Haskell module names can include the quote character ', + -- so the module names have the z-encoding applied to them + +instance Binary Module where + put_ bh (Module m) = put_ bh m + get bh = do m <- get bh; return (Module m) + +instance Uniquable Module where + getUnique (Module nm) = getUnique nm + +instance Eq Module where + nm1 == nm2 = getUnique nm1 == getUnique nm2 + +-- Warning: gives an ordering relation based on the uniques of the +-- FastStrings which are the (encoded) module names. This is _not_ +-- a lexicographical ordering. +instance Ord Module where + nm1 `compare` nm2 = getUnique nm1 `compare` getUnique nm2 + +instance Outputable Module where + ppr = pprModule + +pprModule :: Module -> SDoc +pprModule (Module nm) = + getPprStyle $ \ sty -> + if codeStyle sty + then ftext (zEncodeFS nm) + else ftext nm + +moduleFS :: Module -> FastString +moduleFS (Module mod) = mod + +moduleString :: Module -> String +moduleString (Module mod) = unpackFS mod + +-- used to be called mkSrcModule +mkModule :: String -> Module +mkModule s = Module (mkFastString s) + +-- used to be called mkSrcModuleFS +mkModuleFS :: FastString -> Module +mkModuleFS s = Module s +\end{code} + +%************************************************************************ +%* * +\subsection{@ModuleEnv@s} +%* * +%************************************************************************ + +\begin{code} +type ModuleEnv elt = UniqFM elt + +emptyModuleEnv :: ModuleEnv a +mkModuleEnv :: [(Module, a)] -> ModuleEnv a +unitModuleEnv :: Module -> a -> ModuleEnv a +extendModuleEnv :: ModuleEnv a -> Module -> a -> ModuleEnv a +extendModuleEnv_C :: (a->a->a) -> ModuleEnv a -> Module -> a -> ModuleEnv a +plusModuleEnv :: ModuleEnv a -> ModuleEnv a -> ModuleEnv a +extendModuleEnvList :: ModuleEnv a -> [(Module, a)] -> ModuleEnv a + +delModuleEnvList :: ModuleEnv a -> [Module] -> ModuleEnv a +delModuleEnv :: ModuleEnv a -> Module -> ModuleEnv a +plusModuleEnv_C :: (a -> a -> a) -> ModuleEnv a -> ModuleEnv a -> ModuleEnv a +mapModuleEnv :: (a -> b) -> ModuleEnv a -> ModuleEnv b +moduleEnvElts :: ModuleEnv a -> [a] + +isEmptyModuleEnv :: ModuleEnv a -> Bool +lookupModuleEnv :: ModuleEnv a -> Module -> Maybe a +lookupWithDefaultModuleEnv :: ModuleEnv a -> a -> Module -> a +elemModuleEnv :: Module -> ModuleEnv a -> Bool +foldModuleEnv :: (a -> b -> b) -> b -> ModuleEnv a -> b +filterModuleEnv :: (a -> Bool) -> ModuleEnv a -> ModuleEnv a + +filterModuleEnv = filterUFM +elemModuleEnv = elemUFM +extendModuleEnv = addToUFM +extendModuleEnv_C = addToUFM_C +extendModuleEnvList = addListToUFM +plusModuleEnv_C = plusUFM_C +delModuleEnvList = delListFromUFM +delModuleEnv = delFromUFM +plusModuleEnv = plusUFM +lookupModuleEnv = lookupUFM +lookupWithDefaultModuleEnv = lookupWithDefaultUFM +mapModuleEnv = mapUFM +mkModuleEnv = listToUFM +emptyModuleEnv = emptyUFM +moduleEnvElts = eltsUFM +unitModuleEnv = unitUFM +isEmptyModuleEnv = isNullUFM +foldModuleEnv = foldUFM +\end{code} + +\begin{code} +type ModuleSet = UniqSet Module +mkModuleSet :: [Module] -> ModuleSet +extendModuleSet :: ModuleSet -> Module -> ModuleSet +emptyModuleSet :: ModuleSet +moduleSetElts :: ModuleSet -> [Module] +elemModuleSet :: Module -> ModuleSet -> Bool + +emptyModuleSet = emptyUniqSet +mkModuleSet = mkUniqSet +extendModuleSet = addOneToUniqSet +moduleSetElts = uniqSetToList +elemModuleSet = elementOfUniqSet +\end{code} diff --git a/compiler/basicTypes/Module.lhs-boot b/compiler/basicTypes/Module.lhs-boot new file mode 100644 index 0000000000..d75c032d45 --- /dev/null +++ b/compiler/basicTypes/Module.lhs-boot @@ -0,0 +1,6 @@ +\begin{code} +module Module where + +data Module +\end{code} + diff --git a/compiler/basicTypes/Name.hi-boot-5 b/compiler/basicTypes/Name.hi-boot-5 new file mode 100644 index 0000000000..634d95433c --- /dev/null +++ b/compiler/basicTypes/Name.hi-boot-5 @@ -0,0 +1,3 @@ +__interface Name 1 0 where +__export Name Name; +1 data Name ; diff --git a/compiler/basicTypes/Name.hi-boot-6 b/compiler/basicTypes/Name.hi-boot-6 new file mode 100644 index 0000000000..c4eeca4d68 --- /dev/null +++ b/compiler/basicTypes/Name.hi-boot-6 @@ -0,0 +1,3 @@ +module Name where + +data Name diff --git a/compiler/basicTypes/Name.lhs b/compiler/basicTypes/Name.lhs new file mode 100644 index 0000000000..1e1fb31f84 --- /dev/null +++ b/compiler/basicTypes/Name.lhs @@ -0,0 +1,384 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[Name]{@Name@: to transmit name info from renamer to typechecker} + +\begin{code} +module Name ( + -- Re-export the OccName stuff + module OccName, + + -- The Name type + Name, -- Abstract + BuiltInSyntax(..), + mkInternalName, mkSystemName, + mkSystemVarName, mkSysTvName, + mkFCallName, mkIPName, + mkExternalName, mkWiredInName, + + nameUnique, setNameUnique, + nameOccName, nameModule, nameModule_maybe, + tidyNameOcc, + hashName, localiseName, + + nameSrcLoc, nameParent, nameParent_maybe, isImplicitName, + + isSystemName, isInternalName, isExternalName, + isTyVarName, isWiredInName, isBuiltInSyntax, + wiredInNameTyThing_maybe, + nameIsLocalOrFrom, + + -- Class NamedThing and overloaded friends + NamedThing(..), + getSrcLoc, getOccString + ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} TypeRep( TyThing ) + +import OccName -- All of it +import Module ( Module, moduleFS ) +import SrcLoc ( noSrcLoc, wiredInSrcLoc, SrcLoc ) +import Unique ( Unique, Uniquable(..), getKey, pprUnique ) +import Maybes ( orElse, isJust ) +import FastString ( FastString, zEncodeFS ) +import Outputable +\end{code} + +%************************************************************************ +%* * +\subsection[Name-datatype]{The @Name@ datatype, and name construction} +%* * +%************************************************************************ + +\begin{code} +data Name = Name { + n_sort :: NameSort, -- What sort of name it is + n_occ :: !OccName, -- Its occurrence name + n_uniq :: Unique, + n_loc :: !SrcLoc -- Definition site + } + +-- NOTE: we make the n_loc field strict to eliminate some potential +-- (and real!) space leaks, due to the fact that we don't look at +-- the SrcLoc in a Name all that often. + +data NameSort + = External Module (Maybe Name) + -- (Just parent) => this Name is a subordinate name of 'parent' + -- e.g. data constructor of a data type, method of a class + -- Nothing => not a subordinate + + | WiredIn Module (Maybe Name) TyThing BuiltInSyntax + -- A variant of External, for wired-in things + + | Internal -- A user-defined Id or TyVar + -- defined in the module being compiled + + | System -- A system-defined Id or TyVar. Typically the + -- OccName is very uninformative (like 's') + +data BuiltInSyntax = BuiltInSyntax | UserSyntax +-- BuiltInSyntax is for things like (:), [], tuples etc, +-- which have special syntactic forms. They aren't "in scope" +-- as such. +\end{code} + +Notes about the NameSorts: + +1. Initially, top-level Ids (including locally-defined ones) get External names, + and all other local Ids get Internal names + +2. Things with a External name are given C static labels, so they finally + appear in the .o file's symbol table. They appear in the symbol table + in the form M.n. If originally-local things have this property they + must be made @External@ first. + +3. In the tidy-core phase, a External that is not visible to an importer + is changed to Internal, and a Internal that is visible is changed to External + +4. A System Name differs in the following ways: + a) has unique attached when printing dumps + b) unifier eliminates sys tyvars in favour of user provs where possible + + Before anything gets printed in interface files or output code, it's + fed through a 'tidy' processor, which zaps the OccNames to have + unique names; and converts all sys-locals to user locals + If any desugarer sys-locals have survived that far, they get changed to + "ds1", "ds2", etc. + +Built-in syntax => It's a syntactic form, not "in scope" (e.g. []) + +Wired-in thing => The thing (Id, TyCon) is fully known to the compiler, + not read from an interface file. + E.g. Bool, True, Int, Float, and many others + +All built-in syntax is for wired-in things. + +\begin{code} +nameUnique :: Name -> Unique +nameOccName :: Name -> OccName +nameModule :: Name -> Module +nameSrcLoc :: Name -> SrcLoc + +nameUnique name = n_uniq name +nameOccName name = n_occ name +nameSrcLoc name = n_loc name +\end{code} + +\begin{code} +nameIsLocalOrFrom :: Module -> Name -> Bool +isInternalName :: Name -> Bool +isExternalName :: Name -> Bool +isSystemName :: Name -> Bool +isWiredInName :: Name -> Bool + +isWiredInName (Name {n_sort = WiredIn _ _ _ _}) = True +isWiredInName other = False + +wiredInNameTyThing_maybe :: Name -> Maybe TyThing +wiredInNameTyThing_maybe (Name {n_sort = WiredIn _ _ thing _}) = Just thing +wiredInNameTyThing_maybe other = Nothing + +isBuiltInSyntax (Name {n_sort = WiredIn _ _ _ BuiltInSyntax}) = True +isBuiltInSyntax other = False + +isExternalName (Name {n_sort = External _ _}) = True +isExternalName (Name {n_sort = WiredIn _ _ _ _}) = True +isExternalName other = False + +isInternalName name = not (isExternalName name) + +nameParent_maybe :: Name -> Maybe Name +nameParent_maybe (Name {n_sort = External _ p}) = p +nameParent_maybe (Name {n_sort = WiredIn _ p _ _}) = p +nameParent_maybe other = Nothing + +nameParent :: Name -> Name +nameParent name = case nameParent_maybe name of + Just parent -> parent + Nothing -> name + +isImplicitName :: Name -> Bool +-- An Implicit Name is one has a parent; that is, one whose definition +-- derives from the parent thing +isImplicitName name = isJust (nameParent_maybe name) + +nameModule name = nameModule_maybe name `orElse` pprPanic "nameModule" (ppr name) +nameModule_maybe (Name { n_sort = External mod _}) = Just mod +nameModule_maybe (Name { n_sort = WiredIn mod _ _ _}) = Just mod +nameModule_maybe name = Nothing + +nameIsLocalOrFrom from name + | isExternalName name = from == nameModule name + | otherwise = True + +isTyVarName :: Name -> Bool +isTyVarName name = isTvOcc (nameOccName name) + +isSystemName (Name {n_sort = System}) = True +isSystemName other = False +\end{code} + + +%************************************************************************ +%* * +\subsection{Making names} +%* * +%************************************************************************ + +\begin{code} +mkInternalName :: Unique -> OccName -> SrcLoc -> Name +mkInternalName uniq occ loc = Name { n_uniq = uniq, n_sort = Internal, n_occ = occ, n_loc = loc } + -- NB: You might worry that after lots of huffing and + -- puffing we might end up with two local names with distinct + -- uniques, but the same OccName. Indeed we can, but that's ok + -- * the insides of the compiler don't care: they use the Unique + -- * when printing for -ddump-xxx you can switch on -dppr-debug to get the + -- uniques if you get confused + -- * for interface files we tidyCore first, which puts the uniques + -- into the print name (see setNameVisibility below) + +mkExternalName :: Unique -> Module -> OccName -> Maybe Name -> SrcLoc -> Name +mkExternalName uniq mod occ mb_parent loc + = Name { n_uniq = uniq, n_sort = External mod mb_parent, + n_occ = occ, n_loc = loc } + +mkWiredInName :: Module -> OccName -> Unique + -> Maybe Name -> TyThing -> BuiltInSyntax -> Name +mkWiredInName mod occ uniq mb_parent thing built_in + = Name { n_uniq = uniq, + n_sort = WiredIn mod mb_parent thing built_in, + n_occ = occ, n_loc = wiredInSrcLoc } + +mkSystemName :: Unique -> OccName -> Name +mkSystemName uniq occ = Name { n_uniq = uniq, n_sort = System, + n_occ = occ, n_loc = noSrcLoc } + +mkSystemVarName :: Unique -> FastString -> Name +mkSystemVarName uniq fs = mkSystemName uniq (mkVarOccFS fs) + +mkSysTvName :: Unique -> FastString -> Name +mkSysTvName uniq fs = mkSystemName uniq (mkOccNameFS tvName fs) + +mkFCallName :: Unique -> String -> Name + -- The encoded string completely describes the ccall +mkFCallName uniq str = Name { n_uniq = uniq, n_sort = Internal, + n_occ = mkVarOcc str, n_loc = noSrcLoc } + +mkIPName :: Unique -> OccName -> Name +mkIPName uniq occ + = Name { n_uniq = uniq, + n_sort = Internal, + n_occ = occ, + n_loc = noSrcLoc } +\end{code} + +\begin{code} +-- When we renumber/rename things, we need to be +-- able to change a Name's Unique to match the cached +-- one in the thing it's the name of. If you know what I mean. +setNameUnique name uniq = name {n_uniq = uniq} + +tidyNameOcc :: Name -> OccName -> Name +-- We set the OccName of a Name when tidying +-- In doing so, we change System --> Internal, so that when we print +-- it we don't get the unique by default. It's tidy now! +tidyNameOcc name@(Name { n_sort = System }) occ = name { n_occ = occ, n_sort = Internal} +tidyNameOcc name occ = name { n_occ = occ } + +localiseName :: Name -> Name +localiseName n = n { n_sort = Internal } +\end{code} + + +%************************************************************************ +%* * +\subsection{Predicates and selectors} +%* * +%************************************************************************ + +\begin{code} +hashName :: Name -> Int +hashName name = getKey (nameUnique name) +\end{code} + + +%************************************************************************ +%* * +\subsection[Name-instances]{Instance declarations} +%* * +%************************************************************************ + +\begin{code} +cmpName n1 n2 = n_uniq n1 `compare` n_uniq n2 +\end{code} + +\begin{code} +instance Eq Name where + a == b = case (a `compare` b) of { EQ -> True; _ -> False } + a /= b = case (a `compare` b) of { EQ -> False; _ -> True } + +instance Ord Name where + a <= b = case (a `compare` b) of { LT -> True; EQ -> True; GT -> False } + a < b = case (a `compare` b) of { LT -> True; EQ -> False; GT -> False } + a >= b = case (a `compare` b) of { LT -> False; EQ -> True; GT -> True } + a > b = case (a `compare` b) of { LT -> False; EQ -> False; GT -> True } + compare a b = cmpName a b + +instance Uniquable Name where + getUnique = nameUnique + +instance NamedThing Name where + getName n = n +\end{code} + + +%************************************************************************ +%* * +\subsection{Pretty printing} +%* * +%************************************************************************ + +\begin{code} +instance Outputable Name where + ppr name = pprName name + +instance OutputableBndr Name where + pprBndr _ name = pprName name + +pprName (Name {n_sort = sort, n_uniq = uniq, n_occ = occ}) + = getPprStyle $ \ sty -> + case sort of + WiredIn mod _ _ builtin -> pprExternal sty uniq mod occ True builtin + External mod _ -> pprExternal sty uniq mod occ False UserSyntax + System -> pprSystem sty uniq occ + Internal -> pprInternal sty uniq occ + +pprExternal sty uniq mod occ is_wired is_builtin + | codeStyle sty = ppr_z_module mod <> char '_' <> ppr_z_occ_name occ + -- In code style, always qualify + -- ToDo: maybe we could print all wired-in things unqualified + -- in code style, to reduce symbol table bloat? + | debugStyle sty = ppr mod <> dot <> ppr_occ_name occ + <> braces (hsep [if is_wired then ptext SLIT("(w)") else empty, + pprNameSpaceBrief (occNameSpace occ), + pprUnique uniq]) + | BuiltInSyntax <- is_builtin = ppr_occ_name occ + -- never qualify builtin syntax + | unqualStyle sty mod occ = ppr_occ_name occ + | otherwise = ppr mod <> dot <> ppr_occ_name occ + +pprInternal sty uniq occ + | codeStyle sty = pprUnique uniq + | debugStyle sty = ppr_occ_name occ <> braces (hsep [pprNameSpaceBrief (occNameSpace occ), + pprUnique uniq]) + | dumpStyle sty = ppr_occ_name occ <> char '_' <> pprUnique uniq + -- For debug dumps, we're not necessarily dumping + -- tidied code, so we need to print the uniques. + | otherwise = ppr_occ_name occ -- User style + +-- Like Internal, except that we only omit the unique in Iface style +pprSystem sty uniq occ + | codeStyle sty = pprUnique uniq + | debugStyle sty = ppr_occ_name occ <> char '_' <> pprUnique uniq + <> braces (pprNameSpaceBrief (occNameSpace occ)) + | otherwise = ppr_occ_name occ <> char '_' <> pprUnique uniq + -- If the tidy phase hasn't run, the OccName + -- is unlikely to be informative (like 's'), + -- so print the unique + +ppr_occ_name occ = ftext (occNameFS occ) + -- Don't use pprOccName; instead, just print the string of the OccName; + -- we print the namespace in the debug stuff above + +-- In code style, we Z-encode the strings. The results of Z-encoding each FastString are +-- cached behind the scenes in the FastString implementation. +ppr_z_occ_name occ = ftext (zEncodeFS (occNameFS occ)) +ppr_z_module mod = ftext (zEncodeFS (moduleFS mod)) + +\end{code} + +%************************************************************************ +%* * +\subsection{Overloaded functions related to Names} +%* * +%************************************************************************ + +\begin{code} +class NamedThing a where + getOccName :: a -> OccName + getName :: a -> Name + + getOccName n = nameOccName (getName n) -- Default method +\end{code} + +\begin{code} +getSrcLoc :: NamedThing a => a -> SrcLoc +getOccString :: NamedThing a => a -> String + +getSrcLoc = nameSrcLoc . getName +getOccString = occNameString . getOccName +\end{code} + diff --git a/compiler/basicTypes/Name.lhs-boot b/compiler/basicTypes/Name.lhs-boot new file mode 100644 index 0000000000..167ce4242d --- /dev/null +++ b/compiler/basicTypes/Name.lhs-boot @@ -0,0 +1,5 @@ +\begin{code} +module Name where + +data Name +\end{code} diff --git a/compiler/basicTypes/NameEnv.lhs b/compiler/basicTypes/NameEnv.lhs new file mode 100644 index 0000000000..ff637010aa --- /dev/null +++ b/compiler/basicTypes/NameEnv.lhs @@ -0,0 +1,72 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[NameEnv]{@NameEnv@: name environments} + +\begin{code} +module NameEnv ( + NameEnv, mkNameEnv, + emptyNameEnv, unitNameEnv, nameEnvElts, + extendNameEnv_C, extendNameEnv_Acc, extendNameEnv, extendNameEnvList, + foldNameEnv, filterNameEnv, + plusNameEnv, plusNameEnv_C, + lookupNameEnv, lookupNameEnv_NF, delFromNameEnv, delListFromNameEnv, + elemNameEnv, mapNameEnv + ) where + +#include "HsVersions.h" + +import Name ( Name ) +import UniqFM +import Maybes ( expectJust ) +\end{code} + +%************************************************************************ +%* * +\subsection{Name environment} +%* * +%************************************************************************ + +\begin{code} +type NameEnv a = UniqFM a -- Domain is Name + +emptyNameEnv :: NameEnv a +mkNameEnv :: [(Name,a)] -> NameEnv a +nameEnvElts :: NameEnv a -> [a] +extendNameEnv_C :: (a->a->a) -> NameEnv a -> Name -> a -> NameEnv a +extendNameEnv_Acc :: (a->b->b) -> (a->b) -> NameEnv b -> Name -> a -> NameEnv b +extendNameEnv :: NameEnv a -> Name -> a -> NameEnv a +plusNameEnv :: NameEnv a -> NameEnv a -> NameEnv a +plusNameEnv_C :: (a->a->a) -> NameEnv a -> NameEnv a -> NameEnv a +extendNameEnvList :: NameEnv a -> [(Name,a)] -> NameEnv a +delFromNameEnv :: NameEnv a -> Name -> NameEnv a +delListFromNameEnv :: NameEnv a -> [Name] -> NameEnv a +elemNameEnv :: Name -> NameEnv a -> Bool +unitNameEnv :: Name -> a -> NameEnv a +lookupNameEnv :: NameEnv a -> Name -> Maybe a +lookupNameEnv_NF :: NameEnv a -> Name -> a +foldNameEnv :: (a -> b -> b) -> b -> NameEnv a -> b +filterNameEnv :: (elt -> Bool) -> NameEnv elt -> NameEnv elt +mapNameEnv :: (elt1 -> elt2) -> NameEnv elt1 -> NameEnv elt2 + +emptyNameEnv = emptyUFM +foldNameEnv = foldUFM +mkNameEnv = listToUFM +nameEnvElts = eltsUFM +extendNameEnv_C = addToUFM_C +extendNameEnv_Acc = addToUFM_Acc +extendNameEnv = addToUFM +plusNameEnv = plusUFM +plusNameEnv_C = plusUFM_C +extendNameEnvList = addListToUFM +delFromNameEnv = delFromUFM +delListFromNameEnv = delListFromUFM +elemNameEnv = elemUFM +unitNameEnv = unitUFM +filterNameEnv = filterUFM +mapNameEnv = mapUFM + +lookupNameEnv = lookupUFM +lookupNameEnv_NF env n = expectJust "lookupNameEnv_NF" (lookupUFM env n) +\end{code} + diff --git a/compiler/basicTypes/NameSet.lhs b/compiler/basicTypes/NameSet.lhs new file mode 100644 index 0000000000..d0e55dec68 --- /dev/null +++ b/compiler/basicTypes/NameSet.lhs @@ -0,0 +1,190 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1998 +% +\section[NameSet]{@NameSets@} + +\begin{code} +module NameSet ( + -- Sets of Names + NameSet, + emptyNameSet, unitNameSet, mkNameSet, unionNameSets, unionManyNameSets, + minusNameSet, elemNameSet, nameSetToList, addOneToNameSet, addListToNameSet, + delFromNameSet, delListFromNameSet, isEmptyNameSet, foldNameSet, filterNameSet, + intersectsNameSet, intersectNameSet, + + -- Free variables + FreeVars, isEmptyFVs, emptyFVs, plusFVs, plusFV, + mkFVs, addOneFV, unitFV, delFV, delFVs, + + -- Defs and uses + Defs, Uses, DefUse, DefUses, + emptyDUs, usesOnly, mkDUs, plusDU, + findUses, duDefs, duUses, allUses + ) where + +#include "HsVersions.h" + +import Name +import UniqSet +\end{code} + + +%************************************************************************ +%* * +\subsection[Sets of names} +%* * +%************************************************************************ + +\begin{code} +type NameSet = UniqSet Name +emptyNameSet :: NameSet +unitNameSet :: Name -> NameSet +addListToNameSet :: NameSet -> [Name] -> NameSet +addOneToNameSet :: NameSet -> Name -> NameSet +mkNameSet :: [Name] -> NameSet +unionNameSets :: NameSet -> NameSet -> NameSet +unionManyNameSets :: [NameSet] -> NameSet +minusNameSet :: NameSet -> NameSet -> NameSet +elemNameSet :: Name -> NameSet -> Bool +nameSetToList :: NameSet -> [Name] +isEmptyNameSet :: NameSet -> Bool +delFromNameSet :: NameSet -> Name -> NameSet +delListFromNameSet :: NameSet -> [Name] -> NameSet +foldNameSet :: (Name -> b -> b) -> b -> NameSet -> b +filterNameSet :: (Name -> Bool) -> NameSet -> NameSet +intersectNameSet :: NameSet -> NameSet -> NameSet +intersectsNameSet :: NameSet -> NameSet -> Bool -- True if non-empty intersection + -- (s1 `intersectsVarSet` s2) doesn't compute s2 if s1 is empty + +isEmptyNameSet = isEmptyUniqSet +emptyNameSet = emptyUniqSet +unitNameSet = unitUniqSet +mkNameSet = mkUniqSet +addListToNameSet = addListToUniqSet +addOneToNameSet = addOneToUniqSet +unionNameSets = unionUniqSets +unionManyNameSets = unionManyUniqSets +minusNameSet = minusUniqSet +elemNameSet = elementOfUniqSet +nameSetToList = uniqSetToList +delFromNameSet = delOneFromUniqSet +foldNameSet = foldUniqSet +filterNameSet = filterUniqSet +intersectNameSet = intersectUniqSets + +delListFromNameSet set ns = foldl delFromNameSet set ns + +intersectsNameSet s1 s2 = not (isEmptyNameSet (s1 `intersectNameSet` s2)) +\end{code} + + +%************************************************************************ +%* * +\subsection{Free variables} +%* * +%************************************************************************ + +These synonyms are useful when we are thinking of free variables + +\begin{code} +type FreeVars = NameSet + +plusFV :: FreeVars -> FreeVars -> FreeVars +addOneFV :: FreeVars -> Name -> FreeVars +unitFV :: Name -> FreeVars +emptyFVs :: FreeVars +plusFVs :: [FreeVars] -> FreeVars +mkFVs :: [Name] -> FreeVars +delFV :: Name -> FreeVars -> FreeVars +delFVs :: [Name] -> FreeVars -> FreeVars + +isEmptyFVs = isEmptyNameSet +emptyFVs = emptyNameSet +plusFVs = unionManyNameSets +plusFV = unionNameSets +mkFVs = mkNameSet +addOneFV = addOneToNameSet +unitFV = unitNameSet +delFV n s = delFromNameSet s n +delFVs ns s = delListFromNameSet s ns +\end{code} + + +%************************************************************************ +%* * + Defs and uses +%* * +%************************************************************************ + +\begin{code} +type Defs = NameSet +type Uses = NameSet + +type DefUses = [DefUse] + -- In dependency order: earlier Defs scope over later Uses + +type DefUse = (Maybe Defs, Uses) + -- For items (Just ds, us), the use of any member + -- of the ds implies that all the us are used too + -- + -- Also, us may mention ds + -- + -- Nothing => Nothing defined in this group, but + -- nevertheless all the uses are essential. + -- Used for instance declarations, for example + +emptyDUs :: DefUses +emptyDUs = [] + +usesOnly :: Uses -> DefUses +usesOnly uses = [(Nothing, uses)] + +mkDUs :: [(Defs,Uses)] -> DefUses +mkDUs pairs = [(Just defs, uses) | (defs,uses) <- pairs] + +plusDU :: DefUses -> DefUses -> DefUses +plusDU = (++) + +duDefs :: DefUses -> Defs +duDefs dus = foldr get emptyNameSet dus + where + get (Nothing, u1) d2 = d2 + get (Just d1, u1) d2 = d1 `unionNameSets` d2 + +duUses :: DefUses -> Uses +-- Just like allUses, but defs are not eliminated +duUses dus = foldr get emptyNameSet dus + where + get (d1, u1) u2 = u1 `unionNameSets` u2 + +allUses :: DefUses -> Uses +-- Collect all uses, regardless of +-- whether the group is itself used, +-- but remove defs on the way +allUses dus + = foldr get emptyNameSet dus + where + get (Nothing, rhs_uses) uses = rhs_uses `unionNameSets` uses + get (Just defs, rhs_uses) uses = (rhs_uses `unionNameSets` uses) + `minusNameSet` defs + +findUses :: DefUses -> Uses -> Uses +-- Given some DefUses and some Uses, +-- find all the uses, transitively. +-- The result is a superset of the input uses; +-- and includes things defined in the input DefUses +-- (but only if they are used) +findUses dus uses + = foldr get uses dus + where + get (Nothing, rhs_uses) uses + = rhs_uses `unionNameSets` uses + get (Just defs, rhs_uses) uses + | defs `intersectsNameSet` uses -- Used + || not (all (reportIfUnused . nameOccName) (nameSetToList defs)) + -- At least one starts with an "_", + -- so treat the group as used + = rhs_uses `unionNameSets` uses + | otherwise -- No def is used + = uses +\end{code}
\ No newline at end of file diff --git a/compiler/basicTypes/NewDemand.lhs b/compiler/basicTypes/NewDemand.lhs new file mode 100644 index 0000000000..8e68fd87d2 --- /dev/null +++ b/compiler/basicTypes/NewDemand.lhs @@ -0,0 +1,318 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[Demand]{@Demand@: the amount of demand on a value} + +\begin{code} +module NewDemand( + Demand(..), + topDmd, lazyDmd, seqDmd, evalDmd, errDmd, isStrictDmd, + isTop, isAbsent, seqDemand, + + DmdType(..), topDmdType, botDmdType, mkDmdType, mkTopDmdType, + dmdTypeDepth, seqDmdType, + DmdEnv, emptyDmdEnv, + DmdResult(..), retCPR, isBotRes, returnsCPR, resTypeArgDmd, + + Demands(..), mapDmds, zipWithDmds, allTop, seqDemands, + + StrictSig(..), mkStrictSig, topSig, botSig, cprSig, + isTopSig, + splitStrictSig, + pprIfaceStrictSig, appIsBottom, isBottomingSig, seqStrictSig, + ) where + +#include "HsVersions.h" + +import StaticFlags ( opt_CprOff ) +import BasicTypes ( Arity ) +import VarEnv ( VarEnv, emptyVarEnv, isEmptyVarEnv ) +import UniqFM ( ufmToList ) +import Util ( listLengthCmp, zipWithEqual ) +import Outputable +\end{code} + + +%************************************************************************ +%* * +\subsection{Demands} +%* * +%************************************************************************ + +\begin{code} +data Demand + = Top -- T; used for unlifted types too, so that + -- A `lub` T = T + | Abs -- A + + | Call Demand -- C(d) + + | Eval Demands -- U(ds) + + | Defer Demands -- D(ds) + + | Box Demand -- B(d) + + | Bot -- B + deriving( Eq ) + -- Equality needed for fixpoints in DmdAnal + +data Demands = Poly Demand -- Polymorphic case + | Prod [Demand] -- Product case + deriving( Eq ) + +allTop (Poly d) = isTop d +allTop (Prod ds) = all isTop ds + +isTop Top = True +isTop d = False + +isAbsent Abs = True +isAbsent d = False + +mapDmds :: (Demand -> Demand) -> Demands -> Demands +mapDmds f (Poly d) = Poly (f d) +mapDmds f (Prod ds) = Prod (map f ds) + +zipWithDmds :: (Demand -> Demand -> Demand) + -> Demands -> Demands -> Demands +zipWithDmds f (Poly d1) (Poly d2) = Poly (d1 `f` d2) +zipWithDmds f (Prod ds1) (Poly d2) = Prod [d1 `f` d2 | d1 <- ds1] +zipWithDmds f (Poly d1) (Prod ds2) = Prod [d1 `f` d2 | d2 <- ds2] +zipWithDmds f (Prod ds1) (Prod ds2) = Prod (zipWithEqual "zipWithDmds" f ds1 ds2) + +topDmd, lazyDmd, seqDmd :: Demand +topDmd = Top -- The most uninformative demand +lazyDmd = Box Abs +seqDmd = Eval (Poly Abs) -- Polymorphic seq demand +evalDmd = Box seqDmd -- Evaluate and return +errDmd = Box Bot -- This used to be called X + +isStrictDmd :: Demand -> Bool +isStrictDmd Bot = True +isStrictDmd (Eval _) = True +isStrictDmd (Call _) = True +isStrictDmd (Box d) = isStrictDmd d +isStrictDmd other = False + +seqDemand :: Demand -> () +seqDemand (Call d) = seqDemand d +seqDemand (Eval ds) = seqDemands ds +seqDemand (Defer ds) = seqDemands ds +seqDemand (Box d) = seqDemand d +seqDemand _ = () + +seqDemands :: Demands -> () +seqDemands (Poly d) = seqDemand d +seqDemands (Prod ds) = seqDemandList ds + +seqDemandList :: [Demand] -> () +seqDemandList [] = () +seqDemandList (d:ds) = seqDemand d `seq` seqDemandList ds + +instance Outputable Demand where + ppr Top = char 'T' + ppr Abs = char 'A' + ppr Bot = char 'B' + + ppr (Defer ds) = char 'D' <> ppr ds + ppr (Eval ds) = char 'U' <> ppr ds + + ppr (Box (Eval ds)) = char 'S' <> ppr ds + ppr (Box Abs) = char 'L' + ppr (Box Bot) = char 'X' + + ppr (Call d) = char 'C' <> parens (ppr d) + + +instance Outputable Demands where + ppr (Poly Abs) = empty + ppr (Poly d) = parens (ppr d <> char '*') + ppr (Prod ds) = parens (hcat (map ppr ds)) + -- At one time I printed U(AAA) as U, but that + -- confuses (Poly Abs) with (Prod AAA), and the + -- worker/wrapper generation differs slightly for these two + -- [Reason: in the latter case we can avoid passing the arg; + -- see notes with WwLib.mkWWstr_one.] +\end{code} + + +%************************************************************************ +%* * +\subsection{Demand types} +%* * +%************************************************************************ + +\begin{code} +data DmdType = DmdType + DmdEnv -- Demand on explicitly-mentioned + -- free variables + [Demand] -- Demand on arguments + DmdResult -- Nature of result + + -- IMPORTANT INVARIANT + -- The default demand on free variables not in the DmdEnv is: + -- DmdResult = BotRes <=> Bot + -- DmdResult = TopRes/ResCPR <=> Abs + + -- ANOTHER IMPORTANT INVARIANT + -- The Demands in the argument list are never + -- Bot, Defer d + -- Handwavey reason: these don't correspond to calling conventions + -- See DmdAnal.funArgDemand for details + + +-- This guy lets us switch off CPR analysis +-- by making sure that everything uses TopRes instead of RetCPR +-- Assuming, of course, that they don't mention RetCPR by name. +-- They should onlyu use retCPR +retCPR | opt_CprOff = TopRes + | otherwise = RetCPR + +seqDmdType (DmdType env ds res) = + {- ??? env `seq` -} seqDemandList ds `seq` res `seq` () + +type DmdEnv = VarEnv Demand + +data DmdResult = TopRes -- Nothing known + | RetCPR -- Returns a constructed product + | BotRes -- Diverges or errors + deriving( Eq, Show ) + -- Equality for fixpoints + -- Show needed for Show in Lex.Token (sigh) + +-- Equality needed for fixpoints in DmdAnal +instance Eq DmdType where + (==) (DmdType fv1 ds1 res1) + (DmdType fv2 ds2 res2) = ufmToList fv1 == ufmToList fv2 + && ds1 == ds2 && res1 == res2 + +instance Outputable DmdType where + ppr (DmdType fv ds res) + = hsep [text "DmdType", + hcat (map ppr ds) <> ppr res, + if null fv_elts then empty + else braces (fsep (map pp_elt fv_elts))] + where + pp_elt (uniq, dmd) = ppr uniq <> text "->" <> ppr dmd + fv_elts = ufmToList fv + +instance Outputable DmdResult where + ppr TopRes = empty -- Keep these distinct from Demand letters + ppr RetCPR = char 'm' -- so that we can print strictness sigs as + ppr BotRes = char 'b' -- dddr + -- without ambiguity + +emptyDmdEnv = emptyVarEnv + +topDmdType = DmdType emptyDmdEnv [] TopRes +botDmdType = DmdType emptyDmdEnv [] BotRes +cprDmdType = DmdType emptyVarEnv [] retCPR + +isTopDmdType :: DmdType -> Bool +-- Only used on top-level types, hence the assert +isTopDmdType (DmdType env [] TopRes) = ASSERT( isEmptyVarEnv env) True +isTopDmdType other = False + +isBotRes :: DmdResult -> Bool +isBotRes BotRes = True +isBotRes other = False + +resTypeArgDmd :: DmdResult -> Demand +-- TopRes and BotRes are polymorphic, so that +-- BotRes = Bot -> BotRes +-- TopRes = Top -> TopRes +-- This function makes that concrete +-- We can get a RetCPR, because of the way in which we are (now) +-- giving CPR info to strict arguments. On the first pass, when +-- nothing has demand info, we optimistically give CPR info or RetCPR to all args +resTypeArgDmd TopRes = Top +resTypeArgDmd RetCPR = Top +resTypeArgDmd BotRes = Bot + +returnsCPR :: DmdResult -> Bool +returnsCPR RetCPR = True +returnsCPR other = False + +mkDmdType :: DmdEnv -> [Demand] -> DmdResult -> DmdType +mkDmdType fv ds res = DmdType fv ds res + +mkTopDmdType :: [Demand] -> DmdResult -> DmdType +mkTopDmdType ds res = DmdType emptyDmdEnv ds res + +dmdTypeDepth :: DmdType -> Arity +dmdTypeDepth (DmdType _ ds _) = length ds +\end{code} + + +%************************************************************************ +%* * +\subsection{Strictness signature +%* * +%************************************************************************ + +In a let-bound Id we record its strictness info. +In principle, this strictness info is a demand transformer, mapping +a demand on the Id into a DmdType, which gives + a) the free vars of the Id's value + b) the Id's arguments + c) an indication of the result of applying + the Id to its arguments + +However, in fact we store in the Id an extremely emascuated demand transfomer, +namely + a single DmdType +(Nevertheless we dignify StrictSig as a distinct type.) + +This DmdType gives the demands unleashed by the Id when it is applied +to as many arguments as are given in by the arg demands in the DmdType. + +For example, the demand transformer described by the DmdType + DmdType {x -> U(LL)} [V,A] Top +says that when the function is applied to two arguments, it +unleashes demand U(LL) on the free var x, V on the first arg, +and A on the second. + +If this same function is applied to one arg, all we can say is +that it uses x with U*(LL), and its arg with demand L. + +\begin{code} +newtype StrictSig = StrictSig DmdType + deriving( Eq ) + +instance Outputable StrictSig where + ppr (StrictSig ty) = ppr ty + +instance Show StrictSig where + show (StrictSig ty) = showSDoc (ppr ty) + +mkStrictSig :: DmdType -> StrictSig +mkStrictSig dmd_ty = StrictSig dmd_ty + +splitStrictSig :: StrictSig -> ([Demand], DmdResult) +splitStrictSig (StrictSig (DmdType _ dmds res)) = (dmds, res) + +isTopSig (StrictSig ty) = isTopDmdType ty + +topSig, botSig, cprSig :: StrictSig +topSig = StrictSig topDmdType +botSig = StrictSig botDmdType +cprSig = StrictSig cprDmdType + + +-- appIsBottom returns true if an application to n args would diverge +appIsBottom (StrictSig (DmdType _ ds BotRes)) n = listLengthCmp ds n /= GT +appIsBottom _ _ = False + +isBottomingSig (StrictSig (DmdType _ _ BotRes)) = True +isBottomingSig _ = False + +seqStrictSig (StrictSig ty) = seqDmdType ty + +pprIfaceStrictSig :: StrictSig -> SDoc +-- Used for printing top-level strictness pragmas in interface files +pprIfaceStrictSig (StrictSig (DmdType _ dmds res)) + = hcat (map ppr dmds) <> ppr res +\end{code} + + diff --git a/compiler/basicTypes/OccName.hi-boot-6 b/compiler/basicTypes/OccName.hi-boot-6 new file mode 100644 index 0000000000..705f9b1bd0 --- /dev/null +++ b/compiler/basicTypes/OccName.hi-boot-6 @@ -0,0 +1,4 @@ +module OccName where + +data OccName + diff --git a/compiler/basicTypes/OccName.lhs b/compiler/basicTypes/OccName.lhs new file mode 100644 index 0000000000..a3661a9ab0 --- /dev/null +++ b/compiler/basicTypes/OccName.lhs @@ -0,0 +1,676 @@ +{-% DrIFT (Automatic class derivations for Haskell) v1.1 %-} +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% + +\section[OccName]{@OccName@} + +\begin{code} +module OccName ( + -- * The NameSpace type; abstact + NameSpace, tcName, clsName, tcClsName, dataName, varName, + tvName, srcDataName, + + -- ** Printing + pprNameSpace, pprNonVarNameSpace, pprNameSpaceBrief, + + -- * The OccName type + OccName, -- Abstract, instance of Outputable + pprOccName, + + -- ** Construction + mkOccName, mkOccNameFS, + mkVarOcc, mkVarOccFS, + mkTyVarOcc, + mkDFunOcc, + mkTupleOcc, + setOccNameSpace, + + -- ** Derived OccNames + mkDataConWrapperOcc, mkWorkerOcc, mkDefaultMethodOcc, mkDerivedTyConOcc, + mkClassTyConOcc, mkClassDataConOcc, mkDictOcc, mkIPOcc, + mkSpecOcc, mkForeignExportOcc, mkGenOcc1, mkGenOcc2, + mkDataTOcc, mkDataCOcc, mkDataConWorkerOcc, + mkSuperDictSelOcc, mkLocalOcc, mkMethodOcc, + + -- ** Deconstruction + occNameFS, occNameString, occNameSpace, + + isVarOcc, isTvOcc, isTcOcc, isDataOcc, isDataSymOcc, isSymOcc, isValOcc, + parenSymOcc, reportIfUnused, isTcClsName, isVarName, + + isTupleOcc_maybe, + + -- The OccEnv type + OccEnv, emptyOccEnv, unitOccEnv, extendOccEnv, mapOccEnv, + lookupOccEnv, mkOccEnv, extendOccEnvList, elemOccEnv, + occEnvElts, foldOccEnv, plusOccEnv, plusOccEnv_C, extendOccEnv_C, + + -- The OccSet type + OccSet, emptyOccSet, unitOccSet, mkOccSet, extendOccSet, + extendOccSetList, + unionOccSets, unionManyOccSets, minusOccSet, elemOccSet, occSetElts, + foldOccSet, isEmptyOccSet, intersectOccSet, intersectsOccSet, + + -- Tidying up + TidyOccEnv, emptyTidyOccEnv, tidyOccName, initTidyOccEnv, + + -- The basic form of names + isLexCon, isLexVar, isLexId, isLexSym, + isLexConId, isLexConSym, isLexVarId, isLexVarSym, + startsVarSym, startsVarId, startsConSym, startsConId + ) where + +#include "HsVersions.h" + +import Util ( thenCmp ) +import Unique ( Unique, mkUnique, Uniquable(..) ) +import BasicTypes ( Boxity(..), Arity ) +import StaticFlags ( opt_PprStyle_Debug ) +import UniqFM +import UniqSet +import FastString +import Outputable +import Binary + +import GLAEXTS + +import Data.Char ( isUpper, isLower, ord ) + +-- Unicode TODO: put isSymbol in libcompat +#if __GLASGOW_HASKELL__ > 604 +import Data.Char ( isSymbol ) +#else +isSymbol = const False +#endif + +\end{code} + +%************************************************************************ +%* * +\subsection{Name space} +%* * +%************************************************************************ + +\begin{code} +data NameSpace = VarName -- Variables, including "source" data constructors + | DataName -- "Real" data constructors + | TvName -- Type variables + | TcClsName -- Type constructors and classes; Haskell has them + -- in the same name space for now. + deriving( Eq, Ord ) + {-! derive: Binary !-} + +-- Note [Data Constructors] +-- see also: Note [Data Constructor Naming] in DataCon.lhs +-- +-- "Source" data constructors are the data constructors mentioned +-- in Haskell source code +-- +-- "Real" data constructors are the data constructors of the +-- representation type, which may not be the same as the source +-- type + +-- Example: +-- data T = T !(Int,Int) +-- +-- The source datacon has type (Int,Int) -> T +-- The real datacon has type Int -> Int -> T +-- GHC chooses a representation based on the strictness etc. + + +-- Though type constructors and classes are in the same name space now, +-- the NameSpace type is abstract, so we can easily separate them later +tcName = TcClsName -- Type constructors +clsName = TcClsName -- Classes +tcClsName = TcClsName -- Not sure which! + +dataName = DataName +srcDataName = DataName -- Haskell-source data constructors should be + -- in the Data name space + +tvName = TvName +varName = VarName + +isTcClsName :: NameSpace -> Bool +isTcClsName TcClsName = True +isTcClsName _ = False + +isVarName :: NameSpace -> Bool -- Variables or type variables, but not constructors +isVarName TvName = True +isVarName VarName = True +isVarName other = False + +pprNameSpace :: NameSpace -> SDoc +pprNameSpace DataName = ptext SLIT("data constructor") +pprNameSpace VarName = ptext SLIT("variable") +pprNameSpace TvName = ptext SLIT("type variable") +pprNameSpace TcClsName = ptext SLIT("type constructor or class") + +pprNonVarNameSpace :: NameSpace -> SDoc +pprNonVarNameSpace VarName = empty +pprNonVarNameSpace ns = pprNameSpace ns + +pprNameSpaceBrief DataName = char 'd' +pprNameSpaceBrief VarName = char 'v' +pprNameSpaceBrief TvName = ptext SLIT("tv") +pprNameSpaceBrief TcClsName = ptext SLIT("tc") +\end{code} + + +%************************************************************************ +%* * +\subsection[Name-pieces-datatypes]{The @OccName@ datatypes} +%* * +%************************************************************************ + +\begin{code} +data OccName = OccName + { occNameSpace :: !NameSpace + , occNameFS :: !FastString + } +\end{code} + + +\begin{code} +instance Eq OccName where + (OccName sp1 s1) == (OccName sp2 s2) = s1 == s2 && sp1 == sp2 + +instance Ord OccName where + compare (OccName sp1 s1) (OccName sp2 s2) = (s1 `compare` s2) `thenCmp` + (sp1 `compare` sp2) +\end{code} + + +%************************************************************************ +%* * +\subsection{Printing} +%* * +%************************************************************************ + +\begin{code} +instance Outputable OccName where + ppr = pprOccName + +pprOccName :: OccName -> SDoc +pprOccName (OccName sp occ) + = getPprStyle $ \ sty -> + if codeStyle sty + then ftext (zEncodeFS occ) + else ftext occ <> if debugStyle sty + then braces (pprNameSpaceBrief sp) + else empty +\end{code} + + +%************************************************************************ +%* * +\subsection{Construction} +%* * +%************************************************************************ + +\begin{code} +mkOccName :: NameSpace -> String -> OccName +mkOccName occ_sp str = OccName occ_sp (mkFastString str) + +mkOccNameFS :: NameSpace -> FastString -> OccName +mkOccNameFS occ_sp fs = OccName occ_sp fs + +mkVarOcc :: String -> OccName +mkVarOcc s = mkOccName varName s + +mkVarOccFS :: FastString -> OccName +mkVarOccFS fs = mkOccNameFS varName fs + +mkTyVarOcc :: FastString -> OccName +mkTyVarOcc fs = mkOccNameFS tvName fs +\end{code} + + +%************************************************************************ +%* * + Environments +%* * +%************************************************************************ + +OccEnvs are used mainly for the envts in ModIfaces. + +They are efficient, because FastStrings have unique Int# keys. We assume +this key is less than 2^24, so we can make a Unique using + mkUnique ns key :: Unique +where 'ns' is a Char reprsenting the name space. This in turn makes it +easy to build an OccEnv. + +\begin{code} +instance Uniquable OccName where + getUnique (OccName ns fs) + = mkUnique char (I# (uniqueOfFS fs)) + where -- See notes above about this getUnique function + char = case ns of + VarName -> 'i' + DataName -> 'd' + TvName -> 'v' + TcClsName -> 't' + +type OccEnv a = UniqFM a + +emptyOccEnv :: OccEnv a +unitOccEnv :: OccName -> a -> OccEnv a +extendOccEnv :: OccEnv a -> OccName -> a -> OccEnv a +extendOccEnvList :: OccEnv a -> [(OccName, a)] -> OccEnv a +lookupOccEnv :: OccEnv a -> OccName -> Maybe a +mkOccEnv :: [(OccName,a)] -> OccEnv a +elemOccEnv :: OccName -> OccEnv a -> Bool +foldOccEnv :: (a -> b -> b) -> b -> OccEnv a -> b +occEnvElts :: OccEnv a -> [a] +extendOccEnv_C :: (a->a->a) -> OccEnv a -> OccName -> a -> OccEnv a +plusOccEnv :: OccEnv a -> OccEnv a -> OccEnv a +plusOccEnv_C :: (a->a->a) -> OccEnv a -> OccEnv a -> OccEnv a +mapOccEnv :: (a->b) -> OccEnv a -> OccEnv b + +emptyOccEnv = emptyUFM +unitOccEnv = unitUFM +extendOccEnv = addToUFM +extendOccEnvList = addListToUFM +lookupOccEnv = lookupUFM +mkOccEnv = listToUFM +elemOccEnv = elemUFM +foldOccEnv = foldUFM +occEnvElts = eltsUFM +plusOccEnv = plusUFM +plusOccEnv_C = plusUFM_C +extendOccEnv_C = addToUFM_C +mapOccEnv = mapUFM + +type OccSet = UniqFM OccName + +emptyOccSet :: OccSet +unitOccSet :: OccName -> OccSet +mkOccSet :: [OccName] -> OccSet +extendOccSet :: OccSet -> OccName -> OccSet +extendOccSetList :: OccSet -> [OccName] -> OccSet +unionOccSets :: OccSet -> OccSet -> OccSet +unionManyOccSets :: [OccSet] -> OccSet +minusOccSet :: OccSet -> OccSet -> OccSet +elemOccSet :: OccName -> OccSet -> Bool +occSetElts :: OccSet -> [OccName] +foldOccSet :: (OccName -> b -> b) -> b -> OccSet -> b +isEmptyOccSet :: OccSet -> Bool +intersectOccSet :: OccSet -> OccSet -> OccSet +intersectsOccSet :: OccSet -> OccSet -> Bool + +emptyOccSet = emptyUniqSet +unitOccSet = unitUniqSet +mkOccSet = mkUniqSet +extendOccSet = addOneToUniqSet +extendOccSetList = addListToUniqSet +unionOccSets = unionUniqSets +unionManyOccSets = unionManyUniqSets +minusOccSet = minusUniqSet +elemOccSet = elementOfUniqSet +occSetElts = uniqSetToList +foldOccSet = foldUniqSet +isEmptyOccSet = isEmptyUniqSet +intersectOccSet = intersectUniqSets +intersectsOccSet s1 s2 = not (isEmptyOccSet (s1 `intersectOccSet` s2)) +\end{code} + + +%************************************************************************ +%* * +\subsection{Predicates and taking them apart} +%* * +%************************************************************************ + +\begin{code} +occNameString :: OccName -> String +occNameString (OccName _ s) = unpackFS s + +setOccNameSpace :: NameSpace -> OccName -> OccName +setOccNameSpace sp (OccName _ occ) = OccName sp occ + +isVarOcc, isTvOcc, isDataSymOcc, isSymOcc, isTcOcc :: OccName -> Bool + +isVarOcc (OccName VarName _) = True +isVarOcc other = False + +isTvOcc (OccName TvName _) = True +isTvOcc other = False + +isTcOcc (OccName TcClsName _) = True +isTcOcc other = False + +isValOcc (OccName VarName _) = True +isValOcc (OccName DataName _) = True +isValOcc other = False + +-- Data constructor operator (starts with ':', or '[]') +-- Pretty inefficient! +isDataSymOcc (OccName DataName s) = isLexConSym s +isDataSymOcc (OccName VarName s) + | isLexConSym s = pprPanic "isDataSymOcc: check me" (ppr s) + -- Jan06: I don't think this should happen +isDataSymOcc other = False + +isDataOcc (OccName DataName _) = True +isDataOcc (OccName VarName s) + | isLexCon s = pprPanic "isDataOcc: check me" (ppr s) + -- Jan06: I don't think this should happen +isDataOcc other = False + +-- Any operator (data constructor or variable) +-- Pretty inefficient! +isSymOcc (OccName DataName s) = isLexConSym s +isSymOcc (OccName TcClsName s) = isLexConSym s +isSymOcc (OccName VarName s) = isLexSym s +isSymOcc other = False + +parenSymOcc :: OccName -> SDoc -> SDoc +-- Wrap parens around an operator +parenSymOcc occ doc | isSymOcc occ = parens doc + | otherwise = doc +\end{code} + + +\begin{code} +reportIfUnused :: OccName -> Bool + -- Haskell 98 encourages compilers to suppress warnings about + -- unused names in a pattern if they start with "_". +reportIfUnused occ = case occNameString occ of + ('_' : _) -> False + _other -> True +\end{code} + + +%************************************************************************ +%* * +\subsection{Making system names} +%* * +%************************************************************************ + +Here's our convention for splitting up the interface file name space: + + d... dictionary identifiers + (local variables, so no name-clash worries) + + $f... dict-fun identifiers (from inst decls) + $dm... default methods + $p... superclass selectors + $w... workers + :T... compiler-generated tycons for dictionaries + :D... ...ditto data cons + $sf.. specialised version of f + + in encoded form these appear as Zdfxxx etc + + :... keywords (export:, letrec: etc.) +--- I THINK THIS IS WRONG! + +This knowledge is encoded in the following functions. + + +@mk_deriv@ generates an @OccName@ from the prefix and a string. +NB: The string must already be encoded! + +\begin{code} +mk_deriv :: NameSpace + -> String -- Distinguishes one sort of derived name from another + -> String + -> OccName + +mk_deriv occ_sp sys_prefix str = mkOccName occ_sp (sys_prefix ++ str) +\end{code} + +\begin{code} +mkDataConWrapperOcc, mkWorkerOcc, mkDefaultMethodOcc, mkDerivedTyConOcc, + mkClassTyConOcc, mkClassDataConOcc, mkDictOcc, mkIPOcc, + mkSpecOcc, mkForeignExportOcc, mkGenOcc1, mkGenOcc2, + mkDataTOcc, mkDataCOcc, mkDataConWorkerOcc + :: OccName -> OccName + +-- These derived variables have a prefix that no Haskell value could have +mkDataConWrapperOcc = mk_simple_deriv varName "$W" +mkWorkerOcc = mk_simple_deriv varName "$w" +mkDefaultMethodOcc = mk_simple_deriv varName "$dm" +mkDerivedTyConOcc = mk_simple_deriv tcName ":" -- The : prefix makes sure it classifies +mkClassTyConOcc = mk_simple_deriv tcName ":T" -- as a tycon/datacon +mkClassDataConOcc = mk_simple_deriv dataName ":D" -- We go straight to the "real" data con + -- for datacons from classes +mkDictOcc = mk_simple_deriv varName "$d" +mkIPOcc = mk_simple_deriv varName "$i" +mkSpecOcc = mk_simple_deriv varName "$s" +mkForeignExportOcc = mk_simple_deriv varName "$f" + +-- Generic derivable classes +mkGenOcc1 = mk_simple_deriv varName "$gfrom" +mkGenOcc2 = mk_simple_deriv varName "$gto" + +-- data T = MkT ... deriving( Data ) needs defintions for +-- $tT :: Data.Generics.Basics.DataType +-- $cMkT :: Data.Generics.Basics.Constr +mkDataTOcc = mk_simple_deriv varName "$t" +mkDataCOcc = mk_simple_deriv varName "$c" + +mk_simple_deriv sp px occ = mk_deriv sp px (occNameString occ) + +-- Data constructor workers are made by setting the name space +-- of the data constructor OccName (which should be a DataName) +-- to VarName +mkDataConWorkerOcc datacon_occ = setOccNameSpace varName datacon_occ +\end{code} + +\begin{code} +mkSuperDictSelOcc :: Int -- Index of superclass, eg 3 + -> OccName -- Class, eg "Ord" + -> OccName -- eg "$p3Ord" +mkSuperDictSelOcc index cls_occ + = mk_deriv varName "$p" (show index ++ occNameString cls_occ) + +mkLocalOcc :: Unique -- Unique + -> OccName -- Local name (e.g. "sat") + -> OccName -- Nice unique version ("$L23sat") +mkLocalOcc uniq occ + = mk_deriv varName ("$L" ++ show uniq) (occNameString occ) + -- The Unique might print with characters + -- that need encoding (e.g. 'z'!) +\end{code} + + +\begin{code} +mkDFunOcc :: String -- Typically the class and type glommed together e.g. "OrdMaybe" + -- Only used in debug mode, for extra clarity + -> Bool -- True <=> hs-boot instance dfun + -> Int -- Unique index + -> OccName -- "$f3OrdMaybe" + +-- In hs-boot files we make dict funs like $fx7ClsTy, which get bound to the real +-- thing when we compile the mother module. Reason: we don't know exactly +-- what the mother module will call it. + +mkDFunOcc info_str is_boot index + = mk_deriv VarName prefix string + where + prefix | is_boot = "$fx" + | otherwise = "$f" + string | opt_PprStyle_Debug = show index ++ info_str + | otherwise = show index +\end{code} + +We used to add a '$m' to indicate a method, but that gives rise to bad +error messages from the type checker when we print the function name or pattern +of an instance-decl binding. Why? Because the binding is zapped +to use the method name in place of the selector name. +(See TcClassDcl.tcMethodBind) + +The way it is now, -ddump-xx output may look confusing, but +you can always say -dppr-debug to get the uniques. + +However, we *do* have to zap the first character to be lower case, +because overloaded constructors (blarg) generate methods too. +And convert to VarName space + +e.g. a call to constructor MkFoo where + data (Ord a) => Foo a = MkFoo a + +If this is necessary, we do it by prefixing '$m'. These +guys never show up in error messages. What a hack. + +\begin{code} +mkMethodOcc :: OccName -> OccName +mkMethodOcc occ@(OccName VarName fs) = occ +mkMethodOcc occ = mk_simple_deriv varName "$m" occ +\end{code} + + +%************************************************************************ +%* * +\subsection{Tidying them up} +%* * +%************************************************************************ + +Before we print chunks of code we like to rename it so that +we don't have to print lots of silly uniques in it. But we mustn't +accidentally introduce name clashes! So the idea is that we leave the +OccName alone unless it accidentally clashes with one that is already +in scope; if so, we tack on '1' at the end and try again, then '2', and +so on till we find a unique one. + +There's a wrinkle for operators. Consider '>>='. We can't use '>>=1' +because that isn't a single lexeme. So we encode it to 'lle' and *then* +tack on the '1', if necessary. + +\begin{code} +type TidyOccEnv = OccEnv Int -- The in-scope OccNames + -- Range gives a plausible starting point for new guesses + +emptyTidyOccEnv = emptyOccEnv + +initTidyOccEnv :: [OccName] -> TidyOccEnv -- Initialise with names to avoid! +initTidyOccEnv = foldl (\env occ -> extendOccEnv env occ 1) emptyTidyOccEnv + +tidyOccName :: TidyOccEnv -> OccName -> (TidyOccEnv, OccName) + +tidyOccName in_scope occ@(OccName occ_sp fs) + = case lookupOccEnv in_scope occ of + Nothing -> -- Not already used: make it used + (extendOccEnv in_scope occ 1, occ) + + Just n -> -- Already used: make a new guess, + -- change the guess base, and try again + tidyOccName (extendOccEnv in_scope occ (n+1)) + (mkOccName occ_sp (unpackFS fs ++ show n)) +\end{code} + +%************************************************************************ +%* * + Stuff for dealing with tuples +%* * +%************************************************************************ + +\begin{code} +mkTupleOcc :: NameSpace -> Boxity -> Arity -> OccName +mkTupleOcc ns bx ar = OccName ns (mkFastString str) + where + -- no need to cache these, the caching is done in the caller + -- (TysWiredIn.mk_tuple) + str = case bx of + Boxed -> '(' : commas ++ ")" + Unboxed -> '(' : '#' : commas ++ "#)" + + commas = take (ar-1) (repeat ',') + +isTupleOcc_maybe :: OccName -> Maybe (NameSpace, Boxity, Arity) +-- Tuples are special, because there are so many of them! +isTupleOcc_maybe (OccName ns fs) + = case unpackFS fs of + '(':'#':',':rest -> Just (ns, Unboxed, 2 + count_commas rest) + '(':',':rest -> Just (ns, Boxed, 2 + count_commas rest) + _other -> Nothing + where + count_commas (',':rest) = 1 + count_commas rest + count_commas _ = 0 +\end{code} + +%************************************************************************ +%* * +\subsection{Lexical categories} +%* * +%************************************************************************ + +These functions test strings to see if they fit the lexical categories +defined in the Haskell report. + +\begin{code} +isLexCon, isLexVar, isLexId, isLexSym :: FastString -> Bool +isLexConId, isLexConSym, isLexVarId, isLexVarSym :: FastString -> Bool + +isLexCon cs = isLexConId cs || isLexConSym cs +isLexVar cs = isLexVarId cs || isLexVarSym cs + +isLexId cs = isLexConId cs || isLexVarId cs +isLexSym cs = isLexConSym cs || isLexVarSym cs + +------------- + +isLexConId cs -- Prefix type or data constructors + | nullFS cs = False -- e.g. "Foo", "[]", "(,)" + | cs == FSLIT("[]") = True + | otherwise = startsConId (headFS cs) + +isLexVarId cs -- Ordinary prefix identifiers + | nullFS cs = False -- e.g. "x", "_x" + | otherwise = startsVarId (headFS cs) + +isLexConSym cs -- Infix type or data constructors + | nullFS cs = False -- e.g. ":-:", ":", "->" + | cs == FSLIT("->") = True + | otherwise = startsConSym (headFS cs) + +isLexVarSym cs -- Infix identifiers + | nullFS cs = False -- e.g. "+" + | otherwise = startsVarSym (headFS cs) + +------------- +startsVarSym, startsVarId, startsConSym, startsConId :: Char -> Bool +startsVarSym c = isSymbolASCII c || (ord c > 0x7f && isSymbol c) -- Infix Ids +startsConSym c = c == ':' -- Infix data constructors +startsVarId c = isLower c || c == '_' -- Ordinary Ids +startsConId c = isUpper c || c == '(' -- Ordinary type constructors and data constructors + +isSymbolASCII c = c `elem` "!#$%&*+./<=>?@\\^|~-" +\end{code} + +%************************************************************************ +%* * + Binary instance + Here rather than BinIface because OccName is abstract +%* * +%************************************************************************ + +\begin{code} +instance Binary NameSpace where + put_ bh VarName = do + putByte bh 0 + put_ bh DataName = do + putByte bh 1 + put_ bh TvName = do + putByte bh 2 + put_ bh TcClsName = do + putByte bh 3 + get bh = do + h <- getByte bh + case h of + 0 -> do return VarName + 1 -> do return DataName + 2 -> do return TvName + _ -> do return TcClsName + +instance Binary OccName where + put_ bh (OccName aa ab) = do + put_ bh aa + put_ bh ab + get bh = do + aa <- get bh + ab <- get bh + return (OccName aa ab) +\end{code} diff --git a/compiler/basicTypes/OccName.lhs-boot b/compiler/basicTypes/OccName.lhs-boot new file mode 100644 index 0000000000..d9c7fcd141 --- /dev/null +++ b/compiler/basicTypes/OccName.lhs-boot @@ -0,0 +1,5 @@ +\begin{code} +module OccName where + +data OccName +\end{code} diff --git a/compiler/basicTypes/RdrName.lhs b/compiler/basicTypes/RdrName.lhs new file mode 100644 index 0000000000..030aa1f609 --- /dev/null +++ b/compiler/basicTypes/RdrName.lhs @@ -0,0 +1,540 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% + +\section[RdrName]{@RdrName@} + +\begin{code} +module RdrName ( + RdrName(..), -- Constructors exported only to BinIface + + -- Construction + mkRdrUnqual, mkRdrQual, + mkUnqual, mkVarUnqual, mkQual, mkOrig, + nameRdrName, getRdrName, + mkDerivedRdrName, + + -- Destruction + rdrNameModule, rdrNameOcc, setRdrNameSpace, + isRdrDataCon, isRdrTyVar, isRdrTc, isQual, isUnqual, + isOrig, isOrig_maybe, isExact, isExact_maybe, isSrcRdrName, + + -- Printing; instance Outputable RdrName + + -- LocalRdrEnv + LocalRdrEnv, emptyLocalRdrEnv, extendLocalRdrEnv, + lookupLocalRdrEnv, elemLocalRdrEnv, + + -- GlobalRdrEnv + GlobalRdrEnv, emptyGlobalRdrEnv, mkGlobalRdrEnv, plusGlobalRdrEnv, + lookupGlobalRdrEnv, extendGlobalRdrEnv, + pprGlobalRdrEnv, globalRdrEnvElts, + lookupGRE_RdrName, lookupGRE_Name, + + -- GlobalRdrElt, Provenance, ImportSpec + GlobalRdrElt(..), isLocalGRE, unQualOK, + Provenance(..), pprNameProvenance, + ImportSpec(..), ImpDeclSpec(..), ImpItemSpec(..), + importSpecLoc, importSpecModule + ) where + +#include "HsVersions.h" + +import OccName +import Module ( Module, mkModuleFS ) +import Name ( Name, NamedThing(getName), nameModule, nameParent_maybe, + nameOccName, isExternalName, nameSrcLoc ) +import Maybes ( mapCatMaybes ) +import SrcLoc ( isGoodSrcLoc, SrcSpan ) +import FastString ( FastString ) +import Outputable +import Util ( thenCmp ) +\end{code} + +%************************************************************************ +%* * +\subsection{The main data type} +%* * +%************************************************************************ + +\begin{code} +data RdrName + = Unqual OccName + -- Used for ordinary, unqualified occurrences + + | Qual Module OccName + -- A qualified name written by the user in + -- *source* code. The module isn't necessarily + -- the module where the thing is defined; + -- just the one from which it is imported + + | Orig Module OccName + -- An original name; the module is the *defining* module. + -- This is used when GHC generates code that will be fed + -- into the renamer (e.g. from deriving clauses), but where + -- we want to say "Use Prelude.map dammit". + + | Exact Name + -- We know exactly the Name. This is used + -- (a) when the parser parses built-in syntax like "[]" + -- and "(,)", but wants a RdrName from it + -- (b) when converting names to the RdrNames in IfaceTypes + -- Here an Exact RdrName always contains an External Name + -- (Internal Names are converted to simple Unquals) + -- (c) by Template Haskell, when TH has generated a unique name +\end{code} + + +%************************************************************************ +%* * +\subsection{Simple functions} +%* * +%************************************************************************ + +\begin{code} +rdrNameModule :: RdrName -> Module +rdrNameModule (Qual m _) = m +rdrNameModule (Orig m _) = m +rdrNameModule (Exact n) = nameModule n +rdrNameModule (Unqual n) = pprPanic "rdrNameModule" (ppr n) + +rdrNameOcc :: RdrName -> OccName +rdrNameOcc (Qual _ occ) = occ +rdrNameOcc (Unqual occ) = occ +rdrNameOcc (Orig _ occ) = occ +rdrNameOcc (Exact name) = nameOccName name + +setRdrNameSpace :: RdrName -> NameSpace -> RdrName +-- This rather gruesome function is used mainly by the parser +-- When parsing data T a = T | T1 Int +-- we parse the data constructors as *types* because of parser ambiguities, +-- so then we need to change the *type constr* to a *data constr* +-- +-- The original-name case *can* occur when parsing +-- data [] a = [] | a : [a] +-- For the orig-name case we return an unqualified name. +setRdrNameSpace (Unqual occ) ns = Unqual (setOccNameSpace ns occ) +setRdrNameSpace (Qual m occ) ns = Qual m (setOccNameSpace ns occ) +setRdrNameSpace (Orig m occ) ns = Orig m (setOccNameSpace ns occ) +setRdrNameSpace (Exact n) ns = Orig (nameModule n) + (setOccNameSpace ns (nameOccName n)) +\end{code} + +\begin{code} + -- These two are the basic constructors +mkRdrUnqual :: OccName -> RdrName +mkRdrUnqual occ = Unqual occ + +mkRdrQual :: Module -> OccName -> RdrName +mkRdrQual mod occ = Qual mod occ + +mkOrig :: Module -> OccName -> RdrName +mkOrig mod occ = Orig mod occ + +--------------- +mkDerivedRdrName :: Name -> (OccName -> OccName) -> (RdrName) +mkDerivedRdrName parent mk_occ + = mkOrig (nameModule parent) (mk_occ (nameOccName parent)) + +--------------- + -- These two are used when parsing source files + -- They do encode the module and occurrence names +mkUnqual :: NameSpace -> FastString -> RdrName +mkUnqual sp n = Unqual (mkOccNameFS sp n) + +mkVarUnqual :: FastString -> RdrName +mkVarUnqual n = Unqual (mkVarOccFS n) + +mkQual :: NameSpace -> (FastString, FastString) -> RdrName +mkQual sp (m, n) = Qual (mkModuleFS m) (mkOccNameFS sp n) + +getRdrName :: NamedThing thing => thing -> RdrName +getRdrName name = nameRdrName (getName name) + +nameRdrName :: Name -> RdrName +nameRdrName name = Exact name +-- Keep the Name even for Internal names, so that the +-- unique is still there for debug printing, particularly +-- of Types (which are converted to IfaceTypes before printing) + +nukeExact :: Name -> RdrName +nukeExact n + | isExternalName n = Orig (nameModule n) (nameOccName n) + | otherwise = Unqual (nameOccName n) +\end{code} + +\begin{code} +isRdrDataCon rn = isDataOcc (rdrNameOcc rn) +isRdrTyVar rn = isTvOcc (rdrNameOcc rn) +isRdrTc rn = isTcOcc (rdrNameOcc rn) + +isSrcRdrName (Unqual _) = True +isSrcRdrName (Qual _ _) = True +isSrcRdrName _ = False + +isUnqual (Unqual _) = True +isUnqual other = False + +isQual (Qual _ _) = True +isQual _ = False + +isOrig (Orig _ _) = True +isOrig _ = False + +isOrig_maybe (Orig m n) = Just (m,n) +isOrig_maybe _ = Nothing + +isExact (Exact _) = True +isExact other = False + +isExact_maybe (Exact n) = Just n +isExact_maybe other = Nothing +\end{code} + + +%************************************************************************ +%* * +\subsection{Instances} +%* * +%************************************************************************ + +\begin{code} +instance Outputable RdrName where + ppr (Exact name) = ppr name + ppr (Unqual occ) = ppr occ <+> ppr_name_space occ + ppr (Qual mod occ) = ppr mod <> dot <> ppr occ <+> ppr_name_space occ + ppr (Orig mod occ) = ppr mod <> dot <> ppr occ <+> ppr_name_space occ + +ppr_name_space occ = ifPprDebug (parens (pprNonVarNameSpace (occNameSpace occ))) + +instance OutputableBndr RdrName where + pprBndr _ n + | isTvOcc (rdrNameOcc n) = char '@' <+> ppr n + | otherwise = ppr n + +instance Eq RdrName where + (Exact n1) == (Exact n2) = n1==n2 + -- Convert exact to orig + (Exact n1) == r2@(Orig _ _) = nukeExact n1 == r2 + r1@(Orig _ _) == (Exact n2) = r1 == nukeExact n2 + + (Orig m1 o1) == (Orig m2 o2) = m1==m2 && o1==o2 + (Qual m1 o1) == (Qual m2 o2) = m1==m2 && o1==o2 + (Unqual o1) == (Unqual o2) = o1==o2 + r1 == r2 = False + +instance Ord RdrName where + a <= b = case (a `compare` b) of { LT -> True; EQ -> True; GT -> False } + a < b = case (a `compare` b) of { LT -> True; EQ -> False; GT -> False } + a >= b = case (a `compare` b) of { LT -> False; EQ -> True; GT -> True } + a > b = case (a `compare` b) of { LT -> False; EQ -> False; GT -> True } + + -- Exact < Unqual < Qual < Orig + -- [Note: Apr 2004] We used to use nukeExact to convert Exact to Orig + -- before comparing so that Prelude.map == the exact Prelude.map, but + -- that meant that we reported duplicates when renaming bindings + -- generated by Template Haskell; e.g + -- do { n1 <- newName "foo"; n2 <- newName "foo"; + -- <decl involving n1,n2> } + -- I think we can do without this conversion + compare (Exact n1) (Exact n2) = n1 `compare` n2 + compare (Exact n1) n2 = LT + + compare (Unqual _) (Exact _) = GT + compare (Unqual o1) (Unqual o2) = o1 `compare` o2 + compare (Unqual _) _ = LT + + compare (Qual _ _) (Exact _) = GT + compare (Qual _ _) (Unqual _) = GT + compare (Qual m1 o1) (Qual m2 o2) = (o1 `compare` o2) `thenCmp` (m1 `compare` m2) + compare (Qual _ _) (Orig _ _) = LT + + compare (Orig m1 o1) (Orig m2 o2) = (o1 `compare` o2) `thenCmp` (m1 `compare` m2) + compare (Orig _ _) _ = GT +\end{code} + + + +%************************************************************************ +%* * + LocalRdrEnv +%* * +%************************************************************************ + +A LocalRdrEnv is used for local bindings (let, where, lambda, case) +It is keyed by OccName, because we never use it for qualified names. + +\begin{code} +type LocalRdrEnv = OccEnv Name + +emptyLocalRdrEnv = emptyOccEnv + +extendLocalRdrEnv :: LocalRdrEnv -> [Name] -> LocalRdrEnv +extendLocalRdrEnv env names + = extendOccEnvList env [(nameOccName n, n) | n <- names] + +lookupLocalRdrEnv :: LocalRdrEnv -> RdrName -> Maybe Name +lookupLocalRdrEnv env (Exact name) = Just name +lookupLocalRdrEnv env (Unqual occ) = lookupOccEnv env occ +lookupLocalRdrEnv env other = Nothing + +elemLocalRdrEnv :: RdrName -> LocalRdrEnv -> Bool +elemLocalRdrEnv rdr_name env + | isUnqual rdr_name = rdrNameOcc rdr_name `elemOccEnv` env + | otherwise = False +\end{code} + + +%************************************************************************ +%* * + GlobalRdrEnv +%* * +%************************************************************************ + +\begin{code} +type GlobalRdrEnv = OccEnv [GlobalRdrElt] + -- Keyed by OccName; when looking up a qualified name + -- we look up the OccName part, and then check the Provenance + -- to see if the appropriate qualification is valid. This + -- saves routinely doubling the size of the env by adding both + -- qualified and unqualified names to the domain. + -- + -- The list in the range is reqd because there may be name clashes + -- These only get reported on lookup, not on construction + + -- INVARIANT: All the members of the list have distinct + -- gre_name fields; that is, no duplicate Names + +emptyGlobalRdrEnv = emptyOccEnv + +globalRdrEnvElts :: GlobalRdrEnv -> [GlobalRdrElt] +globalRdrEnvElts env = foldOccEnv (++) [] env + +data GlobalRdrElt + = GRE { gre_name :: Name, + gre_prov :: Provenance -- Why it's in scope + } + +instance Outputable GlobalRdrElt where + ppr gre = ppr name <+> pp_parent (nameParent_maybe name) + <+> parens (pprNameProvenance gre) + where + name = gre_name gre + pp_parent (Just p) = brackets (text "parent:" <+> ppr p) + pp_parent Nothing = empty + +pprGlobalRdrEnv :: GlobalRdrEnv -> SDoc +pprGlobalRdrEnv env + = vcat (map pp (occEnvElts env)) + where + pp gres = ppr (nameOccName (gre_name (head gres))) <> colon <+> + vcat [ ppr (gre_name gre) <+> pprNameProvenance gre + | gre <- gres] +\end{code} + +\begin{code} +lookupGlobalRdrEnv :: GlobalRdrEnv -> OccName -> [GlobalRdrElt] +lookupGlobalRdrEnv env rdr_name = case lookupOccEnv env rdr_name of + Nothing -> [] + Just gres -> gres + +extendGlobalRdrEnv :: GlobalRdrEnv -> GlobalRdrElt -> GlobalRdrEnv +extendGlobalRdrEnv env gre = extendOccEnv_C add env occ [gre] + where + occ = nameOccName (gre_name gre) + add gres _ = gre:gres + +lookupGRE_RdrName :: RdrName -> GlobalRdrEnv -> [GlobalRdrElt] +lookupGRE_RdrName rdr_name env + = case lookupOccEnv env (rdrNameOcc rdr_name) of + Nothing -> [] + Just gres -> pickGREs rdr_name gres + +lookupGRE_Name :: GlobalRdrEnv -> Name -> [GlobalRdrElt] +lookupGRE_Name env name + = [ gre | gre <- lookupGlobalRdrEnv env (nameOccName name), + gre_name gre == name ] + + +pickGREs :: RdrName -> [GlobalRdrElt] -> [GlobalRdrElt] +-- Take a list of GREs which have the right OccName +-- Pick those GREs that are suitable for this RdrName +-- And for those, keep only only the Provenances that are suitable +-- +-- Consider +-- module A ( f ) where +-- import qualified Foo( f ) +-- import Baz( f ) +-- f = undefined +-- Let's suppose that Foo.f and Baz.f are the same entity really. +-- The export of f is ambiguous because it's in scope from the local def +-- and the import. The lookup of (Unqual f) should return a GRE for +-- the locally-defined f, and a GRE for the imported f, with a *single* +-- provenance, namely the one for Baz(f). +pickGREs rdr_name gres + = mapCatMaybes pick gres + where + is_unqual = isUnqual rdr_name + mod = rdrNameModule rdr_name + + pick :: GlobalRdrElt -> Maybe GlobalRdrElt + pick gre@(GRE {gre_prov = LocalDef, gre_name = n}) -- Local def + | is_unqual || nameModule n == mod = Just gre + | otherwise = Nothing + pick gre@(GRE {gre_prov = Imported [is]}) -- Single import (efficiency) + | is_unqual = if not (is_qual (is_decl is)) then Just gre + else Nothing + | otherwise = if mod == is_as (is_decl is) then Just gre + else Nothing + pick gre@(GRE {gre_prov = Imported is}) -- Multiple import + | null filtered_is = Nothing + | otherwise = Just (gre {gre_prov = Imported filtered_is}) + where + filtered_is | is_unqual = filter (not . is_qual . is_decl) is + | otherwise = filter ((== mod) . is_as . is_decl) is + +isLocalGRE :: GlobalRdrElt -> Bool +isLocalGRE (GRE {gre_prov = LocalDef}) = True +isLocalGRE other = False + +unQualOK :: GlobalRdrElt -> Bool +-- An unqualifed version of this thing is in scope +unQualOK (GRE {gre_prov = LocalDef}) = True +unQualOK (GRE {gre_prov = Imported is}) = not (all (is_qual . is_decl) is) + +plusGlobalRdrEnv :: GlobalRdrEnv -> GlobalRdrEnv -> GlobalRdrEnv +plusGlobalRdrEnv env1 env2 = plusOccEnv_C (foldr insertGRE) env1 env2 + +mkGlobalRdrEnv :: [GlobalRdrElt] -> GlobalRdrEnv +mkGlobalRdrEnv gres + = foldr add emptyGlobalRdrEnv gres + where + add gre env = extendOccEnv_C (foldr insertGRE) env + (nameOccName (gre_name gre)) + [gre] + +insertGRE :: GlobalRdrElt -> [GlobalRdrElt] -> [GlobalRdrElt] +insertGRE new_g [] = [new_g] +insertGRE new_g (old_g : old_gs) + | gre_name new_g == gre_name old_g + = new_g `plusGRE` old_g : old_gs + | otherwise + = old_g : insertGRE new_g old_gs + +plusGRE :: GlobalRdrElt -> GlobalRdrElt -> GlobalRdrElt +-- Used when the gre_name fields match +plusGRE g1 g2 + = GRE { gre_name = gre_name g1, + gre_prov = gre_prov g1 `plusProv` gre_prov g2 } +\end{code} + + +%************************************************************************ +%* * + Provenance +%* * +%************************************************************************ + +The "provenance" of something says how it came to be in scope. +It's quite elaborate so that we can give accurate unused-name warnings. + +\begin{code} +data Provenance + = LocalDef -- Defined locally + | Imported -- Imported + [ImportSpec] -- INVARIANT: non-empty + +data ImportSpec = ImpSpec { is_decl :: ImpDeclSpec, + is_item :: ImpItemSpec } + deriving( Eq, Ord ) + +data ImpDeclSpec -- Describes a particular import declaration + -- Shared among all the Provenaces for that decl + = ImpDeclSpec { + is_mod :: Module, -- 'import Muggle' + -- Note the Muggle may well not be + -- the defining module for this thing! + is_as :: Module, -- 'as M' (or 'Muggle' if there is no 'as' clause) + is_qual :: Bool, -- True <=> qualified (only) + is_dloc :: SrcSpan -- Location of import declaration + } + +data ImpItemSpec -- Describes import info a particular Name + = ImpAll -- The import had no import list, + -- or had a hiding list + + | ImpSome { -- The import had an import list + is_explicit :: Bool, + is_iloc :: SrcSpan -- Location of the import item + } + -- The is_explicit field is True iff the thing was named + -- *explicitly* in the import specs rather + -- than being imported as part of a "..." group + -- e.g. import C( T(..) ) + -- Here the constructors of T are not named explicitly; + -- only T is named explicitly. + +importSpecLoc :: ImportSpec -> SrcSpan +importSpecLoc (ImpSpec decl ImpAll) = is_dloc decl +importSpecLoc (ImpSpec _ item) = is_iloc item + +importSpecModule :: ImportSpec -> Module +importSpecModule is = is_mod (is_decl is) + +-- Note [Comparing provenance] +-- Comparison of provenance is just used for grouping +-- error messages (in RnEnv.warnUnusedBinds) +instance Eq Provenance where + p1 == p2 = case p1 `compare` p2 of EQ -> True; _ -> False + +instance Eq ImpDeclSpec where + p1 == p2 = case p1 `compare` p2 of EQ -> True; _ -> False + +instance Eq ImpItemSpec where + p1 == p2 = case p1 `compare` p2 of EQ -> True; _ -> False + +instance Ord Provenance where + compare LocalDef LocalDef = EQ + compare LocalDef (Imported _) = LT + compare (Imported _ ) LocalDef = GT + compare (Imported is1) (Imported is2) = compare (head is1) + {- See Note [Comparing provenance] -} (head is2) + +instance Ord ImpDeclSpec where + compare is1 is2 = (is_mod is1 `compare` is_mod is2) `thenCmp` + (is_dloc is1 `compare` is_dloc is2) + +instance Ord ImpItemSpec where + compare is1 is2 = is_iloc is1 `compare` is_iloc is2 +\end{code} + +\begin{code} +plusProv :: Provenance -> Provenance -> Provenance +-- Choose LocalDef over Imported +-- There is an obscure bug lurking here; in the presence +-- of recursive modules, something can be imported *and* locally +-- defined, and one might refer to it with a qualified name from +-- the import -- but I'm going to ignore that because it makes +-- the isLocalGRE predicate so much nicer this way +plusProv LocalDef LocalDef = panic "plusProv" +plusProv LocalDef p2 = LocalDef +plusProv p1 LocalDef = LocalDef +plusProv (Imported is1) (Imported is2) = Imported (is1++is2) + +pprNameProvenance :: GlobalRdrElt -> SDoc +-- Print out the place where the name was imported +pprNameProvenance (GRE {gre_name = name, gre_prov = LocalDef}) + = ptext SLIT("defined at") <+> ppr (nameSrcLoc name) +pprNameProvenance (GRE {gre_name = name, gre_prov = Imported (why:whys)}) + = sep [ppr why, nest 2 (ppr_defn (nameSrcLoc name))] + +-- If we know the exact definition point (which we may do with GHCi) +-- then show that too. But not if it's just "imported from X". +ppr_defn loc | isGoodSrcLoc loc = parens (ptext SLIT("defined at") <+> ppr loc) + | otherwise = empty + +instance Outputable ImportSpec where + ppr imp_spec@(ImpSpec imp_decl _) + = ptext SLIT("imported from") <+> ppr (is_mod imp_decl) + <+> ptext SLIT("at") <+> ppr (importSpecLoc imp_spec) +\end{code} diff --git a/compiler/basicTypes/SrcLoc.lhs b/compiler/basicTypes/SrcLoc.lhs new file mode 100644 index 0000000000..51d4318b0b --- /dev/null +++ b/compiler/basicTypes/SrcLoc.lhs @@ -0,0 +1,386 @@ +% +% (c) The University of Glasgow, 1992-2003 +% +%************************************************************************ +%* * +\section[SrcLoc]{The @SrcLoc@ type} +%* * +%************************************************************************ + +\begin{code} +module SrcLoc ( + SrcLoc, -- Abstract + + mkSrcLoc, isGoodSrcLoc, mkGeneralSrcLoc, + noSrcLoc, -- "I'm sorry, I haven't a clue" + advanceSrcLoc, + + importedSrcLoc, -- Unknown place in an interface + wiredInSrcLoc, -- Something wired into the compiler + generatedSrcLoc, -- Code generated within the compiler + interactiveSrcLoc, -- Code from an interactive session + + srcLocFile, -- return the file name part + srcLocLine, -- return the line part + srcLocCol, -- return the column part + pprDefnLoc, + + SrcSpan, -- Abstract + noSrcSpan, + mkGeneralSrcSpan, + isGoodSrcSpan, + mkSrcSpan, srcLocSpan, + combineSrcSpans, + srcSpanFile, + srcSpanStartLine, srcSpanEndLine, + srcSpanStartCol, srcSpanEndCol, + srcSpanStart, srcSpanEnd, + + Located(..), getLoc, unLoc, noLoc, eqLocated, cmpLocated, combineLocs, addCLoc + ) where + +#include "HsVersions.h" + +import Util ( thenCmp ) +import Outputable +import FastString +\end{code} + +%************************************************************************ +%* * +\subsection[SrcLoc-SrcLocations]{Source-location information} +%* * +%************************************************************************ + +We keep information about the {\em definition} point for each entity; +this is the obvious stuff: +\begin{code} +data SrcLoc + = SrcLoc FastString -- A precise location (file name) + !Int -- line number, begins at 1 + !Int -- column number, begins at 0 + -- Don't ask me why lines start at 1 and columns start at + -- zero. That's just the way it is, so there. --SDM + + | ImportedLoc String -- Module name + + | UnhelpfulLoc FastString -- Just a general indication +\end{code} + +Note that an entity might be imported via more than one route, and +there could be more than one ``definition point'' --- in two or more +\tr{.hi} files. We deemed it probably-unworthwhile to cater for this +rare case. + +%************************************************************************ +%* * +\subsection[SrcLoc-access-fns]{Access functions for names} +%* * +%************************************************************************ + +Things to make 'em: +\begin{code} +mkSrcLoc x line col = SrcLoc x line col +noSrcLoc = UnhelpfulLoc FSLIT("<no location info>") +generatedSrcLoc = UnhelpfulLoc FSLIT("<compiler-generated code>") +wiredInSrcLoc = UnhelpfulLoc FSLIT("<wired into compiler>") +interactiveSrcLoc = UnhelpfulLoc FSLIT("<interactive session>") + +mkGeneralSrcLoc :: FastString -> SrcLoc +mkGeneralSrcLoc = UnhelpfulLoc + +importedSrcLoc :: String -> SrcLoc +importedSrcLoc mod_name = ImportedLoc mod_name + +isGoodSrcLoc (SrcLoc _ _ _) = True +isGoodSrcLoc other = False + +srcLocFile :: SrcLoc -> FastString +srcLocFile (SrcLoc fname _ _) = fname +srcLocFile other = FSLIT("<unknown file") + +srcLocLine :: SrcLoc -> Int +srcLocLine (SrcLoc _ l c) = l +srcLocLine other = panic "srcLocLine: unknown line" + +srcLocCol :: SrcLoc -> Int +srcLocCol (SrcLoc _ l c) = c +srcLocCol other = panic "srcLocCol: unknown col" + +advanceSrcLoc :: SrcLoc -> Char -> SrcLoc +advanceSrcLoc (SrcLoc f l c) '\n' = SrcLoc f (l + 1) 0 +advanceSrcLoc (SrcLoc f l c) _ = SrcLoc f l (c + 1) +advanceSrcLoc loc _ = loc -- Better than nothing +\end{code} + +%************************************************************************ +%* * +\subsection[SrcLoc-instances]{Instance declarations for various names} +%* * +%************************************************************************ + +\begin{code} +-- SrcLoc is an instance of Ord so that we can sort error messages easily +instance Eq SrcLoc where + loc1 == loc2 = case loc1 `cmpSrcLoc` loc2 of + EQ -> True + other -> False + +instance Ord SrcLoc where + compare = cmpSrcLoc + +cmpSrcLoc (UnhelpfulLoc s1) (UnhelpfulLoc s2) = s1 `compare` s2 +cmpSrcLoc (UnhelpfulLoc _) other = LT + +cmpSrcLoc (ImportedLoc _) (UnhelpfulLoc _) = GT +cmpSrcLoc (ImportedLoc m1) (ImportedLoc m2) = m1 `compare` m2 +cmpSrcLoc (ImportedLoc _) other = LT + +cmpSrcLoc (SrcLoc s1 l1 c1) (SrcLoc s2 l2 c2) + = (s1 `compare` s2) `thenCmp` (l1 `cmpline` l2) `thenCmp` (c1 `cmpline` c2) + where + l1 `cmpline` l2 | l1 < l2 = LT + | l1 == l2 = EQ + | otherwise = GT +cmpSrcLoc (SrcLoc _ _ _) other = GT + +instance Outputable SrcLoc where + ppr (SrcLoc src_path src_line src_col) + = getPprStyle $ \ sty -> + if userStyle sty || debugStyle sty then + hcat [ ftext src_path, char ':', + int src_line, + char ':', int src_col + ] + else + hcat [text "{-# LINE ", int src_line, space, + char '\"', ftext src_path, text " #-}"] + + ppr (ImportedLoc mod) = ptext SLIT("Imported from") <+> text mod + ppr (UnhelpfulLoc s) = ftext s +\end{code} + +%************************************************************************ +%* * +\subsection[SrcSpan]{Source Spans} +%* * +%************************************************************************ + +\begin{code} +{- | +A SrcSpan delimits a portion of a text file. It could be represented +by a pair of (line,column) coordinates, but in fact we optimise +slightly by using more compact representations for single-line and +zero-length spans, both of which are quite common. + +The end position is defined to be the column *after* the end of the +span. That is, a span of (1,1)-(1,2) is one character long, and a +span of (1,1)-(1,1) is zero characters long. +-} +data SrcSpan + = SrcSpanOneLine -- a common case: a single line + { srcSpanFile :: FastString, + srcSpanLine :: !Int, + srcSpanSCol :: !Int, + srcSpanECol :: !Int + } + + | SrcSpanMultiLine + { srcSpanFile :: FastString, + srcSpanSLine :: !Int, + srcSpanSCol :: !Int, + srcSpanELine :: !Int, + srcSpanECol :: !Int + } + + | SrcSpanPoint + { srcSpanFile :: FastString, + srcSpanLine :: !Int, + srcSpanCol :: !Int + } + + | ImportedSpan String -- Module name + + | UnhelpfulSpan FastString -- Just a general indication + -- also used to indicate an empty span + + deriving Eq + +-- We want to order SrcSpans first by the start point, then by the end point. +instance Ord SrcSpan where + a `compare` b = + (srcSpanStart a `compare` srcSpanStart b) `thenCmp` + (srcSpanEnd a `compare` srcSpanEnd b) + +noSrcSpan = UnhelpfulSpan FSLIT("<no location info>") + +mkGeneralSrcSpan :: FastString -> SrcSpan +mkGeneralSrcSpan = UnhelpfulSpan + +isGoodSrcSpan SrcSpanOneLine{} = True +isGoodSrcSpan SrcSpanMultiLine{} = True +isGoodSrcSpan SrcSpanPoint{} = True +isGoodSrcSpan _ = False + +srcSpanStartLine SrcSpanOneLine{ srcSpanLine=l } = l +srcSpanStartLine SrcSpanMultiLine{ srcSpanSLine=l } = l +srcSpanStartLine SrcSpanPoint{ srcSpanLine=l } = l +srcSpanStartLine _ = panic "SrcLoc.srcSpanStartLine" + +srcSpanEndLine SrcSpanOneLine{ srcSpanLine=l } = l +srcSpanEndLine SrcSpanMultiLine{ srcSpanELine=l } = l +srcSpanEndLine SrcSpanPoint{ srcSpanLine=l } = l +srcSpanEndLine _ = panic "SrcLoc.srcSpanEndLine" + +srcSpanStartCol SrcSpanOneLine{ srcSpanSCol=l } = l +srcSpanStartCol SrcSpanMultiLine{ srcSpanSCol=l } = l +srcSpanStartCol SrcSpanPoint{ srcSpanCol=l } = l +srcSpanStartCol _ = panic "SrcLoc.srcSpanStartCol" + +srcSpanEndCol SrcSpanOneLine{ srcSpanECol=c } = c +srcSpanEndCol SrcSpanMultiLine{ srcSpanECol=c } = c +srcSpanEndCol SrcSpanPoint{ srcSpanCol=c } = c +srcSpanEndCol _ = panic "SrcLoc.srcSpanEndCol" + +srcSpanStart (ImportedSpan str) = ImportedLoc str +srcSpanStart (UnhelpfulSpan str) = UnhelpfulLoc str +srcSpanStart s = + mkSrcLoc (srcSpanFile s) + (srcSpanStartLine s) + (srcSpanStartCol s) + +srcSpanEnd (ImportedSpan str) = ImportedLoc str +srcSpanEnd (UnhelpfulSpan str) = UnhelpfulLoc str +srcSpanEnd s = + mkSrcLoc (srcSpanFile s) + (srcSpanEndLine s) + (srcSpanEndCol s) + +srcLocSpan :: SrcLoc -> SrcSpan +srcLocSpan (ImportedLoc str) = ImportedSpan str +srcLocSpan (UnhelpfulLoc str) = UnhelpfulSpan str +srcLocSpan (SrcLoc file line col) = SrcSpanPoint file line col + +mkSrcSpan :: SrcLoc -> SrcLoc -> SrcSpan +mkSrcSpan (ImportedLoc str) _ = ImportedSpan str +mkSrcSpan (UnhelpfulLoc str) _ = UnhelpfulSpan str +mkSrcSpan _ (ImportedLoc str) = ImportedSpan str +mkSrcSpan _ (UnhelpfulLoc str) = UnhelpfulSpan str +mkSrcSpan loc1 loc2 + | line1 == line2 = if col1 == col2 + then SrcSpanPoint file line1 col1 + else SrcSpanOneLine file line1 col1 col2 + | otherwise = SrcSpanMultiLine file line1 col1 line2 col2 + where + line1 = srcLocLine loc1 + line2 = srcLocLine loc2 + col1 = srcLocCol loc1 + col2 = srcLocCol loc2 + file = srcLocFile loc1 + +combineSrcSpans :: SrcSpan -> SrcSpan -> SrcSpan +-- Assumes the 'file' part is the same in both +combineSrcSpans (ImportedSpan str) _ = ImportedSpan str +combineSrcSpans (UnhelpfulSpan str) r = r -- this seems more useful +combineSrcSpans _ (ImportedSpan str) = ImportedSpan str +combineSrcSpans l (UnhelpfulSpan str) = l +combineSrcSpans start end + = case line1 `compare` line2 of + EQ -> case col1 `compare` col2 of + EQ -> SrcSpanPoint file line1 col1 + LT -> SrcSpanOneLine file line1 col1 col2 + GT -> SrcSpanOneLine file line1 col2 col1 + LT -> SrcSpanMultiLine file line1 col1 line2 col2 + GT -> SrcSpanMultiLine file line2 col2 line1 col1 + where + line1 = srcSpanStartLine start + col1 = srcSpanStartCol start + line2 = srcSpanEndLine end + col2 = srcSpanEndCol end + file = srcSpanFile start + +pprDefnLoc :: SrcLoc -> SDoc +-- "defined at ..." or "imported from ..." +pprDefnLoc loc + | isGoodSrcLoc loc = ptext SLIT("Defined at") <+> ppr loc + | otherwise = ppr loc + +instance Outputable SrcSpan where + ppr span + = getPprStyle $ \ sty -> + if userStyle sty || debugStyle sty then + pprUserSpan span + else + hcat [text "{-# LINE ", int (srcSpanStartLine span), space, + char '\"', ftext (srcSpanFile span), text " #-}"] + + +pprUserSpan (SrcSpanOneLine src_path line start_col end_col) + = hcat [ ftext src_path, char ':', + int line, + char ':', int start_col + ] + <> if end_col - start_col <= 1 + then empty + -- for single-character or point spans, we just output the starting + -- column number + else char '-' <> int (end_col-1) + +pprUserSpan (SrcSpanMultiLine src_path sline scol eline ecol) + = hcat [ ftext src_path, char ':', + parens (int sline <> char ',' <> int scol), + char '-', + parens (int eline <> char ',' <> + if ecol == 0 then int ecol else int (ecol-1)) + ] + +pprUserSpan (SrcSpanPoint src_path line col) + = hcat [ ftext src_path, char ':', + int line, + char ':', int col + ] + +pprUserSpan (ImportedSpan mod) = ptext SLIT("Imported from") <+> quotes (text mod) +pprUserSpan (UnhelpfulSpan s) = ftext s +\end{code} + +%************************************************************************ +%* * +\subsection[Located]{Attaching SrcSpans to things} +%* * +%************************************************************************ + +\begin{code} +-- | We attach SrcSpans to lots of things, so let's have a datatype for it. +data Located e = L SrcSpan e + +unLoc :: Located e -> e +unLoc (L _ e) = e + +getLoc :: Located e -> SrcSpan +getLoc (L l _) = l + +noLoc :: e -> Located e +noLoc e = L noSrcSpan e + +combineLocs :: Located a -> Located b -> SrcSpan +combineLocs a b = combineSrcSpans (getLoc a) (getLoc b) + +addCLoc :: Located a -> Located b -> c -> Located c +addCLoc a b c = L (combineSrcSpans (getLoc a) (getLoc b)) c + +-- not clear whether to add a general Eq instance, but this is useful sometimes: +eqLocated :: Eq a => Located a -> Located a -> Bool +eqLocated a b = unLoc a == unLoc b + +-- not clear whether to add a general Eq instance, but this is useful sometimes: +cmpLocated :: Ord a => Located a -> Located a -> Ordering +cmpLocated a b = unLoc a `compare` unLoc b + +instance Functor Located where + fmap f (L l e) = L l (f e) + +instance Outputable e => Outputable (Located e) where + ppr (L span e) = ppr e + -- do we want to dump the span in debugSty mode? +\end{code} diff --git a/compiler/basicTypes/UniqSupply.lhs b/compiler/basicTypes/UniqSupply.lhs new file mode 100644 index 0000000000..41ad5c0f60 --- /dev/null +++ b/compiler/basicTypes/UniqSupply.lhs @@ -0,0 +1,203 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[UniqSupply]{The @UniqueSupply@ data type and a (monadic) supply thereof} + +\begin{code} +module UniqSupply ( + + UniqSupply, -- Abstractly + + uniqFromSupply, uniqsFromSupply, -- basic ops + + UniqSM, -- type: unique supply monad + initUs, initUs_, thenUs, thenUs_, returnUs, fixUs, getUs, withUs, + getUniqueUs, getUniquesUs, + mapUs, mapAndUnzipUs, mapAndUnzip3Us, + thenMaybeUs, mapAccumLUs, + lazyThenUs, lazyMapUs, + + mkSplitUniqSupply, + splitUniqSupply + ) where + +#include "HsVersions.h" + +import Unique + +import GLAEXTS +import UNSAFE_IO ( unsafeInterleaveIO ) + +w2i x = word2Int# x +i2w x = int2Word# x +i2w_s x = (x :: Int#) +\end{code} + + +%************************************************************************ +%* * +\subsection{Splittable Unique supply: @UniqSupply@} +%* * +%************************************************************************ + +A value of type @UniqSupply@ is unique, and it can +supply {\em one} distinct @Unique@. Also, from the supply, one can +also manufacture an arbitrary number of further @UniqueSupplies@, +which will be distinct from the first and from all others. + +\begin{code} +data UniqSupply + = MkSplitUniqSupply Int -- make the Unique with this + UniqSupply UniqSupply + -- when split => these two supplies +\end{code} + +\begin{code} +mkSplitUniqSupply :: Char -> IO UniqSupply + +splitUniqSupply :: UniqSupply -> (UniqSupply, UniqSupply) +uniqFromSupply :: UniqSupply -> Unique +uniqsFromSupply :: UniqSupply -> [Unique] -- Infinite +\end{code} + +\begin{code} +mkSplitUniqSupply (C# c#) + = let +#if __GLASGOW_HASKELL__ >= 503 + mask# = (i2w (ord# c#)) `uncheckedShiftL#` (i2w_s 24#) +#else + mask# = (i2w (ord# c#)) `shiftL#` (i2w_s 24#) +#endif + -- here comes THE MAGIC: + + -- This is one of the most hammered bits in the whole compiler + mk_supply# + = unsafeInterleaveIO ( + mk_unique >>= \ uniq -> + mk_supply# >>= \ s1 -> + mk_supply# >>= \ s2 -> + return (MkSplitUniqSupply uniq s1 s2) + ) + + mk_unique = genSymZh >>= \ (W# u#) -> + return (I# (w2i (mask# `or#` u#))) + in + mk_supply# + +foreign import ccall unsafe "genSymZh" genSymZh :: IO Word + +splitUniqSupply (MkSplitUniqSupply _ s1 s2) = (s1, s2) +\end{code} + +\begin{code} +uniqFromSupply (MkSplitUniqSupply n _ _) = mkUniqueGrimily n +uniqsFromSupply (MkSplitUniqSupply n _ s2) = mkUniqueGrimily n : uniqsFromSupply s2 +\end{code} + +%************************************************************************ +%* * +\subsubsection[UniqSupply-monad]{@UniqSupply@ monad: @UniqSM@} +%* * +%************************************************************************ + +\begin{code} +type UniqSM result = UniqSupply -> (result, UniqSupply) + +-- the initUs function also returns the final UniqSupply; initUs_ drops it +initUs :: UniqSupply -> UniqSM a -> (a,UniqSupply) +initUs init_us m = case m init_us of { (r,us) -> (r,us) } + +initUs_ :: UniqSupply -> UniqSM a -> a +initUs_ init_us m = case m init_us of { (r,us) -> r } + +{-# INLINE thenUs #-} +{-# INLINE lazyThenUs #-} +{-# INLINE returnUs #-} +{-# INLINE splitUniqSupply #-} +\end{code} + +@thenUs@ is where we split the @UniqSupply@. +\begin{code} +fixUs :: (a -> UniqSM a) -> UniqSM a +fixUs m us + = (r,us') where (r,us') = m r us + +thenUs :: UniqSM a -> (a -> UniqSM b) -> UniqSM b +thenUs expr cont us + = case (expr us) of { (result, us') -> cont result us' } + +lazyThenUs :: UniqSM a -> (a -> UniqSM b) -> UniqSM b +lazyThenUs expr cont us + = let (result, us') = expr us in cont result us' + +thenUs_ :: UniqSM a -> UniqSM b -> UniqSM b +thenUs_ expr cont us + = case (expr us) of { (_, us') -> cont us' } + + +returnUs :: a -> UniqSM a +returnUs result us = (result, us) + +withUs :: (UniqSupply -> (a, UniqSupply)) -> UniqSM a +withUs f us = f us -- Ha ha! + +getUs :: UniqSM UniqSupply +getUs us = splitUniqSupply us + +getUniqueUs :: UniqSM Unique +getUniqueUs us = case splitUniqSupply us of + (us1,us2) -> (uniqFromSupply us1, us2) + +getUniquesUs :: UniqSM [Unique] +getUniquesUs us = case splitUniqSupply us of + (us1,us2) -> (uniqsFromSupply us1, us2) +\end{code} + +\begin{code} +mapUs :: (a -> UniqSM b) -> [a] -> UniqSM [b] +mapUs f [] = returnUs [] +mapUs f (x:xs) + = f x `thenUs` \ r -> + mapUs f xs `thenUs` \ rs -> + returnUs (r:rs) + +lazyMapUs :: (a -> UniqSM b) -> [a] -> UniqSM [b] +lazyMapUs f [] = returnUs [] +lazyMapUs f (x:xs) + = f x `lazyThenUs` \ r -> + lazyMapUs f xs `lazyThenUs` \ rs -> + returnUs (r:rs) + +mapAndUnzipUs :: (a -> UniqSM (b,c)) -> [a] -> UniqSM ([b],[c]) +mapAndUnzip3Us :: (a -> UniqSM (b,c,d)) -> [a] -> UniqSM ([b],[c],[d]) + +mapAndUnzipUs f [] = returnUs ([],[]) +mapAndUnzipUs f (x:xs) + = f x `thenUs` \ (r1, r2) -> + mapAndUnzipUs f xs `thenUs` \ (rs1, rs2) -> + returnUs (r1:rs1, r2:rs2) + +mapAndUnzip3Us f [] = returnUs ([],[],[]) +mapAndUnzip3Us f (x:xs) + = f x `thenUs` \ (r1, r2, r3) -> + mapAndUnzip3Us f xs `thenUs` \ (rs1, rs2, rs3) -> + returnUs (r1:rs1, r2:rs2, r3:rs3) + +thenMaybeUs :: UniqSM (Maybe a) -> (a -> UniqSM (Maybe b)) -> UniqSM (Maybe b) +thenMaybeUs m k + = m `thenUs` \ result -> + case result of + Nothing -> returnUs Nothing + Just x -> k x + +mapAccumLUs :: (acc -> x -> UniqSM (acc, y)) + -> acc + -> [x] + -> UniqSM (acc, [y]) + +mapAccumLUs f b [] = returnUs (b, []) +mapAccumLUs f b (x:xs) + = f b x `thenUs` \ (b__2, x__2) -> + mapAccumLUs f b__2 xs `thenUs` \ (b__3, xs__2) -> + returnUs (b__3, x__2:xs__2) +\end{code} diff --git a/compiler/basicTypes/Unique.lhs b/compiler/basicTypes/Unique.lhs new file mode 100644 index 0000000000..874328863e --- /dev/null +++ b/compiler/basicTypes/Unique.lhs @@ -0,0 +1,330 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% + +@Uniques@ are used to distinguish entities in the compiler (@Ids@, +@Classes@, etc.) from each other. Thus, @Uniques@ are the basic +comparison key in the compiler. + +If there is any single operation that needs to be fast, it is @Unique@ +comparison. Unsurprisingly, there is quite a bit of huff-and-puff +directed to that end. + +Some of the other hair in this code is to be able to use a +``splittable @UniqueSupply@'' if requested/possible (not standard +Haskell). + +\begin{code} +module Unique ( + Unique, Uniquable(..), hasKey, + + pprUnique, + + mkUnique, -- Used in UniqSupply + mkUniqueGrimily, -- Used in UniqSupply only! + getKey, getKey#, -- Used in Var, UniqFM, Name only! + + incrUnique, -- Used for renumbering + deriveUnique, -- Ditto + newTagUnique, -- Used in CgCase + initTyVarUnique, + + isTupleKey, + + -- now all the built-in Uniques (and functions to make them) + -- [the Oh-So-Wonderful Haskell module system wins again...] + mkAlphaTyVarUnique, + mkPrimOpIdUnique, + mkTupleTyConUnique, mkTupleDataConUnique, + mkPreludeMiscIdUnique, mkPreludeDataConUnique, + mkPreludeTyConUnique, mkPreludeClassUnique, + mkPArrDataConUnique, + + mkBuiltinUnique, + mkPseudoUniqueC, + mkPseudoUniqueD, + mkPseudoUniqueE, + mkPseudoUniqueH + ) where + +#include "HsVersions.h" + +import BasicTypes ( Boxity(..) ) +import PackageConfig ( PackageId, packageIdFS ) +import FastString ( FastString, uniqueOfFS ) +import Outputable +import FastTypes + +import GLAEXTS + +import Char ( chr, ord ) +\end{code} + +%************************************************************************ +%* * +\subsection[Unique-type]{@Unique@ type and operations} +%* * +%************************************************************************ + +The @Chars@ are ``tag letters'' that identify the @UniqueSupply@. +Fast comparison is everything on @Uniques@: + +\begin{code} +data Unique = MkUnique Int# +\end{code} + +Now come the functions which construct uniques from their pieces, and vice versa. +The stuff about unique *supplies* is handled further down this module. + +\begin{code} +mkUnique :: Char -> Int -> Unique -- Builds a unique from pieces +unpkUnique :: Unique -> (Char, Int) -- The reverse + +mkUniqueGrimily :: Int -> Unique -- A trap-door for UniqSupply +getKey :: Unique -> Int -- for Var +getKey# :: Unique -> Int# -- for Var + +incrUnique :: Unique -> Unique +deriveUnique :: Unique -> Int -> Unique +newTagUnique :: Unique -> Char -> Unique + +isTupleKey :: Unique -> Bool +\end{code} + + +\begin{code} +mkUniqueGrimily (I# x) = MkUnique x + +{-# INLINE getKey #-} +getKey (MkUnique x) = I# x +{-# INLINE getKey# #-} +getKey# (MkUnique x) = x + +incrUnique (MkUnique i) = MkUnique (i +# 1#) + +-- deriveUnique uses an 'X' tag so that it won't clash with +-- any of the uniques produced any other way +deriveUnique (MkUnique i) delta = mkUnique 'X' (I# i + delta) + +-- newTagUnique changes the "domain" of a unique to a different char +newTagUnique u c = mkUnique c i where (_,i) = unpkUnique u + +-- pop the Char in the top 8 bits of the Unique(Supply) + +-- No 64-bit bugs here, as long as we have at least 32 bits. --JSM + +w2i x = word2Int# x +i2w x = int2Word# x +i2w_s x = (x::Int#) + +mkUnique (C# c) (I# i) + = MkUnique (w2i (tag `or#` bits)) + where +#if __GLASGOW_HASKELL__ >= 503 + tag = i2w (ord# c) `uncheckedShiftL#` i2w_s 24# +#else + tag = i2w (ord# c) `shiftL#` i2w_s 24# +#endif + bits = i2w i `and#` (i2w 16777215#){-``0x00ffffff''-} + +unpkUnique (MkUnique u) + = let + tag = C# (chr# (w2i ((i2w u) `shiftr` (i2w_s 24#)))) + i = I# (w2i ((i2w u) `and#` (i2w 16777215#){-``0x00ffffff''-})) + in + (tag, i) + where +#if __GLASGOW_HASKELL__ >= 503 + shiftr x y = uncheckedShiftRL# x y +#else + shiftr x y = shiftRL# x y +#endif +\end{code} + + + +%************************************************************************ +%* * +\subsection[Uniquable-class]{The @Uniquable@ class} +%* * +%************************************************************************ + +\begin{code} +class Uniquable a where + getUnique :: a -> Unique + +hasKey :: Uniquable a => a -> Unique -> Bool +x `hasKey` k = getUnique x == k + +instance Uniquable FastString where + getUnique fs = mkUniqueGrimily (I# (uniqueOfFS fs)) + +instance Uniquable PackageId where + getUnique pid = getUnique (packageIdFS pid) + +instance Uniquable Int where + getUnique i = mkUniqueGrimily i +\end{code} + + +%************************************************************************ +%* * +\subsection[Unique-instances]{Instance declarations for @Unique@} +%* * +%************************************************************************ + +And the whole point (besides uniqueness) is fast equality. We don't +use `deriving' because we want {\em precise} control of ordering +(equality on @Uniques@ is v common). + +\begin{code} +eqUnique (MkUnique u1) (MkUnique u2) = u1 ==# u2 +ltUnique (MkUnique u1) (MkUnique u2) = u1 <# u2 +leUnique (MkUnique u1) (MkUnique u2) = u1 <=# u2 + +cmpUnique (MkUnique u1) (MkUnique u2) + = if u1 ==# u2 then EQ else if u1 <# u2 then LT else GT + +instance Eq Unique where + a == b = eqUnique a b + a /= b = not (eqUnique a b) + +instance Ord Unique where + a < b = ltUnique a b + a <= b = leUnique a b + a > b = not (leUnique a b) + a >= b = not (ltUnique a b) + compare a b = cmpUnique a b + +----------------- +instance Uniquable Unique where + getUnique u = u +\end{code} + +We do sometimes make strings with @Uniques@ in them: +\begin{code} +pprUnique :: Unique -> SDoc +pprUnique uniq + = case unpkUnique uniq of + (tag, u) -> finish_ppr tag u (text (iToBase62 u)) + +#ifdef UNUSED +pprUnique10 :: Unique -> SDoc +pprUnique10 uniq -- in base-10, dudes + = case unpkUnique uniq of + (tag, u) -> finish_ppr tag u (int u) +#endif + +finish_ppr 't' u pp_u | u < 26 + = -- Special case to make v common tyvars, t1, t2, ... + -- come out as a, b, ... (shorter, easier to read) + char (chr (ord 'a' + u)) +finish_ppr tag u pp_u = char tag <> pp_u + +instance Outputable Unique where + ppr u = pprUnique u + +instance Show Unique where + showsPrec p uniq = showsPrecSDoc p (pprUnique uniq) +\end{code} + +%************************************************************************ +%* * +\subsection[Utils-base62]{Base-62 numbers} +%* * +%************************************************************************ + +A character-stingy way to read/write numbers (notably Uniques). +The ``62-its'' are \tr{[0-9a-zA-Z]}. We don't handle negative Ints. +Code stolen from Lennart. + +\begin{code} +iToBase62 :: Int -> String +iToBase62 n@(I# n#) + = ASSERT(n >= 0) go n# "" + where + go n# cs | n# <# 62# + = case (indexCharOffAddr# chars62# n#) of { c# -> C# c# : cs } + | otherwise + = case (quotRem (I# n#) 62) of { (I# q#, I# r#) -> + case (indexCharOffAddr# chars62# r#) of { c# -> + go q# (C# c# : cs) }} + + chars62# = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"# +\end{code} + +%************************************************************************ +%* * +\subsection[Uniques-prelude]{@Uniques@ for wired-in Prelude things} +%* * +%************************************************************************ + +Allocation of unique supply characters: + v,t,u : for renumbering value-, type- and usage- vars. + B: builtin + C-E: pseudo uniques (used in native-code generator) + X: uniques derived by deriveUnique + _: unifiable tyvars (above) + 0-9: prelude things below + + other a-z: lower case chars for unique supplies. Used so far: + + d desugarer + f AbsC flattener + g SimplStg + l ndpFlatten + n Native codegen + r Hsc name cache + s simplifier + +\begin{code} +mkAlphaTyVarUnique i = mkUnique '1' i + +mkPreludeClassUnique i = mkUnique '2' i + +-- Prelude type constructors occupy *three* slots. +-- The first is for the tycon itself; the latter two +-- are for the generic to/from Ids. See TysWiredIn.mk_tc_gen_info. + +mkPreludeTyConUnique i = mkUnique '3' (3*i) +mkTupleTyConUnique Boxed a = mkUnique '4' (3*a) +mkTupleTyConUnique Unboxed a = mkUnique '5' (3*a) + +-- Data constructor keys occupy *two* slots. The first is used for the +-- data constructor itself and its wrapper function (the function that +-- evaluates arguments as necessary and calls the worker). The second is +-- used for the worker function (the function that builds the constructor +-- representation). + +mkPreludeDataConUnique i = mkUnique '6' (2*i) -- Must be alphabetic +mkTupleDataConUnique Boxed a = mkUnique '7' (2*a) -- ditto (*may* be used in C labels) +mkTupleDataConUnique Unboxed a = mkUnique '8' (2*a) + +-- This one is used for a tiresome reason +-- to improve a consistency-checking error check in the renamer +isTupleKey u = case unpkUnique u of + (tag,_) -> tag == '4' || tag == '5' || tag == '7' || tag == '8' + +mkPrimOpIdUnique op = mkUnique '9' op +mkPreludeMiscIdUnique i = mkUnique '0' i + +-- No numbers left anymore, so I pick something different for the character +-- tag +mkPArrDataConUnique a = mkUnique ':' (2*a) + +-- The "tyvar uniques" print specially nicely: a, b, c, etc. +-- See pprUnique for details + +initTyVarUnique :: Unique +initTyVarUnique = mkUnique 't' 0 + +mkPseudoUniqueC, mkPseudoUniqueD, mkPseudoUniqueE, mkPseudoUniqueH, + mkBuiltinUnique :: Int -> Unique + +mkBuiltinUnique i = mkUnique 'B' i +mkPseudoUniqueC i = mkUnique 'C' i -- used for getUnique on Regs +mkPseudoUniqueD i = mkUnique 'D' i -- used in NCG for getUnique on RealRegs +mkPseudoUniqueE i = mkUnique 'E' i -- used in NCG spiller to create spill VirtualRegs +mkPseudoUniqueH i = mkUnique 'H' i -- used in NCG spiller to create spill VirtualRegs +\end{code} + diff --git a/compiler/basicTypes/Var.lhs b/compiler/basicTypes/Var.lhs new file mode 100644 index 0000000000..60fdf3831c --- /dev/null +++ b/compiler/basicTypes/Var.lhs @@ -0,0 +1,337 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section{@Vars@: Variables} + +\begin{code} +module Var ( + Var, + varName, varUnique, + setVarName, setVarUnique, + + -- TyVars + TyVar, mkTyVar, mkTcTyVar, + tyVarName, tyVarKind, + setTyVarName, setTyVarUnique, + tcTyVarDetails, + + -- Ids + Id, DictId, + idName, idType, idUnique, idInfo, modifyIdInfo, maybeModifyIdInfo, + setIdName, setIdUnique, setIdType, setIdInfo, lazySetIdInfo, + setIdExported, setIdNotExported, + + globalIdDetails, globaliseId, + + mkLocalId, mkExportedLocalId, mkGlobalId, + + isTyVar, isTcTyVar, isId, isLocalVar, isLocalId, + isGlobalId, isExportedId, + mustHaveLocalBinding + ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} TypeRep( Type ) +import {-# SOURCE #-} TcType( TcTyVarDetails, pprTcTyVarDetails ) +import {-# SOURCE #-} IdInfo( GlobalIdDetails, notGlobalId, IdInfo, seqIdInfo ) + +import Name ( Name, NamedThing(..), + setNameUnique, nameUnique + ) +import Kind ( Kind ) +import Unique ( Unique, Uniquable(..), mkUniqueGrimily, getKey# ) +import FastTypes +import Outputable +\end{code} + + +%************************************************************************ +%* * +\subsection{The main data type declarations} +%* * +%************************************************************************ + + +Every @Var@ has a @Unique@, to uniquify it and for fast comparison, a +@Type@, and an @IdInfo@ (non-essential info about it, e.g., +strictness). The essential info about different kinds of @Vars@ is +in its @VarDetails@. + +\begin{code} +data Var + = TyVar { + varName :: !Name, + realUnique :: FastInt, -- Key for fast comparison + -- Identical to the Unique in the name, + -- cached here for speed + tyVarKind :: Kind } + + | TcTyVar { -- Used only during type inference + varName :: !Name, + realUnique :: FastInt, + tyVarKind :: Kind, + tcTyVarDetails :: TcTyVarDetails } + + | GlobalId { -- Used for imported Ids, dict selectors etc + varName :: !Name, + realUnique :: FastInt, + idType :: Type, + idInfo :: IdInfo, + gblDetails :: GlobalIdDetails } + + | LocalId { -- Used for locally-defined Ids (see NOTE below) + varName :: !Name, + realUnique :: FastInt, + idType :: Type, + idInfo :: IdInfo, + lclDetails :: LocalIdDetails } + +data LocalIdDetails + = NotExported -- Not exported + | Exported -- Exported + -- Exported Ids are kept alive; + -- NotExported things may be discarded as dead code. +\end{code} + +LocalId and GlobalId +~~~~~~~~~~~~~~~~~~~~ +A GlobalId is + * always a constant (top-level) + * imported, or data constructor, or primop, or record selector + * has a Unique that is globally unique across the whole + GHC invocation (a single invocation may compile multiple modules) + +A LocalId is + * bound within an expression (lambda, case, local let(rec)) + * or defined at top level in the module being compiled + +After CoreTidy, top-level LocalIds are turned into GlobalIds + + +\begin{code} +instance Outputable Var where + ppr var = ppr (varName var) <+> ifPprDebug (brackets extra) + where + extra = case var of + GlobalId {} -> ptext SLIT("gid") + LocalId {} -> ptext SLIT("lid") + TyVar {} -> ptext SLIT("tv") + TcTyVar {tcTyVarDetails = details} -> pprTcTyVarDetails details + +instance Show Var where + showsPrec p var = showsPrecSDoc p (ppr var) + +instance NamedThing Var where + getName = varName + +instance Uniquable Var where + getUnique = varUnique + +instance Eq Var where + a == b = realUnique a ==# realUnique b + +instance Ord Var where + a <= b = realUnique a <=# realUnique b + a < b = realUnique a <# realUnique b + a >= b = realUnique a >=# realUnique b + a > b = realUnique a ># realUnique b + a `compare` b = varUnique a `compare` varUnique b +\end{code} + + +\begin{code} +varUnique :: Var -> Unique +varUnique var = mkUniqueGrimily (iBox (realUnique var)) + +setVarUnique :: Var -> Unique -> Var +setVarUnique var uniq + = var { realUnique = getKey# uniq, + varName = setNameUnique (varName var) uniq } + +setVarName :: Var -> Name -> Var +setVarName var new_name + = var { realUnique = getKey# (getUnique new_name), + varName = new_name } +\end{code} + + +%************************************************************************ +%* * +\subsection{Type variables} +%* * +%************************************************************************ + +\begin{code} +type TyVar = Var + +tyVarName = varName + +setTyVarUnique = setVarUnique +setTyVarName = setVarName +\end{code} + +\begin{code} +mkTyVar :: Name -> Kind -> TyVar +mkTyVar name kind = TyVar { varName = name + , realUnique = getKey# (nameUnique name) + , tyVarKind = kind + } + +mkTcTyVar :: Name -> Kind -> TcTyVarDetails -> TyVar +mkTcTyVar name kind details + = TcTyVar { varName = name, + realUnique = getKey# (nameUnique name), + tyVarKind = kind, + tcTyVarDetails = details + } +\end{code} + + +%************************************************************************ +%* * +\subsection{Id Construction} +%* * +%************************************************************************ + +Most Id-related functions are in Id.lhs and MkId.lhs + +\begin{code} +type Id = Var +type DictId = Id +\end{code} + +\begin{code} +idName = varName +idUnique = varUnique + +setIdUnique :: Id -> Unique -> Id +setIdUnique = setVarUnique + +setIdName :: Id -> Name -> Id +setIdName = setVarName + +setIdType :: Id -> Type -> Id +setIdType id ty = id {idType = ty} + +setIdExported :: Id -> Id +-- Can be called on GlobalIds, such as data cons and class ops, +-- which are "born" as GlobalIds and automatically exported +setIdExported id@(LocalId {}) = id { lclDetails = Exported } +setIdExported other_id = ASSERT( isId other_id ) other_id + +setIdNotExported :: Id -> Id +-- We can only do this to LocalIds +setIdNotExported id = ASSERT( isLocalId id ) id { lclDetails = NotExported } + +globaliseId :: GlobalIdDetails -> Id -> Id +-- If it's a local, make it global +globaliseId details id = GlobalId { varName = varName id, + realUnique = realUnique id, + idType = idType id, + idInfo = idInfo id, + gblDetails = details } + +lazySetIdInfo :: Id -> IdInfo -> Id +lazySetIdInfo id info = id {idInfo = info} + +setIdInfo :: Id -> IdInfo -> Id +setIdInfo id info = seqIdInfo info `seq` id {idInfo = info} + -- Try to avoid spack leaks by seq'ing + +modifyIdInfo :: (IdInfo -> IdInfo) -> Id -> Id +modifyIdInfo fn id + = seqIdInfo new_info `seq` id {idInfo = new_info} + where + new_info = fn (idInfo id) + +-- maybeModifyIdInfo tries to avoid unnecesary thrashing +maybeModifyIdInfo :: (IdInfo -> Maybe IdInfo) -> Id -> Id +maybeModifyIdInfo fn id + = case fn (idInfo id) of + Nothing -> id + Just new_info -> id {idInfo = new_info} +\end{code} + +%************************************************************************ +%* * +\subsection{Predicates over variables +%* * +%************************************************************************ + +\begin{code} +mkGlobalId :: GlobalIdDetails -> Name -> Type -> IdInfo -> Id +mkGlobalId details name ty info + = GlobalId { varName = name, + realUnique = getKey# (nameUnique name), -- Cache the unique + idType = ty, + gblDetails = details, + idInfo = info } + +mk_local_id :: Name -> Type -> LocalIdDetails -> IdInfo -> Id +mk_local_id name ty details info + = LocalId { varName = name, + realUnique = getKey# (nameUnique name), -- Cache the unique + idType = ty, + lclDetails = details, + idInfo = info } + +mkLocalId :: Name -> Type -> IdInfo -> Id +mkLocalId name ty info = mk_local_id name ty NotExported info + +mkExportedLocalId :: Name -> Type -> IdInfo -> Id +mkExportedLocalId name ty info = mk_local_id name ty Exported info +\end{code} + +\begin{code} +isTyVar, isTcTyVar :: Var -> Bool +isId, isLocalVar, isLocalId :: Var -> Bool +isGlobalId, isExportedId :: Var -> Bool +mustHaveLocalBinding :: Var -> Bool + +isTyVar (TyVar {}) = True +isTyVar (TcTyVar {}) = True +isTyVar other = False + +isTcTyVar (TcTyVar {}) = True +isTcTyVar other = False + +isId (LocalId {}) = True +isId (GlobalId {}) = True +isId other = False + +isLocalId (LocalId {}) = True +isLocalId other = False + +-- isLocalVar returns True for type variables as well as local Ids +-- These are the variables that we need to pay attention to when finding free +-- variables, or doing dependency analysis. +isLocalVar (GlobalId {}) = False +isLocalVar other = True + +-- mustHaveLocalBinding returns True of Ids and TyVars +-- that must have a binding in this module. The converse +-- is not quite right: there are some GlobalIds that must have +-- bindings, such as record selectors. But that doesn't matter, +-- because it's only used for assertions +mustHaveLocalBinding var = isLocalVar var + +isGlobalId (GlobalId {}) = True +isGlobalId other = False + +-- isExportedId means "don't throw this away" +isExportedId (GlobalId {}) = True +isExportedId (LocalId {lclDetails = details}) + = case details of + Exported -> True + other -> False +isExportedId other = False +\end{code} + +\begin{code} +globalIdDetails :: Var -> GlobalIdDetails +-- Works OK on local Ids too, returning notGlobalId +globalIdDetails (GlobalId {gblDetails = details}) = details +globalIdDetails other = notGlobalId +\end{code} + diff --git a/compiler/basicTypes/VarEnv.lhs b/compiler/basicTypes/VarEnv.lhs new file mode 100644 index 0000000000..bfeecdc923 --- /dev/null +++ b/compiler/basicTypes/VarEnv.lhs @@ -0,0 +1,344 @@ + +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section{@VarEnvs@: Variable environments} + +\begin{code} +module VarEnv ( + VarEnv, IdEnv, TyVarEnv, + emptyVarEnv, unitVarEnv, mkVarEnv, + elemVarEnv, varEnvElts, varEnvKeys, + extendVarEnv, extendVarEnv_C, extendVarEnvList, + plusVarEnv, plusVarEnv_C, + delVarEnvList, delVarEnv, + lookupVarEnv, lookupVarEnv_NF, lookupWithDefaultVarEnv, + mapVarEnv, zipVarEnv, + modifyVarEnv, modifyVarEnv_Directly, + isEmptyVarEnv, foldVarEnv, + elemVarEnvByKey, lookupVarEnv_Directly, + filterVarEnv_Directly, + + -- InScopeSet + InScopeSet, emptyInScopeSet, mkInScopeSet, delInScopeSet, + extendInScopeSet, extendInScopeSetList, modifyInScopeSet, + getInScopeVars, lookupInScope, elemInScopeSet, uniqAway, + mapInScopeSet, + + -- RnEnv2 and its operations + RnEnv2, mkRnEnv2, rnBndr2, rnBndrs2, rnOccL, rnOccR, inRnEnvL, inRnEnvR, + rnBndrL, rnBndrR, nukeRnEnvL, nukeRnEnvR, + + -- TidyEnvs + TidyEnv, emptyTidyEnv + ) where + +#include "HsVersions.h" + +import OccName ( TidyOccEnv, emptyTidyOccEnv ) +import Var ( Var, setVarUnique ) +import VarSet +import UniqFM +import Unique ( Unique, deriveUnique, getUnique ) +import Util ( zipEqual, foldl2 ) +import Maybes ( orElse, isJust ) +import StaticFlags( opt_PprStyle_Debug ) +import Outputable +import FastTypes +\end{code} + + +%************************************************************************ +%* * + In-scope sets +%* * +%************************************************************************ + +\begin{code} +data InScopeSet = InScope (VarEnv Var) FastInt + -- The Int# is a kind of hash-value used by uniqAway + -- For example, it might be the size of the set + -- INVARIANT: it's not zero; we use it as a multiplier in uniqAway + +instance Outputable InScopeSet where + ppr (InScope s i) = ptext SLIT("InScope") <+> ppr s + +emptyInScopeSet :: InScopeSet +emptyInScopeSet = InScope emptyVarSet 1# + +getInScopeVars :: InScopeSet -> VarEnv Var +getInScopeVars (InScope vs _) = vs + +mkInScopeSet :: VarEnv Var -> InScopeSet +mkInScopeSet in_scope = InScope in_scope 1# + +extendInScopeSet :: InScopeSet -> Var -> InScopeSet +extendInScopeSet (InScope in_scope n) v = InScope (extendVarEnv in_scope v v) (n +# 1#) + +extendInScopeSetList :: InScopeSet -> [Var] -> InScopeSet +extendInScopeSetList (InScope in_scope n) vs + = InScope (foldl (\s v -> extendVarEnv s v v) in_scope vs) + (n +# iUnbox (length vs)) + +modifyInScopeSet :: InScopeSet -> Var -> Var -> InScopeSet +-- Exploit the fact that the in-scope "set" is really a map +-- Make old_v map to new_v +modifyInScopeSet (InScope in_scope n) old_v new_v = InScope (extendVarEnv in_scope old_v new_v) (n +# 1#) + +delInScopeSet :: InScopeSet -> Var -> InScopeSet +delInScopeSet (InScope in_scope n) v = InScope (in_scope `delVarEnv` v) n + +mapInScopeSet :: (Var -> Var) -> InScopeSet -> InScopeSet +mapInScopeSet f (InScope in_scope n) = InScope (mapVarEnv f in_scope) n + +elemInScopeSet :: Var -> InScopeSet -> Bool +elemInScopeSet v (InScope in_scope n) = v `elemVarEnv` in_scope + +lookupInScope :: InScopeSet -> Var -> Maybe Var +-- It's important to look for a fixed point +-- When we see (case x of y { I# v -> ... }) +-- we add [x -> y] to the in-scope set (Simplify.simplCaseBinder). +-- When we lookup up an occurrence of x, we map to y, but then +-- we want to look up y in case it has acquired more evaluation information by now. +lookupInScope (InScope in_scope n) v + = go v + where + go v = case lookupVarEnv in_scope v of + Just v' | v == v' -> Just v' -- Reached a fixed point + | otherwise -> go v' + Nothing -> Nothing +\end{code} + +\begin{code} +uniqAway :: InScopeSet -> Var -> Var +-- (uniqAway in_scope v) finds a unique that is not used in the +-- in-scope set, and gives that to v. It starts with v's current unique, of course, +-- in the hope that it won't have to change it, and thereafter uses a combination +-- of that and the hash-code found in the in-scope set +uniqAway in_scope var + | var `elemInScopeSet` in_scope = uniqAway' in_scope var -- Make a new one + | otherwise = var -- Nothing to do + +uniqAway' :: InScopeSet -> Var -> Var +-- This one *always* makes up a new variable +uniqAway' (InScope set n) var + = try 1# + where + orig_unique = getUnique var + try k +#ifdef DEBUG + | k ># 1000# + = pprPanic "uniqAway loop:" (ppr (iBox k) <+> text "tries" <+> ppr var <+> int (iBox n)) +#endif + | uniq `elemVarSetByKey` set = try (k +# 1#) +#ifdef DEBUG + | opt_PprStyle_Debug && k ># 3# + = pprTrace "uniqAway:" (ppr (iBox k) <+> text "tries" <+> ppr var <+> int (iBox n)) + setVarUnique var uniq +#endif + | otherwise = setVarUnique var uniq + where + uniq = deriveUnique orig_unique (iBox (n *# k)) +\end{code} + + +%************************************************************************ +%* * + Dual renaming +%* * +%************************************************************************ + +When we are comparing (or matching) types or terms, we are faced with +"going under" corresponding binders. E.g. when comparing + \x. e1 ~ \y. e2 + +Basically we want to rename [x->y] or [y->x], but there are lots of +things we must be careful of. In particular, x might be free in e2, or +y in e1. So the idea is that we come up with a fresh binder that is free +in neither, and rename x and y respectively. That means we must maintain + a) a renaming for the left-hand expression + b) a renaming for the right-hand expressions + c) an in-scope set + +Furthermore, when matching, we want to be able to have an 'occurs check', +to prevent + \x. f ~ \y. y +matching with f->y. So for each expression we want to know that set of +locally-bound variables. That is precisely the domain of the mappings (a) +and (b), but we must ensure that we always extend the mappings as we go in. + + +\begin{code} +data RnEnv2 + = RV2 { envL :: VarEnv Var -- Renaming for Left term + , envR :: VarEnv Var -- Renaming for Right term + , in_scope :: InScopeSet } -- In scope in left or right terms + +-- The renamings envL and envR are *guaranteed* to contain a binding +-- for every variable bound as we go into the term, even if it is not +-- renamed. That way we can ask what variables are locally bound +-- (inRnEnvL, inRnEnvR) + +mkRnEnv2 :: InScopeSet -> RnEnv2 +mkRnEnv2 vars = RV2 { envL = emptyVarEnv + , envR = emptyVarEnv + , in_scope = vars } + +rnBndrs2 :: RnEnv2 -> [Var] -> [Var] -> RnEnv2 +-- Arg lists must be of equal length +rnBndrs2 env bsL bsR = foldl2 rnBndr2 env bsL bsR + +rnBndr2 :: RnEnv2 -> Var -> Var -> RnEnv2 +-- (rnBndr2 env bL bR) go under a binder bL in the Left term 1, +-- and binder bR in the Right term +-- It finds a new binder, new_b, +-- and returns an environment mapping bL->new_b and bR->new_b resp. +rnBndr2 (RV2 { envL = envL, envR = envR, in_scope = in_scope }) bL bR + = RV2 { envL = extendVarEnv envL bL new_b -- See Note + , envR = extendVarEnv envR bR new_b -- [Rebinding] + , in_scope = extendInScopeSet in_scope new_b } + where + -- Find a new binder not in scope in either term + new_b | not (bL `elemInScopeSet` in_scope) = bL + | not (bR `elemInScopeSet` in_scope) = bR + | otherwise = uniqAway' in_scope bL + + -- Note [Rebinding] + -- If the new var is the same as the old one, note that + -- the extendVarEnv *deletes* any current renaming + -- E.g. (\x. \x. ...) ~ (\y. \z. ...) + -- + -- Inside \x \y { [x->y], [y->y], {y} } + -- \x \z { [x->x], [y->y, z->x], {y,x} } + +rnBndrL, rnBndrR :: RnEnv2 -> Var -> (RnEnv2, Var) +-- Used when there's a binder on one side or the other only +-- Useful when eta-expanding +rnBndrL (RV2 { envL = envL, envR = envR, in_scope = in_scope }) bL + = (RV2 { envL = extendVarEnv envL bL new_b + , envR = envR + , in_scope = extendInScopeSet in_scope new_b }, new_b) + where + new_b | not (bL `elemInScopeSet` in_scope) = bL + | otherwise = uniqAway' in_scope bL + +rnBndrR (RV2 { envL = envL, envR = envR, in_scope = in_scope }) bR + = (RV2 { envL = envL + , envR = extendVarEnv envR bR new_b + , in_scope = extendInScopeSet in_scope new_b }, new_b) + where + new_b | not (bR `elemInScopeSet` in_scope) = bR + | otherwise = uniqAway' in_scope bR + +rnOccL, rnOccR :: RnEnv2 -> Var -> Var +-- Look up the renaming of an occurrence in the left or right term +rnOccL (RV2 { envL = env }) v = lookupVarEnv env v `orElse` v +rnOccR (RV2 { envR = env }) v = lookupVarEnv env v `orElse` v + +inRnEnvL, inRnEnvR :: RnEnv2 -> Var -> Bool +-- Tells whether a variable is locally bound +inRnEnvL (RV2 { envL = env }) v = isJust (lookupVarEnv env v) +inRnEnvR (RV2 { envR = env }) v = isJust (lookupVarEnv env v) + +nukeRnEnvL, nukeRnEnvR :: RnEnv2 -> RnEnv2 +nukeRnEnvL env = env { envL = emptyVarEnv } +nukeRnEnvR env = env { envR = emptyVarEnv } +\end{code} + + +%************************************************************************ +%* * + Tidying +%* * +%************************************************************************ + +When tidying up print names, we keep a mapping of in-scope occ-names +(the TidyOccEnv) and a Var-to-Var of the current renamings. + +\begin{code} +type TidyEnv = (TidyOccEnv, VarEnv Var) + +emptyTidyEnv :: TidyEnv +emptyTidyEnv = (emptyTidyOccEnv, emptyVarEnv) +\end{code} + + +%************************************************************************ +%* * +\subsection{@VarEnv@s} +%* * +%************************************************************************ + +\begin{code} +type VarEnv elt = UniqFM elt +type IdEnv elt = VarEnv elt +type TyVarEnv elt = VarEnv elt + +emptyVarEnv :: VarEnv a +mkVarEnv :: [(Var, a)] -> VarEnv a +zipVarEnv :: [Var] -> [a] -> VarEnv a +unitVarEnv :: Var -> a -> VarEnv a +extendVarEnv :: VarEnv a -> Var -> a -> VarEnv a +extendVarEnv_C :: (a->a->a) -> VarEnv a -> Var -> a -> VarEnv a +plusVarEnv :: VarEnv a -> VarEnv a -> VarEnv a +extendVarEnvList :: VarEnv a -> [(Var, a)] -> VarEnv a + +lookupVarEnv_Directly :: VarEnv a -> Unique -> Maybe a +filterVarEnv_Directly :: (Unique -> a -> Bool) -> VarEnv a -> VarEnv a +delVarEnvList :: VarEnv a -> [Var] -> VarEnv a +delVarEnv :: VarEnv a -> Var -> VarEnv a +plusVarEnv_C :: (a -> a -> a) -> VarEnv a -> VarEnv a -> VarEnv a +mapVarEnv :: (a -> b) -> VarEnv a -> VarEnv b +modifyVarEnv :: (a -> a) -> VarEnv a -> Var -> VarEnv a +varEnvElts :: VarEnv a -> [a] +varEnvKeys :: VarEnv a -> [Unique] + +isEmptyVarEnv :: VarEnv a -> Bool +lookupVarEnv :: VarEnv a -> Var -> Maybe a +lookupVarEnv_NF :: VarEnv a -> Var -> a +lookupWithDefaultVarEnv :: VarEnv a -> a -> Var -> a +elemVarEnv :: Var -> VarEnv a -> Bool +elemVarEnvByKey :: Unique -> VarEnv a -> Bool +foldVarEnv :: (a -> b -> b) -> b -> VarEnv a -> b +\end{code} + +\begin{code} +elemVarEnv = elemUFM +elemVarEnvByKey = elemUFM_Directly +extendVarEnv = addToUFM +extendVarEnv_C = addToUFM_C +extendVarEnvList = addListToUFM +plusVarEnv_C = plusUFM_C +delVarEnvList = delListFromUFM +delVarEnv = delFromUFM +plusVarEnv = plusUFM +lookupVarEnv = lookupUFM +lookupWithDefaultVarEnv = lookupWithDefaultUFM +mapVarEnv = mapUFM +mkVarEnv = listToUFM +emptyVarEnv = emptyUFM +varEnvElts = eltsUFM +varEnvKeys = keysUFM +unitVarEnv = unitUFM +isEmptyVarEnv = isNullUFM +foldVarEnv = foldUFM +lookupVarEnv_Directly = lookupUFM_Directly +filterVarEnv_Directly = filterUFM_Directly + +zipVarEnv tyvars tys = mkVarEnv (zipEqual "zipVarEnv" tyvars tys) +lookupVarEnv_NF env id = case (lookupVarEnv env id) of { Just xx -> xx } +\end{code} + +@modifyVarEnv@: Look up a thing in the VarEnv, +then mash it with the modify function, and put it back. + +\begin{code} +modifyVarEnv mangle_fn env key + = case (lookupVarEnv env key) of + Nothing -> env + Just xx -> extendVarEnv env key (mangle_fn xx) + +modifyVarEnv_Directly mangle_fn env key + = case (lookupUFM_Directly env key) of + Nothing -> env + Just xx -> addToUFM_Directly env key (mangle_fn xx) +\end{code} diff --git a/compiler/basicTypes/VarSet.lhs b/compiler/basicTypes/VarSet.lhs new file mode 100644 index 0000000000..55e82a8515 --- /dev/null +++ b/compiler/basicTypes/VarSet.lhs @@ -0,0 +1,105 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section{@VarSet@: Variable sets} + +\begin{code} +module VarSet ( + VarSet, IdSet, TyVarSet, + emptyVarSet, unitVarSet, mkVarSet, + extendVarSet, extendVarSetList, extendVarSet_C, + elemVarSet, varSetElems, subVarSet, + unionVarSet, unionVarSets, + intersectVarSet, intersectsVarSet, + isEmptyVarSet, delVarSet, delVarSetList, delVarSetByKey, + minusVarSet, foldVarSet, filterVarSet, + lookupVarSet, mapVarSet, sizeVarSet, seqVarSet, + elemVarSetByKey + ) where + +#include "HsVersions.h" + +import Var ( Var, Id, TyVar ) +import Unique ( Unique ) +import UniqSet +import UniqFM ( delFromUFM_Directly, addToUFM_C ) +\end{code} + +%************************************************************************ +%* * +\subsection{@VarSet@s} +%* * +%************************************************************************ + +\begin{code} +type VarSet = UniqSet Var +type IdSet = UniqSet Id +type TyVarSet = UniqSet TyVar + +emptyVarSet :: VarSet +intersectVarSet :: VarSet -> VarSet -> VarSet +unionVarSet :: VarSet -> VarSet -> VarSet +unionVarSets :: [VarSet] -> VarSet +varSetElems :: VarSet -> [Var] +unitVarSet :: Var -> VarSet +extendVarSet :: VarSet -> Var -> VarSet +extendVarSetList:: VarSet -> [Var] -> VarSet +elemVarSet :: Var -> VarSet -> Bool +delVarSet :: VarSet -> Var -> VarSet +delVarSetList :: VarSet -> [Var] -> VarSet +minusVarSet :: VarSet -> VarSet -> VarSet +isEmptyVarSet :: VarSet -> Bool +mkVarSet :: [Var] -> VarSet +foldVarSet :: (Var -> a -> a) -> a -> VarSet -> a +lookupVarSet :: VarSet -> Var -> Maybe Var + -- Returns the set element, which may be + -- (==) to the argument, but not the same as +mapVarSet :: (Var -> Var) -> VarSet -> VarSet +sizeVarSet :: VarSet -> Int +filterVarSet :: (Var -> Bool) -> VarSet -> VarSet +extendVarSet_C :: (Var->Var->Var) -> VarSet -> Var -> VarSet + +delVarSetByKey :: VarSet -> Unique -> VarSet +elemVarSetByKey :: Unique -> VarSet -> Bool + +emptyVarSet = emptyUniqSet +unitVarSet = unitUniqSet +extendVarSet = addOneToUniqSet +extendVarSetList= addListToUniqSet +intersectVarSet = intersectUniqSets + +intersectsVarSet:: VarSet -> VarSet -> Bool -- True if non-empty intersection + -- (s1 `intersectsVarSet` s2) doesn't compute s2 if s1 is empty +subVarSet :: VarSet -> VarSet -> Bool -- True if first arg is subset of second + -- (s1 `subVarSet` s2) doesn't compute s2 if s1 is empty + +unionVarSet = unionUniqSets +unionVarSets = unionManyUniqSets +varSetElems = uniqSetToList +elemVarSet = elementOfUniqSet +minusVarSet = minusUniqSet +delVarSet = delOneFromUniqSet +delVarSetList = delListFromUniqSet +isEmptyVarSet = isEmptyUniqSet +mkVarSet = mkUniqSet +foldVarSet = foldUniqSet +lookupVarSet = lookupUniqSet +mapVarSet = mapUniqSet +sizeVarSet = sizeUniqSet +filterVarSet = filterUniqSet +extendVarSet_C combine s x = addToUFM_C combine s x x +delVarSetByKey = delFromUFM_Directly -- Can't be bothered to add this to UniqSet +elemVarSetByKey = elemUniqSet_Directly +\end{code} + +\begin{code} +-- See comments with type signatures +intersectsVarSet s1 s2 = not (isEmptyVarSet (s1 `intersectVarSet` s2)) +a `subVarSet` b = isEmptyVarSet (a `minusVarSet` b) +\end{code} + +\begin{code} +seqVarSet :: VarSet -> () +seqVarSet s = sizeVarSet s `seq` () +\end{code} + diff --git a/compiler/cbits/rawSystem.c b/compiler/cbits/rawSystem.c new file mode 100644 index 0000000000..d103f4808b --- /dev/null +++ b/compiler/cbits/rawSystem.c @@ -0,0 +1,6 @@ +/* Grab rawSystem from the library sources iff we're bootstrapping with an + * old version of GHC. + */ +#if __GLASGOW_HASKELL__ < 601 +#include "../../libraries/base/cbits/rawSystem.c" +#endif diff --git a/compiler/cmm/CLabel.hs b/compiler/cmm/CLabel.hs new file mode 100644 index 0000000000..e42b92db5a --- /dev/null +++ b/compiler/cmm/CLabel.hs @@ -0,0 +1,831 @@ +----------------------------------------------------------------------------- +-- +-- Object-file symbols (called CLabel for histerical raisins). +-- +-- (c) The University of Glasgow 2004 +-- +----------------------------------------------------------------------------- + +module CLabel ( + CLabel, -- abstract type + + mkClosureLabel, + mkSRTLabel, + mkSRTDescLabel, + mkInfoTableLabel, + mkEntryLabel, + mkSlowEntryLabel, + mkConEntryLabel, + mkStaticConEntryLabel, + mkRednCountsLabel, + mkConInfoTableLabel, + mkStaticInfoTableLabel, + mkApEntryLabel, + mkApInfoTableLabel, + mkClosureTableLabel, + + mkLocalClosureLabel, + mkLocalInfoTableLabel, + mkLocalEntryLabel, + mkLocalConEntryLabel, + mkLocalStaticConEntryLabel, + mkLocalConInfoTableLabel, + mkLocalStaticInfoTableLabel, + mkLocalClosureTableLabel, + + mkReturnPtLabel, + mkReturnInfoLabel, + mkAltLabel, + mkDefaultLabel, + mkBitmapLabel, + mkStringLitLabel, + + mkAsmTempLabel, + + mkModuleInitLabel, + mkPlainModuleInitLabel, + + mkSplitMarkerLabel, + mkDirty_MUT_VAR_Label, + mkUpdInfoLabel, + mkSeqInfoLabel, + mkIndStaticInfoLabel, + mkMainCapabilityLabel, + mkMAP_FROZEN_infoLabel, + mkMAP_DIRTY_infoLabel, + mkEMPTY_MVAR_infoLabel, + + mkTopTickyCtrLabel, + mkCAFBlackHoleInfoTableLabel, + mkSECAFBlackHoleInfoTableLabel, + mkRtsPrimOpLabel, + mkRtsSlowTickyCtrLabel, + + moduleRegdLabel, + + mkSelectorInfoLabel, + mkSelectorEntryLabel, + + mkRtsInfoLabel, + mkRtsEntryLabel, + mkRtsRetInfoLabel, + mkRtsRetLabel, + mkRtsCodeLabel, + mkRtsDataLabel, + + mkRtsInfoLabelFS, + mkRtsEntryLabelFS, + mkRtsRetInfoLabelFS, + mkRtsRetLabelFS, + mkRtsCodeLabelFS, + mkRtsDataLabelFS, + + mkRtsApFastLabel, + + mkForeignLabel, + + mkCCLabel, mkCCSLabel, + + DynamicLinkerLabelInfo(..), + mkDynamicLinkerLabel, + dynamicLinkerLabelInfo, + + mkPicBaseLabel, + mkDeadStripPreventer, + + infoLblToEntryLbl, entryLblToInfoLbl, + needsCDecl, isAsmTemp, externallyVisibleCLabel, + CLabelType(..), labelType, labelDynamic, + + pprCLabel + ) where + + +#include "HsVersions.h" + +import Packages ( HomeModules ) +import StaticFlags ( opt_Static, opt_DoTickyProfiling ) +import Packages ( isHomeModule, isDllName ) +import DataCon ( ConTag ) +import Module ( moduleFS, Module ) +import Name ( Name, isExternalName ) +import Unique ( pprUnique, Unique ) +import PrimOp ( PrimOp ) +import Config ( cLeadingUnderscore ) +import CostCentre ( CostCentre, CostCentreStack ) +import Outputable +import FastString + +-- ----------------------------------------------------------------------------- +-- The CLabel type + +{- +CLabel is an abstract type that supports the following operations: + + - Pretty printing + + - In a C file, does it need to be declared before use? (i.e. is it + guaranteed to be already in scope in the places we need to refer to it?) + + - If it needs to be declared, what type (code or data) should it be + declared to have? + + - Is it visible outside this object file or not? + + - Is it "dynamic" (see details below) + + - Eq and Ord, so that we can make sets of CLabels (currently only + used in outputting C as far as I can tell, to avoid generating + more than one declaration for any given label). + + - Converting an info table label into an entry label. +-} + +data CLabel + = IdLabel -- A family of labels related to the + Name -- definition of a particular Id or Con + IdLabelInfo + + | DynIdLabel -- like IdLabel, but in a separate package, + Name -- and might therefore need a dynamic + IdLabelInfo -- reference. + + | CaseLabel -- A family of labels related to a particular + -- case expression. + {-# UNPACK #-} !Unique -- Unique says which case expression + CaseLabelInfo + + | AsmTempLabel + {-# UNPACK #-} !Unique + + | StringLitLabel + {-# UNPACK #-} !Unique + + | ModuleInitLabel + Module -- the module name + String -- its "way" + Bool -- True <=> is in a different package + -- at some point we might want some kind of version number in + -- the module init label, to guard against compiling modules in + -- the wrong order. We can't use the interface file version however, + -- because we don't always recompile modules which depend on a module + -- whose version has changed. + + | PlainModuleInitLabel -- without the vesrion & way info + Module + Bool -- True <=> is in a different package + + | ModuleRegdLabel + + | RtsLabel RtsLabelInfo + + | ForeignLabel FastString -- a 'C' (or otherwise foreign) label + (Maybe Int) -- possible '@n' suffix for stdcall functions + -- When generating C, the '@n' suffix is omitted, but when + -- generating assembler we must add it to the label. + Bool -- True <=> is dynamic + + | CC_Label CostCentre + | CCS_Label CostCentreStack + + -- Dynamic Linking in the NCG: + -- generated and used inside the NCG only, + -- see module PositionIndependentCode for details. + + | DynamicLinkerLabel DynamicLinkerLabelInfo CLabel + -- special variants of a label used for dynamic linking + + | PicBaseLabel -- a label used as a base for PIC calculations + -- on some platforms. + -- It takes the form of a local numeric + -- assembler label '1'; it is pretty-printed + -- as 1b, referring to the previous definition + -- of 1: in the assembler source file. + + | DeadStripPreventer CLabel + -- label before an info table to prevent excessive dead-stripping on darwin + + deriving (Eq, Ord) + +data IdLabelInfo + = Closure -- Label for closure + | SRT -- Static reference table + | SRTDesc -- Static reference table descriptor + | InfoTable -- Info tables for closures; always read-only + | Entry -- entry point + | Slow -- slow entry point + + | RednCounts -- Label of place to keep Ticky-ticky info for + -- this Id + + | Bitmap -- A bitmap (function or case return) + + | ConEntry -- constructor entry point + | ConInfoTable -- corresponding info table + | StaticConEntry -- static constructor entry point + | StaticInfoTable -- corresponding info table + + | ClosureTable -- table of closures for Enum tycons + + deriving (Eq, Ord) + + +data CaseLabelInfo + = CaseReturnPt + | CaseReturnInfo + | CaseAlt ConTag + | CaseDefault + deriving (Eq, Ord) + + +data RtsLabelInfo + = RtsSelectorInfoTable Bool{-updatable-} Int{-offset-} -- Selector thunks + | RtsSelectorEntry Bool{-updatable-} Int{-offset-} + + | RtsApInfoTable Bool{-updatable-} Int{-arity-} -- AP thunks + | RtsApEntry Bool{-updatable-} Int{-arity-} + + | RtsPrimOp PrimOp + + | RtsInfo LitString -- misc rts info tables + | RtsEntry LitString -- misc rts entry points + | RtsRetInfo LitString -- misc rts ret info tables + | RtsRet LitString -- misc rts return points + | RtsData LitString -- misc rts data bits, eg CHARLIKE_closure + | RtsCode LitString -- misc rts code + + | RtsInfoFS FastString -- misc rts info tables + | RtsEntryFS FastString -- misc rts entry points + | RtsRetInfoFS FastString -- misc rts ret info tables + | RtsRetFS FastString -- misc rts return points + | RtsDataFS FastString -- misc rts data bits, eg CHARLIKE_closure + | RtsCodeFS FastString -- misc rts code + + | RtsApFast LitString -- _fast versions of generic apply + + | RtsSlowTickyCtr String + + deriving (Eq, Ord) + -- NOTE: Eq on LitString compares the pointer only, so this isn't + -- a real equality. + +data DynamicLinkerLabelInfo + = CodeStub -- MachO: Lfoo$stub, ELF: foo@plt + | SymbolPtr -- MachO: Lfoo$non_lazy_ptr, Windows: __imp_foo + | GotSymbolPtr -- ELF: foo@got + | GotSymbolOffset -- ELF: foo@gotoff + + deriving (Eq, Ord) + +-- ----------------------------------------------------------------------------- +-- Constructing CLabels + +-- These are always local: +mkSRTLabel name = IdLabel name SRT +mkSRTDescLabel name = IdLabel name SRTDesc +mkSlowEntryLabel name = IdLabel name Slow +mkBitmapLabel name = IdLabel name Bitmap +mkRednCountsLabel name = IdLabel name RednCounts + +-- These have local & (possibly) external variants: +mkLocalClosureLabel name = IdLabel name Closure +mkLocalInfoTableLabel name = IdLabel name InfoTable +mkLocalEntryLabel name = IdLabel name Entry +mkLocalClosureTableLabel name = IdLabel name ClosureTable + +mkClosureLabel hmods name + | isDllName hmods name = DynIdLabel name Closure + | otherwise = IdLabel name Closure + +mkInfoTableLabel hmods name + | isDllName hmods name = DynIdLabel name InfoTable + | otherwise = IdLabel name InfoTable + +mkEntryLabel hmods name + | isDllName hmods name = DynIdLabel name Entry + | otherwise = IdLabel name Entry + +mkClosureTableLabel hmods name + | isDllName hmods name = DynIdLabel name ClosureTable + | otherwise = IdLabel name ClosureTable + +mkLocalConInfoTableLabel con = IdLabel con ConInfoTable +mkLocalConEntryLabel con = IdLabel con ConEntry +mkLocalStaticInfoTableLabel con = IdLabel con StaticInfoTable +mkLocalStaticConEntryLabel con = IdLabel con StaticConEntry + +mkConInfoTableLabel name False = IdLabel name ConInfoTable +mkConInfoTableLabel name True = DynIdLabel name ConInfoTable + +mkStaticInfoTableLabel name False = IdLabel name StaticInfoTable +mkStaticInfoTableLabel name True = DynIdLabel name StaticInfoTable + +mkConEntryLabel hmods name + | isDllName hmods name = DynIdLabel name ConEntry + | otherwise = IdLabel name ConEntry + +mkStaticConEntryLabel hmods name + | isDllName hmods name = DynIdLabel name StaticConEntry + | otherwise = IdLabel name StaticConEntry + + +mkReturnPtLabel uniq = CaseLabel uniq CaseReturnPt +mkReturnInfoLabel uniq = CaseLabel uniq CaseReturnInfo +mkAltLabel uniq tag = CaseLabel uniq (CaseAlt tag) +mkDefaultLabel uniq = CaseLabel uniq CaseDefault + +mkStringLitLabel = StringLitLabel +mkAsmTempLabel = AsmTempLabel + +mkModuleInitLabel :: HomeModules -> Module -> String -> CLabel +mkModuleInitLabel hmods mod way + = ModuleInitLabel mod way $! (not (isHomeModule hmods mod)) + +mkPlainModuleInitLabel :: HomeModules -> Module -> CLabel +mkPlainModuleInitLabel hmods mod + = PlainModuleInitLabel mod $! (not (isHomeModule hmods mod)) + + -- Some fixed runtime system labels + +mkSplitMarkerLabel = RtsLabel (RtsCode SLIT("__stg_split_marker")) +mkDirty_MUT_VAR_Label = RtsLabel (RtsCode SLIT("dirty_MUT_VAR")) +mkUpdInfoLabel = RtsLabel (RtsInfo SLIT("stg_upd_frame")) +mkSeqInfoLabel = RtsLabel (RtsInfo SLIT("stg_seq_frame")) +mkIndStaticInfoLabel = RtsLabel (RtsInfo SLIT("stg_IND_STATIC")) +mkMainCapabilityLabel = RtsLabel (RtsData SLIT("MainCapability")) +mkMAP_FROZEN_infoLabel = RtsLabel (RtsInfo SLIT("stg_MUT_ARR_PTRS_FROZEN0")) +mkMAP_DIRTY_infoLabel = RtsLabel (RtsInfo SLIT("stg_MUT_ARR_PTRS_DIRTY")) +mkEMPTY_MVAR_infoLabel = RtsLabel (RtsInfo SLIT("stg_EMPTY_MVAR")) + +mkTopTickyCtrLabel = RtsLabel (RtsData SLIT("top_ct")) +mkCAFBlackHoleInfoTableLabel = RtsLabel (RtsInfo SLIT("stg_CAF_BLACKHOLE")) +mkSECAFBlackHoleInfoTableLabel = if opt_DoTickyProfiling then + RtsLabel (RtsInfo SLIT("stg_SE_CAF_BLACKHOLE")) + else -- RTS won't have info table unless -ticky is on + panic "mkSECAFBlackHoleInfoTableLabel requires -ticky" +mkRtsPrimOpLabel primop = RtsLabel (RtsPrimOp primop) + +moduleRegdLabel = ModuleRegdLabel + +mkSelectorInfoLabel upd off = RtsLabel (RtsSelectorInfoTable upd off) +mkSelectorEntryLabel upd off = RtsLabel (RtsSelectorEntry upd off) + +mkApInfoTableLabel upd off = RtsLabel (RtsApInfoTable upd off) +mkApEntryLabel upd off = RtsLabel (RtsApEntry upd off) + + -- Foreign labels + +mkForeignLabel :: FastString -> Maybe Int -> Bool -> CLabel +mkForeignLabel str mb_sz is_dynamic = ForeignLabel str mb_sz is_dynamic + + -- Cost centres etc. + +mkCCLabel cc = CC_Label cc +mkCCSLabel ccs = CCS_Label ccs + +mkRtsInfoLabel str = RtsLabel (RtsInfo str) +mkRtsEntryLabel str = RtsLabel (RtsEntry str) +mkRtsRetInfoLabel str = RtsLabel (RtsRetInfo str) +mkRtsRetLabel str = RtsLabel (RtsRet str) +mkRtsCodeLabel str = RtsLabel (RtsCode str) +mkRtsDataLabel str = RtsLabel (RtsData str) + +mkRtsInfoLabelFS str = RtsLabel (RtsInfoFS str) +mkRtsEntryLabelFS str = RtsLabel (RtsEntryFS str) +mkRtsRetInfoLabelFS str = RtsLabel (RtsRetInfoFS str) +mkRtsRetLabelFS str = RtsLabel (RtsRetFS str) +mkRtsCodeLabelFS str = RtsLabel (RtsCodeFS str) +mkRtsDataLabelFS str = RtsLabel (RtsDataFS str) + +mkRtsApFastLabel str = RtsLabel (RtsApFast str) + +mkRtsSlowTickyCtrLabel :: String -> CLabel +mkRtsSlowTickyCtrLabel pat = RtsLabel (RtsSlowTickyCtr pat) + + -- Dynamic linking + +mkDynamicLinkerLabel :: DynamicLinkerLabelInfo -> CLabel -> CLabel +mkDynamicLinkerLabel = DynamicLinkerLabel + +dynamicLinkerLabelInfo :: CLabel -> Maybe (DynamicLinkerLabelInfo, CLabel) +dynamicLinkerLabelInfo (DynamicLinkerLabel info lbl) = Just (info, lbl) +dynamicLinkerLabelInfo _ = Nothing + + -- Position independent code + +mkPicBaseLabel :: CLabel +mkPicBaseLabel = PicBaseLabel + +mkDeadStripPreventer :: CLabel -> CLabel +mkDeadStripPreventer lbl = DeadStripPreventer lbl + +-- ----------------------------------------------------------------------------- +-- Converting info labels to entry labels. + +infoLblToEntryLbl :: CLabel -> CLabel +infoLblToEntryLbl (IdLabel n InfoTable) = IdLabel n Entry +infoLblToEntryLbl (IdLabel n ConInfoTable) = IdLabel n ConEntry +infoLblToEntryLbl (IdLabel n StaticInfoTable) = IdLabel n StaticConEntry +infoLblToEntryLbl (DynIdLabel n InfoTable) = DynIdLabel n Entry +infoLblToEntryLbl (DynIdLabel n ConInfoTable) = DynIdLabel n ConEntry +infoLblToEntryLbl (DynIdLabel n StaticInfoTable) = DynIdLabel n StaticConEntry +infoLblToEntryLbl (CaseLabel n CaseReturnInfo) = CaseLabel n CaseReturnPt +infoLblToEntryLbl (RtsLabel (RtsInfo s)) = RtsLabel (RtsEntry s) +infoLblToEntryLbl (RtsLabel (RtsRetInfo s)) = RtsLabel (RtsRet s) +infoLblToEntryLbl (RtsLabel (RtsInfoFS s)) = RtsLabel (RtsEntryFS s) +infoLblToEntryLbl (RtsLabel (RtsRetInfoFS s)) = RtsLabel (RtsRetFS s) +infoLblToEntryLbl _ = panic "CLabel.infoLblToEntryLbl" + +entryLblToInfoLbl :: CLabel -> CLabel +entryLblToInfoLbl (IdLabel n Entry) = IdLabel n InfoTable +entryLblToInfoLbl (IdLabel n ConEntry) = IdLabel n ConInfoTable +entryLblToInfoLbl (IdLabel n StaticConEntry) = IdLabel n StaticInfoTable +entryLblToInfoLbl (DynIdLabel n Entry) = DynIdLabel n InfoTable +entryLblToInfoLbl (DynIdLabel n ConEntry) = DynIdLabel n ConInfoTable +entryLblToInfoLbl (DynIdLabel n StaticConEntry) = DynIdLabel n StaticInfoTable +entryLblToInfoLbl (CaseLabel n CaseReturnPt) = CaseLabel n CaseReturnInfo +entryLblToInfoLbl (RtsLabel (RtsEntry s)) = RtsLabel (RtsInfo s) +entryLblToInfoLbl (RtsLabel (RtsRet s)) = RtsLabel (RtsRetInfo s) +entryLblToInfoLbl (RtsLabel (RtsEntryFS s)) = RtsLabel (RtsInfoFS s) +entryLblToInfoLbl (RtsLabel (RtsRetFS s)) = RtsLabel (RtsRetInfoFS s) +entryLblToInfoLbl l = pprPanic "CLabel.entryLblToInfoLbl" (pprCLabel l) + +-- ----------------------------------------------------------------------------- +-- Does a CLabel need declaring before use or not? + +needsCDecl :: CLabel -> Bool + -- False <=> it's pre-declared; don't bother + -- don't bother declaring SRT & Bitmap labels, we always make sure + -- they are defined before use. +needsCDecl (IdLabel _ SRT) = False +needsCDecl (IdLabel _ SRTDesc) = False +needsCDecl (IdLabel _ Bitmap) = False +needsCDecl (IdLabel _ _) = True +needsCDecl (DynIdLabel _ _) = True +needsCDecl (CaseLabel _ _) = True +needsCDecl (ModuleInitLabel _ _ _) = True +needsCDecl (PlainModuleInitLabel _ _) = True +needsCDecl ModuleRegdLabel = False + +needsCDecl (StringLitLabel _) = False +needsCDecl (AsmTempLabel _) = False +needsCDecl (RtsLabel _) = False +needsCDecl (ForeignLabel _ _ _) = False +needsCDecl (CC_Label _) = True +needsCDecl (CCS_Label _) = True + +-- Whether the label is an assembler temporary: + +isAsmTemp :: CLabel -> Bool -- is a local temporary for native code generation +isAsmTemp (AsmTempLabel _) = True +isAsmTemp _ = False + +-- ----------------------------------------------------------------------------- +-- Is a CLabel visible outside this object file or not? + +-- From the point of view of the code generator, a name is +-- externally visible if it has to be declared as exported +-- in the .o file's symbol table; that is, made non-static. + +externallyVisibleCLabel :: CLabel -> Bool -- not C "static" +externallyVisibleCLabel (CaseLabel _ _) = False +externallyVisibleCLabel (StringLitLabel _) = False +externallyVisibleCLabel (AsmTempLabel _) = False +externallyVisibleCLabel (ModuleInitLabel _ _ _)= True +externallyVisibleCLabel (PlainModuleInitLabel _ _)= True +externallyVisibleCLabel ModuleRegdLabel = False +externallyVisibleCLabel (RtsLabel _) = True +externallyVisibleCLabel (ForeignLabel _ _ _) = True +externallyVisibleCLabel (IdLabel name _) = isExternalName name +externallyVisibleCLabel (DynIdLabel name _) = isExternalName name +externallyVisibleCLabel (CC_Label _) = True +externallyVisibleCLabel (CCS_Label _) = True +externallyVisibleCLabel (DynamicLinkerLabel _ _) = False + +-- ----------------------------------------------------------------------------- +-- Finding the "type" of a CLabel + +-- For generating correct types in label declarations: + +data CLabelType + = CodeLabel + | DataLabel + +labelType :: CLabel -> CLabelType +labelType (RtsLabel (RtsSelectorInfoTable _ _)) = DataLabel +labelType (RtsLabel (RtsApInfoTable _ _)) = DataLabel +labelType (RtsLabel (RtsData _)) = DataLabel +labelType (RtsLabel (RtsCode _)) = CodeLabel +labelType (RtsLabel (RtsInfo _)) = DataLabel +labelType (RtsLabel (RtsEntry _)) = CodeLabel +labelType (RtsLabel (RtsRetInfo _)) = DataLabel +labelType (RtsLabel (RtsRet _)) = CodeLabel +labelType (RtsLabel (RtsDataFS _)) = DataLabel +labelType (RtsLabel (RtsCodeFS _)) = CodeLabel +labelType (RtsLabel (RtsInfoFS _)) = DataLabel +labelType (RtsLabel (RtsEntryFS _)) = CodeLabel +labelType (RtsLabel (RtsRetInfoFS _)) = DataLabel +labelType (RtsLabel (RtsRetFS _)) = CodeLabel +labelType (RtsLabel (RtsApFast _)) = CodeLabel +labelType (CaseLabel _ CaseReturnInfo) = DataLabel +labelType (CaseLabel _ _) = CodeLabel +labelType (ModuleInitLabel _ _ _) = CodeLabel +labelType (PlainModuleInitLabel _ _) = CodeLabel + +labelType (IdLabel _ info) = idInfoLabelType info +labelType (DynIdLabel _ info) = idInfoLabelType info +labelType _ = DataLabel + +idInfoLabelType info = + case info of + InfoTable -> DataLabel + Closure -> DataLabel + Bitmap -> DataLabel + ConInfoTable -> DataLabel + StaticInfoTable -> DataLabel + ClosureTable -> DataLabel + _ -> CodeLabel + + +-- ----------------------------------------------------------------------------- +-- Does a CLabel need dynamic linkage? + +-- When referring to data in code, we need to know whether +-- that data resides in a DLL or not. [Win32 only.] +-- @labelDynamic@ returns @True@ if the label is located +-- in a DLL, be it a data reference or not. + +labelDynamic :: CLabel -> Bool +labelDynamic lbl = + case lbl of + RtsLabel _ -> not opt_Static -- i.e., is the RTS in a DLL or not? + IdLabel n k -> False + DynIdLabel n k -> True +#if mingw32_TARGET_OS + ForeignLabel _ _ d -> d +#else + -- On Mac OS X and on ELF platforms, false positives are OK, + -- so we claim that all foreign imports come from dynamic libraries + ForeignLabel _ _ _ -> True +#endif + ModuleInitLabel m _ dyn -> not opt_Static && dyn + PlainModuleInitLabel m dyn -> not opt_Static && dyn + + -- Note that DynamicLinkerLabels do NOT require dynamic linking themselves. + _ -> False + +{- +OLD?: These GRAN functions are needed for spitting out GRAN_FETCH() at the +right places. It is used to detect when the abstractC statement of an +CCodeBlock actually contains the code for a slow entry point. -- HWL + +We need at least @Eq@ for @CLabels@, because we want to avoid +duplicate declarations in generating C (see @labelSeenTE@ in +@PprAbsC@). +-} + +----------------------------------------------------------------------------- +-- Printing out CLabels. + +{- +Convention: + + <name>_<type> + +where <name> is <Module>_<name> for external names and <unique> for +internal names. <type> is one of the following: + + info Info table + srt Static reference table + srtd Static reference table descriptor + entry Entry code (function, closure) + slow Slow entry code (if any) + ret Direct return address + vtbl Vector table + <n>_alt Case alternative (tag n) + dflt Default case alternative + btm Large bitmap vector + closure Static closure + con_entry Dynamic Constructor entry code + con_info Dynamic Constructor info table + static_entry Static Constructor entry code + static_info Static Constructor info table + sel_info Selector info table + sel_entry Selector entry code + cc Cost centre + ccs Cost centre stack + +Many of these distinctions are only for documentation reasons. For +example, _ret is only distinguished from _entry to make it easy to +tell whether a code fragment is a return point or a closure/function +entry. +-} + +instance Outputable CLabel where + ppr = pprCLabel + +pprCLabel :: CLabel -> SDoc + +#if ! OMIT_NATIVE_CODEGEN +pprCLabel (AsmTempLabel u) + = getPprStyle $ \ sty -> + if asmStyle sty then + ptext asmTempLabelPrefix <> pprUnique u + else + char '_' <> pprUnique u + +pprCLabel (DynamicLinkerLabel info lbl) + = pprDynamicLinkerAsmLabel info lbl + +pprCLabel PicBaseLabel + = ptext SLIT("1b") + +pprCLabel (DeadStripPreventer lbl) + = pprCLabel lbl <> ptext SLIT("_dsp") +#endif + +pprCLabel lbl = +#if ! OMIT_NATIVE_CODEGEN + getPprStyle $ \ sty -> + if asmStyle sty then + maybe_underscore (pprAsmCLbl lbl) + else +#endif + pprCLbl lbl + +maybe_underscore doc + | underscorePrefix = pp_cSEP <> doc + | otherwise = doc + +#ifdef mingw32_TARGET_OS +-- In asm mode, we need to put the suffix on a stdcall ForeignLabel. +-- (The C compiler does this itself). +pprAsmCLbl (ForeignLabel fs (Just sz) _) + = ftext fs <> char '@' <> int sz +#endif +pprAsmCLbl lbl + = pprCLbl lbl + +pprCLbl (StringLitLabel u) + = pprUnique u <> ptext SLIT("_str") + +pprCLbl (CaseLabel u CaseReturnPt) + = hcat [pprUnique u, ptext SLIT("_ret")] +pprCLbl (CaseLabel u CaseReturnInfo) + = hcat [pprUnique u, ptext SLIT("_info")] +pprCLbl (CaseLabel u (CaseAlt tag)) + = hcat [pprUnique u, pp_cSEP, int tag, ptext SLIT("_alt")] +pprCLbl (CaseLabel u CaseDefault) + = hcat [pprUnique u, ptext SLIT("_dflt")] + +pprCLbl (RtsLabel (RtsCode str)) = ptext str +pprCLbl (RtsLabel (RtsData str)) = ptext str +pprCLbl (RtsLabel (RtsCodeFS str)) = ftext str +pprCLbl (RtsLabel (RtsDataFS str)) = ftext str + +pprCLbl (RtsLabel (RtsApFast str)) = ptext str <> ptext SLIT("_fast") + +pprCLbl (RtsLabel (RtsSelectorInfoTable upd_reqd offset)) + = hcat [ptext SLIT("stg_sel_"), text (show offset), + ptext (if upd_reqd + then SLIT("_upd_info") + else SLIT("_noupd_info")) + ] + +pprCLbl (RtsLabel (RtsSelectorEntry upd_reqd offset)) + = hcat [ptext SLIT("stg_sel_"), text (show offset), + ptext (if upd_reqd + then SLIT("_upd_entry") + else SLIT("_noupd_entry")) + ] + +pprCLbl (RtsLabel (RtsApInfoTable upd_reqd arity)) + = hcat [ptext SLIT("stg_ap_"), text (show arity), + ptext (if upd_reqd + then SLIT("_upd_info") + else SLIT("_noupd_info")) + ] + +pprCLbl (RtsLabel (RtsApEntry upd_reqd arity)) + = hcat [ptext SLIT("stg_ap_"), text (show arity), + ptext (if upd_reqd + then SLIT("_upd_entry") + else SLIT("_noupd_entry")) + ] + +pprCLbl (RtsLabel (RtsInfo fs)) + = ptext fs <> ptext SLIT("_info") + +pprCLbl (RtsLabel (RtsEntry fs)) + = ptext fs <> ptext SLIT("_entry") + +pprCLbl (RtsLabel (RtsRetInfo fs)) + = ptext fs <> ptext SLIT("_info") + +pprCLbl (RtsLabel (RtsRet fs)) + = ptext fs <> ptext SLIT("_ret") + +pprCLbl (RtsLabel (RtsInfoFS fs)) + = ftext fs <> ptext SLIT("_info") + +pprCLbl (RtsLabel (RtsEntryFS fs)) + = ftext fs <> ptext SLIT("_entry") + +pprCLbl (RtsLabel (RtsRetInfoFS fs)) + = ftext fs <> ptext SLIT("_info") + +pprCLbl (RtsLabel (RtsRetFS fs)) + = ftext fs <> ptext SLIT("_ret") + +pprCLbl (RtsLabel (RtsPrimOp primop)) + = ppr primop <> ptext SLIT("_fast") + +pprCLbl (RtsLabel (RtsSlowTickyCtr pat)) + = ptext SLIT("SLOW_CALL_") <> text pat <> ptext SLIT("_ctr") + +pprCLbl ModuleRegdLabel + = ptext SLIT("_module_registered") + +pprCLbl (ForeignLabel str _ _) + = ftext str + +pprCLbl (IdLabel name flavor) = ppr name <> ppIdFlavor flavor +pprCLbl (DynIdLabel name flavor) = ppr name <> ppIdFlavor flavor + +pprCLbl (CC_Label cc) = ppr cc +pprCLbl (CCS_Label ccs) = ppr ccs + +pprCLbl (ModuleInitLabel mod way _) + = ptext SLIT("__stginit_") <> ppr mod + <> char '_' <> text way +pprCLbl (PlainModuleInitLabel mod _) + = ptext SLIT("__stginit_") <> ppr mod + +ppIdFlavor :: IdLabelInfo -> SDoc +ppIdFlavor x = pp_cSEP <> + (case x of + Closure -> ptext SLIT("closure") + SRT -> ptext SLIT("srt") + SRTDesc -> ptext SLIT("srtd") + InfoTable -> ptext SLIT("info") + Entry -> ptext SLIT("entry") + Slow -> ptext SLIT("slow") + RednCounts -> ptext SLIT("ct") + Bitmap -> ptext SLIT("btm") + ConEntry -> ptext SLIT("con_entry") + ConInfoTable -> ptext SLIT("con_info") + StaticConEntry -> ptext SLIT("static_entry") + StaticInfoTable -> ptext SLIT("static_info") + ClosureTable -> ptext SLIT("closure_tbl") + ) + + +pp_cSEP = char '_' + +-- ----------------------------------------------------------------------------- +-- Machine-dependent knowledge about labels. + +underscorePrefix :: Bool -- leading underscore on assembler labels? +underscorePrefix = (cLeadingUnderscore == "YES") + +asmTempLabelPrefix :: LitString -- for formatting labels +asmTempLabelPrefix = +#if alpha_TARGET_OS + {- The alpha assembler likes temporary labels to look like $L123 + instead of L123. (Don't toss the L, because then Lf28 + turns into $f28.) + -} + SLIT("$") +#elif darwin_TARGET_OS + SLIT("L") +#else + SLIT(".L") +#endif + +pprDynamicLinkerAsmLabel :: DynamicLinkerLabelInfo -> CLabel -> SDoc + +#if darwin_TARGET_OS +pprDynamicLinkerAsmLabel SymbolPtr lbl + = char 'L' <> pprCLabel lbl <> text "$non_lazy_ptr" +pprDynamicLinkerAsmLabel CodeStub lbl + = char 'L' <> pprCLabel lbl <> text "$stub" +#elif powerpc_TARGET_ARCH && linux_TARGET_OS +pprDynamicLinkerAsmLabel CodeStub lbl + = pprCLabel lbl <> text "@plt" +pprDynamicLinkerAsmLabel SymbolPtr lbl + = text ".LC_" <> pprCLabel lbl +#elif linux_TARGET_OS +pprDynamicLinkerAsmLabel CodeStub lbl + = pprCLabel lbl <> text "@plt" +pprDynamicLinkerAsmLabel GotSymbolPtr lbl + = pprCLabel lbl <> text "@got" +pprDynamicLinkerAsmLabel GotSymbolOffset lbl + = pprCLabel lbl <> text "@gotoff" +pprDynamicLinkerAsmLabel SymbolPtr lbl + = text ".LC_" <> pprCLabel lbl +#elif mingw32_TARGET_OS +pprDynamicLinkerAsmLabel SymbolPtr lbl + = text "__imp_" <> pprCLabel lbl +#endif +pprDynamicLinkerAsmLabel _ _ + = panic "pprDynamicLinkerAsmLabel" diff --git a/compiler/cmm/Cmm.hs b/compiler/cmm/Cmm.hs new file mode 100644 index 0000000000..13961c15d3 --- /dev/null +++ b/compiler/cmm/Cmm.hs @@ -0,0 +1,322 @@ +----------------------------------------------------------------------------- +-- +-- Cmm data types +-- +-- (c) The University of Glasgow 2004 +-- +----------------------------------------------------------------------------- + +module Cmm ( + GenCmm(..), Cmm, + GenCmmTop(..), CmmTop, + GenBasicBlock(..), CmmBasicBlock, blockId, blockStmts, + CmmStmt(..), + CmmCallTarget(..), + CmmStatic(..), Section(..), + CmmExpr(..), cmmExprRep, + CmmReg(..), cmmRegRep, + CmmLit(..), cmmLitRep, + LocalReg(..), localRegRep, + BlockId(..), + GlobalReg(..), globalRegRep, + + node, nodeReg, spReg, hpReg, + ) where + +#include "HsVersions.h" + +import MachOp +import CLabel ( CLabel ) +import ForeignCall ( CCallConv ) +import Unique ( Unique, Uniquable(..) ) +import FastString ( FastString ) +import DATA_WORD ( Word8 ) + +----------------------------------------------------------------------------- +-- Cmm, CmmTop, CmmBasicBlock +----------------------------------------------------------------------------- + +-- A file is a list of top-level chunks. These may be arbitrarily +-- re-orderd during code generation. + +-- GenCmm is abstracted over +-- (a) the type of static data elements +-- (b) the contents of a basic block. +-- We expect there to be two main instances of this type: +-- (a) Plain C--, i.e. populated with CmmLit and CmmExpr respectively, +-- (b) Native code, populated with instructions +-- +newtype GenCmm d i = Cmm [GenCmmTop d i] + +type Cmm = GenCmm CmmStatic CmmStmt + +-- A top-level chunk, abstracted over the type of the contents of +-- the basic blocks (Cmm or instructions are the likely instantiations). +data GenCmmTop d i + = CmmProc + [d] -- Info table, may be empty + CLabel -- Used to generate both info & entry labels + [LocalReg] -- Argument locals live on entry (C-- procedure params) + [GenBasicBlock i] -- Code, may be empty. The first block is + -- the entry point. The order is otherwise initially + -- unimportant, but at some point the code gen will + -- fix the order. + + -- the BlockId of the first block does not give rise + -- to a label. To jump to the first block in a Proc, + -- use the appropriate CLabel. + + -- some static data. + | CmmData Section [d] -- constant values only + +type CmmTop = GenCmmTop CmmStatic CmmStmt + +-- A basic block containing a single label, at the beginning. +-- The list of basic blocks in a top-level code block may be re-ordered. +-- Fall-through is not allowed: there must be an explicit jump at the +-- end of each basic block, but the code generator might rearrange basic +-- blocks in order to turn some jumps into fallthroughs. + +data GenBasicBlock i = BasicBlock BlockId [i] + -- ToDo: Julian suggests that we might need to annotate this type + -- with the out & in edges in the graph, i.e. two * [BlockId]. This + -- information can be derived from the contents, but it might be + -- helpful to cache it here. + +type CmmBasicBlock = GenBasicBlock CmmStmt + +blockId :: GenBasicBlock i -> BlockId +-- The branch block id is that of the first block in +-- the branch, which is that branch's entry point +blockId (BasicBlock blk_id _ ) = blk_id + +blockStmts :: GenBasicBlock i -> [i] +blockStmts (BasicBlock _ stmts) = stmts + + +----------------------------------------------------------------------------- +-- CmmStmt +-- A "statement". Note that all branches are explicit: there are no +-- control transfers to computed addresses, except when transfering +-- control to a new function. +----------------------------------------------------------------------------- + +data CmmStmt + = CmmNop + | CmmComment FastString + + | CmmAssign CmmReg CmmExpr -- Assign to register + + | CmmStore CmmExpr CmmExpr -- Assign to memory location. Size is + -- given by cmmExprRep of the rhs. + + | CmmCall -- A foreign call, with + CmmCallTarget + [(CmmReg,MachHint)] -- zero or more results + [(CmmExpr,MachHint)] -- zero or more arguments + (Maybe [GlobalReg]) -- Global regs that may need to be saved + -- if they will be clobbered by the call. + -- Nothing <=> save *all* globals that + -- might be clobbered. + + | CmmBranch BlockId -- branch to another BB in this fn + + | CmmCondBranch CmmExpr BlockId -- conditional branch + + | CmmSwitch CmmExpr [Maybe BlockId] -- Table branch + -- The scrutinee is zero-based; + -- zero -> first block + -- one -> second block etc + -- Undefined outside range, and when there's a Nothing + + | CmmJump CmmExpr [LocalReg] -- Jump to another function, with these + -- parameters. + +----------------------------------------------------------------------------- +-- CmmCallTarget +-- +-- The target of a CmmCall. +----------------------------------------------------------------------------- + +data CmmCallTarget + = CmmForeignCall -- Call to a foreign function + CmmExpr -- literal label <=> static call + -- other expression <=> dynamic call + CCallConv -- The calling convention + + | CmmPrim -- Call to a "primitive" (eg. sin, cos) + CallishMachOp -- These might be implemented as inline + -- code by the backend. + +----------------------------------------------------------------------------- +-- CmmExpr +-- An expression. Expressions have no side effects. +----------------------------------------------------------------------------- + +data CmmExpr + = CmmLit CmmLit -- Literal + | CmmLoad CmmExpr MachRep -- Read memory location + | CmmReg CmmReg -- Contents of register + | CmmMachOp MachOp [CmmExpr] -- Machine operation (+, -, *, etc.) + | CmmRegOff CmmReg Int + -- CmmRegOff reg i + -- ** is shorthand only, meaning ** + -- CmmMachOp (MO_S_Add rep (CmmReg reg) (CmmLit (CmmInt i rep))) + -- where rep = cmmRegRep reg + +cmmExprRep :: CmmExpr -> MachRep +cmmExprRep (CmmLit lit) = cmmLitRep lit +cmmExprRep (CmmLoad _ rep) = rep +cmmExprRep (CmmReg reg) = cmmRegRep reg +cmmExprRep (CmmMachOp op _) = resultRepOfMachOp op +cmmExprRep (CmmRegOff reg _) = cmmRegRep reg + +data CmmReg + = CmmLocal LocalReg + | CmmGlobal GlobalReg + deriving( Eq ) + +cmmRegRep :: CmmReg -> MachRep +cmmRegRep (CmmLocal reg) = localRegRep reg +cmmRegRep (CmmGlobal reg) = globalRegRep reg + +data LocalReg + = LocalReg !Unique MachRep + +instance Eq LocalReg where + (LocalReg u1 _) == (LocalReg u2 _) = u1 == u2 + +instance Uniquable LocalReg where + getUnique (LocalReg uniq _) = uniq + +localRegRep :: LocalReg -> MachRep +localRegRep (LocalReg _ rep) = rep + +data CmmLit + = CmmInt Integer MachRep + -- Interpretation: the 2's complement representation of the value + -- is truncated to the specified size. This is easier than trying + -- to keep the value within range, because we don't know whether + -- it will be used as a signed or unsigned value (the MachRep doesn't + -- distinguish between signed & unsigned). + | CmmFloat Rational MachRep + | CmmLabel CLabel -- Address of label + | CmmLabelOff CLabel Int -- Address of label + byte offset + + -- Due to limitations in the C backend, the following + -- MUST ONLY be used inside the info table indicated by label2 + -- (label2 must be the info label), and label1 must be an + -- SRT, a slow entrypoint or a large bitmap (see the Mangler) + -- Don't use it at all unless tablesNextToCode. + -- It is also used inside the NCG during when generating + -- position-independent code. + | CmmLabelDiffOff CLabel CLabel Int -- label1 - label2 + offset + +cmmLitRep :: CmmLit -> MachRep +cmmLitRep (CmmInt _ rep) = rep +cmmLitRep (CmmFloat _ rep) = rep +cmmLitRep (CmmLabel _) = wordRep +cmmLitRep (CmmLabelOff _ _) = wordRep +cmmLitRep (CmmLabelDiffOff _ _ _) = wordRep + +----------------------------------------------------------------------------- +-- A local label. + +-- Local labels must be unique within a single compilation unit. + +newtype BlockId = BlockId Unique + deriving (Eq,Ord) + +instance Uniquable BlockId where + getUnique (BlockId u) = u + +----------------------------------------------------------------------------- +-- Static Data +----------------------------------------------------------------------------- + +data Section + = Text + | Data + | ReadOnlyData + | RelocatableReadOnlyData + | UninitialisedData + | ReadOnlyData16 -- .rodata.cst16 on x86_64, 16-byte aligned + | OtherSection String + +data CmmStatic + = CmmStaticLit CmmLit + -- a literal value, size given by cmmLitRep of the literal. + | CmmUninitialised Int + -- uninitialised data, N bytes long + | CmmAlign Int + -- align to next N-byte boundary (N must be a power of 2). + | CmmDataLabel CLabel + -- label the current position in this section. + | CmmString [Word8] + -- string of 8-bit values only, not zero terminated. + +----------------------------------------------------------------------------- +-- Global STG registers +----------------------------------------------------------------------------- + +data GlobalReg + -- Argument and return registers + = VanillaReg -- pointers, unboxed ints and chars + {-# UNPACK #-} !Int -- its number + + | FloatReg -- single-precision floating-point registers + {-# UNPACK #-} !Int -- its number + + | DoubleReg -- double-precision floating-point registers + {-# UNPACK #-} !Int -- its number + + | LongReg -- long int registers (64-bit, really) + {-# UNPACK #-} !Int -- its number + + -- STG registers + | Sp -- Stack ptr; points to last occupied stack location. + | SpLim -- Stack limit + | Hp -- Heap ptr; points to last occupied heap location. + | HpLim -- Heap limit register + | CurrentTSO -- pointer to current thread's TSO + | CurrentNursery -- pointer to allocation area + | HpAlloc -- allocation count for heap check failure + + -- We keep the address of some commonly-called + -- functions in the register table, to keep code + -- size down: + | GCEnter1 -- stg_gc_enter_1 + | GCFun -- stg_gc_fun + + -- Base offset for the register table, used for accessing registers + -- which do not have real registers assigned to them. This register + -- will only appear after we have expanded GlobalReg into memory accesses + -- (where necessary) in the native code generator. + | BaseReg + + -- Base Register for PIC (position-independent code) calculations + -- Only used inside the native code generator. It's exact meaning differs + -- from platform to platform (see module PositionIndependentCode). + | PicBaseReg + + deriving( Eq +#ifdef DEBUG + , Show +#endif + ) + +-- convenient aliases +spReg, hpReg, nodeReg :: CmmReg +spReg = CmmGlobal Sp +hpReg = CmmGlobal Hp +nodeReg = CmmGlobal node + +node :: GlobalReg +node = VanillaReg 1 + +globalRegRep :: GlobalReg -> MachRep +globalRegRep (VanillaReg _) = wordRep +globalRegRep (FloatReg _) = F32 +globalRegRep (DoubleReg _) = F64 +globalRegRep (LongReg _) = I64 +globalRegRep _ = wordRep diff --git a/compiler/cmm/CmmLex.x b/compiler/cmm/CmmLex.x new file mode 100644 index 0000000000..c2efd17710 --- /dev/null +++ b/compiler/cmm/CmmLex.x @@ -0,0 +1,311 @@ +----------------------------------------------------------------------------- +-- (c) The University of Glasgow, 2004 +-- +-- Lexer for concrete Cmm. We try to stay close to the C-- spec, but there +-- are a few minor differences: +-- +-- * extra keywords for our macros, and float32/float64 types +-- * global registers (Sp,Hp, etc.) +-- +----------------------------------------------------------------------------- + +{ +module CmmLex ( + CmmToken(..), cmmlex, + ) where + +#include "HsVersions.h" + +import Cmm +import Lexer + +import SrcLoc +import UniqFM +import StringBuffer +import FastString +import Ctype +import Util ( readRational ) +--import TRACE +} + +$whitechar = [\ \t\n\r\f\v\xa0] +$white_no_nl = $whitechar # \n + +$ascdigit = 0-9 +$unidigit = \x01 +$digit = [$ascdigit $unidigit] +$octit = 0-7 +$hexit = [$digit A-F a-f] + +$unilarge = \x03 +$asclarge = [A-Z \xc0-\xd6 \xd8-\xde] +$large = [$asclarge $unilarge] + +$unismall = \x04 +$ascsmall = [a-z \xdf-\xf6 \xf8-\xff] +$small = [$ascsmall $unismall \_] + +$namebegin = [$large $small \_ \. \$ \@] +$namechar = [$namebegin $digit] + +@decimal = $digit+ +@octal = $octit+ +@hexadecimal = $hexit+ +@exponent = [eE] [\-\+]? @decimal + +@floating_point = @decimal \. @decimal @exponent? | @decimal @exponent + +@escape = \\ ([abfnrt\\\'\"\?] | x @hexadecimal | @octal) +@strchar = ($printable # [\"\\]) | @escape + +cmm :- + +$white_no_nl+ ; +^\# pragma .* \n ; -- Apple GCC 3.3 CPP generates pragmas in its output + +^\# (line)? { begin line_prag } + +-- single-line line pragmas, of the form +-- # <line> "<file>" <extra-stuff> \n +<line_prag> $digit+ { setLine line_prag1 } +<line_prag1> \" ($printable # \")* \" { setFile line_prag2 } +<line_prag2> .* { pop } + +<0> { + \n ; + + [\:\;\{\}\[\]\(\)\=\`\~\/\*\%\-\+\&\^\|\>\<\,\!] { special_char } + + ".." { kw CmmT_DotDot } + "::" { kw CmmT_DoubleColon } + ">>" { kw CmmT_Shr } + "<<" { kw CmmT_Shl } + ">=" { kw CmmT_Ge } + "<=" { kw CmmT_Le } + "==" { kw CmmT_Eq } + "!=" { kw CmmT_Ne } + "&&" { kw CmmT_BoolAnd } + "||" { kw CmmT_BoolOr } + + R@decimal { global_regN VanillaReg } + F@decimal { global_regN FloatReg } + D@decimal { global_regN DoubleReg } + L@decimal { global_regN LongReg } + Sp { global_reg Sp } + SpLim { global_reg SpLim } + Hp { global_reg Hp } + HpLim { global_reg HpLim } + CurrentTSO { global_reg CurrentTSO } + CurrentNursery { global_reg CurrentNursery } + HpAlloc { global_reg HpAlloc } + BaseReg { global_reg BaseReg } + + $namebegin $namechar* { name } + + 0 @octal { tok_octal } + @decimal { tok_decimal } + 0[xX] @hexadecimal { tok_hexadecimal } + @floating_point { strtoken tok_float } + + \" @strchar* \" { strtoken tok_string } +} + +{ +data CmmToken + = CmmT_SpecChar Char + | CmmT_DotDot + | CmmT_DoubleColon + | CmmT_Shr + | CmmT_Shl + | CmmT_Ge + | CmmT_Le + | CmmT_Eq + | CmmT_Ne + | CmmT_BoolAnd + | CmmT_BoolOr + | CmmT_CLOSURE + | CmmT_INFO_TABLE + | CmmT_INFO_TABLE_RET + | CmmT_INFO_TABLE_FUN + | CmmT_INFO_TABLE_CONSTR + | CmmT_INFO_TABLE_SELECTOR + | CmmT_else + | CmmT_export + | CmmT_section + | CmmT_align + | CmmT_goto + | CmmT_if + | CmmT_jump + | CmmT_foreign + | CmmT_import + | CmmT_switch + | CmmT_case + | CmmT_default + | CmmT_bits8 + | CmmT_bits16 + | CmmT_bits32 + | CmmT_bits64 + | CmmT_float32 + | CmmT_float64 + | CmmT_GlobalReg GlobalReg + | CmmT_Name FastString + | CmmT_String String + | CmmT_Int Integer + | CmmT_Float Rational + | CmmT_EOF +#ifdef DEBUG + deriving (Show) +#endif + +-- ----------------------------------------------------------------------------- +-- Lexer actions + +type Action = SrcSpan -> StringBuffer -> Int -> P (Located CmmToken) + +begin :: Int -> Action +begin code _span _str _len = do pushLexState code; lexToken + +pop :: Action +pop _span _buf _len = do popLexState; lexToken + +special_char :: Action +special_char span buf len = return (L span (CmmT_SpecChar (currentChar buf))) + +kw :: CmmToken -> Action +kw tok span buf len = return (L span tok) + +global_regN :: (Int -> GlobalReg) -> Action +global_regN con span buf len + = return (L span (CmmT_GlobalReg (con (fromIntegral n)))) + where buf' = stepOn buf + n = parseInteger buf' (len-1) 10 octDecDigit + +global_reg :: GlobalReg -> Action +global_reg r span buf len = return (L span (CmmT_GlobalReg r)) + +strtoken :: (String -> CmmToken) -> Action +strtoken f span buf len = + return (L span $! (f $! lexemeToString buf len)) + +name :: Action +name span buf len = + case lookupUFM reservedWordsFM fs of + Just tok -> return (L span tok) + Nothing -> return (L span (CmmT_Name fs)) + where + fs = lexemeToFastString buf len + +reservedWordsFM = listToUFM $ + map (\(x, y) -> (mkFastString x, y)) [ + ( "CLOSURE", CmmT_CLOSURE ), + ( "INFO_TABLE", CmmT_INFO_TABLE ), + ( "INFO_TABLE_RET", CmmT_INFO_TABLE_RET ), + ( "INFO_TABLE_FUN", CmmT_INFO_TABLE_FUN ), + ( "INFO_TABLE_CONSTR", CmmT_INFO_TABLE_CONSTR ), + ( "INFO_TABLE_SELECTOR",CmmT_INFO_TABLE_SELECTOR ), + ( "else", CmmT_else ), + ( "export", CmmT_export ), + ( "section", CmmT_section ), + ( "align", CmmT_align ), + ( "goto", CmmT_goto ), + ( "if", CmmT_if ), + ( "jump", CmmT_jump ), + ( "foreign", CmmT_foreign ), + ( "import", CmmT_import ), + ( "switch", CmmT_switch ), + ( "case", CmmT_case ), + ( "default", CmmT_default ), + ( "bits8", CmmT_bits8 ), + ( "bits16", CmmT_bits16 ), + ( "bits32", CmmT_bits32 ), + ( "bits64", CmmT_bits64 ), + ( "float32", CmmT_float32 ), + ( "float64", CmmT_float64 ) + ] + +tok_decimal span buf len + = return (L span (CmmT_Int $! parseInteger buf len 10 octDecDigit)) + +tok_octal span buf len + = return (L span (CmmT_Int $! parseInteger (offsetBytes 1 buf) (len-1) 8 octDecDigit)) + +tok_hexadecimal span buf len + = return (L span (CmmT_Int $! parseInteger (offsetBytes 2 buf) (len-2) 16 hexDigit)) + +tok_float str = CmmT_Float $! readRational str + +tok_string str = CmmT_String (read str) + -- urk, not quite right, but it'll do for now + +-- ----------------------------------------------------------------------------- +-- Line pragmas + +setLine :: Int -> Action +setLine code span buf len = do + let line = parseInteger buf len 10 octDecDigit + setSrcLoc (mkSrcLoc (srcSpanFile span) (fromIntegral line - 1) 0) + -- subtract one: the line number refers to the *following* line + -- trace ("setLine " ++ show line) $ do + popLexState + pushLexState code + lexToken + +setFile :: Int -> Action +setFile code span buf len = do + let file = lexemeToFastString (stepOn buf) (len-2) + setSrcLoc (mkSrcLoc file (srcSpanEndLine span) (srcSpanEndCol span)) + popLexState + pushLexState code + lexToken + +-- ----------------------------------------------------------------------------- +-- This is the top-level function: called from the parser each time a +-- new token is to be read from the input. + +cmmlex :: (Located CmmToken -> P a) -> P a +cmmlex cont = do + tok@(L _ tok__) <- lexToken + --trace ("token: " ++ show tok__) $ do + cont tok + +lexToken :: P (Located CmmToken) +lexToken = do + inp@(loc1,buf) <- getInput + sc <- getLexState + case alexScan inp sc of + AlexEOF -> do let span = mkSrcSpan loc1 loc1 + setLastToken span 0 + return (L span CmmT_EOF) + AlexError (loc2,_) -> do failLocMsgP loc1 loc2 "lexical error" + AlexSkip inp2 _ -> do + setInput inp2 + lexToken + AlexToken inp2@(end,buf2) len t -> do + setInput inp2 + let span = mkSrcSpan loc1 end + span `seq` setLastToken span len + t span buf len + +-- ----------------------------------------------------------------------------- +-- Monad stuff + +-- Stuff that Alex needs to know about our input type: +type AlexInput = (SrcLoc,StringBuffer) + +alexInputPrevChar :: AlexInput -> Char +alexInputPrevChar (_,s) = prevChar s '\n' + +alexGetChar :: AlexInput -> Maybe (Char,AlexInput) +alexGetChar (loc,s) + | atEnd s = Nothing + | otherwise = c `seq` loc' `seq` s' `seq` Just (c, (loc', s')) + where c = currentChar s + loc' = advanceSrcLoc loc c + s' = stepOn s + +getInput :: P AlexInput +getInput = P $ \s@PState{ loc=l, buffer=b } -> POk s (l,b) + +setInput :: AlexInput -> P () +setInput (l,b) = P $ \s -> POk s{ loc=l, buffer=b } () +} diff --git a/compiler/cmm/CmmLint.hs b/compiler/cmm/CmmLint.hs new file mode 100644 index 0000000000..fbfb14c165 --- /dev/null +++ b/compiler/cmm/CmmLint.hs @@ -0,0 +1,159 @@ +----------------------------------------------------------------------------- +-- +-- CmmLint: checking the correctness of Cmm statements and expressions +-- +-- (c) The University of Glasgow 2004 +-- +----------------------------------------------------------------------------- + +module CmmLint ( + cmmLint, cmmLintTop + ) where + +#include "HsVersions.h" + +import Cmm +import CLabel ( pprCLabel ) +import MachOp +import Outputable +import PprCmm +import Unique ( getUnique ) +import Constants ( wORD_SIZE ) + +import Monad ( when ) + +-- ----------------------------------------------------------------------------- +-- Exported entry points: + +cmmLint :: Cmm -> Maybe SDoc +cmmLint (Cmm tops) = runCmmLint $ mapM_ lintCmmTop tops + +cmmLintTop :: CmmTop -> Maybe SDoc +cmmLintTop top = runCmmLint $ lintCmmTop top + +runCmmLint :: CmmLint a -> Maybe SDoc +runCmmLint l = + case unCL l of + Left err -> Just (ptext SLIT("Cmm lint error:") $$ nest 2 err) + Right _ -> Nothing + +lintCmmTop (CmmProc _info lbl _args blocks) + = addLintInfo (text "in proc " <> pprCLabel lbl) $ + mapM_ lintCmmBlock blocks +lintCmmTop _other + = return () + +lintCmmBlock (BasicBlock id stmts) + = addLintInfo (text "in basic block " <> ppr (getUnique id)) $ + mapM_ lintCmmStmt stmts + +-- ----------------------------------------------------------------------------- +-- lintCmmExpr + +-- Checks whether a CmmExpr is "type-correct", and check for obvious-looking +-- byte/word mismatches. + +lintCmmExpr :: CmmExpr -> CmmLint MachRep +lintCmmExpr (CmmLoad expr rep) = do + lintCmmExpr expr + when (machRepByteWidth rep >= wORD_SIZE) $ + cmmCheckWordAddress expr + return rep +lintCmmExpr expr@(CmmMachOp op args) = do + mapM_ lintCmmExpr args + if map cmmExprRep args == machOpArgReps op + then cmmCheckMachOp op args + else cmmLintMachOpErr expr +lintCmmExpr (CmmRegOff reg offset) + = lintCmmExpr (CmmMachOp (MO_Add rep) + [CmmReg reg, CmmLit (CmmInt (fromIntegral offset) rep)]) + where rep = cmmRegRep reg +lintCmmExpr lit@(CmmLit (CmmInt _ rep)) + | isFloatingRep rep + = cmmLintErr (text "integer literal with floating MachRep: " <> ppr lit) +lintCmmExpr expr = + return (cmmExprRep expr) + +-- Check for some common byte/word mismatches (eg. Sp + 1) +cmmCheckMachOp op args@[CmmReg reg, CmmLit (CmmInt i _)] + | isWordOffsetReg reg && isOffsetOp op && i `rem` fromIntegral wORD_SIZE /= 0 + = cmmLintDubiousWordOffset (CmmMachOp op args) +cmmCheckMachOp op [lit@(CmmLit (CmmInt i _)), reg@(CmmReg _)] + = cmmCheckMachOp op [reg, lit] +cmmCheckMachOp op@(MO_U_Conv from to) args + | isFloatingRep from || isFloatingRep to + = cmmLintErr (text "unsigned conversion from/to floating rep: " + <> ppr (CmmMachOp op args)) +cmmCheckMachOp op args + = return (resultRepOfMachOp op) + +isWordOffsetReg (CmmGlobal Sp) = True +isWordOffsetReg (CmmGlobal Hp) = True +isWordOffsetReg _ = False + +isOffsetOp (MO_Add _) = True +isOffsetOp (MO_Sub _) = True +isOffsetOp _ = False + +-- This expression should be an address from which a word can be loaded: +-- check for funny-looking sub-word offsets. +cmmCheckWordAddress e@(CmmMachOp op [arg, CmmLit (CmmInt i _)]) + | isOffsetOp op && i `rem` fromIntegral wORD_SIZE /= 0 + = cmmLintDubiousWordOffset e +cmmCheckWordAddress e@(CmmMachOp op [CmmLit (CmmInt i _), arg]) + | isOffsetOp op && i `rem` fromIntegral wORD_SIZE /= 0 + = cmmLintDubiousWordOffset e +cmmCheckWordAddress _ + = return () + + +lintCmmStmt :: CmmStmt -> CmmLint () +lintCmmStmt stmt@(CmmAssign reg expr) = do + erep <- lintCmmExpr expr + if (erep == cmmRegRep reg) + then return () + else cmmLintAssignErr stmt +lintCmmStmt (CmmStore l r) = do + lintCmmExpr l + lintCmmExpr r + return () +lintCmmStmt (CmmCall _target _res args _vols) = mapM_ (lintCmmExpr.fst) args +lintCmmStmt (CmmCondBranch e _id) = lintCmmExpr e >> return () +lintCmmStmt (CmmSwitch e _branches) = lintCmmExpr e >> return () +lintCmmStmt (CmmJump e _args) = lintCmmExpr e >> return () +lintCmmStmt _other = return () + +-- ----------------------------------------------------------------------------- +-- CmmLint monad + +-- just a basic error monad: + +newtype CmmLint a = CmmLint { unCL :: Either SDoc a } + +instance Monad CmmLint where + CmmLint m >>= k = CmmLint $ case m of + Left e -> Left e + Right a -> unCL (k a) + return a = CmmLint (Right a) + +cmmLintErr :: SDoc -> CmmLint a +cmmLintErr msg = CmmLint (Left msg) + +addLintInfo :: SDoc -> CmmLint a -> CmmLint a +addLintInfo info thing = CmmLint $ + case unCL thing of + Left err -> Left (hang info 2 err) + Right a -> Right a + +cmmLintMachOpErr :: CmmExpr -> CmmLint a +cmmLintMachOpErr expr = cmmLintErr (text "in MachOp application: " $$ + nest 2 (pprExpr expr)) + +cmmLintAssignErr :: CmmStmt -> CmmLint a +cmmLintAssignErr stmt = cmmLintErr (text "in assignment: " $$ + nest 2 (pprStmt stmt)) + +cmmLintDubiousWordOffset :: CmmExpr -> CmmLint a +cmmLintDubiousWordOffset expr + = cmmLintErr (text "offset is not a multiple of words: " $$ + nest 2 (pprExpr expr)) diff --git a/compiler/cmm/CmmOpt.hs b/compiler/cmm/CmmOpt.hs new file mode 100644 index 0000000000..c454ff4c6a --- /dev/null +++ b/compiler/cmm/CmmOpt.hs @@ -0,0 +1,507 @@ +----------------------------------------------------------------------------- +-- +-- Cmm optimisation +-- +-- (c) The University of Glasgow 2006 +-- +----------------------------------------------------------------------------- + +module CmmOpt ( + cmmMiniInline, + cmmMachOpFold, + cmmLoopifyForC, + ) where + +#include "HsVersions.h" + +import Cmm +import CmmUtils ( hasNoGlobalRegs ) +import CLabel ( entryLblToInfoLbl ) +import MachOp +import SMRep ( tablesNextToCode ) + +import UniqFM +import Unique ( Unique ) +import Panic ( panic ) + +import Outputable + +import Bits +import Word +import Int +import GLAEXTS + + +-- ----------------------------------------------------------------------------- +-- The mini-inliner + +{- +This pass inlines assignments to temporaries that are used just +once. It works as follows: + + - count uses of each temporary + - for each temporary that occurs just once: + - attempt to push it forward to the statement that uses it + - only push forward past assignments to other temporaries + (assumes that temporaries are single-assignment) + - if we reach the statement that uses it, inline the rhs + and delete the original assignment. + +Possible generalisations: here is an example from factorial + +Fac_zdwfac_entry: + cmG: + _smi = R2; + if (_smi != 0) goto cmK; + R1 = R3; + jump I64[Sp]; + cmK: + _smn = _smi * R3; + R2 = _smi + (-1); + R3 = _smn; + jump Fac_zdwfac_info; + +We want to inline _smi and _smn. To inline _smn: + + - we must be able to push forward past assignments to global regs. + We can do this if the rhs of the assignment we are pushing + forward doesn't refer to the global reg being assigned to; easy + to test. + +To inline _smi: + + - It is a trivial replacement, reg for reg, but it occurs more than + once. + - We can inline trivial assignments even if the temporary occurs + more than once, as long as we don't eliminate the original assignment + (this doesn't help much on its own). + - We need to be able to propagate the assignment forward through jumps; + if we did this, we would find that it can be inlined safely in all + its occurrences. +-} + +cmmMiniInline :: [CmmBasicBlock] -> [CmmBasicBlock] +cmmMiniInline blocks = map do_inline blocks + where + blockUses (BasicBlock _ stmts) + = foldr (plusUFM_C (+)) emptyUFM (map getStmtUses stmts) + + uses = foldr (plusUFM_C (+)) emptyUFM (map blockUses blocks) + + do_inline (BasicBlock id stmts) + = BasicBlock id (cmmMiniInlineStmts uses stmts) + + +cmmMiniInlineStmts :: UniqFM Int -> [CmmStmt] -> [CmmStmt] +cmmMiniInlineStmts uses [] = [] +cmmMiniInlineStmts uses (stmt@(CmmAssign (CmmLocal (LocalReg u _)) expr) : stmts) + | Just 1 <- lookupUFM uses u, + Just stmts' <- lookForInline u expr stmts + = +#ifdef NCG_DEBUG + trace ("nativeGen: inlining " ++ showSDoc (pprStmt stmt)) $ +#endif + cmmMiniInlineStmts uses stmts' + +cmmMiniInlineStmts uses (stmt:stmts) + = stmt : cmmMiniInlineStmts uses stmts + + +-- Try to inline a temporary assignment. We can skip over assignments to +-- other tempoararies, because we know that expressions aren't side-effecting +-- and temporaries are single-assignment. +lookForInline u expr (stmt@(CmmAssign (CmmLocal (LocalReg u' _)) rhs) : rest) + | u /= u' + = case lookupUFM (getExprUses rhs) u of + Just 1 -> Just (inlineStmt u expr stmt : rest) + _other -> case lookForInline u expr rest of + Nothing -> Nothing + Just stmts -> Just (stmt:stmts) + +lookForInline u expr (CmmNop : rest) + = lookForInline u expr rest + +lookForInline u expr (stmt:stmts) + = case lookupUFM (getStmtUses stmt) u of + Just 1 | ok_to_inline -> Just (inlineStmt u expr stmt : stmts) + _other -> Nothing + where + -- we don't inline into CmmCall if the expression refers to global + -- registers. This is a HACK to avoid global registers clashing with + -- C argument-passing registers, really the back-end ought to be able + -- to handle it properly, but currently neither PprC nor the NCG can + -- do it. See also CgForeignCall:load_args_into_temps. + ok_to_inline = case stmt of + CmmCall{} -> hasNoGlobalRegs expr + _ -> True + +-- ----------------------------------------------------------------------------- +-- Boring Cmm traversals for collecting usage info and substitutions. + +getStmtUses :: CmmStmt -> UniqFM Int +getStmtUses (CmmAssign _ e) = getExprUses e +getStmtUses (CmmStore e1 e2) = plusUFM_C (+) (getExprUses e1) (getExprUses e2) +getStmtUses (CmmCall target _ es _) + = plusUFM_C (+) (uses target) (getExprsUses (map fst es)) + where uses (CmmForeignCall e _) = getExprUses e + uses _ = emptyUFM +getStmtUses (CmmCondBranch e _) = getExprUses e +getStmtUses (CmmSwitch e _) = getExprUses e +getStmtUses (CmmJump e _) = getExprUses e +getStmtUses _ = emptyUFM + +getExprUses :: CmmExpr -> UniqFM Int +getExprUses (CmmReg (CmmLocal (LocalReg u _))) = unitUFM u 1 +getExprUses (CmmRegOff (CmmLocal (LocalReg u _)) _) = unitUFM u 1 +getExprUses (CmmLoad e _) = getExprUses e +getExprUses (CmmMachOp _ es) = getExprsUses es +getExprUses _other = emptyUFM + +getExprsUses es = foldr (plusUFM_C (+)) emptyUFM (map getExprUses es) + +inlineStmt :: Unique -> CmmExpr -> CmmStmt -> CmmStmt +inlineStmt u a (CmmAssign r e) = CmmAssign r (inlineExpr u a e) +inlineStmt u a (CmmStore e1 e2) = CmmStore (inlineExpr u a e1) (inlineExpr u a e2) +inlineStmt u a (CmmCall target regs es vols) + = CmmCall (infn target) regs es' vols + where infn (CmmForeignCall fn cconv) = CmmForeignCall fn cconv + infn (CmmPrim p) = CmmPrim p + es' = [ (inlineExpr u a e, hint) | (e,hint) <- es ] +inlineStmt u a (CmmCondBranch e d) = CmmCondBranch (inlineExpr u a e) d +inlineStmt u a (CmmSwitch e d) = CmmSwitch (inlineExpr u a e) d +inlineStmt u a (CmmJump e d) = CmmJump (inlineExpr u a e) d +inlineStmt u a other_stmt = other_stmt + +inlineExpr :: Unique -> CmmExpr -> CmmExpr -> CmmExpr +inlineExpr u a e@(CmmReg (CmmLocal (LocalReg u' _))) + | u == u' = a + | otherwise = e +inlineExpr u a e@(CmmRegOff (CmmLocal (LocalReg u' rep)) off) + | u == u' = CmmMachOp (MO_Add rep) [a, CmmLit (CmmInt (fromIntegral off) rep)] + | otherwise = e +inlineExpr u a (CmmLoad e rep) = CmmLoad (inlineExpr u a e) rep +inlineExpr u a (CmmMachOp op es) = CmmMachOp op (map (inlineExpr u a) es) +inlineExpr u a other_expr = other_expr + +-- ----------------------------------------------------------------------------- +-- MachOp constant folder + +-- Now, try to constant-fold the MachOps. The arguments have already +-- been optimized and folded. + +cmmMachOpFold + :: MachOp -- The operation from an CmmMachOp + -> [CmmExpr] -- The optimized arguments + -> CmmExpr + +cmmMachOpFold op arg@[CmmLit (CmmInt x rep)] + = case op of + MO_S_Neg r -> CmmLit (CmmInt (-x) rep) + MO_Not r -> CmmLit (CmmInt (complement x) rep) + + -- these are interesting: we must first narrow to the + -- "from" type, in order to truncate to the correct size. + -- The final narrow/widen to the destination type + -- is implicit in the CmmLit. + MO_S_Conv from to + | isFloatingRep to -> CmmLit (CmmFloat (fromInteger x) to) + | otherwise -> CmmLit (CmmInt (narrowS from x) to) + MO_U_Conv from to -> CmmLit (CmmInt (narrowU from x) to) + + _ -> panic "cmmMachOpFold: unknown unary op" + + +-- Eliminate conversion NOPs +cmmMachOpFold (MO_S_Conv rep1 rep2) [x] | rep1 == rep2 = x +cmmMachOpFold (MO_U_Conv rep1 rep2) [x] | rep1 == rep2 = x + +-- Eliminate nested conversions where possible +cmmMachOpFold conv_outer args@[CmmMachOp conv_inner [x]] + | Just (rep1,rep2,signed1) <- isIntConversion conv_inner, + Just (_, rep3,signed2) <- isIntConversion conv_outer + = case () of + -- widen then narrow to the same size is a nop + _ | rep1 < rep2 && rep1 == rep3 -> x + -- Widen then narrow to different size: collapse to single conversion + -- but remember to use the signedness from the widening, just in case + -- the final conversion is a widen. + | rep1 < rep2 && rep2 > rep3 -> + cmmMachOpFold (intconv signed1 rep1 rep3) [x] + -- Nested widenings: collapse if the signedness is the same + | rep1 < rep2 && rep2 < rep3 && signed1 == signed2 -> + cmmMachOpFold (intconv signed1 rep1 rep3) [x] + -- Nested narrowings: collapse + | rep1 > rep2 && rep2 > rep3 -> + cmmMachOpFold (MO_U_Conv rep1 rep3) [x] + | otherwise -> + CmmMachOp conv_outer args + where + isIntConversion (MO_U_Conv rep1 rep2) + | not (isFloatingRep rep1) && not (isFloatingRep rep2) + = Just (rep1,rep2,False) + isIntConversion (MO_S_Conv rep1 rep2) + | not (isFloatingRep rep1) && not (isFloatingRep rep2) + = Just (rep1,rep2,True) + isIntConversion _ = Nothing + + intconv True = MO_S_Conv + intconv False = MO_U_Conv + +-- ToDo: a narrow of a load can be collapsed into a narrow load, right? +-- but what if the architecture only supports word-sized loads, should +-- we do the transformation anyway? + +cmmMachOpFold mop args@[CmmLit (CmmInt x xrep), CmmLit (CmmInt y _)] + = case mop of + -- for comparisons: don't forget to narrow the arguments before + -- comparing, since they might be out of range. + MO_Eq r -> CmmLit (CmmInt (if x_u == y_u then 1 else 0) wordRep) + MO_Ne r -> CmmLit (CmmInt (if x_u /= y_u then 1 else 0) wordRep) + + MO_U_Gt r -> CmmLit (CmmInt (if x_u > y_u then 1 else 0) wordRep) + MO_U_Ge r -> CmmLit (CmmInt (if x_u >= y_u then 1 else 0) wordRep) + MO_U_Lt r -> CmmLit (CmmInt (if x_u < y_u then 1 else 0) wordRep) + MO_U_Le r -> CmmLit (CmmInt (if x_u <= y_u then 1 else 0) wordRep) + + MO_S_Gt r -> CmmLit (CmmInt (if x_s > y_s then 1 else 0) wordRep) + MO_S_Ge r -> CmmLit (CmmInt (if x_s >= y_s then 1 else 0) wordRep) + MO_S_Lt r -> CmmLit (CmmInt (if x_s < y_s then 1 else 0) wordRep) + MO_S_Le r -> CmmLit (CmmInt (if x_s <= y_s then 1 else 0) wordRep) + + MO_Add r -> CmmLit (CmmInt (x + y) r) + MO_Sub r -> CmmLit (CmmInt (x - y) r) + MO_Mul r -> CmmLit (CmmInt (x * y) r) + MO_S_Quot r | y /= 0 -> CmmLit (CmmInt (x `quot` y) r) + MO_S_Rem r | y /= 0 -> CmmLit (CmmInt (x `rem` y) r) + + MO_And r -> CmmLit (CmmInt (x .&. y) r) + MO_Or r -> CmmLit (CmmInt (x .|. y) r) + MO_Xor r -> CmmLit (CmmInt (x `xor` y) r) + + MO_Shl r -> CmmLit (CmmInt (x `shiftL` fromIntegral y) r) + MO_U_Shr r -> CmmLit (CmmInt (x_u `shiftR` fromIntegral y) r) + MO_S_Shr r -> CmmLit (CmmInt (x `shiftR` fromIntegral y) r) + + other -> CmmMachOp mop args + + where + x_u = narrowU xrep x + y_u = narrowU xrep y + x_s = narrowS xrep x + y_s = narrowS xrep y + + +-- When possible, shift the constants to the right-hand side, so that we +-- can match for strength reductions. Note that the code generator will +-- also assume that constants have been shifted to the right when +-- possible. + +cmmMachOpFold op [x@(CmmLit _), y] + | not (isLit y) && isCommutableMachOp op + = cmmMachOpFold op [y, x] + +-- Turn (a+b)+c into a+(b+c) where possible. Because literals are +-- moved to the right, it is more likely that we will find +-- opportunities for constant folding when the expression is +-- right-associated. +-- +-- ToDo: this appears to introduce a quadratic behaviour due to the +-- nested cmmMachOpFold. Can we fix this? +-- +-- Why do we check isLit arg1? If arg1 is a lit, it means that arg2 +-- is also a lit (otherwise arg1 would be on the right). If we +-- put arg1 on the left of the rearranged expression, we'll get into a +-- loop: (x1+x2)+x3 => x1+(x2+x3) => (x2+x3)+x1 => x2+(x3+x1) ... +-- +cmmMachOpFold mop1 [CmmMachOp mop2 [arg1,arg2], arg3] + | mop1 == mop2 && isAssociativeMachOp mop1 && not (isLit arg1) + = cmmMachOpFold mop1 [arg1, cmmMachOpFold mop2 [arg2,arg3]] + +-- Make a RegOff if we can +cmmMachOpFold (MO_Add _) [CmmReg reg, CmmLit (CmmInt n rep)] + = CmmRegOff reg (fromIntegral (narrowS rep n)) +cmmMachOpFold (MO_Add _) [CmmRegOff reg off, CmmLit (CmmInt n rep)] + = CmmRegOff reg (off + fromIntegral (narrowS rep n)) +cmmMachOpFold (MO_Sub _) [CmmReg reg, CmmLit (CmmInt n rep)] + = CmmRegOff reg (- fromIntegral (narrowS rep n)) +cmmMachOpFold (MO_Sub _) [CmmRegOff reg off, CmmLit (CmmInt n rep)] + = CmmRegOff reg (off - fromIntegral (narrowS rep n)) + +-- Fold label(+/-)offset into a CmmLit where possible + +cmmMachOpFold (MO_Add _) [CmmLit (CmmLabel lbl), CmmLit (CmmInt i rep)] + = CmmLit (CmmLabelOff lbl (fromIntegral (narrowU rep i))) +cmmMachOpFold (MO_Add _) [CmmLit (CmmInt i rep), CmmLit (CmmLabel lbl)] + = CmmLit (CmmLabelOff lbl (fromIntegral (narrowU rep i))) +cmmMachOpFold (MO_Sub _) [CmmLit (CmmLabel lbl), CmmLit (CmmInt i rep)] + = CmmLit (CmmLabelOff lbl (fromIntegral (negate (narrowU rep i)))) + +-- We can often do something with constants of 0 and 1 ... + +cmmMachOpFold mop args@[x, y@(CmmLit (CmmInt 0 _))] + = case mop of + MO_Add r -> x + MO_Sub r -> x + MO_Mul r -> y + MO_And r -> y + MO_Or r -> x + MO_Xor r -> x + MO_Shl r -> x + MO_S_Shr r -> x + MO_U_Shr r -> x + MO_Ne r | isComparisonExpr x -> x + MO_Eq r | Just x' <- maybeInvertConditionalExpr x -> x' + MO_U_Gt r | isComparisonExpr x -> x + MO_S_Gt r | isComparisonExpr x -> x + MO_U_Lt r | isComparisonExpr x -> CmmLit (CmmInt 0 wordRep) + MO_S_Lt r | isComparisonExpr x -> CmmLit (CmmInt 0 wordRep) + MO_U_Ge r | isComparisonExpr x -> CmmLit (CmmInt 1 wordRep) + MO_S_Ge r | isComparisonExpr x -> CmmLit (CmmInt 1 wordRep) + MO_U_Le r | Just x' <- maybeInvertConditionalExpr x -> x' + MO_S_Le r | Just x' <- maybeInvertConditionalExpr x -> x' + other -> CmmMachOp mop args + +cmmMachOpFold mop args@[x, y@(CmmLit (CmmInt 1 rep))] + = case mop of + MO_Mul r -> x + MO_S_Quot r -> x + MO_U_Quot r -> x + MO_S_Rem r -> CmmLit (CmmInt 0 rep) + MO_U_Rem r -> CmmLit (CmmInt 0 rep) + MO_Ne r | Just x' <- maybeInvertConditionalExpr x -> x' + MO_Eq r | isComparisonExpr x -> x + MO_U_Lt r | Just x' <- maybeInvertConditionalExpr x -> x' + MO_S_Lt r | Just x' <- maybeInvertConditionalExpr x -> x' + MO_U_Gt r | isComparisonExpr x -> CmmLit (CmmInt 0 wordRep) + MO_S_Gt r | isComparisonExpr x -> CmmLit (CmmInt 0 wordRep) + MO_U_Le r | isComparisonExpr x -> CmmLit (CmmInt 1 wordRep) + MO_S_Le r | isComparisonExpr x -> CmmLit (CmmInt 1 wordRep) + MO_U_Ge r | isComparisonExpr x -> x + MO_S_Ge r | isComparisonExpr x -> x + other -> CmmMachOp mop args + +-- Now look for multiplication/division by powers of 2 (integers). + +cmmMachOpFold mop args@[x, y@(CmmLit (CmmInt n _))] + = case mop of + MO_Mul rep + -> case exactLog2 n of + Nothing -> unchanged + Just p -> CmmMachOp (MO_Shl rep) [x, CmmLit (CmmInt p rep)] + MO_S_Quot rep + -> case exactLog2 n of + Nothing -> unchanged + Just p -> CmmMachOp (MO_S_Shr rep) [x, CmmLit (CmmInt p rep)] + other + -> unchanged + where + unchanged = CmmMachOp mop args + +-- Anything else is just too hard. + +cmmMachOpFold mop args = CmmMachOp mop args + +-- ----------------------------------------------------------------------------- +-- exactLog2 + +-- This algorithm for determining the $\log_2$ of exact powers of 2 comes +-- from GCC. It requires bit manipulation primitives, and we use GHC +-- extensions. Tough. +-- +-- Used to be in MachInstrs --SDM. +-- ToDo: remove use of unboxery --SDM. + +w2i x = word2Int# x +i2w x = int2Word# x + +exactLog2 :: Integer -> Maybe Integer +exactLog2 x + = if (x <= 0 || x >= 2147483648) then + Nothing + else + case fromInteger x of { I# x# -> + if (w2i ((i2w x#) `and#` (i2w (0# -# x#))) /=# x#) then + Nothing + else + Just (toInteger (I# (pow2 x#))) + } + where + pow2 x# | x# ==# 1# = 0# + | otherwise = 1# +# pow2 (w2i (i2w x# `shiftRL#` 1#)) + + +-- ----------------------------------------------------------------------------- +-- widening / narrowing + +narrowU :: MachRep -> Integer -> Integer +narrowU I8 x = fromIntegral (fromIntegral x :: Word8) +narrowU I16 x = fromIntegral (fromIntegral x :: Word16) +narrowU I32 x = fromIntegral (fromIntegral x :: Word32) +narrowU I64 x = fromIntegral (fromIntegral x :: Word64) +narrowU _ _ = panic "narrowTo" + +narrowS :: MachRep -> Integer -> Integer +narrowS I8 x = fromIntegral (fromIntegral x :: Int8) +narrowS I16 x = fromIntegral (fromIntegral x :: Int16) +narrowS I32 x = fromIntegral (fromIntegral x :: Int32) +narrowS I64 x = fromIntegral (fromIntegral x :: Int64) +narrowS _ _ = panic "narrowTo" + +-- ----------------------------------------------------------------------------- +-- Loopify for C + +{- + This is a simple pass that replaces tail-recursive functions like this: + + fac() { + ... + jump fac(); + } + + with this: + + fac() { + L: + ... + goto L; + } + + the latter generates better C code, because the C compiler treats it + like a loop, and brings full loop optimisation to bear. + + In my measurements this makes little or no difference to anything + except factorial, but what the hell. +-} + +cmmLoopifyForC :: CmmTop -> CmmTop +cmmLoopifyForC p@(CmmProc info entry_lbl [] blocks@(BasicBlock top_id _ : _)) + | null info = p -- only if there's an info table, ignore case alts + | otherwise = +-- pprTrace "jump_lbl" (ppr jump_lbl <+> ppr entry_lbl) $ + CmmProc info entry_lbl [] blocks' + where blocks' = [ BasicBlock id (map do_stmt stmts) + | BasicBlock id stmts <- blocks ] + + do_stmt (CmmJump (CmmLit (CmmLabel lbl)) _) | lbl == jump_lbl + = CmmBranch top_id + do_stmt stmt = stmt + + jump_lbl | tablesNextToCode = entryLblToInfoLbl entry_lbl + | otherwise = entry_lbl + +cmmLoopifyForC top = top + +-- ----------------------------------------------------------------------------- +-- Utils + +isLit (CmmLit _) = True +isLit _ = False + +isComparisonExpr :: CmmExpr -> Bool +isComparisonExpr (CmmMachOp op _) = isComparisonMachOp op +isComparisonExpr _other = False + +maybeInvertConditionalExpr :: CmmExpr -> Maybe CmmExpr +maybeInvertConditionalExpr (CmmMachOp op args) + | Just op' <- maybeInvertComparison op = Just (CmmMachOp op' args) +maybeInvertConditionalExpr _ = Nothing diff --git a/compiler/cmm/CmmParse.y b/compiler/cmm/CmmParse.y new file mode 100644 index 0000000000..73618bc35b --- /dev/null +++ b/compiler/cmm/CmmParse.y @@ -0,0 +1,890 @@ +----------------------------------------------------------------------------- +-- +-- (c) The University of Glasgow, 2004 +-- +-- Parser for concrete Cmm. +-- +----------------------------------------------------------------------------- + +{ +module CmmParse ( parseCmmFile ) where + +import CgMonad +import CgHeapery +import CgUtils +import CgProf +import CgTicky +import CgInfoTbls +import CgForeignCall +import CgTailCall ( pushUnboxedTuple ) +import CgStackery ( emitPushUpdateFrame ) +import ClosureInfo ( C_SRT(..) ) +import CgCallConv ( smallLiveness ) +import CgClosure ( emitBlackHoleCode ) +import CostCentre ( dontCareCCS ) + +import Cmm +import PprCmm +import CmmUtils ( mkIntCLit ) +import CmmLex +import CLabel +import MachOp +import SMRep ( fixedHdrSize, CgRep(..) ) +import Lexer + +import ForeignCall ( CCallConv(..), Safety(..) ) +import Literal ( mkMachInt ) +import Unique +import UniqFM +import SrcLoc +import DynFlags ( DynFlags, DynFlag(..) ) +import Packages ( HomeModules ) +import StaticFlags ( opt_SccProfilingOn ) +import ErrUtils ( printError, dumpIfSet_dyn, showPass ) +import StringBuffer ( hGetStringBuffer ) +import FastString +import Panic ( panic ) +import Constants ( wORD_SIZE ) +import Outputable + +import Monad ( when ) +import Data.Char ( ord ) + +#include "HsVersions.h" +} + +%token + ':' { L _ (CmmT_SpecChar ':') } + ';' { L _ (CmmT_SpecChar ';') } + '{' { L _ (CmmT_SpecChar '{') } + '}' { L _ (CmmT_SpecChar '}') } + '[' { L _ (CmmT_SpecChar '[') } + ']' { L _ (CmmT_SpecChar ']') } + '(' { L _ (CmmT_SpecChar '(') } + ')' { L _ (CmmT_SpecChar ')') } + '=' { L _ (CmmT_SpecChar '=') } + '`' { L _ (CmmT_SpecChar '`') } + '~' { L _ (CmmT_SpecChar '~') } + '/' { L _ (CmmT_SpecChar '/') } + '*' { L _ (CmmT_SpecChar '*') } + '%' { L _ (CmmT_SpecChar '%') } + '-' { L _ (CmmT_SpecChar '-') } + '+' { L _ (CmmT_SpecChar '+') } + '&' { L _ (CmmT_SpecChar '&') } + '^' { L _ (CmmT_SpecChar '^') } + '|' { L _ (CmmT_SpecChar '|') } + '>' { L _ (CmmT_SpecChar '>') } + '<' { L _ (CmmT_SpecChar '<') } + ',' { L _ (CmmT_SpecChar ',') } + '!' { L _ (CmmT_SpecChar '!') } + + '..' { L _ (CmmT_DotDot) } + '::' { L _ (CmmT_DoubleColon) } + '>>' { L _ (CmmT_Shr) } + '<<' { L _ (CmmT_Shl) } + '>=' { L _ (CmmT_Ge) } + '<=' { L _ (CmmT_Le) } + '==' { L _ (CmmT_Eq) } + '!=' { L _ (CmmT_Ne) } + '&&' { L _ (CmmT_BoolAnd) } + '||' { L _ (CmmT_BoolOr) } + + 'CLOSURE' { L _ (CmmT_CLOSURE) } + 'INFO_TABLE' { L _ (CmmT_INFO_TABLE) } + 'INFO_TABLE_RET'{ L _ (CmmT_INFO_TABLE_RET) } + 'INFO_TABLE_FUN'{ L _ (CmmT_INFO_TABLE_FUN) } + 'INFO_TABLE_CONSTR'{ L _ (CmmT_INFO_TABLE_CONSTR) } + 'INFO_TABLE_SELECTOR'{ L _ (CmmT_INFO_TABLE_SELECTOR) } + 'else' { L _ (CmmT_else) } + 'export' { L _ (CmmT_export) } + 'section' { L _ (CmmT_section) } + 'align' { L _ (CmmT_align) } + 'goto' { L _ (CmmT_goto) } + 'if' { L _ (CmmT_if) } + 'jump' { L _ (CmmT_jump) } + 'foreign' { L _ (CmmT_foreign) } + 'import' { L _ (CmmT_import) } + 'switch' { L _ (CmmT_switch) } + 'case' { L _ (CmmT_case) } + 'default' { L _ (CmmT_default) } + 'bits8' { L _ (CmmT_bits8) } + 'bits16' { L _ (CmmT_bits16) } + 'bits32' { L _ (CmmT_bits32) } + 'bits64' { L _ (CmmT_bits64) } + 'float32' { L _ (CmmT_float32) } + 'float64' { L _ (CmmT_float64) } + + GLOBALREG { L _ (CmmT_GlobalReg $$) } + NAME { L _ (CmmT_Name $$) } + STRING { L _ (CmmT_String $$) } + INT { L _ (CmmT_Int $$) } + FLOAT { L _ (CmmT_Float $$) } + +%monad { P } { >>= } { return } +%lexer { cmmlex } { L _ CmmT_EOF } +%name cmmParse cmm +%tokentype { Located CmmToken } + +-- C-- operator precedences, taken from the C-- spec +%right '||' -- non-std extension, called %disjoin in C-- +%right '&&' -- non-std extension, called %conjoin in C-- +%right '!' +%nonassoc '>=' '>' '<=' '<' '!=' '==' +%left '|' +%left '^' +%left '&' +%left '>>' '<<' +%left '-' '+' +%left '/' '*' '%' +%right '~' + +%% + +cmm :: { ExtCode } + : {- empty -} { return () } + | cmmtop cmm { do $1; $2 } + +cmmtop :: { ExtCode } + : cmmproc { $1 } + | cmmdata { $1 } + | decl { $1 } + | 'CLOSURE' '(' NAME ',' NAME lits ')' ';' + { do lits <- sequence $6; + staticClosure $3 $5 (map getLit lits) } + +-- The only static closures in the RTS are dummy closures like +-- stg_END_TSO_QUEUE_closure and stg_dummy_ret. We don't need +-- to provide the full generality of static closures here. +-- In particular: +-- * CCS can always be CCS_DONT_CARE +-- * closure is always extern +-- * payload is always empty +-- * we can derive closure and info table labels from a single NAME + +cmmdata :: { ExtCode } + : 'section' STRING '{' statics '}' + { do ss <- sequence $4; + code (emitData (section $2) (concat ss)) } + +statics :: { [ExtFCode [CmmStatic]] } + : {- empty -} { [] } + | static statics { $1 : $2 } + +-- Strings aren't used much in the RTS HC code, so it doesn't seem +-- worth allowing inline strings. C-- doesn't allow them anyway. +static :: { ExtFCode [CmmStatic] } + : NAME ':' { return [CmmDataLabel (mkRtsDataLabelFS $1)] } + | type expr ';' { do e <- $2; + return [CmmStaticLit (getLit e)] } + | type ';' { return [CmmUninitialised + (machRepByteWidth $1)] } + | 'bits8' '[' ']' STRING ';' { return [mkString $4] } + | 'bits8' '[' INT ']' ';' { return [CmmUninitialised + (fromIntegral $3)] } + | typenot8 '[' INT ']' ';' { return [CmmUninitialised + (machRepByteWidth $1 * + fromIntegral $3)] } + | 'align' INT ';' { return [CmmAlign (fromIntegral $2)] } + | 'CLOSURE' '(' NAME lits ')' + { do lits <- sequence $4; + return $ map CmmStaticLit $ + mkStaticClosure (mkRtsInfoLabelFS $3) + dontCareCCS (map getLit lits) [] [] [] } + -- arrays of closures required for the CHARLIKE & INTLIKE arrays + +lits :: { [ExtFCode CmmExpr] } + : {- empty -} { [] } + | ',' expr lits { $2 : $3 } + +cmmproc :: { ExtCode } + : info '{' body '}' + { do (info_lbl, info1, info2) <- $1; + stmts <- getCgStmtsEC (loopDecls $3) + blks <- code (cgStmtsToBlocks stmts) + code (emitInfoTableAndCode info_lbl info1 info2 [] blks) } + + | info ';' + { do (info_lbl, info1, info2) <- $1; + code (emitInfoTableAndCode info_lbl info1 info2 [] []) } + + | NAME '{' body '}' + { do stmts <- getCgStmtsEC (loopDecls $3); + blks <- code (cgStmtsToBlocks stmts) + code (emitProc [] (mkRtsCodeLabelFS $1) [] blks) } + +info :: { ExtFCode (CLabel, [CmmLit],[CmmLit]) } + : 'INFO_TABLE' '(' NAME ',' INT ',' INT ',' INT ',' STRING ',' STRING ')' + -- ptrs, nptrs, closure type, description, type + { stdInfo $3 $5 $7 0 $9 $11 $13 } + + | 'INFO_TABLE_FUN' '(' NAME ',' INT ',' INT ',' INT ',' STRING ',' STRING ',' INT ')' + -- ptrs, nptrs, closure type, description, type, fun type + { funInfo $3 $5 $7 $9 $11 $13 $15 } + + | 'INFO_TABLE_CONSTR' '(' NAME ',' INT ',' INT ',' INT ',' INT ',' STRING ',' STRING ')' + -- ptrs, nptrs, tag, closure type, description, type + { stdInfo $3 $5 $7 $9 $11 $13 $15 } + + | 'INFO_TABLE_SELECTOR' '(' NAME ',' INT ',' INT ',' STRING ',' STRING ')' + -- selector, closure type, description, type + { basicInfo $3 (mkIntCLit (fromIntegral $5)) 0 $7 $9 $11 } + + | 'INFO_TABLE_RET' '(' NAME ',' INT ',' INT ',' INT maybe_vec ')' + { retInfo $3 $5 $7 $9 $10 } + +maybe_vec :: { [CmmLit] } + : {- empty -} { [] } + | ',' NAME maybe_vec { CmmLabel (mkRtsCodeLabelFS $2) : $3 } + +body :: { ExtCode } + : {- empty -} { return () } + | decl body { do $1; $2 } + | stmt body { do $1; $2 } + +decl :: { ExtCode } + : type names ';' { mapM_ (newLocal $1) $2 } + | 'import' names ';' { return () } -- ignore imports + | 'export' names ';' { return () } -- ignore exports + +names :: { [FastString] } + : NAME { [$1] } + | NAME ',' names { $1 : $3 } + +stmt :: { ExtCode } + : ';' { nopEC } + + | block_id ':' { code (labelC $1) } + + | lreg '=' expr ';' + { do reg <- $1; e <- $3; stmtEC (CmmAssign reg e) } + | type '[' expr ']' '=' expr ';' + { doStore $1 $3 $6 } + | 'foreign' STRING expr '(' hint_exprs0 ')' vols ';' + {% foreignCall $2 [] $3 $5 $7 } + | lreg '=' 'foreign' STRING expr '(' hint_exprs0 ')' vols ';' + {% let result = do r <- $1; return (r,NoHint) in + foreignCall $4 [result] $5 $7 $9 } + | STRING lreg '=' 'foreign' STRING expr '(' hint_exprs0 ')' vols ';' + {% do h <- parseHint $1; + let result = do r <- $2; return (r,h) in + foreignCall $5 [result] $6 $8 $10 } + -- stmt-level macros, stealing syntax from ordinary C-- function calls. + -- Perhaps we ought to use the %%-form? + | NAME '(' exprs0 ')' ';' + {% stmtMacro $1 $3 } + | 'switch' maybe_range expr '{' arms default '}' + { doSwitch $2 $3 $5 $6 } + | 'goto' block_id ';' + { stmtEC (CmmBranch $2) } + | 'jump' expr {-maybe_actuals-} ';' + { do e <- $2; stmtEC (CmmJump e []) } + | 'if' bool_expr '{' body '}' else + { ifThenElse $2 $4 $6 } + +bool_expr :: { ExtFCode BoolExpr } + : bool_op { $1 } + | expr { do e <- $1; return (BoolTest e) } + +bool_op :: { ExtFCode BoolExpr } + : bool_expr '&&' bool_expr { do e1 <- $1; e2 <- $3; + return (BoolAnd e1 e2) } + | bool_expr '||' bool_expr { do e1 <- $1; e2 <- $3; + return (BoolOr e1 e2) } + | '!' bool_expr { do e <- $2; return (BoolNot e) } + | '(' bool_op ')' { $2 } + +-- This is not C-- syntax. What to do? +vols :: { Maybe [GlobalReg] } + : {- empty -} { Nothing } + | '[' ']' { Just [] } + | '[' globals ']' { Just $2 } + +globals :: { [GlobalReg] } + : GLOBALREG { [$1] } + | GLOBALREG ',' globals { $1 : $3 } + +maybe_range :: { Maybe (Int,Int) } + : '[' INT '..' INT ']' { Just (fromIntegral $2, fromIntegral $4) } + | {- empty -} { Nothing } + +arms :: { [([Int],ExtCode)] } + : {- empty -} { [] } + | arm arms { $1 : $2 } + +arm :: { ([Int],ExtCode) } + : 'case' ints ':' '{' body '}' { ($2, $5) } + +ints :: { [Int] } + : INT { [ fromIntegral $1 ] } + | INT ',' ints { fromIntegral $1 : $3 } + +default :: { Maybe ExtCode } + : 'default' ':' '{' body '}' { Just $4 } + -- taking a few liberties with the C-- syntax here; C-- doesn't have + -- 'default' branches + | {- empty -} { Nothing } + +else :: { ExtCode } + : {- empty -} { nopEC } + | 'else' '{' body '}' { $3 } + +-- we have to write this out longhand so that Happy's precedence rules +-- can kick in. +expr :: { ExtFCode CmmExpr } + : expr '/' expr { mkMachOp MO_U_Quot [$1,$3] } + | expr '*' expr { mkMachOp MO_Mul [$1,$3] } + | expr '%' expr { mkMachOp MO_U_Rem [$1,$3] } + | expr '-' expr { mkMachOp MO_Sub [$1,$3] } + | expr '+' expr { mkMachOp MO_Add [$1,$3] } + | expr '>>' expr { mkMachOp MO_U_Shr [$1,$3] } + | expr '<<' expr { mkMachOp MO_Shl [$1,$3] } + | expr '&' expr { mkMachOp MO_And [$1,$3] } + | expr '^' expr { mkMachOp MO_Xor [$1,$3] } + | expr '|' expr { mkMachOp MO_Or [$1,$3] } + | expr '>=' expr { mkMachOp MO_U_Ge [$1,$3] } + | expr '>' expr { mkMachOp MO_U_Gt [$1,$3] } + | expr '<=' expr { mkMachOp MO_U_Le [$1,$3] } + | expr '<' expr { mkMachOp MO_U_Lt [$1,$3] } + | expr '!=' expr { mkMachOp MO_Ne [$1,$3] } + | expr '==' expr { mkMachOp MO_Eq [$1,$3] } + | '~' expr { mkMachOp MO_Not [$2] } + | '-' expr { mkMachOp MO_S_Neg [$2] } + | expr0 '`' NAME '`' expr0 {% do { mo <- nameToMachOp $3 ; + return (mkMachOp mo [$1,$5]) } } + | expr0 { $1 } + +expr0 :: { ExtFCode CmmExpr } + : INT maybe_ty { return (CmmLit (CmmInt $1 $2)) } + | FLOAT maybe_ty { return (CmmLit (CmmFloat $1 $2)) } + | STRING { do s <- code (mkStringCLit $1); + return (CmmLit s) } + | reg { $1 } + | type '[' expr ']' { do e <- $3; return (CmmLoad e $1) } + | '%' NAME '(' exprs0 ')' {% exprOp $2 $4 } + | '(' expr ')' { $2 } + + +-- leaving out the type of a literal gives you the native word size in C-- +maybe_ty :: { MachRep } + : {- empty -} { wordRep } + | '::' type { $2 } + +hint_exprs0 :: { [ExtFCode (CmmExpr, MachHint)] } + : {- empty -} { [] } + | hint_exprs { $1 } + +hint_exprs :: { [ExtFCode (CmmExpr, MachHint)] } + : hint_expr { [$1] } + | hint_expr ',' hint_exprs { $1 : $3 } + +hint_expr :: { ExtFCode (CmmExpr, MachHint) } + : expr { do e <- $1; return (e, inferHint e) } + | expr STRING {% do h <- parseHint $2; + return $ do + e <- $1; return (e,h) } + +exprs0 :: { [ExtFCode CmmExpr] } + : {- empty -} { [] } + | exprs { $1 } + +exprs :: { [ExtFCode CmmExpr] } + : expr { [ $1 ] } + | expr ',' exprs { $1 : $3 } + +reg :: { ExtFCode CmmExpr } + : NAME { lookupName $1 } + | GLOBALREG { return (CmmReg (CmmGlobal $1)) } + +lreg :: { ExtFCode CmmReg } + : NAME { do e <- lookupName $1; + return $ + case e of + CmmReg r -> r + other -> pprPanic "CmmParse:" (ftext $1 <> text " not a register") } + | GLOBALREG { return (CmmGlobal $1) } + +block_id :: { BlockId } + : NAME { BlockId (newTagUnique (getUnique $1) 'L') } + -- TODO: ugh. The unique of a FastString has a null + -- tag, so we have to put our own tag on. We should + -- really make a new unique for every label, and keep + -- them in an environment. + +type :: { MachRep } + : 'bits8' { I8 } + | typenot8 { $1 } + +typenot8 :: { MachRep } + : 'bits16' { I16 } + | 'bits32' { I32 } + | 'bits64' { I64 } + | 'float32' { F32 } + | 'float64' { F64 } +{ +section :: String -> Section +section "text" = Text +section "data" = Data +section "rodata" = ReadOnlyData +section "bss" = UninitialisedData +section s = OtherSection s + +mkString :: String -> CmmStatic +mkString s = CmmString (map (fromIntegral.ord) s) + +-- mkMachOp infers the type of the MachOp from the type of its first +-- argument. We assume that this is correct: for MachOps that don't have +-- symmetrical args (e.g. shift ops), the first arg determines the type of +-- the op. +mkMachOp :: (MachRep -> MachOp) -> [ExtFCode CmmExpr] -> ExtFCode CmmExpr +mkMachOp fn args = do + arg_exprs <- sequence args + return (CmmMachOp (fn (cmmExprRep (head arg_exprs))) arg_exprs) + +getLit :: CmmExpr -> CmmLit +getLit (CmmLit l) = l +getLit (CmmMachOp (MO_S_Neg _) [CmmLit (CmmInt i r)]) = CmmInt (negate i) r +getLit _ = panic "invalid literal" -- TODO messy failure + +nameToMachOp :: FastString -> P (MachRep -> MachOp) +nameToMachOp name = + case lookupUFM machOps name of + Nothing -> fail ("unknown primitive " ++ unpackFS name) + Just m -> return m + +exprOp :: FastString -> [ExtFCode CmmExpr] -> P (ExtFCode CmmExpr) +exprOp name args_code = + case lookupUFM exprMacros name of + Just f -> return $ do + args <- sequence args_code + return (f args) + Nothing -> do + mo <- nameToMachOp name + return $ mkMachOp mo args_code + +exprMacros :: UniqFM ([CmmExpr] -> CmmExpr) +exprMacros = listToUFM [ + ( FSLIT("ENTRY_CODE"), \ [x] -> entryCode x ), + ( FSLIT("INFO_PTR"), \ [x] -> closureInfoPtr x ), + ( FSLIT("STD_INFO"), \ [x] -> infoTable x ), + ( FSLIT("FUN_INFO"), \ [x] -> funInfoTable x ), + ( FSLIT("GET_ENTRY"), \ [x] -> entryCode (closureInfoPtr x) ), + ( FSLIT("GET_STD_INFO"), \ [x] -> infoTable (closureInfoPtr x) ), + ( FSLIT("GET_FUN_INFO"), \ [x] -> funInfoTable (closureInfoPtr x) ), + ( FSLIT("INFO_TYPE"), \ [x] -> infoTableClosureType x ), + ( FSLIT("INFO_PTRS"), \ [x] -> infoTablePtrs x ), + ( FSLIT("INFO_NPTRS"), \ [x] -> infoTableNonPtrs x ), + ( FSLIT("RET_VEC"), \ [info, conZ] -> retVec info conZ ) + ] + +-- we understand a subset of C-- primitives: +machOps = listToUFM $ + map (\(x, y) -> (mkFastString x, y)) [ + ( "add", MO_Add ), + ( "sub", MO_Sub ), + ( "eq", MO_Eq ), + ( "ne", MO_Ne ), + ( "mul", MO_Mul ), + ( "neg", MO_S_Neg ), + ( "quot", MO_S_Quot ), + ( "rem", MO_S_Rem ), + ( "divu", MO_U_Quot ), + ( "modu", MO_U_Rem ), + + ( "ge", MO_S_Ge ), + ( "le", MO_S_Le ), + ( "gt", MO_S_Gt ), + ( "lt", MO_S_Lt ), + + ( "geu", MO_U_Ge ), + ( "leu", MO_U_Le ), + ( "gtu", MO_U_Gt ), + ( "ltu", MO_U_Lt ), + + ( "flt", MO_S_Lt ), + ( "fle", MO_S_Le ), + ( "feq", MO_Eq ), + ( "fne", MO_Ne ), + ( "fgt", MO_S_Gt ), + ( "fge", MO_S_Ge ), + ( "fneg", MO_S_Neg ), + + ( "and", MO_And ), + ( "or", MO_Or ), + ( "xor", MO_Xor ), + ( "com", MO_Not ), + ( "shl", MO_Shl ), + ( "shrl", MO_U_Shr ), + ( "shra", MO_S_Shr ), + + ( "lobits8", flip MO_U_Conv I8 ), + ( "lobits16", flip MO_U_Conv I16 ), + ( "lobits32", flip MO_U_Conv I32 ), + ( "lobits64", flip MO_U_Conv I64 ), + ( "sx16", flip MO_S_Conv I16 ), + ( "sx32", flip MO_S_Conv I32 ), + ( "sx64", flip MO_S_Conv I64 ), + ( "zx16", flip MO_U_Conv I16 ), + ( "zx32", flip MO_U_Conv I32 ), + ( "zx64", flip MO_U_Conv I64 ), + ( "f2f32", flip MO_S_Conv F32 ), -- TODO; rounding mode + ( "f2f64", flip MO_S_Conv F64 ), -- TODO; rounding mode + ( "f2i8", flip MO_S_Conv I8 ), + ( "f2i16", flip MO_S_Conv I8 ), + ( "f2i32", flip MO_S_Conv I8 ), + ( "f2i64", flip MO_S_Conv I8 ), + ( "i2f32", flip MO_S_Conv F32 ), + ( "i2f64", flip MO_S_Conv F64 ) + ] + +parseHint :: String -> P MachHint +parseHint "ptr" = return PtrHint +parseHint "signed" = return SignedHint +parseHint "float" = return FloatHint +parseHint str = fail ("unrecognised hint: " ++ str) + +-- labels are always pointers, so we might as well infer the hint +inferHint :: CmmExpr -> MachHint +inferHint (CmmLit (CmmLabel _)) = PtrHint +inferHint (CmmReg (CmmGlobal g)) | isPtrGlobalReg g = PtrHint +inferHint _ = NoHint + +isPtrGlobalReg Sp = True +isPtrGlobalReg SpLim = True +isPtrGlobalReg Hp = True +isPtrGlobalReg HpLim = True +isPtrGlobalReg CurrentTSO = True +isPtrGlobalReg CurrentNursery = True +isPtrGlobalReg _ = False + +happyError :: P a +happyError = srcParseFail + +-- ----------------------------------------------------------------------------- +-- Statement-level macros + +stmtMacro :: FastString -> [ExtFCode CmmExpr] -> P ExtCode +stmtMacro fun args_code = do + case lookupUFM stmtMacros fun of + Nothing -> fail ("unknown macro: " ++ unpackFS fun) + Just fcode -> return $ do + args <- sequence args_code + code (fcode args) + +stmtMacros :: UniqFM ([CmmExpr] -> Code) +stmtMacros = listToUFM [ + ( FSLIT("CCS_ALLOC"), \[words,ccs] -> profAlloc words ccs ), + ( FSLIT("CLOSE_NURSERY"), \[] -> emitCloseNursery ), + ( FSLIT("ENTER_CCS_PAP_CL"), \[e] -> enterCostCentrePAP e ), + ( FSLIT("ENTER_CCS_THUNK"), \[e] -> enterCostCentreThunk e ), + ( FSLIT("HP_CHK_GEN"), \[words,liveness,reentry] -> + hpChkGen words liveness reentry ), + ( FSLIT("HP_CHK_NP_ASSIGN_SP0"), \[e,f] -> hpChkNodePointsAssignSp0 e f ), + ( FSLIT("LOAD_THREAD_STATE"), \[] -> emitLoadThreadState ), + ( FSLIT("LDV_ENTER"), \[e] -> ldvEnter e ), + ( FSLIT("LDV_RECORD_CREATE"), \[e] -> ldvRecordCreate e ), + ( FSLIT("OPEN_NURSERY"), \[] -> emitOpenNursery ), + ( FSLIT("PUSH_UPD_FRAME"), \[sp,e] -> emitPushUpdateFrame sp e ), + ( FSLIT("SAVE_THREAD_STATE"), \[] -> emitSaveThreadState ), + ( FSLIT("SET_HDR"), \[ptr,info,ccs] -> + emitSetDynHdr ptr info ccs ), + ( FSLIT("STK_CHK_GEN"), \[words,liveness,reentry] -> + stkChkGen words liveness reentry ), + ( FSLIT("STK_CHK_NP"), \[e] -> stkChkNodePoints e ), + ( FSLIT("TICK_ALLOC_PRIM"), \[hdr,goods,slop] -> + tickyAllocPrim hdr goods slop ), + ( FSLIT("TICK_ALLOC_PAP"), \[goods,slop] -> + tickyAllocPAP goods slop ), + ( FSLIT("TICK_ALLOC_UP_THK"), \[goods,slop] -> + tickyAllocThunk goods slop ), + ( FSLIT("UPD_BH_UPDATABLE"), \[] -> emitBlackHoleCode False ), + ( FSLIT("UPD_BH_SINGLE_ENTRY"), \[] -> emitBlackHoleCode True ), + + ( FSLIT("RET_P"), \[a] -> emitRetUT [(PtrArg,a)]), + ( FSLIT("RET_N"), \[a] -> emitRetUT [(NonPtrArg,a)]), + ( FSLIT("RET_PP"), \[a,b] -> emitRetUT [(PtrArg,a),(PtrArg,b)]), + ( FSLIT("RET_NN"), \[a,b] -> emitRetUT [(NonPtrArg,a),(NonPtrArg,b)]), + ( FSLIT("RET_NP"), \[a,b] -> emitRetUT [(NonPtrArg,a),(PtrArg,b)]), + ( FSLIT("RET_PPP"), \[a,b,c] -> emitRetUT [(PtrArg,a),(PtrArg,b),(PtrArg,c)]), + ( FSLIT("RET_NNP"), \[a,b,c] -> emitRetUT [(NonPtrArg,a),(NonPtrArg,b),(PtrArg,c)]), + ( FSLIT("RET_NNNP"), \[a,b,c,d] -> emitRetUT [(NonPtrArg,a),(NonPtrArg,b),(NonPtrArg,c),(PtrArg,d)]), + ( FSLIT("RET_NPNP"), \[a,b,c,d] -> emitRetUT [(NonPtrArg,a),(PtrArg,b),(NonPtrArg,c),(PtrArg,d)]) + + ] + +-- ----------------------------------------------------------------------------- +-- Our extended FCode monad. + +-- We add a mapping from names to CmmExpr, to support local variable names in +-- the concrete C-- code. The unique supply of the underlying FCode monad +-- is used to grab a new unique for each local variable. + +-- In C--, a local variable can be declared anywhere within a proc, +-- and it scopes from the beginning of the proc to the end. Hence, we have +-- to collect declarations as we parse the proc, and feed the environment +-- back in circularly (to avoid a two-pass algorithm). + +type Decls = [(FastString,CmmExpr)] +type Env = UniqFM CmmExpr + +newtype ExtFCode a = EC { unEC :: Env -> Decls -> FCode (Decls, a) } + +type ExtCode = ExtFCode () + +returnExtFC a = EC $ \e s -> return (s, a) +thenExtFC (EC m) k = EC $ \e s -> do (s',r) <- m e s; unEC (k r) e s' + +instance Monad ExtFCode where + (>>=) = thenExtFC + return = returnExtFC + +-- This function takes the variable decarations and imports and makes +-- an environment, which is looped back into the computation. In this +-- way, we can have embedded declarations that scope over the whole +-- procedure, and imports that scope over the entire module. +loopDecls :: ExtFCode a -> ExtFCode a +loopDecls (EC fcode) = + EC $ \e s -> fixC (\ ~(decls,a) -> fcode (addListToUFM e decls) []) + +getEnv :: ExtFCode Env +getEnv = EC $ \e s -> return (s, e) + +addVarDecl :: FastString -> CmmExpr -> ExtCode +addVarDecl var expr = EC $ \e s -> return ((var,expr):s, ()) + +newLocal :: MachRep -> FastString -> ExtCode +newLocal ty name = do + u <- code newUnique + addVarDecl name (CmmReg (CmmLocal (LocalReg u ty))) + +-- Unknown names are treated as if they had been 'import'ed. +-- This saves us a lot of bother in the RTS sources, at the expense of +-- deferring some errors to link time. +lookupName :: FastString -> ExtFCode CmmExpr +lookupName name = do + env <- getEnv + return $ + case lookupUFM env name of + Nothing -> CmmLit (CmmLabel (mkRtsCodeLabelFS name)) + Just e -> e + +-- Lifting FCode computations into the ExtFCode monad: +code :: FCode a -> ExtFCode a +code fc = EC $ \e s -> do r <- fc; return (s, r) + +code2 :: (FCode (Decls,b) -> FCode ((Decls,b),c)) + -> ExtFCode b -> ExtFCode c +code2 f (EC ec) = EC $ \e s -> do ((s',b),c) <- f (ec e s); return (s',c) + +nopEC = code nopC +stmtEC stmt = code (stmtC stmt) +stmtsEC stmts = code (stmtsC stmts) +getCgStmtsEC = code2 getCgStmts' + +forkLabelledCodeEC ec = do + stmts <- getCgStmtsEC ec + code (forkCgStmts stmts) + +retInfo name size live_bits cl_type vector = do + let liveness = smallLiveness (fromIntegral size) (fromIntegral live_bits) + info_lbl = mkRtsRetInfoLabelFS name + (info1,info2) = mkRetInfoTable info_lbl liveness NoC_SRT + (fromIntegral cl_type) vector + return (info_lbl, info1, info2) + +stdInfo name ptrs nptrs srt_bitmap cl_type desc_str ty_str = + basicInfo name (packHalfWordsCLit ptrs nptrs) + srt_bitmap cl_type desc_str ty_str + +basicInfo name layout srt_bitmap cl_type desc_str ty_str = do + lit1 <- if opt_SccProfilingOn + then code $ mkStringCLit desc_str + else return (mkIntCLit 0) + lit2 <- if opt_SccProfilingOn + then code $ mkStringCLit ty_str + else return (mkIntCLit 0) + let info1 = mkStdInfoTable lit1 lit2 (fromIntegral cl_type) + (fromIntegral srt_bitmap) + layout + return (mkRtsInfoLabelFS name, info1, []) + +funInfo name ptrs nptrs cl_type desc_str ty_str fun_type = do + (label,info1,_) <- stdInfo name ptrs nptrs 0{-srt_bitmap-} + cl_type desc_str ty_str + let info2 = mkFunGenInfoExtraBits (fromIntegral fun_type) 0 zero zero zero + -- we leave most of the fields zero here. This is only used + -- to generate the BCO info table in the RTS at the moment. + return (label,info1,info2) + where + zero = mkIntCLit 0 + + +staticClosure :: FastString -> FastString -> [CmmLit] -> ExtCode +staticClosure cl_label info payload + = code $ emitDataLits (mkRtsDataLabelFS cl_label) lits + where lits = mkStaticClosure (mkRtsInfoLabelFS info) dontCareCCS payload [] [] [] + +foreignCall + :: String + -> [ExtFCode (CmmReg,MachHint)] + -> ExtFCode CmmExpr + -> [ExtFCode (CmmExpr,MachHint)] + -> Maybe [GlobalReg] -> P ExtCode +foreignCall "C" results_code expr_code args_code vols + = return $ do + results <- sequence results_code + expr <- expr_code + args <- sequence args_code + code (emitForeignCall' PlayRisky results + (CmmForeignCall expr CCallConv) args vols) +foreignCall conv _ _ _ _ + = fail ("unknown calling convention: " ++ conv) + +doStore :: MachRep -> ExtFCode CmmExpr -> ExtFCode CmmExpr -> ExtCode +doStore rep addr_code val_code + = do addr <- addr_code + val <- val_code + -- if the specified store type does not match the type of the expr + -- on the rhs, then we insert a coercion that will cause the type + -- mismatch to be flagged by cmm-lint. If we don't do this, then + -- the store will happen at the wrong type, and the error will not + -- be noticed. + let coerce_val + | cmmExprRep val /= rep = CmmMachOp (MO_U_Conv rep rep) [val] + | otherwise = val + stmtEC (CmmStore addr coerce_val) + +-- Return an unboxed tuple. +emitRetUT :: [(CgRep,CmmExpr)] -> Code +emitRetUT args = do + tickyUnboxedTupleReturn (length args) -- TICK + (sp, stmts) <- pushUnboxedTuple 0 args + emitStmts stmts + when (sp /= 0) $ stmtC (CmmAssign spReg (cmmRegOffW spReg (-sp))) + stmtC (CmmJump (entryCode (CmmLoad (cmmRegOffW spReg sp) wordRep)) []) + +-- ----------------------------------------------------------------------------- +-- If-then-else and boolean expressions + +data BoolExpr + = BoolExpr `BoolAnd` BoolExpr + | BoolExpr `BoolOr` BoolExpr + | BoolNot BoolExpr + | BoolTest CmmExpr + +-- ToDo: smart constructors which simplify the boolean expression. + +ifThenElse cond then_part else_part = do + then_id <- code newLabelC + join_id <- code newLabelC + c <- cond + emitCond c then_id + else_part + stmtEC (CmmBranch join_id) + code (labelC then_id) + then_part + -- fall through to join + code (labelC join_id) + +-- 'emitCond cond true_id' emits code to test whether the cond is true, +-- branching to true_id if so, and falling through otherwise. +emitCond (BoolTest e) then_id = do + stmtEC (CmmCondBranch e then_id) +emitCond (BoolNot (BoolTest (CmmMachOp op args))) then_id + | Just op' <- maybeInvertComparison op + = emitCond (BoolTest (CmmMachOp op' args)) then_id +emitCond (BoolNot e) then_id = do + else_id <- code newLabelC + emitCond e else_id + stmtEC (CmmBranch then_id) + code (labelC else_id) +emitCond (e1 `BoolOr` e2) then_id = do + emitCond e1 then_id + emitCond e2 then_id +emitCond (e1 `BoolAnd` e2) then_id = do + -- we'd like to invert one of the conditionals here to avoid an + -- extra branch instruction, but we can't use maybeInvertComparison + -- here because we can't look too closely at the expression since + -- we're in a loop. + and_id <- code newLabelC + else_id <- code newLabelC + emitCond e1 and_id + stmtEC (CmmBranch else_id) + code (labelC and_id) + emitCond e2 then_id + code (labelC else_id) + + +-- ----------------------------------------------------------------------------- +-- Table jumps + +-- We use a simplified form of C-- switch statements for now. A +-- switch statement always compiles to a table jump. Each arm can +-- specify a list of values (not ranges), and there can be a single +-- default branch. The range of the table is given either by the +-- optional range on the switch (eg. switch [0..7] {...}), or by +-- the minimum/maximum values from the branches. + +doSwitch :: Maybe (Int,Int) -> ExtFCode CmmExpr -> [([Int],ExtCode)] + -> Maybe ExtCode -> ExtCode +doSwitch mb_range scrut arms deflt + = do + -- Compile code for the default branch + dflt_entry <- + case deflt of + Nothing -> return Nothing + Just e -> do b <- forkLabelledCodeEC e; return (Just b) + + -- Compile each case branch + table_entries <- mapM emitArm arms + + -- Construct the table + let + all_entries = concat table_entries + ixs = map fst all_entries + (min,max) + | Just (l,u) <- mb_range = (l,u) + | otherwise = (minimum ixs, maximum ixs) + + entries = elems (accumArray (\_ a -> Just a) dflt_entry (min,max) + all_entries) + expr <- scrut + -- ToDo: check for out of range and jump to default if necessary + stmtEC (CmmSwitch expr entries) + where + emitArm :: ([Int],ExtCode) -> ExtFCode [(Int,BlockId)] + emitArm (ints,code) = do + blockid <- forkLabelledCodeEC code + return [ (i,blockid) | i <- ints ] + + +-- ----------------------------------------------------------------------------- +-- Putting it all together + +-- The initial environment: we define some constants that the compiler +-- knows about here. +initEnv :: Env +initEnv = listToUFM [ + ( FSLIT("SIZEOF_StgHeader"), + CmmLit (CmmInt (fromIntegral (fixedHdrSize * wORD_SIZE)) wordRep) ), + ( FSLIT("SIZEOF_StgInfoTable"), + CmmLit (CmmInt (fromIntegral stdInfoTableSizeB) wordRep) ) + ] + +parseCmmFile :: DynFlags -> HomeModules -> FilePath -> IO (Maybe Cmm) +parseCmmFile dflags hmods filename = do + showPass dflags "ParseCmm" + buf <- hGetStringBuffer filename + let + init_loc = mkSrcLoc (mkFastString filename) 1 0 + init_state = (mkPState buf init_loc dflags) { lex_state = [0] } + -- reset the lex_state: the Lexer monad leaves some stuff + -- in there we don't want. + case unP cmmParse init_state of + PFailed span err -> do printError span err; return Nothing + POk _ code -> do + cmm <- initC dflags hmods no_module (getCmm (unEC code initEnv [] >> return ())) + dumpIfSet_dyn dflags Opt_D_dump_cmm "Cmm" (pprCmms [cmm]) + return (Just cmm) + where + no_module = panic "parseCmmFile: no module" +} diff --git a/compiler/cmm/CmmUtils.hs b/compiler/cmm/CmmUtils.hs new file mode 100644 index 0000000000..a04935b279 --- /dev/null +++ b/compiler/cmm/CmmUtils.hs @@ -0,0 +1,177 @@ +----------------------------------------------------------------------------- +-- +-- Cmm utilities. +-- +-- (c) The University of Glasgow 2004 +-- +----------------------------------------------------------------------------- + +module CmmUtils( + CmmStmts, noStmts, oneStmt, mkStmts, plusStmts, stmtList, + isNopStmt, + + isTrivialCmmExpr, hasNoGlobalRegs, + + cmmRegOff, cmmLabelOff, cmmOffset, cmmOffsetLit, cmmIndex, + cmmOffsetExpr, cmmIndexExpr, cmmLoadIndex, + + mkIntCLit, zeroCLit, + + mkLblExpr, + ) where + +#include "HsVersions.h" + +import CLabel ( CLabel ) +import Cmm +import MachOp +import OrdList +import Outputable + +--------------------------------------------------- +-- +-- CmmStmts +-- +--------------------------------------------------- + +type CmmStmts = OrdList CmmStmt + +noStmts :: CmmStmts +noStmts = nilOL + +oneStmt :: CmmStmt -> CmmStmts +oneStmt = unitOL + +mkStmts :: [CmmStmt] -> CmmStmts +mkStmts = toOL + +plusStmts :: CmmStmts -> CmmStmts -> CmmStmts +plusStmts = appOL + +stmtList :: CmmStmts -> [CmmStmt] +stmtList = fromOL + + +--------------------------------------------------- +-- +-- CmmStmt +-- +--------------------------------------------------- + +isNopStmt :: CmmStmt -> Bool +-- If isNopStmt returns True, the stmt is definitely a no-op; +-- but it might be a no-op even if isNopStmt returns False +isNopStmt CmmNop = True +isNopStmt (CmmAssign r e) = cheapEqReg r e +isNopStmt (CmmStore e1 (CmmLoad e2 _)) = cheapEqExpr e1 e2 +isNopStmt s = False + +cheapEqExpr :: CmmExpr -> CmmExpr -> Bool +cheapEqExpr (CmmReg r) e = cheapEqReg r e +cheapEqExpr (CmmRegOff r 0) e = cheapEqReg r e +cheapEqExpr (CmmRegOff r n) (CmmRegOff r' n') = r==r' && n==n' +cheapEqExpr e1 e2 = False + +cheapEqReg :: CmmReg -> CmmExpr -> Bool +cheapEqReg r (CmmReg r') = r==r' +cheapEqReg r (CmmRegOff r' 0) = r==r' +cheapEqReg r e = False + +--------------------------------------------------- +-- +-- CmmExpr +-- +--------------------------------------------------- + +isTrivialCmmExpr :: CmmExpr -> Bool +isTrivialCmmExpr (CmmLoad _ _) = False +isTrivialCmmExpr (CmmMachOp _ _) = False +isTrivialCmmExpr (CmmLit _) = True +isTrivialCmmExpr (CmmReg _) = True +isTrivialCmmExpr (CmmRegOff _ _) = True + +hasNoGlobalRegs :: CmmExpr -> Bool +hasNoGlobalRegs (CmmLoad e _) = hasNoGlobalRegs e +hasNoGlobalRegs (CmmMachOp _ es) = all hasNoGlobalRegs es +hasNoGlobalRegs (CmmLit _) = True +hasNoGlobalRegs (CmmReg (CmmLocal _)) = True +hasNoGlobalRegs (CmmRegOff (CmmLocal _) _) = True +hasNoGlobalRegs _ = False + +--------------------------------------------------- +-- +-- Expr Construction helpers +-- +--------------------------------------------------- + +cmmOffsetExpr :: CmmExpr -> CmmExpr -> CmmExpr +-- assumes base and offset have the same MachRep +cmmOffsetExpr e (CmmLit (CmmInt n _)) = cmmOffset e (fromInteger n) +cmmOffsetExpr e byte_off = CmmMachOp (MO_Add (cmmExprRep e)) [e, byte_off] + +-- NB. Do *not* inspect the value of the offset in these smart constructors!!! +-- +-- because the offset is sometimes involved in a loop in the code generator +-- (we don't know the real Hp offset until we've generated code for the entire +-- basic block, for example). So we cannot eliminate zero offsets at this +-- stage; they're eliminated later instead (either during printing or +-- a later optimisation step on Cmm). +-- +cmmOffset :: CmmExpr -> Int -> CmmExpr +cmmOffset (CmmReg reg) byte_off = cmmRegOff reg byte_off +cmmOffset (CmmRegOff reg m) byte_off = cmmRegOff reg (m+byte_off) +cmmOffset (CmmLit lit) byte_off = CmmLit (cmmOffsetLit lit byte_off) +cmmOffset (CmmMachOp (MO_Add rep) [expr, CmmLit (CmmInt byte_off1 _rep)]) byte_off2 + = CmmMachOp (MO_Add rep) + [expr, CmmLit (CmmInt (byte_off1 + toInteger byte_off2) rep)] +cmmOffset expr byte_off + = CmmMachOp (MO_Add rep) [expr, CmmLit (CmmInt (toInteger byte_off) rep)] + where + rep = cmmExprRep expr + +-- Smart constructor for CmmRegOff. Same caveats as cmmOffset above. +cmmRegOff :: CmmReg -> Int -> CmmExpr +cmmRegOff reg byte_off = CmmRegOff reg byte_off + +cmmOffsetLit :: CmmLit -> Int -> CmmLit +cmmOffsetLit (CmmLabel l) byte_off = cmmLabelOff l byte_off +cmmOffsetLit (CmmLabelOff l m) byte_off = cmmLabelOff l (m+byte_off) +cmmOffsetLit (CmmInt m rep) byte_off = CmmInt (m + fromIntegral byte_off) rep +cmmOffsetLit other byte_off = pprPanic "cmmOffsetLit" (ppr byte_off) + +cmmLabelOff :: CLabel -> Int -> CmmLit +-- Smart constructor for CmmLabelOff +cmmLabelOff lbl 0 = CmmLabel lbl +cmmLabelOff lbl byte_off = CmmLabelOff lbl byte_off + +-- | Useful for creating an index into an array, with a staticaly known offset. +cmmIndex :: MachRep -> CmmExpr -> Int -> CmmExpr +cmmIndex rep base idx = cmmOffset base (idx * machRepByteWidth rep) + +-- | Useful for creating an index into an array, with an unknown offset. +cmmIndexExpr :: MachRep -> CmmExpr -> CmmExpr -> CmmExpr +cmmIndexExpr rep base (CmmLit (CmmInt n _)) = cmmIndex rep base (fromInteger n) +cmmIndexExpr rep base idx = + cmmOffsetExpr base byte_off + where + idx_rep = cmmExprRep idx + byte_off = CmmMachOp (MO_Shl idx_rep) [ + idx, CmmLit (mkIntCLit (machRepLogWidth rep))] + +cmmLoadIndex :: MachRep -> CmmExpr -> Int -> CmmExpr +cmmLoadIndex rep expr ix = CmmLoad (cmmIndex rep expr ix) rep + +--------------------------------------------------- +-- +-- Literal construction functions +-- +--------------------------------------------------- + +mkIntCLit :: Int -> CmmLit +mkIntCLit i = CmmInt (toInteger i) wordRep + +zeroCLit :: CmmLit +zeroCLit = CmmInt 0 wordRep + +mkLblExpr :: CLabel -> CmmExpr +mkLblExpr lbl = CmmLit (CmmLabel lbl) diff --git a/compiler/cmm/MachOp.hs b/compiler/cmm/MachOp.hs new file mode 100644 index 0000000000..5bbff6de78 --- /dev/null +++ b/compiler/cmm/MachOp.hs @@ -0,0 +1,652 @@ +----------------------------------------------------------------------------- +-- +-- (c) The University of Glasgow 2002-2004 +-- +-- Low-level machine operations, used in the Cmm datatype. +-- +----------------------------------------------------------------------------- + +module MachOp ( + MachRep(..), + machRepBitWidth, + machRepByteWidth, + machRepLogWidth, + isFloatingRep, + + MachHint(..), + + MachOp(..), + pprMachOp, + isCommutableMachOp, + isAssociativeMachOp, + isComparisonMachOp, + resultRepOfMachOp, + machOpArgReps, + maybeInvertComparison, + + CallishMachOp(..), + pprCallishMachOp, + + wordRep, + halfWordRep, + cIntRep, cLongRep, + + mo_wordAdd, + mo_wordSub, + mo_wordEq, + mo_wordNe, + mo_wordMul, + mo_wordSQuot, + mo_wordSRem, + mo_wordSNeg, + mo_wordUQuot, + mo_wordURem, + + mo_wordSGe, + mo_wordSLe, + mo_wordSGt, + mo_wordSLt, + + mo_wordUGe, + mo_wordULe, + mo_wordUGt, + mo_wordULt, + + mo_wordAnd, + mo_wordOr, + mo_wordXor, + mo_wordNot, + mo_wordShl, + mo_wordSShr, + mo_wordUShr, + + mo_u_8To32, + mo_s_8To32, + mo_u_16To32, + mo_s_16To32, + + mo_u_8ToWord, + mo_s_8ToWord, + mo_u_16ToWord, + mo_s_16ToWord, + mo_u_32ToWord, + mo_s_32ToWord, + + mo_32To8, + mo_32To16, + mo_WordTo8, + mo_WordTo16, + mo_WordTo32, + ) where + +#include "HsVersions.h" + +import Constants +import Outputable + +-- ----------------------------------------------------------------------------- +-- MachRep + +{- | +A MachRep is the "representation" of a value in Cmm. It is used for +resource allocation: eg. which kind of register a value should be +stored in. + +The primary requirement is that there exists a function + + cmmExprRep :: CmmExpr -> MachRep + +This means that: + + - a register has an implicit MachRep + - a literal has an implicit MachRep + - an operation (MachOp) has an implicit result MachRep + +It also means that we can check that the arguments to a MachOp have +the correct MachRep, i.e. we can do a kind of lint-style type checking +on Cmm. +-} + +data MachRep + = I8 + | I16 + | I32 + | I64 + | I128 + | F32 + | F64 + | F80 -- extended double-precision, used in x86 native codegen only. + deriving (Eq, Ord, Show) + +mrStr I8 = SLIT("I8") +mrStr I16 = SLIT("I16") +mrStr I32 = SLIT("I32") +mrStr I64 = SLIT("I64") +mrStr I128 = SLIT("I128") +mrStr F32 = SLIT("F32") +mrStr F64 = SLIT("F64") +mrStr F80 = SLIT("F80") + +instance Outputable MachRep where + ppr rep = ptext (mrStr rep) + +{- +Implementation notes: + +It might suffice to keep just a width, without distinguishing between +floating and integer types. However, keeping the distinction will +help the native code generator to assign registers more easily. +-} + +{- +Should a MachRep include a signed vs. unsigned distinction? + +This is very much like a "hint" in C-- terminology: it isn't necessary +in order to generate correct code, but it might be useful in that the +compiler can generate better code if it has access to higher-level +hints about data. This is important at call boundaries, because the +definition of a function is not visible at all of its call sites, so +the compiler cannot infer the hints. + +Here in Cmm, we're taking a slightly different approach. We include +the int vs. float hint in the MachRep, because (a) the majority of +platforms have a strong distinction between float and int registers, +and (b) we don't want to do any heavyweight hint-inference in the +native code backend in order to get good code. We're treating the +hint more like a type: our Cmm is always completely consistent with +respect to hints. All coercions between float and int are explicit. + +What about the signed vs. unsigned hint? This information might be +useful if we want to keep sub-word-sized values in word-size +registers, which we must do if we only have word-sized registers. + +On such a system, there are two straightforward conventions for +representing sub-word-sized values: + +(a) Leave the upper bits undefined. Comparison operations must + sign- or zero-extend both operands before comparing them, + depending on whether the comparison is signed or unsigned. + +(b) Always keep the values sign- or zero-extended as appropriate. + Arithmetic operations must narrow the result to the appropriate + size. + +A clever compiler might not use either (a) or (b) exclusively, instead +it would attempt to minimize the coercions by analysis: the same kind +of analysis that propagates hints around. In Cmm we don't want to +have to do this, so we plump for having richer types and keeping the +type information consistent. + +If signed/unsigned hints are missing from MachRep, then the only +choice we have is (a), because we don't know whether the result of an +operation should be sign- or zero-extended. + +Many architectures have extending load operations, which work well +with (b). To make use of them with (a), you need to know whether the +value is going to be sign- or zero-extended by an enclosing comparison +(for example), which involves knowing above the context. This is +doable but more complex. + +Further complicating the issue is foreign calls: a foreign calling +convention can specify that signed 8-bit quantities are passed as +sign-extended 32 bit quantities, for example (this is the case on the +PowerPC). So we *do* need sign information on foreign call arguments. + +Pros for adding signed vs. unsigned to MachRep: + + - It would let us use convention (b) above, and get easier + code generation for extending loads. + + - Less information required on foreign calls. + + - MachOp type would be simpler + +Cons: + + - More complexity + + - What is the MachRep for a VanillaReg? Currently it is + always wordRep, but now we have to decide whether it is + signed or unsigned. The same VanillaReg can thus have + different MachReps in different parts of the program. + + - Extra coercions cluttering up expressions. + +Currently for GHC, the foreign call point is moot, because we do our +own promotion of sub-word-sized values to word-sized values. The Int8 +type is represnted by an Int# which is kept sign-extended at all times +(this is slightly naughty, because we're making assumptions about the +C calling convention rather early on in the compiler). However, given +this, the cons outweigh the pros. + +-} + + +machRepBitWidth :: MachRep -> Int +machRepBitWidth I8 = 8 +machRepBitWidth I16 = 16 +machRepBitWidth I32 = 32 +machRepBitWidth I64 = 64 +machRepBitWidth I128 = 128 +machRepBitWidth F32 = 32 +machRepBitWidth F64 = 64 +machRepBitWidth F80 = 80 + +machRepByteWidth :: MachRep -> Int +machRepByteWidth I8 = 1 +machRepByteWidth I16 = 2 +machRepByteWidth I32 = 4 +machRepByteWidth I64 = 8 +machRepByteWidth I128 = 16 +machRepByteWidth F32 = 4 +machRepByteWidth F64 = 8 +machRepByteWidth F80 = 10 + +-- log_2 of the width in bytes, useful for generating shifts. +machRepLogWidth :: MachRep -> Int +machRepLogWidth I8 = 0 +machRepLogWidth I16 = 1 +machRepLogWidth I32 = 2 +machRepLogWidth I64 = 3 +machRepLogWidth I128 = 4 +machRepLogWidth F32 = 2 +machRepLogWidth F64 = 3 +machRepLogWidth F80 = panic "machRepLogWidth: F80" + +isFloatingRep :: MachRep -> Bool +isFloatingRep F32 = True +isFloatingRep F64 = True +isFloatingRep F80 = True +isFloatingRep _ = False + +-- ----------------------------------------------------------------------------- +-- Hints + +{- +A hint gives a little more information about a data value. Hints are +used on the arguments to a foreign call, where the code generator needs +to know some extra information on top of the MachRep of each argument in +order to generate a correct call. +-} + +data MachHint + = NoHint + | PtrHint + | SignedHint + | FloatHint + deriving Eq + +mhStr NoHint = SLIT("NoHint") +mhStr PtrHint = SLIT("PtrHint") +mhStr SignedHint = SLIT("SignedHint") +mhStr FloatHint = SLIT("FloatHint") + +instance Outputable MachHint where + ppr hint = ptext (mhStr hint) + +-- ----------------------------------------------------------------------------- +-- MachOp + +{- | +Machine-level primops; ones which we can reasonably delegate to the +native code generators to handle. Basically contains C's primops +and no others. + +Nomenclature: all ops indicate width and signedness, where +appropriate. Widths: 8\/16\/32\/64 means the given size, obviously. +Nat means the operation works on STG word sized objects. +Signedness: S means signed, U means unsigned. For operations where +signedness is irrelevant or makes no difference (for example +integer add), the signedness component is omitted. + +An exception: NatP is a ptr-typed native word. From the point of +view of the native code generators this distinction is irrelevant, +but the C code generator sometimes needs this info to emit the +right casts. +-} + +data MachOp + + -- Integer operations + = MO_Add MachRep + | MO_Sub MachRep + | MO_Eq MachRep + | MO_Ne MachRep + | MO_Mul MachRep -- low word of multiply + | MO_S_MulMayOflo MachRep -- nonzero if signed multiply overflows + | MO_S_Quot MachRep -- signed / (same semantics as IntQuotOp) + | MO_S_Rem MachRep -- signed % (same semantics as IntRemOp) + | MO_S_Neg MachRep -- unary - + | MO_U_MulMayOflo MachRep -- nonzero if unsigned multiply overflows + | MO_U_Quot MachRep -- unsigned / (same semantics as WordQuotOp) + | MO_U_Rem MachRep -- unsigned % (same semantics as WordRemOp) + + -- Signed comparisons (floating-point comparisons also use these) + | MO_S_Ge MachRep + | MO_S_Le MachRep + | MO_S_Gt MachRep + | MO_S_Lt MachRep + + -- Unsigned comparisons + | MO_U_Ge MachRep + | MO_U_Le MachRep + | MO_U_Gt MachRep + | MO_U_Lt MachRep + + -- Bitwise operations. Not all of these may be supported at all sizes, + -- and only integral MachReps are valid. + | MO_And MachRep + | MO_Or MachRep + | MO_Xor MachRep + | MO_Not MachRep + | MO_Shl MachRep + | MO_U_Shr MachRep -- unsigned shift right + | MO_S_Shr MachRep -- signed shift right + + -- Conversions. Some of these will be NOPs. + -- Floating-point conversions use the signed variant. + | MO_S_Conv MachRep{-from-} MachRep{-to-} -- signed conversion + | MO_U_Conv MachRep{-from-} MachRep{-to-} -- unsigned conversion + + deriving (Eq, Show) + +pprMachOp :: MachOp -> SDoc +pprMachOp mo = text (show mo) + + +-- These MachOps tend to be implemented by foreign calls in some backends, +-- so we separate them out. In Cmm, these can only occur in a +-- statement position, in contrast to an ordinary MachOp which can occur +-- anywhere in an expression. +data CallishMachOp + = MO_F64_Pwr + | MO_F64_Sin + | MO_F64_Cos + | MO_F64_Tan + | MO_F64_Sinh + | MO_F64_Cosh + | MO_F64_Tanh + | MO_F64_Asin + | MO_F64_Acos + | MO_F64_Atan + | MO_F64_Log + | MO_F64_Exp + | MO_F64_Sqrt + | MO_F32_Pwr + | MO_F32_Sin + | MO_F32_Cos + | MO_F32_Tan + | MO_F32_Sinh + | MO_F32_Cosh + | MO_F32_Tanh + | MO_F32_Asin + | MO_F32_Acos + | MO_F32_Atan + | MO_F32_Log + | MO_F32_Exp + | MO_F32_Sqrt + deriving (Eq, Show) + +pprCallishMachOp :: CallishMachOp -> SDoc +pprCallishMachOp mo = text (show mo) + +-- ----------------------------------------------------------------------------- +-- Some common MachReps + +-- A 'wordRep' is a machine word on the target architecture +-- Specifically, it is the size of an Int#, Word#, Addr# +-- and the unit of allocation on the stack and the heap +-- Any pointer is also guaranteed to be a wordRep. + +wordRep | wORD_SIZE == 4 = I32 + | wORD_SIZE == 8 = I64 + | otherwise = panic "MachOp.wordRep: Unknown word size" + +halfWordRep | wORD_SIZE == 4 = I16 + | wORD_SIZE == 8 = I32 + | otherwise = panic "MachOp.halfWordRep: Unknown word size" + +mo_wordAdd = MO_Add wordRep +mo_wordSub = MO_Sub wordRep +mo_wordEq = MO_Eq wordRep +mo_wordNe = MO_Ne wordRep +mo_wordMul = MO_Mul wordRep +mo_wordSQuot = MO_S_Quot wordRep +mo_wordSRem = MO_S_Rem wordRep +mo_wordSNeg = MO_S_Neg wordRep +mo_wordUQuot = MO_U_Quot wordRep +mo_wordURem = MO_U_Rem wordRep + +mo_wordSGe = MO_S_Ge wordRep +mo_wordSLe = MO_S_Le wordRep +mo_wordSGt = MO_S_Gt wordRep +mo_wordSLt = MO_S_Lt wordRep + +mo_wordUGe = MO_U_Ge wordRep +mo_wordULe = MO_U_Le wordRep +mo_wordUGt = MO_U_Gt wordRep +mo_wordULt = MO_U_Lt wordRep + +mo_wordAnd = MO_And wordRep +mo_wordOr = MO_Or wordRep +mo_wordXor = MO_Xor wordRep +mo_wordNot = MO_Not wordRep +mo_wordShl = MO_Shl wordRep +mo_wordSShr = MO_S_Shr wordRep +mo_wordUShr = MO_U_Shr wordRep + +mo_u_8To32 = MO_U_Conv I8 I32 +mo_s_8To32 = MO_S_Conv I8 I32 +mo_u_16To32 = MO_U_Conv I16 I32 +mo_s_16To32 = MO_S_Conv I16 I32 + +mo_u_8ToWord = MO_U_Conv I8 wordRep +mo_s_8ToWord = MO_S_Conv I8 wordRep +mo_u_16ToWord = MO_U_Conv I16 wordRep +mo_s_16ToWord = MO_S_Conv I16 wordRep +mo_s_32ToWord = MO_S_Conv I32 wordRep +mo_u_32ToWord = MO_U_Conv I32 wordRep + +mo_WordTo8 = MO_U_Conv wordRep I8 +mo_WordTo16 = MO_U_Conv wordRep I16 +mo_WordTo32 = MO_U_Conv wordRep I32 + +mo_32To8 = MO_U_Conv I32 I8 +mo_32To16 = MO_U_Conv I32 I16 + +-- cIntRep is the MachRep for a C-language 'int' +#if SIZEOF_INT == 4 +cIntRep = I32 +#elif SIZEOF_INT == 8 +cIntRep = I64 +#endif + +#if SIZEOF_LONG == 4 +cLongRep = I32 +#elif SIZEOF_LONG == 8 +cLongRep = I64 +#endif + +-- ---------------------------------------------------------------------------- +-- isCommutableMachOp + +{- | +Returns 'True' if the MachOp has commutable arguments. This is used +in the platform-independent Cmm optimisations. + +If in doubt, return 'False'. This generates worse code on the +native routes, but is otherwise harmless. +-} +isCommutableMachOp :: MachOp -> Bool +isCommutableMachOp mop = + case mop of + MO_Add _ -> True + MO_Eq _ -> True + MO_Ne _ -> True + MO_Mul _ -> True + MO_S_MulMayOflo _ -> True + MO_U_MulMayOflo _ -> True + MO_And _ -> True + MO_Or _ -> True + MO_Xor _ -> True + _other -> False + +-- ---------------------------------------------------------------------------- +-- isAssociativeMachOp + +{- | +Returns 'True' if the MachOp is associative (i.e. @(x+y)+z == x+(y+z)@) +This is used in the platform-independent Cmm optimisations. + +If in doubt, return 'False'. This generates worse code on the +native routes, but is otherwise harmless. +-} +isAssociativeMachOp :: MachOp -> Bool +isAssociativeMachOp mop = + case mop of + MO_Add r -> not (isFloatingRep r) + MO_Mul r -> not (isFloatingRep r) + MO_And _ -> True + MO_Or _ -> True + MO_Xor _ -> True + _other -> False + +-- ---------------------------------------------------------------------------- +-- isComparisonMachOp + +{- | +Returns 'True' if the MachOp is a comparison. + +If in doubt, return False. This generates worse code on the +native routes, but is otherwise harmless. +-} +isComparisonMachOp :: MachOp -> Bool +isComparisonMachOp mop = + case mop of + MO_Eq _ -> True + MO_Ne _ -> True + MO_S_Ge _ -> True + MO_S_Le _ -> True + MO_S_Gt _ -> True + MO_S_Lt _ -> True + MO_U_Ge _ -> True + MO_U_Le _ -> True + MO_U_Gt _ -> True + MO_U_Lt _ -> True + _other -> False + +-- ----------------------------------------------------------------------------- +-- Inverting conditions + +-- Sometimes it's useful to be able to invert the sense of a +-- condition. Not all conditional tests are invertible: in +-- particular, floating point conditionals cannot be inverted, because +-- there exist floating-point values which return False for both senses +-- of a condition (eg. !(NaN > NaN) && !(NaN /<= NaN)). + +maybeInvertComparison :: MachOp -> Maybe MachOp +maybeInvertComparison op + = case op of + MO_Eq r | not (isFloatingRep r) -> Just (MO_Ne r) + MO_Ne r | not (isFloatingRep r) -> Just (MO_Eq r) + MO_U_Lt r | not (isFloatingRep r) -> Just (MO_U_Ge r) + MO_U_Gt r | not (isFloatingRep r) -> Just (MO_U_Le r) + MO_U_Le r | not (isFloatingRep r) -> Just (MO_U_Gt r) + MO_U_Ge r | not (isFloatingRep r) -> Just (MO_U_Lt r) + MO_S_Lt r | not (isFloatingRep r) -> Just (MO_S_Ge r) + MO_S_Gt r | not (isFloatingRep r) -> Just (MO_S_Le r) + MO_S_Le r | not (isFloatingRep r) -> Just (MO_S_Gt r) + MO_S_Ge r | not (isFloatingRep r) -> Just (MO_S_Lt r) + _other -> Nothing + +-- ---------------------------------------------------------------------------- +-- resultRepOfMachOp + +{- | +Returns the MachRep of the result of a MachOp. +-} +resultRepOfMachOp :: MachOp -> MachRep +resultRepOfMachOp mop = + case mop of + MO_Add r -> r + MO_Sub r -> r + MO_Eq r -> comparisonResultRep + MO_Ne r -> comparisonResultRep + MO_Mul r -> r + MO_S_MulMayOflo r -> r + MO_S_Quot r -> r + MO_S_Rem r -> r + MO_S_Neg r -> r + MO_U_MulMayOflo r -> r + MO_U_Quot r -> r + MO_U_Rem r -> r + + MO_S_Ge r -> comparisonResultRep + MO_S_Le r -> comparisonResultRep + MO_S_Gt r -> comparisonResultRep + MO_S_Lt r -> comparisonResultRep + + MO_U_Ge r -> comparisonResultRep + MO_U_Le r -> comparisonResultRep + MO_U_Gt r -> comparisonResultRep + MO_U_Lt r -> comparisonResultRep + + MO_And r -> r + MO_Or r -> r + MO_Xor r -> r + MO_Not r -> r + MO_Shl r -> r + MO_U_Shr r -> r + MO_S_Shr r -> r + + MO_S_Conv from to -> to + MO_U_Conv from to -> to + + +comparisonResultRep = wordRep -- is it? + + +-- ----------------------------------------------------------------------------- +-- machOpArgReps + +-- | This function is used for debugging only: we can check whether an +-- application of a MachOp is "type-correct" by checking that the MachReps of +-- its arguments are the same as the MachOp expects. This is used when +-- linting a CmmExpr. + +machOpArgReps :: MachOp -> [MachRep] +machOpArgReps op = + case op of + MO_Add r -> [r,r] + MO_Sub r -> [r,r] + MO_Eq r -> [r,r] + MO_Ne r -> [r,r] + MO_Mul r -> [r,r] + MO_S_MulMayOflo r -> [r,r] + MO_S_Quot r -> [r,r] + MO_S_Rem r -> [r,r] + MO_S_Neg r -> [r] + MO_U_MulMayOflo r -> [r,r] + MO_U_Quot r -> [r,r] + MO_U_Rem r -> [r,r] + + MO_S_Ge r -> [r,r] + MO_S_Le r -> [r,r] + MO_S_Gt r -> [r,r] + MO_S_Lt r -> [r,r] + + MO_U_Ge r -> [r,r] + MO_U_Le r -> [r,r] + MO_U_Gt r -> [r,r] + MO_U_Lt r -> [r,r] + + MO_And r -> [r,r] + MO_Or r -> [r,r] + MO_Xor r -> [r,r] + MO_Not r -> [r] + MO_Shl r -> [r,wordRep] + MO_U_Shr r -> [r,wordRep] + MO_S_Shr r -> [r,wordRep] + + MO_S_Conv from to -> [from] + MO_U_Conv from to -> [from] diff --git a/compiler/cmm/PprC.hs b/compiler/cmm/PprC.hs new file mode 100644 index 0000000000..a8d30668b7 --- /dev/null +++ b/compiler/cmm/PprC.hs @@ -0,0 +1,1028 @@ +----------------------------------------------------------------------------- +-- +-- Pretty-printing of Cmm as C, suitable for feeding gcc +-- +-- (c) The University of Glasgow 2004 +-- +----------------------------------------------------------------------------- + +-- +-- Print Cmm as real C, for -fvia-C +-- +-- This is simpler than the old PprAbsC, because Cmm is "macro-expanded" +-- relative to the old AbstractC, and many oddities/decorations have +-- disappeared from the data type. +-- + +-- ToDo: save/restore volatile registers around calls. + +module PprC ( + writeCs, + pprStringInCStyle + ) where + +#include "HsVersions.h" + +-- Cmm stuff +import Cmm +import CLabel +import MachOp +import ForeignCall + +-- Utils +import DynFlags ( DynFlags, DynFlag(..), dopt ) +import Unique ( getUnique ) +import UniqSet +import FiniteMap +import UniqFM ( eltsUFM ) +import FastString +import Outputable +import Constants +import StaticFlags ( opt_Unregisterised ) + +-- The rest +import Data.List ( intersperse, groupBy ) +import Data.Bits ( shiftR ) +import Char ( ord, chr ) +import IO ( Handle ) +import DATA_BITS +import Data.Word ( Word8 ) + +#ifdef DEBUG +import PprCmm () -- instances only +-- import Debug.Trace +#endif + +#if __GLASGOW_HASKELL__ >= 504 +import Data.Array.ST +#endif +import MONAD_ST + +-- -------------------------------------------------------------------------- +-- Top level + +pprCs :: DynFlags -> [Cmm] -> SDoc +pprCs dflags cmms + = pprCode CStyle (vcat $ map (\c -> split_marker $$ pprC c) cmms) + where + split_marker + | dopt Opt_SplitObjs dflags = ptext SLIT("__STG_SPLIT_MARKER") + | otherwise = empty + +writeCs :: DynFlags -> Handle -> [Cmm] -> IO () +writeCs dflags handle cmms + = printForC handle (pprCs dflags cmms) + +-- -------------------------------------------------------------------------- +-- Now do some real work +-- +-- for fun, we could call cmmToCmm over the tops... +-- + +pprC :: Cmm -> SDoc +pprC (Cmm tops) = vcat $ intersperse (text "") $ map pprTop tops + +-- +-- top level procs +-- +pprTop :: CmmTop -> SDoc +pprTop (CmmProc info clbl _params blocks) = + (if not (null info) + then pprDataExterns info $$ + pprWordArray (entryLblToInfoLbl clbl) info + else empty) $$ + (case blocks of + [] -> empty + -- the first block doesn't get a label: + (BasicBlock _ stmts : rest) -> vcat [ + text "", + extern_decls, + (if (externallyVisibleCLabel clbl) + then mkFN_ else mkIF_) (pprCLabel clbl) <+> lbrace, + nest 8 temp_decls, + nest 8 mkFB_, + nest 8 (vcat (map pprStmt stmts)) $$ + vcat (map pprBBlock rest), + nest 8 mkFE_, + rbrace ] + ) + where + (temp_decls, extern_decls) = pprTempAndExternDecls blocks + + +-- Chunks of static data. + +-- We only handle (a) arrays of word-sized things and (b) strings. + +pprTop (CmmData _section _ds@[CmmDataLabel lbl, CmmString str]) = + hcat [ + pprLocalness lbl, ptext SLIT("char "), pprCLabel lbl, + ptext SLIT("[] = "), pprStringInCStyle str, semi + ] + +pprTop (CmmData _section _ds@[CmmDataLabel lbl, CmmUninitialised size]) = + hcat [ + pprLocalness lbl, ptext SLIT("char "), pprCLabel lbl, + brackets (int size), semi + ] + +pprTop top@(CmmData _section (CmmDataLabel lbl : lits)) = + pprDataExterns lits $$ + pprWordArray lbl lits + +-- these shouldn't appear? +pprTop (CmmData _ _) = panic "PprC.pprTop: can't handle this data" + + +-- -------------------------------------------------------------------------- +-- BasicBlocks are self-contained entities: they always end in a jump. +-- +-- Like nativeGen/AsmCodeGen, we could probably reorder blocks to turn +-- as many jumps as possible into fall throughs. +-- + +pprBBlock :: CmmBasicBlock -> SDoc +pprBBlock (BasicBlock lbl stmts) = + if null stmts then + pprTrace "pprC.pprBBlock: curious empty code block for" + (pprBlockId lbl) empty + else + nest 4 (pprBlockId lbl <> colon) $$ + nest 8 (vcat (map pprStmt stmts)) + +-- -------------------------------------------------------------------------- +-- Info tables. Just arrays of words. +-- See codeGen/ClosureInfo, and nativeGen/PprMach + +pprWordArray :: CLabel -> [CmmStatic] -> SDoc +pprWordArray lbl ds + = hcat [ pprLocalness lbl, ptext SLIT("StgWord") + , space, pprCLabel lbl, ptext SLIT("[] = {") ] + $$ nest 8 (commafy (pprStatics ds)) + $$ ptext SLIT("};") + +-- +-- has to be static, if it isn't globally visible +-- +pprLocalness :: CLabel -> SDoc +pprLocalness lbl | not $ externallyVisibleCLabel lbl = ptext SLIT("static ") + | otherwise = empty + +-- -------------------------------------------------------------------------- +-- Statements. +-- + +pprStmt :: CmmStmt -> SDoc + +pprStmt stmt = case stmt of + CmmNop -> empty + CmmComment s -> (hang (ptext SLIT("/*")) 3 (ftext s)) $$ ptext SLIT("*/") + + CmmAssign dest src -> pprAssign dest src + + CmmStore dest src + | rep == I64 && wordRep /= I64 + -> ptext SLIT("ASSIGN_Word64") <> + parens (mkP_ <> pprExpr1 dest <> comma <> pprExpr src) <> semi + + | rep == F64 && wordRep /= I64 + -> ptext SLIT("ASSIGN_DBL") <> + parens (mkP_ <> pprExpr1 dest <> comma <> pprExpr src) <> semi + + | otherwise + -> hsep [ pprExpr (CmmLoad dest rep), equals, pprExpr src <> semi ] + where + rep = cmmExprRep src + + CmmCall (CmmForeignCall fn cconv) results args volatile -> + -- Controversial: leave this out for now. + -- pprUndef fn $$ + + pprCall ppr_fn cconv results args volatile + where + ppr_fn = case fn of + CmmLit (CmmLabel lbl) -> pprCLabel lbl + _other -> parens (cCast (pprCFunType cconv results args) fn) + -- for a dynamic call, cast the expression to + -- a function of the right type (we hope). + + -- we #undef a function before calling it: the FFI is supposed to be + -- an interface specifically to C, not to C+CPP. For one thing, this + -- makes the via-C route more compatible with the NCG. If macros + -- are being used for optimisation, then inline functions are probably + -- better anyway. + pprUndef (CmmLit (CmmLabel lbl)) = + ptext SLIT("#undef") <+> pprCLabel lbl + pprUndef _ = empty + + CmmCall (CmmPrim op) results args volatile -> + pprCall ppr_fn CCallConv results args volatile + where + ppr_fn = pprCallishMachOp_for_C op + + CmmBranch ident -> pprBranch ident + CmmCondBranch expr ident -> pprCondBranch expr ident + CmmJump lbl _params -> mkJMP_(pprExpr lbl) <> semi + CmmSwitch arg ids -> pprSwitch arg ids + +pprCFunType :: CCallConv -> [(CmmReg,MachHint)] -> [(CmmExpr,MachHint)] -> SDoc +pprCFunType cconv ress args + = hcat [ + res_type ress, + parens (text (ccallConvAttribute cconv) <> char '*'), + parens (commafy (map arg_type args)) + ] + where + res_type [] = ptext SLIT("void") + res_type [(one,hint)] = machRepHintCType (cmmRegRep one) hint + + arg_type (expr,hint) = machRepHintCType (cmmExprRep expr) hint + +-- --------------------------------------------------------------------- +-- unconditional branches +pprBranch :: BlockId -> SDoc +pprBranch ident = ptext SLIT("goto") <+> pprBlockId ident <> semi + + +-- --------------------------------------------------------------------- +-- conditional branches to local labels +pprCondBranch :: CmmExpr -> BlockId -> SDoc +pprCondBranch expr ident + = hsep [ ptext SLIT("if") , parens(pprExpr expr) , + ptext SLIT("goto") , (pprBlockId ident) <> semi ] + + +-- --------------------------------------------------------------------- +-- a local table branch +-- +-- we find the fall-through cases +-- +-- N.B. we remove Nothing's from the list of branches, as they are +-- 'undefined'. However, they may be defined one day, so we better +-- document this behaviour. +-- +pprSwitch :: CmmExpr -> [ Maybe BlockId ] -> SDoc +pprSwitch e maybe_ids + = let pairs = [ (ix, ident) | (ix,Just ident) <- zip [0..] maybe_ids ] + pairs2 = [ (map fst as, snd (head as)) | as <- groupBy sndEq pairs ] + in + (hang (ptext SLIT("switch") <+> parens ( pprExpr e ) <+> lbrace) + 4 (vcat ( map caseify pairs2 ))) + $$ rbrace + + where + sndEq (_,x) (_,y) = x == y + + -- fall through case + caseify (ix:ixs, ident) = vcat (map do_fallthrough ixs) $$ final_branch ix + where + do_fallthrough ix = + hsep [ ptext SLIT("case") , pprHexVal ix wordRep <> colon , + ptext SLIT("/* fall through */") ] + + final_branch ix = + hsep [ ptext SLIT("case") , pprHexVal ix wordRep <> colon , + ptext SLIT("goto") , (pprBlockId ident) <> semi ] + +-- --------------------------------------------------------------------- +-- Expressions. +-- + +-- C Types: the invariant is that the C expression generated by +-- +-- pprExpr e +-- +-- has a type in C which is also given by +-- +-- machRepCType (cmmExprRep e) +-- +-- (similar invariants apply to the rest of the pretty printer). + +pprExpr :: CmmExpr -> SDoc +pprExpr e = case e of + CmmLit lit -> pprLit lit + + CmmLoad e I64 | wordRep /= I64 + -> ptext SLIT("PK_Word64") <> parens (mkP_ <> pprExpr1 e) + + CmmLoad e F64 | wordRep /= I64 + -> ptext SLIT("PK_DBL") <> parens (mkP_ <> pprExpr1 e) + + CmmLoad (CmmReg r) rep + | isPtrReg r && rep == wordRep + -> char '*' <> pprAsPtrReg r + + CmmLoad (CmmRegOff r 0) rep + | isPtrReg r && rep == wordRep + -> char '*' <> pprAsPtrReg r + + CmmLoad (CmmRegOff r off) rep + | isPtrReg r && rep == wordRep + -- ToDo: check that the offset is a word multiple? + -> pprAsPtrReg r <> brackets (ppr (off `shiftR` wordShift)) + + CmmLoad expr rep -> + -- the general case: + char '*' <> parens (cCast (machRepPtrCType rep) expr) + + CmmReg reg -> pprCastReg reg + CmmRegOff reg 0 -> pprCastReg reg + + CmmRegOff reg i + | i > 0 -> pprRegOff (char '+') i + | otherwise -> pprRegOff (char '-') (-i) + where + pprRegOff op i' = pprCastReg reg <> op <> int i' + + CmmMachOp mop args -> pprMachOpApp mop args + +pprExpr1 :: CmmExpr -> SDoc +pprExpr1 (CmmLit lit) = pprLit1 lit +pprExpr1 e@(CmmReg _reg) = pprExpr e +pprExpr1 other = parens (pprExpr other) + +-- -------------------------------------------------------------------------- +-- MachOp applications + +pprMachOpApp :: MachOp -> [CmmExpr] -> SDoc + +pprMachOpApp op args + | isMulMayOfloOp op + = ptext SLIT("mulIntMayOflo") <> parens (commafy (map pprExpr args)) + where isMulMayOfloOp (MO_U_MulMayOflo _) = True + isMulMayOfloOp (MO_S_MulMayOflo _) = True + isMulMayOfloOp _ = False + +pprMachOpApp mop args + = case args of + -- dyadic + [x,y] -> pprArg x <+> pprMachOp_for_C mop <+> pprArg y + + -- unary + [x] -> pprMachOp_for_C mop <> parens (pprArg x) + + _ -> panic "PprC.pprMachOp : machop with wrong number of args" + + where + pprArg e | signedOp mop = cCast (machRepSignedCType (cmmExprRep e)) e + | otherwise = pprExpr1 e + +-- -------------------------------------------------------------------------- +-- Literals + +pprLit :: CmmLit -> SDoc +pprLit lit = case lit of + CmmInt i rep -> pprHexVal i rep + CmmFloat f rep -> parens (machRepCType rep) <> (rational f) + CmmLabel clbl -> mkW_ <> pprCLabelAddr clbl + CmmLabelOff clbl i -> mkW_ <> pprCLabelAddr clbl <> char '+' <> int i + CmmLabelDiffOff clbl1 clbl2 i + -- WARNING: + -- * the lit must occur in the info table clbl2 + -- * clbl1 must be an SRT, a slow entry point or a large bitmap + -- The Mangler is expected to convert any reference to an SRT, + -- a slow entry point or a large bitmap + -- from an info table to an offset. + -> mkW_ <> pprCLabelAddr clbl1 <> char '+' <> int i + +pprCLabelAddr lbl = char '&' <> pprCLabel lbl + +pprLit1 :: CmmLit -> SDoc +pprLit1 lit@(CmmLabelOff _ _) = parens (pprLit lit) +pprLit1 lit@(CmmLabelDiffOff _ _ _) = parens (pprLit lit) +pprLit1 lit@(CmmFloat _ _) = parens (pprLit lit) +pprLit1 other = pprLit other + +-- --------------------------------------------------------------------------- +-- Static data + +pprStatics :: [CmmStatic] -> [SDoc] +pprStatics [] = [] +pprStatics (CmmStaticLit (CmmFloat f F32) : rest) + = pprLit1 (floatToWord f) : pprStatics rest +pprStatics (CmmStaticLit (CmmFloat f F64) : rest) + = map pprLit1 (doubleToWords f) ++ pprStatics rest +pprStatics (CmmStaticLit (CmmInt i I64) : rest) + | machRepByteWidth I32 == wORD_SIZE +#ifdef WORDS_BIGENDIAN + = pprStatics (CmmStaticLit (CmmInt q I32) : + CmmStaticLit (CmmInt r I32) : rest) +#else + = pprStatics (CmmStaticLit (CmmInt r I32) : + CmmStaticLit (CmmInt q I32) : rest) +#endif + where r = i .&. 0xffffffff + q = i `shiftR` 32 +pprStatics (CmmStaticLit lit : rest) + = pprLit1 lit : pprStatics rest +pprStatics (other : rest) + = pprPanic "pprWord" (pprStatic other) + +pprStatic :: CmmStatic -> SDoc +pprStatic s = case s of + + CmmStaticLit lit -> nest 4 (pprLit lit) + CmmAlign i -> nest 4 (ptext SLIT("/* align */") <+> int i) + CmmDataLabel clbl -> pprCLabel clbl <> colon + CmmUninitialised i -> nest 4 (mkC_ <> brackets (int i)) + + -- these should be inlined, like the old .hc + CmmString s' -> nest 4 (mkW_ <> parens(pprStringInCStyle s')) + + +-- --------------------------------------------------------------------------- +-- Block Ids + +pprBlockId :: BlockId -> SDoc +pprBlockId b = char '_' <> ppr (getUnique b) + +-- -------------------------------------------------------------------------- +-- Print a MachOp in a way suitable for emitting via C. +-- + +pprMachOp_for_C :: MachOp -> SDoc + +pprMachOp_for_C mop = case mop of + + -- Integer operations + MO_Add _ -> char '+' + MO_Sub _ -> char '-' + MO_Eq _ -> ptext SLIT("==") + MO_Ne _ -> ptext SLIT("!=") + MO_Mul _ -> char '*' + + MO_S_Quot _ -> char '/' + MO_S_Rem _ -> char '%' + MO_S_Neg _ -> char '-' + + MO_U_Quot _ -> char '/' + MO_U_Rem _ -> char '%' + + -- Signed comparisons (floating-point comparisons also use these) + -- & Unsigned comparisons + MO_S_Ge _ -> ptext SLIT(">=") + MO_S_Le _ -> ptext SLIT("<=") + MO_S_Gt _ -> char '>' + MO_S_Lt _ -> char '<' + + MO_U_Ge _ -> ptext SLIT(">=") + MO_U_Le _ -> ptext SLIT("<=") + MO_U_Gt _ -> char '>' + MO_U_Lt _ -> char '<' + + -- Bitwise operations. Not all of these may be supported at all + -- sizes, and only integral MachReps are valid. + MO_And _ -> char '&' + MO_Or _ -> char '|' + MO_Xor _ -> char '^' + MO_Not _ -> char '~' + MO_Shl _ -> ptext SLIT("<<") + MO_U_Shr _ -> ptext SLIT(">>") -- unsigned shift right + MO_S_Shr _ -> ptext SLIT(">>") -- signed shift right + +-- Conversions. Some of these will be NOPs. +-- Floating-point conversions use the signed variant. +-- We won't know to generate (void*) casts here, but maybe from +-- context elsewhere + +-- noop casts + MO_U_Conv I8 I8 -> empty + MO_U_Conv I16 I16 -> empty + MO_U_Conv I32 I32 -> empty + MO_U_Conv I64 I64 -> empty + MO_U_Conv I128 I128 -> empty + MO_S_Conv I8 I8 -> empty + MO_S_Conv I16 I16 -> empty + MO_S_Conv I32 I32 -> empty + MO_S_Conv I64 I64 -> empty + MO_S_Conv I128 I128 -> empty + + MO_U_Conv _from to -> parens (machRepCType to) + MO_S_Conv _from to -> parens (machRepSignedCType to) + + _ -> panic "PprC.pprMachOp_for_C: unknown machop" + +signedOp :: MachOp -> Bool +signedOp (MO_S_Quot _) = True +signedOp (MO_S_Rem _) = True +signedOp (MO_S_Neg _) = True +signedOp (MO_S_Ge _) = True +signedOp (MO_S_Le _) = True +signedOp (MO_S_Gt _) = True +signedOp (MO_S_Lt _) = True +signedOp (MO_S_Shr _) = True +signedOp (MO_S_Conv _ _) = True +signedOp _ = False + +-- --------------------------------------------------------------------- +-- tend to be implemented by foreign calls + +pprCallishMachOp_for_C :: CallishMachOp -> SDoc + +pprCallishMachOp_for_C mop + = case mop of + MO_F64_Pwr -> ptext SLIT("pow") + MO_F64_Sin -> ptext SLIT("sin") + MO_F64_Cos -> ptext SLIT("cos") + MO_F64_Tan -> ptext SLIT("tan") + MO_F64_Sinh -> ptext SLIT("sinh") + MO_F64_Cosh -> ptext SLIT("cosh") + MO_F64_Tanh -> ptext SLIT("tanh") + MO_F64_Asin -> ptext SLIT("asin") + MO_F64_Acos -> ptext SLIT("acos") + MO_F64_Atan -> ptext SLIT("atan") + MO_F64_Log -> ptext SLIT("log") + MO_F64_Exp -> ptext SLIT("exp") + MO_F64_Sqrt -> ptext SLIT("sqrt") + MO_F32_Pwr -> ptext SLIT("powf") + MO_F32_Sin -> ptext SLIT("sinf") + MO_F32_Cos -> ptext SLIT("cosf") + MO_F32_Tan -> ptext SLIT("tanf") + MO_F32_Sinh -> ptext SLIT("sinhf") + MO_F32_Cosh -> ptext SLIT("coshf") + MO_F32_Tanh -> ptext SLIT("tanhf") + MO_F32_Asin -> ptext SLIT("asinf") + MO_F32_Acos -> ptext SLIT("acosf") + MO_F32_Atan -> ptext SLIT("atanf") + MO_F32_Log -> ptext SLIT("logf") + MO_F32_Exp -> ptext SLIT("expf") + MO_F32_Sqrt -> ptext SLIT("sqrtf") + +-- --------------------------------------------------------------------- +-- Useful #defines +-- + +mkJMP_, mkFN_, mkIF_ :: SDoc -> SDoc + +mkJMP_ i = ptext SLIT("JMP_") <> parens i +mkFN_ i = ptext SLIT("FN_") <> parens i -- externally visible function +mkIF_ i = ptext SLIT("IF_") <> parens i -- locally visible + + +mkFB_, mkFE_ :: SDoc +mkFB_ = ptext SLIT("FB_") -- function code begin +mkFE_ = ptext SLIT("FE_") -- function code end + +-- from includes/Stg.h +-- +mkC_,mkW_,mkP_,mkPP_,mkI_,mkA_,mkD_,mkF_,mkB_,mkL_,mkLI_,mkLW_ :: SDoc + +mkC_ = ptext SLIT("(C_)") -- StgChar +mkW_ = ptext SLIT("(W_)") -- StgWord +mkP_ = ptext SLIT("(P_)") -- StgWord* +mkPP_ = ptext SLIT("(PP_)") -- P_* +mkI_ = ptext SLIT("(I_)") -- StgInt +mkA_ = ptext SLIT("(A_)") -- StgAddr +mkD_ = ptext SLIT("(D_)") -- const StgWord* +mkF_ = ptext SLIT("(F_)") -- StgFunPtr +mkB_ = ptext SLIT("(B_)") -- StgByteArray +mkL_ = ptext SLIT("(L_)") -- StgClosurePtr + +mkLI_ = ptext SLIT("(LI_)") -- StgInt64 +mkLW_ = ptext SLIT("(LW_)") -- StgWord64 + + +-- --------------------------------------------------------------------- +-- +-- Assignments +-- +-- Generating assignments is what we're all about, here +-- +pprAssign :: CmmReg -> CmmExpr -> SDoc + +-- dest is a reg, rhs is a reg +pprAssign r1 (CmmReg r2) + | not (isStrangeTypeReg r1) && not (isStrangeTypeReg r2) + || isPtrReg r1 && isPtrReg r2 + = hcat [ pprAsPtrReg r1, equals, pprAsPtrReg r2, semi ] + +-- dest is a reg, rhs is a CmmRegOff +pprAssign r1 (CmmRegOff r2 off) + | not (isStrangeTypeReg r1) && not (isStrangeTypeReg r2) + || isPtrReg r1 && isPtrReg r2 + = hcat [ pprAsPtrReg r1, equals, pprAsPtrReg r2, op, int off', semi ] + where + off1 | isPtrReg r2 = off `shiftR` wordShift + | otherwise = off + + (op,off') | off >= 0 = (char '+', off1) + | otherwise = (char '-', -off1) + +-- dest is a reg, rhs is anything. +-- We can't cast the lvalue, so we have to cast the rhs if necessary. Casting +-- the lvalue elicits a warning from new GCC versions (3.4+). +pprAssign r1 r2 + | isPtrReg r1 + = pprAsPtrReg r1 <> ptext SLIT(" = ") <> mkP_ <> pprExpr1 r2 <> semi + | Just ty <- strangeRegType r1 + = pprReg r1 <> ptext SLIT(" = ") <> parens ty <> pprExpr1 r2 <> semi + | otherwise + = pprReg r1 <> ptext SLIT(" = ") <> pprExpr r2 <> semi + +-- --------------------------------------------------------------------- +-- Registers + +pprCastReg reg + | isStrangeTypeReg reg = mkW_ <> pprReg reg + | otherwise = pprReg reg + +-- True if the register has type StgPtr in C, otherwise it has an +-- integer type. We need to take care with pointer arithmetic on registers +-- with type StgPtr. +isPtrReg :: CmmReg -> Bool +isPtrReg (CmmLocal _) = False +isPtrReg (CmmGlobal r) = isPtrGlobalReg r + +isPtrGlobalReg :: GlobalReg -> Bool +isPtrGlobalReg (VanillaReg n) = True +isPtrGlobalReg Sp = True +isPtrGlobalReg Hp = True +isPtrGlobalReg HpLim = True +isPtrGlobalReg SpLim = True +isPtrGlobalReg _ = False + +-- True if in C this register doesn't have the type given by +-- (machRepCType (cmmRegRep reg)), so it has to be cast. +isStrangeTypeReg :: CmmReg -> Bool +isStrangeTypeReg (CmmLocal _) = False +isStrangeTypeReg (CmmGlobal g) = isStrangeTypeGlobal g + +isStrangeTypeGlobal :: GlobalReg -> Bool +isStrangeTypeGlobal CurrentTSO = True +isStrangeTypeGlobal CurrentNursery = True +isStrangeTypeGlobal BaseReg = True +isStrangeTypeGlobal r = isPtrGlobalReg r + +strangeRegType :: CmmReg -> Maybe SDoc +strangeRegType (CmmGlobal CurrentTSO) = Just (ptext SLIT("struct StgTSO_ *")) +strangeRegType (CmmGlobal CurrentNursery) = Just (ptext SLIT("struct bdescr_ *")) +strangeRegType (CmmGlobal BaseReg) = Just (ptext SLIT("struct StgRegTable_ *")) +strangeRegType _ = Nothing + +-- pprReg just prints the register name. +-- +pprReg :: CmmReg -> SDoc +pprReg r = case r of + CmmLocal local -> pprLocalReg local + CmmGlobal global -> pprGlobalReg global + +pprAsPtrReg :: CmmReg -> SDoc +pprAsPtrReg (CmmGlobal (VanillaReg n)) = char 'R' <> int n <> ptext SLIT(".p") +pprAsPtrReg other_reg = pprReg other_reg + +pprGlobalReg :: GlobalReg -> SDoc +pprGlobalReg gr = case gr of + VanillaReg n -> char 'R' <> int n <> ptext SLIT(".w") + FloatReg n -> char 'F' <> int n + DoubleReg n -> char 'D' <> int n + LongReg n -> char 'L' <> int n + Sp -> ptext SLIT("Sp") + SpLim -> ptext SLIT("SpLim") + Hp -> ptext SLIT("Hp") + HpLim -> ptext SLIT("HpLim") + CurrentTSO -> ptext SLIT("CurrentTSO") + CurrentNursery -> ptext SLIT("CurrentNursery") + HpAlloc -> ptext SLIT("HpAlloc") + BaseReg -> ptext SLIT("BaseReg") + GCEnter1 -> ptext SLIT("stg_gc_enter_1") + GCFun -> ptext SLIT("stg_gc_fun") + +pprLocalReg :: LocalReg -> SDoc +pprLocalReg (LocalReg uniq _rep) = char '_' <> ppr uniq + +-- ----------------------------------------------------------------------------- +-- Foreign Calls + +pprCall :: SDoc -> CCallConv -> [(CmmReg,MachHint)] -> [(CmmExpr,MachHint)] + -> Maybe [GlobalReg] -> SDoc + +pprCall ppr_fn cconv results args vols + | not (is_cish cconv) + = panic "pprCall: unknown calling convention" + + | otherwise + = save vols $$ + ptext SLIT("CALLER_SAVE_SYSTEM") $$ +#if x86_64_TARGET_ARCH + -- HACK around gcc optimisations. + -- x86_64 needs a __DISCARD__() here, to create a barrier between + -- putting the arguments into temporaries and passing the arguments + -- to the callee, because the argument expressions may refer to + -- machine registers that are also used for passing arguments in the + -- C calling convention. + (if (not opt_Unregisterised) + then ptext SLIT("__DISCARD__();") + else empty) $$ +#endif + ppr_assign results (ppr_fn <> parens (commafy (map pprArg args))) <> semi $$ + ptext SLIT("CALLER_RESTORE_SYSTEM") $$ + restore vols + where + ppr_assign [] rhs = rhs + ppr_assign [(reg@(CmmGlobal BaseReg), hint)] rhs + | Just ty <- strangeRegType reg + = ptext SLIT("ASSIGN_BaseReg") <> parens (parens ty <> rhs) + -- BaseReg is special, sometimes it isn't an lvalue and we + -- can't assign to it. + ppr_assign [(one,hint)] rhs + | Just ty <- strangeRegType one + = pprReg one <> ptext SLIT(" = ") <> parens ty <> rhs + | otherwise + = pprReg one <> ptext SLIT(" = ") + <> pprUnHint hint (cmmRegRep one) <> rhs + ppr_assign _other _rhs = panic "pprCall: multiple results" + + pprArg (expr, PtrHint) + = cCast (ptext SLIT("void *")) expr + -- see comment by machRepHintCType below + pprArg (expr, SignedHint) + = cCast (machRepSignedCType (cmmExprRep expr)) expr + pprArg (expr, _other) + = pprExpr expr + + pprUnHint PtrHint rep = parens (machRepCType rep) + pprUnHint SignedHint rep = parens (machRepCType rep) + pprUnHint _ _ = empty + + save = save_restore SLIT("CALLER_SAVE") + restore = save_restore SLIT("CALLER_RESTORE") + + -- Nothing says "I don't know what's live; save everything" + -- CALLER_SAVE_USER is defined in ghc/includes/Regs.h + save_restore txt Nothing = ptext txt <> ptext SLIT("_USER") + save_restore txt (Just these) = vcat (map saveRestoreGlobal these) + where saveRestoreGlobal r = ptext txt <> char '_' <> pprGlobalRegName r + +pprGlobalRegName :: GlobalReg -> SDoc +pprGlobalRegName gr = case gr of + VanillaReg n -> char 'R' <> int n -- without the .w suffix + _ -> pprGlobalReg gr + +-- Currently we only have these two calling conventions, but this might +-- change in the future... +is_cish CCallConv = True +is_cish StdCallConv = True + +-- --------------------------------------------------------------------- +-- Find and print local and external declarations for a list of +-- Cmm statements. +-- +pprTempAndExternDecls :: [CmmBasicBlock] -> (SDoc{-temps-}, SDoc{-externs-}) +pprTempAndExternDecls stmts + = (vcat (map pprTempDecl (eltsUFM temps)), + vcat (map (pprExternDecl False{-ToDo-}) (keysFM lbls))) + where (temps, lbls) = runTE (mapM_ te_BB stmts) + +pprDataExterns :: [CmmStatic] -> SDoc +pprDataExterns statics + = vcat (map (pprExternDecl False{-ToDo-}) (keysFM lbls)) + where (_, lbls) = runTE (mapM_ te_Static statics) + +pprTempDecl :: LocalReg -> SDoc +pprTempDecl l@(LocalReg _uniq rep) + = hcat [ machRepCType rep, space, pprLocalReg l, semi ] + +pprExternDecl :: Bool -> CLabel -> SDoc +pprExternDecl in_srt lbl + -- do not print anything for "known external" things + | not (needsCDecl lbl) = empty + | otherwise = + hcat [ visibility, label_type (labelType lbl), + lparen, dyn_wrapper (pprCLabel lbl), text ");" ] + where + dyn_wrapper d + | in_srt && labelDynamic lbl = text "DLL_IMPORT_DATA_VAR" <> parens d + | otherwise = d + + label_type CodeLabel = ptext SLIT("F_") + label_type DataLabel = ptext SLIT("I_") + + visibility + | externallyVisibleCLabel lbl = char 'E' + | otherwise = char 'I' + + +type TEState = (UniqSet LocalReg, FiniteMap CLabel ()) +newtype TE a = TE { unTE :: TEState -> (a, TEState) } + +instance Monad TE where + TE m >>= k = TE $ \s -> case m s of (a, s') -> unTE (k a) s' + return a = TE $ \s -> (a, s) + +te_lbl :: CLabel -> TE () +te_lbl lbl = TE $ \(temps,lbls) -> ((), (temps, addToFM lbls lbl ())) + +te_temp :: LocalReg -> TE () +te_temp r = TE $ \(temps,lbls) -> ((), (addOneToUniqSet temps r, lbls)) + +runTE :: TE () -> TEState +runTE (TE m) = snd (m (emptyUniqSet, emptyFM)) + +te_Static :: CmmStatic -> TE () +te_Static (CmmStaticLit lit) = te_Lit lit +te_Static _ = return () + +te_BB :: CmmBasicBlock -> TE () +te_BB (BasicBlock _ ss) = mapM_ te_Stmt ss + +te_Lit :: CmmLit -> TE () +te_Lit (CmmLabel l) = te_lbl l +te_Lit (CmmLabelOff l _) = te_lbl l +te_Lit (CmmLabelDiffOff l1 l2 _) = te_lbl l1 +te_Lit _ = return () + +te_Stmt :: CmmStmt -> TE () +te_Stmt (CmmAssign r e) = te_Reg r >> te_Expr e +te_Stmt (CmmStore l r) = te_Expr l >> te_Expr r +te_Stmt (CmmCall _ rs es _) = mapM_ (te_Reg.fst) rs >> + mapM_ (te_Expr.fst) es +te_Stmt (CmmCondBranch e _) = te_Expr e +te_Stmt (CmmSwitch e _) = te_Expr e +te_Stmt (CmmJump e _) = te_Expr e +te_Stmt _ = return () + +te_Expr :: CmmExpr -> TE () +te_Expr (CmmLit lit) = te_Lit lit +te_Expr (CmmLoad e _) = te_Expr e +te_Expr (CmmReg r) = te_Reg r +te_Expr (CmmMachOp _ es) = mapM_ te_Expr es +te_Expr (CmmRegOff r _) = te_Reg r + +te_Reg :: CmmReg -> TE () +te_Reg (CmmLocal l) = te_temp l +te_Reg _ = return () + + +-- --------------------------------------------------------------------- +-- C types for MachReps + +cCast :: SDoc -> CmmExpr -> SDoc +cCast ty expr = parens ty <> pprExpr1 expr + +-- This is for finding the types of foreign call arguments. For a pointer +-- argument, we always cast the argument to (void *), to avoid warnings from +-- the C compiler. +machRepHintCType :: MachRep -> MachHint -> SDoc +machRepHintCType rep PtrHint = ptext SLIT("void *") +machRepHintCType rep SignedHint = machRepSignedCType rep +machRepHintCType rep _other = machRepCType rep + +machRepPtrCType :: MachRep -> SDoc +machRepPtrCType r | r == wordRep = ptext SLIT("P_") + | otherwise = machRepCType r <> char '*' + +machRepCType :: MachRep -> SDoc +machRepCType r | r == wordRep = ptext SLIT("W_") + | otherwise = sized_type + where sized_type = case r of + I8 -> ptext SLIT("StgWord8") + I16 -> ptext SLIT("StgWord16") + I32 -> ptext SLIT("StgWord32") + I64 -> ptext SLIT("StgWord64") + F32 -> ptext SLIT("StgFloat") -- ToDo: correct? + F64 -> ptext SLIT("StgDouble") + _ -> panic "machRepCType" + +machRepSignedCType :: MachRep -> SDoc +machRepSignedCType r | r == wordRep = ptext SLIT("I_") + | otherwise = sized_type + where sized_type = case r of + I8 -> ptext SLIT("StgInt8") + I16 -> ptext SLIT("StgInt16") + I32 -> ptext SLIT("StgInt32") + I64 -> ptext SLIT("StgInt64") + F32 -> ptext SLIT("StgFloat") -- ToDo: correct? + F64 -> ptext SLIT("StgDouble") + _ -> panic "machRepCType" + +-- --------------------------------------------------------------------- +-- print strings as valid C strings + +pprStringInCStyle :: [Word8] -> SDoc +pprStringInCStyle s = doubleQuotes (text (concatMap charToC s)) + +charToC :: Word8 -> String +charToC w = + case chr (fromIntegral w) of + '\"' -> "\\\"" + '\'' -> "\\\'" + '\\' -> "\\\\" + c | c >= ' ' && c <= '~' -> [c] + | otherwise -> ['\\', + chr (ord '0' + ord c `div` 64), + chr (ord '0' + ord c `div` 8 `mod` 8), + chr (ord '0' + ord c `mod` 8)] + +-- --------------------------------------------------------------------------- +-- Initialising static objects with floating-point numbers. We can't +-- just emit the floating point number, because C will cast it to an int +-- by rounding it. We want the actual bit-representation of the float. + +-- This is a hack to turn the floating point numbers into ints that we +-- can safely initialise to static locations. + +big_doubles + | machRepByteWidth F64 == 2 * wORD_SIZE = True + | machRepByteWidth F64 == wORD_SIZE = False + | otherwise = panic "big_doubles" + +#if __GLASGOW_HASKELL__ >= 504 +newFloatArray :: (Int,Int) -> ST s (STUArray s Int Float) +newFloatArray = newArray_ + +newDoubleArray :: (Int,Int) -> ST s (STUArray s Int Double) +newDoubleArray = newArray_ + +castFloatToIntArray :: STUArray s Int Float -> ST s (STUArray s Int Int) +castFloatToIntArray = castSTUArray + +castDoubleToIntArray :: STUArray s Int Double -> ST s (STUArray s Int Int) +castDoubleToIntArray = castSTUArray + +writeFloatArray :: STUArray s Int Float -> Int -> Float -> ST s () +writeFloatArray = writeArray + +writeDoubleArray :: STUArray s Int Double -> Int -> Double -> ST s () +writeDoubleArray = writeArray + +readIntArray :: STUArray s Int Int -> Int -> ST s Int +readIntArray = readArray + +#else + +castFloatToIntArray :: MutableByteArray s t -> ST s (MutableByteArray s t) +castFloatToIntArray = return + +castDoubleToIntArray :: MutableByteArray s t -> ST s (MutableByteArray s t) +castDoubleToIntArray = return + +#endif + +-- floats are always 1 word +floatToWord :: Rational -> CmmLit +floatToWord r + = runST (do + arr <- newFloatArray ((0::Int),0) + writeFloatArray arr 0 (fromRational r) + arr' <- castFloatToIntArray arr + i <- readIntArray arr' 0 + return (CmmInt (toInteger i) wordRep) + ) + +doubleToWords :: Rational -> [CmmLit] +doubleToWords r + | big_doubles -- doubles are 2 words + = runST (do + arr <- newDoubleArray ((0::Int),1) + writeDoubleArray arr 0 (fromRational r) + arr' <- castDoubleToIntArray arr + i1 <- readIntArray arr' 0 + i2 <- readIntArray arr' 1 + return [ CmmInt (toInteger i1) wordRep + , CmmInt (toInteger i2) wordRep + ] + ) + | otherwise -- doubles are 1 word + = runST (do + arr <- newDoubleArray ((0::Int),0) + writeDoubleArray arr 0 (fromRational r) + arr' <- castDoubleToIntArray arr + i <- readIntArray arr' 0 + return [ CmmInt (toInteger i) wordRep ] + ) + +-- --------------------------------------------------------------------------- +-- Utils + +wordShift :: Int +wordShift = machRepLogWidth wordRep + +commafy :: [SDoc] -> SDoc +commafy xs = hsep $ punctuate comma xs + +-- Print in C hex format: 0x13fa +pprHexVal :: Integer -> MachRep -> SDoc +pprHexVal 0 _ = ptext SLIT("0x0") +pprHexVal w rep + | w < 0 = parens (char '-' <> ptext SLIT("0x") <> go (-w) <> repsuffix rep) + | otherwise = ptext SLIT("0x") <> go w <> repsuffix rep + where + -- type suffix for literals: + -- Integer literals are unsigned in Cmm/C. We explicitly cast to + -- signed values for doing signed operations, but at all other + -- times values are unsigned. This also helps eliminate occasional + -- warnings about integer overflow from gcc. + + -- on 32-bit platforms, add "ULL" to 64-bit literals + repsuffix I64 | wORD_SIZE == 4 = ptext SLIT("ULL") + -- on 64-bit platforms with 32-bit int, add "L" to 64-bit literals + repsuffix I64 | cINT_SIZE == 4 = ptext SLIT("UL") + repsuffix _ = char 'U' + + go 0 = empty + go w' = go q <> dig + where + (q,r) = w' `quotRem` 16 + dig | r < 10 = char (chr (fromInteger r + ord '0')) + | otherwise = char (chr (fromInteger r - 10 + ord 'a')) + diff --git a/compiler/cmm/PprCmm.hs b/compiler/cmm/PprCmm.hs new file mode 100644 index 0000000000..6e8367d662 --- /dev/null +++ b/compiler/cmm/PprCmm.hs @@ -0,0 +1,462 @@ +---------------------------------------------------------------------------- +-- +-- Pretty-printing of Cmm as (a superset of) C-- +-- +-- (c) The University of Glasgow 2004 +-- +----------------------------------------------------------------------------- + +-- +-- This is where we walk over Cmm emitting an external representation, +-- suitable for parsing, in a syntax strongly reminiscent of C--. This +-- is the "External Core" for the Cmm layer. +-- +-- As such, this should be a well-defined syntax: we want it to look nice. +-- Thus, we try wherever possible to use syntax defined in [1], +-- "The C-- Reference Manual", http://www.cminusminus.org/. We differ +-- slightly, in some cases. For one, we use I8 .. I64 for types, rather +-- than C--'s bits8 .. bits64. +-- +-- We try to ensure that all information available in the abstract +-- syntax is reproduced, or reproducible, in the concrete syntax. +-- Data that is not in printed out can be reconstructed according to +-- conventions used in the pretty printer. There are at least two such +-- cases: +-- 1) if a value has wordRep type, the type is not appended in the +-- output. +-- 2) MachOps that operate over wordRep type are printed in a +-- C-style, rather than as their internal MachRep name. +-- +-- These conventions produce much more readable Cmm output. +-- +-- A useful example pass over Cmm is in nativeGen/MachCodeGen.hs +-- + +module PprCmm ( + writeCmms, pprCmms, pprCmm, pprStmt, pprExpr + ) where + +#include "HsVersions.h" + +import Cmm +import CmmUtils ( isTrivialCmmExpr ) +import MachOp ( MachOp(..), pprMachOp, MachRep(..), wordRep ) +import CLabel ( pprCLabel, mkForeignLabel, entryLblToInfoLbl ) + +import ForeignCall ( CCallConv(..) ) +import Unique ( getUnique ) +import Outputable +import FastString ( mkFastString ) + +import Data.List ( intersperse, groupBy ) +import IO ( Handle ) +import Maybe ( isJust ) +import Data.Char ( chr ) + +pprCmms :: [Cmm] -> SDoc +pprCmms cmms = pprCode CStyle (vcat (intersperse separator $ map ppr cmms)) + where + separator = space $$ ptext SLIT("-------------------") $$ space + +writeCmms :: Handle -> [Cmm] -> IO () +writeCmms handle cmms = printForC handle (pprCmms cmms) + +----------------------------------------------------------------------------- + +instance Outputable Cmm where + ppr c = pprCmm c + +instance Outputable CmmTop where + ppr t = pprTop t + +instance Outputable CmmBasicBlock where + ppr b = pprBBlock b + +instance Outputable CmmStmt where + ppr s = pprStmt s + +instance Outputable CmmExpr where + ppr e = pprExpr e + +instance Outputable CmmReg where + ppr e = pprReg e + +instance Outputable GlobalReg where + ppr e = pprGlobalReg e + +----------------------------------------------------------------------------- + +pprCmm :: Cmm -> SDoc +pprCmm (Cmm tops) = vcat $ intersperse (text "") $ map pprTop tops + +-- -------------------------------------------------------------------------- +-- Top level `procedure' blocks. The info tables, if not null, are +-- printed in the style of C--'s 'stackdata' declaration, just inside +-- the proc body, and are labelled with the procedure name ++ "_info". +-- +pprTop :: CmmTop -> SDoc +pprTop (CmmProc info lbl params blocks ) + + = vcat [ pprCLabel lbl <> parens (commafy $ map pprLocalReg params) <+> lbrace + , nest 8 $ pprInfo info lbl + , nest 4 $ vcat (map ppr blocks) + , rbrace ] + + where + pprInfo [] _ = empty + pprInfo i label = + (hang (pprCLabel (entryLblToInfoLbl label) <+> lbrace ) + 4 $ vcat (map pprStatic i)) + $$ rbrace + +-- -------------------------------------------------------------------------- +-- We follow [1], 4.5 +-- +-- section "data" { ... } +-- +pprTop (CmmData section ds) = + (hang (pprSection section <+> lbrace) 4 (vcat (map pprStatic ds))) + $$ rbrace + + +-- -------------------------------------------------------------------------- +-- Basic blocks look like assembly blocks. +-- lbl: stmt ; stmt ; .. +pprBBlock :: CmmBasicBlock -> SDoc +pprBBlock (BasicBlock ident stmts) = + hang (pprBlockId ident <> colon) 4 (vcat (map ppr stmts)) + +-- -------------------------------------------------------------------------- +-- Statements. C-- usually, exceptions to this should be obvious. +-- +pprStmt :: CmmStmt -> SDoc +pprStmt stmt = case stmt of + + -- ; + CmmNop -> semi + + -- // text + CmmComment s -> text "//" <+> ftext s + + -- reg = expr; + CmmAssign reg expr -> ppr reg <+> equals <+> ppr expr <> semi + + -- rep[lv] = expr; + CmmStore lv expr -> rep <> brackets(ppr lv) <+> equals <+> ppr expr <> semi + where + rep = ppr ( cmmExprRep expr ) + + -- call "ccall" foo(x, y)[r1, r2]; + -- ToDo ppr volatile + CmmCall (CmmForeignCall fn cconv) results args _volatile -> + hcat [ ptext SLIT("call"), space, + doubleQuotes(ppr cconv), space, + target fn, parens ( commafy $ map ppr args ), + (if null results + then empty + else brackets( commafy $ map ppr results)), semi ] + where + target (CmmLit lit) = pprLit lit + target fn' = parens (ppr fn') + + CmmCall (CmmPrim op) results args volatile -> + pprStmt (CmmCall (CmmForeignCall (CmmLit lbl) CCallConv) + results args volatile) + where + lbl = CmmLabel (mkForeignLabel (mkFastString (show op)) Nothing False) + + CmmBranch ident -> genBranch ident + CmmCondBranch expr ident -> genCondBranch expr ident + CmmJump expr params -> genJump expr params + CmmSwitch arg ids -> genSwitch arg ids + +-- -------------------------------------------------------------------------- +-- goto local label. [1], section 6.6 +-- +-- goto lbl; +-- +genBranch :: BlockId -> SDoc +genBranch ident = + ptext SLIT("goto") <+> pprBlockId ident <> semi + +-- -------------------------------------------------------------------------- +-- Conditional. [1], section 6.4 +-- +-- if (expr) { goto lbl; } +-- +genCondBranch :: CmmExpr -> BlockId -> SDoc +genCondBranch expr ident = + hsep [ ptext SLIT("if") + , parens(ppr expr) + , ptext SLIT("goto") + , pprBlockId ident <> semi ] + +-- -------------------------------------------------------------------------- +-- A tail call. [1], Section 6.9 +-- +-- jump foo(a, b, c); +-- +genJump :: CmmExpr -> [LocalReg] -> SDoc +genJump expr actuals = + + hcat [ ptext SLIT("jump") + , space + , if isTrivialCmmExpr expr + then pprExpr expr + else case expr of + CmmLoad (CmmReg _) _ -> pprExpr expr + _ -> parens (pprExpr expr) + , pprActuals actuals + , semi ] + + where + pprActuals [] = empty + pprActuals as = parens ( commafy $ map pprLocalReg as ) + +-- -------------------------------------------------------------------------- +-- Tabled jump to local label +-- +-- The syntax is from [1], section 6.5 +-- +-- switch [0 .. n] (expr) { case ... ; } +-- +genSwitch :: CmmExpr -> [Maybe BlockId] -> SDoc +genSwitch expr maybe_ids + + = let pairs = groupBy snds (zip [0 .. ] maybe_ids ) + + in hang (hcat [ ptext SLIT("switch [0 .. ") + , int (length maybe_ids - 1) + , ptext SLIT("] ") + , if isTrivialCmmExpr expr + then pprExpr expr + else parens (pprExpr expr) + , ptext SLIT(" {") + ]) + 4 (vcat ( map caseify pairs )) $$ rbrace + + where + snds a b = (snd a) == (snd b) + + caseify :: [(Int,Maybe BlockId)] -> SDoc + caseify ixs@((i,Nothing):_) + = ptext SLIT("/* impossible: ") <> hcat (intersperse comma (map (int.fst) ixs)) + <> ptext SLIT(" */") + caseify as + = let (is,ids) = unzip as + in hsep [ ptext SLIT("case") + , hcat (punctuate comma (map int is)) + , ptext SLIT(": goto") + , pprBlockId (head [ id | Just id <- ids]) <> semi ] + +-- -------------------------------------------------------------------------- +-- Expressions +-- + +pprExpr :: CmmExpr -> SDoc +pprExpr e + = case e of + CmmRegOff reg i -> + pprExpr (CmmMachOp (MO_Add rep) + [CmmReg reg, CmmLit (CmmInt (fromIntegral i) rep)]) + where rep = cmmRegRep reg + CmmLit lit -> pprLit lit + _other -> pprExpr1 e + +-- Here's the precedence table from CmmParse.y: +-- %nonassoc '>=' '>' '<=' '<' '!=' '==' +-- %left '|' +-- %left '^' +-- %left '&' +-- %left '>>' '<<' +-- %left '-' '+' +-- %left '/' '*' '%' +-- %right '~' + +-- We just cope with the common operators for now, the rest will get +-- a default conservative behaviour. + +-- %nonassoc '>=' '>' '<=' '<' '!=' '==' +pprExpr1 (CmmMachOp op [x,y]) | Just doc <- infixMachOp1 op + = pprExpr7 x <+> doc <+> pprExpr7 y +pprExpr1 e = pprExpr7 e + +infixMachOp1 (MO_Eq _) = Just (ptext SLIT("==")) +infixMachOp1 (MO_Ne _) = Just (ptext SLIT("!=")) +infixMachOp1 (MO_Shl _) = Just (ptext SLIT("<<")) +infixMachOp1 (MO_U_Shr _) = Just (ptext SLIT(">>")) +infixMachOp1 (MO_U_Ge _) = Just (ptext SLIT(">=")) +infixMachOp1 (MO_U_Le _) = Just (ptext SLIT("<=")) +infixMachOp1 (MO_U_Gt _) = Just (char '>') +infixMachOp1 (MO_U_Lt _) = Just (char '<') +infixMachOp1 _ = Nothing + +-- %left '-' '+' +pprExpr7 (CmmMachOp op [x,y]) | Just doc <- infixMachOp7 op + = pprExpr7 x <+> doc <+> pprExpr8 y +pprExpr7 e = pprExpr8 e + +infixMachOp7 (MO_Add _) = Just (char '+') +infixMachOp7 (MO_Sub _) = Just (char '-') +infixMachOp7 _ = Nothing + +-- %left '/' '*' '%' +pprExpr8 (CmmMachOp op [x,y]) | Just doc <- infixMachOp8 op + = pprExpr8 x <+> doc <+> pprExpr9 y +pprExpr8 e = pprExpr9 e + +infixMachOp8 (MO_U_Quot _) = Just (char '/') +infixMachOp8 (MO_Mul _) = Just (char '*') +infixMachOp8 (MO_U_Rem _) = Just (char '%') +infixMachOp8 _ = Nothing + +pprExpr9 :: CmmExpr -> SDoc +pprExpr9 e = + case e of + CmmLit lit -> pprLit1 lit + CmmLoad expr rep -> ppr rep <> brackets( ppr expr ) + CmmReg reg -> ppr reg + CmmRegOff reg off -> parens (ppr reg <+> char '+' <+> int off) + CmmMachOp mop args -> genMachOp mop args + +genMachOp :: MachOp -> [CmmExpr] -> SDoc +genMachOp mop args + | Just doc <- infixMachOp mop = case args of + -- dyadic + [x,y] -> pprExpr9 x <+> doc <+> pprExpr9 y + + -- unary + [x] -> doc <> pprExpr9 x + + _ -> pprTrace "PprCmm.genMachOp: machop with strange number of args" + (pprMachOp mop <+> + parens (hcat $ punctuate comma (map pprExpr args))) + empty + + | isJust (infixMachOp1 mop) + || isJust (infixMachOp7 mop) + || isJust (infixMachOp8 mop) = parens (pprExpr (CmmMachOp mop args)) + + | otherwise = char '%' <> pprMachOp mop <> parens (commafy (map pprExpr args)) + +-- +-- Unsigned ops on the word size of the machine get nice symbols. +-- All else get dumped in their ugly format. +-- +infixMachOp :: MachOp -> Maybe SDoc +infixMachOp mop + = case mop of + MO_And _ -> Just $ char '&' + MO_Or _ -> Just $ char '|' + MO_Xor _ -> Just $ char '^' + MO_Not _ -> Just $ char '~' + MO_S_Neg _ -> Just $ char '-' -- there is no unsigned neg :) + _ -> Nothing + +-- -------------------------------------------------------------------------- +-- Literals. +-- To minimise line noise we adopt the convention that if the literal +-- has the natural machine word size, we do not append the type +-- +pprLit :: CmmLit -> SDoc +pprLit lit = case lit of + CmmInt i rep -> + hcat [ (if i < 0 then parens else id)(integer i) + , (if rep == wordRep + then empty + else space <> dcolon <+> ppr rep) ] + + CmmFloat f rep -> hsep [ rational f, dcolon, ppr rep ] + CmmLabel clbl -> pprCLabel clbl + CmmLabelOff clbl i -> pprCLabel clbl <> ppr_offset i + CmmLabelDiffOff clbl1 clbl2 i -> pprCLabel clbl1 <> char '-' + <> pprCLabel clbl2 <> ppr_offset i + +pprLit1 lit@(CmmLabelOff clbl i) = parens (pprLit lit) +pprLit1 lit = pprLit lit + +ppr_offset :: Int -> SDoc +ppr_offset i + | i==0 = empty + | i>=0 = char '+' <> int i + | otherwise = char '-' <> int (-i) + +-- -------------------------------------------------------------------------- +-- Static data. +-- Strings are printed as C strings, and we print them as I8[], +-- following C-- +-- +pprStatic :: CmmStatic -> SDoc +pprStatic s = case s of + CmmStaticLit lit -> nest 4 $ ptext SLIT("const") <+> pprLit lit <> semi + CmmUninitialised i -> nest 4 $ text "I8" <> brackets (int i) + CmmAlign i -> nest 4 $ text "align" <+> int i + CmmDataLabel clbl -> pprCLabel clbl <> colon + CmmString s' -> nest 4 $ text "I8[]" <+> + doubleQuotes (text (map (chr.fromIntegral) s')) + +-- -------------------------------------------------------------------------- +-- Registers, whether local (temps) or global +-- +pprReg :: CmmReg -> SDoc +pprReg r + = case r of + CmmLocal local -> pprLocalReg local + CmmGlobal global -> pprGlobalReg global + +-- +-- We only print the type of the local reg if it isn't wordRep +-- +pprLocalReg :: LocalReg -> SDoc +pprLocalReg (LocalReg uniq rep) + = hcat [ char '_', ppr uniq, + (if rep == wordRep + then empty else dcolon <> ppr rep) ] + +-- needs to be kept in syn with Cmm.hs.GlobalReg +-- +pprGlobalReg :: GlobalReg -> SDoc +pprGlobalReg gr + = case gr of + VanillaReg n -> char 'R' <> int n + FloatReg n -> char 'F' <> int n + DoubleReg n -> char 'D' <> int n + LongReg n -> char 'L' <> int n + Sp -> ptext SLIT("Sp") + SpLim -> ptext SLIT("SpLim") + Hp -> ptext SLIT("Hp") + HpLim -> ptext SLIT("HpLim") + CurrentTSO -> ptext SLIT("CurrentTSO") + CurrentNursery -> ptext SLIT("CurrentNursery") + HpAlloc -> ptext SLIT("HpAlloc") + GCEnter1 -> ptext SLIT("stg_gc_enter_1") + GCFun -> ptext SLIT("stg_gc_fun") + BaseReg -> ptext SLIT("BaseReg") + PicBaseReg -> ptext SLIT("PicBaseReg") + +-- -------------------------------------------------------------------------- +-- data sections +-- +pprSection :: Section -> SDoc +pprSection s = case s of + Text -> section <+> doubleQuotes (ptext SLIT("text")) + Data -> section <+> doubleQuotes (ptext SLIT("data")) + ReadOnlyData -> section <+> doubleQuotes (ptext SLIT("readonly")) + RelocatableReadOnlyData + -> section <+> doubleQuotes (ptext SLIT("relreadonly")) + UninitialisedData -> section <+> doubleQuotes (ptext SLIT("uninitialised")) + OtherSection s' -> section <+> doubleQuotes (text s') + where + section = ptext SLIT("section") + +-- -------------------------------------------------------------------------- +-- Basic block ids +-- +pprBlockId :: BlockId -> SDoc +pprBlockId b = ppr $ getUnique b + +----------------------------------------------------------------------------- + +commafy :: [SDoc] -> SDoc +commafy xs = hsep $ punctuate comma xs + diff --git a/compiler/codeGen/Bitmap.hs b/compiler/codeGen/Bitmap.hs new file mode 100644 index 0000000000..c0b490978c --- /dev/null +++ b/compiler/codeGen/Bitmap.hs @@ -0,0 +1,79 @@ +-- +-- (c) The University of Glasgow 2003 +-- + +-- Functions for constructing bitmaps, which are used in various +-- places in generated code (stack frame liveness masks, function +-- argument liveness masks, SRT bitmaps). + +module Bitmap ( + Bitmap, mkBitmap, + intsToBitmap, intsToReverseBitmap, + mAX_SMALL_BITMAP_SIZE + ) where + +#include "HsVersions.h" +#include "../includes/MachDeps.h" + +import SMRep +import Constants +import DATA_BITS + +{-| +A bitmap represented by a sequence of 'StgWord's on the /target/ +architecture. These are used for bitmaps in info tables and other +generated code which need to be emitted as sequences of StgWords. +-} +type Bitmap = [StgWord] + +-- | Make a bitmap from a sequence of bits +mkBitmap :: [Bool] -> Bitmap +mkBitmap [] = [] +mkBitmap stuff = chunkToBitmap chunk : mkBitmap rest + where (chunk, rest) = splitAt wORD_SIZE_IN_BITS stuff + +chunkToBitmap :: [Bool] -> StgWord +chunkToBitmap chunk = + foldr (.|.) 0 [ 1 `shiftL` n | (True,n) <- zip chunk [0..] ] + +-- | Make a bitmap where the slots specified are the /ones/ in the bitmap. +-- eg. @[1,2,4], size 4 ==> 0xb@. +-- +-- The list of @Int@s /must/ be already sorted. +intsToBitmap :: Int -> [Int] -> Bitmap +intsToBitmap size slots{- must be sorted -} + | size <= 0 = [] + | otherwise = + (foldr (.|.) 0 (map (1 `shiftL`) these)) : + intsToBitmap (size - wORD_SIZE_IN_BITS) + (map (\x -> x - wORD_SIZE_IN_BITS) rest) + where (these,rest) = span (<wORD_SIZE_IN_BITS) slots + +-- | Make a bitmap where the slots specified are the /zeros/ in the bitmap. +-- eg. @[1,2,4], size 4 ==> 0x8@ (we leave any bits outside the size as zero, +-- just to make the bitmap easier to read). +-- +-- The list of @Int@s /must/ be already sorted. +intsToReverseBitmap :: Int -> [Int] -> Bitmap +intsToReverseBitmap size slots{- must be sorted -} + | size <= 0 = [] + | otherwise = + (foldr xor init (map (1 `shiftL`) these)) : + intsToReverseBitmap (size - wORD_SIZE_IN_BITS) + (map (\x -> x - wORD_SIZE_IN_BITS) rest) + where (these,rest) = span (<wORD_SIZE_IN_BITS) slots + init + | size >= wORD_SIZE_IN_BITS = complement 0 + | otherwise = (1 `shiftL` size) - 1 + +{- | +Magic number, must agree with @BITMAP_BITS_SHIFT@ in InfoTables.h. +Some kinds of bitmap pack a size\/bitmap into a single word if +possible, or fall back to an external pointer when the bitmap is too +large. This value represents the largest size of bitmap that can be +packed into a single word. +-} +mAX_SMALL_BITMAP_SIZE :: Int +mAX_SMALL_BITMAP_SIZE | wORD_SIZE == 4 = 27 + | otherwise = 58 + diff --git a/compiler/codeGen/CgBindery.hi-boot-5 b/compiler/codeGen/CgBindery.hi-boot-5 new file mode 100644 index 0000000000..f375fcc6e1 --- /dev/null +++ b/compiler/codeGen/CgBindery.hi-boot-5 @@ -0,0 +1,7 @@ +__interface CgBindery 1 0 where +__export CgBindery CgBindings CgIdInfo VolatileLoc StableLoc nukeVolatileBinds; +1 type CgBindings = VarEnv.IdEnv CgIdInfo; +1 data CgIdInfo; +1 data VolatileLoc; +1 data StableLoc; +1 nukeVolatileBinds :: CgBindings -> CgBindings ; diff --git a/compiler/codeGen/CgBindery.hi-boot-6 b/compiler/codeGen/CgBindery.hi-boot-6 new file mode 100644 index 0000000000..7d1f300a86 --- /dev/null +++ b/compiler/codeGen/CgBindery.hi-boot-6 @@ -0,0 +1,8 @@ +module CgBindery where + +type CgBindings = VarEnv.IdEnv CgIdInfo +data CgIdInfo +data VolatileLoc +data StableLoc + +nukeVolatileBinds :: CgBindings -> CgBindings diff --git a/compiler/codeGen/CgBindery.lhs b/compiler/codeGen/CgBindery.lhs new file mode 100644 index 0000000000..f78edda655 --- /dev/null +++ b/compiler/codeGen/CgBindery.lhs @@ -0,0 +1,494 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[CgBindery]{Utility functions related to doing @CgBindings@} + +\begin{code} +module CgBindery ( + CgBindings, CgIdInfo, + StableLoc, VolatileLoc, + + cgIdInfoId, cgIdInfoArgRep, cgIdInfoLF, + + stableIdInfo, heapIdInfo, + letNoEscapeIdInfo, idInfoToAmode, + + addBindC, addBindsC, + + nukeVolatileBinds, + nukeDeadBindings, + getLiveStackSlots, + + bindArgsToStack, rebindToStack, + bindNewToNode, bindNewToReg, bindArgsToRegs, + bindNewToTemp, + getArgAmode, getArgAmodes, + getCgIdInfo, + getCAddrModeIfVolatile, getVolatileRegs, + maybeLetNoEscape, + ) where + +#include "HsVersions.h" + +import CgMonad +import CgHeapery ( getHpRelOffset ) +import CgStackery ( freeStackSlots, getSpRelOffset ) +import CgUtils ( cgLit, cmmOffsetW ) +import CLabel ( mkClosureLabel, pprCLabel ) +import ClosureInfo ( mkLFImported, mkLFArgument, LambdaFormInfo ) + +import Cmm +import PprCmm ( {- instance Outputable -} ) +import SMRep ( CgRep(..), WordOff, isFollowableArg, + isVoidArg, cgRepSizeW, argMachRep, + idCgRep, typeCgRep ) +import Id ( Id, idName ) +import VarEnv +import VarSet ( varSetElems ) +import Literal ( literalType ) +import Maybes ( catMaybes ) +import Name ( isExternalName ) +import StgSyn ( StgArg, StgLiveVars, GenStgArg(..), isStgTypeArg ) +import Unique ( Uniquable(..) ) +import UniqSet ( elementOfUniqSet ) +import Outputable +\end{code} + + +%************************************************************************ +%* * +\subsection[Bindery-datatypes]{Data types} +%* * +%************************************************************************ + +@(CgBinding a b)@ is a type of finite maps from a to b. + +The assumption used to be that @lookupCgBind@ must get exactly one +match. This is {\em completely wrong} in the case of compiling +letrecs (where knot-tying is used). An initial binding is fed in (and +never evaluated); eventually, a correct binding is put into the +environment. So there can be two bindings for a given name. + +\begin{code} +type CgBindings = IdEnv CgIdInfo + +data CgIdInfo + = CgIdInfo + { cg_id :: Id -- Id that this is the info for + -- Can differ from the Id at occurrence sites by + -- virtue of being externalised, for splittable C + , cg_rep :: CgRep + , cg_vol :: VolatileLoc + , cg_stb :: StableLoc + , cg_lf :: LambdaFormInfo } + +mkCgIdInfo id vol stb lf + = CgIdInfo { cg_id = id, cg_vol = vol, cg_stb = stb, + cg_lf = lf, cg_rep = idCgRep id } + +voidIdInfo id = CgIdInfo { cg_id = id, cg_vol = NoVolatileLoc + , cg_stb = VoidLoc, cg_lf = mkLFArgument id + , cg_rep = VoidArg } + -- Used just for VoidRep things + +data VolatileLoc -- These locations die across a call + = NoVolatileLoc + | RegLoc CmmReg -- In one of the registers (global or local) + | VirHpLoc VirtualHpOffset -- Hp+offset (address of closure) + | VirNodeLoc VirtualHpOffset -- Cts of offset indirect from Node + -- ie *(Node+offset) +\end{code} + +@StableLoc@ encodes where an Id can be found, used by +the @CgBindings@ environment in @CgBindery@. + +\begin{code} +data StableLoc + = NoStableLoc + + | VirStkLoc VirtualSpOffset -- The thing is held in this + -- stack slot + + | VirStkLNE VirtualSpOffset -- A let-no-escape thing; the + -- value is this stack pointer + -- (as opposed to the contents of the slot) + + | StableLoc CmmExpr + | VoidLoc -- Used only for VoidRep variables. They never need to + -- be saved, so it makes sense to treat treat them as + -- having a stable location +\end{code} + +\begin{code} +instance Outputable CgIdInfo where + ppr (CgIdInfo id rep vol stb lf) + = ppr id <+> ptext SLIT("-->") <+> vcat [ppr vol, ppr stb] + +instance Outputable VolatileLoc where + ppr NoVolatileLoc = empty + ppr (RegLoc r) = ptext SLIT("reg") <+> ppr r + ppr (VirHpLoc v) = ptext SLIT("vh") <+> ppr v + ppr (VirNodeLoc v) = ptext SLIT("vn") <+> ppr v + +instance Outputable StableLoc where + ppr NoStableLoc = empty + ppr VoidLoc = ptext SLIT("void") + ppr (VirStkLoc v) = ptext SLIT("vs") <+> ppr v + ppr (VirStkLNE v) = ptext SLIT("lne") <+> ppr v + ppr (StableLoc a) = ptext SLIT("amode") <+> ppr a +\end{code} + +%************************************************************************ +%* * +\subsection[Bindery-idInfo]{Manipulating IdInfo} +%* * +%************************************************************************ + +\begin{code} +stableIdInfo id amode lf_info = mkCgIdInfo id NoVolatileLoc (StableLoc amode) lf_info +heapIdInfo id offset lf_info = mkCgIdInfo id (VirHpLoc offset) NoStableLoc lf_info +letNoEscapeIdInfo id sp lf_info = mkCgIdInfo id NoVolatileLoc (VirStkLNE sp) lf_info +stackIdInfo id sp lf_info = mkCgIdInfo id NoVolatileLoc (VirStkLoc sp) lf_info +nodeIdInfo id offset lf_info = mkCgIdInfo id (VirNodeLoc offset) NoStableLoc lf_info +regIdInfo id reg lf_info = mkCgIdInfo id (RegLoc reg) NoStableLoc lf_info + +idInfoToAmode :: CgIdInfo -> FCode CmmExpr +idInfoToAmode info + = case cg_vol info of { + RegLoc reg -> returnFC (CmmReg reg) ; + VirNodeLoc nd_off -> returnFC (CmmLoad (cmmOffsetW (CmmReg nodeReg) nd_off) mach_rep) ; + VirHpLoc hp_off -> getHpRelOffset hp_off ; + NoVolatileLoc -> + + case cg_stb info of + StableLoc amode -> returnFC amode + VirStkLoc sp_off -> do { sp_rel <- getSpRelOffset sp_off + ; return (CmmLoad sp_rel mach_rep) } + + VirStkLNE sp_off -> getSpRelOffset sp_off + + VoidLoc -> return $ pprPanic "idInfoToAmode: void" (ppr (cg_id info)) + -- We return a 'bottom' amode, rather than panicing now + -- In this way getArgAmode returns a pair of (VoidArg, bottom) + -- and that's exactly what we want + + NoStableLoc -> pprPanic "idInfoToAmode: no loc" (ppr (cg_id info)) + } + where + mach_rep = argMachRep (cg_rep info) + +cgIdInfoId :: CgIdInfo -> Id +cgIdInfoId = cg_id + +cgIdInfoLF :: CgIdInfo -> LambdaFormInfo +cgIdInfoLF = cg_lf + +cgIdInfoArgRep :: CgIdInfo -> CgRep +cgIdInfoArgRep = cg_rep + +maybeLetNoEscape (CgIdInfo { cg_stb = VirStkLNE sp_off }) = Just sp_off +maybeLetNoEscape other = Nothing +\end{code} + +%************************************************************************ +%* * +\subsection[CgMonad-bindery]{Monad things for fiddling with @CgBindings@} +%* * +%************************************************************************ + +.There are three basic routines, for adding (@addBindC@), modifying +(@modifyBindC@) and looking up (@getCgIdInfo@) bindings. + +A @Id@ is bound to a @(VolatileLoc, StableLoc)@ triple. +The name should not already be bound. (nice ASSERT, eh?) + +\begin{code} +addBindC :: Id -> CgIdInfo -> Code +addBindC name stuff_to_bind = do + binds <- getBinds + setBinds $ extendVarEnv binds name stuff_to_bind + +addBindsC :: [(Id, CgIdInfo)] -> Code +addBindsC new_bindings = do + binds <- getBinds + let new_binds = foldl (\ binds (name,info) -> extendVarEnv binds name info) + binds + new_bindings + setBinds new_binds + +modifyBindC :: Id -> (CgIdInfo -> CgIdInfo) -> Code +modifyBindC name mangle_fn = do + binds <- getBinds + setBinds $ modifyVarEnv mangle_fn binds name + +getCgIdInfo :: Id -> FCode CgIdInfo +getCgIdInfo id + = do { -- Try local bindings first + ; local_binds <- getBinds + ; case lookupVarEnv local_binds id of { + Just info -> return info ; + Nothing -> do + + { -- Try top-level bindings + static_binds <- getStaticBinds + ; case lookupVarEnv static_binds id of { + Just info -> return info ; + Nothing -> + + -- Should be imported; make up a CgIdInfo for it + let + name = idName id + in + if isExternalName name then do + hmods <- getHomeModules + let ext_lbl = CmmLit (CmmLabel (mkClosureLabel hmods name)) + return (stableIdInfo id ext_lbl (mkLFImported id)) + else + if isVoidArg (idCgRep id) then + -- Void things are never in the environment + return (voidIdInfo id) + else + -- Bug + cgLookupPanic id + }}}} + + +cgLookupPanic :: Id -> FCode a +cgLookupPanic id + = do static_binds <- getStaticBinds + local_binds <- getBinds + srt <- getSRTLabel + pprPanic "cgPanic" + (vcat [ppr id, + ptext SLIT("static binds for:"), + vcat [ ppr (cg_id info) | info <- varEnvElts static_binds ], + ptext SLIT("local binds for:"), + vcat [ ppr (cg_id info) | info <- varEnvElts local_binds ], + ptext SLIT("SRT label") <+> pprCLabel srt + ]) +\end{code} + +%************************************************************************ +%* * +\subsection[Bindery-nuke-volatile]{Nuking volatile bindings} +%* * +%************************************************************************ + +We sometimes want to nuke all the volatile bindings; we must be sure +we don't leave any (NoVolatile, NoStable) binds around... + +\begin{code} +nukeVolatileBinds :: CgBindings -> CgBindings +nukeVolatileBinds binds + = mkVarEnv (foldr keep_if_stable [] (varEnvElts binds)) + where + keep_if_stable (CgIdInfo { cg_stb = NoStableLoc }) acc = acc + keep_if_stable info acc + = (cg_id info, info { cg_vol = NoVolatileLoc }) : acc +\end{code} + + +%************************************************************************ +%* * +\subsection[lookup-interface]{Interface functions to looking up bindings} +%* * +%************************************************************************ + +\begin{code} +getCAddrModeIfVolatile :: Id -> FCode (Maybe CmmExpr) +getCAddrModeIfVolatile id + = do { info <- getCgIdInfo id + ; case cg_stb info of + NoStableLoc -> do -- Aha! So it is volatile! + amode <- idInfoToAmode info + return $ Just amode + a_stable_loc -> return Nothing } +\end{code} + +@getVolatileRegs@ gets a set of live variables, and returns a list of +all registers on which these variables depend. These are the regs +which must be saved and restored across any C calls. If a variable is +both in a volatile location (depending on a register) {\em and} a +stable one (notably, on the stack), we modify the current bindings to +forget the volatile one. + +\begin{code} +getVolatileRegs :: StgLiveVars -> FCode [GlobalReg] + +getVolatileRegs vars = do + do { stuff <- mapFCs snaffle_it (varSetElems vars) + ; returnFC $ catMaybes stuff } + where + snaffle_it var = do + { info <- getCgIdInfo var + ; let + -- commoned-up code... + consider_reg reg + = -- We assume that all regs can die across C calls + -- We leave it to the save-macros to decide which + -- regs *really* need to be saved. + case cg_stb info of + NoStableLoc -> returnFC (Just reg) -- got one! + is_a_stable_loc -> do + { -- has both volatile & stable locations; + -- force it to rely on the stable location + modifyBindC var nuke_vol_bind + ; return Nothing } + + ; case cg_vol info of + RegLoc (CmmGlobal reg) -> consider_reg reg + VirNodeLoc _ -> consider_reg node + other_loc -> returnFC Nothing -- Local registers + } + + nuke_vol_bind info = info { cg_vol = NoVolatileLoc } +\end{code} + +\begin{code} +getArgAmode :: StgArg -> FCode (CgRep, CmmExpr) +getArgAmode (StgVarArg var) + = do { info <- getCgIdInfo var + ; amode <- idInfoToAmode info + ; return (cgIdInfoArgRep info, amode ) } + +getArgAmode (StgLitArg lit) + = do { cmm_lit <- cgLit lit + ; return (typeCgRep (literalType lit), CmmLit cmm_lit) } + +getArgAmode (StgTypeArg _) = panic "getArgAmode: type arg" + +getArgAmodes :: [StgArg] -> FCode [(CgRep, CmmExpr)] +getArgAmodes [] = returnFC [] +getArgAmodes (atom:atoms) + | isStgTypeArg atom = getArgAmodes atoms + | otherwise = do { amode <- getArgAmode atom + ; amodes <- getArgAmodes atoms + ; return ( amode : amodes ) } +\end{code} + +%************************************************************************ +%* * +\subsection[binding-and-rebinding-interface]{Interface functions for binding and re-binding names} +%* * +%************************************************************************ + +\begin{code} +bindArgsToStack :: [(Id, VirtualSpOffset)] -> Code +bindArgsToStack args + = mapCs bind args + where + bind(id, offset) = addBindC id (stackIdInfo id offset (mkLFArgument id)) + +bindArgsToRegs :: [(Id, GlobalReg)] -> Code +bindArgsToRegs args + = mapCs bind args + where + bind (arg, reg) = bindNewToReg arg (CmmGlobal reg) (mkLFArgument arg) + +bindNewToNode :: Id -> VirtualHpOffset -> LambdaFormInfo -> Code +bindNewToNode id offset lf_info + = addBindC id (nodeIdInfo id offset lf_info) + +-- Create a new temporary whose unique is that in the id, +-- bind the id to it, and return the addressing mode for the +-- temporary. +bindNewToTemp :: Id -> FCode CmmReg +bindNewToTemp id + = do addBindC id (regIdInfo id temp_reg lf_info) + return temp_reg + where + uniq = getUnique id + temp_reg = CmmLocal (LocalReg uniq (argMachRep (idCgRep id))) + lf_info = mkLFArgument id -- Always used of things we + -- know nothing about + +bindNewToReg :: Id -> CmmReg -> LambdaFormInfo -> Code +bindNewToReg name reg lf_info + = addBindC name info + where + info = mkCgIdInfo name (RegLoc reg) NoStableLoc lf_info +\end{code} + +\begin{code} +rebindToStack :: Id -> VirtualSpOffset -> Code +rebindToStack name offset + = modifyBindC name replace_stable_fn + where + replace_stable_fn info = info { cg_stb = VirStkLoc offset } +\end{code} + +%************************************************************************ +%* * +\subsection[CgMonad-deadslots]{Finding dead stack slots} +%* * +%************************************************************************ + +nukeDeadBindings does the following: + + - Removes all bindings from the environment other than those + for variables in the argument to nukeDeadBindings. + - Collects any stack slots so freed, and returns them to the stack free + list. + - Moves the virtual stack pointer to point to the topmost used + stack locations. + +You can have multi-word slots on the stack (where a Double# used to +be, for instance); if dead, such a slot will be reported as *several* +offsets (one per word). + +Probably *naughty* to look inside monad... + +\begin{code} +nukeDeadBindings :: StgLiveVars -- All the *live* variables + -> Code +nukeDeadBindings live_vars = do + binds <- getBinds + let (dead_stk_slots, bs') = + dead_slots live_vars + [] [] + [ (cg_id b, b) | b <- varEnvElts binds ] + setBinds $ mkVarEnv bs' + freeStackSlots dead_stk_slots +\end{code} + +Several boring auxiliary functions to do the dirty work. + +\begin{code} +dead_slots :: StgLiveVars + -> [(Id,CgIdInfo)] + -> [VirtualSpOffset] + -> [(Id,CgIdInfo)] + -> ([VirtualSpOffset], [(Id,CgIdInfo)]) + +-- dead_slots carries accumulating parameters for +-- filtered bindings, dead slots +dead_slots live_vars fbs ds [] + = (ds, reverse fbs) -- Finished; rm the dups, if any + +dead_slots live_vars fbs ds ((v,i):bs) + | v `elementOfUniqSet` live_vars + = dead_slots live_vars ((v,i):fbs) ds bs + -- Live, so don't record it in dead slots + -- Instead keep it in the filtered bindings + + | otherwise + = case cg_stb i of + VirStkLoc offset + | size > 0 + -> dead_slots live_vars fbs ([offset-size+1 .. offset] ++ ds) bs + + _ -> dead_slots live_vars fbs ds bs + where + size :: WordOff + size = cgRepSizeW (cg_rep i) +\end{code} + +\begin{code} +getLiveStackSlots :: FCode [VirtualSpOffset] +-- Return the offsets of slots in stack containig live pointers +getLiveStackSlots + = do { binds <- getBinds + ; return [off | CgIdInfo { cg_stb = VirStkLoc off, + cg_rep = rep } <- varEnvElts binds, + isFollowableArg rep] } +\end{code} diff --git a/compiler/codeGen/CgBindery.lhs-boot b/compiler/codeGen/CgBindery.lhs-boot new file mode 100644 index 0000000000..e504a6a9ba --- /dev/null +++ b/compiler/codeGen/CgBindery.lhs-boot @@ -0,0 +1,11 @@ +\begin{code} +module CgBindery where +import VarEnv( IdEnv ) + +data CgIdInfo +data VolatileLoc +data StableLoc +type CgBindings = IdEnv CgIdInfo + +nukeVolatileBinds :: CgBindings -> CgBindings +\end{code}
\ No newline at end of file diff --git a/compiler/codeGen/CgCallConv.hs b/compiler/codeGen/CgCallConv.hs new file mode 100644 index 0000000000..f463255807 --- /dev/null +++ b/compiler/codeGen/CgCallConv.hs @@ -0,0 +1,512 @@ +----------------------------------------------------------------------------- +-- +-- CgCallConv +-- +-- The datatypes and functions here encapsulate the +-- calling and return conventions used by the code generator. +-- +-- (c) The University of Glasgow 2004 +-- +----------------------------------------------------------------------------- + + +module CgCallConv ( + -- Argument descriptors + mkArgDescr, argDescrType, + + -- Liveness + isBigLiveness, buildContLiveness, mkRegLiveness, + smallLiveness, mkLivenessCLit, + + -- Register assignment + assignCallRegs, assignReturnRegs, assignPrimOpCallRegs, + + -- Calls + constructSlowCall, slowArgs, slowCallPattern, + + -- Returns + CtrlReturnConvention(..), + ctrlReturnConvAlg, + dataReturnConvPrim, + getSequelAmode + ) where + +#include "HsVersions.h" + +import CgUtils ( emitRODataLits, mkWordCLit ) +import CgMonad + +import Constants ( mAX_FAMILY_SIZE_FOR_VEC_RETURNS, + mAX_Vanilla_REG, mAX_Float_REG, + mAX_Double_REG, mAX_Long_REG, + mAX_Real_Vanilla_REG, mAX_Real_Float_REG, + mAX_Real_Double_REG, mAX_Real_Long_REG, + bITMAP_BITS_SHIFT + ) + +import ClosureInfo ( ArgDescr(..), Liveness(..) ) +import CgStackery ( getSpRelOffset ) +import SMRep +import MachOp ( wordRep ) +import Cmm ( CmmExpr(..), GlobalReg(..), CmmLit(..), CmmReg(..), node ) +import CmmUtils ( mkLblExpr ) +import CLabel +import Maybes ( mapCatMaybes ) +import Id ( Id ) +import Name ( Name ) +import TyCon ( TyCon, tyConFamilySize ) +import Bitmap ( Bitmap, mAX_SMALL_BITMAP_SIZE, + mkBitmap, intsToReverseBitmap ) +import Util ( isn'tIn, sortLe ) +import StaticFlags ( opt_Unregisterised ) +import FastString ( LitString ) +import Outputable +import DATA_BITS + + +------------------------------------------------------------------------- +-- +-- Making argument descriptors +-- +-- An argument descriptor describes the layout of args on the stack, +-- both for * GC (stack-layout) purposes, and +-- * saving/restoring registers when a heap-check fails +-- +-- Void arguments aren't important, therefore (contrast constructSlowCall) +-- +------------------------------------------------------------------------- + +-- bring in ARG_P, ARG_N, etc. +#include "../includes/StgFun.h" + +------------------------- +argDescrType :: ArgDescr -> Int +-- The "argument type" RTS field type +argDescrType (ArgSpec n) = n +argDescrType (ArgGen liveness) + | isBigLiveness liveness = ARG_GEN_BIG + | otherwise = ARG_GEN + + +mkArgDescr :: Name -> [Id] -> FCode ArgDescr +mkArgDescr nm args + = case stdPattern arg_reps of + Just spec_id -> return (ArgSpec spec_id) + Nothing -> do { liveness <- mkLiveness nm size bitmap + ; return (ArgGen liveness) } + where + arg_reps = filter nonVoidArg (map idCgRep args) + -- Getting rid of voids eases matching of standard patterns + + bitmap = mkBitmap arg_bits + arg_bits = argBits arg_reps + size = length arg_bits + +argBits :: [CgRep] -> [Bool] -- True for non-ptr, False for ptr +argBits [] = [] +argBits (PtrArg : args) = False : argBits args +argBits (arg : args) = take (cgRepSizeW arg) (repeat True) ++ argBits args + +stdPattern :: [CgRep] -> Maybe Int +stdPattern [] = Just ARG_NONE -- just void args, probably + +stdPattern [PtrArg] = Just ARG_P +stdPattern [FloatArg] = Just ARG_F +stdPattern [DoubleArg] = Just ARG_D +stdPattern [LongArg] = Just ARG_L +stdPattern [NonPtrArg] = Just ARG_N + +stdPattern [NonPtrArg,NonPtrArg] = Just ARG_NN +stdPattern [NonPtrArg,PtrArg] = Just ARG_NP +stdPattern [PtrArg,NonPtrArg] = Just ARG_PN +stdPattern [PtrArg,PtrArg] = Just ARG_PP + +stdPattern [NonPtrArg,NonPtrArg,NonPtrArg] = Just ARG_NNN +stdPattern [NonPtrArg,NonPtrArg,PtrArg] = Just ARG_NNP +stdPattern [NonPtrArg,PtrArg,NonPtrArg] = Just ARG_NPN +stdPattern [NonPtrArg,PtrArg,PtrArg] = Just ARG_NPP +stdPattern [PtrArg,NonPtrArg,NonPtrArg] = Just ARG_PNN +stdPattern [PtrArg,NonPtrArg,PtrArg] = Just ARG_PNP +stdPattern [PtrArg,PtrArg,NonPtrArg] = Just ARG_PPN +stdPattern [PtrArg,PtrArg,PtrArg] = Just ARG_PPP + +stdPattern [PtrArg,PtrArg,PtrArg,PtrArg] = Just ARG_PPPP +stdPattern [PtrArg,PtrArg,PtrArg,PtrArg,PtrArg] = Just ARG_PPPPP +stdPattern [PtrArg,PtrArg,PtrArg,PtrArg,PtrArg,PtrArg] = Just ARG_PPPPPP +stdPattern other = Nothing + + +------------------------------------------------------------------------- +-- +-- Liveness info +-- +------------------------------------------------------------------------- + +mkLiveness :: Name -> Int -> Bitmap -> FCode Liveness +mkLiveness name size bits + | size > mAX_SMALL_BITMAP_SIZE -- Bitmap does not fit in one word + = do { let lbl = mkBitmapLabel name + ; emitRODataLits lbl ( mkWordCLit (fromIntegral size) + : map mkWordCLit bits) + ; return (BigLiveness lbl) } + + | otherwise -- Bitmap fits in one word + = let + small_bits = case bits of + [] -> 0 + [b] -> fromIntegral b + _ -> panic "livenessToAddrMode" + in + return (smallLiveness size small_bits) + +smallLiveness :: Int -> StgWord -> Liveness +smallLiveness size small_bits = SmallLiveness bits + where bits = fromIntegral size .|. (small_bits `shiftL` bITMAP_BITS_SHIFT) + +------------------- +isBigLiveness :: Liveness -> Bool +isBigLiveness (BigLiveness _) = True +isBigLiveness (SmallLiveness _) = False + +------------------- +mkLivenessCLit :: Liveness -> CmmLit +mkLivenessCLit (BigLiveness lbl) = CmmLabel lbl +mkLivenessCLit (SmallLiveness bits) = mkWordCLit bits + + +------------------------------------------------------------------------- +-- +-- Bitmap describing register liveness +-- across GC when doing a "generic" heap check +-- (a RET_DYN stack frame). +-- +-- NB. Must agree with these macros (currently in StgMacros.h): +-- GET_NON_PTRS(), GET_PTRS(), GET_LIVENESS(). +------------------------------------------------------------------------- + +mkRegLiveness :: [(Id, GlobalReg)] -> Int -> Int -> StgWord +mkRegLiveness regs ptrs nptrs + = (fromIntegral nptrs `shiftL` 16) .|. + (fromIntegral ptrs `shiftL` 24) .|. + all_non_ptrs `xor` reg_bits regs + where + all_non_ptrs = 0xff + + reg_bits [] = 0 + reg_bits ((id, VanillaReg i) : regs) | isFollowableArg (idCgRep id) + = (1 `shiftL` (i - 1)) .|. reg_bits regs + reg_bits (_ : regs) + = reg_bits regs + +------------------------------------------------------------------------- +-- +-- Pushing the arguments for a slow call +-- +------------------------------------------------------------------------- + +-- For a slow call, we must take a bunch of arguments and intersperse +-- some stg_ap_<pattern>_ret_info return addresses. +constructSlowCall + :: [(CgRep,CmmExpr)] + -> (CLabel, -- RTS entry point for call + [(CgRep,CmmExpr)], -- args to pass to the entry point + [(CgRep,CmmExpr)]) -- stuff to save on the stack + + -- don't forget the zero case +constructSlowCall [] + = (mkRtsApFastLabel SLIT("stg_ap_0"), [], []) + +constructSlowCall amodes + = (stg_ap_pat, these, rest) + where + stg_ap_pat = mkRtsApFastLabel arg_pat + (arg_pat, these, rest) = matchSlowPattern amodes + +enterRtsRetLabel arg_pat + | tablesNextToCode = mkRtsRetInfoLabel arg_pat + | otherwise = mkRtsRetLabel arg_pat + +-- | 'slowArgs' takes a list of function arguments and prepares them for +-- pushing on the stack for "extra" arguments to a function which requires +-- fewer arguments than we currently have. +slowArgs :: [(CgRep,CmmExpr)] -> [(CgRep,CmmExpr)] +slowArgs [] = [] +slowArgs amodes = (NonPtrArg, mkLblExpr stg_ap_pat) : args ++ slowArgs rest + where (arg_pat, args, rest) = matchSlowPattern amodes + stg_ap_pat = mkRtsRetInfoLabel arg_pat + +matchSlowPattern :: [(CgRep,CmmExpr)] + -> (LitString, [(CgRep,CmmExpr)], [(CgRep,CmmExpr)]) +matchSlowPattern amodes = (arg_pat, these, rest) + where (arg_pat, n) = slowCallPattern (map fst amodes) + (these, rest) = splitAt n amodes + +-- These cases were found to cover about 99% of all slow calls: +slowCallPattern (PtrArg: PtrArg: PtrArg: PtrArg: PtrArg: PtrArg: _) = (SLIT("stg_ap_pppppp"), 6) +slowCallPattern (PtrArg: PtrArg: PtrArg: PtrArg: PtrArg: _) = (SLIT("stg_ap_ppppp"), 5) +slowCallPattern (PtrArg: PtrArg: PtrArg: PtrArg: _) = (SLIT("stg_ap_pppp"), 4) +slowCallPattern (PtrArg: PtrArg: PtrArg: VoidArg: _) = (SLIT("stg_ap_pppv"), 4) +slowCallPattern (PtrArg: PtrArg: PtrArg: _) = (SLIT("stg_ap_ppp"), 3) +slowCallPattern (PtrArg: PtrArg: VoidArg: _) = (SLIT("stg_ap_ppv"), 3) +slowCallPattern (PtrArg: PtrArg: _) = (SLIT("stg_ap_pp"), 2) +slowCallPattern (PtrArg: VoidArg: _) = (SLIT("stg_ap_pv"), 2) +slowCallPattern (PtrArg: _) = (SLIT("stg_ap_p"), 1) +slowCallPattern (VoidArg: _) = (SLIT("stg_ap_v"), 1) +slowCallPattern (NonPtrArg: _) = (SLIT("stg_ap_n"), 1) +slowCallPattern (FloatArg: _) = (SLIT("stg_ap_f"), 1) +slowCallPattern (DoubleArg: _) = (SLIT("stg_ap_d"), 1) +slowCallPattern (LongArg: _) = (SLIT("stg_ap_l"), 1) +slowCallPattern _ = panic "CgStackery.slowCallPattern" + +------------------------------------------------------------------------- +-- +-- Return conventions +-- +------------------------------------------------------------------------- + +-- A @CtrlReturnConvention@ says how {\em control} is returned. + +data CtrlReturnConvention + = VectoredReturn Int -- size of the vector table (family size) + | UnvectoredReturn Int -- family size + +ctrlReturnConvAlg :: TyCon -> CtrlReturnConvention +ctrlReturnConvAlg tycon + = case (tyConFamilySize tycon) of + size -> -- we're supposed to know... + if (size > (1::Int) && size <= mAX_FAMILY_SIZE_FOR_VEC_RETURNS) then + VectoredReturn size + else + UnvectoredReturn size + -- NB: unvectored returns Include size 0 (no constructors), so that + -- the following perverse code compiles (it crashed GHC in 5.02) + -- data T1 + -- data T2 = T2 !T1 Int + -- The only value of type T1 is bottom, which never returns anyway. + +dataReturnConvPrim :: CgRep -> CmmReg +dataReturnConvPrim PtrArg = CmmGlobal (VanillaReg 1) +dataReturnConvPrim NonPtrArg = CmmGlobal (VanillaReg 1) +dataReturnConvPrim LongArg = CmmGlobal (LongReg 1) +dataReturnConvPrim FloatArg = CmmGlobal (FloatReg 1) +dataReturnConvPrim DoubleArg = CmmGlobal (DoubleReg 1) +dataReturnConvPrim VoidArg = panic "dataReturnConvPrim: void" + + +-- getSequelAmode returns an amode which refers to an info table. The info +-- table will always be of the RET(_VEC)?_(BIG|SMALL) kind. We're careful +-- not to handle real code pointers, just in case we're compiling for +-- an unregisterised/untailcallish architecture, where info pointers and +-- code pointers aren't the same. +-- DIRE WARNING. +-- The OnStack case of sequelToAmode delivers an Amode which is only +-- valid just before the final control transfer, because it assumes +-- that Sp is pointing to the top word of the return address. This +-- seems unclean but there you go. + +getSequelAmode :: FCode CmmExpr +getSequelAmode + = do { EndOfBlockInfo virt_sp sequel <- getEndOfBlockInfo + ; case sequel of + OnStack -> do { sp_rel <- getSpRelOffset virt_sp + ; returnFC (CmmLoad sp_rel wordRep) } + + UpdateCode -> returnFC (CmmLit (CmmLabel mkUpdInfoLabel)) + CaseAlts lbl _ _ True -> returnFC (CmmLit (CmmLabel mkSeqInfoLabel)) + CaseAlts lbl _ _ False -> returnFC (CmmLit (CmmLabel lbl)) + } + +------------------------------------------------------------------------- +-- +-- Build a liveness mask for the current stack +-- +------------------------------------------------------------------------- + +-- There are four kinds of things on the stack: +-- +-- - pointer variables (bound in the environment) +-- - non-pointer variables (boudn in the environment) +-- - free slots (recorded in the stack free list) +-- - non-pointer data slots (recorded in the stack free list) +-- +-- We build up a bitmap of non-pointer slots by searching the environment +-- for all the pointer variables, and subtracting these from a bitmap +-- with initially all bits set (up to the size of the stack frame). + +buildContLiveness :: Name -- Basis for label (only) + -> [VirtualSpOffset] -- Live stack slots + -> FCode Liveness +buildContLiveness name live_slots + = do { stk_usg <- getStkUsage + ; let StackUsage { realSp = real_sp, + frameSp = frame_sp } = stk_usg + + start_sp :: VirtualSpOffset + start_sp = real_sp - retAddrSizeW + -- In a continuation, we want a liveness mask that + -- starts from just after the return address, which is + -- on the stack at real_sp. + + frame_size :: WordOff + frame_size = start_sp - frame_sp + -- real_sp points to the frame-header for the current + -- stack frame, and the end of this frame is frame_sp. + -- The size is therefore real_sp - frame_sp - retAddrSizeW + -- (subtract one for the frame-header = return address). + + rel_slots :: [WordOff] + rel_slots = sortLe (<=) + [ start_sp - ofs -- Get slots relative to top of frame + | ofs <- live_slots ] + + bitmap = intsToReverseBitmap frame_size rel_slots + + ; WARN( not (all (>=0) rel_slots), + ppr name $$ ppr live_slots $$ ppr frame_size $$ ppr start_sp $$ ppr rel_slots ) + mkLiveness name frame_size bitmap } + + +------------------------------------------------------------------------- +-- +-- Register assignment +-- +------------------------------------------------------------------------- + +-- How to assign registers for +-- +-- 1) Calling a fast entry point. +-- 2) Returning an unboxed tuple. +-- 3) Invoking an out-of-line PrimOp. +-- +-- Registers are assigned in order. +-- +-- If we run out, we don't attempt to assign any further registers (even +-- though we might have run out of only one kind of register); we just +-- return immediately with the left-overs specified. +-- +-- The alternative version @assignAllRegs@ uses the complete set of +-- registers, including those that aren't mapped to real machine +-- registers. This is used for calling special RTS functions and PrimOps +-- which expect their arguments to always be in the same registers. + +assignCallRegs, assignPrimOpCallRegs, assignReturnRegs + :: [(CgRep,a)] -- Arg or result values to assign + -> ([(a, GlobalReg)], -- Register assignment in same order + -- for *initial segment of* input list + -- (but reversed; doesn't matter) + -- VoidRep args do not appear here + [(CgRep,a)]) -- Leftover arg or result values + +assignCallRegs args + = assign_regs args (mkRegTbl [node]) + -- The entry convention for a function closure + -- never uses Node for argument passing; instead + -- Node points to the function closure itself + +assignPrimOpCallRegs args + = assign_regs args (mkRegTbl_allRegs []) + -- For primops, *all* arguments must be passed in registers + +assignReturnRegs args + = assign_regs args (mkRegTbl []) + -- For returning unboxed tuples etc, + -- we use all regs + +assign_regs :: [(CgRep,a)] -- Arg or result values to assign + -> AvailRegs -- Regs still avail: Vanilla, Float, Double, Longs + -> ([(a, GlobalReg)], [(CgRep, a)]) +assign_regs args supply + = go args [] supply + where + go [] acc supply = (acc, []) -- Return the results reversed (doesn't matter) + go ((VoidArg,_) : args) acc supply -- Skip void arguments; they aren't passed, and + = go args acc supply -- there's nothign to bind them to + go ((rep,arg) : args) acc supply + = case assign_reg rep supply of + Just (reg, supply') -> go args ((arg,reg):acc) supply' + Nothing -> (acc, (rep,arg):args) -- No more regs + +assign_reg :: CgRep -> AvailRegs -> Maybe (GlobalReg, AvailRegs) +assign_reg FloatArg (vs, f:fs, ds, ls) = Just (FloatReg f, (vs, fs, ds, ls)) +assign_reg DoubleArg (vs, fs, d:ds, ls) = Just (DoubleReg d, (vs, fs, ds, ls)) +assign_reg LongArg (vs, fs, ds, l:ls) = Just (LongReg l, (vs, fs, ds, ls)) +assign_reg PtrArg (v:vs, fs, ds, ls) = Just (VanillaReg v, (vs, fs, ds, ls)) +assign_reg NonPtrArg (v:vs, fs, ds, ls) = Just (VanillaReg v, (vs, fs, ds, ls)) + -- PtrArg and NonPtrArg both go in a vanilla register +assign_reg other not_enough_regs = Nothing + + +------------------------------------------------------------------------- +-- +-- Register supplies +-- +------------------------------------------------------------------------- + +-- Vanilla registers can contain pointers, Ints, Chars. +-- Floats and doubles have separate register supplies. +-- +-- We take these register supplies from the *real* registers, i.e. those +-- that are guaranteed to map to machine registers. + +useVanillaRegs | opt_Unregisterised = 0 + | otherwise = mAX_Real_Vanilla_REG +useFloatRegs | opt_Unregisterised = 0 + | otherwise = mAX_Real_Float_REG +useDoubleRegs | opt_Unregisterised = 0 + | otherwise = mAX_Real_Double_REG +useLongRegs | opt_Unregisterised = 0 + | otherwise = mAX_Real_Long_REG + +vanillaRegNos, floatRegNos, doubleRegNos, longRegNos :: [Int] +vanillaRegNos = regList useVanillaRegs +floatRegNos = regList useFloatRegs +doubleRegNos = regList useDoubleRegs +longRegNos = regList useLongRegs + +allVanillaRegNos, allFloatRegNos, allDoubleRegNos, allLongRegNos :: [Int] +allVanillaRegNos = regList mAX_Vanilla_REG +allFloatRegNos = regList mAX_Float_REG +allDoubleRegNos = regList mAX_Double_REG +allLongRegNos = regList mAX_Long_REG + +regList 0 = [] +regList n = [1 .. n] + +type AvailRegs = ( [Int] -- available vanilla regs. + , [Int] -- floats + , [Int] -- doubles + , [Int] -- longs (int64 and word64) + ) + +mkRegTbl :: [GlobalReg] -> AvailRegs +mkRegTbl regs_in_use + = mkRegTbl' regs_in_use vanillaRegNos floatRegNos doubleRegNos longRegNos + +mkRegTbl_allRegs :: [GlobalReg] -> AvailRegs +mkRegTbl_allRegs regs_in_use + = mkRegTbl' regs_in_use allVanillaRegNos allFloatRegNos allDoubleRegNos allLongRegNos + +mkRegTbl' regs_in_use vanillas floats doubles longs + = (ok_vanilla, ok_float, ok_double, ok_long) + where + ok_vanilla = mapCatMaybes (select VanillaReg) vanillas + ok_float = mapCatMaybes (select FloatReg) floats + ok_double = mapCatMaybes (select DoubleReg) doubles + ok_long = mapCatMaybes (select LongReg) longs + -- rep isn't looked at, hence we can use any old rep. + + select :: (Int -> GlobalReg) -> Int{-cand-} -> Maybe Int + -- one we've unboxed the Int, we make a GlobalReg + -- and see if it is already in use; if not, return its number. + + select mk_reg_fun cand + = let + reg = mk_reg_fun cand + in + if reg `not_elem` regs_in_use + then Just cand + else Nothing + where + not_elem = isn'tIn "mkRegTbl" + + diff --git a/compiler/codeGen/CgCase.lhs b/compiler/codeGen/CgCase.lhs new file mode 100644 index 0000000000..e7c08940c5 --- /dev/null +++ b/compiler/codeGen/CgCase.lhs @@ -0,0 +1,634 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +% $Id: CgCase.lhs,v 1.75 2005/06/21 10:44:41 simonmar Exp $ +% +%******************************************************** +%* * +\section[CgCase]{Converting @StgCase@ expressions} +%* * +%******************************************************** + +\begin{code} +module CgCase ( cgCase, saveVolatileVarsAndRegs, + restoreCurrentCostCentre + ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} CgExpr ( cgExpr ) + +import CgMonad +import StgSyn +import CgBindery ( getArgAmodes, + bindNewToReg, bindNewToTemp, + getCgIdInfo, getArgAmode, + rebindToStack, getCAddrModeIfVolatile, + nukeDeadBindings, idInfoToAmode + ) +import CgCon ( bindConArgs, bindUnboxedTupleComponents ) +import CgHeapery ( altHeapCheck, unbxTupleHeapCheck ) +import CgCallConv ( dataReturnConvPrim, ctrlReturnConvAlg, + CtrlReturnConvention(..) + ) +import CgStackery ( allocPrimStack, allocStackTop, getSpRelOffset, + deAllocStackTop, freeStackSlots + ) +import CgTailCall ( performTailCall ) +import CgPrimOp ( cgPrimOp ) +import CgForeignCall ( cgForeignCall ) +import CgUtils ( newTemp, cgLit, emitLitSwitch, emitSwitch, + tagToClosure ) +import CgProf ( curCCS, curCCSAddr ) +import CgInfoTbls ( emitDirectReturnTarget, emitAlgReturnTarget, + dataConTagZ ) +import SMRep ( CgRep(..), retAddrSizeW, nonVoidArg, isVoidArg, + idCgRep, tyConCgRep, typeHint ) +import CmmUtils ( CmmStmts, noStmts, oneStmt, plusStmts ) +import Cmm +import MachOp ( wordRep ) +import ClosureInfo ( mkLFArgument ) +import StaticFlags ( opt_SccProfilingOn ) +import Id ( Id, idName, isDeadBinder, idType ) +import ForeignCall ( ForeignCall(..), CCallSpec(..), playSafe ) +import VarSet ( varSetElems ) +import CoreSyn ( AltCon(..) ) +import PrimOp ( PrimOp(..), primOpOutOfLine ) +import TyCon ( isEnumerationTyCon, tyConFamilySize ) +import Util ( isSingleton ) +import Outputable +\end{code} + +\begin{code} +data GCFlag + = GCMayHappen -- The scrutinee may involve GC, so everything must be + -- tidy before the code for the scrutinee. + + | NoGC -- The scrutinee is a primitive value, or a call to a + -- primitive op which does no GC. Hence the case can + -- be done inline, without tidying up first. +\end{code} + +It is quite interesting to decide whether to put a heap-check +at the start of each alternative. Of course we certainly have +to do so if the case forces an evaluation, or if there is a primitive +op which can trigger GC. + +A more interesting situation is this: + + \begin{verbatim} + !A!; + ...A... + case x# of + 0# -> !B!; ...B... + default -> !C!; ...C... + \end{verbatim} + +where \tr{!x!} indicates a possible heap-check point. The heap checks +in the alternatives {\em can} be omitted, in which case the topmost +heapcheck will take their worst case into account. + +In favour of omitting \tr{!B!}, \tr{!C!}: + + - {\em May} save a heap overflow test, + if ...A... allocates anything. The other advantage + of this is that we can use relative addressing + from a single Hp to get at all the closures so allocated. + + - No need to save volatile vars etc across the case + +Against: + + - May do more allocation than reqd. This sometimes bites us + badly. For example, nfib (ha!) allocates about 30\% more space if the + worst-casing is done, because many many calls to nfib are leaf calls + which don't need to allocate anything. + + This never hurts us if there is only one alternative. + +\begin{code} +cgCase :: StgExpr + -> StgLiveVars + -> StgLiveVars + -> Id + -> SRT + -> AltType + -> [StgAlt] + -> Code +\end{code} + +Special case #1: case of literal. + +\begin{code} +cgCase (StgLit lit) live_in_whole_case live_in_alts bndr srt + alt_type@(PrimAlt tycon) alts + = do { tmp_reg <- bindNewToTemp bndr + ; cm_lit <- cgLit lit + ; stmtC (CmmAssign tmp_reg (CmmLit cm_lit)) + ; cgPrimAlts NoGC alt_type tmp_reg alts } +\end{code} + +Special case #2: scrutinising a primitive-typed variable. No +evaluation required. We don't save volatile variables, nor do we do a +heap-check in the alternatives. Instead, the heap usage of the +alternatives is worst-cased and passed upstream. This can result in +allocating more heap than strictly necessary, but it will sometimes +eliminate a heap check altogether. + +\begin{code} +cgCase (StgApp v []) live_in_whole_case live_in_alts bndr srt + alt_type@(PrimAlt tycon) alts + = do { -- Careful! we can't just bind the default binder to the same thing + -- as the scrutinee, since it might be a stack location, and having + -- two bindings pointing at the same stack locn doesn't work (it + -- confuses nukeDeadBindings). Hence, use a new temp. + v_info <- getCgIdInfo v + ; amode <- idInfoToAmode v_info + ; tmp_reg <- bindNewToTemp bndr + ; stmtC (CmmAssign tmp_reg amode) + ; cgPrimAlts NoGC alt_type tmp_reg alts } +\end{code} + +Special case #3: inline PrimOps and foreign calls. + +\begin{code} +cgCase (StgOpApp op@(StgPrimOp primop) args _) + live_in_whole_case live_in_alts bndr srt alt_type alts + | not (primOpOutOfLine primop) + = cgInlinePrimOp primop args bndr alt_type live_in_alts alts +\end{code} + +TODO: Case-of-case of primop can probably be done inline too (but +maybe better to translate it out beforehand). See +ghc/lib/misc/PackedString.lhs for examples where this crops up (with +4.02). + +Special case #4: inline foreign calls: an unsafe foreign call can be done +right here, just like an inline primop. + +\begin{code} +cgCase (StgOpApp op@(StgFCallOp fcall _) args _) + live_in_whole_case live_in_alts bndr srt alt_type alts + | unsafe_foreign_call + = ASSERT( isSingleton alts ) + do -- *must* be an unboxed tuple alt. + -- exactly like the cgInlinePrimOp case for unboxed tuple alts.. + { res_tmps <- mapFCs bindNewToTemp non_void_res_ids + ; let res_hints = map (typeHint.idType) non_void_res_ids + ; cgForeignCall (zip res_tmps res_hints) fcall args live_in_alts + ; cgExpr rhs } + where + (_, res_ids, _, rhs) = head alts + non_void_res_ids = filter (nonVoidArg . idCgRep) res_ids + + unsafe_foreign_call + = case fcall of + CCall (CCallSpec _ _ s) -> not (playSafe s) + _other -> False +\end{code} + +Special case: scrutinising a non-primitive variable. +This can be done a little better than the general case, because +we can reuse/trim the stack slot holding the variable (if it is in one). + +\begin{code} +cgCase (StgApp fun args) + live_in_whole_case live_in_alts bndr srt alt_type alts + = do { fun_info <- getCgIdInfo fun + ; arg_amodes <- getArgAmodes args + + -- Nuking dead bindings *before* calculating the saves is the + -- value-add here. We might end up freeing up some slots currently + -- occupied by variables only required for the call. + -- NOTE: we need to look up the variables used in the call before + -- doing this, because some of them may not be in the environment + -- afterward. + ; nukeDeadBindings live_in_alts + ; (save_assts, alts_eob_info, maybe_cc_slot) + <- saveVolatileVarsAndRegs live_in_alts + + ; scrut_eob_info + <- forkEval alts_eob_info + (allocStackTop retAddrSizeW >> nopC) + (do { deAllocStackTop retAddrSizeW + ; cgEvalAlts maybe_cc_slot bndr srt alt_type alts }) + + ; setEndOfBlockInfo (maybeReserveSeqFrame alt_type scrut_eob_info) + (performTailCall fun_info arg_amodes save_assts) } +\end{code} + +Note about return addresses: we *always* push a return address, even +if because of an optimisation we end up jumping direct to the return +code (not through the address itself). The alternatives always assume +that the return address is on the stack. The return address is +required in case the alternative performs a heap check, since it +encodes the liveness of the slots in the activation record. + +On entry to the case alternative, we can re-use the slot containing +the return address immediately after the heap check. That's what the +deAllocStackTop call is doing above. + +Finally, here is the general case. + +\begin{code} +cgCase expr live_in_whole_case live_in_alts bndr srt alt_type alts + = do { -- Figure out what volatile variables to save + nukeDeadBindings live_in_whole_case + + ; (save_assts, alts_eob_info, maybe_cc_slot) + <- saveVolatileVarsAndRegs live_in_alts + + -- Save those variables right now! + ; emitStmts save_assts + + -- generate code for the alts + ; scrut_eob_info + <- forkEval alts_eob_info + (do { nukeDeadBindings live_in_alts + ; allocStackTop retAddrSizeW -- space for retn address + ; nopC }) + (do { deAllocStackTop retAddrSizeW + ; cgEvalAlts maybe_cc_slot bndr srt alt_type alts }) + + ; setEndOfBlockInfo (maybeReserveSeqFrame alt_type scrut_eob_info) + (cgExpr expr) + } +\end{code} + +There's a lot of machinery going on behind the scenes to manage the +stack pointer here. forkEval takes the virtual Sp and free list from +the first argument, and turns that into the *real* Sp for the second +argument. It also uses this virtual Sp as the args-Sp in the EOB info +returned, so that the scrutinee will trim the real Sp back to the +right place before doing whatever it does. + --SDM (who just spent an hour figuring this out, and didn't want to + forget it). + +Why don't we push the return address just before evaluating the +scrutinee? Because the slot reserved for the return address might +contain something useful, so we wait until performing a tail call or +return before pushing the return address (see +CgTailCall.pushReturnAddress). + +This also means that the environment doesn't need to know about the +free stack slot for the return address (for generating bitmaps), +because we don't reserve it until just before the eval. + +TODO!! Problem: however, we have to save the current cost centre +stack somewhere, because at the eval point the current CCS might be +different. So we pick a free stack slot and save CCCS in it. One +consequence of this is that activation records on the stack don't +follow the layout of closures when we're profiling. The CCS could be +anywhere within the record). + +\begin{code} +maybeReserveSeqFrame PolyAlt (EndOfBlockInfo args_sp (CaseAlts amode stuff bndr _)) + = EndOfBlockInfo (args_sp + retAddrSizeW) (CaseAlts amode stuff bndr True) +maybeReserveSeqFrame other scrut_eob_info = scrut_eob_info +\end{code} + + +%************************************************************************ +%* * + Inline primops +%* * +%************************************************************************ + +\begin{code} +cgInlinePrimOp primop args bndr (PrimAlt tycon) live_in_alts alts + | isVoidArg (idCgRep bndr) + = ASSERT( con == DEFAULT && isSingleton alts && null bs ) + do { -- VOID RESULT; just sequencing, + -- so get in there and do it + cgPrimOp [] primop args live_in_alts + ; cgExpr rhs } + where + (con,bs,_,rhs) = head alts + +cgInlinePrimOp primop args bndr (PrimAlt tycon) live_in_alts alts + = do { -- PRIMITIVE ALTS, with non-void result + tmp_reg <- bindNewToTemp bndr + ; cgPrimOp [tmp_reg] primop args live_in_alts + ; cgPrimAlts NoGC (PrimAlt tycon) tmp_reg alts } + +cgInlinePrimOp primop args bndr (UbxTupAlt tycon) live_in_alts alts + = ASSERT( isSingleton alts ) + do { -- UNBOXED TUPLE ALTS + -- No heap check, no yield, just get in there and do it. + -- NB: the case binder isn't bound to anything; + -- it has a unboxed tuple type + + res_tmps <- mapFCs bindNewToTemp non_void_res_ids + ; cgPrimOp res_tmps primop args live_in_alts + ; cgExpr rhs } + where + (_, res_ids, _, rhs) = head alts + non_void_res_ids = filter (nonVoidArg . idCgRep) res_ids + +cgInlinePrimOp primop args bndr (AlgAlt tycon) live_in_alts alts + = do { -- ENUMERATION TYPE RETURN + -- Typical: case a ># b of { True -> ..; False -> .. } + -- The primop itself returns an index into the table of + -- closures for the enumeration type. + tag_amode <- ASSERT( isEnumerationTyCon tycon ) + do_enum_primop primop + + -- Bind the default binder if necessary + -- (avoiding it avoids the assignment) + -- The deadness info is set by StgVarInfo + ; hmods <- getHomeModules + ; whenC (not (isDeadBinder bndr)) + (do { tmp_reg <- bindNewToTemp bndr + ; stmtC (CmmAssign tmp_reg (tagToClosure hmods tycon tag_amode)) }) + + -- Compile the alts + ; (branches, mb_deflt) <- cgAlgAlts NoGC Nothing{-cc_slot-} + (AlgAlt tycon) alts + + -- Do the switch + ; emitSwitch tag_amode branches mb_deflt 0 (tyConFamilySize tycon - 1) + } + where + + do_enum_primop :: PrimOp -> FCode CmmExpr -- Returns amode for result + do_enum_primop TagToEnumOp -- No code! + | [arg] <- args = do + (_,e) <- getArgAmode arg + return e + do_enum_primop primop + = do tmp <- newTemp wordRep + cgPrimOp [tmp] primop args live_in_alts + returnFC (CmmReg tmp) + +cgInlinePrimOp primop arg_amodes bndr PolyAlt live_in_alts alts + = pprPanic "cgCase: case of primop has polymorphic type" (ppr bndr) +\end{code} + +%************************************************************************ +%* * +\subsection[CgCase-alts]{Alternatives} +%* * +%************************************************************************ + +@cgEvalAlts@ returns an addressing mode for a continuation for the +alternatives of a @case@, used in a context when there +is some evaluation to be done. + +\begin{code} +cgEvalAlts :: Maybe VirtualSpOffset -- Offset of cost-centre to be restored, if any + -> Id + -> SRT -- SRT for the continuation + -> AltType + -> [StgAlt] + -> FCode Sequel -- Any addr modes inside are guaranteed + -- to be a label so that we can duplicate it + -- without risk of duplicating code + +cgEvalAlts cc_slot bndr srt alt_type@(PrimAlt tycon) alts + = do { let rep = tyConCgRep tycon + reg = dataReturnConvPrim rep -- Bottom for voidRep + + ; abs_c <- forkProc $ do + { -- Bind the case binder, except if it's void + -- (reg is bottom in that case) + whenC (nonVoidArg rep) $ + bindNewToReg bndr reg (mkLFArgument bndr) + ; restoreCurrentCostCentre cc_slot True + ; cgPrimAlts GCMayHappen alt_type reg alts } + + ; lbl <- emitDirectReturnTarget (idName bndr) abs_c srt + ; returnFC (CaseAlts lbl Nothing bndr False) } + +cgEvalAlts cc_slot bndr srt (UbxTupAlt _) [(con,args,_,rhs)] + = -- Unboxed tuple case + -- By now, the simplifier should have have turned it + -- into case e of (# a,b #) -> e + -- There shouldn't be a + -- case e of DEFAULT -> e + ASSERT2( case con of { DataAlt _ -> True; other -> False }, + text "cgEvalAlts: dodgy case of unboxed tuple type" ) + do { -- forkAbsC for the RHS, so that the envt is + -- not changed for the emitDirectReturn call + abs_c <- forkProc $ do + { (live_regs, ptrs, nptrs, _) <- bindUnboxedTupleComponents args + -- Restore the CC *after* binding the tuple components, + -- so that we get the stack offset of the saved CC right. + ; restoreCurrentCostCentre cc_slot True + -- Generate a heap check if necessary + -- and finally the code for the alternative + ; unbxTupleHeapCheck live_regs ptrs nptrs noStmts + (cgExpr rhs) } + ; lbl <- emitDirectReturnTarget (idName bndr) abs_c srt + ; returnFC (CaseAlts lbl Nothing bndr False) } + +cgEvalAlts cc_slot bndr srt alt_type alts + = -- Algebraic and polymorphic case + do { -- Bind the default binder + bindNewToReg bndr nodeReg (mkLFArgument bndr) + + -- Generate sequel info for use downstream + -- At the moment, we only do it if the type is vector-returnable. + -- Reason: if not, then it costs extra to label the + -- alternatives, because we'd get return code like: + -- + -- switch TagReg { 0 : JMP(alt_1); 1 : JMP(alt_2) ..etc } + -- + -- which is worse than having the alt code in the switch statement + + ; (alts, mb_deflt) <- cgAlgAlts GCMayHappen cc_slot alt_type alts + + ; (lbl, branches) <- emitAlgReturnTarget (idName bndr) + alts mb_deflt srt ret_conv + + ; returnFC (CaseAlts lbl branches bndr False) } + where + ret_conv = case alt_type of + AlgAlt tc -> ctrlReturnConvAlg tc + PolyAlt -> UnvectoredReturn 0 +\end{code} + + +HWL comment on {\em GrAnSim\/} (adding GRAN_YIELDs for context switch): If +we do an inlining of the case no separate functions for returning are +created, so we don't have to generate a GRAN_YIELD in that case. This info +must be propagated to cgAlgAltRhs (where the GRAN_YIELD macro might be +emitted). Hence, the new Bool arg to cgAlgAltRhs. + +%************************************************************************ +%* * +\subsection[CgCase-alg-alts]{Algebraic alternatives} +%* * +%************************************************************************ + +In @cgAlgAlts@, none of the binders in the alternatives are +assumed to be yet bound. + +HWL comment on {\em GrAnSim\/} (adding GRAN_YIELDs for context switch): The +last arg of cgAlgAlts indicates if we want a context switch at the +beginning of each alternative. Normally we want that. The only exception +are inlined alternatives. + +\begin{code} +cgAlgAlts :: GCFlag + -> Maybe VirtualSpOffset + -> AltType -- ** AlgAlt or PolyAlt only ** + -> [StgAlt] -- The alternatives + -> FCode ( [(ConTagZ, CgStmts)], -- The branches + Maybe CgStmts ) -- The default case + +cgAlgAlts gc_flag cc_slot alt_type alts + = do alts <- forkAlts [ cgAlgAlt gc_flag cc_slot alt_type alt | alt <- alts] + let + mb_deflt = case alts of -- DEFAULT is always first, if present + ((DEFAULT,blks) : _) -> Just blks + other -> Nothing + + branches = [(dataConTagZ con, blks) + | (DataAlt con, blks) <- alts] + -- in + return (branches, mb_deflt) + + +cgAlgAlt :: GCFlag + -> Maybe VirtualSpOffset -- Turgid state + -> AltType -- ** AlgAlt or PolyAlt only ** + -> StgAlt + -> FCode (AltCon, CgStmts) + +cgAlgAlt gc_flag cc_slot alt_type (con, args, use_mask, rhs) + = do { abs_c <- getCgStmts $ do + { bind_con_args con args + ; restoreCurrentCostCentre cc_slot True + ; maybeAltHeapCheck gc_flag alt_type (cgExpr rhs) } + ; return (con, abs_c) } + where + bind_con_args DEFAULT args = nopC + bind_con_args (DataAlt dc) args = bindConArgs dc args +\end{code} + + +%************************************************************************ +%* * +\subsection[CgCase-prim-alts]{Primitive alternatives} +%* * +%************************************************************************ + +@cgPrimAlts@ generates suitable a @CSwitch@ +for dealing with the alternatives of a primitive @case@, given an +addressing mode for the thing to scrutinise. It also keeps track of +the maximum stack depth encountered down any branch. + +As usual, no binders in the alternatives are yet bound. + +\begin{code} +cgPrimAlts :: GCFlag + -> AltType -- Always PrimAlt, but passed to maybeAltHeapCheck + -> CmmReg -- Scrutinee + -> [StgAlt] -- Alternatives + -> Code +-- NB: cgPrimAlts emits code that does the case analysis. +-- It's often used in inline situations, rather than to genearte +-- a labelled return point. That's why its interface is a little +-- different to cgAlgAlts +-- +-- INVARIANT: the default binder is already bound +cgPrimAlts gc_flag alt_type scrutinee alts + = do { tagged_absCs <- forkAlts (map (cgPrimAlt gc_flag alt_type) alts) + ; let ((DEFAULT, deflt_absC) : others) = tagged_absCs -- There is always a default + alt_absCs = [(lit,rhs) | (LitAlt lit, rhs) <- others] + ; emitLitSwitch (CmmReg scrutinee) alt_absCs deflt_absC } + +cgPrimAlt :: GCFlag + -> AltType + -> StgAlt -- The alternative + -> FCode (AltCon, CgStmts) -- Its compiled form + +cgPrimAlt gc_flag alt_type (con, [], [], rhs) + = ASSERT( case con of { DEFAULT -> True; LitAlt _ -> True; other -> False } ) + do { abs_c <- getCgStmts (maybeAltHeapCheck gc_flag alt_type (cgExpr rhs)) + ; returnFC (con, abs_c) } +\end{code} + + +%************************************************************************ +%* * +\subsection[CgCase-tidy]{Code for tidying up prior to an eval} +%* * +%************************************************************************ + +\begin{code} +maybeAltHeapCheck + :: GCFlag + -> AltType -- PolyAlt, PrimAlt, AlgAlt, but *not* UbxTupAlt + -> Code -- Continuation + -> Code +maybeAltHeapCheck NoGC _ code = code +maybeAltHeapCheck GCMayHappen alt_type code = altHeapCheck alt_type code + +saveVolatileVarsAndRegs + :: StgLiveVars -- Vars which should be made safe + -> FCode (CmmStmts, -- Assignments to do the saves + EndOfBlockInfo, -- sequel for the alts + Maybe VirtualSpOffset) -- Slot for current cost centre + +saveVolatileVarsAndRegs vars + = do { var_saves <- saveVolatileVars vars + ; (maybe_cc_slot, cc_save) <- saveCurrentCostCentre + ; eob_info <- getEndOfBlockInfo + ; returnFC (var_saves `plusStmts` cc_save, + eob_info, + maybe_cc_slot) } + + +saveVolatileVars :: StgLiveVars -- Vars which should be made safe + -> FCode CmmStmts -- Assignments to to the saves + +saveVolatileVars vars + = do { stmts_s <- mapFCs save_it (varSetElems vars) + ; return (foldr plusStmts noStmts stmts_s) } + where + save_it var + = do { v <- getCAddrModeIfVolatile var + ; case v of + Nothing -> return noStmts -- Non-volatile + Just vol_amode -> save_var var vol_amode -- Aha! It's volatile + } + + save_var var vol_amode + = do { slot <- allocPrimStack (idCgRep var) + ; rebindToStack var slot + ; sp_rel <- getSpRelOffset slot + ; returnFC (oneStmt (CmmStore sp_rel vol_amode)) } +\end{code} + +--------------------------------------------------------------------------- + +When we save the current cost centre (which is done for lexical +scoping), we allocate a free stack location, and return (a)~the +virtual offset of the location, to pass on to the alternatives, and +(b)~the assignment to do the save (just as for @saveVolatileVars@). + +\begin{code} +saveCurrentCostCentre :: + FCode (Maybe VirtualSpOffset, -- Where we decide to store it + CmmStmts) -- Assignment to save it + +saveCurrentCostCentre + | not opt_SccProfilingOn + = returnFC (Nothing, noStmts) + | otherwise + = do { slot <- allocPrimStack PtrArg + ; sp_rel <- getSpRelOffset slot + ; returnFC (Just slot, + oneStmt (CmmStore sp_rel curCCS)) } + +-- Sometimes we don't free the slot containing the cost centre after restoring it +-- (see CgLetNoEscape.cgLetNoEscapeBody). +restoreCurrentCostCentre :: Maybe VirtualSpOffset -> Bool -> Code +restoreCurrentCostCentre Nothing _freeit = nopC +restoreCurrentCostCentre (Just slot) freeit + = do { sp_rel <- getSpRelOffset slot + ; whenC freeit (freeStackSlots [slot]) + ; stmtC (CmmStore curCCSAddr (CmmLoad sp_rel wordRep)) } +\end{code} + diff --git a/compiler/codeGen/CgClosure.lhs b/compiler/codeGen/CgClosure.lhs new file mode 100644 index 0000000000..1a2cbc5202 --- /dev/null +++ b/compiler/codeGen/CgClosure.lhs @@ -0,0 +1,599 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +% $Id: CgClosure.lhs,v 1.72 2005/05/18 12:06:51 simonmar Exp $ +% +\section[CgClosure]{Code generation for closures} + +This module provides the support code for @StgToAbstractC@ to deal +with {\em closures} on the RHSs of let(rec)s. See also +@CgCon@, which deals with constructors. + +\begin{code} +module CgClosure ( cgTopRhsClosure, + cgStdRhsClosure, + cgRhsClosure, + emitBlackHoleCode, + ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} CgExpr ( cgExpr ) + +import CgMonad +import CgBindery +import CgHeapery +import CgStackery ( mkVirtStkOffsets, pushUpdateFrame, getVirtSp, + setRealAndVirtualSp ) +import CgProf ( chooseDynCostCentres, ldvEnter, enterCostCentre, + costCentreFrom ) +import CgTicky +import CgParallel ( granYield, granFetchAndReschedule ) +import CgInfoTbls ( emitClosureCodeAndInfoTable, getSRTInfo ) +import CgCallConv ( assignCallRegs, mkArgDescr ) +import CgUtils ( emitDataLits, addIdReps, cmmRegOffW, + emitRtsCallWithVols ) +import ClosureInfo -- lots and lots of stuff +import SMRep ( CgRep, cgRepSizeW, argMachRep, fixedHdrSize, WordOff, + idCgRep ) +import MachOp ( MachHint(..) ) +import Cmm +import CmmUtils ( CmmStmts, mkStmts, oneStmt, plusStmts, noStmts, + mkLblExpr ) +import CLabel +import StgSyn +import StaticFlags ( opt_DoTickyProfiling ) +import CostCentre +import Id ( Id, idName, idType ) +import Name ( Name, isExternalName ) +import Module ( Module, pprModule ) +import ListSetOps ( minusList ) +import Util ( isIn, mapAccumL, zipWithEqual ) +import BasicTypes ( TopLevelFlag(..) ) +import Constants ( oFFSET_StgInd_indirectee, wORD_SIZE ) +import Outputable +import FastString +\end{code} + +%******************************************************** +%* * +\subsection[closures-no-free-vars]{Top-level closures} +%* * +%******************************************************** + +For closures bound at top level, allocate in static space. +They should have no free variables. + +\begin{code} +cgTopRhsClosure :: Id + -> CostCentreStack -- Optional cost centre annotation + -> StgBinderInfo + -> SRT + -> UpdateFlag + -> [Id] -- Args + -> StgExpr + -> FCode (Id, CgIdInfo) + +cgTopRhsClosure id ccs binder_info srt upd_flag args body = do + { -- LAY OUT THE OBJECT + let name = idName id + ; lf_info <- mkClosureLFInfo id TopLevel [] upd_flag args + ; srt_info <- getSRTInfo name srt + ; mod_name <- moduleName + ; let descr = closureDescription mod_name name + closure_info = mkClosureInfo True id lf_info 0 0 srt_info descr + closure_label = mkLocalClosureLabel name + cg_id_info = stableIdInfo id (mkLblExpr closure_label) lf_info + closure_rep = mkStaticClosureFields closure_info ccs True [] + + -- BUILD THE OBJECT, AND GENERATE INFO TABLE (IF NECESSARY) + ; emitDataLits closure_label closure_rep + ; forkClosureBody (closureCodeBody binder_info closure_info + ccs args body) + + ; returnFC (id, cg_id_info) } +\end{code} + +%******************************************************** +%* * +\subsection[non-top-level-closures]{Non top-level closures} +%* * +%******************************************************** + +For closures with free vars, allocate in heap. + +\begin{code} +cgStdRhsClosure + :: Id + -> CostCentreStack -- Optional cost centre annotation + -> StgBinderInfo + -> [Id] -- Free vars + -> [Id] -- Args + -> StgExpr + -> LambdaFormInfo + -> [StgArg] -- payload + -> FCode (Id, CgIdInfo) + +cgStdRhsClosure bndr cc bndr_info fvs args body lf_info payload + = do -- AHA! A STANDARD-FORM THUNK + { -- LAY OUT THE OBJECT + amodes <- getArgAmodes payload + ; mod_name <- moduleName + ; let (tot_wds, ptr_wds, amodes_w_offsets) + = mkVirtHeapOffsets (isLFThunk lf_info) amodes + + descr = closureDescription mod_name (idName bndr) + closure_info = mkClosureInfo False -- Not static + bndr lf_info tot_wds ptr_wds + NoC_SRT -- No SRT for a std-form closure + descr + + ; (use_cc, blame_cc) <- chooseDynCostCentres cc args body + + -- BUILD THE OBJECT + ; heap_offset <- allocDynClosure closure_info use_cc blame_cc amodes_w_offsets + + -- RETURN + ; returnFC (bndr, heapIdInfo bndr heap_offset lf_info) } +\end{code} + +Here's the general case. + +\begin{code} +cgRhsClosure :: Id + -> CostCentreStack -- Optional cost centre annotation + -> StgBinderInfo + -> SRT + -> [Id] -- Free vars + -> UpdateFlag + -> [Id] -- Args + -> StgExpr + -> FCode (Id, CgIdInfo) + +cgRhsClosure bndr cc bndr_info srt fvs upd_flag args body = do + { -- LAY OUT THE OBJECT + -- If the binder is itself a free variable, then don't store + -- it in the closure. Instead, just bind it to Node on entry. + -- NB we can be sure that Node will point to it, because we + -- havn't told mkClosureLFInfo about this; so if the binder + -- _was_ a free var of its RHS, mkClosureLFInfo thinks it *is* + -- stored in the closure itself, so it will make sure that + -- Node points to it... + let + name = idName bndr + is_elem = isIn "cgRhsClosure" + bndr_is_a_fv = bndr `is_elem` fvs + reduced_fvs | bndr_is_a_fv = fvs `minusList` [bndr] + | otherwise = fvs + + ; lf_info <- mkClosureLFInfo bndr NotTopLevel fvs upd_flag args + ; fv_infos <- mapFCs getCgIdInfo reduced_fvs + ; srt_info <- getSRTInfo name srt + ; mod_name <- moduleName + ; let bind_details :: [(CgIdInfo, VirtualHpOffset)] + (tot_wds, ptr_wds, bind_details) + = mkVirtHeapOffsets (isLFThunk lf_info) (map add_rep fv_infos) + + add_rep info = (cgIdInfoArgRep info, info) + + descr = closureDescription mod_name name + closure_info = mkClosureInfo False -- Not static + bndr lf_info tot_wds ptr_wds + srt_info descr + + -- BUILD ITS INFO TABLE AND CODE + ; forkClosureBody (do + { -- Bind the fvs + let bind_fv (info, offset) + = bindNewToNode (cgIdInfoId info) offset (cgIdInfoLF info) + ; mapCs bind_fv bind_details + + -- Bind the binder itself, if it is a free var + ; whenC bndr_is_a_fv (bindNewToReg bndr nodeReg lf_info) + + -- Compile the body + ; closureCodeBody bndr_info closure_info cc args body }) + + -- BUILD THE OBJECT + ; let + to_amode (info, offset) = do { amode <- idInfoToAmode info + ; return (amode, offset) } + ; (use_cc, blame_cc) <- chooseDynCostCentres cc args body + ; amodes_w_offsets <- mapFCs to_amode bind_details + ; heap_offset <- allocDynClosure closure_info use_cc blame_cc amodes_w_offsets + + -- RETURN + ; returnFC (bndr, heapIdInfo bndr heap_offset lf_info) } + + +mkClosureLFInfo :: Id -- The binder + -> TopLevelFlag -- True of top level + -> [Id] -- Free vars + -> UpdateFlag -- Update flag + -> [Id] -- Args + -> FCode LambdaFormInfo +mkClosureLFInfo bndr top fvs upd_flag args + | null args = return (mkLFThunk (idType bndr) top fvs upd_flag) + | otherwise = do { arg_descr <- mkArgDescr (idName bndr) args + ; return (mkLFReEntrant top fvs args arg_descr) } +\end{code} + + +%************************************************************************ +%* * +\subsection[code-for-closures]{The code for closures} +%* * +%************************************************************************ + +\begin{code} +closureCodeBody :: StgBinderInfo + -> ClosureInfo -- Lots of information about this closure + -> CostCentreStack -- Optional cost centre attached to closure + -> [Id] + -> StgExpr + -> Code +\end{code} + +There are two main cases for the code for closures. If there are {\em +no arguments}, then the closure is a thunk, and not in normal form. +So it should set up an update frame (if it is shared). +NB: Thunks cannot have a primitive type! + +\begin{code} +closureCodeBody binder_info cl_info cc [{- No args i.e. thunk -}] body = do + { body_absC <- getCgStmts $ do + { tickyEnterThunk cl_info + ; ldvEnter (CmmReg nodeReg) -- NB: Node always points when profiling + ; thunkWrapper cl_info $ do + -- We only enter cc after setting up update so + -- that cc of enclosing scope will be recorded + -- in update frame CAF/DICT functions will be + -- subsumed by this enclosing cc + { enterCostCentre cl_info cc body + ; cgExpr body } + } + + ; emitClosureCodeAndInfoTable cl_info [] body_absC } +\end{code} + +If there is /at least one argument/, then this closure is in +normal form, so there is no need to set up an update frame. + +The Macros for GrAnSim are produced at the beginning of the +argSatisfactionCheck (by calling fetchAndReschedule). There info if +Node points to closure is available. -- HWL + +\begin{code} +closureCodeBody binder_info cl_info cc args body + = ASSERT( length args > 0 ) + do { -- Get the current virtual Sp (it might not be zero, + -- eg. if we're compiling a let-no-escape). + vSp <- getVirtSp + ; let (reg_args, other_args) = assignCallRegs (addIdReps args) + (sp_top, stk_args) = mkVirtStkOffsets vSp other_args + + -- Allocate the global ticky counter + ; let ticky_ctr_lbl = mkRednCountsLabel (closureName cl_info) + ; emitTickyCounter cl_info args sp_top + + -- ...and establish the ticky-counter + -- label for this block + ; setTickyCtrLabel ticky_ctr_lbl $ do + + -- Emit the slow-entry code + { reg_save_code <- mkSlowEntryCode cl_info reg_args + + -- Emit the main entry code + ; blks <- forkProc $ + mkFunEntryCode cl_info cc reg_args stk_args + sp_top reg_save_code body + ; emitClosureCodeAndInfoTable cl_info [] blks + }} + + + +mkFunEntryCode :: ClosureInfo + -> CostCentreStack + -> [(Id,GlobalReg)] -- Args in regs + -> [(Id,VirtualSpOffset)] -- Args on stack + -> VirtualSpOffset -- Last allocated word on stack + -> CmmStmts -- Register-save code in case of GC + -> StgExpr + -> Code +-- The main entry code for the closure +mkFunEntryCode cl_info cc reg_args stk_args sp_top reg_save_code body = do + { -- Bind args to regs/stack as appropriate, + -- and record expected position of sps + ; bindArgsToRegs reg_args + ; bindArgsToStack stk_args + ; setRealAndVirtualSp sp_top + + -- Enter the cost-centre, if required + -- ToDo: It's not clear why this is outside the funWrapper, + -- but the tickyEnterFun is inside. Perhaps we can put + -- them together? + ; enterCostCentre cl_info cc body + + -- Do the business + ; funWrapper cl_info reg_args reg_save_code $ do + { tickyEnterFun cl_info + ; cgExpr body } + } +\end{code} + +The "slow entry" code for a function. This entry point takes its +arguments on the stack. It loads the arguments into registers +according to the calling convention, and jumps to the function's +normal entry point. The function's closure is assumed to be in +R1/node. + +The slow entry point is used in two places: + + (a) unknown calls: eg. stg_PAP_entry + (b) returning from a heap-check failure + +\begin{code} +mkSlowEntryCode :: ClosureInfo -> [(Id,GlobalReg)] -> FCode CmmStmts +-- If this function doesn't have a specialised ArgDescr, we need +-- to generate the function's arg bitmap, slow-entry code, and +-- register-save code for the heap-check failure +-- Here, we emit the slow-entry code, and +-- return the register-save assignments +mkSlowEntryCode cl_info reg_args + | Just (_, ArgGen _) <- closureFunInfo cl_info + = do { emitSimpleProc slow_lbl (emitStmts load_stmts) + ; return save_stmts } + | otherwise = return noStmts + where + name = closureName cl_info + slow_lbl = mkSlowEntryLabel name + + load_stmts = mkStmts load_assts `plusStmts` mkStmts [stk_adj_pop, jump_to_entry] + save_stmts = oneStmt stk_adj_push `plusStmts` mkStmts save_assts + + reps_w_regs :: [(CgRep,GlobalReg)] + reps_w_regs = [(idCgRep id, reg) | (id,reg) <- reverse reg_args] + (final_stk_offset, stk_offsets) + = mapAccumL (\off (rep,_) -> (off + cgRepSizeW rep, off)) + 0 reps_w_regs + + load_assts = zipWithEqual "mk_load" mk_load reps_w_regs stk_offsets + mk_load (rep,reg) offset = CmmAssign (CmmGlobal reg) + (CmmLoad (cmmRegOffW spReg offset) + (argMachRep rep)) + + save_assts = zipWithEqual "mk_save" mk_save reps_w_regs stk_offsets + mk_save (rep,reg) offset = ASSERT( argMachRep rep == globalRegRep reg ) + CmmStore (cmmRegOffW spReg offset) + (CmmReg (CmmGlobal reg)) + + stk_adj_pop = CmmAssign spReg (cmmRegOffW spReg final_stk_offset) + stk_adj_push = CmmAssign spReg (cmmRegOffW spReg (- final_stk_offset)) + jump_to_entry = CmmJump (mkLblExpr (enterLocalIdLabel name)) [] +\end{code} + + +%************************************************************************ +%* * +\subsubsection[closure-code-wrappers]{Wrappers around closure code} +%* * +%************************************************************************ + +\begin{code} +thunkWrapper:: ClosureInfo -> Code -> Code +thunkWrapper closure_info thunk_code = do + { let node_points = nodeMustPointToIt (closureLFInfo closure_info) + + -- HWL: insert macros for GrAnSim; 2 versions depending on liveness of node + -- (we prefer fetchAndReschedule-style context switches to yield ones) + ; if node_points + then granFetchAndReschedule [] node_points + else granYield [] node_points + + -- Stack and/or heap checks + ; thunkEntryChecks closure_info $ do + { -- Overwrite with black hole if necessary + whenC (blackHoleOnEntry closure_info && node_points) + (blackHoleIt closure_info) + ; setupUpdate closure_info thunk_code } + -- setupUpdate *encloses* the thunk_code + } + +funWrapper :: ClosureInfo -- Closure whose code body this is + -> [(Id,GlobalReg)] -- List of argument registers (if any) + -> CmmStmts -- reg saves for the heap check failure + -> Code -- Body of function being compiled + -> Code +funWrapper closure_info arg_regs reg_save_code fun_body = do + { let node_points = nodeMustPointToIt (closureLFInfo closure_info) + + -- Enter for Ldv profiling + ; whenC node_points (ldvEnter (CmmReg nodeReg)) + + -- GranSim yeild poin + ; granYield arg_regs node_points + + -- Heap and/or stack checks wrap the function body + ; funEntryChecks closure_info reg_save_code + fun_body + } +\end{code} + + +%************************************************************************ +%* * +\subsubsubsection[update-and-BHs]{Update and black-hole wrappers} +%* * +%************************************************************************ + + +\begin{code} +blackHoleIt :: ClosureInfo -> Code +-- Only called for closures with no args +-- Node points to the closure +blackHoleIt closure_info = emitBlackHoleCode (closureSingleEntry closure_info) + +emitBlackHoleCode :: Bool -> Code +emitBlackHoleCode is_single_entry + | eager_blackholing = do + tickyBlackHole (not is_single_entry) + stmtC (CmmStore (CmmReg nodeReg) (CmmLit (CmmLabel bh_lbl))) + | otherwise = + nopC + where + bh_lbl | is_single_entry = mkRtsDataLabel SLIT("stg_SE_BLACKHOLE_info") + | otherwise = mkRtsDataLabel SLIT("stg_BLACKHOLE_info") + + -- If we wanted to do eager blackholing with slop filling, + -- we'd need to do it at the *end* of a basic block, otherwise + -- we overwrite the free variables in the thunk that we still + -- need. We have a patch for this from Andy Cheadle, but not + -- incorporated yet. --SDM [6/2004] + -- + -- Profiling needs slop filling (to support LDV profiling), so + -- currently eager blackholing doesn't work with profiling. + -- + -- TICKY_TICKY needs EAGER_BLACKHOLING to verify no double-entries of + -- single-entry thunks. + eager_blackholing + | opt_DoTickyProfiling = True + | otherwise = False + +\end{code} + +\begin{code} +setupUpdate :: ClosureInfo -> Code -> Code -- Only called for closures with no args + -- Nota Bene: this function does not change Node (even if it's a CAF), + -- so that the cost centre in the original closure can still be + -- extracted by a subsequent enterCostCentre +setupUpdate closure_info code + | closureReEntrant closure_info + = code + + | not (isStaticClosure closure_info) + = if closureUpdReqd closure_info + then do { tickyPushUpdateFrame; pushUpdateFrame (CmmReg nodeReg) code } + else do { tickyUpdateFrameOmitted; code } + + | otherwise -- A static closure + = do { tickyUpdateBhCaf closure_info + + ; if closureUpdReqd closure_info + then do -- Blackhole the (updatable) CAF: + { upd_closure <- link_caf closure_info True + ; pushUpdateFrame upd_closure code } + else do + { -- No update reqd, you'd think we don't need to + -- black-hole it. But when ticky-ticky is on, we + -- black-hole it regardless, to catch errors in which + -- an allegedly single-entry closure is entered twice + -- + -- We discard the pointer returned by link_caf, because + -- we don't push an update frame + whenC opt_DoTickyProfiling -- Blackhole even a SE CAF + (link_caf closure_info False >> nopC) + ; tickyUpdateFrameOmitted + ; code } + } + + +----------------------------------------------------------------------------- +-- Entering a CAF +-- +-- When a CAF is first entered, it creates a black hole in the heap, +-- and updates itself with an indirection to this new black hole. +-- +-- We update the CAF with an indirection to a newly-allocated black +-- hole in the heap. We also set the blocking queue on the newly +-- allocated black hole to be empty. +-- +-- Why do we make a black hole in the heap when we enter a CAF? +-- +-- - for a generational garbage collector, which needs a fast +-- test for whether an updatee is in an old generation or not +-- +-- - for the parallel system, which can implement updates more +-- easily if the updatee is always in the heap. (allegedly). +-- +-- When debugging, we maintain a separate CAF list so we can tell when +-- a CAF has been garbage collected. + +-- newCAF must be called before the itbl ptr is overwritten, since +-- newCAF records the old itbl ptr in order to do CAF reverting +-- (which Hugs needs to do in order that combined mode works right.) +-- + +-- ToDo [Feb 04] This entire link_caf nonsense could all be moved +-- into the "newCAF" RTS procedure, which we call anyway, including +-- the allocation of the black-hole indirection closure. +-- That way, code size would fall, the CAF-handling code would +-- be closer together, and the compiler wouldn't need to know +-- about off_indirectee etc. + +link_caf :: ClosureInfo + -> Bool -- True <=> updatable, False <=> single-entry + -> FCode CmmExpr -- Returns amode for closure to be updated +-- To update a CAF we must allocate a black hole, link the CAF onto the +-- CAF list, then update the CAF to point to the fresh black hole. +-- This function returns the address of the black hole, so it can be +-- updated with the new value when available. The reason for all of this +-- is that we only want to update dynamic heap objects, not static ones, +-- so that generational GC is easier. +link_caf cl_info is_upd = do + { -- Alloc black hole specifying CC_HDR(Node) as the cost centre + ; let use_cc = costCentreFrom (CmmReg nodeReg) + blame_cc = use_cc + ; hp_offset <- allocDynClosure bh_cl_info use_cc blame_cc [] + ; hp_rel <- getHpRelOffset hp_offset + + -- Call the RTS function newCAF to add the CAF to the CafList + -- so that the garbage collector can find them + -- This must be done *before* the info table pointer is overwritten, + -- because the old info table ptr is needed for reversion + ; emitRtsCallWithVols SLIT("newCAF") [(CmmReg nodeReg,PtrHint)] [node] + -- node is live, so save it. + + -- Overwrite the closure with a (static) indirection + -- to the newly-allocated black hole + ; stmtsC [ CmmStore (cmmRegOffW nodeReg off_indirectee) hp_rel + , CmmStore (CmmReg nodeReg) ind_static_info ] + + ; returnFC hp_rel } + where + bh_cl_info :: ClosureInfo + bh_cl_info | is_upd = cafBlackHoleClosureInfo cl_info + | otherwise = seCafBlackHoleClosureInfo cl_info + + ind_static_info :: CmmExpr + ind_static_info = mkLblExpr mkIndStaticInfoLabel + + off_indirectee :: WordOff + off_indirectee = fixedHdrSize + oFFSET_StgInd_indirectee*wORD_SIZE +\end{code} + + +%************************************************************************ +%* * +\subsection[CgClosure-Description]{Profiling Closure Description.} +%* * +%************************************************************************ + +For "global" data constructors the description is simply occurrence +name of the data constructor itself. Otherwise it is determined by +@closureDescription@ from the let binding information. + +\begin{code} +closureDescription :: Module -- Module + -> Name -- Id of closure binding + -> String + -- Not called for StgRhsCon which have global info tables built in + -- CgConTbls.lhs with a description generated from the data constructor +closureDescription mod_name name + = showSDocDump (char '<' <> + (if isExternalName name + then ppr name -- ppr will include the module name prefix + else pprModule mod_name <> char '.' <> ppr name) <> + char '>') + -- showSDocDump, because we want to see the unique on the Name. +\end{code} + diff --git a/compiler/codeGen/CgCon.lhs b/compiler/codeGen/CgCon.lhs new file mode 100644 index 0000000000..bfb55bf46e --- /dev/null +++ b/compiler/codeGen/CgCon.lhs @@ -0,0 +1,457 @@ +% +% (c) The GRASP Project, Glasgow University, 1992-1998 +% +\section[CgCon]{Code generation for constructors} + +This module provides the support code for @StgToAbstractC@ to deal +with {\em constructors} on the RHSs of let(rec)s. See also +@CgClosure@, which deals with closures. + +\begin{code} +module CgCon ( + cgTopRhsCon, buildDynCon, + bindConArgs, bindUnboxedTupleComponents, + cgReturnDataCon, + cgTyCon + ) where + +#include "HsVersions.h" + +import CgMonad +import StgSyn + +import CgBindery ( getArgAmodes, bindNewToNode, + bindArgsToRegs, idInfoToAmode, stableIdInfo, + heapIdInfo, CgIdInfo, bindArgsToStack + ) +import CgStackery ( mkVirtStkOffsets, freeStackSlots, + getRealSp, getVirtSp, setRealAndVirtualSp ) +import CgUtils ( addIdReps, cmmLabelOffW, emitRODataLits, emitDataLits ) +import CgCallConv ( assignReturnRegs ) +import Constants ( mAX_INTLIKE, mIN_INTLIKE, mAX_CHARLIKE, mIN_CHARLIKE ) +import CgHeapery ( allocDynClosure, layOutDynConstr, + layOutStaticConstr, mkStaticClosureFields ) +import CgTailCall ( performReturn, emitKnownConReturnCode, returnUnboxedTuple ) +import CgProf ( mkCCostCentreStack, ldvEnter, curCCS ) +import CgTicky +import CgInfoTbls ( emitClosureCodeAndInfoTable, dataConTagZ ) +import CLabel +import ClosureInfo ( mkConLFInfo, mkLFArgument ) +import CmmUtils ( mkLblExpr ) +import Cmm +import SMRep ( WordOff, CgRep, separateByPtrFollowness, + fixedHdrSize, typeCgRep ) +import CostCentre ( currentOrSubsumedCCS, dontCareCCS, CostCentreStack, + currentCCS ) +import Constants ( mIN_INTLIKE, mAX_INTLIKE, mIN_CHARLIKE, mAX_CHARLIKE ) +import TyCon ( TyCon, tyConDataCons, isEnumerationTyCon, tyConName ) +import DataCon ( DataCon, dataConRepArgTys, isNullaryRepDataCon, + isUnboxedTupleCon, dataConWorkId, + dataConName, dataConRepArity + ) +import Id ( Id, idName, isDeadBinder ) +import Type ( Type ) +import PrelInfo ( maybeCharLikeCon, maybeIntLikeCon ) +import Outputable +import Util ( lengthIs ) +import ListSetOps ( assocMaybe ) +\end{code} + + +%************************************************************************ +%* * +\subsection[toplevel-constructors]{Top-level constructors} +%* * +%************************************************************************ + +\begin{code} +cgTopRhsCon :: Id -- Name of thing bound to this RHS + -> DataCon -- Id + -> [StgArg] -- Args + -> FCode (Id, CgIdInfo) +cgTopRhsCon id con args + = do { + ; hmods <- getHomeModules +#if mingw32_TARGET_OS + -- Windows DLLs have a problem with static cross-DLL refs. + ; ASSERT( not (isDllConApp hmods con args) ) return () +#endif + ; ASSERT( args `lengthIs` dataConRepArity con ) return () + + -- LAY IT OUT + ; amodes <- getArgAmodes args + + ; let + name = idName id + lf_info = mkConLFInfo con + closure_label = mkClosureLabel hmods name + caffy = any stgArgHasCafRefs args + (closure_info, amodes_w_offsets) = layOutStaticConstr hmods con amodes + closure_rep = mkStaticClosureFields + closure_info + dontCareCCS -- Because it's static data + caffy -- Has CAF refs + payload + + payload = map get_lit amodes_w_offsets + get_lit (CmmLit lit, _offset) = lit + get_lit other = pprPanic "CgCon.get_lit" (ppr other) + -- NB1: amodes_w_offsets is sorted into ptrs first, then non-ptrs + -- NB2: all the amodes should be Lits! + + -- BUILD THE OBJECT + ; emitDataLits closure_label closure_rep + + -- RETURN + ; returnFC (id, stableIdInfo id (mkLblExpr closure_label) lf_info) } +\end{code} + +%************************************************************************ +%* * +%* non-top-level constructors * +%* * +%************************************************************************ +\subsection[code-for-constructors]{The code for constructors} + +\begin{code} +buildDynCon :: Id -- Name of the thing to which this constr will + -- be bound + -> CostCentreStack -- Where to grab cost centre from; + -- current CCS if currentOrSubsumedCCS + -> DataCon -- The data constructor + -> [(CgRep,CmmExpr)] -- Its args + -> FCode CgIdInfo -- Return details about how to find it + +-- We used to pass a boolean indicating whether all the +-- args were of size zero, so we could use a static +-- construtor; but I concluded that it just isn't worth it. +-- Now I/O uses unboxed tuples there just aren't any constructors +-- with all size-zero args. +-- +-- The reason for having a separate argument, rather than looking at +-- the addr modes of the args is that we may be in a "knot", and +-- premature looking at the args will cause the compiler to black-hole! +\end{code} + +First we deal with the case of zero-arity constructors. Now, they +will probably be unfolded, so we don't expect to see this case much, +if at all, but it does no harm, and sets the scene for characters. + +In the case of zero-arity constructors, or, more accurately, those +which have exclusively size-zero (VoidRep) args, we generate no code +at all. + +\begin{code} +buildDynCon binder cc con [] + = do hmods <- getHomeModules + returnFC (stableIdInfo binder + (mkLblExpr (mkClosureLabel hmods (dataConName con))) + (mkConLFInfo con)) +\end{code} + +The following three paragraphs about @Char@-like and @Int@-like +closures are obsolete, but I don't understand the details well enough +to properly word them, sorry. I've changed the treatment of @Char@s to +be analogous to @Int@s: only a subset is preallocated, because @Char@ +has now 31 bits. Only literals are handled here. -- Qrczak + +Now for @Char@-like closures. We generate an assignment of the +address of the closure to a temporary. It would be possible simply to +generate no code, and record the addressing mode in the environment, +but we'd have to be careful if the argument wasn't a constant --- so +for simplicity we just always asssign to a temporary. + +Last special case: @Int@-like closures. We only special-case the +situation in which the argument is a literal in the range +@mIN_INTLIKE@..@mAX_INTLILKE@. NB: for @Char@-like closures we can +work with any old argument, but for @Int@-like ones the argument has +to be a literal. Reason: @Char@ like closures have an argument type +which is guaranteed in range. + +Because of this, we use can safely return an addressing mode. + +\begin{code} +buildDynCon binder cc con [arg_amode] + | maybeIntLikeCon con + , (_, CmmLit (CmmInt val _)) <- arg_amode + , let val_int = (fromIntegral val) :: Int + , val_int <= mAX_INTLIKE && val_int >= mIN_INTLIKE + = do { let intlike_lbl = mkRtsDataLabel SLIT("stg_INTLIKE_closure") + offsetW = (val_int - mIN_INTLIKE) * (fixedHdrSize + 1) + -- INTLIKE closures consist of a header and one word payload + intlike_amode = CmmLit (cmmLabelOffW intlike_lbl offsetW) + ; returnFC (stableIdInfo binder intlike_amode (mkConLFInfo con)) } + +buildDynCon binder cc con [arg_amode] + | maybeCharLikeCon con + , (_, CmmLit (CmmInt val _)) <- arg_amode + , let val_int = (fromIntegral val) :: Int + , val_int <= mAX_CHARLIKE && val_int >= mIN_CHARLIKE + = do { let charlike_lbl = mkRtsDataLabel SLIT("stg_CHARLIKE_closure") + offsetW = (val_int - mIN_CHARLIKE) * (fixedHdrSize + 1) + -- CHARLIKE closures consist of a header and one word payload + charlike_amode = CmmLit (cmmLabelOffW charlike_lbl offsetW) + ; returnFC (stableIdInfo binder charlike_amode (mkConLFInfo con)) } +\end{code} + +Now the general case. + +\begin{code} +buildDynCon binder ccs con args + = do { + ; hmods <- getHomeModules + ; let + (closure_info, amodes_w_offsets) = layOutDynConstr hmods con args + + ; hp_off <- allocDynClosure closure_info use_cc blame_cc amodes_w_offsets + ; returnFC (heapIdInfo binder hp_off lf_info) } + where + lf_info = mkConLFInfo con + + use_cc -- cost-centre to stick in the object + | currentOrSubsumedCCS ccs = curCCS + | otherwise = CmmLit (mkCCostCentreStack ccs) + + blame_cc = use_cc -- cost-centre on which to blame the alloc (same) +\end{code} + + +%************************************************************************ +%* * +%* constructor-related utility function: * +%* bindConArgs is called from cgAlt of a case * +%* * +%************************************************************************ +\subsection[constructor-utilities]{@bindConArgs@: constructor-related utility} + +@bindConArgs@ $con args$ augments the environment with bindings for the +binders $args$, assuming that we have just returned from a @case@ which +found a $con$. + +\begin{code} +bindConArgs :: DataCon -> [Id] -> Code +bindConArgs con args + = do hmods <- getHomeModules + let + bind_arg (arg, offset) = bindNewToNode arg offset (mkLFArgument arg) + (_, args_w_offsets) = layOutDynConstr hmods con (addIdReps args) + -- + ASSERT(not (isUnboxedTupleCon con)) return () + mapCs bind_arg args_w_offsets +\end{code} + +Unboxed tuples are handled slightly differently - the object is +returned in registers and on the stack instead of the heap. + +\begin{code} +bindUnboxedTupleComponents + :: [Id] -- Args + -> FCode ([(Id,GlobalReg)], -- Regs assigned + WordOff, -- Number of pointer stack slots + WordOff, -- Number of non-pointer stack slots + VirtualSpOffset) -- Offset of return address slot + -- (= realSP on entry) + +bindUnboxedTupleComponents args + = do { + vsp <- getVirtSp + ; rsp <- getRealSp + + -- Assign as many components as possible to registers + ; let (reg_args, stk_args) = assignReturnRegs (addIdReps args) + + -- Separate the rest of the args into pointers and non-pointers + (ptr_args, nptr_args) = separateByPtrFollowness stk_args + + -- Allocate the rest on the stack + -- The real SP points to the return address, above which any + -- leftover unboxed-tuple components will be allocated + (ptr_sp, ptr_offsets) = mkVirtStkOffsets rsp ptr_args + (nptr_sp, nptr_offsets) = mkVirtStkOffsets ptr_sp nptr_args + ptrs = ptr_sp - rsp + nptrs = nptr_sp - ptr_sp + + -- The stack pointer points to the last stack-allocated component + ; setRealAndVirtualSp nptr_sp + + -- We have just allocated slots starting at real SP + 1, and set the new + -- virtual SP to the topmost allocated slot. + -- If the virtual SP started *below* the real SP, we've just jumped over + -- some slots that won't be in the free-list, so put them there + -- This commonly happens because we've freed the return-address slot + -- (trimming back the virtual SP), but the real SP still points to that slot + ; freeStackSlots [vsp+1,vsp+2 .. rsp] + + ; bindArgsToRegs reg_args + ; bindArgsToStack ptr_offsets + ; bindArgsToStack nptr_offsets + + ; returnFC (reg_args, ptrs, nptrs, rsp) } +\end{code} + +%************************************************************************ +%* * + Actually generate code for a constructor return +%* * +%************************************************************************ + + +Note: it's the responsibility of the @cgReturnDataCon@ caller to be +sure the @amodes@ passed don't conflict with each other. +\begin{code} +cgReturnDataCon :: DataCon -> [(CgRep, CmmExpr)] -> Code + +cgReturnDataCon con amodes + = ASSERT( amodes `lengthIs` dataConRepArity con ) + do { EndOfBlockInfo _ sequel <- getEndOfBlockInfo + ; case sequel of + CaseAlts _ (Just (alts, deflt_lbl)) bndr _ + -> -- Ho! We know the constructor so we can + -- go straight to the right alternative + case assocMaybe alts (dataConTagZ con) of { + Just join_lbl -> build_it_then (jump_to join_lbl); + Nothing + -- Special case! We're returning a constructor to the default case + -- of an enclosing case. For example: + -- + -- case (case e of (a,b) -> C a b) of + -- D x -> ... + -- y -> ...<returning here!>... + -- + -- In this case, + -- if the default is a non-bind-default (ie does not use y), + -- then we should simply jump to the default join point; + + | isDeadBinder bndr -> performReturn (jump_to deflt_lbl) + | otherwise -> build_it_then (jump_to deflt_lbl) } + + other_sequel -- The usual case + | isUnboxedTupleCon con -> returnUnboxedTuple amodes + | otherwise -> build_it_then (emitKnownConReturnCode con) + } + where + jump_to lbl = stmtC (CmmJump (CmmLit lbl) []) + build_it_then return_code + = do { -- BUILD THE OBJECT IN THE HEAP + -- The first "con" says that the name bound to this + -- closure is "con", which is a bit of a fudge, but it only + -- affects profiling + + -- This Id is also used to get a unique for a + -- temporary variable, if the closure is a CHARLIKE. + -- funnily enough, this makes the unique always come + -- out as '54' :-) + tickyReturnNewCon (length amodes) + ; idinfo <- buildDynCon (dataConWorkId con) currentCCS con amodes + ; amode <- idInfoToAmode idinfo + ; checkedAbsC (CmmAssign nodeReg amode) + ; performReturn return_code } +\end{code} + + +%************************************************************************ +%* * + Generating static stuff for algebraic data types +%* * +%************************************************************************ + + [These comments are rather out of date] + +\begin{tabular}{lll} +Info tbls & Macro & Kind of constructor \\ +\hline +info & @CONST_INFO_TABLE@& Zero arity (no info -- compiler uses static closure)\\ +info & @CHARLIKE_INFO_TABLE@& Charlike (no info -- compiler indexes fixed array)\\ +info & @INTLIKE_INFO_TABLE@& Intlike; the one macro generates both info tbls\\ +info & @SPEC_INFO_TABLE@& SPECish, and bigger than or equal to @MIN_UPD_SIZE@\\ +info & @GEN_INFO_TABLE@& GENish (hence bigger than or equal to @MIN_UPD_SIZE@)\\ +\end{tabular} + +Possible info tables for constructor con: + +\begin{description} +\item[@_con_info@:] +Used for dynamically let(rec)-bound occurrences of +the constructor, and for updates. For constructors +which are int-like, char-like or nullary, when GC occurs, +the closure tries to get rid of itself. + +\item[@_static_info@:] +Static occurrences of the constructor +macro: @STATIC_INFO_TABLE@. +\end{description} + +For zero-arity constructors, \tr{con}, we NO LONGER generate a static closure; +it's place is taken by the top level defn of the constructor. + +For charlike and intlike closures there is a fixed array of static +closures predeclared. + +\begin{code} +cgTyCon :: TyCon -> FCode [Cmm] -- each constructor gets a separate Cmm +cgTyCon tycon + = do { constrs <- mapM (getCmm . cgDataCon) (tyConDataCons tycon) + + -- Generate a table of static closures for an enumeration type + -- Put the table after the data constructor decls, because the + -- datatype closure table (for enumeration types) + -- to (say) PrelBase_$wTrue_closure, which is defined in code_stuff + ; extra <- + if isEnumerationTyCon tycon then do + tbl <- getCmm (emitRODataLits (mkLocalClosureTableLabel + (tyConName tycon)) + [ CmmLabel (mkLocalClosureLabel (dataConName con)) + | con <- tyConDataCons tycon]) + return [tbl] + else + return [] + + ; return (extra ++ constrs) + } +\end{code} + +Generate the entry code, info tables, and (for niladic constructor) the +static closure, for a constructor. + +\begin{code} +cgDataCon :: DataCon -> Code +cgDataCon data_con + = do { -- Don't need any dynamic closure code for zero-arity constructors + hmods <- getHomeModules + + ; let + -- To allow the debuggers, interpreters, etc to cope with + -- static data structures (ie those built at compile + -- time), we take care that info-table contains the + -- information we need. + (static_cl_info, _) = + layOutStaticConstr hmods data_con arg_reps + + (dyn_cl_info, arg_things) = + layOutDynConstr hmods data_con arg_reps + + emit_info cl_info ticky_code + = do { code_blks <- getCgStmts the_code + ; emitClosureCodeAndInfoTable cl_info [] code_blks } + where + the_code = do { ticky_code + ; ldvEnter (CmmReg nodeReg) + ; body_code } + + arg_reps :: [(CgRep, Type)] + arg_reps = [(typeCgRep ty, ty) | ty <- dataConRepArgTys data_con] + + body_code = do { + -- NB: We don't set CC when entering data (WDP 94/06) + tickyReturnOldCon (length arg_things) + ; performReturn (emitKnownConReturnCode data_con) } + -- noStmts: Ptr to thing already in Node + + ; whenC (not (isNullaryRepDataCon data_con)) + (emit_info dyn_cl_info tickyEnterDynCon) + + -- Dynamic-Closure first, to reduce forward references + ; emit_info static_cl_info tickyEnterStaticCon } + + where +\end{code} diff --git a/compiler/codeGen/CgExpr.hi-boot-5 b/compiler/codeGen/CgExpr.hi-boot-5 new file mode 100644 index 0000000000..588e63f8f1 --- /dev/null +++ b/compiler/codeGen/CgExpr.hi-boot-5 @@ -0,0 +1,3 @@ +__interface CgExpr 1 0 where +__export CgExpr cgExpr; +1 cgExpr :: StgSyn.StgExpr -> CgMonad.Code ; diff --git a/compiler/codeGen/CgExpr.hi-boot-6 b/compiler/codeGen/CgExpr.hi-boot-6 new file mode 100644 index 0000000000..dc2d75cefe --- /dev/null +++ b/compiler/codeGen/CgExpr.hi-boot-6 @@ -0,0 +1,3 @@ +module CgExpr where + +cgExpr :: StgSyn.StgExpr -> CgMonad.Code diff --git a/compiler/codeGen/CgExpr.lhs b/compiler/codeGen/CgExpr.lhs new file mode 100644 index 0000000000..33d72f1608 --- /dev/null +++ b/compiler/codeGen/CgExpr.lhs @@ -0,0 +1,454 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +% $Id: CgExpr.lhs,v 1.62 2005/06/21 10:44:41 simonmar Exp $ +% +%******************************************************** +%* * +\section[CgExpr]{Converting @StgExpr@s} +%* * +%******************************************************** + +\begin{code} +module CgExpr ( cgExpr ) where + +#include "HsVersions.h" + +import Constants ( mAX_SPEC_SELECTEE_SIZE, mAX_SPEC_AP_SIZE ) +import StgSyn +import CgMonad + +import SMRep ( fixedHdrSize, isFollowableArg, CgRep(..), argMachRep, + nonVoidArg, idCgRep, typeCgRep, typeHint, + primRepToCgRep ) +import CoreSyn ( AltCon(..) ) +import CgProf ( emitSetCCC ) +import CgHeapery ( layOutDynConstr ) +import CgBindery ( getArgAmodes, getArgAmode, CgIdInfo, + nukeDeadBindings, addBindC, addBindsC ) +import CgCase ( cgCase, saveVolatileVarsAndRegs ) +import CgClosure ( cgRhsClosure, cgStdRhsClosure ) +import CgCon ( buildDynCon, cgReturnDataCon ) +import CgLetNoEscape ( cgLetNoEscapeClosure ) +import CgCallConv ( dataReturnConvPrim ) +import CgTailCall +import CgInfoTbls ( emitDirectReturnInstr ) +import CgForeignCall ( emitForeignCall, shimForeignCallArg ) +import CgPrimOp ( cgPrimOp ) +import CgUtils ( addIdReps, newTemp, assignTemp, cgLit, tagToClosure ) +import ClosureInfo ( mkSelectorLFInfo, mkApLFInfo ) +import Cmm ( CmmExpr(..), CmmStmt(..), CmmReg, nodeReg ) +import MachOp ( wordRep, MachHint ) +import VarSet +import Literal ( literalType ) +import PrimOp ( primOpOutOfLine, getPrimOpResultInfo, + PrimOp(..), PrimOpResultInfo(..) ) +import Id ( Id ) +import TyCon ( isUnboxedTupleTyCon, isEnumerationTyCon ) +import Type ( Type, tyConAppArgs, tyConAppTyCon, repType, + PrimRep(VoidRep) ) +import Maybes ( maybeToBool ) +import ListSetOps ( assocMaybe ) +import BasicTypes ( RecFlag(..) ) +import Util ( lengthIs ) +import Outputable +\end{code} + +This module provides the support code for @StgToAbstractC@ to deal +with STG {\em expressions}. See also @CgClosure@, which deals +with closures, and @CgCon@, which deals with constructors. + +\begin{code} +cgExpr :: StgExpr -- input + -> Code -- output +\end{code} + +%******************************************************** +%* * +%* Tail calls * +%* * +%******************************************************** + +``Applications'' mean {\em tail calls}, a service provided by module +@CgTailCall@. This includes literals, which show up as +@(STGApp (StgLitArg 42) [])@. + +\begin{code} +cgExpr (StgApp fun args) = cgTailCall fun args +\end{code} + +%******************************************************** +%* * +%* STG ConApps (for inline versions) * +%* * +%******************************************************** + +\begin{code} +cgExpr (StgConApp con args) + = do { amodes <- getArgAmodes args + ; cgReturnDataCon con amodes } +\end{code} + +Literals are similar to constructors; they return by putting +themselves in an appropriate register and returning to the address on +top of the stack. + +\begin{code} +cgExpr (StgLit lit) + = do { cmm_lit <- cgLit lit + ; performPrimReturn rep (CmmLit cmm_lit) } + where + rep = typeCgRep (literalType lit) +\end{code} + + +%******************************************************** +%* * +%* PrimOps and foreign calls. +%* * +%******************************************************** + +NOTE about "safe" foreign calls: a safe foreign call is never compiled +inline in a case expression. When we see + + case (ccall ...) of { ... } + +We generate a proper return address for the alternatives and push the +stack frame before doing the call, so that in the event that the call +re-enters the RTS the stack is in a sane state. + +\begin{code} +cgExpr (StgOpApp (StgFCallOp fcall _) stg_args res_ty) = do + {- + First, copy the args into temporaries. We're going to push + a return address right before doing the call, so the args + must be out of the way. + -} + reps_n_amodes <- getArgAmodes stg_args + let + -- Get the *non-void* args, and jiggle them with shimForeignCall + arg_exprs = [ shimForeignCallArg stg_arg expr + | (stg_arg, (rep,expr)) <- stg_args `zip` reps_n_amodes, + nonVoidArg rep] + + -- in + arg_tmps <- mapM assignTemp arg_exprs + let + arg_hints = zip arg_tmps (map (typeHint.stgArgType) stg_args) + -- in + {- + Now, allocate some result regs. + -} + (res_reps,res_regs,res_hints) <- newUnboxedTupleRegs res_ty + ccallReturnUnboxedTuple (zip res_reps (map CmmReg res_regs)) $ + emitForeignCall (zip res_regs res_hints) fcall + arg_hints emptyVarSet{-no live vars-} + +-- tagToEnum# is special: we need to pull the constructor out of the table, +-- and perform an appropriate return. + +cgExpr (StgOpApp (StgPrimOp TagToEnumOp) [arg] res_ty) + = ASSERT(isEnumerationTyCon tycon) + do { (_,amode) <- getArgAmode arg + ; amode' <- assignTemp amode -- We're going to use it twice, + -- so save in a temp if non-trivial + ; hmods <- getHomeModules + ; stmtC (CmmAssign nodeReg (tagToClosure hmods tycon amode')) + ; performReturn (emitAlgReturnCode tycon amode') } + where + -- If you're reading this code in the attempt to figure + -- out why the compiler panic'ed here, it is probably because + -- you used tagToEnum# in a non-monomorphic setting, e.g., + -- intToTg :: Enum a => Int -> a ; intToTg (I# x#) = tagToEnum# x# + -- That won't work. + tycon = tyConAppTyCon res_ty + + +cgExpr x@(StgOpApp op@(StgPrimOp primop) args res_ty) + | primOpOutOfLine primop + = tailCallPrimOp primop args + + | ReturnsPrim VoidRep <- result_info + = do cgPrimOp [] primop args emptyVarSet + performReturn emitDirectReturnInstr + + | ReturnsPrim rep <- result_info + = do cgPrimOp [dataReturnConvPrim (primRepToCgRep rep)] + primop args emptyVarSet + performReturn emitDirectReturnInstr + + | ReturnsAlg tycon <- result_info, isUnboxedTupleTyCon tycon + = do (reps, regs, _hints) <- newUnboxedTupleRegs res_ty + cgPrimOp regs primop args emptyVarSet{-no live vars-} + returnUnboxedTuple (zip reps (map CmmReg regs)) + + | ReturnsAlg tycon <- result_info, isEnumerationTyCon tycon + -- c.f. cgExpr (...TagToEnumOp...) + = do tag_reg <- newTemp wordRep + hmods <- getHomeModules + cgPrimOp [tag_reg] primop args emptyVarSet + stmtC (CmmAssign nodeReg (tagToClosure hmods tycon (CmmReg tag_reg))) + performReturn (emitAlgReturnCode tycon (CmmReg tag_reg)) + where + result_info = getPrimOpResultInfo primop +\end{code} + +%******************************************************** +%* * +%* Case expressions * +%* * +%******************************************************** +Case-expression conversion is complicated enough to have its own +module, @CgCase@. +\begin{code} + +cgExpr (StgCase expr live_vars save_vars bndr srt alt_type alts) + = cgCase expr live_vars save_vars bndr srt alt_type alts +\end{code} + + +%******************************************************** +%* * +%* Let and letrec * +%* * +%******************************************************** +\subsection[let-and-letrec-codegen]{Converting @StgLet@ and @StgLetrec@} + +\begin{code} +cgExpr (StgLet (StgNonRec name rhs) expr) + = cgRhs name rhs `thenFC` \ (name, info) -> + addBindC name info `thenC` + cgExpr expr + +cgExpr (StgLet (StgRec pairs) expr) + = fixC (\ new_bindings -> addBindsC new_bindings `thenC` + listFCs [ cgRhs b e | (b,e) <- pairs ] + ) `thenFC` \ new_bindings -> + + addBindsC new_bindings `thenC` + cgExpr expr +\end{code} + +\begin{code} +cgExpr (StgLetNoEscape live_in_whole_let live_in_rhss bindings body) + = do { -- Figure out what volatile variables to save + ; nukeDeadBindings live_in_whole_let + ; (save_assts, rhs_eob_info, maybe_cc_slot) + <- saveVolatileVarsAndRegs live_in_rhss + + -- Save those variables right now! + ; emitStmts save_assts + + -- Produce code for the rhss + -- and add suitable bindings to the environment + ; cgLetNoEscapeBindings live_in_rhss rhs_eob_info + maybe_cc_slot bindings + + -- Do the body + ; setEndOfBlockInfo rhs_eob_info (cgExpr body) } +\end{code} + + +%******************************************************** +%* * +%* SCC Expressions * +%* * +%******************************************************** + +SCC expressions are treated specially. They set the current cost +centre. + +\begin{code} +cgExpr (StgSCC cc expr) = do emitSetCCC cc; cgExpr expr +\end{code} + +%******************************************************** +%* * +%* Non-top-level bindings * +%* * +%******************************************************** +\subsection[non-top-level-bindings]{Converting non-top-level bindings} + +We rely on the support code in @CgCon@ (to do constructors) and +in @CgClosure@ (to do closures). + +\begin{code} +cgRhs :: Id -> StgRhs -> FCode (Id, CgIdInfo) + -- the Id is passed along so a binding can be set up + +cgRhs name (StgRhsCon maybe_cc con args) + = do { amodes <- getArgAmodes args + ; idinfo <- buildDynCon name maybe_cc con amodes + ; returnFC (name, idinfo) } + +cgRhs name (StgRhsClosure cc bi fvs upd_flag srt args body) + = do hmods <- getHomeModules + mkRhsClosure hmods name cc bi srt fvs upd_flag args body +\end{code} + +mkRhsClosure looks for two special forms of the right-hand side: + a) selector thunks. + b) AP thunks + +If neither happens, it just calls mkClosureLFInfo. You might think +that mkClosureLFInfo should do all this, but it seems wrong for the +latter to look at the structure of an expression + +Selectors +~~~~~~~~~ +We look at the body of the closure to see if it's a selector---turgid, +but nothing deep. We are looking for a closure of {\em exactly} the +form: + +... = [the_fv] \ u [] -> + case the_fv of + con a_1 ... a_n -> a_i + + +\begin{code} +mkRhsClosure hmods bndr cc bi srt + [the_fv] -- Just one free var + upd_flag -- Updatable thunk + [] -- A thunk + body@(StgCase (StgApp scrutinee [{-no args-}]) + _ _ _ _ -- ignore uniq, etc. + (AlgAlt tycon) + [(DataAlt con, params, use_mask, + (StgApp selectee [{-no args-}]))]) + | the_fv == scrutinee -- Scrutinee is the only free variable + && maybeToBool maybe_offset -- Selectee is a component of the tuple + && offset_into_int <= mAX_SPEC_SELECTEE_SIZE -- Offset is small enough + = -- NOT TRUE: ASSERT(is_single_constructor) + -- The simplifier may have statically determined that the single alternative + -- is the only possible case and eliminated the others, even if there are + -- other constructors in the datatype. It's still ok to make a selector + -- thunk in this case, because we *know* which constructor the scrutinee + -- will evaluate to. + cgStdRhsClosure bndr cc bi [the_fv] [] body lf_info [StgVarArg the_fv] + where + lf_info = mkSelectorLFInfo bndr offset_into_int + (isUpdatable upd_flag) + (_, params_w_offsets) = layOutDynConstr hmods con (addIdReps params) + -- Just want the layout + maybe_offset = assocMaybe params_w_offsets selectee + Just the_offset = maybe_offset + offset_into_int = the_offset - fixedHdrSize +\end{code} + +Ap thunks +~~~~~~~~~ + +A more generic AP thunk of the form + + x = [ x_1...x_n ] \.. [] -> x_1 ... x_n + +A set of these is compiled statically into the RTS, so we just use +those. We could extend the idea to thunks where some of the x_i are +global ids (and hence not free variables), but this would entail +generating a larger thunk. It might be an option for non-optimising +compilation, though. + +We only generate an Ap thunk if all the free variables are pointers, +for semi-obvious reasons. + +\begin{code} +mkRhsClosure hmods bndr cc bi srt + fvs + upd_flag + [] -- No args; a thunk + body@(StgApp fun_id args) + + | args `lengthIs` (arity-1) + && all isFollowableArg (map idCgRep fvs) + && isUpdatable upd_flag + && arity <= mAX_SPEC_AP_SIZE + + -- Ha! an Ap thunk + = cgStdRhsClosure bndr cc bi fvs [] body lf_info payload + + where + lf_info = mkApLFInfo bndr upd_flag arity + -- the payload has to be in the correct order, hence we can't + -- just use the fvs. + payload = StgVarArg fun_id : args + arity = length fvs +\end{code} + +The default case +~~~~~~~~~~~~~~~~ +\begin{code} +mkRhsClosure hmods bndr cc bi srt fvs upd_flag args body + = cgRhsClosure bndr cc bi srt fvs upd_flag args body +\end{code} + + +%******************************************************** +%* * +%* Let-no-escape bindings +%* * +%******************************************************** +\begin{code} +cgLetNoEscapeBindings live_in_rhss rhs_eob_info maybe_cc_slot + (StgNonRec binder rhs) + = do { (binder,info) <- cgLetNoEscapeRhs live_in_rhss rhs_eob_info + maybe_cc_slot + NonRecursive binder rhs + ; addBindC binder info } + +cgLetNoEscapeBindings live_in_rhss rhs_eob_info maybe_cc_slot (StgRec pairs) + = do { new_bindings <- fixC (\ new_bindings -> do + { addBindsC new_bindings + ; listFCs [ cgLetNoEscapeRhs full_live_in_rhss + rhs_eob_info maybe_cc_slot Recursive b e + | (b,e) <- pairs ] }) + + ; addBindsC new_bindings } + where + -- We add the binders to the live-in-rhss set so that we don't + -- delete the bindings for the binder from the environment! + full_live_in_rhss = live_in_rhss `unionVarSet` (mkVarSet [b | (b,r) <- pairs]) + +cgLetNoEscapeRhs + :: StgLiveVars -- Live in rhss + -> EndOfBlockInfo + -> Maybe VirtualSpOffset + -> RecFlag + -> Id + -> StgRhs + -> FCode (Id, CgIdInfo) + +cgLetNoEscapeRhs full_live_in_rhss rhs_eob_info maybe_cc_slot rec binder + (StgRhsClosure cc bi _ upd_flag srt args body) + = -- We could check the update flag, but currently we don't switch it off + -- for let-no-escaped things, so we omit the check too! + -- case upd_flag of + -- Updatable -> panic "cgLetNoEscapeRhs" -- Nothing to update! + -- other -> cgLetNoEscapeClosure binder cc bi live_in_whole_let live_in_rhss args body + cgLetNoEscapeClosure binder cc bi srt full_live_in_rhss rhs_eob_info + maybe_cc_slot rec args body + +-- For a constructor RHS we want to generate a single chunk of code which +-- can be jumped to from many places, which will return the constructor. +-- It's easy; just behave as if it was an StgRhsClosure with a ConApp inside! +cgLetNoEscapeRhs full_live_in_rhss rhs_eob_info maybe_cc_slot rec binder + (StgRhsCon cc con args) + = cgLetNoEscapeClosure binder cc noBinderInfo{-safe-} NoSRT + full_live_in_rhss rhs_eob_info maybe_cc_slot rec + [] --No args; the binder is data structure, not a function + (StgConApp con args) +\end{code} + +Little helper for primitives that return unboxed tuples. + +\begin{code} +newUnboxedTupleRegs :: Type -> FCode ([CgRep], [CmmReg], [MachHint]) +newUnboxedTupleRegs res_ty = + let + ty_args = tyConAppArgs (repType res_ty) + (reps,hints) = unzip [ (rep, typeHint ty) | ty <- ty_args, + let rep = typeCgRep ty, + nonVoidArg rep ] + in do + regs <- mapM (newTemp . argMachRep) reps + return (reps,regs,hints) +\end{code} diff --git a/compiler/codeGen/CgExpr.lhs-boot b/compiler/codeGen/CgExpr.lhs-boot new file mode 100644 index 0000000000..29cdc3a605 --- /dev/null +++ b/compiler/codeGen/CgExpr.lhs-boot @@ -0,0 +1,7 @@ +\begin{code} +module CgExpr where +import StgSyn( StgExpr ) +import CgMonad( Code ) + +cgExpr :: StgExpr -> Code +\end{code} diff --git a/compiler/codeGen/CgForeignCall.hs b/compiler/codeGen/CgForeignCall.hs new file mode 100644 index 0000000000..10f41bdf8b --- /dev/null +++ b/compiler/codeGen/CgForeignCall.hs @@ -0,0 +1,256 @@ +----------------------------------------------------------------------------- +-- +-- Code generation for foreign calls. +-- +-- (c) The University of Glasgow 2004 +-- +----------------------------------------------------------------------------- + +module CgForeignCall ( + cgForeignCall, + emitForeignCall, + emitForeignCall', + shimForeignCallArg, + emitSaveThreadState, -- will be needed by the Cmm parser + emitLoadThreadState, -- ditto + emitCloseNursery, + emitOpenNursery, + ) where + +#include "HsVersions.h" + +import StgSyn ( StgLiveVars, StgArg, stgArgType ) +import CgProf ( curCCS, curCCSAddr ) +import CgBindery ( getVolatileRegs, getArgAmodes ) +import CgMonad +import CgUtils ( cmmOffsetW, cmmOffsetB, cmmLoadIndexW, newTemp, + assignTemp ) +import Type ( tyConAppTyCon, repType ) +import TysPrim +import CLabel ( mkForeignLabel, mkRtsCodeLabel ) +import Cmm +import CmmUtils +import MachOp +import SMRep +import ForeignCall +import Constants +import StaticFlags ( opt_SccProfilingOn ) +import Outputable + +import Monad ( when ) + +-- ----------------------------------------------------------------------------- +-- Code generation for Foreign Calls + +cgForeignCall + :: [(CmmReg,MachHint)] -- where to put the results + -> ForeignCall -- the op + -> [StgArg] -- arguments + -> StgLiveVars -- live vars, in case we need to save them + -> Code +cgForeignCall results fcall stg_args live + = do + reps_n_amodes <- getArgAmodes stg_args + let + -- Get the *non-void* args, and jiggle them with shimForeignCall + arg_exprs = [ shimForeignCallArg stg_arg expr + | (stg_arg, (rep,expr)) <- stg_args `zip` reps_n_amodes, + nonVoidArg rep] + + arg_hints = zip arg_exprs (map (typeHint.stgArgType) stg_args) + -- in + emitForeignCall results fcall arg_hints live + + +emitForeignCall + :: [(CmmReg,MachHint)] -- where to put the results + -> ForeignCall -- the op + -> [(CmmExpr,MachHint)] -- arguments + -> StgLiveVars -- live vars, in case we need to save them + -> Code + +emitForeignCall results (CCall (CCallSpec target cconv safety)) args live + = do vols <- getVolatileRegs live + emitForeignCall' safety results + (CmmForeignCall cmm_target cconv) call_args (Just vols) + where + (call_args, cmm_target) + = case target of + StaticTarget lbl -> (args, CmmLit (CmmLabel + (mkForeignLabel lbl call_size False))) + DynamicTarget -> case args of (fn,_):rest -> (rest, fn) + + -- in the stdcall calling convention, the symbol needs @size appended + -- to it, where size is the total number of bytes of arguments. We + -- attach this info to the CLabel here, and the CLabel pretty printer + -- will generate the suffix when the label is printed. + call_size + | StdCallConv <- cconv = Just (sum (map (arg_size.cmmExprRep.fst) args)) + | otherwise = Nothing + + -- ToDo: this might not be correct for 64-bit API + arg_size rep = max (machRepByteWidth rep) wORD_SIZE + +emitForeignCall results (DNCall _) args live + = panic "emitForeignCall: DNCall" + + +-- alternative entry point, used by CmmParse +emitForeignCall' + :: Safety + -> [(CmmReg,MachHint)] -- where to put the results + -> CmmCallTarget -- the op + -> [(CmmExpr,MachHint)] -- arguments + -> Maybe [GlobalReg] -- live vars, in case we need to save them + -> Code +emitForeignCall' safety results target args vols + | not (playSafe safety) = do + temp_args <- load_args_into_temps args + stmtC (CmmCall target results temp_args vols) + + | otherwise = do + id <- newTemp wordRep + temp_args <- load_args_into_temps args + emitSaveThreadState + stmtC (CmmCall (CmmForeignCall suspendThread CCallConv) + [(id,PtrHint)] + [ (CmmReg (CmmGlobal BaseReg), PtrHint) ] + vols + ) + stmtC (CmmCall target results temp_args vols) + stmtC (CmmCall (CmmForeignCall resumeThread CCallConv) + [ (CmmGlobal BaseReg, PtrHint) ] + -- Assign the result to BaseReg: we + -- might now have a different + -- Capability! + [ (CmmReg id, PtrHint) ] + vols + ) + emitLoadThreadState + + +suspendThread = CmmLit (CmmLabel (mkRtsCodeLabel SLIT("suspendThread"))) +resumeThread = CmmLit (CmmLabel (mkRtsCodeLabel SLIT("resumeThread"))) + + +-- we might need to load arguments into temporaries before +-- making the call, because certain global registers might +-- overlap with registers that the C calling convention uses +-- for passing arguments. +-- +-- This is a HACK; really it should be done in the back end, but +-- it's easier to generate the temporaries here. +load_args_into_temps args = mapM maybe_assignTemp args + +maybe_assignTemp (e, hint) + | hasNoGlobalRegs e = return (e, hint) + | otherwise = do + -- don't use assignTemp, it uses its own notion of "trivial" + -- expressions, which are wrong here + reg <- newTemp (cmmExprRep e) + stmtC (CmmAssign reg e) + return (CmmReg reg, hint) + +-- ----------------------------------------------------------------------------- +-- Save/restore the thread state in the TSO + +-- This stuff can't be done in suspendThread/resumeThread, because it +-- refers to global registers which aren't available in the C world. + +emitSaveThreadState = do + -- CurrentTSO->sp = Sp; + stmtC $ CmmStore (cmmOffset stgCurrentTSO tso_SP) stgSp + emitCloseNursery + -- and save the current cost centre stack in the TSO when profiling: + when opt_SccProfilingOn $ + stmtC (CmmStore (cmmOffset stgCurrentTSO tso_CCCS) curCCS) + + -- CurrentNursery->free = Hp+1; +emitCloseNursery = stmtC $ CmmStore nursery_bdescr_free (cmmOffsetW stgHp 1) + +emitLoadThreadState = do + tso <- newTemp wordRep + stmtsC [ + -- tso = CurrentTSO; + CmmAssign tso stgCurrentTSO, + -- Sp = tso->sp; + CmmAssign sp (CmmLoad (cmmOffset (CmmReg tso) tso_SP) + wordRep), + -- SpLim = tso->stack + RESERVED_STACK_WORDS; + CmmAssign spLim (cmmOffsetW (cmmOffset (CmmReg tso) tso_STACK) + rESERVED_STACK_WORDS) + ] + emitOpenNursery + -- and load the current cost centre stack from the TSO when profiling: + when opt_SccProfilingOn $ + stmtC (CmmStore curCCSAddr + (CmmLoad (cmmOffset (CmmReg tso) tso_CCCS) wordRep)) + +emitOpenNursery = stmtsC [ + -- Hp = CurrentNursery->free - 1; + CmmAssign hp (cmmOffsetW (CmmLoad nursery_bdescr_free wordRep) (-1)), + + -- HpLim = CurrentNursery->start + + -- CurrentNursery->blocks*BLOCK_SIZE_W - 1; + CmmAssign hpLim + (cmmOffsetExpr + (CmmLoad nursery_bdescr_start wordRep) + (cmmOffset + (CmmMachOp mo_wordMul [ + CmmMachOp (MO_S_Conv I32 wordRep) + [CmmLoad nursery_bdescr_blocks I32], + CmmLit (mkIntCLit bLOCK_SIZE) + ]) + (-1) + ) + ) + ] + + +nursery_bdescr_free = cmmOffset stgCurrentNursery oFFSET_bdescr_free +nursery_bdescr_start = cmmOffset stgCurrentNursery oFFSET_bdescr_start +nursery_bdescr_blocks = cmmOffset stgCurrentNursery oFFSET_bdescr_blocks + +tso_SP = tsoFieldB oFFSET_StgTSO_sp +tso_STACK = tsoFieldB oFFSET_StgTSO_stack +tso_CCCS = tsoProfFieldB oFFSET_StgTSO_CCCS + +-- The TSO struct has a variable header, and an optional StgTSOProfInfo in +-- the middle. The fields we're interested in are after the StgTSOProfInfo. +tsoFieldB :: ByteOff -> ByteOff +tsoFieldB off + | opt_SccProfilingOn = off + sIZEOF_StgTSOProfInfo + fixedHdrSize * wORD_SIZE + | otherwise = off + fixedHdrSize * wORD_SIZE + +tsoProfFieldB :: ByteOff -> ByteOff +tsoProfFieldB off = off + fixedHdrSize * wORD_SIZE + +stgSp = CmmReg sp +stgHp = CmmReg hp +stgCurrentTSO = CmmReg currentTSO +stgCurrentNursery = CmmReg currentNursery + +sp = CmmGlobal Sp +spLim = CmmGlobal SpLim +hp = CmmGlobal Hp +hpLim = CmmGlobal HpLim +currentTSO = CmmGlobal CurrentTSO +currentNursery = CmmGlobal CurrentNursery + +-- ----------------------------------------------------------------------------- +-- For certain types passed to foreign calls, we adjust the actual +-- value passed to the call. For ByteArray#/Array# we pass the +-- address of the actual array, not the address of the heap object. + +shimForeignCallArg :: StgArg -> CmmExpr -> CmmExpr +shimForeignCallArg arg expr + | tycon == arrayPrimTyCon || tycon == mutableArrayPrimTyCon + = cmmOffsetB expr arrPtrsHdrSize + + | tycon == byteArrayPrimTyCon || tycon == mutableByteArrayPrimTyCon + = cmmOffsetB expr arrWordsHdrSize + + | otherwise = expr + where + -- should be a tycon app, since this is a foreign call + tycon = tyConAppTyCon (repType (stgArgType arg)) diff --git a/compiler/codeGen/CgHeapery.lhs b/compiler/codeGen/CgHeapery.lhs new file mode 100644 index 0000000000..184af904df --- /dev/null +++ b/compiler/codeGen/CgHeapery.lhs @@ -0,0 +1,588 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +% $Id: CgHeapery.lhs,v 1.47 2005/06/21 10:44:41 simonmar Exp $ +% +\section[CgHeapery]{Heap management functions} + +\begin{code} +module CgHeapery ( + initHeapUsage, getVirtHp, setVirtHp, setRealHp, + getHpRelOffset, hpRel, + + funEntryChecks, thunkEntryChecks, + altHeapCheck, unbxTupleHeapCheck, + hpChkGen, hpChkNodePointsAssignSp0, + stkChkGen, stkChkNodePoints, + + layOutDynConstr, layOutStaticConstr, + mkVirtHeapOffsets, mkStaticClosureFields, mkStaticClosure, + + allocDynClosure, emitSetDynHdr + ) where + +#include "HsVersions.h" + +import StgSyn ( AltType(..) ) +import CLabel ( CLabel, mkRtsCodeLabel ) +import CgUtils ( mkWordCLit, cmmRegOffW, cmmOffsetW, + cmmOffsetExprB ) +import CgMonad +import CgProf ( staticProfHdr, profDynAlloc, dynProfHdr ) +import CgTicky ( staticTickyHdr, tickyDynAlloc, tickyAllocHeap ) +import CgParallel ( staticGranHdr, staticParHdr, doGranAllocate ) +import CgStackery ( getFinalStackHW, getRealSp ) +import CgCallConv ( mkRegLiveness ) +import ClosureInfo ( closureSize, staticClosureNeedsLink, + mkConInfo, closureNeedsUpdSpace, + infoTableLabelFromCI, closureLabelFromCI, + nodeMustPointToIt, closureLFInfo, + ClosureInfo ) +import SMRep ( CgRep(..), cgRepSizeW, separateByPtrFollowness, + WordOff, fixedHdrSize, thunkHdrSize, + isVoidArg, primRepToCgRep ) + +import Cmm ( CmmLit(..), CmmStmt(..), CmmExpr(..), GlobalReg(..), + CmmReg(..), hpReg, nodeReg, spReg ) +import MachOp ( mo_wordULt, mo_wordUGt, mo_wordSub ) +import CmmUtils ( mkIntCLit, CmmStmts, noStmts, oneStmt, plusStmts, + mkStmts ) +import Id ( Id ) +import DataCon ( DataCon ) +import TyCon ( tyConPrimRep ) +import CostCentre ( CostCentreStack ) +import Util ( mapAccumL, filterOut ) +import Constants ( wORD_SIZE ) +import Packages ( HomeModules ) +import Outputable + +\end{code} + + +%************************************************************************ +%* * +\subsection[CgUsages-heapery]{Monad things for fiddling with heap usage} +%* * +%************************************************************************ + +The heap always grows upwards, so hpRel is easy + +\begin{code} +hpRel :: VirtualHpOffset -- virtual offset of Hp + -> VirtualHpOffset -- virtual offset of The Thing + -> WordOff -- integer word offset +hpRel hp off = off - hp +\end{code} + +@initHeapUsage@ applies a function to the amount of heap that it uses. +It initialises the heap usage to zeros, and passes on an unchanged +heap usage. + +It is usually a prelude to performing a GC check, so everything must +be in a tidy and consistent state. + +rje: Note the slightly suble fixed point behaviour needed here + +\begin{code} +initHeapUsage :: (VirtualHpOffset -> Code) -> Code +initHeapUsage fcode + = do { orig_hp_usage <- getHpUsage + ; setHpUsage initHpUsage + ; fixC (\heap_usage2 -> do + { fcode (heapHWM heap_usage2) + ; getHpUsage }) + ; setHpUsage orig_hp_usage } + +setVirtHp :: VirtualHpOffset -> Code +setVirtHp new_virtHp + = do { hp_usage <- getHpUsage + ; setHpUsage (hp_usage {virtHp = new_virtHp}) } + +getVirtHp :: FCode VirtualHpOffset +getVirtHp + = do { hp_usage <- getHpUsage + ; return (virtHp hp_usage) } + +setRealHp :: VirtualHpOffset -> Code +setRealHp new_realHp + = do { hp_usage <- getHpUsage + ; setHpUsage (hp_usage {realHp = new_realHp}) } + +getHpRelOffset :: VirtualHpOffset -> FCode CmmExpr +getHpRelOffset virtual_offset + = do { hp_usg <- getHpUsage + ; return (cmmRegOffW hpReg (hpRel (realHp hp_usg) virtual_offset)) } +\end{code} + + +%************************************************************************ +%* * + Layout of heap objects +%* * +%************************************************************************ + +\begin{code} +layOutDynConstr, layOutStaticConstr + :: HomeModules + -> DataCon + -> [(CgRep,a)] + -> (ClosureInfo, + [(a,VirtualHpOffset)]) + +layOutDynConstr = layOutConstr False +layOutStaticConstr = layOutConstr True + +layOutConstr is_static hmods data_con args + = (mkConInfo hmods is_static data_con tot_wds ptr_wds, + things_w_offsets) + where + (tot_wds, -- #ptr_wds + #nonptr_wds + ptr_wds, -- #ptr_wds + things_w_offsets) = mkVirtHeapOffsets False{-not a thunk-} args +\end{code} + +@mkVirtHeapOffsets@ always returns boxed things with smaller offsets +than the unboxed things, and furthermore, the offsets in the result +list + +\begin{code} +mkVirtHeapOffsets + :: Bool -- True <=> is a thunk + -> [(CgRep,a)] -- Things to make offsets for + -> (WordOff, -- _Total_ number of words allocated + WordOff, -- Number of words allocated for *pointers* + [(a, VirtualHpOffset)]) + -- Things with their offsets from start of + -- object in order of increasing offset + +-- First in list gets lowest offset, which is initial offset + 1. + +mkVirtHeapOffsets is_thunk things + = let non_void_things = filterOut (isVoidArg . fst) things + (ptrs, non_ptrs) = separateByPtrFollowness non_void_things + (wds_of_ptrs, ptrs_w_offsets) = mapAccumL computeOffset 0 ptrs + (tot_wds, non_ptrs_w_offsets) = mapAccumL computeOffset wds_of_ptrs non_ptrs + in + (tot_wds, wds_of_ptrs, ptrs_w_offsets ++ non_ptrs_w_offsets) + where + hdr_size | is_thunk = thunkHdrSize + | otherwise = fixedHdrSize + + computeOffset wds_so_far (rep, thing) + = (wds_so_far + cgRepSizeW rep, (thing, hdr_size + wds_so_far)) +\end{code} + + +%************************************************************************ +%* * + Lay out a static closure +%* * +%************************************************************************ + +Make a static closure, adding on any extra padding needed for CAFs, +and adding a static link field if necessary. + +\begin{code} +mkStaticClosureFields + :: ClosureInfo + -> CostCentreStack + -> Bool -- Has CAF refs + -> [CmmLit] -- Payload + -> [CmmLit] -- The full closure +mkStaticClosureFields cl_info ccs caf_refs payload + = mkStaticClosure info_lbl ccs payload padding_wds + static_link_field saved_info_field + where + info_lbl = infoTableLabelFromCI cl_info + + -- CAFs must have consistent layout, regardless of whether they + -- are actually updatable or not. The layout of a CAF is: + -- + -- 3 saved_info + -- 2 static_link + -- 1 indirectee + -- 0 info ptr + -- + -- the static_link and saved_info fields must always be in the same + -- place. So we use closureNeedsUpdSpace rather than + -- closureUpdReqd here: + + is_caf = closureNeedsUpdSpace cl_info + + padding_wds + | not is_caf = [] + | otherwise = ASSERT(null payload) [mkIntCLit 0] + + static_link_field + | is_caf || staticClosureNeedsLink cl_info = [static_link_value] + | otherwise = [] + + saved_info_field + | is_caf = [mkIntCLit 0] + | otherwise = [] + + -- for a static constructor which has NoCafRefs, we set the + -- static link field to a non-zero value so the garbage + -- collector will ignore it. + static_link_value + | caf_refs = mkIntCLit 0 + | otherwise = mkIntCLit 1 + + +mkStaticClosure :: CLabel -> CostCentreStack -> [CmmLit] + -> [CmmLit] -> [CmmLit] -> [CmmLit] -> [CmmLit] +mkStaticClosure info_lbl ccs payload padding_wds static_link_field saved_info_field + = [CmmLabel info_lbl] + ++ variable_header_words + ++ payload + ++ padding_wds + ++ static_link_field + ++ saved_info_field + where + variable_header_words + = staticGranHdr + ++ staticParHdr + ++ staticProfHdr ccs + ++ staticTickyHdr +\end{code} + +%************************************************************************ +%* * +\subsection[CgHeapery-heap-overflow]{Heap overflow checking} +%* * +%************************************************************************ + +The new code for heapChecks. For GrAnSim the code for doing a heap check +and doing a context switch has been separated. Especially, the HEAP_CHK +macro only performs a heap check. THREAD_CONTEXT_SWITCH should be used for +doing a context switch. GRAN_FETCH_AND_RESCHEDULE must be put at the +beginning of every slow entry code in order to simulate the fetching of +closures. If fetching is necessary (i.e. current closure is not local) then +an automatic context switch is done. + +-------------------------------------------------------------- +A heap/stack check at a function or thunk entry point. + +\begin{code} +funEntryChecks :: ClosureInfo -> CmmStmts -> Code -> Code +funEntryChecks cl_info reg_save_code code + = hpStkCheck cl_info True reg_save_code code + +thunkEntryChecks :: ClosureInfo -> Code -> Code +thunkEntryChecks cl_info code + = hpStkCheck cl_info False noStmts code + +hpStkCheck :: ClosureInfo -- Function closure + -> Bool -- Is a function? (not a thunk) + -> CmmStmts -- Register saves + -> Code + -> Code + +hpStkCheck cl_info is_fun reg_save_code code + = getFinalStackHW $ \ spHw -> do + { sp <- getRealSp + ; let stk_words = spHw - sp + ; initHeapUsage $ \ hpHw -> do + { -- Emit heap checks, but be sure to do it lazily so + -- that the conditionals on hpHw don't cause a black hole + codeOnly $ do + { do_checks stk_words hpHw full_save_code rts_label + ; tickyAllocHeap hpHw } + ; setRealHp hpHw + ; code } + } + where + node_asst + | nodeMustPointToIt (closureLFInfo cl_info) + = noStmts + | otherwise + = oneStmt (CmmAssign nodeReg (CmmLit (CmmLabel closure_lbl))) + closure_lbl = closureLabelFromCI cl_info + + full_save_code = node_asst `plusStmts` reg_save_code + + rts_label | is_fun = CmmReg (CmmGlobal GCFun) + -- Function entry point + | otherwise = CmmReg (CmmGlobal GCEnter1) + -- Thunk or case return + -- In the thunk/case-return case, R1 points to a closure + -- which should be (re)-entered after GC +\end{code} + +Heap checks in a case alternative are nice and easy, provided this is +a bog-standard algebraic case. We have in our hand: + + * one return address, on the stack, + * one return value, in Node. + +the canned code for this heap check failure just pushes Node on the +stack, saying 'EnterGHC' to return. The scheduler will return by +entering the top value on the stack, which in turn will return through +the return address, getting us back to where we were. This is +therefore only valid if the return value is *lifted* (just being +boxed isn't good enough). + +For primitive returns, we have an unlifted value in some register +(either R1 or FloatReg1 or DblReg1). This means using specialised +heap-check code for these cases. + +\begin{code} +altHeapCheck + :: AltType -- PolyAlt, PrimAlt, AlgAlt, but *not* UbxTupAlt + -- (Unboxed tuples are dealt with by ubxTupleHeapCheck) + -> Code -- Continuation + -> Code +altHeapCheck alt_type code + = initHeapUsage $ \ hpHw -> do + { codeOnly $ do + { do_checks 0 {- no stack chk -} hpHw + noStmts {- nothign to save -} + (rts_label alt_type) + ; tickyAllocHeap hpHw } + ; setRealHp hpHw + ; code } + where + rts_label PolyAlt = CmmLit (CmmLabel (mkRtsCodeLabel SLIT( "stg_gc_unpt_r1"))) + -- Do *not* enter R1 after a heap check in + -- a polymorphic case. It might be a function + -- and the entry code for a function (currently) + -- applies it + -- + -- However R1 is guaranteed to be a pointer + + rts_label (AlgAlt tc) = stg_gc_enter1 + -- Enter R1 after the heap check; it's a pointer + + rts_label (PrimAlt tc) + = CmmLit $ CmmLabel $ + case primRepToCgRep (tyConPrimRep tc) of + VoidArg -> mkRtsCodeLabel SLIT( "stg_gc_noregs") + FloatArg -> mkRtsCodeLabel SLIT( "stg_gc_f1") + DoubleArg -> mkRtsCodeLabel SLIT( "stg_gc_d1") + LongArg -> mkRtsCodeLabel SLIT( "stg_gc_l1") + -- R1 is boxed but unlifted: + PtrArg -> mkRtsCodeLabel SLIT( "stg_gc_unpt_r1") + -- R1 is unboxed: + NonPtrArg -> mkRtsCodeLabel SLIT( "stg_gc_unbx_r1") + + rts_label (UbxTupAlt _) = panic "altHeapCheck" +\end{code} + + +Unboxed tuple alternatives and let-no-escapes (the two most annoying +constructs to generate code for!) For unboxed tuple returns, there +are an arbitrary number of possibly unboxed return values, some of +which will be in registers, and the others will be on the stack. We +always organise the stack-resident fields into pointers & +non-pointers, and pass the number of each to the heap check code. + +\begin{code} +unbxTupleHeapCheck + :: [(Id, GlobalReg)] -- Live registers + -> WordOff -- no. of stack slots containing ptrs + -> WordOff -- no. of stack slots containing nonptrs + -> CmmStmts -- code to insert in the failure path + -> Code + -> Code + +unbxTupleHeapCheck regs ptrs nptrs fail_code code + -- We can't manage more than 255 pointers/non-pointers + -- in a generic heap check. + | ptrs > 255 || nptrs > 255 = panic "altHeapCheck" + | otherwise + = initHeapUsage $ \ hpHw -> do + { codeOnly $ do { do_checks 0 {- no stack check -} hpHw + full_fail_code rts_label + ; tickyAllocHeap hpHw } + ; setRealHp hpHw + ; code } + where + full_fail_code = fail_code `plusStmts` oneStmt assign_liveness + assign_liveness = CmmAssign (CmmGlobal (VanillaReg 9)) -- Ho ho ho! + (CmmLit (mkWordCLit liveness)) + liveness = mkRegLiveness regs ptrs nptrs + rts_label = CmmLit (CmmLabel (mkRtsCodeLabel SLIT("stg_gc_ut"))) + +\end{code} + + +%************************************************************************ +%* * + Heap/Stack Checks. +%* * +%************************************************************************ + +When failing a check, we save a return address on the stack and +jump to a pre-compiled code fragment that saves the live registers +and returns to the scheduler. + +The return address in most cases will be the beginning of the basic +block in which the check resides, since we need to perform the check +again on re-entry because someone else might have stolen the resource +in the meantime. + +\begin{code} +do_checks :: WordOff -- Stack headroom + -> WordOff -- Heap headroom + -> CmmStmts -- Assignments to perform on failure + -> CmmExpr -- Rts address to jump to on failure + -> Code +do_checks 0 0 _ _ = nopC +do_checks stk hp reg_save_code rts_lbl + = do_checks' (CmmLit (mkIntCLit (stk*wORD_SIZE))) + (CmmLit (mkIntCLit (hp*wORD_SIZE))) + (stk /= 0) (hp /= 0) reg_save_code rts_lbl + +-- The offsets are now in *bytes* +do_checks' stk_expr hp_expr stk_nonzero hp_nonzero reg_save_code rts_lbl + = do { doGranAllocate hp_expr + + -- Emit a block for the heap-check-failure code + ; blk_id <- forkLabelledCode $ do + { whenC hp_nonzero $ + stmtC (CmmAssign (CmmGlobal HpAlloc) hp_expr) + ; emitStmts reg_save_code + ; stmtC (CmmJump rts_lbl []) } + + -- Check for stack overflow *FIRST*; otherwise + -- we might bumping Hp and then failing stack oflo + ; whenC stk_nonzero + (stmtC (CmmCondBranch stk_oflo blk_id)) + + ; whenC hp_nonzero + (stmtsC [CmmAssign hpReg + (cmmOffsetExprB (CmmReg hpReg) hp_expr), + CmmCondBranch hp_oflo blk_id]) + -- Bump heap pointer, and test for heap exhaustion + -- Note that we don't move the heap pointer unless the + -- stack check succeeds. Otherwise we might end up + -- with slop at the end of the current block, which can + -- confuse the LDV profiler. + } + where + -- Stk overflow if (Sp - stk_bytes < SpLim) + stk_oflo = CmmMachOp mo_wordULt + [CmmMachOp mo_wordSub [CmmReg spReg, stk_expr], + CmmReg (CmmGlobal SpLim)] + + -- Hp overflow if (Hpp > HpLim) + -- (Hp has been incremented by now) + -- HpLim points to the LAST WORD of valid allocation space. + hp_oflo = CmmMachOp mo_wordUGt + [CmmReg hpReg, CmmReg (CmmGlobal HpLim)] +\end{code} + +%************************************************************************ +%* * + Generic Heap/Stack Checks - used in the RTS +%* * +%************************************************************************ + +\begin{code} +hpChkGen :: CmmExpr -> CmmExpr -> CmmExpr -> Code +hpChkGen bytes liveness reentry + = do_checks' (CmmLit (mkIntCLit 0)) bytes False True assigns stg_gc_gen + where + assigns = mkStmts [ + CmmAssign (CmmGlobal (VanillaReg 9)) liveness, + CmmAssign (CmmGlobal (VanillaReg 10)) reentry + ] + +-- a heap check where R1 points to the closure to enter on return, and +-- we want to assign to Sp[0] on failure (used in AutoApply.cmm:BUILD_PAP). +hpChkNodePointsAssignSp0 :: CmmExpr -> CmmExpr -> Code +hpChkNodePointsAssignSp0 bytes sp0 + = do_checks' (CmmLit (mkIntCLit 0)) bytes False True assign stg_gc_enter1 + where assign = oneStmt (CmmStore (CmmReg spReg) sp0) + +stkChkGen :: CmmExpr -> CmmExpr -> CmmExpr -> Code +stkChkGen bytes liveness reentry + = do_checks' bytes (CmmLit (mkIntCLit 0)) True False assigns stg_gc_gen + where + assigns = mkStmts [ + CmmAssign (CmmGlobal (VanillaReg 9)) liveness, + CmmAssign (CmmGlobal (VanillaReg 10)) reentry + ] + +stkChkNodePoints :: CmmExpr -> Code +stkChkNodePoints bytes + = do_checks' bytes (CmmLit (mkIntCLit 0)) True False noStmts stg_gc_enter1 + +stg_gc_gen = CmmLit (CmmLabel (mkRtsCodeLabel SLIT("stg_gc_gen"))) +stg_gc_enter1 = CmmReg (CmmGlobal GCEnter1) +\end{code} + +%************************************************************************ +%* * +\subsection[initClosure]{Initialise a dynamic closure} +%* * +%************************************************************************ + +@allocDynClosure@ puts the thing in the heap, and modifies the virtual Hp +to account for this. + +\begin{code} +allocDynClosure + :: ClosureInfo + -> CmmExpr -- Cost Centre to stick in the object + -> CmmExpr -- Cost Centre to blame for this alloc + -- (usually the same; sometimes "OVERHEAD") + + -> [(CmmExpr, VirtualHpOffset)] -- Offsets from start of the object + -- ie Info ptr has offset zero. + -> FCode VirtualHpOffset -- Returns virt offset of object + +allocDynClosure cl_info use_cc blame_cc amodes_with_offsets + = do { virt_hp <- getVirtHp + + -- FIND THE OFFSET OF THE INFO-PTR WORD + ; let info_offset = virt_hp + 1 + -- info_offset is the VirtualHpOffset of the first + -- word of the new object + -- Remember, virtHp points to last allocated word, + -- ie 1 *before* the info-ptr word of new object. + + info_ptr = CmmLit (CmmLabel (infoTableLabelFromCI cl_info)) + hdr_w_offsets = initDynHdr info_ptr use_cc `zip` [0..] + + -- SAY WHAT WE ARE ABOUT TO DO + ; profDynAlloc cl_info use_cc + -- ToDo: This is almost certainly wrong + -- We're ignoring blame_cc. But until we've + -- fixed the boxing hack in chooseDynCostCentres etc, + -- we're worried about making things worse by "fixing" + -- this part to use blame_cc! + + ; tickyDynAlloc cl_info + + -- ALLOCATE THE OBJECT + ; base <- getHpRelOffset info_offset + ; hpStore base (hdr_w_offsets ++ amodes_with_offsets) + + -- BUMP THE VIRTUAL HEAP POINTER + ; setVirtHp (virt_hp + closureSize cl_info) + + -- RETURN PTR TO START OF OBJECT + ; returnFC info_offset } + + +initDynHdr :: CmmExpr + -> CmmExpr -- Cost centre to put in object + -> [CmmExpr] +initDynHdr info_ptr cc + = [info_ptr] + -- ToDo: Gransim stuff + -- ToDo: Parallel stuff + ++ dynProfHdr cc + -- No ticky header + +hpStore :: CmmExpr -> [(CmmExpr, VirtualHpOffset)] -> Code +-- Store the item (expr,off) in base[off] +hpStore base es + = stmtsC [ CmmStore (cmmOffsetW base off) val + | (val, off) <- es ] + +emitSetDynHdr :: CmmExpr -> CmmExpr -> CmmExpr -> Code +emitSetDynHdr base info_ptr ccs + = hpStore base (zip (initDynHdr info_ptr ccs) [0..]) +\end{code} diff --git a/compiler/codeGen/CgInfoTbls.hs b/compiler/codeGen/CgInfoTbls.hs new file mode 100644 index 0000000000..b769950d87 --- /dev/null +++ b/compiler/codeGen/CgInfoTbls.hs @@ -0,0 +1,591 @@ +----------------------------------------------------------------------------- +-- +-- Building info tables. +-- +-- (c) The University of Glasgow 2004 +-- +----------------------------------------------------------------------------- + +module CgInfoTbls ( + emitClosureCodeAndInfoTable, + emitInfoTableAndCode, + dataConTagZ, + getSRTInfo, + emitDirectReturnTarget, emitAlgReturnTarget, + emitDirectReturnInstr, emitVectoredReturnInstr, + mkRetInfoTable, + mkStdInfoTable, + stdInfoTableSizeB, + mkFunGenInfoExtraBits, + entryCode, closureInfoPtr, + getConstrTag, + infoTable, infoTableClosureType, + infoTablePtrs, infoTableNonPtrs, + funInfoTable, + retVec + ) where + + +#include "HsVersions.h" + +import ClosureInfo ( ClosureInfo, closureTypeDescr, closureName, + infoTableLabelFromCI, Liveness, + closureValDescr, closureSRT, closureSMRep, + closurePtrsSize, closureNonHdrSize, closureFunInfo, + C_SRT(..), needsSRT, isConstrClosure_maybe, + ArgDescr(..) ) +import SMRep ( StgHalfWord, hALF_WORD_SIZE_IN_BITS, hALF_WORD_SIZE, + WordOff, ByteOff, + smRepClosureTypeInt, tablesNextToCode, + rET_BIG, rET_SMALL, rET_VEC_BIG, rET_VEC_SMALL ) +import CgBindery ( getLiveStackSlots ) +import CgCallConv ( isBigLiveness, mkLivenessCLit, buildContLiveness, + argDescrType, getSequelAmode, + CtrlReturnConvention(..) ) +import CgUtils ( mkStringCLit, packHalfWordsCLit, mkWordCLit, + cmmOffsetB, cmmOffsetExprW, cmmLabelOffW, cmmOffsetW, + emitDataLits, emitRODataLits, emitSwitch, cmmNegate, + newTemp ) +import CgMonad + +import CmmUtils ( mkIntCLit, zeroCLit ) +import Cmm ( CmmStmt(..), CmmExpr(..), CmmLit(..), LocalReg, + CmmBasicBlock, nodeReg ) +import MachOp ( MachOp(..), wordRep, halfWordRep ) +import CLabel +import StgSyn ( SRT(..) ) +import Name ( Name ) +import DataCon ( DataCon, dataConTag, fIRST_TAG ) +import Unique ( Uniquable(..) ) +import DynFlags ( DynFlags(..), HscTarget(..) ) +import StaticFlags ( opt_SccProfilingOn ) +import ListSetOps ( assocDefault ) +import Maybes ( isJust ) +import Constants ( wORD_SIZE, sIZEOF_StgFunInfoExtraRev ) +import Outputable + + +------------------------------------------------------------------------- +-- +-- Generating the info table and code for a closure +-- +------------------------------------------------------------------------- + +-- Here we make a concrete info table, represented as a list of CmmAddr +-- (it can't be simply a list of Word, because the SRT field is +-- represented by a label+offset expression). + +-- With tablesNextToCode, the layout is +-- <reversed variable part> +-- <normal forward StgInfoTable, but without +-- an entry point at the front> +-- <code> +-- +-- Without tablesNextToCode, the layout of an info table is +-- <entry label> +-- <normal forward rest of StgInfoTable> +-- <forward variable part> +-- +-- See includes/InfoTables.h + +emitClosureCodeAndInfoTable :: ClosureInfo -> [LocalReg] -> CgStmts -> Code +emitClosureCodeAndInfoTable cl_info args body + = do { ty_descr_lit <- + if opt_SccProfilingOn + then mkStringCLit (closureTypeDescr cl_info) + else return (mkIntCLit 0) + ; cl_descr_lit <- + if opt_SccProfilingOn + then mkStringCLit cl_descr_string + else return (mkIntCLit 0) + ; let std_info = mkStdInfoTable ty_descr_lit cl_descr_lit + cl_type srt_len layout_lit + + ; blks <- cgStmtsToBlocks body + ; emitInfoTableAndCode info_lbl std_info extra_bits args blks } + where + info_lbl = infoTableLabelFromCI cl_info + + cl_descr_string = closureValDescr cl_info + cl_type = smRepClosureTypeInt (closureSMRep cl_info) + + srt = closureSRT cl_info + needs_srt = needsSRT srt + + mb_con = isConstrClosure_maybe cl_info + is_con = isJust mb_con + + (srt_label,srt_len) + = case mb_con of + Just con -> -- Constructors don't have an SRT + -- We keep the *zero-indexed* tag in the srt_len + -- field of the info table. + (mkIntCLit 0, fromIntegral (dataConTagZ con)) + + Nothing -> -- Not a constructor + srtLabelAndLength srt info_lbl + + ptrs = closurePtrsSize cl_info + nptrs = size - ptrs + size = closureNonHdrSize cl_info + layout_lit = packHalfWordsCLit ptrs nptrs + + extra_bits + | is_fun = fun_extra_bits + | is_con = [] + | needs_srt = [srt_label] + | otherwise = [] + + maybe_fun_stuff = closureFunInfo cl_info + is_fun = isJust maybe_fun_stuff + (Just (arity, arg_descr)) = maybe_fun_stuff + + fun_extra_bits + | ArgGen liveness <- arg_descr + = [ fun_amode, + srt_label, + makeRelativeRefTo info_lbl $ mkLivenessCLit liveness, + slow_entry ] + | needs_srt = [fun_amode, srt_label] + | otherwise = [fun_amode] + + slow_entry = makeRelativeRefTo info_lbl (CmmLabel slow_entry_label) + slow_entry_label = mkSlowEntryLabel (closureName cl_info) + + fun_amode = packHalfWordsCLit fun_type arity + fun_type = argDescrType arg_descr + +-- We keep the *zero-indexed* tag in the srt_len field of the info +-- table of a data constructor. +dataConTagZ :: DataCon -> ConTagZ +dataConTagZ con = dataConTag con - fIRST_TAG + +-- A low-level way to generate the variable part of a fun-style info table. +-- (must match fun_extra_bits above). Used by the C-- parser. +mkFunGenInfoExtraBits :: Int -> Int -> CmmLit -> CmmLit -> CmmLit -> [CmmLit] +mkFunGenInfoExtraBits fun_type arity srt_label liveness slow_entry + = [ packHalfWordsCLit fun_type arity, + srt_label, + liveness, + slow_entry ] + +------------------------------------------------------------------------- +-- +-- Generating the info table and code for a return point +-- +------------------------------------------------------------------------- + +-- Here's the layout of a return-point info table +-- +-- Tables next to code: +-- +-- <reversed vector table> +-- <srt slot> +-- <standard info table> +-- ret-addr --> <entry code (if any)> +-- +-- Not tables-next-to-code: +-- +-- ret-addr --> <ptr to entry code> +-- <standard info table> +-- <srt slot> +-- <forward vector table> +-- +-- * The vector table is only present for vectored returns +-- +-- * The SRT slot is only there if either +-- (a) there is SRT info to record, OR +-- (b) if the return is vectored +-- The latter (b) is necessary so that the vector is in a +-- predictable place + +vectorSlot :: CmmExpr -> CmmExpr -> CmmExpr +-- Get the vector slot from the info pointer +vectorSlot info_amode zero_indexed_tag + | tablesNextToCode + = cmmOffsetExprW (cmmOffsetW info_amode (- (stdInfoTableSizeW + 2))) + (cmmNegate zero_indexed_tag) + -- The "2" is one for the SRT slot, and one more + -- to get to the first word of the vector + + | otherwise + = cmmOffsetExprW (cmmOffsetW info_amode (stdInfoTableSizeW + 2)) + zero_indexed_tag + -- The "2" is one for the entry-code slot and one for the SRT slot + +retVec :: CmmExpr -> CmmExpr -> CmmExpr +-- Get a return vector from the info pointer +retVec info_amode zero_indexed_tag + = let slot = vectorSlot info_amode zero_indexed_tag + tableEntry = CmmLoad slot wordRep + in if tablesNextToCode + then CmmMachOp (MO_Add wordRep) [tableEntry, info_amode] + else tableEntry + +emitReturnTarget + :: Name + -> CgStmts -- The direct-return code (if any) + -- (empty for vectored returns) + -> [CmmLit] -- Vector of return points + -- (empty for non-vectored returns) + -> SRT + -> FCode CLabel +emitReturnTarget name stmts vector srt + = do { live_slots <- getLiveStackSlots + ; liveness <- buildContLiveness name live_slots + ; srt_info <- getSRTInfo name srt + + ; let + cl_type = case (null vector, isBigLiveness liveness) of + (True, True) -> rET_BIG + (True, False) -> rET_SMALL + (False, True) -> rET_VEC_BIG + (False, False) -> rET_VEC_SMALL + + (std_info, extra_bits) = + mkRetInfoTable info_lbl liveness srt_info cl_type vector + + ; blks <- cgStmtsToBlocks stmts + ; emitInfoTableAndCode info_lbl std_info extra_bits args blks + ; return info_lbl } + where + args = {- trace "emitReturnTarget: missing args" -} [] + uniq = getUnique name + info_lbl = mkReturnInfoLabel uniq + + +mkRetInfoTable + :: CLabel -- info label + -> Liveness -- liveness + -> C_SRT -- SRT Info + -> Int -- type (eg. rET_SMALL) + -> [CmmLit] -- vector + -> ([CmmLit],[CmmLit]) +mkRetInfoTable info_lbl liveness srt_info cl_type vector + = (std_info, extra_bits) + where + (srt_label, srt_len) = srtLabelAndLength srt_info info_lbl + + srt_slot | need_srt = [srt_label] + | otherwise = [] + + need_srt = needsSRT srt_info || not (null vector) + -- If there's a vector table then we must allocate + -- an SRT slot, so that the vector table is at a + -- known offset from the info pointer + + liveness_lit = makeRelativeRefTo info_lbl $ mkLivenessCLit liveness + std_info = mkStdInfoTable zeroCLit zeroCLit cl_type srt_len liveness_lit + extra_bits = srt_slot ++ map (makeRelativeRefTo info_lbl) vector + + +emitDirectReturnTarget + :: Name + -> CgStmts -- The direct-return code + -> SRT + -> FCode CLabel +emitDirectReturnTarget name code srt + = emitReturnTarget name code [] srt + +emitAlgReturnTarget + :: Name -- Just for its unique + -> [(ConTagZ, CgStmts)] -- Tagged branches + -> Maybe CgStmts -- Default branch (if any) + -> SRT -- Continuation's SRT + -> CtrlReturnConvention + -> FCode (CLabel, SemiTaggingStuff) + +emitAlgReturnTarget name branches mb_deflt srt ret_conv + = case ret_conv of + UnvectoredReturn fam_sz -> do + { blks <- getCgStmts $ + emitSwitch tag_expr branches mb_deflt 0 (fam_sz - 1) + -- NB: tag_expr is zero-based + ; lbl <- emitDirectReturnTarget name blks srt + ; return (lbl, Nothing) } + -- Nothing: the internal branches in the switch don't have + -- global labels, so we can't use them at the 'call site' + + VectoredReturn fam_sz -> do + { let tagged_lbls = zip (map fst branches) $ + map (CmmLabel . mkAltLabel uniq . fst) branches + deflt_lbl | isJust mb_deflt = CmmLabel $ mkDefaultLabel uniq + | otherwise = mkIntCLit 0 + ; let vector = [ assocDefault deflt_lbl tagged_lbls i + | i <- [0..fam_sz-1]] + ; lbl <- emitReturnTarget name noCgStmts vector srt + ; mapFCs emit_alt branches + ; emit_deflt mb_deflt + ; return (lbl, Just (tagged_lbls, deflt_lbl)) } + where + uniq = getUnique name + tag_expr = getConstrTag (CmmReg nodeReg) + + emit_alt :: (Int, CgStmts) -> FCode (Int, CmmLit) + -- Emit the code for the alternative as a top-level + -- code block returning a label for it + emit_alt (tag, stmts) = do { let lbl = mkAltLabel uniq tag + ; blks <- cgStmtsToBlocks stmts + ; emitProc [] lbl [] blks + ; return (tag, CmmLabel lbl) } + + emit_deflt (Just stmts) = do { let lbl = mkDefaultLabel uniq + ; blks <- cgStmtsToBlocks stmts + ; emitProc [] lbl [] blks + ; return (CmmLabel lbl) } + emit_deflt Nothing = return (mkIntCLit 0) + -- Nothing case: the simplifier might have eliminated a case + -- so we may have e.g. case xs of + -- [] -> e + -- In that situation the default should never be taken, + -- so we just use a NULL pointer + +-------------------------------- +emitDirectReturnInstr :: Code +emitDirectReturnInstr + = do { info_amode <- getSequelAmode + ; stmtC (CmmJump (entryCode info_amode) []) } + +emitVectoredReturnInstr :: CmmExpr -- _Zero-indexed_ constructor tag + -> Code +emitVectoredReturnInstr zero_indexed_tag + = do { info_amode <- getSequelAmode + -- HACK! assign info_amode to a temp, because retVec + -- uses it twice and the NCG doesn't have any CSE yet. + -- Only do this for the NCG, because gcc is too stupid + -- to optimise away the extra tmp (grrr). + ; dflags <- getDynFlags + ; x <- if hscTarget dflags == HscAsm + then do z <- newTemp wordRep + stmtC (CmmAssign z info_amode) + return (CmmReg z) + else + return info_amode + ; let target = retVec x zero_indexed_tag + ; stmtC (CmmJump target []) } + + +------------------------------------------------------------------------- +-- +-- Generating a standard info table +-- +------------------------------------------------------------------------- + +-- The standard bits of an info table. This part of the info table +-- corresponds to the StgInfoTable type defined in InfoTables.h. +-- +-- Its shape varies with ticky/profiling/tables next to code etc +-- so we can't use constant offsets from Constants + +mkStdInfoTable + :: CmmLit -- closure type descr (profiling) + -> CmmLit -- closure descr (profiling) + -> Int -- closure type + -> StgHalfWord -- SRT length + -> CmmLit -- layout field + -> [CmmLit] + +mkStdInfoTable type_descr closure_descr cl_type srt_len layout_lit + = -- Parallel revertible-black hole field + prof_info + -- Ticky info (none at present) + -- Debug info (none at present) + ++ [layout_lit, type_lit] + + where + prof_info + | opt_SccProfilingOn = [type_descr, closure_descr] + | otherwise = [] + + type_lit = packHalfWordsCLit cl_type srt_len + +stdInfoTableSizeW :: WordOff +-- The size of a standard info table varies with profiling/ticky etc, +-- so we can't get it from Constants +-- It must vary in sync with mkStdInfoTable +stdInfoTableSizeW + = size_fixed + size_prof + where + size_fixed = 2 -- layout, type + size_prof | opt_SccProfilingOn = 2 + | otherwise = 0 + +stdInfoTableSizeB = stdInfoTableSizeW * wORD_SIZE :: ByteOff + +stdSrtBitmapOffset :: ByteOff +-- Byte offset of the SRT bitmap half-word which is +-- in the *higher-addressed* part of the type_lit +stdSrtBitmapOffset = stdInfoTableSizeB - hALF_WORD_SIZE + +stdClosureTypeOffset :: ByteOff +-- Byte offset of the closure type half-word +stdClosureTypeOffset = stdInfoTableSizeB - wORD_SIZE + +stdPtrsOffset, stdNonPtrsOffset :: ByteOff +stdPtrsOffset = stdInfoTableSizeB - 2*wORD_SIZE +stdNonPtrsOffset = stdInfoTableSizeB - 2*wORD_SIZE + hALF_WORD_SIZE + +------------------------------------------------------------------------- +-- +-- Accessing fields of an info table +-- +------------------------------------------------------------------------- + +closureInfoPtr :: CmmExpr -> CmmExpr +-- Takes a closure pointer and returns the info table pointer +closureInfoPtr e = CmmLoad e wordRep + +entryCode :: CmmExpr -> CmmExpr +-- Takes an info pointer (the first word of a closure) +-- and returns its entry code +entryCode e | tablesNextToCode = e + | otherwise = CmmLoad e wordRep + +getConstrTag :: CmmExpr -> CmmExpr +-- Takes a closure pointer, and return the *zero-indexed* +-- constructor tag obtained from the info table +-- This lives in the SRT field of the info table +-- (constructors don't need SRTs). +getConstrTag closure_ptr + = CmmMachOp (MO_U_Conv halfWordRep wordRep) [infoTableConstrTag info_table] + where + info_table = infoTable (closureInfoPtr closure_ptr) + +infoTable :: CmmExpr -> CmmExpr +-- Takes an info pointer (the first word of a closure) +-- and returns a pointer to the first word of the standard-form +-- info table, excluding the entry-code word (if present) +infoTable info_ptr + | tablesNextToCode = cmmOffsetB info_ptr (- stdInfoTableSizeB) + | otherwise = cmmOffsetW info_ptr 1 -- Past the entry code pointer + +infoTableConstrTag :: CmmExpr -> CmmExpr +-- Takes an info table pointer (from infoTable) and returns the constr tag +-- field of the info table (same as the srt_bitmap field) +infoTableConstrTag = infoTableSrtBitmap + +infoTableSrtBitmap :: CmmExpr -> CmmExpr +-- Takes an info table pointer (from infoTable) and returns the srt_bitmap +-- field of the info table +infoTableSrtBitmap info_tbl + = CmmLoad (cmmOffsetB info_tbl stdSrtBitmapOffset) halfWordRep + +infoTableClosureType :: CmmExpr -> CmmExpr +-- Takes an info table pointer (from infoTable) and returns the closure type +-- field of the info table. +infoTableClosureType info_tbl + = CmmLoad (cmmOffsetB info_tbl stdClosureTypeOffset) halfWordRep + +infoTablePtrs :: CmmExpr -> CmmExpr +infoTablePtrs info_tbl + = CmmLoad (cmmOffsetB info_tbl stdPtrsOffset) halfWordRep + +infoTableNonPtrs :: CmmExpr -> CmmExpr +infoTableNonPtrs info_tbl + = CmmLoad (cmmOffsetB info_tbl stdNonPtrsOffset) halfWordRep + +funInfoTable :: CmmExpr -> CmmExpr +-- Takes the info pointer of a function, +-- and returns a pointer to the first word of the StgFunInfoExtra struct +-- in the info table. +funInfoTable info_ptr + | tablesNextToCode + = cmmOffsetB info_ptr (- stdInfoTableSizeB - sIZEOF_StgFunInfoExtraRev) + | otherwise + = cmmOffsetW info_ptr (1 + stdInfoTableSizeW) + -- Past the entry code pointer + +------------------------------------------------------------------------- +-- +-- Emit the code for a closure (or return address) +-- and its associated info table +-- +------------------------------------------------------------------------- + +-- The complication here concerns whether or not we can +-- put the info table next to the code + +emitInfoTableAndCode + :: CLabel -- Label of info table + -> [CmmLit] -- ...its invariant part + -> [CmmLit] -- ...and its variant part + -> [LocalReg] -- ...args + -> [CmmBasicBlock] -- ...and body + -> Code + +emitInfoTableAndCode info_lbl std_info extra_bits args blocks + | tablesNextToCode -- Reverse the extra_bits; and emit the top-level proc + = emitProc (reverse extra_bits ++ std_info) + entry_lbl args blocks + -- NB: the info_lbl is discarded + + | null blocks -- No actual code; only the info table is significant + = -- Use a zero place-holder in place of the + -- entry-label in the info table + emitRODataLits info_lbl (zeroCLit : std_info ++ extra_bits) + + | otherwise -- Separately emit info table (with the function entry + = -- point as first entry) and the entry code + do { emitDataLits info_lbl (CmmLabel entry_lbl : std_info ++ extra_bits) + ; emitProc [] entry_lbl args blocks } + + where + entry_lbl = infoLblToEntryLbl info_lbl + +------------------------------------------------------------------------- +-- +-- Static reference tables +-- +------------------------------------------------------------------------- + +-- There is just one SRT for each top level binding; all the nested +-- bindings use sub-sections of this SRT. The label is passed down to +-- the nested bindings via the monad. + +getSRTInfo :: Name -> SRT -> FCode C_SRT +getSRTInfo id NoSRT = return NoC_SRT +getSRTInfo id (SRT off len bmp) + | len > hALF_WORD_SIZE_IN_BITS || bmp == [fromIntegral srt_escape] + = do { srt_lbl <- getSRTLabel + ; let srt_desc_lbl = mkSRTDescLabel id + ; emitRODataLits srt_desc_lbl + ( cmmLabelOffW srt_lbl off + : mkWordCLit (fromIntegral len) + : map mkWordCLit bmp) + ; return (C_SRT srt_desc_lbl 0 srt_escape) } + + | otherwise + = do { srt_lbl <- getSRTLabel + ; return (C_SRT srt_lbl off (fromIntegral (head bmp))) } + -- The fromIntegral converts to StgHalfWord + +srt_escape = (-1) :: StgHalfWord + +srtLabelAndLength :: C_SRT -> CLabel -> (CmmLit, StgHalfWord) +srtLabelAndLength NoC_SRT _ + = (zeroCLit, 0) +srtLabelAndLength (C_SRT lbl off bitmap) info_lbl + = (makeRelativeRefTo info_lbl $ cmmLabelOffW lbl off, bitmap) + +------------------------------------------------------------------------- +-- +-- Position independent code +-- +------------------------------------------------------------------------- +-- In order to support position independent code, we mustn't put absolute +-- references into read-only space. Info tables in the tablesNextToCode +-- case must be in .text, which is read-only, so we doctor the CmmLits +-- to use relative offsets instead. + +-- Note that this is done even when the -fPIC flag is not specified, +-- as we want to keep binary compatibility between PIC and non-PIC. + +makeRelativeRefTo :: CLabel -> CmmLit -> CmmLit + +makeRelativeRefTo info_lbl (CmmLabel lbl) + | tablesNextToCode + = CmmLabelDiffOff lbl info_lbl 0 +makeRelativeRefTo info_lbl (CmmLabelOff lbl off) + | tablesNextToCode + = CmmLabelDiffOff lbl info_lbl off +makeRelativeRefTo _ lit = lit diff --git a/compiler/codeGen/CgLetNoEscape.lhs b/compiler/codeGen/CgLetNoEscape.lhs new file mode 100644 index 0000000000..39860f4ee0 --- /dev/null +++ b/compiler/codeGen/CgLetNoEscape.lhs @@ -0,0 +1,212 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1993-1998 +% +% $Id: CgLetNoEscape.lhs,v 1.26 2004/09/30 10:35:47 simonpj Exp $ +% +%******************************************************** +%* * +\section[CgLetNoEscape]{Handling ``let-no-escapes''} +%* * +%******************************************************** + +\begin{code} +module CgLetNoEscape ( cgLetNoEscapeClosure ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} CgExpr ( cgExpr ) + +import StgSyn +import CgMonad + +import CgBindery ( CgIdInfo, letNoEscapeIdInfo, nukeDeadBindings ) +import CgCase ( restoreCurrentCostCentre ) +import CgCon ( bindUnboxedTupleComponents ) +import CgHeapery ( unbxTupleHeapCheck ) +import CgInfoTbls ( emitDirectReturnTarget ) +import CgStackery ( allocStackTop, deAllocStackTop, getSpRelOffset ) +import Cmm ( CmmStmt(..) ) +import CmmUtils ( mkLblExpr, oneStmt ) +import CLabel ( mkReturnInfoLabel ) +import ClosureInfo ( mkLFLetNoEscape ) +import CostCentre ( CostCentreStack ) +import Id ( Id, idName ) +import Var ( idUnique ) +import SMRep ( retAddrSizeW ) +import BasicTypes ( RecFlag(..) ) +import Outputable +\end{code} + +%************************************************************************ +%* * +\subsection[what-is-non-escaping]{What {\em is} a ``non-escaping let''?} +%* * +%************************************************************************ + +[The {\em code} that detects these things is elsewhere.] + +Consider: +\begin{verbatim} + let x = fvs \ args -> e + in + if ... then x else + if ... then x else ... +\end{verbatim} +@x@ is used twice (so we probably can't unfold it), but when it is +entered, the stack is deeper than it was when the definition of @x@ +happened. Specifically, if instead of allocating a closure for @x@, +we saved all @x@'s fvs on the stack, and remembered the stack depth at +that moment, then whenever we enter @x@ we can simply set the stack +pointer(s) to these remembered (compile-time-fixed) values, and jump +to the code for @x@. + +All of this is provided x is: +\begin{enumerate} +\item +non-updatable; +\item +guaranteed to be entered before the stack retreats -- ie x is not +buried in a heap-allocated closure, or passed as an argument to something; +\item +all the enters have exactly the right number of arguments, +no more no less; +\item +all the enters are tail calls; that is, they return to the +caller enclosing the definition of @x@. +\end{enumerate} + +Under these circumstances we say that @x@ is {\em non-escaping}. + +An example of when (4) does {\em not} hold: +\begin{verbatim} + let x = ... + in case x of ...alts... +\end{verbatim} + +Here, @x@ is certainly entered only when the stack is deeper than when +@x@ is defined, but here it must return to \tr{...alts...} So we can't +just adjust the stack down to @x@'s recalled points, because that +would lost @alts@' context. + +Things can get a little more complicated. Consider: +\begin{verbatim} + let y = ... + in let x = fvs \ args -> ...y... + in ...x... +\end{verbatim} + +Now, if @x@ is used in a non-escaping way in \tr{...x...}, {\em and} +@y@ is used in a non-escaping way in \tr{...y...}, {\em then} @y@ is +non-escaping. + +@x@ can even be recursive! Eg: +\begin{verbatim} + letrec x = [y] \ [v] -> if v then x True else ... + in + ...(x b)... +\end{verbatim} + + +%************************************************************************ +%* * +\subsection[codeGen-for-non-escaping]{Generating code for a ``non-escaping let''} +%* * +%************************************************************************ + + +Generating code for this is fun. It is all very very similar to what +we do for a case expression. The duality is between +\begin{verbatim} + let-no-escape x = b + in e +\end{verbatim} +and +\begin{verbatim} + case e of ... -> b +\end{verbatim} + +That is, the RHS of @x@ (ie @b@) will execute {\em later}, just like +the alternative of the case; it needs to be compiled in an environment +in which all volatile bindings are forgotten, and the free vars are +bound only to stable things like stack locations.. The @e@ part will +execute {\em next}, just like the scrutinee of a case. + +First, we need to save all @x@'s free vars +on the stack, if they aren't there already. + +\begin{code} +cgLetNoEscapeClosure + :: Id -- binder + -> CostCentreStack -- NB: *** NOT USED *** ToDo (WDP 94/06) + -> StgBinderInfo -- NB: ditto + -> SRT + -> StgLiveVars -- variables live in RHS, including the binders + -- themselves in the case of a recursive group + -> EndOfBlockInfo -- where are we going to? + -> Maybe VirtualSpOffset -- Slot for current cost centre + -> RecFlag -- is the binding recursive? + -> [Id] -- args (as in \ args -> body) + -> StgExpr -- body (as in above) + -> FCode (Id, CgIdInfo) + +-- ToDo: deal with the cost-centre issues + +cgLetNoEscapeClosure + bndr cc binder_info srt full_live_in_rhss + rhs_eob_info cc_slot rec args body + = let + arity = length args + lf_info = mkLFLetNoEscape arity + in + -- saveVolatileVarsAndRegs done earlier in cgExpr. + + do { (vSp, _) <- forkEvalHelp rhs_eob_info + + (do { allocStackTop retAddrSizeW + ; nukeDeadBindings full_live_in_rhss }) + + (do { deAllocStackTop retAddrSizeW + ; abs_c <- forkProc $ cgLetNoEscapeBody bndr cc + cc_slot args body + + -- Ignore the label that comes back from + -- mkRetDirectTarget. It must be conjured up elswhere + ; emitDirectReturnTarget (idName bndr) abs_c srt + ; return () }) + + ; returnFC (bndr, letNoEscapeIdInfo bndr vSp lf_info) } +\end{code} + +\begin{code} +cgLetNoEscapeBody :: Id -- Name of the joint point + -> CostCentreStack + -> Maybe VirtualSpOffset + -> [Id] -- Args + -> StgExpr -- Body + -> Code + +cgLetNoEscapeBody bndr cc cc_slot all_args body = do + { (arg_regs, ptrs, nptrs, ret_slot) <- bindUnboxedTupleComponents all_args + + -- restore the saved cost centre. BUT: we must not free the stack slot + -- containing the cost centre, because it might be needed for a + -- recursive call to this let-no-escape. + ; restoreCurrentCostCentre cc_slot False{-don't free-} + + -- Enter the closures cc, if required + ; -- enterCostCentreCode closure_info cc IsFunction + + -- The "return address" slot doesn't have a return address in it; + -- but the heap-check needs it filled in if the heap-check fails. + -- So we pass code to fill it in to the heap-check macro + ; sp_rel <- getSpRelOffset ret_slot + + ; let lbl = mkReturnInfoLabel (idUnique bndr) + frame_hdr_asst = oneStmt (CmmStore sp_rel (mkLblExpr lbl)) + + -- Do heap check [ToDo: omit for non-recursive case by recording in + -- in envt and absorbing at call site] + ; unbxTupleHeapCheck arg_regs ptrs nptrs frame_hdr_asst + (cgExpr body) + } +\end{code} diff --git a/compiler/codeGen/CgMonad.lhs b/compiler/codeGen/CgMonad.lhs new file mode 100644 index 0000000000..4f95c9b36a --- /dev/null +++ b/compiler/codeGen/CgMonad.lhs @@ -0,0 +1,853 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +% $Id: CgMonad.lhs,v 1.45 2005/06/21 10:44:41 simonmar Exp $ +% +\section[CgMonad]{The code generation monad} + +See the beginning of the top-level @CodeGen@ module, to see how this +monadic stuff fits into the Big Picture. + +\begin{code} +module CgMonad ( + Code, -- type + FCode, -- type + + initC, thenC, thenFC, listCs, listFCs, mapCs, mapFCs, + returnFC, fixC, checkedAbsC, + stmtC, stmtsC, labelC, emitStmts, nopC, whenC, newLabelC, + newUnique, newUniqSupply, + + CgStmts, emitCgStmts, forkCgStmts, cgStmtsToBlocks, + getCgStmts', getCgStmts, + noCgStmts, oneCgStmt, consCgStmt, + + getCmm, + emitData, emitProc, emitSimpleProc, + + forkLabelledCode, + forkClosureBody, forkStatics, forkAlts, forkEval, + forkEvalHelp, forkProc, codeOnly, + SemiTaggingStuff, ConTagZ, + + EndOfBlockInfo(..), + setEndOfBlockInfo, getEndOfBlockInfo, + + setSRTLabel, getSRTLabel, + setTickyCtrLabel, getTickyCtrLabel, + + StackUsage(..), HeapUsage(..), + VirtualSpOffset, VirtualHpOffset, + initStkUsage, initHpUsage, + getHpUsage, setHpUsage, + heapHWM, + + moduleName, + + Sequel(..), -- ToDo: unabstract? + + -- ideally we wouldn't export these, but some other modules access internal state + getState, setState, getInfoDown, getDynFlags, getHomeModules, + + -- more localised access to monad state + getStkUsage, setStkUsage, + getBinds, setBinds, getStaticBinds, + + -- out of general friendliness, we also export ... + CgInfoDownwards(..), CgState(..) -- non-abstract + ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} CgBindery ( CgBindings, nukeVolatileBinds ) + +import DynFlags ( DynFlags ) +import Packages ( HomeModules ) +import Cmm +import CmmUtils ( CmmStmts, isNopStmt ) +import CLabel +import SMRep ( WordOff ) +import Module ( Module ) +import Id ( Id ) +import VarEnv +import OrdList +import Unique ( Unique ) +import Util ( mapAccumL ) +import UniqSupply ( UniqSupply, mkSplitUniqSupply, splitUniqSupply, uniqFromSupply ) +import FastString +import Outputable + +import Control.Monad ( liftM ) + +infixr 9 `thenC` -- Right-associative! +infixr 9 `thenFC` +\end{code} + +%************************************************************************ +%* * +\subsection[CgMonad-environment]{Stuff for manipulating environments} +%* * +%************************************************************************ + +This monadery has some information that it only passes {\em +downwards}, as well as some ``state'' which is modified as we go +along. + +\begin{code} +data CgInfoDownwards -- information only passed *downwards* by the monad + = MkCgInfoDown { + cgd_dflags :: DynFlags, + cgd_hmods :: HomeModules, -- Packages we depend on + cgd_mod :: Module, -- Module being compiled + cgd_statics :: CgBindings, -- [Id -> info] : static environment + cgd_srt :: CLabel, -- label of the current SRT + cgd_ticky :: CLabel, -- current destination for ticky counts + cgd_eob :: EndOfBlockInfo -- Info for stuff to do at end of basic block: + } + +initCgInfoDown :: DynFlags -> HomeModules -> Module -> CgInfoDownwards +initCgInfoDown dflags hmods mod + = MkCgInfoDown { cgd_dflags = dflags, + cgd_hmods = hmods, + cgd_mod = mod, + cgd_statics = emptyVarEnv, + cgd_srt = error "initC: srt", + cgd_ticky = mkTopTickyCtrLabel, + cgd_eob = initEobInfo } + +data CgState + = MkCgState { + cgs_stmts :: OrdList CgStmt, -- Current proc + cgs_tops :: OrdList CmmTop, + -- Other procedures and data blocks in this compilation unit + -- Both the latter two are ordered only so that we can + -- reduce forward references, when it's easy to do so + + cgs_binds :: CgBindings, -- [Id -> info] : *local* bindings environment + -- Bindings for top-level things are given in + -- the info-down part + + cgs_stk_usg :: StackUsage, + cgs_hp_usg :: HeapUsage, + + cgs_uniqs :: UniqSupply } + +initCgState :: UniqSupply -> CgState +initCgState uniqs + = MkCgState { cgs_stmts = nilOL, cgs_tops = nilOL, + cgs_binds = emptyVarEnv, + cgs_stk_usg = initStkUsage, + cgs_hp_usg = initHpUsage, + cgs_uniqs = uniqs } +\end{code} + +@EndOfBlockInfo@ tells what to do at the end of this block of code or, +if the expression is a @case@, what to do at the end of each +alternative. + +\begin{code} +data EndOfBlockInfo + = EndOfBlockInfo + VirtualSpOffset -- Args Sp: trim the stack to this point at a + -- return; push arguments starting just + -- above this point on a tail call. + + -- This is therefore the stk ptr as seen + -- by a case alternative. + Sequel + +initEobInfo = EndOfBlockInfo 0 OnStack +\end{code} + +Any addressing modes inside @Sequel@ must be ``robust,'' in the sense +that it must survive stack pointer adjustments at the end of the +block. + +\begin{code} +data Sequel + = OnStack -- Continuation is on the stack + | UpdateCode -- Continuation is update + + | CaseAlts + CLabel -- Jump to this; if the continuation is for a vectored + -- case this might be the label of a return vector + SemiTaggingStuff + Id -- The case binder, only used to see if it's dead + Bool -- True <=> polymorphic, push a SEQ frame too + +type SemiTaggingStuff + = Maybe -- Maybe[1] we don't have any semi-tagging stuff... + ([(ConTagZ, CmmLit)], -- Alternatives + CmmLit) -- Default (will be a can't happen RTS label if can't happen) + +type ConTagZ = Int -- A *zero-indexed* contructor tag + +-- The case branch is executed only from a successful semitagging +-- venture, when a case has looked at a variable, found that it's +-- evaluated, and wants to load up the contents and go to the join +-- point. +\end{code} + +%************************************************************************ +%* * + CgStmt type +%* * +%************************************************************************ + +The CgStmts type is what the code generator outputs: it is a tree of +statements, including in-line labels. The job of flattenCgStmts is to +turn this into a list of basic blocks, each of which ends in a jump +statement (either a local branch or a non-local jump). + +\begin{code} +type CgStmts = OrdList CgStmt + +data CgStmt + = CgStmt CmmStmt + | CgLabel BlockId + | CgFork BlockId CgStmts + +flattenCgStmts :: BlockId -> CgStmts -> [CmmBasicBlock] +flattenCgStmts id stmts = + case flatten (fromOL stmts) of + ([],blocks) -> blocks + (block,blocks) -> BasicBlock id block : blocks + where + flatten [] = ([],[]) + + -- A label at the end of a function or fork: this label must not be reachable, + -- but it might be referred to from another BB that also isn't reachable. + -- Eliminating these has to be done with a dead-code analysis. For now, + -- we just make it into a well-formed block by adding a recursive jump. + flatten [CgLabel id] + = ( [], [BasicBlock id [CmmBranch id]] ) + + -- A jump/branch: throw away all the code up to the next label, because + -- it is unreachable. Be careful to keep forks that we find on the way. + flatten (CgStmt stmt : stmts) + | isJump stmt + = case dropWhile isOrdinaryStmt stmts of + [] -> ( [stmt], [] ) + [CgLabel id] -> ( [stmt], [BasicBlock id [CmmBranch id]]) + (CgLabel id : stmts) -> ( [stmt], BasicBlock id block : blocks ) + where (block,blocks) = flatten stmts + (CgFork fork_id stmts : ss) -> + flatten (CgFork fork_id stmts : CgStmt stmt : ss) + + flatten (s:ss) = + case s of + CgStmt stmt -> (stmt:block,blocks) + CgLabel id -> ([CmmBranch id],BasicBlock id block:blocks) + CgFork fork_id stmts -> + (block, BasicBlock fork_id fork_block : fork_blocks ++ blocks) + where (fork_block, fork_blocks) = flatten (fromOL stmts) + where (block,blocks) = flatten ss + +isJump (CmmJump _ _) = True +isJump (CmmBranch _) = True +isJump _ = False + +isOrdinaryStmt (CgStmt _) = True +isOrdinaryStmt _ = False +\end{code} + +%************************************************************************ +%* * + Stack and heap models +%* * +%************************************************************************ + +\begin{code} +type VirtualHpOffset = WordOff -- Both are in +type VirtualSpOffset = WordOff -- units of words + +data StackUsage + = StackUsage { + virtSp :: VirtualSpOffset, + -- Virtual offset of topmost allocated slot + + frameSp :: VirtualSpOffset, + -- Virtual offset of the return address of the enclosing frame. + -- This RA describes the liveness/pointedness of + -- all the stack from frameSp downwards + -- INVARIANT: less than or equal to virtSp + + freeStk :: [VirtualSpOffset], + -- List of free slots, in *increasing* order + -- INVARIANT: all <= virtSp + -- All slots <= virtSp are taken except these ones + + realSp :: VirtualSpOffset, + -- Virtual offset of real stack pointer register + + hwSp :: VirtualSpOffset + } -- Highest value ever taken by virtSp + +-- INVARIANT: The environment contains no Stable references to +-- stack slots below (lower offset) frameSp +-- It can contain volatile references to this area though. + +data HeapUsage = + HeapUsage { + virtHp :: VirtualHpOffset, -- Virtual offset of highest-allocated word + realHp :: VirtualHpOffset -- realHp: Virtual offset of real heap ptr + } +\end{code} + +The heap high water mark is the larger of virtHp and hwHp. The latter is +only records the high water marks of forked-off branches, so to find the +heap high water mark you have to take the max of virtHp and hwHp. Remember, +virtHp never retreats! + +Note Jan 04: ok, so why do we only look at the virtual Hp?? + +\begin{code} +heapHWM :: HeapUsage -> VirtualHpOffset +heapHWM = virtHp +\end{code} + +Initialisation. + +\begin{code} +initStkUsage :: StackUsage +initStkUsage = StackUsage { + virtSp = 0, + frameSp = 0, + freeStk = [], + realSp = 0, + hwSp = 0 + } + +initHpUsage :: HeapUsage +initHpUsage = HeapUsage { + virtHp = 0, + realHp = 0 + } +\end{code} + +@stateIncUsage@$~e_1~e_2$ incorporates in $e_1$ the stack and heap high water +marks found in $e_2$. + +\begin{code} +stateIncUsage :: CgState -> CgState -> CgState +stateIncUsage s1 s2@(MkCgState { cgs_stk_usg = stk_usg, cgs_hp_usg = hp_usg }) + = s1 { cgs_hp_usg = cgs_hp_usg s1 `maxHpHw` virtHp hp_usg, + cgs_stk_usg = cgs_stk_usg s1 `maxStkHw` hwSp stk_usg } + `addCodeBlocksFrom` s2 + +stateIncUsageEval :: CgState -> CgState -> CgState +stateIncUsageEval s1 s2 + = s1 { cgs_stk_usg = cgs_stk_usg s1 `maxStkHw` hwSp (cgs_stk_usg s2) } + `addCodeBlocksFrom` s2 + -- We don't max the heap high-watermark because stateIncUsageEval is + -- used only in forkEval, which in turn is only used for blocks of code + -- which do their own heap-check. + +addCodeBlocksFrom :: CgState -> CgState -> CgState +-- Add code blocks from the latter to the former +-- (The cgs_stmts will often be empty, but not always; see codeOnly) +s1 `addCodeBlocksFrom` s2 + = s1 { cgs_stmts = cgs_stmts s1 `appOL` cgs_stmts s2, + cgs_tops = cgs_tops s1 `appOL` cgs_tops s2 } + +maxHpHw :: HeapUsage -> VirtualHpOffset -> HeapUsage +hp_usg `maxHpHw` hw = hp_usg { virtHp = virtHp hp_usg `max` hw } + +maxStkHw :: StackUsage -> VirtualSpOffset -> StackUsage +stk_usg `maxStkHw` hw = stk_usg { hwSp = hwSp stk_usg `max` hw } +\end{code} + +%************************************************************************ +%* * + The FCode monad +%* * +%************************************************************************ + +\begin{code} +newtype FCode a = FCode (CgInfoDownwards -> CgState -> (a, CgState)) +type Code = FCode () + +instance Monad FCode where + (>>=) = thenFC + return = returnFC + +{-# INLINE thenC #-} +{-# INLINE thenFC #-} +{-# INLINE returnFC #-} +\end{code} +The Abstract~C is not in the environment so as to improve strictness. + +\begin{code} +initC :: DynFlags -> HomeModules -> Module -> FCode a -> IO a + +initC dflags hmods mod (FCode code) + = do { uniqs <- mkSplitUniqSupply 'c' + ; case code (initCgInfoDown dflags hmods mod) (initCgState uniqs) of + (res, _) -> return res + } + +returnFC :: a -> FCode a +returnFC val = FCode (\info_down state -> (val, state)) +\end{code} + +\begin{code} +thenC :: Code -> FCode a -> FCode a +thenC (FCode m) (FCode k) = + FCode (\info_down state -> let (_,new_state) = m info_down state in + k info_down new_state) + +listCs :: [Code] -> Code +listCs [] = return () +listCs (fc:fcs) = do + fc + listCs fcs + +mapCs :: (a -> Code) -> [a] -> Code +mapCs = mapM_ +\end{code} + +\begin{code} +thenFC :: FCode a -> (a -> FCode c) -> FCode c +thenFC (FCode m) k = FCode ( + \info_down state -> + let + (m_result, new_state) = m info_down state + (FCode kcode) = k m_result + in + kcode info_down new_state + ) + +listFCs :: [FCode a] -> FCode [a] +listFCs = sequence + +mapFCs :: (a -> FCode b) -> [a] -> FCode [b] +mapFCs = mapM +\end{code} + +And the knot-tying combinator: +\begin{code} +fixC :: (a -> FCode a) -> FCode a +fixC fcode = FCode ( + \info_down state -> + let + FCode fc = fcode v + result@(v,_) = fc info_down state + -- ^--------^ + in + result + ) +\end{code} + +%************************************************************************ +%* * + Operators for getting and setting the state and "info_down". + +%* * +%************************************************************************ + +\begin{code} +getState :: FCode CgState +getState = FCode $ \info_down state -> (state,state) + +setState :: CgState -> FCode () +setState state = FCode $ \info_down _ -> ((),state) + +getStkUsage :: FCode StackUsage +getStkUsage = do + state <- getState + return $ cgs_stk_usg state + +setStkUsage :: StackUsage -> Code +setStkUsage new_stk_usg = do + state <- getState + setState $ state {cgs_stk_usg = new_stk_usg} + +getHpUsage :: FCode HeapUsage +getHpUsage = do + state <- getState + return $ cgs_hp_usg state + +setHpUsage :: HeapUsage -> Code +setHpUsage new_hp_usg = do + state <- getState + setState $ state {cgs_hp_usg = new_hp_usg} + +getBinds :: FCode CgBindings +getBinds = do + state <- getState + return $ cgs_binds state + +setBinds :: CgBindings -> FCode () +setBinds new_binds = do + state <- getState + setState $ state {cgs_binds = new_binds} + +getStaticBinds :: FCode CgBindings +getStaticBinds = do + info <- getInfoDown + return (cgd_statics info) + +withState :: FCode a -> CgState -> FCode (a,CgState) +withState (FCode fcode) newstate = FCode $ \info_down state -> + let (retval, state2) = fcode info_down newstate in ((retval,state2), state) + +newUniqSupply :: FCode UniqSupply +newUniqSupply = do + state <- getState + let (us1, us2) = splitUniqSupply (cgs_uniqs state) + setState $ state { cgs_uniqs = us1 } + return us2 + +newUnique :: FCode Unique +newUnique = do + us <- newUniqSupply + return (uniqFromSupply us) + +------------------ +getInfoDown :: FCode CgInfoDownwards +getInfoDown = FCode $ \info_down state -> (info_down,state) + +getDynFlags :: FCode DynFlags +getDynFlags = liftM cgd_dflags getInfoDown + +getHomeModules :: FCode HomeModules +getHomeModules = liftM cgd_hmods getInfoDown + +withInfoDown :: FCode a -> CgInfoDownwards -> FCode a +withInfoDown (FCode fcode) info_down = FCode $ \_ state -> fcode info_down state + +doFCode :: FCode a -> CgInfoDownwards -> CgState -> (a,CgState) +doFCode (FCode fcode) info_down state = fcode info_down state +\end{code} + + +%************************************************************************ +%* * + Forking +%* * +%************************************************************************ + +@forkClosureBody@ takes a code, $c$, and compiles it in a completely +fresh environment, except that: + - compilation info and statics are passed in unchanged. +The current environment is passed on completely unaltered, except that +abstract C from the fork is incorporated. + +@forkProc@ takes a code and compiles it in the current environment, +returning the basic blocks thus constructed. The current environment +is passed on completely unchanged. It is pretty similar to +@getBlocks@, except that the latter does affect the environment. + +@forkStatics@ $fc$ compiles $fc$ in an environment whose statics come +from the current bindings, but which is otherwise freshly initialised. +The Abstract~C returned is attached to the current state, but the +bindings and usage information is otherwise unchanged. + +\begin{code} +forkClosureBody :: Code -> Code +forkClosureBody body_code + = do { info <- getInfoDown + ; us <- newUniqSupply + ; state <- getState + ; let body_info_down = info { cgd_eob = initEobInfo } + ((),fork_state) = doFCode body_code body_info_down + (initCgState us) + ; ASSERT( isNilOL (cgs_stmts fork_state) ) + setState $ state `addCodeBlocksFrom` fork_state } + +forkStatics :: FCode a -> FCode a +forkStatics body_code + = do { info <- getInfoDown + ; us <- newUniqSupply + ; state <- getState + ; let rhs_info_down = info { cgd_statics = cgs_binds state, + cgd_eob = initEobInfo } + (result, fork_state_out) = doFCode body_code rhs_info_down + (initCgState us) + ; ASSERT( isNilOL (cgs_stmts fork_state_out) ) + setState (state `addCodeBlocksFrom` fork_state_out) + ; return result } + +forkProc :: Code -> FCode CgStmts +forkProc body_code + = do { info_down <- getInfoDown + ; us <- newUniqSupply + ; state <- getState + ; let fork_state_in = (initCgState us) + { cgs_binds = cgs_binds state, + cgs_stk_usg = cgs_stk_usg state, + cgs_hp_usg = cgs_hp_usg state } + -- ToDo: is the hp usage necesary? + (code_blks, fork_state_out) = doFCode (getCgStmts body_code) + info_down fork_state_in + ; setState $ state `stateIncUsageEval` fork_state_out + ; return code_blks } + +codeOnly :: Code -> Code +-- Emit any code from the inner thing into the outer thing +-- Do not affect anything else in the outer state +-- Used in almost-circular code to prevent false loop dependencies +codeOnly body_code + = do { info_down <- getInfoDown + ; us <- newUniqSupply + ; state <- getState + ; let fork_state_in = (initCgState us) { cgs_binds = cgs_binds state, + cgs_stk_usg = cgs_stk_usg state, + cgs_hp_usg = cgs_hp_usg state } + ((), fork_state_out) = doFCode body_code info_down fork_state_in + ; setState $ state `addCodeBlocksFrom` fork_state_out } +\end{code} + +@forkAlts@ $bs~d$ takes fcodes $bs$ for the branches of a @case@, and +an fcode for the default case $d$, and compiles each in the current +environment. The current environment is passed on unmodified, except +that + - the worst stack high-water mark is incorporated + - the virtual Hp is moved on to the worst virtual Hp for the branches + +\begin{code} +forkAlts :: [FCode a] -> FCode [a] + +forkAlts branch_fcodes + = do { info_down <- getInfoDown + ; us <- newUniqSupply + ; state <- getState + ; let compile us branch + = (us2, doFCode branch info_down branch_state) + where + (us1,us2) = splitUniqSupply us + branch_state = (initCgState us1) { + cgs_binds = cgs_binds state, + cgs_stk_usg = cgs_stk_usg state, + cgs_hp_usg = cgs_hp_usg state } + + (_us, results) = mapAccumL compile us branch_fcodes + (branch_results, branch_out_states) = unzip results + ; setState $ foldl stateIncUsage state branch_out_states + -- NB foldl. state is the *left* argument to stateIncUsage + ; return branch_results } +\end{code} + +@forkEval@ takes two blocks of code. + + - The first meddles with the environment to set it up as expected by + the alternatives of a @case@ which does an eval (or gc-possible primop). + - The second block is the code for the alternatives. + (plus info for semi-tagging purposes) + +@forkEval@ picks up the virtual stack pointer and returns a suitable +@EndOfBlockInfo@ for the caller to use, together with whatever value +is returned by the second block. + +It uses @initEnvForAlternatives@ to initialise the environment, and +@stateIncUsageAlt@ to incorporate usage; the latter ignores the heap +usage. + +\begin{code} +forkEval :: EndOfBlockInfo -- For the body + -> Code -- Code to set environment + -> FCode Sequel -- Semi-tagging info to store + -> FCode EndOfBlockInfo -- The new end of block info + +forkEval body_eob_info env_code body_code + = do { (v, sequel) <- forkEvalHelp body_eob_info env_code body_code + ; returnFC (EndOfBlockInfo v sequel) } + +forkEvalHelp :: EndOfBlockInfo -- For the body + -> Code -- Code to set environment + -> FCode a -- The code to do after the eval + -> FCode (VirtualSpOffset, -- Sp + a) -- Result of the FCode + -- A disturbingly complicated function +forkEvalHelp body_eob_info env_code body_code + = do { info_down <- getInfoDown + ; us <- newUniqSupply + ; state <- getState + ; let { info_down_for_body = info_down {cgd_eob = body_eob_info} + ; (_, env_state) = doFCode env_code info_down_for_body + (state {cgs_uniqs = us}) + ; state_for_body = (initCgState (cgs_uniqs env_state)) + { cgs_binds = binds_for_body, + cgs_stk_usg = stk_usg_for_body } + ; binds_for_body = nukeVolatileBinds (cgs_binds env_state) + ; stk_usg_from_env = cgs_stk_usg env_state + ; virtSp_from_env = virtSp stk_usg_from_env + ; stk_usg_for_body = stk_usg_from_env {realSp = virtSp_from_env, + hwSp = virtSp_from_env} + ; (value_returned, state_at_end_return) + = doFCode body_code info_down_for_body state_for_body + } + ; ASSERT( isNilOL (cgs_stmts state_at_end_return) ) + -- The code coming back should consist only of nested declarations, + -- notably of the return vector! + setState $ state `stateIncUsageEval` state_at_end_return + ; return (virtSp_from_env, value_returned) } + + +-- ---------------------------------------------------------------------------- +-- Combinators for emitting code + +nopC :: Code +nopC = return () + +whenC :: Bool -> Code -> Code +whenC True code = code +whenC False code = nopC + +stmtC :: CmmStmt -> Code +stmtC stmt = emitCgStmt (CgStmt stmt) + +labelC :: BlockId -> Code +labelC id = emitCgStmt (CgLabel id) + +newLabelC :: FCode BlockId +newLabelC = do { id <- newUnique; return (BlockId id) } + +checkedAbsC :: CmmStmt -> Code +-- Emit code, eliminating no-ops +checkedAbsC stmt = emitStmts (if isNopStmt stmt then nilOL + else unitOL stmt) + +stmtsC :: [CmmStmt] -> Code +stmtsC stmts = emitStmts (toOL stmts) + +-- Emit code; no no-op checking +emitStmts :: CmmStmts -> Code +emitStmts stmts = emitCgStmts (fmap CgStmt stmts) + +-- forkLabelledCode is for emitting a chunk of code with a label, outside +-- of the current instruction stream. +forkLabelledCode :: Code -> FCode BlockId +forkLabelledCode code = getCgStmts code >>= forkCgStmts + +emitCgStmt :: CgStmt -> Code +emitCgStmt stmt + = do { state <- getState + ; setState $ state { cgs_stmts = cgs_stmts state `snocOL` stmt } + } + +emitData :: Section -> [CmmStatic] -> Code +emitData sect lits + = do { state <- getState + ; setState $ state { cgs_tops = cgs_tops state `snocOL` data_block } } + where + data_block = CmmData sect lits + +emitProc :: [CmmLit] -> CLabel -> [LocalReg] -> [CmmBasicBlock] -> Code +emitProc lits lbl args blocks + = do { let proc_block = CmmProc (map CmmStaticLit lits) lbl args blocks + ; state <- getState + ; setState $ state { cgs_tops = cgs_tops state `snocOL` proc_block } } + +emitSimpleProc :: CLabel -> Code -> Code +-- Emit a procedure whose body is the specified code; no info table +emitSimpleProc lbl code + = do { stmts <- getCgStmts code + ; blks <- cgStmtsToBlocks stmts + ; emitProc [] lbl [] blks } + +getCmm :: Code -> FCode Cmm +-- Get all the CmmTops (there should be no stmts) +getCmm code + = do { state1 <- getState + ; ((), state2) <- withState code (state1 { cgs_tops = nilOL }) + ; setState $ state2 { cgs_tops = cgs_tops state1 } + ; return (Cmm (fromOL (cgs_tops state2))) } + +-- ---------------------------------------------------------------------------- +-- CgStmts + +-- These functions deal in terms of CgStmts, which is an abstract type +-- representing the code in the current proc. + + +-- emit CgStmts into the current instruction stream +emitCgStmts :: CgStmts -> Code +emitCgStmts stmts + = do { state <- getState + ; setState $ state { cgs_stmts = cgs_stmts state `appOL` stmts } } + +-- emit CgStmts outside the current instruction stream, and return a label +forkCgStmts :: CgStmts -> FCode BlockId +forkCgStmts stmts + = do { id <- newLabelC + ; emitCgStmt (CgFork id stmts) + ; return id + } + +-- turn CgStmts into [CmmBasicBlock], for making a new proc. +cgStmtsToBlocks :: CgStmts -> FCode [CmmBasicBlock] +cgStmtsToBlocks stmts + = do { id <- newLabelC + ; return (flattenCgStmts id stmts) + } + +-- collect the code emitted by an FCode computation +getCgStmts' :: FCode a -> FCode (a, CgStmts) +getCgStmts' fcode + = do { state1 <- getState + ; (a, state2) <- withState fcode (state1 { cgs_stmts = nilOL }) + ; setState $ state2 { cgs_stmts = cgs_stmts state1 } + ; return (a, cgs_stmts state2) } + +getCgStmts :: FCode a -> FCode CgStmts +getCgStmts fcode = do { (_,stmts) <- getCgStmts' fcode; return stmts } + +-- Simple ways to construct CgStmts: +noCgStmts :: CgStmts +noCgStmts = nilOL + +oneCgStmt :: CmmStmt -> CgStmts +oneCgStmt stmt = unitOL (CgStmt stmt) + +consCgStmt :: CmmStmt -> CgStmts -> CgStmts +consCgStmt stmt stmts = CgStmt stmt `consOL` stmts + +-- ---------------------------------------------------------------------------- +-- Get the current module name + +moduleName :: FCode Module +moduleName = do { info <- getInfoDown; return (cgd_mod info) } + +-- ---------------------------------------------------------------------------- +-- Get/set the end-of-block info + +setEndOfBlockInfo :: EndOfBlockInfo -> Code -> Code +setEndOfBlockInfo eob_info code = do + info <- getInfoDown + withInfoDown code (info {cgd_eob = eob_info}) + +getEndOfBlockInfo :: FCode EndOfBlockInfo +getEndOfBlockInfo = do + info <- getInfoDown + return (cgd_eob info) + +-- ---------------------------------------------------------------------------- +-- Get/set the current SRT label + +-- There is just one SRT for each top level binding; all the nested +-- bindings use sub-sections of this SRT. The label is passed down to +-- the nested bindings via the monad. + +getSRTLabel :: FCode CLabel -- Used only by cgPanic +getSRTLabel = do info <- getInfoDown + return (cgd_srt info) + +setSRTLabel :: CLabel -> FCode a -> FCode a +setSRTLabel srt_lbl code + = do info <- getInfoDown + withInfoDown code (info { cgd_srt = srt_lbl}) + +-- ---------------------------------------------------------------------------- +-- Get/set the current ticky counter label + +getTickyCtrLabel :: FCode CLabel +getTickyCtrLabel = do + info <- getInfoDown + return (cgd_ticky info) + +setTickyCtrLabel :: CLabel -> Code -> Code +setTickyCtrLabel ticky code = do + info <- getInfoDown + withInfoDown code (info {cgd_ticky = ticky}) +\end{code} diff --git a/compiler/codeGen/CgParallel.hs b/compiler/codeGen/CgParallel.hs new file mode 100644 index 0000000000..b826a33cba --- /dev/null +++ b/compiler/codeGen/CgParallel.hs @@ -0,0 +1,90 @@ +-- Code generation relaed to GpH +-- (a) parallel +-- (b) GranSim + +module CgParallel( + staticGranHdr,staticParHdr, + granFetchAndReschedule, granYield, + doGranAllocate + ) where + +import CgMonad +import CgCallConv ( mkRegLiveness ) +import Id ( Id ) +import Cmm ( CmmLit, GlobalReg(..), node, CmmExpr ) +import StaticFlags ( opt_GranMacros ) +import Outputable + +staticParHdr :: [CmmLit] +-- Parallel header words in a static closure +staticParHdr = [] + +-------------------------------------------------------- +-- GranSim stuff +-------------------------------------------------------- + +staticGranHdr :: [CmmLit] +-- Gransim header words in a static closure +staticGranHdr = [] + +doGranAllocate :: CmmExpr -> Code +-- macro DO_GRAN_ALLOCATE +doGranAllocate hp + | not opt_GranMacros = nopC + | otherwise = panic "doGranAllocate" + + + +------------------------- +granFetchAndReschedule :: [(Id,GlobalReg)] -- Live registers + -> Bool -- Node reqd? + -> Code +-- Emit code for simulating a fetch and then reschedule. +granFetchAndReschedule regs node_reqd + | opt_GranMacros && (node `elem` map snd regs || node_reqd) + = do { fetch + ; reschedule liveness node_reqd } + | otherwise + = nopC + where + liveness = mkRegLiveness regs 0 0 + +fetch = panic "granFetch" + -- Was: absC (CMacroStmt GRAN_FETCH []) + --HWL: generate GRAN_FETCH macro for GrAnSim + -- currently GRAN_FETCH and GRAN_FETCH_AND_RESCHEDULE are miai + +reschedule liveness node_reqd = panic "granReschedule" + -- Was: absC (CMacroStmt GRAN_RESCHEDULE [ + -- mkIntCLit (I# (word2Int# liveness_mask)), + -- mkIntCLit (if node_reqd then 1 else 0)]) + + +------------------------- +-- The @GRAN_YIELD@ macro is taken from JSM's code for Concurrent Haskell. It +-- allows to context-switch at places where @node@ is not alive (it uses the +-- @Continue@ rather than the @EnterNodeCode@ function in the RTS). We emit +-- this kind of macro at the beginning of the following kinds of basic bocks: +-- \begin{itemize} +-- \item Slow entry code where node is not alive (see @CgClosure.lhs@). Normally +-- we use @fetchAndReschedule@ at a slow entry code. +-- \item Fast entry code (see @CgClosure.lhs@). +-- \item Alternatives in case expressions (@CLabelledCode@ structures), provided +-- that they are not inlined (see @CgCases.lhs@). These alternatives will +-- be turned into separate functions. + +granYield :: [(Id,GlobalReg)] -- Live registers + -> Bool -- Node reqd? + -> Code + +granYield regs node_reqd + | opt_GranMacros && node_reqd = yield liveness + | otherwise = nopC + where + liveness = mkRegLiveness regs 0 0 + +yield liveness = panic "granYield" + -- Was : absC (CMacroStmt GRAN_YIELD + -- [mkIntCLit (I# (word2Int# liveness_mask))]) + + diff --git a/compiler/codeGen/CgPrimOp.hs b/compiler/codeGen/CgPrimOp.hs new file mode 100644 index 0000000000..bc7c9140ed --- /dev/null +++ b/compiler/codeGen/CgPrimOp.hs @@ -0,0 +1,584 @@ +----------------------------------------------------------------------------- +-- +-- Code generation for PrimOps. +-- +-- (c) The University of Glasgow 2004 +-- +----------------------------------------------------------------------------- + +module CgPrimOp ( + cgPrimOp + ) where + +#include "HsVersions.h" + +import ForeignCall ( CCallConv(CCallConv) ) +import StgSyn ( StgLiveVars, StgArg ) +import CgForeignCall ( emitForeignCall' ) +import CgBindery ( getVolatileRegs, getArgAmodes ) +import CgMonad +import CgInfoTbls ( getConstrTag ) +import CgUtils ( cmmOffsetW, cmmOffsetB, cmmLoadIndexW ) +import ForeignCall +import Cmm +import CLabel ( mkMAP_FROZEN_infoLabel, mkMAP_DIRTY_infoLabel, + mkDirty_MUT_VAR_Label, mkRtsCodeLabel ) +import CmmUtils +import MachOp +import SMRep +import PrimOp ( PrimOp(..) ) +import SMRep ( tablesNextToCode ) +import Constants ( wORD_SIZE, wORD_SIZE_IN_BITS ) +import StaticFlags ( opt_Parallel ) +import Outputable + +-- --------------------------------------------------------------------------- +-- Code generation for PrimOps + +cgPrimOp :: [CmmReg] -- where to put the results + -> PrimOp -- the op + -> [StgArg] -- arguments + -> StgLiveVars -- live vars, in case we need to save them + -> Code + +cgPrimOp results op args live + = do arg_exprs <- getArgAmodes args + let non_void_args = [ e | (r,e) <- arg_exprs, nonVoidArg r ] + emitPrimOp results op non_void_args live + + +emitPrimOp :: [CmmReg] -- where to put the results + -> PrimOp -- the op + -> [CmmExpr] -- arguments + -> StgLiveVars -- live vars, in case we need to save them + -> Code + +-- First we handle various awkward cases specially. The remaining +-- easy cases are then handled by translateOp, defined below. + +emitPrimOp [res_r,res_c] IntAddCOp [aa,bb] live +{- + With some bit-twiddling, we can define int{Add,Sub}Czh portably in + C, and without needing any comparisons. This may not be the + fastest way to do it - if you have better code, please send it! --SDM + + Return : r = a + b, c = 0 if no overflow, 1 on overflow. + + We currently don't make use of the r value if c is != 0 (i.e. + overflow), we just convert to big integers and try again. This + could be improved by making r and c the correct values for + plugging into a new J#. + + { r = ((I_)(a)) + ((I_)(b)); \ + c = ((StgWord)(~(((I_)(a))^((I_)(b))) & (((I_)(a))^r))) \ + >> (BITS_IN (I_) - 1); \ + } + Wading through the mass of bracketry, it seems to reduce to: + c = ( (~(a^b)) & (a^r) ) >>unsigned (BITS_IN(I_)-1) + +-} + = stmtsC [ + CmmAssign res_r (CmmMachOp mo_wordAdd [aa,bb]), + CmmAssign res_c $ + CmmMachOp mo_wordUShr [ + CmmMachOp mo_wordAnd [ + CmmMachOp mo_wordNot [CmmMachOp mo_wordXor [aa,bb]], + CmmMachOp mo_wordXor [aa, CmmReg res_r] + ], + CmmLit (mkIntCLit (wORD_SIZE_IN_BITS - 1)) + ] + ] + + +emitPrimOp [res_r,res_c] IntSubCOp [aa,bb] live +{- Similarly: + #define subIntCzh(r,c,a,b) \ + { r = ((I_)(a)) - ((I_)(b)); \ + c = ((StgWord)((((I_)(a))^((I_)(b))) & (((I_)(a))^r))) \ + >> (BITS_IN (I_) - 1); \ + } + + c = ((a^b) & (a^r)) >>unsigned (BITS_IN(I_)-1) +-} + = stmtsC [ + CmmAssign res_r (CmmMachOp mo_wordSub [aa,bb]), + CmmAssign res_c $ + CmmMachOp mo_wordUShr [ + CmmMachOp mo_wordAnd [ + CmmMachOp mo_wordXor [aa,bb], + CmmMachOp mo_wordXor [aa, CmmReg res_r] + ], + CmmLit (mkIntCLit (wORD_SIZE_IN_BITS - 1)) + ] + ] + + +emitPrimOp [res] ParOp [arg] live + = do + -- for now, just implement this in a C function + -- later, we might want to inline it. + vols <- getVolatileRegs live + emitForeignCall' PlayRisky + [(res,NoHint)] + (CmmForeignCall newspark CCallConv) + [(CmmReg (CmmGlobal BaseReg), PtrHint), (arg,PtrHint)] + (Just vols) + where + newspark = CmmLit (CmmLabel (mkRtsCodeLabel SLIT("newSpark"))) + +emitPrimOp [res] ReadMutVarOp [mutv] live + = stmtC (CmmAssign res (cmmLoadIndexW mutv fixedHdrSize)) + +emitPrimOp [] WriteMutVarOp [mutv,var] live + = do + stmtC (CmmStore (cmmOffsetW mutv fixedHdrSize) var) + vols <- getVolatileRegs live + emitForeignCall' PlayRisky + [{-no results-}] + (CmmForeignCall (CmmLit (CmmLabel mkDirty_MUT_VAR_Label)) + CCallConv) + [(CmmReg (CmmGlobal BaseReg), PtrHint), (mutv,PtrHint)] + (Just vols) + +-- #define sizzeofByteArrayzh(r,a) \ +-- r = (((StgArrWords *)(a))->words * sizeof(W_)) +emitPrimOp [res] SizeofByteArrayOp [arg] live + = stmtC $ + CmmAssign res (CmmMachOp mo_wordMul [ + cmmLoadIndexW arg fixedHdrSize, + CmmLit (mkIntCLit wORD_SIZE) + ]) + +-- #define sizzeofMutableByteArrayzh(r,a) \ +-- r = (((StgArrWords *)(a))->words * sizeof(W_)) +emitPrimOp [res] SizeofMutableByteArrayOp [arg] live + = emitPrimOp [res] SizeofByteArrayOp [arg] live + + +-- #define touchzh(o) /* nothing */ +emitPrimOp [] TouchOp [arg] live + = nopC + +-- #define byteArrayContentszh(r,a) r = BYTE_ARR_CTS(a) +emitPrimOp [res] ByteArrayContents_Char [arg] live + = stmtC (CmmAssign res (cmmOffsetB arg arrWordsHdrSize)) + +-- #define stableNameToIntzh(r,s) (r = ((StgStableName *)s)->sn) +emitPrimOp [res] StableNameToIntOp [arg] live + = stmtC (CmmAssign res (cmmLoadIndexW arg fixedHdrSize)) + +-- #define eqStableNamezh(r,sn1,sn2) \ +-- (r = (((StgStableName *)sn1)->sn == ((StgStableName *)sn2)->sn)) +emitPrimOp [res] EqStableNameOp [arg1,arg2] live + = stmtC (CmmAssign res (CmmMachOp mo_wordEq [ + cmmLoadIndexW arg1 fixedHdrSize, + cmmLoadIndexW arg2 fixedHdrSize + ])) + + +emitPrimOp [res] ReallyUnsafePtrEqualityOp [arg1,arg2] live + = stmtC (CmmAssign res (CmmMachOp mo_wordEq [arg1,arg2])) + +-- #define addrToHValuezh(r,a) r=(P_)a +emitPrimOp [res] AddrToHValueOp [arg] live + = stmtC (CmmAssign res arg) + +-- #define dataToTagzh(r,a) r=(GET_TAG(((StgClosure *)a)->header.info)) +emitPrimOp [res] DataToTagOp [arg] live + = stmtC (CmmAssign res (getConstrTag arg)) + +{- Freezing arrays-of-ptrs requires changing an info table, for the + benefit of the generational collector. It needs to scavenge mutable + objects, even if they are in old space. When they become immutable, + they can be removed from this scavenge list. -} + +-- #define unsafeFreezzeArrayzh(r,a) +-- { +-- SET_INFO((StgClosure *)a,&stg_MUT_ARR_PTRS_FROZEN0_info); +-- r = a; +-- } +emitPrimOp [res] UnsafeFreezeArrayOp [arg] live + = stmtsC [ setInfo arg (CmmLit (CmmLabel mkMAP_FROZEN_infoLabel)), + CmmAssign res arg ] + +-- #define unsafeFreezzeByteArrayzh(r,a) r=(a) +emitPrimOp [res] UnsafeFreezeByteArrayOp [arg] live + = stmtC (CmmAssign res arg) + +-- Reading/writing pointer arrays + +emitPrimOp [r] ReadArrayOp [obj,ix] live = doReadPtrArrayOp r obj ix +emitPrimOp [r] IndexArrayOp [obj,ix] live = doReadPtrArrayOp r obj ix +emitPrimOp [] WriteArrayOp [obj,ix,v] live = doWritePtrArrayOp obj ix v + +-- IndexXXXoffAddr + +emitPrimOp res IndexOffAddrOp_Char args live = doIndexOffAddrOp (Just mo_u_8ToWord) I8 res args +emitPrimOp res IndexOffAddrOp_WideChar args live = doIndexOffAddrOp (Just mo_u_32ToWord) I32 res args +emitPrimOp res IndexOffAddrOp_Int args live = doIndexOffAddrOp Nothing wordRep res args +emitPrimOp res IndexOffAddrOp_Word args live = doIndexOffAddrOp Nothing wordRep res args +emitPrimOp res IndexOffAddrOp_Addr args live = doIndexOffAddrOp Nothing wordRep res args +emitPrimOp res IndexOffAddrOp_Float args live = doIndexOffAddrOp Nothing F32 res args +emitPrimOp res IndexOffAddrOp_Double args live = doIndexOffAddrOp Nothing F64 res args +emitPrimOp res IndexOffAddrOp_StablePtr args live = doIndexOffAddrOp Nothing wordRep res args +emitPrimOp res IndexOffAddrOp_Int8 args live = doIndexOffAddrOp (Just mo_s_8ToWord) I8 res args +emitPrimOp res IndexOffAddrOp_Int16 args live = doIndexOffAddrOp (Just mo_s_16ToWord) I16 res args +emitPrimOp res IndexOffAddrOp_Int32 args live = doIndexOffAddrOp (Just mo_s_32ToWord) I32 res args +emitPrimOp res IndexOffAddrOp_Int64 args live = doIndexOffAddrOp Nothing I64 res args +emitPrimOp res IndexOffAddrOp_Word8 args live = doIndexOffAddrOp (Just mo_u_8ToWord) I8 res args +emitPrimOp res IndexOffAddrOp_Word16 args live = doIndexOffAddrOp (Just mo_u_16ToWord) I16 res args +emitPrimOp res IndexOffAddrOp_Word32 args live = doIndexOffAddrOp (Just mo_u_32ToWord) I32 res args +emitPrimOp res IndexOffAddrOp_Word64 args live = doIndexOffAddrOp Nothing I64 res args + +-- ReadXXXoffAddr, which are identical, for our purposes, to IndexXXXoffAddr. + +emitPrimOp res ReadOffAddrOp_Char args live = doIndexOffAddrOp (Just mo_u_8ToWord) I8 res args +emitPrimOp res ReadOffAddrOp_WideChar args live = doIndexOffAddrOp (Just mo_u_32ToWord) I32 res args +emitPrimOp res ReadOffAddrOp_Int args live = doIndexOffAddrOp Nothing wordRep res args +emitPrimOp res ReadOffAddrOp_Word args live = doIndexOffAddrOp Nothing wordRep res args +emitPrimOp res ReadOffAddrOp_Addr args live = doIndexOffAddrOp Nothing wordRep res args +emitPrimOp res ReadOffAddrOp_Float args live = doIndexOffAddrOp Nothing F32 res args +emitPrimOp res ReadOffAddrOp_Double args live = doIndexOffAddrOp Nothing F64 res args +emitPrimOp res ReadOffAddrOp_StablePtr args live = doIndexOffAddrOp Nothing wordRep res args +emitPrimOp res ReadOffAddrOp_Int8 args live = doIndexOffAddrOp (Just mo_s_8ToWord) I8 res args +emitPrimOp res ReadOffAddrOp_Int16 args live = doIndexOffAddrOp (Just mo_s_16ToWord) I16 res args +emitPrimOp res ReadOffAddrOp_Int32 args live = doIndexOffAddrOp (Just mo_s_32ToWord) I32 res args +emitPrimOp res ReadOffAddrOp_Int64 args live = doIndexOffAddrOp Nothing I64 res args +emitPrimOp res ReadOffAddrOp_Word8 args live = doIndexOffAddrOp (Just mo_u_8ToWord) I8 res args +emitPrimOp res ReadOffAddrOp_Word16 args live = doIndexOffAddrOp (Just mo_u_16ToWord) I16 res args +emitPrimOp res ReadOffAddrOp_Word32 args live = doIndexOffAddrOp (Just mo_u_32ToWord) I32 res args +emitPrimOp res ReadOffAddrOp_Word64 args live = doIndexOffAddrOp Nothing I64 res args + +-- IndexXXXArray + +emitPrimOp res IndexByteArrayOp_Char args live = doIndexByteArrayOp (Just mo_u_8ToWord) I8 res args +emitPrimOp res IndexByteArrayOp_WideChar args live = doIndexByteArrayOp (Just mo_u_32ToWord) I32 res args +emitPrimOp res IndexByteArrayOp_Int args live = doIndexByteArrayOp Nothing wordRep res args +emitPrimOp res IndexByteArrayOp_Word args live = doIndexByteArrayOp Nothing wordRep res args +emitPrimOp res IndexByteArrayOp_Addr args live = doIndexByteArrayOp Nothing wordRep res args +emitPrimOp res IndexByteArrayOp_Float args live = doIndexByteArrayOp Nothing F32 res args +emitPrimOp res IndexByteArrayOp_Double args live = doIndexByteArrayOp Nothing F64 res args +emitPrimOp res IndexByteArrayOp_StablePtr args live = doIndexByteArrayOp Nothing wordRep res args +emitPrimOp res IndexByteArrayOp_Int8 args live = doIndexByteArrayOp (Just mo_s_8ToWord) I8 res args +emitPrimOp res IndexByteArrayOp_Int16 args live = doIndexByteArrayOp (Just mo_s_16ToWord) I16 res args +emitPrimOp res IndexByteArrayOp_Int32 args live = doIndexByteArrayOp (Just mo_s_32ToWord) I32 res args +emitPrimOp res IndexByteArrayOp_Int64 args live = doIndexByteArrayOp Nothing I64 res args +emitPrimOp res IndexByteArrayOp_Word8 args live = doIndexByteArrayOp (Just mo_u_8ToWord) I8 res args +emitPrimOp res IndexByteArrayOp_Word16 args live = doIndexByteArrayOp (Just mo_u_16ToWord) I16 res args +emitPrimOp res IndexByteArrayOp_Word32 args live = doIndexByteArrayOp (Just mo_u_32ToWord) I32 res args +emitPrimOp res IndexByteArrayOp_Word64 args live = doIndexByteArrayOp Nothing I64 res args + +-- ReadXXXArray, identical to IndexXXXArray. + +emitPrimOp res ReadByteArrayOp_Char args live = doIndexByteArrayOp (Just mo_u_8ToWord) I8 res args +emitPrimOp res ReadByteArrayOp_WideChar args live = doIndexByteArrayOp (Just mo_u_32ToWord) I32 res args +emitPrimOp res ReadByteArrayOp_Int args live = doIndexByteArrayOp Nothing wordRep res args +emitPrimOp res ReadByteArrayOp_Word args live = doIndexByteArrayOp Nothing wordRep res args +emitPrimOp res ReadByteArrayOp_Addr args live = doIndexByteArrayOp Nothing wordRep res args +emitPrimOp res ReadByteArrayOp_Float args live = doIndexByteArrayOp Nothing F32 res args +emitPrimOp res ReadByteArrayOp_Double args live = doIndexByteArrayOp Nothing F64 res args +emitPrimOp res ReadByteArrayOp_StablePtr args live = doIndexByteArrayOp Nothing wordRep res args +emitPrimOp res ReadByteArrayOp_Int8 args live = doIndexByteArrayOp (Just mo_s_8ToWord) I8 res args +emitPrimOp res ReadByteArrayOp_Int16 args live = doIndexByteArrayOp (Just mo_s_16ToWord) I16 res args +emitPrimOp res ReadByteArrayOp_Int32 args live = doIndexByteArrayOp (Just mo_s_32ToWord) I32 res args +emitPrimOp res ReadByteArrayOp_Int64 args live = doIndexByteArrayOp Nothing I64 res args +emitPrimOp res ReadByteArrayOp_Word8 args live = doIndexByteArrayOp (Just mo_u_8ToWord) I8 res args +emitPrimOp res ReadByteArrayOp_Word16 args live = doIndexByteArrayOp (Just mo_u_16ToWord) I16 res args +emitPrimOp res ReadByteArrayOp_Word32 args live = doIndexByteArrayOp (Just mo_u_32ToWord) I32 res args +emitPrimOp res ReadByteArrayOp_Word64 args live = doIndexByteArrayOp Nothing I64 res args + +-- WriteXXXoffAddr + +emitPrimOp res WriteOffAddrOp_Char args live = doWriteOffAddrOp (Just mo_WordTo8) I8 res args +emitPrimOp res WriteOffAddrOp_WideChar args live = doWriteOffAddrOp (Just mo_WordTo32) I32 res args +emitPrimOp res WriteOffAddrOp_Int args live = doWriteOffAddrOp Nothing wordRep res args +emitPrimOp res WriteOffAddrOp_Word args live = doWriteOffAddrOp Nothing wordRep res args +emitPrimOp res WriteOffAddrOp_Addr args live = doWriteOffAddrOp Nothing wordRep res args +emitPrimOp res WriteOffAddrOp_Float args live = doWriteOffAddrOp Nothing F32 res args +emitPrimOp res WriteOffAddrOp_Double args live = doWriteOffAddrOp Nothing F64 res args +emitPrimOp res WriteOffAddrOp_StablePtr args live = doWriteOffAddrOp Nothing wordRep res args +emitPrimOp res WriteOffAddrOp_Int8 args live = doWriteOffAddrOp (Just mo_WordTo8) I8 res args +emitPrimOp res WriteOffAddrOp_Int16 args live = doWriteOffAddrOp (Just mo_WordTo16) I16 res args +emitPrimOp res WriteOffAddrOp_Int32 args live = doWriteOffAddrOp (Just mo_WordTo32) I32 res args +emitPrimOp res WriteOffAddrOp_Int64 args live = doWriteOffAddrOp Nothing I64 res args +emitPrimOp res WriteOffAddrOp_Word8 args live = doWriteOffAddrOp (Just mo_WordTo8) I8 res args +emitPrimOp res WriteOffAddrOp_Word16 args live = doWriteOffAddrOp (Just mo_WordTo16) I16 res args +emitPrimOp res WriteOffAddrOp_Word32 args live = doWriteOffAddrOp (Just mo_WordTo32) I32 res args +emitPrimOp res WriteOffAddrOp_Word64 args live = doWriteOffAddrOp Nothing I64 res args + +-- WriteXXXArray + +emitPrimOp res WriteByteArrayOp_Char args live = doWriteByteArrayOp (Just mo_WordTo8) I8 res args +emitPrimOp res WriteByteArrayOp_WideChar args live = doWriteByteArrayOp (Just mo_WordTo32) I32 res args +emitPrimOp res WriteByteArrayOp_Int args live = doWriteByteArrayOp Nothing wordRep res args +emitPrimOp res WriteByteArrayOp_Word args live = doWriteByteArrayOp Nothing wordRep res args +emitPrimOp res WriteByteArrayOp_Addr args live = doWriteByteArrayOp Nothing wordRep res args +emitPrimOp res WriteByteArrayOp_Float args live = doWriteByteArrayOp Nothing F32 res args +emitPrimOp res WriteByteArrayOp_Double args live = doWriteByteArrayOp Nothing F64 res args +emitPrimOp res WriteByteArrayOp_StablePtr args live = doWriteByteArrayOp Nothing wordRep res args +emitPrimOp res WriteByteArrayOp_Int8 args live = doWriteByteArrayOp (Just mo_WordTo8) I8 res args +emitPrimOp res WriteByteArrayOp_Int16 args live = doWriteByteArrayOp (Just mo_WordTo16) I16 res args +emitPrimOp res WriteByteArrayOp_Int32 args live = doWriteByteArrayOp (Just mo_WordTo32) I32 res args +emitPrimOp res WriteByteArrayOp_Int64 args live = doWriteByteArrayOp Nothing I64 res args +emitPrimOp res WriteByteArrayOp_Word8 args live = doWriteByteArrayOp (Just mo_WordTo8) I8 res args +emitPrimOp res WriteByteArrayOp_Word16 args live = doWriteByteArrayOp (Just mo_WordTo16) I16 res args +emitPrimOp res WriteByteArrayOp_Word32 args live = doWriteByteArrayOp (Just mo_WordTo32) I32 res args +emitPrimOp res WriteByteArrayOp_Word64 args live = doWriteByteArrayOp Nothing I64 res args + + +-- The rest just translate straightforwardly +emitPrimOp [res] op [arg] live + | nopOp op + = stmtC (CmmAssign res arg) + + | Just (mop,rep) <- narrowOp op + = stmtC (CmmAssign res (CmmMachOp (mop rep wordRep) [ + CmmMachOp (mop wordRep rep) [arg]])) + +emitPrimOp [res] op args live + | Just prim <- callishOp op + = do vols <- getVolatileRegs live + emitForeignCall' PlayRisky + [(res,NoHint)] + (CmmPrim prim) + [(a,NoHint) | a<-args] -- ToDo: hints? + (Just vols) + + | Just mop <- translateOp op + = let stmt = CmmAssign res (CmmMachOp mop args) in + stmtC stmt + +emitPrimOp _ op _ _ + = pprPanic "emitPrimOp: can't translate PrimOp" (ppr op) + + +-- These PrimOps are NOPs in Cmm + +nopOp Int2WordOp = True +nopOp Word2IntOp = True +nopOp Int2AddrOp = True +nopOp Addr2IntOp = True +nopOp ChrOp = True -- Int# and Char# are rep'd the same +nopOp OrdOp = True +nopOp _ = False + +-- These PrimOps turn into double casts + +narrowOp Narrow8IntOp = Just (MO_S_Conv, I8) +narrowOp Narrow16IntOp = Just (MO_S_Conv, I16) +narrowOp Narrow32IntOp = Just (MO_S_Conv, I32) +narrowOp Narrow8WordOp = Just (MO_U_Conv, I8) +narrowOp Narrow16WordOp = Just (MO_U_Conv, I16) +narrowOp Narrow32WordOp = Just (MO_U_Conv, I32) +narrowOp _ = Nothing + +-- Native word signless ops + +translateOp IntAddOp = Just mo_wordAdd +translateOp IntSubOp = Just mo_wordSub +translateOp WordAddOp = Just mo_wordAdd +translateOp WordSubOp = Just mo_wordSub +translateOp AddrAddOp = Just mo_wordAdd +translateOp AddrSubOp = Just mo_wordSub + +translateOp IntEqOp = Just mo_wordEq +translateOp IntNeOp = Just mo_wordNe +translateOp WordEqOp = Just mo_wordEq +translateOp WordNeOp = Just mo_wordNe +translateOp AddrEqOp = Just mo_wordEq +translateOp AddrNeOp = Just mo_wordNe + +translateOp AndOp = Just mo_wordAnd +translateOp OrOp = Just mo_wordOr +translateOp XorOp = Just mo_wordXor +translateOp NotOp = Just mo_wordNot +translateOp SllOp = Just mo_wordShl +translateOp SrlOp = Just mo_wordUShr + +translateOp AddrRemOp = Just mo_wordURem + +-- Native word signed ops + +translateOp IntMulOp = Just mo_wordMul +translateOp IntMulMayOfloOp = Just (MO_S_MulMayOflo wordRep) +translateOp IntQuotOp = Just mo_wordSQuot +translateOp IntRemOp = Just mo_wordSRem +translateOp IntNegOp = Just mo_wordSNeg + + +translateOp IntGeOp = Just mo_wordSGe +translateOp IntLeOp = Just mo_wordSLe +translateOp IntGtOp = Just mo_wordSGt +translateOp IntLtOp = Just mo_wordSLt + +translateOp ISllOp = Just mo_wordShl +translateOp ISraOp = Just mo_wordSShr +translateOp ISrlOp = Just mo_wordUShr + +-- Native word unsigned ops + +translateOp WordGeOp = Just mo_wordUGe +translateOp WordLeOp = Just mo_wordULe +translateOp WordGtOp = Just mo_wordUGt +translateOp WordLtOp = Just mo_wordULt + +translateOp WordMulOp = Just mo_wordMul +translateOp WordQuotOp = Just mo_wordUQuot +translateOp WordRemOp = Just mo_wordURem + +translateOp AddrGeOp = Just mo_wordUGe +translateOp AddrLeOp = Just mo_wordULe +translateOp AddrGtOp = Just mo_wordUGt +translateOp AddrLtOp = Just mo_wordULt + +-- Char# ops + +translateOp CharEqOp = Just (MO_Eq wordRep) +translateOp CharNeOp = Just (MO_Ne wordRep) +translateOp CharGeOp = Just (MO_U_Ge wordRep) +translateOp CharLeOp = Just (MO_U_Le wordRep) +translateOp CharGtOp = Just (MO_U_Gt wordRep) +translateOp CharLtOp = Just (MO_U_Lt wordRep) + +-- Double ops + +translateOp DoubleEqOp = Just (MO_Eq F64) +translateOp DoubleNeOp = Just (MO_Ne F64) +translateOp DoubleGeOp = Just (MO_S_Ge F64) +translateOp DoubleLeOp = Just (MO_S_Le F64) +translateOp DoubleGtOp = Just (MO_S_Gt F64) +translateOp DoubleLtOp = Just (MO_S_Lt F64) + +translateOp DoubleAddOp = Just (MO_Add F64) +translateOp DoubleSubOp = Just (MO_Sub F64) +translateOp DoubleMulOp = Just (MO_Mul F64) +translateOp DoubleDivOp = Just (MO_S_Quot F64) +translateOp DoubleNegOp = Just (MO_S_Neg F64) + +-- Float ops + +translateOp FloatEqOp = Just (MO_Eq F32) +translateOp FloatNeOp = Just (MO_Ne F32) +translateOp FloatGeOp = Just (MO_S_Ge F32) +translateOp FloatLeOp = Just (MO_S_Le F32) +translateOp FloatGtOp = Just (MO_S_Gt F32) +translateOp FloatLtOp = Just (MO_S_Lt F32) + +translateOp FloatAddOp = Just (MO_Add F32) +translateOp FloatSubOp = Just (MO_Sub F32) +translateOp FloatMulOp = Just (MO_Mul F32) +translateOp FloatDivOp = Just (MO_S_Quot F32) +translateOp FloatNegOp = Just (MO_S_Neg F32) + +-- Conversions + +translateOp Int2DoubleOp = Just (MO_S_Conv wordRep F64) +translateOp Double2IntOp = Just (MO_S_Conv F64 wordRep) + +translateOp Int2FloatOp = Just (MO_S_Conv wordRep F32) +translateOp Float2IntOp = Just (MO_S_Conv F32 wordRep) + +translateOp Float2DoubleOp = Just (MO_S_Conv F32 F64) +translateOp Double2FloatOp = Just (MO_S_Conv F64 F32) + +-- Word comparisons masquerading as more exotic things. + +translateOp SameMutVarOp = Just mo_wordEq +translateOp SameMVarOp = Just mo_wordEq +translateOp SameMutableArrayOp = Just mo_wordEq +translateOp SameMutableByteArrayOp = Just mo_wordEq +translateOp SameTVarOp = Just mo_wordEq +translateOp EqStablePtrOp = Just mo_wordEq + +translateOp _ = Nothing + +-- These primops are implemented by CallishMachOps, because they sometimes +-- turn into foreign calls depending on the backend. + +callishOp DoublePowerOp = Just MO_F64_Pwr +callishOp DoubleSinOp = Just MO_F64_Sin +callishOp DoubleCosOp = Just MO_F64_Cos +callishOp DoubleTanOp = Just MO_F64_Tan +callishOp DoubleSinhOp = Just MO_F64_Sinh +callishOp DoubleCoshOp = Just MO_F64_Cosh +callishOp DoubleTanhOp = Just MO_F64_Tanh +callishOp DoubleAsinOp = Just MO_F64_Asin +callishOp DoubleAcosOp = Just MO_F64_Acos +callishOp DoubleAtanOp = Just MO_F64_Atan +callishOp DoubleLogOp = Just MO_F64_Log +callishOp DoubleExpOp = Just MO_F64_Exp +callishOp DoubleSqrtOp = Just MO_F64_Sqrt + +callishOp FloatPowerOp = Just MO_F32_Pwr +callishOp FloatSinOp = Just MO_F32_Sin +callishOp FloatCosOp = Just MO_F32_Cos +callishOp FloatTanOp = Just MO_F32_Tan +callishOp FloatSinhOp = Just MO_F32_Sinh +callishOp FloatCoshOp = Just MO_F32_Cosh +callishOp FloatTanhOp = Just MO_F32_Tanh +callishOp FloatAsinOp = Just MO_F32_Asin +callishOp FloatAcosOp = Just MO_F32_Acos +callishOp FloatAtanOp = Just MO_F32_Atan +callishOp FloatLogOp = Just MO_F32_Log +callishOp FloatExpOp = Just MO_F32_Exp +callishOp FloatSqrtOp = Just MO_F32_Sqrt + +callishOp _ = Nothing + +------------------------------------------------------------------------------ +-- Helpers for translating various minor variants of array indexing. + +doIndexOffAddrOp maybe_post_read_cast rep [res] [addr,idx] + = mkBasicIndexedRead 0 maybe_post_read_cast rep res addr idx +doIndexOffAddrOp _ _ _ _ + = panic "CgPrimOp: doIndexOffAddrOp" + +doIndexByteArrayOp maybe_post_read_cast rep [res] [addr,idx] + = mkBasicIndexedRead arrWordsHdrSize maybe_post_read_cast rep res addr idx +doIndexByteArrayOp _ _ _ _ + = panic "CgPrimOp: doIndexByteArrayOp" + +doReadPtrArrayOp res addr idx + = mkBasicIndexedRead arrPtrsHdrSize Nothing wordRep res addr idx + + +doWriteOffAddrOp maybe_pre_write_cast rep [] [addr,idx,val] + = mkBasicIndexedWrite 0 maybe_pre_write_cast rep addr idx val +doWriteOffAddrOp _ _ _ _ + = panic "CgPrimOp: doWriteOffAddrOp" + +doWriteByteArrayOp maybe_pre_write_cast rep [] [addr,idx,val] + = mkBasicIndexedWrite arrWordsHdrSize maybe_pre_write_cast rep addr idx val +doWriteByteArrayOp _ _ _ _ + = panic "CgPrimOp: doWriteByteArrayOp" + +doWritePtrArrayOp addr idx val + = do stmtC (setInfo addr (CmmLit (CmmLabel mkMAP_DIRTY_infoLabel))) + mkBasicIndexedWrite arrPtrsHdrSize Nothing wordRep addr idx val + + +mkBasicIndexedRead off Nothing read_rep res base idx + = stmtC (CmmAssign res (cmmLoadIndexOffExpr off read_rep base idx)) +mkBasicIndexedRead off (Just cast) read_rep res base idx + = stmtC (CmmAssign res (CmmMachOp cast [ + cmmLoadIndexOffExpr off read_rep base idx])) + +mkBasicIndexedWrite off Nothing write_rep base idx val + = stmtC (CmmStore (cmmIndexOffExpr off write_rep base idx) val) +mkBasicIndexedWrite off (Just cast) write_rep base idx val + = stmtC (CmmStore (cmmIndexOffExpr off write_rep base idx) (CmmMachOp cast [val])) + +-- ---------------------------------------------------------------------------- +-- Misc utils + +cmmIndexOffExpr :: ByteOff -> MachRep -> CmmExpr -> CmmExpr -> CmmExpr +cmmIndexOffExpr off rep base idx + = cmmIndexExpr rep (cmmOffsetB base off) idx + +cmmLoadIndexOffExpr :: ByteOff -> MachRep -> CmmExpr -> CmmExpr -> CmmExpr +cmmLoadIndexOffExpr off rep base idx + = CmmLoad (cmmIndexOffExpr off rep base idx) rep + +setInfo :: CmmExpr -> CmmExpr -> CmmStmt +setInfo closure_ptr info_ptr = CmmStore closure_ptr info_ptr + diff --git a/compiler/codeGen/CgProf.hs b/compiler/codeGen/CgProf.hs new file mode 100644 index 0000000000..1488e34956 --- /dev/null +++ b/compiler/codeGen/CgProf.hs @@ -0,0 +1,478 @@ +----------------------------------------------------------------------------- +-- +-- Code generation for profiling +-- +-- (c) The University of Glasgow 2004 +-- +----------------------------------------------------------------------------- + +module CgProf ( + mkCCostCentre, mkCCostCentreStack, + + -- Cost-centre Profiling + dynProfHdr, profDynAlloc, profAlloc, staticProfHdr, initUpdFrameProf, + enterCostCentre, enterCostCentrePAP, enterCostCentreThunk, + chooseDynCostCentres, + costCentreFrom, + curCCS, curCCSAddr, + emitCostCentreDecl, emitCostCentreStackDecl, + emitRegisterCC, emitRegisterCCS, + emitSetCCC, emitCCS, + + -- Lag/drag/void stuff + ldvEnter, ldvRecordCreate + ) where + +#include "HsVersions.h" +#include "MachDeps.h" + -- For WORD_SIZE_IN_BITS only. +#include "../includes/Constants.h" + -- For LDV_CREATE_MASK, LDV_STATE_USE + -- which are StgWords +#include "../includes/DerivedConstants.h" + -- For REP_xxx constants, which are MachReps + +import ClosureInfo ( ClosureInfo, closureSize, + closureName, isToplevClosure, closureReEntrant, ) +import CgUtils +import CgMonad +import SMRep ( StgWord, profHdrSize ) + +import Cmm +import MachOp +import CmmUtils ( zeroCLit, mkIntCLit, mkLblExpr ) +import CLabel ( mkCCLabel, mkCCSLabel, mkRtsDataLabel ) + +import Module ( moduleString ) +import Id ( Id ) +import CostCentre +import StgSyn ( GenStgExpr(..), StgExpr ) +import StaticFlags ( opt_SccProfilingOn ) +import FastString ( FastString, mkFastString, LitString ) +import Constants -- Lots of field offsets +import Outputable + +import Maybe +import Char ( ord ) +import Monad ( when ) + +----------------------------------------------------------------------------- +-- +-- Cost-centre-stack Profiling +-- +----------------------------------------------------------------------------- + +-- Expression representing the current cost centre stack +curCCS :: CmmExpr +curCCS = CmmLoad curCCSAddr wordRep + +-- Address of current CCS variable, for storing into +curCCSAddr :: CmmExpr +curCCSAddr = CmmLit (CmmLabel (mkRtsDataLabel SLIT("CCCS"))) + +mkCCostCentre :: CostCentre -> CmmLit +mkCCostCentre cc = CmmLabel (mkCCLabel cc) + +mkCCostCentreStack :: CostCentreStack -> CmmLit +mkCCostCentreStack ccs = CmmLabel (mkCCSLabel ccs) + +costCentreFrom :: CmmExpr -- A closure pointer + -> CmmExpr -- The cost centre from that closure +costCentreFrom cl = CmmLoad (cmmOffsetB cl oFFSET_StgHeader_ccs) wordRep + +staticProfHdr :: CostCentreStack -> [CmmLit] +-- The profiling header words in a static closure +-- Was SET_STATIC_PROF_HDR +staticProfHdr ccs = ifProfilingL [mkCCostCentreStack ccs, + staticLdvInit] + +dynProfHdr :: CmmExpr -> [CmmExpr] +-- Profiling header words in a dynamic closure +dynProfHdr ccs = ifProfilingL [ccs, dynLdvInit] + +initUpdFrameProf :: CmmExpr -> Code +-- Initialise the profiling field of an update frame +initUpdFrameProf frame_amode + = ifProfiling $ -- frame->header.prof.ccs = CCCS + stmtC (CmmStore (cmmOffsetB frame_amode oFFSET_StgHeader_ccs) curCCS) + -- frame->header.prof.hp.rs = NULL (or frame-header.prof.hp.ldvw = 0) + -- is unnecessary because it is not used anyhow. + +-- ----------------------------------------------------------------------------- +-- Recording allocation in a cost centre + +-- | Record the allocation of a closure. The CmmExpr is the cost +-- centre stack to which to attribute the allocation. +profDynAlloc :: ClosureInfo -> CmmExpr -> Code +profDynAlloc cl_info ccs + = ifProfiling $ + profAlloc (CmmLit (mkIntCLit (closureSize cl_info))) ccs + +-- | Record the allocation of a closure (size is given by a CmmExpr) +-- The size must be in words, because the allocation counter in a CCS counts +-- in words. +profAlloc :: CmmExpr -> CmmExpr -> Code +profAlloc words ccs + = ifProfiling $ + stmtC (addToMemE alloc_rep + (cmmOffsetB ccs oFFSET_CostCentreStack_mem_alloc) + (CmmMachOp (MO_U_Conv wordRep alloc_rep) $ + [CmmMachOp mo_wordSub [words, + CmmLit (mkIntCLit profHdrSize)]])) + -- subtract the "profiling overhead", which is the + -- profiling header in a closure. + where + alloc_rep = REP_CostCentreStack_mem_alloc + +-- ---------------------------------------------------------------------- +-- Setting the cost centre in a new closure + +chooseDynCostCentres :: CostCentreStack + -> [Id] -- Args + -> StgExpr -- Body + -> FCode (CmmExpr, CmmExpr) +-- Called when alllcating a closure +-- Tells which cost centre to put in the object, and which +-- to blame the cost of allocation on +chooseDynCostCentres ccs args body = do + -- Cost-centre we record in the object + use_ccs <- emitCCS ccs + + -- Cost-centre on whom we blame the allocation + let blame_ccs + | null args && isBox body = CmmLit (mkCCostCentreStack overheadCCS) + | otherwise = use_ccs + + return (use_ccs, blame_ccs) + + +-- Some CostCentreStacks are a sequence of pushes on top of CCCS. +-- These pushes must be performed before we can refer to the stack in +-- an expression. +emitCCS :: CostCentreStack -> FCode CmmExpr +emitCCS ccs = push_em (ccsExpr ccs') (reverse cc's) + where + (cc's, ccs') = decomposeCCS ccs + + push_em ccs [] = return ccs + push_em ccs (cc:rest) = do + tmp <- newTemp wordRep + pushCostCentre tmp ccs cc + push_em (CmmReg tmp) rest + +ccsExpr :: CostCentreStack -> CmmExpr +ccsExpr ccs + | isCurrentCCS ccs = curCCS + | otherwise = CmmLit (mkCCostCentreStack ccs) + + +isBox :: StgExpr -> Bool +-- If it's an utterly trivial RHS, then it must be +-- one introduced by boxHigherOrderArgs for profiling, +-- so we charge it to "OVERHEAD". +-- This looks like a GROSS HACK to me --SDM +isBox (StgApp fun []) = True +isBox other = False + + +-- ----------------------------------------------------------------------- +-- Setting the current cost centre on entry to a closure + +-- For lexically scoped profiling we have to load the cost centre from +-- the closure entered, if the costs are not supposed to be inherited. +-- This is done immediately on entering the fast entry point. + +-- Load current cost centre from closure, if not inherited. +-- Node is guaranteed to point to it, if profiling and not inherited. + +enterCostCentre + :: ClosureInfo + -> CostCentreStack + -> StgExpr -- The RHS of the closure + -> Code + +-- We used to have a special case for bindings of form +-- f = g True +-- where g has arity 2. The RHS is a thunk, but we don't +-- need to update it; and we want to subsume costs. +-- We don't have these sort of PAPs any more, so the special +-- case has gone away. + +enterCostCentre closure_info ccs body + = ifProfiling $ + ASSERT2(not (noCCSAttached ccs), ppr (closureName closure_info) <+> ppr ccs) + enter_cost_centre closure_info ccs body + +enter_cost_centre closure_info ccs body + | isSubsumedCCS ccs + = ASSERT(isToplevClosure closure_info) + ASSERT(re_entrant) + enter_ccs_fsub + + | isDerivedFromCurrentCCS ccs + = do { + if re_entrant && not is_box + then + enter_ccs_fun node_ccs + else + stmtC (CmmStore curCCSAddr node_ccs) + + -- don't forget to bump the scc count. This closure might have been + -- of the form let x = _scc_ "x" e in ...x..., which the SCCfinal + -- pass has turned into simply let x = e in ...x... and attached + -- the _scc_ as PushCostCentre(x,CCCS) on the x closure. So that + -- we don't lose the scc counter, bump it in the entry code for x. + -- ToDo: for a multi-push we should really bump the counter for + -- each of the intervening CCSs, not just the top one. + ; when (not (isCurrentCCS ccs)) $ + stmtC (bumpSccCount curCCS) + } + + | isCafCCS ccs + = ASSERT(isToplevClosure closure_info) + ASSERT(not re_entrant) + do { -- This is just a special case of the isDerivedFromCurrentCCS + -- case above. We could delete this, but it's a micro + -- optimisation and saves a bit of code. + stmtC (CmmStore curCCSAddr enc_ccs) + ; stmtC (bumpSccCount node_ccs) + } + + | otherwise + = panic "enterCostCentre" + where + enc_ccs = CmmLit (mkCCostCentreStack ccs) + re_entrant = closureReEntrant closure_info + node_ccs = costCentreFrom (CmmReg nodeReg) + is_box = isBox body + +-- set the current CCS when entering a PAP +enterCostCentrePAP :: CmmExpr -> Code +enterCostCentrePAP closure = + ifProfiling $ do + enter_ccs_fun (costCentreFrom closure) + enteringPAP 1 + +enterCostCentreThunk :: CmmExpr -> Code +enterCostCentreThunk closure = + ifProfiling $ do + stmtC $ CmmStore curCCSAddr (costCentreFrom closure) + +enter_ccs_fun stack = emitRtsCall SLIT("EnterFunCCS") [(stack,PtrHint)] + -- ToDo: vols + +enter_ccs_fsub = enteringPAP 0 + +-- When entering a PAP, EnterFunCCS is called by both the PAP entry +-- code and the function entry code; we don't want the function's +-- entry code to also update CCCS in the event that it was called via +-- a PAP, so we set the flag entering_PAP to indicate that we are +-- entering via a PAP. +enteringPAP :: Integer -> Code +enteringPAP n + = stmtC (CmmStore (CmmLit (CmmLabel (mkRtsDataLabel SLIT("entering_PAP")))) + (CmmLit (CmmInt n cIntRep))) + +ifProfiling :: Code -> Code +ifProfiling code + | opt_SccProfilingOn = code + | otherwise = nopC + +ifProfilingL :: [a] -> [a] +ifProfilingL xs + | opt_SccProfilingOn = xs + | otherwise = [] + + +-- --------------------------------------------------------------------------- +-- Initialising Cost Centres & CCSs + +emitCostCentreDecl + :: CostCentre + -> Code +emitCostCentreDecl cc = do + { label <- mkStringCLit (costCentreUserName cc) + ; modl <- mkStringCLit (moduleString (cc_mod cc)) + ; let + lits = [ zero, -- StgInt ccID, + label, -- char *label, + modl, -- char *module, + zero, -- StgWord time_ticks + zero64, -- StgWord64 mem_alloc + subsumed, -- StgInt is_caf + zero -- struct _CostCentre *link + ] + ; emitDataLits (mkCCLabel cc) lits + } + where + subsumed | isCafCC cc = mkIntCLit (ord 'c') -- 'c' == is a CAF + | otherwise = mkIntCLit (ord 'B') -- 'B' == is boring + + +emitCostCentreStackDecl + :: CostCentreStack + -> Code +emitCostCentreStackDecl ccs + | Just cc <- maybeSingletonCCS ccs = do + { let + -- Note: to avoid making any assumptions about how the + -- C compiler (that compiles the RTS, in particular) does + -- layouts of structs containing long-longs, simply + -- pad out the struct with zero words until we hit the + -- size of the overall struct (which we get via DerivedConstants.h) + -- + lits = zero : mkCCostCentre cc : replicate (sizeof_ccs_words - 2) zero + ; emitDataLits (mkCCSLabel ccs) lits + } + | otherwise = pprPanic "emitCostCentreStackDecl" (ppr ccs) + +zero = mkIntCLit 0 +zero64 = CmmInt 0 I64 + +sizeof_ccs_words :: Int +sizeof_ccs_words + -- round up to the next word. + | ms == 0 = ws + | otherwise = ws + 1 + where + (ws,ms) = SIZEOF_CostCentreStack `divMod` wORD_SIZE + +-- --------------------------------------------------------------------------- +-- Registering CCs and CCSs + +-- (cc)->link = CC_LIST; +-- CC_LIST = (cc); +-- (cc)->ccID = CC_ID++; + +emitRegisterCC :: CostCentre -> Code +emitRegisterCC cc = do + { tmp <- newTemp cIntRep + ; stmtsC [ + CmmStore (cmmOffsetB cc_lit oFFSET_CostCentre_link) + (CmmLoad cC_LIST wordRep), + CmmStore cC_LIST cc_lit, + CmmAssign tmp (CmmLoad cC_ID cIntRep), + CmmStore (cmmOffsetB cc_lit oFFSET_CostCentre_ccID) (CmmReg tmp), + CmmStore cC_ID (cmmRegOffB tmp 1) + ] + } + where + cc_lit = CmmLit (CmmLabel (mkCCLabel cc)) + +-- (ccs)->prevStack = CCS_LIST; +-- CCS_LIST = (ccs); +-- (ccs)->ccsID = CCS_ID++; + +emitRegisterCCS :: CostCentreStack -> Code +emitRegisterCCS ccs = do + { tmp <- newTemp cIntRep + ; stmtsC [ + CmmStore (cmmOffsetB ccs_lit oFFSET_CostCentreStack_prevStack) + (CmmLoad cCS_LIST wordRep), + CmmStore cCS_LIST ccs_lit, + CmmAssign tmp (CmmLoad cCS_ID cIntRep), + CmmStore (cmmOffsetB ccs_lit oFFSET_CostCentreStack_ccsID) (CmmReg tmp), + CmmStore cCS_ID (cmmRegOffB tmp 1) + ] + } + where + ccs_lit = CmmLit (CmmLabel (mkCCSLabel ccs)) + + +cC_LIST = CmmLit (CmmLabel (mkRtsDataLabel SLIT("CC_LIST"))) +cC_ID = CmmLit (CmmLabel (mkRtsDataLabel SLIT("CC_ID"))) + +cCS_LIST = CmmLit (CmmLabel (mkRtsDataLabel SLIT("CCS_LIST"))) +cCS_ID = CmmLit (CmmLabel (mkRtsDataLabel SLIT("CCS_ID"))) + +-- --------------------------------------------------------------------------- +-- Set the current cost centre stack + +emitSetCCC :: CostCentre -> Code +emitSetCCC cc + | not opt_SccProfilingOn = nopC + | otherwise = do + tmp <- newTemp wordRep + ASSERT( sccAbleCostCentre cc ) + pushCostCentre tmp curCCS cc + stmtC (CmmStore curCCSAddr (CmmReg tmp)) + when (isSccCountCostCentre cc) $ + stmtC (bumpSccCount curCCS) + +pushCostCentre :: CmmReg -> CmmExpr -> CostCentre -> Code +pushCostCentre result ccs cc + = emitRtsCallWithResult result PtrHint + SLIT("PushCostCentre") [(ccs,PtrHint), + (CmmLit (mkCCostCentre cc), PtrHint)] + +bumpSccCount :: CmmExpr -> CmmStmt +bumpSccCount ccs + = addToMem REP_CostCentreStack_scc_count + (cmmOffsetB ccs oFFSET_CostCentreStack_scc_count) 1 + +----------------------------------------------------------------------------- +-- +-- Lag/drag/void stuff +-- +----------------------------------------------------------------------------- + +-- +-- Initial value for the LDV field in a static closure +-- +staticLdvInit :: CmmLit +staticLdvInit = zeroCLit + +-- +-- Initial value of the LDV field in a dynamic closure +-- +dynLdvInit :: CmmExpr +dynLdvInit = -- (era << LDV_SHIFT) | LDV_STATE_CREATE + CmmMachOp mo_wordOr [ + CmmMachOp mo_wordShl [loadEra, CmmLit (mkIntCLit lDV_SHIFT) ], + CmmLit (mkWordCLit lDV_STATE_CREATE) + ] + +-- +-- Initialise the LDV word of a new closure +-- +ldvRecordCreate :: CmmExpr -> Code +ldvRecordCreate closure = stmtC $ CmmStore (ldvWord closure) dynLdvInit + +-- +-- Called when a closure is entered, marks the closure as having been "used". +-- The closure is not an 'inherently used' one. +-- The closure is not IND or IND_OLDGEN because neither is considered for LDV +-- profiling. +-- +ldvEnter :: CmmExpr -> Code +-- Argument is a closure pointer +ldvEnter cl_ptr + = ifProfiling $ + -- if (era > 0) { + -- LDVW((c)) = (LDVW((c)) & LDV_CREATE_MASK) | + -- era | LDV_STATE_USE } + emitIf (CmmMachOp mo_wordUGt [loadEra, CmmLit zeroCLit]) + (stmtC (CmmStore ldv_wd new_ldv_wd)) + where + ldv_wd = ldvWord cl_ptr + new_ldv_wd = cmmOrWord (cmmAndWord (CmmLoad ldv_wd wordRep) + (CmmLit (mkWordCLit lDV_CREATE_MASK))) + (cmmOrWord loadEra (CmmLit (mkWordCLit lDV_STATE_USE))) + +loadEra :: CmmExpr +loadEra = CmmMachOp (MO_U_Conv cIntRep wordRep) + [CmmLoad (mkLblExpr (mkRtsDataLabel SLIT("era"))) cIntRep] + +ldvWord :: CmmExpr -> CmmExpr +-- Takes the address of a closure, and returns +-- the address of the LDV word in the closure +ldvWord closure_ptr = cmmOffsetB closure_ptr oFFSET_StgHeader_ldvw + +-- LDV constants, from ghc/includes/Constants.h +lDV_SHIFT = (LDV_SHIFT :: Int) +--lDV_STATE_MASK = (LDV_STATE_MASK :: StgWord) +lDV_CREATE_MASK = (LDV_CREATE_MASK :: StgWord) +--lDV_LAST_MASK = (LDV_LAST_MASK :: StgWord) +lDV_STATE_CREATE = (LDV_STATE_CREATE :: StgWord) +lDV_STATE_USE = (LDV_STATE_USE :: StgWord) + diff --git a/compiler/codeGen/CgStackery.lhs b/compiler/codeGen/CgStackery.lhs new file mode 100644 index 0000000000..7cb310d521 --- /dev/null +++ b/compiler/codeGen/CgStackery.lhs @@ -0,0 +1,339 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +% $Id: CgStackery.lhs,v 1.27 2004/09/30 10:35:49 simonpj Exp $ +% +\section[CgStackery]{Stack management functions} + +Stack-twiddling operations, which are pretty low-down and grimy. +(This is the module that knows all about stack layouts, etc.) + +\begin{code} +module CgStackery ( + spRel, getVirtSp, getRealSp, setRealSp, + setRealAndVirtualSp, getSpRelOffset, + + allocPrimStack, allocStackTop, deAllocStackTop, + adjustStackHW, getFinalStackHW, + setStackFrame, getStackFrame, + mkVirtStkOffsets, mkStkAmodes, + freeStackSlots, + pushUpdateFrame, emitPushUpdateFrame, + ) where + +#include "HsVersions.h" + +import CgMonad +import CgUtils ( cmmOffsetB, cmmRegOffW ) +import CgProf ( initUpdFrameProf ) +import SMRep +import Cmm +import CmmUtils ( CmmStmts, mkLblExpr ) +import CLabel ( mkUpdInfoLabel ) +import Constants +import Util ( sortLe ) +import FastString ( LitString ) +import OrdList ( toOL ) +import Outputable +\end{code} + +%************************************************************************ +%* * +\subsection[CgUsages-stackery]{Monad things for fiddling with stack usage} +%* * +%************************************************************************ + +spRel is a little function that abstracts the stack direction. Note that most +of the code generator is dependent on the stack direction anyway, so +changing this on its own spells certain doom. ToDo: remove? + + THIS IS DIRECTION SENSITIVE! + +Stack grows down, positive virtual offsets correspond to negative +additions to the stack pointer. + +\begin{code} +spRel :: VirtualSpOffset -- virtual offset of Sp + -> VirtualSpOffset -- virtual offset of The Thing + -> WordOff -- integer offset +spRel sp off = sp - off +\end{code} + +@setRealAndVirtualSp@ sets into the environment the offsets of the +current position of the real and virtual stack pointers in the current +stack frame. The high-water mark is set too. It generates no code. +It is used to initialise things at the beginning of a closure body. + +\begin{code} +setRealAndVirtualSp :: VirtualSpOffset -- New real Sp + -> Code + +setRealAndVirtualSp new_sp + = do { stk_usg <- getStkUsage + ; setStkUsage (stk_usg {virtSp = new_sp, + realSp = new_sp, + hwSp = new_sp}) } + +getVirtSp :: FCode VirtualSpOffset +getVirtSp + = do { stk_usg <- getStkUsage + ; return (virtSp stk_usg) } + +getRealSp :: FCode VirtualSpOffset +getRealSp + = do { stk_usg <- getStkUsage + ; return (realSp stk_usg) } + +setRealSp :: VirtualSpOffset -> Code +setRealSp new_real_sp + = do { stk_usg <- getStkUsage + ; setStkUsage (stk_usg {realSp = new_real_sp}) } + +getSpRelOffset :: VirtualSpOffset -> FCode CmmExpr +getSpRelOffset virtual_offset + = do { real_sp <- getRealSp + ; return (cmmRegOffW spReg (spRel real_sp virtual_offset)) } +\end{code} + + +%************************************************************************ +%* * +\subsection[CgStackery-layout]{Laying out a stack frame} +%* * +%************************************************************************ + +'mkVirtStkOffsets' is given a list of arguments. The first argument +gets the /largest/ virtual stack offset (remember, virtual offsets +increase towards the top of stack). + +\begin{code} +mkVirtStkOffsets + :: VirtualSpOffset -- Offset of the last allocated thing + -> [(CgRep,a)] -- things to make offsets for + -> (VirtualSpOffset, -- OUTPUTS: Topmost allocated word + [(a, VirtualSpOffset)]) -- things with offsets (voids filtered out) + +mkVirtStkOffsets init_Sp_offset things + = loop init_Sp_offset [] (reverse things) + where + loop offset offs [] = (offset,offs) + loop offset offs ((VoidArg,t):things) = loop offset offs things + -- ignore Void arguments + loop offset offs ((rep,t):things) + = loop thing_slot ((t,thing_slot):offs) things + where + thing_slot = offset + cgRepSizeW rep + -- offset of thing is offset+size, because we're + -- growing the stack *downwards* as the offsets increase. + +-- | 'mkStkAmodes' is a higher-level version of +-- 'mkVirtStkOffsets'. It starts from the tail-call locations. +-- It returns a single list of addressing modes for the stack +-- locations, and therefore is in the monad. It /doesn't/ adjust the +-- high water mark. + +mkStkAmodes + :: VirtualSpOffset -- Tail call positions + -> [(CgRep,CmmExpr)] -- things to make offsets for + -> FCode (VirtualSpOffset, -- OUTPUTS: Topmost allocated word + CmmStmts) -- Assignments to appropriate stk slots + +mkStkAmodes tail_Sp things + = do { rSp <- getRealSp + ; let (last_Sp_offset, offsets) = mkVirtStkOffsets tail_Sp things + abs_cs = [ CmmStore (cmmRegOffW spReg (spRel rSp offset)) amode + | (amode, offset) <- offsets + ] + ; returnFC (last_Sp_offset, toOL abs_cs) } +\end{code} + +%************************************************************************ +%* * +\subsection[CgStackery-monadery]{Inside-monad functions for stack manipulation} +%* * +%************************************************************************ + +Allocate a virtual offset for something. + +\begin{code} +allocPrimStack :: CgRep -> FCode VirtualSpOffset +allocPrimStack rep + = do { stk_usg <- getStkUsage + ; let free_stk = freeStk stk_usg + ; case find_block free_stk of + Nothing -> do + { let push_virt_sp = virtSp stk_usg + size + ; setStkUsage (stk_usg { virtSp = push_virt_sp, + hwSp = hwSp stk_usg `max` push_virt_sp }) + -- Adjust high water mark + ; return push_virt_sp } + Just slot -> do + { setStkUsage (stk_usg { freeStk = delete_block free_stk slot }) + ; return slot } + } + where + size :: WordOff + size = cgRepSizeW rep + + -- Find_block looks for a contiguous chunk of free slots + -- returning the offset of its topmost word + find_block :: [VirtualSpOffset] -> Maybe VirtualSpOffset + find_block [] = Nothing + find_block (slot:slots) + | take size (slot:slots) == [slot..top_slot] + = Just top_slot + | otherwise + = find_block slots + where -- The stack grows downwards, with increasing virtual offsets. + -- Therefore, the address of a multi-word object is the *highest* + -- virtual offset it occupies (top_slot below). + top_slot = slot+size-1 + + delete_block free_stk slot = [ s | s <- free_stk, + (s<=slot-size) || (s>slot) ] + -- Retain slots which are not in the range + -- slot-size+1..slot +\end{code} + +Allocate a chunk ON TOP OF the stack. + +\begin{code} +allocStackTop :: WordOff -> FCode VirtualSpOffset +allocStackTop size + = do { stk_usg <- getStkUsage + ; let push_virt_sp = virtSp stk_usg + size + ; setStkUsage (stk_usg { virtSp = push_virt_sp, + hwSp = hwSp stk_usg `max` push_virt_sp }) + ; return push_virt_sp } +\end{code} + +Pop some words from the current top of stack. This is used for +de-allocating the return address in a case alternative. + +\begin{code} +deAllocStackTop :: WordOff -> FCode VirtualSpOffset +deAllocStackTop size + = do { stk_usg <- getStkUsage + ; let pop_virt_sp = virtSp stk_usg - size + ; setStkUsage (stk_usg { virtSp = pop_virt_sp }) + ; return pop_virt_sp } +\end{code} + +\begin{code} +adjustStackHW :: VirtualSpOffset -> Code +adjustStackHW offset + = do { stk_usg <- getStkUsage + ; setStkUsage (stk_usg { hwSp = hwSp stk_usg `max` offset }) } +\end{code} + +A knot-tying beast. + +\begin{code} +getFinalStackHW :: (VirtualSpOffset -> Code) -> Code +getFinalStackHW fcode + = do { fixC (\hw_sp -> do + { fcode hw_sp + ; stk_usg <- getStkUsage + ; return (hwSp stk_usg) }) + ; return () } +\end{code} + +\begin{code} +setStackFrame :: VirtualSpOffset -> Code +setStackFrame offset + = do { stk_usg <- getStkUsage + ; setStkUsage (stk_usg { frameSp = offset }) } + +getStackFrame :: FCode VirtualSpOffset +getStackFrame + = do { stk_usg <- getStkUsage + ; return (frameSp stk_usg) } +\end{code} + + +%******************************************************** +%* * +%* Setting up update frames * +%* * +%******************************************************** + +@pushUpdateFrame@ $updatee$ pushes a general update frame which +points to $updatee$ as the thing to be updated. It is only used +when a thunk has just been entered, so the (real) stack pointers +are guaranteed to be nicely aligned with the top of stack. +@pushUpdateFrame@ adjusts the virtual and tail stack pointers +to reflect the frame pushed. + +\begin{code} +pushUpdateFrame :: CmmExpr -> Code -> Code + +pushUpdateFrame updatee code + = do { +#ifdef DEBUG + EndOfBlockInfo _ sequel <- getEndOfBlockInfo ; + ASSERT(case sequel of { OnStack -> True; _ -> False}) +#endif + + allocStackTop (fixedHdrSize + + sIZEOF_StgUpdateFrame_NoHdr `quot` wORD_SIZE) + ; vsp <- getVirtSp + ; setStackFrame vsp + ; frame_addr <- getSpRelOffset vsp + -- The location of the lowest-address + -- word of the update frame itself + + ; setEndOfBlockInfo (EndOfBlockInfo vsp UpdateCode) $ + do { emitPushUpdateFrame frame_addr updatee + ; code } + } + +emitPushUpdateFrame :: CmmExpr -> CmmExpr -> Code +emitPushUpdateFrame frame_addr updatee = do + stmtsC [ -- Set the info word + CmmStore frame_addr (mkLblExpr mkUpdInfoLabel) + , -- And the updatee + CmmStore (cmmOffsetB frame_addr off_updatee) updatee ] + initUpdFrameProf frame_addr + +off_updatee :: ByteOff +off_updatee = fixedHdrSize*wORD_SIZE + oFFSET_StgUpdateFrame_updatee +\end{code} + + +%************************************************************************ +%* * +\subsection[CgStackery-free]{Free stack slots} +%* * +%************************************************************************ + +Explicitly free some stack space. + +\begin{code} +freeStackSlots :: [VirtualSpOffset] -> Code +freeStackSlots extra_free + = do { stk_usg <- getStkUsage + ; let all_free = addFreeSlots (freeStk stk_usg) (sortLe (<=) extra_free) + ; let (new_vsp, new_free) = trim (virtSp stk_usg) all_free + ; setStkUsage (stk_usg { virtSp = new_vsp, freeStk = new_free }) } + +addFreeSlots :: [VirtualSpOffset] -> [VirtualSpOffset] -> [VirtualSpOffset] +-- Merge the two, assuming both are in increasing order +addFreeSlots cs [] = cs +addFreeSlots [] ns = ns +addFreeSlots (c:cs) (n:ns) + | c < n = c : addFreeSlots cs (n:ns) + | otherwise = n : addFreeSlots (c:cs) ns + +trim :: VirtualSpOffset -> [VirtualSpOffset] -> (VirtualSpOffset, [VirtualSpOffset]) +-- Try to trim back the virtual stack pointer, where there is a +-- continuous bunch of free slots at the end of the free list +trim vsp [] = (vsp, []) +trim vsp (slot:slots) + = case trim vsp slots of + (vsp', []) + | vsp' < slot -> pprTrace "trim: strange" (ppr vsp <+> ppr (slot:slots)) + (vsp', []) + | vsp' == slot -> (vsp'-1, []) + | otherwise -> (vsp', [slot]) + (vsp', slots') -> (vsp', slot:slots') +\end{code} diff --git a/compiler/codeGen/CgTailCall.lhs b/compiler/codeGen/CgTailCall.lhs new file mode 100644 index 0000000000..dd7327b745 --- /dev/null +++ b/compiler/codeGen/CgTailCall.lhs @@ -0,0 +1,455 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +% $Id: CgTailCall.lhs,v 1.43 2005/06/21 10:44:41 simonmar Exp $ +% +%******************************************************** +%* * +\section[CgTailCall]{Tail calls: converting @StgApps@} +%* * +%******************************************************** + +\begin{code} +module CgTailCall ( + cgTailCall, performTailCall, + performReturn, performPrimReturn, + emitKnownConReturnCode, emitAlgReturnCode, + returnUnboxedTuple, ccallReturnUnboxedTuple, + pushUnboxedTuple, + tailCallPrimOp, + + pushReturnAddress + ) where + +#include "HsVersions.h" + +import CgMonad +import CgBindery ( getArgAmodes, getCgIdInfo, CgIdInfo, maybeLetNoEscape, + idInfoToAmode, cgIdInfoId, cgIdInfoLF, + cgIdInfoArgRep ) +import CgInfoTbls ( entryCode, emitDirectReturnInstr, dataConTagZ, + emitVectoredReturnInstr, closureInfoPtr ) +import CgCallConv +import CgStackery ( setRealSp, mkStkAmodes, adjustStackHW, + getSpRelOffset ) +import CgHeapery ( setRealHp, getHpRelOffset ) +import CgUtils ( emitSimultaneously ) +import CgTicky +import ClosureInfo +import SMRep ( CgRep, isVoidArg, separateByPtrFollowness ) +import Cmm +import CmmUtils +import CLabel ( CLabel, mkRtsPrimOpLabel, mkSeqInfoLabel ) +import Type ( isUnLiftedType ) +import Id ( Id, idName, idUnique, idType ) +import DataCon ( DataCon, dataConTyCon ) +import StgSyn ( StgArg ) +import TyCon ( TyCon ) +import PrimOp ( PrimOp ) +import Outputable + +import Monad ( when ) + +----------------------------------------------------------------------------- +-- Tail Calls + +cgTailCall :: Id -> [StgArg] -> Code + +-- Here's the code we generate for a tail call. (NB there may be no +-- arguments, in which case this boils down to just entering a variable.) +-- +-- * Put args in the top locations of the stack. +-- * Adjust the stack ptr +-- * Make R1 point to the function closure if necessary. +-- * Perform the call. +-- +-- Things to be careful about: +-- +-- * Don't overwrite stack locations before you have finished with +-- them (remember you need the function and the as-yet-unmoved +-- arguments). +-- * Preferably, generate no code to replace x by x on the stack (a +-- common situation in tail-recursion). +-- * Adjust the stack high water mark appropriately. +-- +-- Treat unboxed locals exactly like literals (above) except use the addr +-- mode for the local instead of (CLit lit) in the assignment. + +cgTailCall fun args + = do { fun_info <- getCgIdInfo fun + + ; if isUnLiftedType (idType fun) + then -- Primitive return + ASSERT( null args ) + do { fun_amode <- idInfoToAmode fun_info + ; performPrimReturn (cgIdInfoArgRep fun_info) fun_amode } + + else -- Normal case, fun is boxed + do { arg_amodes <- getArgAmodes args + ; performTailCall fun_info arg_amodes noStmts } + } + + +-- ----------------------------------------------------------------------------- +-- The guts of a tail-call + +performTailCall + :: CgIdInfo -- The function + -> [(CgRep,CmmExpr)] -- Args + -> CmmStmts -- Pending simultaneous assignments + -- *** GUARANTEED to contain only stack assignments. + -> Code + +performTailCall fun_info arg_amodes pending_assts + | Just join_sp <- maybeLetNoEscape fun_info + = -- A let-no-escape is slightly different, because we + -- arrange the stack arguments into pointers and non-pointers + -- to make the heap check easier. The tail-call sequence + -- is very similar to returning an unboxed tuple, so we + -- share some code. + do { (final_sp, arg_assts) <- pushUnboxedTuple join_sp arg_amodes + ; emitSimultaneously (pending_assts `plusStmts` arg_assts) + ; let lbl = enterReturnPtLabel (idUnique (cgIdInfoId fun_info)) + ; doFinalJump final_sp True {- Is LNE -} (jumpToLbl lbl) } + + | otherwise + = do { fun_amode <- idInfoToAmode fun_info + ; let node_asst = oneStmt (CmmAssign nodeReg fun_amode) + opt_node_asst | nodeMustPointToIt lf_info = node_asst + | otherwise = noStmts + ; EndOfBlockInfo sp _ <- getEndOfBlockInfo + ; hmods <- getHomeModules + + ; case (getCallMethod hmods fun_name lf_info (length arg_amodes)) of + + -- Node must always point to things we enter + EnterIt -> do + { emitSimultaneously (node_asst `plusStmts` pending_assts) + ; let target = entryCode (closureInfoPtr (CmmReg nodeReg)) + ; doFinalJump sp False (stmtC (CmmJump target [])) } + + -- A function, but we have zero arguments. It is already in WHNF, + -- so we can just return it. + -- As with any return, Node must point to it. + ReturnIt -> do + { emitSimultaneously (node_asst `plusStmts` pending_assts) + ; doFinalJump sp False emitDirectReturnInstr } + + -- A real constructor. Don't bother entering it, + -- just do the right sort of return instead. + -- As with any return, Node must point to it. + ReturnCon con -> do + { emitSimultaneously (node_asst `plusStmts` pending_assts) + ; doFinalJump sp False (emitKnownConReturnCode con) } + + JumpToIt lbl -> do + { emitSimultaneously (opt_node_asst `plusStmts` pending_assts) + ; doFinalJump sp False (jumpToLbl lbl) } + + -- A slow function call via the RTS apply routines + -- Node must definitely point to the thing + SlowCall -> do + { when (not (null arg_amodes)) $ do + { if (isKnownFun lf_info) + then tickyKnownCallTooFewArgs + else tickyUnknownCall + ; tickySlowCallPat (map fst arg_amodes) + } + + ; let (apply_lbl, args, extra_args) + = constructSlowCall arg_amodes + + ; directCall sp apply_lbl args extra_args + (node_asst `plusStmts` pending_assts) + } + + -- A direct function call (possibly with some left-over arguments) + DirectEntry lbl arity -> do + { if arity == length arg_amodes + then tickyKnownCallExact + else do tickyKnownCallExtraArgs + tickySlowCallPat (map fst (drop arity arg_amodes)) + + ; let + -- The args beyond the arity go straight on the stack + (arity_args, extra_args) = splitAt arity arg_amodes + + ; directCall sp lbl arity_args extra_args + (opt_node_asst `plusStmts` pending_assts) + } + } + where + fun_name = idName (cgIdInfoId fun_info) + lf_info = cgIdInfoLF fun_info + + + +directCall sp lbl args extra_args assts = do + let + -- First chunk of args go in registers + (reg_arg_amodes, stk_args) = assignCallRegs args + + -- Any "extra" arguments are placed in frames on the + -- stack after the other arguments. + slow_stk_args = slowArgs extra_args + + reg_assts = assignToRegs reg_arg_amodes + -- + (final_sp, stk_assts) <- mkStkAmodes sp (stk_args ++ slow_stk_args) + + emitSimultaneously (reg_assts `plusStmts` + stk_assts `plusStmts` + assts) + + doFinalJump final_sp False (jumpToLbl lbl) + +-- ----------------------------------------------------------------------------- +-- The final clean-up before we do a jump at the end of a basic block. +-- This code is shared by tail-calls and returns. + +doFinalJump :: VirtualSpOffset -> Bool -> Code -> Code +doFinalJump final_sp is_let_no_escape jump_code + = do { -- Adjust the high-water mark if necessary + adjustStackHW final_sp + + -- Push a return address if necessary (after the assignments + -- above, in case we clobber a live stack location) + -- + -- DONT push the return address when we're about to jump to a + -- let-no-escape: the final tail call in the let-no-escape + -- will do this. + ; eob <- getEndOfBlockInfo + ; whenC (not is_let_no_escape) (pushReturnAddress eob) + + -- Final adjustment of Sp/Hp + ; adjustSpAndHp final_sp + + -- and do the jump + ; jump_code } + +-- ----------------------------------------------------------------------------- +-- A general return (just a special case of doFinalJump, above) + +performReturn :: Code -- The code to execute to actually do the return + -> Code + +performReturn finish_code + = do { EndOfBlockInfo args_sp sequel <- getEndOfBlockInfo + ; doFinalJump args_sp False{-not a LNE-} finish_code } + +-- ----------------------------------------------------------------------------- +-- Primitive Returns +-- Just load the return value into the right register, and return. + +performPrimReturn :: CgRep -> CmmExpr -- The thing to return + -> Code +performPrimReturn rep amode + = do { whenC (not (isVoidArg rep)) + (stmtC (CmmAssign ret_reg amode)) + ; performReturn emitDirectReturnInstr } + where + ret_reg = dataReturnConvPrim rep + +-- ----------------------------------------------------------------------------- +-- Algebraic constructor returns + +-- Constructor is built on the heap; Node is set. +-- All that remains is to do the right sort of jump. + +emitKnownConReturnCode :: DataCon -> Code +emitKnownConReturnCode con + = emitAlgReturnCode (dataConTyCon con) + (CmmLit (mkIntCLit (dataConTagZ con))) + -- emitAlgReturnCode requires zero-indexed tag + +emitAlgReturnCode :: TyCon -> CmmExpr -> Code +-- emitAlgReturnCode is used both by emitKnownConReturnCode, +-- and by by PrimOps that return enumerated types (i.e. +-- all the comparison operators). +emitAlgReturnCode tycon tag + = do { case ctrlReturnConvAlg tycon of + VectoredReturn fam_sz -> do { tickyVectoredReturn fam_sz + ; emitVectoredReturnInstr tag } + UnvectoredReturn _ -> emitDirectReturnInstr + } + + +-- --------------------------------------------------------------------------- +-- Unboxed tuple returns + +-- These are a bit like a normal tail call, except that: +-- +-- - The tail-call target is an info table on the stack +-- +-- - We separate stack arguments into pointers and non-pointers, +-- to make it easier to leave things in a sane state for a heap check. +-- This is OK because we can never partially-apply an unboxed tuple, +-- unlike a function. The same technique is used when calling +-- let-no-escape functions, because they also can't be partially +-- applied. + +returnUnboxedTuple :: [(CgRep, CmmExpr)] -> Code +returnUnboxedTuple amodes + = do { eob@(EndOfBlockInfo args_sp sequel) <- getEndOfBlockInfo + ; tickyUnboxedTupleReturn (length amodes) + ; (final_sp, assts) <- pushUnboxedTuple args_sp amodes + ; emitSimultaneously assts + ; doFinalJump final_sp False{-not a LNE-} emitDirectReturnInstr } + +pushUnboxedTuple :: VirtualSpOffset -- Sp at which to start pushing + -> [(CgRep, CmmExpr)] -- amodes of the components + -> FCode (VirtualSpOffset, -- final Sp + CmmStmts) -- assignments (regs+stack) + +pushUnboxedTuple sp [] + = return (sp, noStmts) +pushUnboxedTuple sp amodes + = do { let (reg_arg_amodes, stk_arg_amodes) = assignReturnRegs amodes + + -- separate the rest of the args into pointers and non-pointers + (ptr_args, nptr_args) = separateByPtrFollowness stk_arg_amodes + reg_arg_assts = assignToRegs reg_arg_amodes + + -- push ptrs, then nonptrs, on the stack + ; (ptr_sp, ptr_assts) <- mkStkAmodes sp ptr_args + ; (final_sp, nptr_assts) <- mkStkAmodes ptr_sp nptr_args + + ; returnFC (final_sp, + reg_arg_assts `plusStmts` + ptr_assts `plusStmts` nptr_assts) } + + +-- ----------------------------------------------------------------------------- +-- Returning unboxed tuples. This is mainly to support _ccall_GC_, where +-- we want to do things in a slightly different order to normal: +-- +-- - push return address +-- - adjust stack pointer +-- - r = call(args...) +-- - assign regs for unboxed tuple (usually just R1 = r) +-- - return to continuation +-- +-- The return address (i.e. stack frame) must be on the stack before +-- doing the call in case the call ends up in the garbage collector. +-- +-- Sadly, the information about the continuation is lost after we push it +-- (in order to avoid pushing it again), so we end up doing a needless +-- indirect jump (ToDo). + +ccallReturnUnboxedTuple :: [(CgRep, CmmExpr)] -> Code -> Code +ccallReturnUnboxedTuple amodes before_jump + = do { eob@(EndOfBlockInfo args_sp _) <- getEndOfBlockInfo + + -- Push a return address if necessary + ; pushReturnAddress eob + ; setEndOfBlockInfo (EndOfBlockInfo args_sp OnStack) + (do { adjustSpAndHp args_sp + ; before_jump + ; returnUnboxedTuple amodes }) + } + +-- ----------------------------------------------------------------------------- +-- Calling an out-of-line primop + +tailCallPrimOp :: PrimOp -> [StgArg] -> Code +tailCallPrimOp op args + = do { -- We're going to perform a normal-looking tail call, + -- except that *all* the arguments will be in registers. + -- Hence the ASSERT( null leftovers ) + arg_amodes <- getArgAmodes args + ; let (arg_regs, leftovers) = assignPrimOpCallRegs arg_amodes + jump_to_primop = jumpToLbl (mkRtsPrimOpLabel op) + + ; ASSERT(null leftovers) -- no stack-resident args + emitSimultaneously (assignToRegs arg_regs) + + ; EndOfBlockInfo args_sp _ <- getEndOfBlockInfo + ; doFinalJump args_sp False{-not a LNE-} jump_to_primop } + +-- ----------------------------------------------------------------------------- +-- Return Addresses + +-- We always push the return address just before performing a tail call +-- or return. The reason we leave it until then is because the stack +-- slot that the return address is to go into might contain something +-- useful. +-- +-- If the end of block info is 'CaseAlts', then we're in the scrutinee of a +-- case expression and the return address is still to be pushed. +-- +-- There are cases where it doesn't look necessary to push the return +-- address: for example, just before doing a return to a known +-- continuation. However, the continuation will expect to find the +-- return address on the stack in case it needs to do a heap check. + +pushReturnAddress :: EndOfBlockInfo -> Code + +pushReturnAddress (EndOfBlockInfo args_sp sequel@(CaseAlts lbl _ _ False)) + = do { sp_rel <- getSpRelOffset args_sp + ; stmtC (CmmStore sp_rel (mkLblExpr lbl)) } + +-- For a polymorphic case, we have two return addresses to push: the case +-- return, and stg_seq_frame_info which turns a possible vectored return +-- into a direct one. +pushReturnAddress (EndOfBlockInfo args_sp sequel@(CaseAlts lbl _ _ True)) + = do { sp_rel <- getSpRelOffset (args_sp-1) + ; stmtC (CmmStore sp_rel (mkLblExpr lbl)) + ; sp_rel <- getSpRelOffset args_sp + ; stmtC (CmmStore sp_rel (CmmLit (CmmLabel mkSeqInfoLabel))) } + +pushReturnAddress _ = nopC + +-- ----------------------------------------------------------------------------- +-- Misc. + +jumpToLbl :: CLabel -> Code +-- Passes no argument to the destination procedure +jumpToLbl lbl = stmtC (CmmJump (CmmLit (CmmLabel lbl)) [{- No args -}]) + +assignToRegs :: [(CmmExpr, GlobalReg)] -> CmmStmts +assignToRegs reg_args + = mkStmts [ CmmAssign (CmmGlobal reg_id) expr + | (expr, reg_id) <- reg_args ] +\end{code} + + +%************************************************************************ +%* * +\subsection[CgStackery-adjust]{Adjusting the stack pointers} +%* * +%************************************************************************ + +This function adjusts the stack and heap pointers just before a tail +call or return. The stack pointer is adjusted to its final position +(i.e. to point to the last argument for a tail call, or the activation +record for a return). The heap pointer may be moved backwards, in +cases where we overallocated at the beginning of the basic block (see +CgCase.lhs for discussion). + +These functions {\em do not} deal with high-water-mark adjustment. +That's done by functions which allocate stack space. + +\begin{code} +adjustSpAndHp :: VirtualSpOffset -- New offset for Arg stack ptr + -> Code +adjustSpAndHp newRealSp + = do { -- Adjust stack, if necessary. + -- NB: the conditional on the monad-carried realSp + -- is out of line (via codeOnly), to avoid a black hole + ; new_sp <- getSpRelOffset newRealSp + ; checkedAbsC (CmmAssign spReg new_sp) -- Will generate no code in the case + ; setRealSp newRealSp -- where realSp==newRealSp + + -- Adjust heap. The virtual heap pointer may be less than the real Hp + -- because the latter was advanced to deal with the worst-case branch + -- of the code, and we may be in a better-case branch. In that case, + -- move the real Hp *back* and retract some ticky allocation count. + ; hp_usg <- getHpUsage + ; let rHp = realHp hp_usg + vHp = virtHp hp_usg + ; new_hp <- getHpRelOffset vHp + ; checkedAbsC (CmmAssign hpReg new_hp) -- Generates nothing when vHp==rHp + ; tickyAllocHeap (vHp - rHp) -- ...ditto + ; setRealHp vHp + } +\end{code} diff --git a/compiler/codeGen/CgTicky.hs b/compiler/codeGen/CgTicky.hs new file mode 100644 index 0000000000..3e72981c50 --- /dev/null +++ b/compiler/codeGen/CgTicky.hs @@ -0,0 +1,370 @@ +----------------------------------------------------------------------------- +-- +-- Code generation for ticky-ticky profiling +-- +-- (c) The University of Glasgow 2004 +-- +----------------------------------------------------------------------------- + +module CgTicky ( + emitTickyCounter, + + tickyDynAlloc, + tickyAllocHeap, + tickyAllocPrim, + tickyAllocThunk, + tickyAllocPAP, + + tickyPushUpdateFrame, + tickyUpdateFrameOmitted, + + tickyEnterDynCon, + tickyEnterStaticCon, + tickyEnterViaNode, + + tickyEnterFun, + tickyEnterThunk, + + tickyUpdateBhCaf, + tickyBlackHole, + tickyUnboxedTupleReturn, tickyVectoredReturn, + tickyReturnOldCon, tickyReturnNewCon, + + tickyKnownCallTooFewArgs, tickyKnownCallExact, tickyKnownCallExtraArgs, + tickyUnknownCall, tickySlowCallPat, + + staticTickyHdr, + ) where + +#include "HsVersions.h" +#include "../includes/DerivedConstants.h" + -- For REP_xxx constants, which are MachReps + +import ClosureInfo ( ClosureInfo, closureSize, slopSize, closureSMRep, + closureUpdReqd, closureName, isStaticClosure ) +import CgUtils +import CgMonad +import SMRep ( ClosureType(..), smRepClosureType, CgRep ) + +import Cmm +import MachOp +import CmmUtils ( zeroCLit, mkIntCLit, mkLblExpr, cmmIndexExpr ) +import CLabel ( CLabel, mkRtsDataLabel, mkRednCountsLabel ) + +import Name ( isInternalName ) +import Id ( Id, idType ) +import StaticFlags ( opt_DoTickyProfiling ) +import BasicTypes ( Arity ) +import FastString ( FastString, mkFastString, LitString ) +import Constants -- Lots of field offsets +import Outputable + +-- Turgid imports for showTypeCategory +import PrelNames +import TcType ( Type, isDictTy, tcSplitTyConApp_maybe, + tcSplitFunTy_maybe ) +import TyCon ( isPrimTyCon, isTupleTyCon, isEnumerationTyCon, + maybeTyConSingleCon ) +import Maybe + +----------------------------------------------------------------------------- +-- +-- Ticky-ticky profiling +-- +----------------------------------------------------------------------------- + +staticTickyHdr :: [CmmLit] +-- The ticky header words in a static closure +-- Was SET_STATIC_TICKY_HDR +staticTickyHdr + | not opt_DoTickyProfiling = [] + | otherwise = [zeroCLit] + +emitTickyCounter :: ClosureInfo -> [Id] -> Int -> Code +emitTickyCounter cl_info args on_stk + = ifTicky $ + do { mod_name <- moduleName + ; fun_descr_lit <- mkStringCLit (fun_descr mod_name) + ; arg_descr_lit <- mkStringCLit arg_descr + ; emitDataLits ticky_ctr_label -- Must match layout of StgEntCounter + [ CmmInt 0 I16, + CmmInt (fromIntegral (length args)) I16, -- Arity + CmmInt (fromIntegral on_stk) I16, -- Words passed on stack + CmmInt 0 I16, -- 2-byte gap + fun_descr_lit, + arg_descr_lit, + zeroCLit, -- Entry count + zeroCLit, -- Allocs + zeroCLit -- Link + ] } + where + name = closureName cl_info + ticky_ctr_label = mkRednCountsLabel name + arg_descr = map (showTypeCategory . idType) args + fun_descr mod_name = ppr_for_ticky_name mod_name name + +-- When printing the name of a thing in a ticky file, we want to +-- give the module name even for *local* things. We print +-- just "x (M)" rather that "M.x" to distinguish them from the global kind. +ppr_for_ticky_name mod_name name + | isInternalName name = showSDocDebug (ppr name <+> (parens (ppr mod_name))) + | otherwise = showSDocDebug (ppr name) + +-- ----------------------------------------------------------------------------- +-- Ticky stack frames + +tickyPushUpdateFrame = ifTicky $ bumpTickyCounter SLIT("UPDF_PUSHED_ctr") +tickyUpdateFrameOmitted = ifTicky $ bumpTickyCounter SLIT("UPDF_OMITTED_ctr") + +-- ----------------------------------------------------------------------------- +-- Ticky entries + +tickyEnterDynCon = ifTicky $ bumpTickyCounter SLIT("ENT_DYN_CON_ctr") +tickyEnterDynThunk = ifTicky $ bumpTickyCounter SLIT("ENT_DYN_THK_ctr") +tickyEnterStaticCon = ifTicky $ bumpTickyCounter SLIT("ENT_STATIC_CON_ctr") +tickyEnterStaticThunk = ifTicky $ bumpTickyCounter SLIT("ENT_STATIC_THK_ctr") +tickyEnterViaNode = ifTicky $ bumpTickyCounter SLIT("ENT_VIA_NODE_ctr") + +tickyEnterThunk :: ClosureInfo -> Code +tickyEnterThunk cl_info + | isStaticClosure cl_info = tickyEnterStaticThunk + | otherwise = tickyEnterDynThunk + +tickyBlackHole :: Bool{-updatable-} -> Code +tickyBlackHole updatable + = ifTicky (bumpTickyCounter ctr) + where + ctr | updatable = SLIT("UPD_BH_SINGLE_ENTRY_ctr") + | otherwise = SLIT("UPD_BH_UPDATABLE_ctr") + +tickyUpdateBhCaf cl_info + = ifTicky (bumpTickyCounter ctr) + where + ctr | closureUpdReqd cl_info = SLIT("UPD_CAF_BH_SINGLE_ENTRY_ctr") + | otherwise = SLIT("UPD_CAF_BH_UPDATABLE_ctr") + +tickyEnterFun :: ClosureInfo -> Code +tickyEnterFun cl_info + = ifTicky $ + do { bumpTickyCounter ctr + ; fun_ctr_lbl <- getTickyCtrLabel + ; registerTickyCtr fun_ctr_lbl + ; bumpTickyCounter' fun_ctr_lbl } + where + ctr | isStaticClosure cl_info = SLIT("TICK_ENT_STATIC_FUN_DIRECT") + | otherwise = SLIT("TICK_ENT_DYN_FUN_DIRECT") + +registerTickyCtr :: CLabel -> Code +-- Register a ticky counter +-- if ( ! f_ct.registeredp ) { +-- f_ct.link = ticky_entry_ctrs; /* hook this one onto the front of the list */ +-- ticky_entry_ctrs = & (f_ct); /* mark it as "registered" */ +-- f_ct.registeredp = 1 } +registerTickyCtr ctr_lbl + = emitIf test (stmtsC register_stmts) + where + test = CmmMachOp (MO_Not I16) + [CmmLoad (CmmLit (cmmLabelOffB ctr_lbl + oFFSET_StgEntCounter_registeredp)) I16] + register_stmts + = [ CmmStore (CmmLit (cmmLabelOffB ctr_lbl oFFSET_StgEntCounter_link)) + (CmmLoad ticky_entry_ctrs wordRep) + , CmmStore ticky_entry_ctrs (mkLblExpr ctr_lbl) + , CmmStore (CmmLit (cmmLabelOffB ctr_lbl + oFFSET_StgEntCounter_registeredp)) + (CmmLit (mkIntCLit 1)) ] + ticky_entry_ctrs = mkLblExpr (mkRtsDataLabel SLIT("ticky_entry_ctrs")) + +tickyReturnOldCon, tickyReturnNewCon :: Arity -> Code +tickyReturnOldCon arity + = ifTicky $ do { bumpTickyCounter SLIT("RET_OLD_ctr") + ; bumpHistogram SLIT("RET_OLD_hst") arity } +tickyReturnNewCon arity + | not opt_DoTickyProfiling = nopC + | otherwise + = ifTicky $ do { bumpTickyCounter SLIT("RET_NEW_ctr") + ; bumpHistogram SLIT("RET_NEW_hst") arity } + +tickyUnboxedTupleReturn :: Int -> Code +tickyUnboxedTupleReturn arity + = ifTicky $ do { bumpTickyCounter SLIT("RET_UNBOXED_TUP_ctr") + ; bumpHistogram SLIT("RET_UNBOXED_TUP_hst") arity } + +tickyVectoredReturn :: Int -> Code +tickyVectoredReturn family_size + = ifTicky $ do { bumpTickyCounter SLIT("VEC_RETURN_ctr") + ; bumpHistogram SLIT("RET_VEC_RETURN_hst") family_size } + +-- ----------------------------------------------------------------------------- +-- Ticky calls + +-- Ticks at a *call site*: +tickyKnownCallTooFewArgs = ifTicky $ bumpTickyCounter SLIT("KNOWN_CALL_TOO_FEW_ARGS_ctr") +tickyKnownCallExact = ifTicky $ bumpTickyCounter SLIT("KNOWN_CALL_ctr") +tickyKnownCallExtraArgs = ifTicky $ bumpTickyCounter SLIT("KNOWN_CALL_EXTRA_ctr") +tickyUnknownCall = ifTicky $ bumpTickyCounter SLIT("UNKNOWN_CALL_ctr") + +-- Tick for the call pattern at slow call site (i.e. in addition to +-- tickyUnknownCall, tickyKnownCallExtraArgs, etc.) +tickySlowCallPat :: [CgRep] -> Code +tickySlowCallPat args = return () +{- LATER: (introduces recursive module dependency now). + case callPattern args of + (str, True) -> bumpTickyCounter' (mkRtsSlowTickyCtrLabel pat) + (str, False) -> bumpTickyCounter SLIT("TICK_SLOW_CALL_OTHER") + +callPattern :: [CgRep] -> (String,Bool) +callPattern reps + | match == length reps = (chars, True) + | otherwise = (chars, False) + where (_,match) = findMatch reps + chars = map argChar reps + +argChar VoidArg = 'v' +argChar PtrArg = 'p' +argChar NonPtrArg = 'n' +argChar LongArg = 'l' +argChar FloatArg = 'f' +argChar DoubleArg = 'd' +-} + +-- ----------------------------------------------------------------------------- +-- Ticky allocation + +tickyDynAlloc :: ClosureInfo -> Code +-- Called when doing a dynamic heap allocation +tickyDynAlloc cl_info + = ifTicky $ + case smRepClosureType (closureSMRep cl_info) of + Constr -> tick_alloc_con + ConstrNoCaf -> tick_alloc_con + Fun -> tick_alloc_fun + Thunk -> tick_alloc_thk + ThunkSelector -> tick_alloc_thk + where + -- will be needed when we fill in stubs + cl_size = closureSize cl_info + slop_size = slopSize cl_info + + tick_alloc_thk + | closureUpdReqd cl_info = tick_alloc_up_thk + | otherwise = tick_alloc_se_thk + + tick_alloc_con = panic "ToDo: tick_alloc" + tick_alloc_fun = panic "ToDo: tick_alloc" + tick_alloc_up_thk = panic "ToDo: tick_alloc" + tick_alloc_se_thk = panic "ToDo: tick_alloc" + +tickyAllocPrim :: CmmExpr -> CmmExpr -> CmmExpr -> Code +tickyAllocPrim hdr goods slop = ifTicky $ panic "ToDo: tickyAllocPrim" + +tickyAllocThunk :: CmmExpr -> CmmExpr -> Code +tickyAllocThunk goods slop = ifTicky $ panic "ToDo: tickyAllocThunk" + +tickyAllocPAP :: CmmExpr -> CmmExpr -> Code +tickyAllocPAP goods slop = ifTicky $ panic "ToDo: tickyAllocPAP" + +tickyAllocHeap :: VirtualHpOffset -> Code +-- Called when doing a heap check [TICK_ALLOC_HEAP] +tickyAllocHeap hp + = ifTicky $ + do { ticky_ctr <- getTickyCtrLabel + ; stmtsC $ + if hp == 0 then [] -- Inside the stmtC to avoid control + else [ -- dependency on the argument + -- Bump the allcoation count in the StgEntCounter + addToMem REP_StgEntCounter_allocs + (CmmLit (cmmLabelOffB ticky_ctr + oFFSET_StgEntCounter_allocs)) hp, + -- Bump ALLOC_HEAP_ctr + addToMemLbl cLongRep (mkRtsDataLabel SLIT("ALLOC_HEAP_ctr")) 1, + -- Bump ALLOC_HEAP_tot + addToMemLbl cLongRep (mkRtsDataLabel SLIT("ALLOC_HEAP_tot")) hp] } + +-- ----------------------------------------------------------------------------- +-- Ticky utils + +ifTicky :: Code -> Code +ifTicky code + | opt_DoTickyProfiling = code + | otherwise = nopC + +addToMemLbl :: MachRep -> CLabel -> Int -> CmmStmt +addToMemLbl rep lbl n = addToMem rep (CmmLit (CmmLabel lbl)) n + +-- All the ticky-ticky counters are declared "unsigned long" in C +bumpTickyCounter :: LitString -> Code +bumpTickyCounter lbl = bumpTickyCounter' (mkRtsDataLabel lbl) + +bumpTickyCounter' :: CLabel -> Code +bumpTickyCounter' lbl = stmtC (addToMemLbl cLongRep lbl 1) + +addToMemLong = addToMem cLongRep + +bumpHistogram :: LitString -> Int -> Code +bumpHistogram lbl n + = bumpHistogramE lbl (CmmLit (CmmInt (fromIntegral n) cLongRep)) + +bumpHistogramE :: LitString -> CmmExpr -> Code +bumpHistogramE lbl n + = do t <- newTemp cLongRep + stmtC (CmmAssign t n) + emitIf (CmmMachOp (MO_U_Le cLongRep) [CmmReg t, eight]) $ + stmtC (CmmAssign t eight) + stmtC (addToMemLong (cmmIndexExpr cLongRep + (CmmLit (CmmLabel (mkRtsDataLabel lbl))) + (CmmReg t)) + 1) + where + eight = CmmLit (CmmInt 8 cLongRep) + +------------------------------------------------------------------ +-- Showing the "type category" for ticky-ticky profiling + +showTypeCategory :: Type -> Char + {- {C,I,F,D} char, int, float, double + T tuple + S other single-constructor type + {c,i,f,d} unboxed ditto + t *unpacked* tuple + s *unpacked" single-cons... + + v void# + a primitive array + + E enumeration type + + dictionary, unless it's a ... + L List + > function + M other (multi-constructor) data-con type + . other type + - reserved for others to mark as "uninteresting" + -} +showTypeCategory ty + = if isDictTy ty + then '+' + else + case tcSplitTyConApp_maybe ty of + Nothing -> if isJust (tcSplitFunTy_maybe ty) + then '>' + else '.' + + Just (tycon, _) -> + let utc = getUnique tycon in + if utc == charDataConKey then 'C' + else if utc == intDataConKey then 'I' + else if utc == floatDataConKey then 'F' + else if utc == doubleDataConKey then 'D' + else if utc == smallIntegerDataConKey || + utc == largeIntegerDataConKey then 'J' + else if utc == charPrimTyConKey then 'c' + else if (utc == intPrimTyConKey || utc == wordPrimTyConKey + || utc == addrPrimTyConKey) then 'i' + else if utc == floatPrimTyConKey then 'f' + else if utc == doublePrimTyConKey then 'd' + else if isPrimTyCon tycon {- array, we hope -} then 'A' -- Bogus + else if isEnumerationTyCon tycon then 'E' + else if isTupleTyCon tycon then 'T' + else if isJust (maybeTyConSingleCon tycon) then 'S' + else if utc == listTyConKey then 'L' + else 'M' -- oh, well... diff --git a/compiler/codeGen/CgUsages.hi-boot-5 b/compiler/codeGen/CgUsages.hi-boot-5 new file mode 100644 index 0000000000..abb98cec1a --- /dev/null +++ b/compiler/codeGen/CgUsages.hi-boot-5 @@ -0,0 +1,3 @@ +__interface CgUsages 1 0 where +__export CgUsages getSpRelOffset; +1 getSpRelOffset :: AbsCSyn.VirtualSpOffset -> CgMonad.FCode AbsCSyn.RegRelative ; diff --git a/compiler/codeGen/CgUsages.hi-boot-6 b/compiler/codeGen/CgUsages.hi-boot-6 new file mode 100644 index 0000000000..9640603cfb --- /dev/null +++ b/compiler/codeGen/CgUsages.hi-boot-6 @@ -0,0 +1,3 @@ +module CgUsages where + +getSpRelOffset :: AbsCSyn.VirtualSpOffset -> CgMonad.FCode AbsCSyn.RegRelative diff --git a/compiler/codeGen/CgUtils.hs b/compiler/codeGen/CgUtils.hs new file mode 100644 index 0000000000..2f69927db0 --- /dev/null +++ b/compiler/codeGen/CgUtils.hs @@ -0,0 +1,688 @@ +----------------------------------------------------------------------------- +-- +-- Code generator utilities; mostly monadic +-- +-- (c) The University of Glasgow 2004 +-- +----------------------------------------------------------------------------- + +module CgUtils ( + addIdReps, + cgLit, + emitDataLits, emitRODataLits, emitIf, emitIfThenElse, + emitRtsCall, emitRtsCallWithVols, emitRtsCallWithResult, + assignTemp, newTemp, + emitSimultaneously, + emitSwitch, emitLitSwitch, + tagToClosure, + + cmmAndWord, cmmOrWord, cmmNegate, cmmEqWord, cmmNeWord, + cmmOffsetExprW, cmmOffsetExprB, + cmmRegOffW, cmmRegOffB, + cmmLabelOffW, cmmLabelOffB, + cmmOffsetW, cmmOffsetB, + cmmOffsetLitW, cmmOffsetLitB, + cmmLoadIndexW, + + addToMem, addToMemE, + mkWordCLit, + mkStringCLit, + packHalfWordsCLit, + blankWord + ) where + +#include "HsVersions.h" + +import CgMonad +import TyCon ( TyCon, tyConName ) +import Id ( Id ) +import Constants ( wORD_SIZE ) +import SMRep ( CgRep, StgWord, hALF_WORD_SIZE_IN_BITS, ByteOff, + WordOff, idCgRep ) +import PprCmm ( {- instances -} ) +import Cmm +import CLabel +import CmmUtils +import MachOp ( MachRep(..), wordRep, MachOp(..), MachHint(..), + mo_wordOr, mo_wordAnd, mo_wordNe, mo_wordEq, + mo_wordULt, mo_wordUGt, mo_wordUGe, machRepByteWidth ) +import ForeignCall ( CCallConv(..) ) +import Literal ( Literal(..) ) +import CLabel ( CLabel, mkStringLitLabel ) +import Digraph ( SCC(..), stronglyConnComp ) +import ListSetOps ( assocDefault ) +import Util ( filterOut, sortLe ) +import DynFlags ( DynFlags(..), HscTarget(..) ) +import Packages ( HomeModules ) +import FastString ( LitString, FastString, bytesFS ) +import Outputable + +import Char ( ord ) +import DATA_BITS +import DATA_WORD ( Word8 ) +import Maybe ( isNothing ) + +------------------------------------------------------------------------- +-- +-- Random small functions +-- +------------------------------------------------------------------------- + +addIdReps :: [Id] -> [(CgRep, Id)] +addIdReps ids = [(idCgRep id, id) | id <- ids] + +------------------------------------------------------------------------- +-- +-- Literals +-- +------------------------------------------------------------------------- + +cgLit :: Literal -> FCode CmmLit +cgLit (MachStr s) = mkByteStringCLit (bytesFS s) + -- not unpackFS; we want the UTF-8 byte stream. +cgLit other_lit = return (mkSimpleLit other_lit) + +mkSimpleLit :: Literal -> CmmLit +mkSimpleLit (MachChar c) = CmmInt (fromIntegral (ord c)) wordRep +mkSimpleLit MachNullAddr = zeroCLit +mkSimpleLit (MachInt i) = CmmInt i wordRep +mkSimpleLit (MachInt64 i) = CmmInt i I64 +mkSimpleLit (MachWord i) = CmmInt i wordRep +mkSimpleLit (MachWord64 i) = CmmInt i I64 +mkSimpleLit (MachFloat r) = CmmFloat r F32 +mkSimpleLit (MachDouble r) = CmmFloat r F64 +mkSimpleLit (MachLabel fs ms) = CmmLabel (mkForeignLabel fs ms is_dyn) + where + is_dyn = False -- ToDo: fix me + +mkLtOp :: Literal -> MachOp +-- On signed literals we must do a signed comparison +mkLtOp (MachInt _) = MO_S_Lt wordRep +mkLtOp (MachFloat _) = MO_S_Lt F32 +mkLtOp (MachDouble _) = MO_S_Lt F64 +mkLtOp lit = MO_U_Lt (cmmLitRep (mkSimpleLit lit)) + + +--------------------------------------------------- +-- +-- Cmm data type functions +-- +--------------------------------------------------- + +----------------------- +-- The "B" variants take byte offsets +cmmRegOffB :: CmmReg -> ByteOff -> CmmExpr +cmmRegOffB = cmmRegOff + +cmmOffsetB :: CmmExpr -> ByteOff -> CmmExpr +cmmOffsetB = cmmOffset + +cmmOffsetExprB :: CmmExpr -> CmmExpr -> CmmExpr +cmmOffsetExprB = cmmOffsetExpr + +cmmLabelOffB :: CLabel -> ByteOff -> CmmLit +cmmLabelOffB = cmmLabelOff + +cmmOffsetLitB :: CmmLit -> ByteOff -> CmmLit +cmmOffsetLitB = cmmOffsetLit + +----------------------- +-- The "W" variants take word offsets +cmmOffsetExprW :: CmmExpr -> CmmExpr -> CmmExpr +-- The second arg is a *word* offset; need to change it to bytes +cmmOffsetExprW e (CmmLit (CmmInt n _)) = cmmOffsetW e (fromInteger n) +cmmOffsetExprW e wd_off = cmmIndexExpr wordRep e wd_off + +cmmOffsetW :: CmmExpr -> WordOff -> CmmExpr +cmmOffsetW e n = cmmOffsetB e (wORD_SIZE * n) + +cmmRegOffW :: CmmReg -> WordOff -> CmmExpr +cmmRegOffW reg wd_off = cmmRegOffB reg (wd_off * wORD_SIZE) + +cmmOffsetLitW :: CmmLit -> WordOff -> CmmLit +cmmOffsetLitW lit wd_off = cmmOffsetLitB lit (wORD_SIZE * wd_off) + +cmmLabelOffW :: CLabel -> WordOff -> CmmLit +cmmLabelOffW lbl wd_off = cmmLabelOffB lbl (wORD_SIZE * wd_off) + +cmmLoadIndexW :: CmmExpr -> Int -> CmmExpr +cmmLoadIndexW base off + = CmmLoad (cmmOffsetW base off) wordRep + +----------------------- +cmmNeWord, cmmEqWord, cmmOrWord, cmmAndWord :: CmmExpr -> CmmExpr -> CmmExpr +cmmOrWord e1 e2 = CmmMachOp mo_wordOr [e1, e2] +cmmAndWord e1 e2 = CmmMachOp mo_wordAnd [e1, e2] +cmmNeWord e1 e2 = CmmMachOp mo_wordNe [e1, e2] +cmmEqWord e1 e2 = CmmMachOp mo_wordEq [e1, e2] +cmmULtWord e1 e2 = CmmMachOp mo_wordULt [e1, e2] +cmmUGeWord e1 e2 = CmmMachOp mo_wordUGe [e1, e2] +cmmUGtWord e1 e2 = CmmMachOp mo_wordUGt [e1, e2] + +cmmNegate :: CmmExpr -> CmmExpr +cmmNegate (CmmLit (CmmInt n rep)) = CmmLit (CmmInt (-n) rep) +cmmNegate e = CmmMachOp (MO_S_Neg (cmmExprRep e)) [e] + +blankWord :: CmmStatic +blankWord = CmmUninitialised wORD_SIZE + +----------------------- +-- Making literals + +mkWordCLit :: StgWord -> CmmLit +mkWordCLit wd = CmmInt (fromIntegral wd) wordRep + +packHalfWordsCLit :: (Integral a, Integral b) => a -> b -> CmmLit +-- Make a single word literal in which the lower_half_word is +-- at the lower address, and the upper_half_word is at the +-- higher address +-- ToDo: consider using half-word lits instead +-- but be careful: that's vulnerable when reversed +packHalfWordsCLit lower_half_word upper_half_word +#ifdef WORDS_BIGENDIAN + = mkWordCLit ((fromIntegral lower_half_word `shiftL` hALF_WORD_SIZE_IN_BITS) + .|. fromIntegral upper_half_word) +#else + = mkWordCLit ((fromIntegral lower_half_word) + .|. (fromIntegral upper_half_word `shiftL` hALF_WORD_SIZE_IN_BITS)) +#endif + +-------------------------------------------------------------------------- +-- +-- Incrementing a memory location +-- +-------------------------------------------------------------------------- + +addToMem :: MachRep -- rep of the counter + -> CmmExpr -- Address + -> Int -- What to add (a word) + -> CmmStmt +addToMem rep ptr n = addToMemE rep ptr (CmmLit (CmmInt (toInteger n) rep)) + +addToMemE :: MachRep -- rep of the counter + -> CmmExpr -- Address + -> CmmExpr -- What to add (a word-typed expression) + -> CmmStmt +addToMemE rep ptr n + = CmmStore ptr (CmmMachOp (MO_Add rep) [CmmLoad ptr rep, n]) + +------------------------------------------------------------------------- +-- +-- Converting a closure tag to a closure for enumeration types +-- (this is the implementation of tagToEnum#). +-- +------------------------------------------------------------------------- + +tagToClosure :: HomeModules -> TyCon -> CmmExpr -> CmmExpr +tagToClosure hmods tycon tag + = CmmLoad (cmmOffsetExprW closure_tbl tag) wordRep + where closure_tbl = CmmLit (CmmLabel lbl) + lbl = mkClosureTableLabel hmods (tyConName tycon) + +------------------------------------------------------------------------- +-- +-- Conditionals and rts calls +-- +------------------------------------------------------------------------- + +emitIf :: CmmExpr -- Boolean + -> Code -- Then part + -> Code +-- Emit (if e then x) +-- ToDo: reverse the condition to avoid the extra branch instruction if possible +-- (some conditionals aren't reversible. eg. floating point comparisons cannot +-- be inverted because there exist some values for which both comparisons +-- return False, such as NaN.) +emitIf cond then_part + = do { then_id <- newLabelC + ; join_id <- newLabelC + ; stmtC (CmmCondBranch cond then_id) + ; stmtC (CmmBranch join_id) + ; labelC then_id + ; then_part + ; labelC join_id + } + +emitIfThenElse :: CmmExpr -- Boolean + -> Code -- Then part + -> Code -- Else part + -> Code +-- Emit (if e then x else y) +emitIfThenElse cond then_part else_part + = do { then_id <- newLabelC + ; else_id <- newLabelC + ; join_id <- newLabelC + ; stmtC (CmmCondBranch cond then_id) + ; else_part + ; stmtC (CmmBranch join_id) + ; labelC then_id + ; then_part + ; labelC join_id + } + +emitRtsCall :: LitString -> [(CmmExpr,MachHint)] -> Code +emitRtsCall fun args = emitRtsCall' [] fun args Nothing + -- The 'Nothing' says "save all global registers" + +emitRtsCallWithVols :: LitString -> [(CmmExpr,MachHint)] -> [GlobalReg] -> Code +emitRtsCallWithVols fun args vols + = emitRtsCall' [] fun args (Just vols) + +emitRtsCallWithResult :: CmmReg -> MachHint -> LitString + -> [(CmmExpr,MachHint)] -> Code +emitRtsCallWithResult res hint fun args + = emitRtsCall' [(res,hint)] fun args Nothing + +-- Make a call to an RTS C procedure +emitRtsCall' + :: [(CmmReg,MachHint)] + -> LitString + -> [(CmmExpr,MachHint)] + -> Maybe [GlobalReg] + -> Code +emitRtsCall' res fun args vols = stmtC (CmmCall target res args vols) + where + target = CmmForeignCall fun_expr CCallConv + fun_expr = mkLblExpr (mkRtsCodeLabel fun) + + +------------------------------------------------------------------------- +-- +-- Strings gnerate a top-level data block +-- +------------------------------------------------------------------------- + +emitDataLits :: CLabel -> [CmmLit] -> Code +-- Emit a data-segment data block +emitDataLits lbl lits + = emitData Data (CmmDataLabel lbl : map CmmStaticLit lits) + +emitRODataLits :: CLabel -> [CmmLit] -> Code +-- Emit a read-only data block +emitRODataLits lbl lits + = emitData section (CmmDataLabel lbl : map CmmStaticLit lits) + where section | any needsRelocation lits = RelocatableReadOnlyData + | otherwise = ReadOnlyData + needsRelocation (CmmLabel _) = True + needsRelocation (CmmLabelOff _ _) = True + needsRelocation _ = False + +mkStringCLit :: String -> FCode CmmLit +-- Make a global definition for the string, +-- and return its label +mkStringCLit str = mkByteStringCLit (map (fromIntegral.ord) str) + +mkByteStringCLit :: [Word8] -> FCode CmmLit +mkByteStringCLit bytes + = do { uniq <- newUnique + ; let lbl = mkStringLitLabel uniq + ; emitData ReadOnlyData [CmmDataLabel lbl, CmmString bytes] + ; return (CmmLabel lbl) } + +------------------------------------------------------------------------- +-- +-- Assigning expressions to temporaries +-- +------------------------------------------------------------------------- + +assignTemp :: CmmExpr -> FCode CmmExpr +-- For a non-trivial expression, e, create a local +-- variable and assign the expression to it +assignTemp e + | isTrivialCmmExpr e = return e + | otherwise = do { reg <- newTemp (cmmExprRep e) + ; stmtC (CmmAssign reg e) + ; return (CmmReg reg) } + + +newTemp :: MachRep -> FCode CmmReg +newTemp rep = do { uniq <- newUnique; return (CmmLocal (LocalReg uniq rep)) } + + +------------------------------------------------------------------------- +-- +-- Building case analysis +-- +------------------------------------------------------------------------- + +emitSwitch + :: CmmExpr -- Tag to switch on + -> [(ConTagZ, CgStmts)] -- Tagged branches + -> Maybe CgStmts -- Default branch (if any) + -> ConTagZ -> ConTagZ -- Min and Max possible values; behaviour + -- outside this range is undefined + -> Code + +-- ONLY A DEFAULT BRANCH: no case analysis to do +emitSwitch tag_expr [] (Just stmts) _ _ + = emitCgStmts stmts + +-- Right, off we go +emitSwitch tag_expr branches mb_deflt lo_tag hi_tag + = -- Just sort the branches before calling mk_sritch + do { mb_deflt_id <- + case mb_deflt of + Nothing -> return Nothing + Just stmts -> do id <- forkCgStmts stmts; return (Just id) + + ; dflags <- getDynFlags + ; let via_C | HscC <- hscTarget dflags = True + | otherwise = False + + ; stmts <- mk_switch tag_expr (sortLe le branches) + mb_deflt_id lo_tag hi_tag via_C + ; emitCgStmts stmts + } + where + (t1,_) `le` (t2,_) = t1 <= t2 + + +mk_switch :: CmmExpr -> [(ConTagZ, CgStmts)] + -> Maybe BlockId -> ConTagZ -> ConTagZ -> Bool + -> FCode CgStmts + +-- SINGLETON TAG RANGE: no case analysis to do +mk_switch tag_expr [(tag,stmts)] _ lo_tag hi_tag via_C + | lo_tag == hi_tag + = ASSERT( tag == lo_tag ) + return stmts + +-- SINGLETON BRANCH, NO DEFUALT: no case analysis to do +mk_switch tag_expr [(tag,stmts)] Nothing lo_tag hi_tag via_C + = return stmts + -- The simplifier might have eliminated a case + -- so we may have e.g. case xs of + -- [] -> e + -- In that situation we can be sure the (:) case + -- can't happen, so no need to test + +-- SINGLETON BRANCH: one equality check to do +mk_switch tag_expr [(tag,stmts)] (Just deflt) lo_tag hi_tag via_C + = return (CmmCondBranch cond deflt `consCgStmt` stmts) + where + cond = cmmNeWord tag_expr (CmmLit (mkIntCLit tag)) + -- We have lo_tag < hi_tag, but there's only one branch, + -- so there must be a default + +-- ToDo: we might want to check for the two branch case, where one of +-- the branches is the tag 0, because comparing '== 0' is likely to be +-- more efficient than other kinds of comparison. + +-- DENSE TAG RANGE: use a switch statment. +-- +-- We also use a switch uncoditionally when compiling via C, because +-- this will get emitted as a C switch statement and the C compiler +-- should do a good job of optimising it. Also, older GCC versions +-- (2.95 in particular) have problems compiling the complicated +-- if-trees generated by this code, so compiling to a switch every +-- time works around that problem. +-- +mk_switch tag_expr branches mb_deflt lo_tag hi_tag via_C + | use_switch -- Use a switch + = do { branch_ids <- mapM forkCgStmts (map snd branches) + ; let + tagged_blk_ids = zip (map fst branches) (map Just branch_ids) + + find_branch :: ConTagZ -> Maybe BlockId + find_branch i = assocDefault mb_deflt tagged_blk_ids i + + -- NB. we have eliminated impossible branches at + -- either end of the range (see below), so the first + -- tag of a real branch is real_lo_tag (not lo_tag). + arms = [ find_branch i | i <- [real_lo_tag..real_hi_tag]] + + switch_stmt = CmmSwitch (cmmOffset tag_expr (- real_lo_tag)) arms + + ; ASSERT(not (all isNothing arms)) + return (oneCgStmt switch_stmt) + } + + -- if we can knock off a bunch of default cases with one if, then do so + | Just deflt <- mb_deflt, (lowest_branch - lo_tag) >= n_branches + = do { (assign_tag, tag_expr') <- assignTemp' tag_expr + ; let cond = cmmULtWord tag_expr' (CmmLit (mkIntCLit lowest_branch)) + branch = CmmCondBranch cond deflt + ; stmts <- mk_switch tag_expr' branches mb_deflt + lowest_branch hi_tag via_C + ; return (assign_tag `consCgStmt` (branch `consCgStmt` stmts)) + } + + | Just deflt <- mb_deflt, (hi_tag - highest_branch) >= n_branches + = do { (assign_tag, tag_expr') <- assignTemp' tag_expr + ; let cond = cmmUGtWord tag_expr' (CmmLit (mkIntCLit highest_branch)) + branch = CmmCondBranch cond deflt + ; stmts <- mk_switch tag_expr' branches mb_deflt + lo_tag highest_branch via_C + ; return (assign_tag `consCgStmt` (branch `consCgStmt` stmts)) + } + + | otherwise -- Use an if-tree + = do { (assign_tag, tag_expr') <- assignTemp' tag_expr + -- To avoid duplication + ; lo_stmts <- mk_switch tag_expr' lo_branches mb_deflt + lo_tag (mid_tag-1) via_C + ; hi_stmts <- mk_switch tag_expr' hi_branches mb_deflt + mid_tag hi_tag via_C + ; hi_id <- forkCgStmts hi_stmts + ; let cond = cmmUGeWord tag_expr' (CmmLit (mkIntCLit mid_tag)) + branch_stmt = CmmCondBranch cond hi_id + ; return (assign_tag `consCgStmt` (branch_stmt `consCgStmt` lo_stmts)) + } + -- we test (e >= mid_tag) rather than (e < mid_tag), because + -- the former works better when e is a comparison, and there + -- are two tags 0 & 1 (mid_tag == 1). In this case, the code + -- generator can reduce the condition to e itself without + -- having to reverse the sense of the comparison: comparisons + -- can't always be easily reversed (eg. floating + -- pt. comparisons). + where + use_switch = {- pprTrace "mk_switch" ( + ppr tag_expr <+> text "n_tags:" <+> int n_tags <+> + text "n_branches:" <+> int n_branches <+> + text "lo_tag: " <+> int lo_tag <+> + text "hi_tag: " <+> int hi_tag <+> + text "real_lo_tag: " <+> int real_lo_tag <+> + text "real_hi_tag: " <+> int real_hi_tag) $ -} + ASSERT( n_branches > 1 && n_tags > 1 ) + n_tags > 2 && (small || dense || via_C) + -- a 2-branch switch always turns into an if. + small = n_tags <= 4 + dense = n_branches > (n_tags `div` 2) + exhaustive = n_tags == n_branches + n_branches = length branches + + -- ignore default slots at each end of the range if there's + -- no default branch defined. + lowest_branch = fst (head branches) + highest_branch = fst (last branches) + + real_lo_tag + | isNothing mb_deflt = lowest_branch + | otherwise = lo_tag + + real_hi_tag + | isNothing mb_deflt = highest_branch + | otherwise = hi_tag + + n_tags = real_hi_tag - real_lo_tag + 1 + + -- INVARIANT: Provided hi_tag > lo_tag (which is true) + -- lo_tag <= mid_tag < hi_tag + -- lo_branches have tags < mid_tag + -- hi_branches have tags >= mid_tag + + (mid_tag,_) = branches !! (n_branches `div` 2) + -- 2 branches => n_branches `div` 2 = 1 + -- => branches !! 1 give the *second* tag + -- There are always at least 2 branches here + + (lo_branches, hi_branches) = span is_lo branches + is_lo (t,_) = t < mid_tag + + +assignTemp' e + | isTrivialCmmExpr e = return (CmmNop, e) + | otherwise = do { reg <- newTemp (cmmExprRep e) + ; return (CmmAssign reg e, CmmReg reg) } + + +emitLitSwitch :: CmmExpr -- Tag to switch on + -> [(Literal, CgStmts)] -- Tagged branches + -> CgStmts -- Default branch (always) + -> Code -- Emit the code +-- Used for general literals, whose size might not be a word, +-- where there is always a default case, and where we don't know +-- the range of values for certain. For simplicity we always generate a tree. +-- +-- ToDo: for integers we could do better here, perhaps by generalising +-- mk_switch and using that. --SDM 15/09/2004 +emitLitSwitch scrut [] deflt + = emitCgStmts deflt +emitLitSwitch scrut branches deflt_blk + = do { scrut' <- assignTemp scrut + ; deflt_blk_id <- forkCgStmts deflt_blk + ; blk <- mk_lit_switch scrut' deflt_blk_id (sortLe le branches) + ; emitCgStmts blk } + where + le (t1,_) (t2,_) = t1 <= t2 + +mk_lit_switch :: CmmExpr -> BlockId + -> [(Literal,CgStmts)] + -> FCode CgStmts +mk_lit_switch scrut deflt_blk_id [(lit,blk)] + = return (consCgStmt if_stmt blk) + where + cmm_lit = mkSimpleLit lit + rep = cmmLitRep cmm_lit + cond = CmmMachOp (MO_Ne rep) [scrut, CmmLit cmm_lit] + if_stmt = CmmCondBranch cond deflt_blk_id + +mk_lit_switch scrut deflt_blk_id branches + = do { hi_blk <- mk_lit_switch scrut deflt_blk_id hi_branches + ; lo_blk <- mk_lit_switch scrut deflt_blk_id lo_branches + ; lo_blk_id <- forkCgStmts lo_blk + ; let if_stmt = CmmCondBranch cond lo_blk_id + ; return (if_stmt `consCgStmt` hi_blk) } + where + n_branches = length branches + (mid_lit,_) = branches !! (n_branches `div` 2) + -- See notes above re mid_tag + + (lo_branches, hi_branches) = span is_lo branches + is_lo (t,_) = t < mid_lit + + cond = CmmMachOp (mkLtOp mid_lit) + [scrut, CmmLit (mkSimpleLit mid_lit)] + +------------------------------------------------------------------------- +-- +-- Simultaneous assignment +-- +------------------------------------------------------------------------- + + +emitSimultaneously :: CmmStmts -> Code +-- Emit code to perform the assignments in the +-- input simultaneously, using temporary variables when necessary. +-- +-- The Stmts must be: +-- CmmNop, CmmComment, CmmAssign, CmmStore +-- and nothing else + + +-- We use the strongly-connected component algorithm, in which +-- * the vertices are the statements +-- * an edge goes from s1 to s2 iff +-- s1 assigns to something s2 uses +-- that is, if s1 should *follow* s2 in the final order + +type CVertex = (Int, CmmStmt) -- Give each vertex a unique number, + -- for fast comparison + +emitSimultaneously stmts + = codeOnly $ + case filterOut isNopStmt (stmtList stmts) of + -- Remove no-ops + [] -> nopC + [stmt] -> stmtC stmt -- It's often just one stmt + stmt_list -> doSimultaneously1 (zip [(1::Int)..] stmt_list) + +doSimultaneously1 :: [CVertex] -> Code +doSimultaneously1 vertices + = let + edges = [ (vertex, key1, edges_from stmt1) + | vertex@(key1, stmt1) <- vertices + ] + edges_from stmt1 = [ key2 | (key2, stmt2) <- vertices, + stmt1 `mustFollow` stmt2 + ] + components = stronglyConnComp edges + + -- do_components deal with one strongly-connected component + -- Not cyclic, or singleton? Just do it + do_component (AcyclicSCC (n,stmt)) = stmtC stmt + do_component (CyclicSCC [(n,stmt)]) = stmtC stmt + + -- Cyclic? Then go via temporaries. Pick one to + -- break the loop and try again with the rest. + do_component (CyclicSCC ((n,first_stmt) : rest)) + = do { from_temp <- go_via_temp first_stmt + ; doSimultaneously1 rest + ; stmtC from_temp } + + go_via_temp (CmmAssign dest src) + = do { tmp <- newTemp (cmmRegRep dest) + ; stmtC (CmmAssign tmp src) + ; return (CmmAssign dest (CmmReg tmp)) } + go_via_temp (CmmStore dest src) + = do { tmp <- newTemp (cmmExprRep src) + ; stmtC (CmmAssign tmp src) + ; return (CmmStore dest (CmmReg tmp)) } + in + mapCs do_component components + +mustFollow :: CmmStmt -> CmmStmt -> Bool +CmmAssign reg _ `mustFollow` stmt = anySrc (reg `regUsedIn`) stmt +CmmStore loc e `mustFollow` stmt = anySrc (locUsedIn loc (cmmExprRep e)) stmt +CmmNop `mustFollow` stmt = False +CmmComment _ `mustFollow` stmt = False + + +anySrc :: (CmmExpr -> Bool) -> CmmStmt -> Bool +-- True if the fn is true of any input of the stmt +anySrc p (CmmAssign _ e) = p e +anySrc p (CmmStore e1 e2) = p e1 || p e2 -- Might be used in either side +anySrc p (CmmComment _) = False +anySrc p CmmNop = False +anySrc p other = True -- Conservative + +regUsedIn :: CmmReg -> CmmExpr -> Bool +reg `regUsedIn` CmmLit _ = False +reg `regUsedIn` CmmLoad e _ = reg `regUsedIn` e +reg `regUsedIn` CmmReg reg' = reg == reg' +reg `regUsedIn` CmmRegOff reg' _ = reg == reg' +reg `regUsedIn` CmmMachOp _ es = any (reg `regUsedIn`) es + +locUsedIn :: CmmExpr -> MachRep -> CmmExpr -> Bool +-- (locUsedIn a r e) checks whether writing to r[a] could affect the value of +-- 'e'. Returns True if it's not sure. +locUsedIn loc rep (CmmLit _) = False +locUsedIn loc rep (CmmLoad e ld_rep) = possiblySameLoc loc rep e ld_rep +locUsedIn loc rep (CmmReg reg') = False +locUsedIn loc rep (CmmRegOff reg' _) = False +locUsedIn loc rep (CmmMachOp _ es) = any (locUsedIn loc rep) es + +possiblySameLoc :: CmmExpr -> MachRep -> CmmExpr -> MachRep -> Bool +-- Assumes that distinct registers (eg Hp, Sp) do not +-- point to the same location, nor any offset thereof. +possiblySameLoc (CmmReg r1) rep1 (CmmReg r2) rep2 = r1==r2 +possiblySameLoc (CmmReg r1) rep1 (CmmRegOff r2 0) rep2 = r1==r2 +possiblySameLoc (CmmRegOff r1 0) rep1 (CmmReg r2) rep2 = r1==r2 +possiblySameLoc (CmmRegOff r1 start1) rep1 (CmmRegOff r2 start2) rep2 + = r1==r2 && end1 > start2 && end2 > start1 + where + end1 = start1 + machRepByteWidth rep1 + end2 = start2 + machRepByteWidth rep2 + +possiblySameLoc l1 rep1 (CmmLit _) rep2 = False +possiblySameLoc l1 rep1 l2 rep2 = True -- Conservative diff --git a/compiler/codeGen/ClosureInfo.hi-boot-5 b/compiler/codeGen/ClosureInfo.hi-boot-5 new file mode 100644 index 0000000000..2291f93cc6 --- /dev/null +++ b/compiler/codeGen/ClosureInfo.hi-boot-5 @@ -0,0 +1,4 @@ +__interface ClosureInfo 1 0 where +__export ClosureInfo ClosureInfo LambdaFormInfo; +1 data LambdaFormInfo; +1 data ClosureInfo; diff --git a/compiler/codeGen/ClosureInfo.hi-boot-6 b/compiler/codeGen/ClosureInfo.hi-boot-6 new file mode 100644 index 0000000000..d313ccde80 --- /dev/null +++ b/compiler/codeGen/ClosureInfo.hi-boot-6 @@ -0,0 +1,4 @@ +module ClosureInfo where + +data LambdaFormInfo +data ClosureInfo diff --git a/compiler/codeGen/ClosureInfo.lhs b/compiler/codeGen/ClosureInfo.lhs new file mode 100644 index 0000000000..84d9dd95ef --- /dev/null +++ b/compiler/codeGen/ClosureInfo.lhs @@ -0,0 +1,951 @@ +% +% (c) The Univserity of Glasgow 1992-2004 +% + + Data structures which describe closures, and + operations over those data structures + + Nothing monadic in here + +Much of the rationale for these things is in the ``details'' part of +the STG paper. + +\begin{code} +module ClosureInfo ( + ClosureInfo, LambdaFormInfo, SMRep, -- all abstract + StandardFormInfo, + + ArgDescr(..), Liveness(..), + C_SRT(..), needsSRT, + + mkLFThunk, mkLFReEntrant, mkConLFInfo, mkSelectorLFInfo, + mkApLFInfo, mkLFImported, mkLFArgument, mkLFLetNoEscape, + + mkClosureInfo, mkConInfo, + + closureSize, closureNonHdrSize, + closureGoodStuffSize, closurePtrsSize, + slopSize, + + closureName, infoTableLabelFromCI, + closureLabelFromCI, closureSRT, + closureLFInfo, isLFThunk,closureSMRep, closureUpdReqd, + closureNeedsUpdSpace, closureIsThunk, + closureSingleEntry, closureReEntrant, isConstrClosure_maybe, + closureFunInfo, isStandardFormThunk, isKnownFun, + + enterIdLabel, enterLocalIdLabel, enterReturnPtLabel, + + nodeMustPointToIt, + CallMethod(..), getCallMethod, + + blackHoleOnEntry, + + staticClosureRequired, + getClosureType, + + isToplevClosure, + closureValDescr, closureTypeDescr, -- profiling + + isStaticClosure, + cafBlackHoleClosureInfo, seCafBlackHoleClosureInfo, + + staticClosureNeedsLink, + ) where + +#include "../includes/MachDeps.h" +#include "HsVersions.h" + +import StgSyn +import SMRep -- all of it + +import CLabel + +import Constants ( mIN_PAYLOAD_SIZE ) +import Packages ( isDllName, HomeModules ) +import StaticFlags ( opt_SccProfilingOn, opt_OmitBlackHoling, + opt_Parallel, opt_DoTickyProfiling ) +import Id ( Id, idType, idArity, idName ) +import DataCon ( DataCon, dataConTyCon, isNullaryRepDataCon, dataConName ) +import Name ( Name, nameUnique, getOccName, getOccString ) +import OccName ( occNameString ) +import Type ( isUnLiftedType, Type, repType, splitTyConApp_maybe ) +import TcType ( tcSplitSigmaTy ) +import TyCon ( isFunTyCon, isAbstractTyCon ) +import BasicTypes ( TopLevelFlag(..), isNotTopLevel, isTopLevel, ipNameName ) +import FastString +import Outputable +import Constants + +import TypeRep -- TEMP +\end{code} + + +%************************************************************************ +%* * +\subsection[ClosureInfo-datatypes]{Data types for closure information} +%* * +%************************************************************************ + +Information about a closure, from the code generator's point of view. + +A ClosureInfo decribes the info pointer of a closure. It has +enough information + a) to construct the info table itself + b) to allocate a closure containing that info pointer (i.e. + it knows the info table label) + +We make a ClosureInfo for + - each let binding (both top level and not) + - each data constructor (for its shared static and + dynamic info tables) + +\begin{code} +data ClosureInfo + = ClosureInfo { + closureName :: !Name, -- The thing bound to this closure + closureLFInfo :: !LambdaFormInfo, -- NOTE: not an LFCon (see below) + closureSMRep :: !SMRep, -- representation used by storage mgr + closureSRT :: !C_SRT, -- What SRT applies to this closure + closureType :: !Type, -- Type of closure (ToDo: remove) + closureDescr :: !String -- closure description (for profiling) + } + + -- Constructor closures don't have a unique info table label (they use + -- the constructor's info table), and they don't have an SRT. + | ConInfo { + closureCon :: !DataCon, + closureSMRep :: !SMRep, + closureDllCon :: !Bool -- is in a separate DLL + } + +-- C_SRT is what StgSyn.SRT gets translated to... +-- we add a label for the table, and expect only the 'offset/length' form + +data C_SRT = NoC_SRT + | C_SRT !CLabel !WordOff !StgHalfWord {-bitmap or escape-} + +needsSRT :: C_SRT -> Bool +needsSRT NoC_SRT = False +needsSRT (C_SRT _ _ _) = True +\end{code} + +%************************************************************************ +%* * +\subsubsection[LambdaFormInfo-datatype]{@LambdaFormInfo@: source-derivable info} +%* * +%************************************************************************ + +Information about an identifier, from the code generator's point of +view. Every identifier is bound to a LambdaFormInfo in the +environment, which gives the code generator enough info to be able to +tail call or return that identifier. + +Note that a closure is usually bound to an identifier, so a +ClosureInfo contains a LambdaFormInfo. + +\begin{code} +data LambdaFormInfo + = LFReEntrant -- Reentrant closure (a function) + TopLevelFlag -- True if top level + !Int -- Arity. Invariant: always > 0 + !Bool -- True <=> no fvs + ArgDescr -- Argument descriptor (should reall be in ClosureInfo) + + | LFCon -- A saturated constructor application + DataCon -- The constructor + + | LFThunk -- Thunk (zero arity) + TopLevelFlag + !Bool -- True <=> no free vars + !Bool -- True <=> updatable (i.e., *not* single-entry) + StandardFormInfo + !Bool -- True <=> *might* be a function type + + | LFUnknown -- Used for function arguments and imported things. + -- We know nothing about this closure. Treat like + -- updatable "LFThunk"... + -- Imported things which we do know something about use + -- one of the other LF constructors (eg LFReEntrant for + -- known functions) + !Bool -- True <=> *might* be a function type + + | LFLetNoEscape -- See LetNoEscape module for precise description of + -- these "lets". + !Int -- arity; + + | LFBlackHole -- Used for the closures allocated to hold the result + -- of a CAF. We want the target of the update frame to + -- be in the heap, so we make a black hole to hold it. + CLabel -- Flavour (info label, eg CAF_BLACKHOLE_info). + + +------------------------- +-- An ArgDsecr describes the argument pattern of a function + +data ArgDescr + = ArgSpec -- Fits one of the standard patterns + !Int -- RTS type identifier ARG_P, ARG_N, ... + + | ArgGen -- General case + Liveness -- Details about the arguments + + +------------------------- +-- We represent liveness bitmaps as a Bitmap (whose internal +-- representation really is a bitmap). These are pinned onto case return +-- vectors to indicate the state of the stack for the garbage collector. +-- +-- In the compiled program, liveness bitmaps that fit inside a single +-- word (StgWord) are stored as a single word, while larger bitmaps are +-- stored as a pointer to an array of words. + +data Liveness + = SmallLiveness -- Liveness info that fits in one word + StgWord -- Here's the bitmap + + | BigLiveness -- Liveness info witha a multi-word bitmap + CLabel -- Label for the bitmap + + +------------------------- +-- StandardFormInfo tells whether this thunk has one of +-- a small number of standard forms + +data StandardFormInfo + = NonStandardThunk + -- Not of of the standard forms + + | SelectorThunk + -- A SelectorThunk is of form + -- case x of + -- con a1,..,an -> ak + -- and the constructor is from a single-constr type. + WordOff -- 0-origin offset of ak within the "goods" of + -- constructor (Recall that the a1,...,an may be laid + -- out in the heap in a non-obvious order.) + + | ApThunk + -- An ApThunk is of form + -- x1 ... xn + -- The code for the thunk just pushes x2..xn on the stack and enters x1. + -- There are a few of these (for 1 <= n <= MAX_SPEC_AP_SIZE) pre-compiled + -- in the RTS to save space. + Int -- Arity, n +\end{code} + +%************************************************************************ +%* * +\subsection[ClosureInfo-construction]{Functions which build LFInfos} +%* * +%************************************************************************ + +\begin{code} +mkLFReEntrant :: TopLevelFlag -- True of top level + -> [Id] -- Free vars + -> [Id] -- Args + -> ArgDescr -- Argument descriptor + -> LambdaFormInfo + +mkLFReEntrant top fvs args arg_descr + = LFReEntrant top (length args) (null fvs) arg_descr + +mkLFThunk thunk_ty top fvs upd_flag + = ASSERT( not (isUpdatable upd_flag) || not (isUnLiftedType thunk_ty) ) + LFThunk top (null fvs) + (isUpdatable upd_flag) + NonStandardThunk + (might_be_a_function thunk_ty) + +might_be_a_function :: Type -> Bool +might_be_a_function ty + | Just (tc,_) <- splitTyConApp_maybe (repType ty), + not (isFunTyCon tc) && not (isAbstractTyCon tc) = False + -- don't forget to check for abstract types, which might + -- be functions too. + | otherwise = True +\end{code} + +@mkConLFInfo@ is similar, for constructors. + +\begin{code} +mkConLFInfo :: DataCon -> LambdaFormInfo +mkConLFInfo con = LFCon con + +mkSelectorLFInfo id offset updatable + = LFThunk NotTopLevel False updatable (SelectorThunk offset) + (might_be_a_function (idType id)) + +mkApLFInfo id upd_flag arity + = LFThunk NotTopLevel (arity == 0) (isUpdatable upd_flag) (ApThunk arity) + (might_be_a_function (idType id)) +\end{code} + +Miscellaneous LF-infos. + +\begin{code} +mkLFArgument id = LFUnknown (might_be_a_function (idType id)) + +mkLFLetNoEscape = LFLetNoEscape + +mkLFImported :: Id -> LambdaFormInfo +mkLFImported id + = case idArity id of + n | n > 0 -> LFReEntrant TopLevel n True (panic "arg_descr") -- n > 0 + other -> mkLFArgument id -- Not sure of exact arity +\end{code} + +\begin{code} +isLFThunk :: LambdaFormInfo -> Bool +isLFThunk (LFThunk _ _ _ _ _) = True +isLFThunk (LFBlackHole _) = True + -- return True for a blackhole: this function is used to determine + -- whether to use the thunk header in SMP mode, and a blackhole + -- must have one. +isLFThunk _ = False +\end{code} + +%************************************************************************ +%* * + Building ClosureInfos +%* * +%************************************************************************ + +\begin{code} +mkClosureInfo :: Bool -- Is static + -> Id + -> LambdaFormInfo + -> Int -> Int -- Total and pointer words + -> C_SRT + -> String -- String descriptor + -> ClosureInfo +mkClosureInfo is_static id lf_info tot_wds ptr_wds srt_info descr + = ClosureInfo { closureName = name, + closureLFInfo = lf_info, + closureSMRep = sm_rep, + closureSRT = srt_info, + closureType = idType id, + closureDescr = descr } + where + name = idName id + sm_rep = chooseSMRep is_static lf_info tot_wds ptr_wds + +mkConInfo :: HomeModules + -> Bool -- Is static + -> DataCon + -> Int -> Int -- Total and pointer words + -> ClosureInfo +mkConInfo hmods is_static data_con tot_wds ptr_wds + = ConInfo { closureSMRep = sm_rep, + closureCon = data_con, + closureDllCon = isDllName hmods (dataConName data_con) } + where + sm_rep = chooseSMRep is_static (mkConLFInfo data_con) tot_wds ptr_wds +\end{code} + +%************************************************************************ +%* * +\subsection[ClosureInfo-sizes]{Functions about closure {\em sizes}} +%* * +%************************************************************************ + +\begin{code} +closureSize :: ClosureInfo -> WordOff +closureSize cl_info = hdr_size + closureNonHdrSize cl_info + where hdr_size | closureIsThunk cl_info = thunkHdrSize + | otherwise = fixedHdrSize + -- All thunks use thunkHdrSize, even if they are non-updatable. + -- this is because we don't have separate closure types for + -- updatable vs. non-updatable thunks, so the GC can't tell the + -- difference. If we ever have significant numbers of non- + -- updatable thunks, it might be worth fixing this. + +closureNonHdrSize :: ClosureInfo -> WordOff +closureNonHdrSize cl_info + = tot_wds + computeSlopSize tot_wds cl_info + where + tot_wds = closureGoodStuffSize cl_info + +closureGoodStuffSize :: ClosureInfo -> WordOff +closureGoodStuffSize cl_info + = let (ptrs, nonptrs) = sizes_from_SMRep (closureSMRep cl_info) + in ptrs + nonptrs + +closurePtrsSize :: ClosureInfo -> WordOff +closurePtrsSize cl_info + = let (ptrs, _) = sizes_from_SMRep (closureSMRep cl_info) + in ptrs + +-- not exported: +sizes_from_SMRep :: SMRep -> (WordOff,WordOff) +sizes_from_SMRep (GenericRep _ ptrs nonptrs _) = (ptrs, nonptrs) +sizes_from_SMRep BlackHoleRep = (0, 0) +\end{code} + +Computing slop size. WARNING: this looks dodgy --- it has deep +knowledge of what the storage manager does with the various +representations... + +Slop Requirements: every thunk gets an extra padding word in the +header, which takes the the updated value. + +\begin{code} +slopSize cl_info = computeSlopSize payload_size cl_info + where payload_size = closureGoodStuffSize cl_info + +computeSlopSize :: WordOff -> ClosureInfo -> WordOff +computeSlopSize payload_size cl_info + = max 0 (minPayloadSize smrep updatable - payload_size) + where + smrep = closureSMRep cl_info + updatable = closureNeedsUpdSpace cl_info + +-- we leave space for an update if either (a) the closure is updatable +-- or (b) it is a static thunk. This is because a static thunk needs +-- a static link field in a predictable place (after the slop), regardless +-- of whether it is updatable or not. +closureNeedsUpdSpace (ClosureInfo { closureLFInfo = + LFThunk TopLevel _ _ _ _ }) = True +closureNeedsUpdSpace cl_info = closureUpdReqd cl_info + +minPayloadSize :: SMRep -> Bool -> WordOff +minPayloadSize smrep updatable + = case smrep of + BlackHoleRep -> min_upd_size + GenericRep _ _ _ _ | updatable -> min_upd_size + GenericRep True _ _ _ -> 0 -- static + GenericRep False _ _ _ -> mIN_PAYLOAD_SIZE + -- ^^^^^___ dynamic + where + min_upd_size = + ASSERT(mIN_PAYLOAD_SIZE <= sIZEOF_StgSMPThunkHeader) + 0 -- check that we already have enough + -- room for mIN_SIZE_NonUpdHeapObject, + -- due to the extra header word in SMP +\end{code} + +%************************************************************************ +%* * +\subsection[SMreps]{Choosing SM reps} +%* * +%************************************************************************ + +\begin{code} +chooseSMRep + :: Bool -- True <=> static closure + -> LambdaFormInfo + -> WordOff -> WordOff -- Tot wds, ptr wds + -> SMRep + +chooseSMRep is_static lf_info tot_wds ptr_wds + = let + nonptr_wds = tot_wds - ptr_wds + closure_type = getClosureType is_static ptr_wds lf_info + in + GenericRep is_static ptr_wds nonptr_wds closure_type + +-- We *do* get non-updatable top-level thunks sometimes. eg. f = g +-- gets compiled to a jump to g (if g has non-zero arity), instead of +-- messing around with update frames and PAPs. We set the closure type +-- to FUN_STATIC in this case. + +getClosureType :: Bool -> WordOff -> LambdaFormInfo -> ClosureType +getClosureType is_static ptr_wds lf_info + = case lf_info of + LFCon con | is_static && ptr_wds == 0 -> ConstrNoCaf + | otherwise -> Constr + LFReEntrant _ _ _ _ -> Fun + LFThunk _ _ _ (SelectorThunk _) _ -> ThunkSelector + LFThunk _ _ _ _ _ -> Thunk + _ -> panic "getClosureType" +\end{code} + +%************************************************************************ +%* * +\subsection[ClosureInfo-4-questions]{Four major questions about @ClosureInfo@} +%* * +%************************************************************************ + +Be sure to see the stg-details notes about these... + +\begin{code} +nodeMustPointToIt :: LambdaFormInfo -> Bool +nodeMustPointToIt (LFReEntrant top _ no_fvs _) + = not no_fvs || -- Certainly if it has fvs we need to point to it + isNotTopLevel top + -- If it is not top level we will point to it + -- We can have a \r closure with no_fvs which + -- is not top level as special case cgRhsClosure + -- has been dissabled in favour of let floating + + -- For lex_profiling we also access the cost centre for a + -- non-inherited function i.e. not top level + -- the not top case above ensures this is ok. + +nodeMustPointToIt (LFCon _) = True + + -- Strictly speaking, the above two don't need Node to point + -- to it if the arity = 0. But this is a *really* unlikely + -- situation. If we know it's nil (say) and we are entering + -- it. Eg: let x = [] in x then we will certainly have inlined + -- x, since nil is a simple atom. So we gain little by not + -- having Node point to known zero-arity things. On the other + -- hand, we do lose something; Patrick's code for figuring out + -- when something has been updated but not entered relies on + -- having Node point to the result of an update. SLPJ + -- 27/11/92. + +nodeMustPointToIt (LFThunk _ no_fvs updatable NonStandardThunk _) + = updatable || not no_fvs || opt_SccProfilingOn + -- For the non-updatable (single-entry case): + -- + -- True if has fvs (in which case we need access to them, and we + -- should black-hole it) + -- or profiling (in which case we need to recover the cost centre + -- from inside it) + +nodeMustPointToIt (LFThunk _ no_fvs updatable some_standard_form_thunk _) + = True -- Node must point to any standard-form thunk + +nodeMustPointToIt (LFUnknown _) = True +nodeMustPointToIt (LFBlackHole _) = True -- BH entry may require Node to point +nodeMustPointToIt (LFLetNoEscape _) = False +\end{code} + +The entry conventions depend on the type of closure being entered, +whether or not it has free variables, and whether we're running +sequentially or in parallel. + +\begin{tabular}{lllll} +Closure Characteristics & Parallel & Node Req'd & Argument Passing & Enter Via \\ +Unknown & no & yes & stack & node \\ +Known fun ($\ge$ 1 arg), no fvs & no & no & registers & fast entry (enough args) \\ +\ & \ & \ & \ & slow entry (otherwise) \\ +Known fun ($\ge$ 1 arg), fvs & no & yes & registers & fast entry (enough args) \\ +0 arg, no fvs @\r,\s@ & no & no & n/a & direct entry \\ +0 arg, no fvs @\u@ & no & yes & n/a & node \\ +0 arg, fvs @\r,\s@ & no & yes & n/a & direct entry \\ +0 arg, fvs @\u@ & no & yes & n/a & node \\ + +Unknown & yes & yes & stack & node \\ +Known fun ($\ge$ 1 arg), no fvs & yes & no & registers & fast entry (enough args) \\ +\ & \ & \ & \ & slow entry (otherwise) \\ +Known fun ($\ge$ 1 arg), fvs & yes & yes & registers & node \\ +0 arg, no fvs @\r,\s@ & yes & no & n/a & direct entry \\ +0 arg, no fvs @\u@ & yes & yes & n/a & node \\ +0 arg, fvs @\r,\s@ & yes & yes & n/a & node \\ +0 arg, fvs @\u@ & yes & yes & n/a & node\\ +\end{tabular} + +When black-holing, single-entry closures could also be entered via node +(rather than directly) to catch double-entry. + +\begin{code} +data CallMethod + = EnterIt -- no args, not a function + + | JumpToIt CLabel -- no args, not a function, but we + -- know what its entry code is + + | ReturnIt -- it's a function, but we have + -- zero args to apply to it, so just + -- return it. + + | ReturnCon DataCon -- It's a data constructor, just return it + + | SlowCall -- Unknown fun, or known fun with + -- too few args. + + | DirectEntry -- Jump directly, with args in regs + CLabel -- The code label + Int -- Its arity + +getCallMethod :: HomeModules + -> Name -- Function being applied + -> LambdaFormInfo -- Its info + -> Int -- Number of available arguments + -> CallMethod + +getCallMethod hmods name lf_info n_args + | nodeMustPointToIt lf_info && opt_Parallel + = -- If we're parallel, then we must always enter via node. + -- The reason is that the closure may have been + -- fetched since we allocated it. + EnterIt + +getCallMethod hmods name (LFReEntrant _ arity _ _) n_args + | n_args == 0 = ASSERT( arity /= 0 ) + ReturnIt -- No args at all + | n_args < arity = SlowCall -- Not enough args + | otherwise = DirectEntry (enterIdLabel hmods name) arity + +getCallMethod hmods name (LFCon con) n_args + = ASSERT( n_args == 0 ) + ReturnCon con + +getCallMethod hmods name (LFThunk _ _ updatable std_form_info is_fun) n_args + | is_fun -- Must always "call" a function-typed + = SlowCall -- thing, cannot just enter it [in eval/apply, the entry code + -- is the fast-entry code] + + | updatable || opt_DoTickyProfiling -- to catch double entry + {- OLD: || opt_SMP + I decided to remove this, because in SMP mode it doesn't matter + if we enter the same thunk multiple times, so the optimisation + of jumping directly to the entry code is still valid. --SDM + -} + = ASSERT( n_args == 0 ) EnterIt + + | otherwise -- Jump direct to code for single-entry thunks + = ASSERT( n_args == 0 ) + JumpToIt (thunkEntryLabel hmods name std_form_info updatable) + +getCallMethod hmods name (LFUnknown True) n_args + = SlowCall -- might be a function + +getCallMethod hmods name (LFUnknown False) n_args + = ASSERT2 ( n_args == 0, ppr name <+> ppr n_args ) + EnterIt -- Not a function + +getCallMethod hmods name (LFBlackHole _) n_args + = SlowCall -- Presumably the black hole has by now + -- been updated, but we don't know with + -- what, so we slow call it + +getCallMethod hmods name (LFLetNoEscape 0) n_args + = JumpToIt (enterReturnPtLabel (nameUnique name)) + +getCallMethod hmods name (LFLetNoEscape arity) n_args + | n_args == arity = DirectEntry (enterReturnPtLabel (nameUnique name)) arity + | otherwise = pprPanic "let-no-escape: " (ppr name <+> ppr arity) + +blackHoleOnEntry :: ClosureInfo -> Bool +-- Static closures are never themselves black-holed. +-- Updatable ones will be overwritten with a CAFList cell, which points to a +-- black hole; +-- Single-entry ones have no fvs to plug, and we trust they don't form part +-- of a loop. + +blackHoleOnEntry ConInfo{} = False +blackHoleOnEntry (ClosureInfo { closureLFInfo = lf_info, closureSMRep = rep }) + | isStaticRep rep + = False -- Never black-hole a static closure + + | otherwise + = case lf_info of + LFReEntrant _ _ _ _ -> False + LFLetNoEscape _ -> False + LFThunk _ no_fvs updatable _ _ + -> if updatable + then not opt_OmitBlackHoling + else opt_DoTickyProfiling || not no_fvs + -- the former to catch double entry, + -- and the latter to plug space-leaks. KSW/SDM 1999-04. + + other -> panic "blackHoleOnEntry" -- Should never happen + +isStandardFormThunk :: LambdaFormInfo -> Bool +isStandardFormThunk (LFThunk _ _ _ (SelectorThunk _) _) = True +isStandardFormThunk (LFThunk _ _ _ (ApThunk _) _) = True +isStandardFormThunk other_lf_info = False + +isKnownFun :: LambdaFormInfo -> Bool +isKnownFun (LFReEntrant _ _ _ _) = True +isKnownFun (LFLetNoEscape _) = True +isKnownFun _ = False +\end{code} + +----------------------------------------------------------------------------- +SRT-related stuff + +\begin{code} +staticClosureNeedsLink :: ClosureInfo -> Bool +-- A static closure needs a link field to aid the GC when traversing +-- the static closure graph. But it only needs such a field if either +-- a) it has an SRT +-- b) it's a constructor with one or more pointer fields +-- In case (b), the constructor's fields themselves play the role +-- of the SRT. +staticClosureNeedsLink (ClosureInfo { closureSRT = srt }) + = needsSRT srt +staticClosureNeedsLink (ConInfo { closureSMRep = sm_rep, closureCon = con }) + = not (isNullaryRepDataCon con) && not_nocaf_constr + where + not_nocaf_constr = + case sm_rep of + GenericRep _ _ _ ConstrNoCaf -> False + _other -> True +\end{code} + +Avoiding generating entries and info tables +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +At present, for every function we generate all of the following, +just in case. But they aren't always all needed, as noted below: + +[NB1: all of this applies only to *functions*. Thunks always +have closure, info table, and entry code.] + +[NB2: All are needed if the function is *exported*, just to play safe.] + + +* Fast-entry code ALWAYS NEEDED + +* Slow-entry code + Needed iff (a) we have any un-saturated calls to the function + OR (b) the function is passed as an arg + OR (c) we're in the parallel world and the function has free vars + [Reason: in parallel world, we always enter functions + with free vars via the closure.] + +* The function closure + Needed iff (a) we have any un-saturated calls to the function + OR (b) the function is passed as an arg + OR (c) if the function has free vars (ie not top level) + + Why case (a) here? Because if the arg-satis check fails, + UpdatePAP stuffs a pointer to the function closure in the PAP. + [Could be changed; UpdatePAP could stuff in a code ptr instead, + but doesn't seem worth it.] + + [NB: these conditions imply that we might need the closure + without the slow-entry code. Here's how. + + f x y = let g w = ...x..y..w... + in + ...(g t)... + + Here we need a closure for g which contains x and y, + but since the calls are all saturated we just jump to the + fast entry point for g, with R1 pointing to the closure for g.] + + +* Standard info table + Needed iff (a) we have any un-saturated calls to the function + OR (b) the function is passed as an arg + OR (c) the function has free vars (ie not top level) + + NB. In the sequential world, (c) is only required so that the function closure has + an info table to point to, to keep the storage manager happy. + If (c) alone is true we could fake up an info table by choosing + one of a standard family of info tables, whose entry code just + bombs out. + + [NB In the parallel world (c) is needed regardless because + we enter functions with free vars via the closure.] + + If (c) is retained, then we'll sometimes generate an info table + (for storage mgr purposes) without slow-entry code. Then we need + to use an error label in the info table to substitute for the absent + slow entry code. + +\begin{code} +staticClosureRequired + :: Name + -> StgBinderInfo + -> LambdaFormInfo + -> Bool +staticClosureRequired binder bndr_info + (LFReEntrant top_level _ _ _) -- It's a function + = ASSERT( isTopLevel top_level ) + -- Assumption: it's a top-level, no-free-var binding + not (satCallsOnly bndr_info) + +staticClosureRequired binder other_binder_info other_lf_info = True +\end{code} + +%************************************************************************ +%* * +\subsection[ClosureInfo-misc-funs]{Misc functions about @ClosureInfo@, etc.} +%* * +%************************************************************************ + +\begin{code} + +isStaticClosure :: ClosureInfo -> Bool +isStaticClosure cl_info = isStaticRep (closureSMRep cl_info) + +closureUpdReqd :: ClosureInfo -> Bool +closureUpdReqd ClosureInfo{ closureLFInfo = lf_info } = lfUpdatable lf_info +closureUpdReqd ConInfo{} = False + +lfUpdatable :: LambdaFormInfo -> Bool +lfUpdatable (LFThunk _ _ upd _ _) = upd +lfUpdatable (LFBlackHole _) = True + -- Black-hole closures are allocated to receive the results of an + -- alg case with a named default... so they need to be updated. +lfUpdatable _ = False + +closureIsThunk :: ClosureInfo -> Bool +closureIsThunk ClosureInfo{ closureLFInfo = lf_info } = isLFThunk lf_info +closureIsThunk ConInfo{} = False + +closureSingleEntry :: ClosureInfo -> Bool +closureSingleEntry (ClosureInfo { closureLFInfo = LFThunk _ _ upd _ _}) = not upd +closureSingleEntry other_closure = False + +closureReEntrant :: ClosureInfo -> Bool +closureReEntrant (ClosureInfo { closureLFInfo = LFReEntrant _ _ _ _ }) = True +closureReEntrant other_closure = False + +isConstrClosure_maybe :: ClosureInfo -> Maybe DataCon +isConstrClosure_maybe (ConInfo { closureCon = data_con }) = Just data_con +isConstrClosure_maybe _ = Nothing + +closureFunInfo :: ClosureInfo -> Maybe (Int, ArgDescr) +closureFunInfo (ClosureInfo { closureLFInfo = LFReEntrant _ arity _ arg_desc}) + = Just (arity, arg_desc) +closureFunInfo _ + = Nothing +\end{code} + +\begin{code} +isToplevClosure :: ClosureInfo -> Bool +isToplevClosure (ClosureInfo { closureLFInfo = lf_info }) + = case lf_info of + LFReEntrant TopLevel _ _ _ -> True + LFThunk TopLevel _ _ _ _ -> True + other -> False +isToplevClosure _ = False +\end{code} + +Label generation. + +\begin{code} +infoTableLabelFromCI :: ClosureInfo -> CLabel +infoTableLabelFromCI (ClosureInfo { closureName = name, + closureLFInfo = lf_info, + closureSMRep = rep }) + = case lf_info of + LFBlackHole info -> info + + LFThunk _ _ upd_flag (SelectorThunk offset) _ -> + mkSelectorInfoLabel upd_flag offset + + LFThunk _ _ upd_flag (ApThunk arity) _ -> + mkApInfoTableLabel upd_flag arity + + LFThunk{} -> mkLocalInfoTableLabel name + + LFReEntrant _ _ _ _ -> mkLocalInfoTableLabel name + + other -> panic "infoTableLabelFromCI" + +infoTableLabelFromCI (ConInfo { closureCon = con, + closureSMRep = rep, + closureDllCon = dll }) + | isStaticRep rep = mkStaticInfoTableLabel name dll + | otherwise = mkConInfoTableLabel name dll + where + name = dataConName con + +-- ClosureInfo for a closure (as opposed to a constructor) is always local +closureLabelFromCI (ClosureInfo { closureName = nm }) = mkLocalClosureLabel nm +closureLabelFromCI _ = panic "closureLabelFromCI" + +-- thunkEntryLabel is a local help function, not exported. It's used from both +-- entryLabelFromCI and getCallMethod. + +thunkEntryLabel hmods thunk_id (ApThunk arity) is_updatable + = enterApLabel is_updatable arity +thunkEntryLabel hmods thunk_id (SelectorThunk offset) upd_flag + = enterSelectorLabel upd_flag offset +thunkEntryLabel hmods thunk_id _ is_updatable + = enterIdLabel hmods thunk_id + +enterApLabel is_updatable arity + | tablesNextToCode = mkApInfoTableLabel is_updatable arity + | otherwise = mkApEntryLabel is_updatable arity + +enterSelectorLabel upd_flag offset + | tablesNextToCode = mkSelectorInfoLabel upd_flag offset + | otherwise = mkSelectorEntryLabel upd_flag offset + +enterIdLabel hmods id + | tablesNextToCode = mkInfoTableLabel hmods id + | otherwise = mkEntryLabel hmods id + +enterLocalIdLabel id + | tablesNextToCode = mkLocalInfoTableLabel id + | otherwise = mkLocalEntryLabel id + +enterReturnPtLabel name + | tablesNextToCode = mkReturnInfoLabel name + | otherwise = mkReturnPtLabel name +\end{code} + + +We need a black-hole closure info to pass to @allocDynClosure@ when we +want to allocate the black hole on entry to a CAF. These are the only +ways to build an LFBlackHole, maintaining the invariant that it really +is a black hole and not something else. + +\begin{code} +cafBlackHoleClosureInfo (ClosureInfo { closureName = nm, + closureType = ty }) + = ClosureInfo { closureName = nm, + closureLFInfo = LFBlackHole mkCAFBlackHoleInfoTableLabel, + closureSMRep = BlackHoleRep, + closureSRT = NoC_SRT, + closureType = ty, + closureDescr = "" } +cafBlackHoleClosureInfo _ = panic "cafBlackHoleClosureInfo" + +seCafBlackHoleClosureInfo (ClosureInfo { closureName = nm, + closureType = ty }) + = ClosureInfo { closureName = nm, + closureLFInfo = LFBlackHole mkSECAFBlackHoleInfoTableLabel, + closureSMRep = BlackHoleRep, + closureSRT = NoC_SRT, + closureType = ty, + closureDescr = "" } +seCafBlackHoleClosureInfo _ = panic "seCafBlackHoleClosureInfo" +\end{code} + +%************************************************************************ +%* * +\subsection[ClosureInfo-Profiling-funs]{Misc functions about for profiling info.} +%* * +%************************************************************************ + +Profiling requires two pieces of information to be determined for +each closure's info table --- description and type. + +The description is stored directly in the @CClosureInfoTable@ when the +info table is built. + +The type is determined from the type information stored with the @Id@ +in the closure info using @closureTypeDescr@. + +\begin{code} +closureValDescr, closureTypeDescr :: ClosureInfo -> String +closureValDescr (ClosureInfo {closureDescr = descr}) + = descr +closureValDescr (ConInfo {closureCon = con}) + = occNameString (getOccName con) + +closureTypeDescr (ClosureInfo { closureType = ty }) + = getTyDescription ty +closureTypeDescr (ConInfo { closureCon = data_con }) + = occNameString (getOccName (dataConTyCon data_con)) + +getTyDescription :: Type -> String +getTyDescription ty + = case (tcSplitSigmaTy ty) of { (_, _, tau_ty) -> + case tau_ty of + TyVarTy _ -> "*" + AppTy fun _ -> getTyDescription fun + FunTy _ res -> '-' : '>' : fun_result res + TyConApp tycon _ -> getOccString tycon + NoteTy (FTVNote _) ty -> getTyDescription ty + PredTy sty -> getPredTyDescription sty + ForAllTy _ ty -> getTyDescription ty + } + where + fun_result (FunTy _ res) = '>' : fun_result res + fun_result other = getTyDescription other + +getPredTyDescription (ClassP cl tys) = getOccString cl +getPredTyDescription (IParam ip ty) = getOccString (ipNameName ip) +\end{code} + + diff --git a/compiler/codeGen/ClosureInfo.lhs-boot b/compiler/codeGen/ClosureInfo.lhs-boot new file mode 100644 index 0000000000..b069905d3e --- /dev/null +++ b/compiler/codeGen/ClosureInfo.lhs-boot @@ -0,0 +1,6 @@ +\begin{code} +module ClosureInfo where + +data LambdaFormInfo +data ClosureInfo +\end{code}
\ No newline at end of file diff --git a/compiler/codeGen/CodeGen.lhs b/compiler/codeGen/CodeGen.lhs new file mode 100644 index 0000000000..e8d83a5a43 --- /dev/null +++ b/compiler/codeGen/CodeGen.lhs @@ -0,0 +1,343 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[CodeGen]{@CodeGen@: main module of the code generator} + +This module says how things get going at the top level. + +@codeGen@ is the interface to the outside world. The \tr{cgTop*} +functions drive the mangling of top-level bindings. + +%************************************************************************ +%* * +\subsection[codeGen-outside-interface]{The code generator's offering to the world} +%* * +%************************************************************************ + +\begin{code} +module CodeGen ( codeGen ) where + +#include "HsVersions.h" + +-- Kludge (??) so that CgExpr is reached via at least one non-SOURCE +-- import. Before, that wasn't the case, and CM therefore didn't +-- bother to compile it. +import CgExpr ( {-NOTHING!-} ) -- DO NOT DELETE THIS IMPORT +import CgProf +import CgMonad +import CgBindery ( CgIdInfo, addBindC, addBindsC, getCgIdInfo, + cgIdInfoId ) +import CgClosure ( cgTopRhsClosure ) +import CgCon ( cgTopRhsCon, cgTyCon ) +import CgUtils ( cmmRegOffW, emitRODataLits, cmmNeWord ) + +import CLabel +import Cmm +import CmmUtils ( zeroCLit, mkIntCLit, mkLblExpr ) +import PprCmm ( pprCmms ) +import MachOp ( wordRep, MachHint(..) ) + +import StgSyn +import PrelNames ( gHC_PRIM, rOOT_MAIN, mAIN, pREL_TOP_HANDLER ) +import Packages ( HomeModules ) +import DynFlags ( DynFlags(..), DynFlag(..), dopt ) +import StaticFlags ( opt_SccProfilingOn ) + +import HscTypes ( ForeignStubs(..), TypeEnv, typeEnvTyCons ) +import CostCentre ( CollectedCCs ) +import Id ( Id, idName, setIdName ) +import Name ( nameSrcLoc, nameOccName, nameUnique, isInternalName, mkExternalName ) +import OccName ( mkLocalOcc ) +import TyCon ( TyCon ) +import Module ( Module, mkModule ) +import ErrUtils ( dumpIfSet_dyn, showPass ) +import Panic ( assertPanic ) + +#ifdef DEBUG +import Outputable +#endif +\end{code} + +\begin{code} +codeGen :: DynFlags + -> HomeModules + -> Module + -> [TyCon] + -> ForeignStubs + -> [Module] -- directly-imported modules + -> CollectedCCs -- (Local/global) cost-centres needing declaring/registering. + -> [(StgBinding,[(Id,[Id])])] -- Bindings to convert, with SRTs + -> IO [Cmm] -- Output + +codeGen dflags hmods this_mod data_tycons foreign_stubs imported_mods + cost_centre_info stg_binds + = do + { showPass dflags "CodeGen" + ; let way = buildTag dflags + main_mod = mainModIs dflags + +-- Why? +-- ; mapM_ (\x -> seq x (return ())) data_tycons + + ; code_stuff <- initC dflags hmods this_mod $ do + { cmm_binds <- mapM (getCmm . cgTopBinding dflags hmods) stg_binds + ; cmm_tycons <- mapM cgTyCon data_tycons + ; cmm_init <- getCmm (mkModuleInit dflags hmods way cost_centre_info + this_mod main_mod + foreign_stubs imported_mods) + ; return (cmm_binds ++ concat cmm_tycons ++ [cmm_init]) + } + -- Put datatype_stuff after code_stuff, because the + -- datatype closure table (for enumeration types) to + -- (say) PrelBase_True_closure, which is defined in + -- code_stuff + + ; dumpIfSet_dyn dflags Opt_D_dump_cmm "Cmm" (pprCmms code_stuff) + + ; return code_stuff } +\end{code} + +%************************************************************************ +%* * +\subsection[codegen-init]{Module initialisation code} +%* * +%************************************************************************ + +/* ----------------------------------------------------------------------------- + Module initialisation + + The module initialisation code looks like this, roughly: + + FN(__stginit_Foo) { + JMP_(__stginit_Foo_1_p) + } + + FN(__stginit_Foo_1_p) { + ... + } + + We have one version of the init code with a module version and the + 'way' attached to it. The version number helps to catch cases + where modules are not compiled in dependency order before being + linked: if a module has been compiled since any modules which depend on + it, then the latter modules will refer to a different version in their + init blocks and a link error will ensue. + + The 'way' suffix helps to catch cases where modules compiled in different + ways are linked together (eg. profiled and non-profiled). + + We provide a plain, unadorned, version of the module init code + which just jumps to the version with the label and way attached. The + reason for this is that when using foreign exports, the caller of + startupHaskell() must supply the name of the init function for the "top" + module in the program, and we don't want to require that this name + has the version and way info appended to it. + -------------------------------------------------------------------------- */ + +We initialise the module tree by keeping a work-stack, + * pointed to by Sp + * that grows downward + * Sp points to the last occupied slot + + +\begin{code} +mkModuleInit + :: DynFlags + -> HomeModules + -> String -- the "way" + -> CollectedCCs -- cost centre info + -> Module + -> Module -- name of the Main module + -> ForeignStubs + -> [Module] + -> Code +mkModuleInit dflags hmods way cost_centre_info this_mod main_mod foreign_stubs imported_mods + = do { + if opt_SccProfilingOn + then do { -- Allocate the static boolean that records if this + -- module has been registered already + emitData Data [CmmDataLabel moduleRegdLabel, + CmmStaticLit zeroCLit] + + ; emitSimpleProc real_init_lbl $ do + { ret_blk <- forkLabelledCode ret_code + + ; init_blk <- forkLabelledCode $ do + { mod_init_code; stmtC (CmmBranch ret_blk) } + + ; stmtC (CmmCondBranch (cmmNeWord (CmmLit zeroCLit) mod_reg_val) + ret_blk) + ; stmtC (CmmBranch init_blk) + } + } + else emitSimpleProc real_init_lbl ret_code + + -- Make the "plain" procedure jump to the "real" init procedure + ; emitSimpleProc plain_init_lbl jump_to_init + + -- When compiling the module in which the 'main' function lives, + -- (that is, this_mod == main_mod) + -- we inject an extra stg_init procedure for stg_init_ZCMain, for the + -- RTS to invoke. We must consult the -main-is flag in case the + -- user specified a different function to Main.main + ; whenC (this_mod == main_mod) + (emitSimpleProc plain_main_init_lbl jump_to_init) + } + where + plain_init_lbl = mkPlainModuleInitLabel hmods this_mod + real_init_lbl = mkModuleInitLabel hmods this_mod way + plain_main_init_lbl = mkPlainModuleInitLabel hmods rOOT_MAIN + + jump_to_init = stmtC (CmmJump (mkLblExpr real_init_lbl) []) + + mod_reg_val = CmmLoad (mkLblExpr moduleRegdLabel) wordRep + + -- Main refers to GHC.TopHandler.runIO, so make sure we call the + -- init function for GHC.TopHandler. + extra_imported_mods + | this_mod == main_mod = [pREL_TOP_HANDLER] + | otherwise = [] + + mod_init_code = do + { -- Set mod_reg to 1 to record that we've been here + stmtC (CmmStore (mkLblExpr moduleRegdLabel) (CmmLit (mkIntCLit 1))) + + -- Now do local stuff + ; initCostCentres cost_centre_info + ; mapCs (registerModuleImport hmods way) + (imported_mods++extra_imported_mods) + } + + -- The return-code pops the work stack by + -- incrementing Sp, and then jumpd to the popped item + ret_code = stmtsC [ CmmAssign spReg (cmmRegOffW spReg 1) + , CmmJump (CmmLoad (cmmRegOffW spReg (-1)) wordRep) [] ] + +----------------------- +registerModuleImport :: HomeModules -> String -> Module -> Code +registerModuleImport hmods way mod + | mod == gHC_PRIM + = nopC + | otherwise -- Push the init procedure onto the work stack + = stmtsC [ CmmAssign spReg (cmmRegOffW spReg (-1)) + , CmmStore (CmmReg spReg) (mkLblExpr (mkModuleInitLabel hmods mod way)) ] +\end{code} + + + +Cost-centre profiling: Besides the usual stuff, we must produce +declarations for the cost-centres defined in this module; + +(The local cost-centres involved in this are passed into the +code-generator.) + +\begin{code} +initCostCentres :: CollectedCCs -> Code +-- Emit the declarations, and return code to register them +initCostCentres (local_CCs, ___extern_CCs, singleton_CCSs) + | not opt_SccProfilingOn = nopC + | otherwise + = do { mapM_ emitCostCentreDecl local_CCs + ; mapM_ emitCostCentreStackDecl singleton_CCSs + ; mapM_ emitRegisterCC local_CCs + ; mapM_ emitRegisterCCS singleton_CCSs + } +\end{code} + +%************************************************************************ +%* * +\subsection[codegen-top-bindings]{Converting top-level STG bindings} +%* * +%************************************************************************ + +@cgTopBinding@ is only used for top-level bindings, since they need +to be allocated statically (not in the heap) and need to be labelled. +No unboxed bindings can happen at top level. + +In the code below, the static bindings are accumulated in the +@MkCgState@, and transferred into the ``statics'' slot by @forkStatics@. +This is so that we can write the top level processing in a compositional +style, with the increasing static environment being plumbed as a state +variable. + +\begin{code} +cgTopBinding :: DynFlags -> HomeModules -> (StgBinding,[(Id,[Id])]) -> Code +cgTopBinding dflags hmods (StgNonRec id rhs, srts) + = do { id' <- maybeExternaliseId dflags id + ; mapM_ (mkSRT hmods [id']) srts + ; (id,info) <- cgTopRhs id' rhs + ; addBindC id info -- Add the *un-externalised* Id to the envt, + -- so we find it when we look up occurrences + } + +cgTopBinding dflags hmods (StgRec pairs, srts) + = do { let (bndrs, rhss) = unzip pairs + ; bndrs' <- mapFCs (maybeExternaliseId dflags) bndrs + ; let pairs' = zip bndrs' rhss + ; mapM_ (mkSRT hmods bndrs') srts + ; _new_binds <- fixC (\ new_binds -> do + { addBindsC new_binds + ; mapFCs ( \ (b,e) -> cgTopRhs b e ) pairs' }) + ; nopC } + +mkSRT :: HomeModules -> [Id] -> (Id,[Id]) -> Code +mkSRT hmods these (id,[]) = nopC +mkSRT hmods these (id,ids) + = do { ids <- mapFCs remap ids + ; id <- remap id + ; emitRODataLits (mkSRTLabel (idName id)) + (map (CmmLabel . mkClosureLabel hmods . idName) ids) + } + where + -- Sigh, better map all the ids against the environment in + -- case they've been externalised (see maybeExternaliseId below). + remap id = case filter (==id) these of + (id':_) -> returnFC id' + [] -> do { info <- getCgIdInfo id; return (cgIdInfoId info) } + +-- Urgh! I tried moving the forkStatics call from the rhss of cgTopRhs +-- to enclose the listFCs in cgTopBinding, but that tickled the +-- statics "error" call in initC. I DON'T UNDERSTAND WHY! + +cgTopRhs :: Id -> StgRhs -> FCode (Id, CgIdInfo) + -- The Id is passed along for setting up a binding... + -- It's already been externalised if necessary + +cgTopRhs bndr (StgRhsCon cc con args) + = forkStatics (cgTopRhsCon bndr con args) + +cgTopRhs bndr (StgRhsClosure cc bi fvs upd_flag srt args body) + = ASSERT(null fvs) -- There should be no free variables + setSRTLabel (mkSRTLabel (idName bndr)) $ + forkStatics (cgTopRhsClosure bndr cc bi srt upd_flag args body) +\end{code} + + +%************************************************************************ +%* * +\subsection{Stuff to support splitting} +%* * +%************************************************************************ + +If we're splitting the object, we need to externalise all the top-level names +(and then make sure we only use the externalised one in any C label we use +which refers to this name). + +\begin{code} +maybeExternaliseId :: DynFlags -> Id -> FCode Id +maybeExternaliseId dflags id + | dopt Opt_SplitObjs dflags, -- Externalise the name for -split-objs + isInternalName name = do { mod <- moduleName + ; returnFC (setIdName id (externalise mod)) } + | otherwise = returnFC id + where + externalise mod = mkExternalName uniq mod new_occ Nothing loc + name = idName id + uniq = nameUnique name + new_occ = mkLocalOcc uniq (nameOccName name) + loc = nameSrcLoc name + -- We want to conjure up a name that can't clash with any + -- existing name. So we generate + -- Mod_$L243foo + -- where 243 is the unique. +\end{code} diff --git a/compiler/codeGen/SMRep.lhs b/compiler/codeGen/SMRep.lhs new file mode 100644 index 0000000000..c807703b13 --- /dev/null +++ b/compiler/codeGen/SMRep.lhs @@ -0,0 +1,361 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[SMRep]{Storage manager representations of closure} + +This is here, rather than in ClosureInfo, just to keep nhc happy. +Other modules should access this info through ClosureInfo. + +\begin{code} +module SMRep ( + -- Words and bytes + StgWord, StgHalfWord, + hALF_WORD_SIZE, hALF_WORD_SIZE_IN_BITS, + WordOff, ByteOff, + + -- Argument/return representations + CgRep(..), nonVoidArg, + argMachRep, primRepToCgRep, primRepHint, + isFollowableArg, isVoidArg, + isFloatingArg, isNonPtrArg, is64BitArg, + separateByPtrFollowness, + cgRepSizeW, cgRepSizeB, + retAddrSizeW, + + typeCgRep, idCgRep, tyConCgRep, typeHint, + + -- Closure repesentation + SMRep(..), ClosureType(..), + isStaticRep, + fixedHdrSize, arrWordsHdrSize, arrPtrsHdrSize, + profHdrSize, thunkHdrSize, + tablesNextToCode, + smRepClosureType, smRepClosureTypeInt, + + rET_SMALL, rET_VEC_SMALL, rET_BIG, rET_VEC_BIG + ) where + +#include "HsVersions.h" +#include "../includes/MachDeps.h" + +import Id ( Id, idType ) +import Type ( Type, typePrimRep, PrimRep(..) ) +import TyCon ( TyCon, tyConPrimRep ) +import MachOp-- ( MachRep(..), MachHint(..), wordRep ) +import StaticFlags ( opt_SccProfilingOn, opt_GranMacros, + opt_Unregisterised ) +import Constants +import Outputable + +import DATA_WORD +\end{code} + + +%************************************************************************ +%* * + Words and bytes +%* * +%************************************************************************ + +\begin{code} +type WordOff = Int -- Word offset, or word count +type ByteOff = Int -- Byte offset, or byte count +\end{code} + +StgWord is a type representing an StgWord on the target platform. + +\begin{code} +#if SIZEOF_HSWORD == 4 +type StgWord = Word32 +type StgHalfWord = Word16 +hALF_WORD_SIZE = 2 :: ByteOff +hALF_WORD_SIZE_IN_BITS = 16 :: Int +#elif SIZEOF_HSWORD == 8 +type StgWord = Word64 +type StgHalfWord = Word32 +hALF_WORD_SIZE = 4 :: ByteOff +hALF_WORD_SIZE_IN_BITS = 32 :: Int +#else +#error unknown SIZEOF_HSWORD +#endif +\end{code} + + +%************************************************************************ +%* * + CgRep +%* * +%************************************************************************ + +An CgRep is an abstraction of a Type which tells the code generator +all it needs to know about the calling convention for arguments (and +results) of that type. In particular, the ArgReps of a function's +arguments are used to decide which of the RTS's generic apply +functions to call when applying an unknown function. + +It contains more information than the back-end data type MachRep, +so one can easily convert from CgRep -> MachRep. (Except that +there's no MachRep for a VoidRep.) + +It distinguishes + pointers from non-pointers (we sort the pointers together + when building closures) + + void from other types: a void argument is different from no argument + +All 64-bit types map to the same CgRep, because they're passed in the +same register, but a PtrArg is still different from an NonPtrArg +because the function's entry convention has to take into account the +pointer-hood of arguments for the purposes of describing the stack on +entry to the garbage collector. + +\begin{code} +data CgRep + = VoidArg -- Void + | PtrArg -- Word-sized Ptr + | NonPtrArg -- Word-sized non-pointer + | LongArg -- 64-bit non-pointer + | FloatArg -- 32-bit float + | DoubleArg -- 64-bit float + deriving Eq + +instance Outputable CgRep where + ppr VoidArg = ptext SLIT("V_") + ppr PtrArg = ptext SLIT("P_") + ppr NonPtrArg = ptext SLIT("I_") + ppr LongArg = ptext SLIT("L_") + ppr FloatArg = ptext SLIT("F_") + ppr DoubleArg = ptext SLIT("D_") + +argMachRep :: CgRep -> MachRep +argMachRep PtrArg = wordRep +argMachRep NonPtrArg = wordRep +argMachRep LongArg = I64 +argMachRep FloatArg = F32 +argMachRep DoubleArg = F64 +argMachRep VoidArg = panic "argMachRep:VoidRep" + +primRepToCgRep :: PrimRep -> CgRep +primRepToCgRep VoidRep = VoidArg +primRepToCgRep PtrRep = PtrArg +primRepToCgRep IntRep = NonPtrArg +primRepToCgRep WordRep = NonPtrArg +primRepToCgRep Int64Rep = LongArg +primRepToCgRep Word64Rep = LongArg +primRepToCgRep AddrRep = NonPtrArg +primRepToCgRep FloatRep = FloatArg +primRepToCgRep DoubleRep = DoubleArg + +primRepHint :: PrimRep -> MachHint +primRepHint VoidRep = panic "primRepHint:VoidRep" +primRepHint PtrRep = PtrHint +primRepHint IntRep = SignedHint +primRepHint WordRep = NoHint +primRepHint Int64Rep = SignedHint +primRepHint Word64Rep = NoHint +primRepHint AddrRep = PtrHint -- NB! PtrHint, but NonPtrArg +primRepHint FloatRep = FloatHint +primRepHint DoubleRep = FloatHint + +idCgRep :: Id -> CgRep +idCgRep = typeCgRep . idType + +tyConCgRep :: TyCon -> CgRep +tyConCgRep = primRepToCgRep . tyConPrimRep + +typeCgRep :: Type -> CgRep +typeCgRep = primRepToCgRep . typePrimRep + +typeHint :: Type -> MachHint +typeHint = primRepHint . typePrimRep +\end{code} + +Whether or not the thing is a pointer that the garbage-collector +should follow. Or, to put it another (less confusing) way, whether +the object in question is a heap object. + +Depending on the outcome, this predicate determines what stack +the pointer/object possibly will have to be saved onto, and the +computation of GC liveness info. + +\begin{code} +isFollowableArg :: CgRep -> Bool -- True <=> points to a heap object +isFollowableArg PtrArg = True +isFollowableArg other = False + +isVoidArg :: CgRep -> Bool +isVoidArg VoidArg = True +isVoidArg other = False + +nonVoidArg :: CgRep -> Bool +nonVoidArg VoidArg = False +nonVoidArg other = True + +-- isFloatingArg is used to distinguish @Double@ and @Float@ which +-- cause inadvertent numeric conversions if you aren't jolly careful. +-- See codeGen/CgCon:cgTopRhsCon. + +isFloatingArg :: CgRep -> Bool +isFloatingArg DoubleArg = True +isFloatingArg FloatArg = True +isFloatingArg _ = False + +isNonPtrArg :: CgRep -> Bool +-- Identify anything which is one word large and not a pointer. +isNonPtrArg NonPtrArg = True +isNonPtrArg other = False + +is64BitArg :: CgRep -> Bool +is64BitArg LongArg = True +is64BitArg _ = False +\end{code} + +\begin{code} +separateByPtrFollowness :: [(CgRep,a)] -> ([(CgRep,a)], [(CgRep,a)]) +-- Returns (ptrs, non-ptrs) +separateByPtrFollowness things + = sep_things things [] [] + -- accumulating params for follow-able and don't-follow things... + where + sep_things [] bs us = (reverse bs, reverse us) + sep_things ((PtrArg,a):ts) bs us = sep_things ts ((PtrArg,a):bs) us + sep_things (t :ts) bs us = sep_things ts bs (t:us) +\end{code} + +\begin{code} +cgRepSizeB :: CgRep -> ByteOff +cgRepSizeB DoubleArg = dOUBLE_SIZE +cgRepSizeB LongArg = wORD64_SIZE +cgRepSizeB VoidArg = 0 +cgRepSizeB _ = wORD_SIZE + +cgRepSizeW :: CgRep -> ByteOff +cgRepSizeW DoubleArg = dOUBLE_SIZE `quot` wORD_SIZE +cgRepSizeW LongArg = wORD64_SIZE `quot` wORD_SIZE +cgRepSizeW VoidArg = 0 +cgRepSizeW _ = 1 + +retAddrSizeW :: WordOff +retAddrSizeW = 1 -- One word +\end{code} + +%************************************************************************ +%* * +\subsubsection[SMRep-datatype]{@SMRep@---storage manager representation} +%* * +%************************************************************************ + +\begin{code} +data SMRep + -- static closure have an extra static link field at the end. + = GenericRep -- GC routines consult sizes in info tbl + Bool -- True <=> This is a static closure. Affects how + -- we garbage-collect it + !Int -- # ptr words + !Int -- # non-ptr words + ClosureType -- closure type + + | BlackHoleRep + +data ClosureType -- Corresponds 1-1 with the varieties of closures + -- implemented by the RTS. Compare with ghc/includes/ClosureTypes.h + = Constr + | ConstrNoCaf + | Fun + | Thunk + | ThunkSelector +\end{code} + +Size of a closure header. + +\begin{code} +fixedHdrSize :: WordOff +fixedHdrSize = sTD_HDR_SIZE + profHdrSize + granHdrSize + +profHdrSize :: WordOff +profHdrSize | opt_SccProfilingOn = pROF_HDR_SIZE + | otherwise = 0 + +granHdrSize :: WordOff +granHdrSize | opt_GranMacros = gRAN_HDR_SIZE + | otherwise = 0 + +arrWordsHdrSize :: ByteOff +arrWordsHdrSize = fixedHdrSize*wORD_SIZE + sIZEOF_StgArrWords_NoHdr + +arrPtrsHdrSize :: ByteOff +arrPtrsHdrSize = fixedHdrSize*wORD_SIZE + sIZEOF_StgMutArrPtrs_NoHdr + +-- Thunks have an extra header word on SMP, so the update doesn't +-- splat the payload. +thunkHdrSize :: WordOff +thunkHdrSize = fixedHdrSize + smp_hdr + where smp_hdr = sIZEOF_StgSMPThunkHeader `quot` wORD_SIZE +\end{code} + +\begin{code} +-- IA64 mangler doesn't place tables next to code +tablesNextToCode :: Bool +#if defined(ia64_TARGET_ARCH) || defined(powerpc64_TARGET_ARCH) +tablesNextToCode = False +#else +tablesNextToCode = not opt_Unregisterised +#endif +\end{code} + +\begin{code} +isStaticRep :: SMRep -> Bool +isStaticRep (GenericRep is_static _ _ _) = is_static +isStaticRep BlackHoleRep = False +\end{code} + +\begin{code} +#include "../includes/ClosureTypes.h" +-- Defines CONSTR, CONSTR_1_0 etc + + +smRepClosureType :: SMRep -> ClosureType +smRepClosureType (GenericRep _ _ _ ty) = ty +smRepClosureType BlackHoleRep = panic "smRepClosureType: black hole" + +smRepClosureTypeInt :: SMRep -> Int +smRepClosureTypeInt (GenericRep False 1 0 Constr) = CONSTR_1_0 +smRepClosureTypeInt (GenericRep False 0 1 Constr) = CONSTR_0_1 +smRepClosureTypeInt (GenericRep False 2 0 Constr) = CONSTR_2_0 +smRepClosureTypeInt (GenericRep False 1 1 Constr) = CONSTR_1_1 +smRepClosureTypeInt (GenericRep False 0 2 Constr) = CONSTR_0_2 +smRepClosureTypeInt (GenericRep False _ _ Constr) = CONSTR + +smRepClosureTypeInt (GenericRep False 1 0 Fun) = FUN_1_0 +smRepClosureTypeInt (GenericRep False 0 1 Fun) = FUN_0_1 +smRepClosureTypeInt (GenericRep False 2 0 Fun) = FUN_2_0 +smRepClosureTypeInt (GenericRep False 1 1 Fun) = FUN_1_1 +smRepClosureTypeInt (GenericRep False 0 2 Fun) = FUN_0_2 +smRepClosureTypeInt (GenericRep False _ _ Fun) = FUN + +smRepClosureTypeInt (GenericRep False 1 0 Thunk) = THUNK_1_0 +smRepClosureTypeInt (GenericRep False 0 1 Thunk) = THUNK_0_1 +smRepClosureTypeInt (GenericRep False 2 0 Thunk) = THUNK_2_0 +smRepClosureTypeInt (GenericRep False 1 1 Thunk) = THUNK_1_1 +smRepClosureTypeInt (GenericRep False 0 2 Thunk) = THUNK_0_2 +smRepClosureTypeInt (GenericRep False _ _ Thunk) = THUNK + +smRepClosureTypeInt (GenericRep False _ _ ThunkSelector) = THUNK_SELECTOR + +smRepClosureTypeInt (GenericRep True _ _ Constr) = CONSTR_STATIC +smRepClosureTypeInt (GenericRep True _ _ ConstrNoCaf) = CONSTR_NOCAF_STATIC +smRepClosureTypeInt (GenericRep True _ _ Fun) = FUN_STATIC +smRepClosureTypeInt (GenericRep True _ _ Thunk) = THUNK_STATIC + +smRepClosureTypeInt BlackHoleRep = BLACKHOLE + +smRepClosureTypeInt rep = panic "smRepClosuretypeint" + + +-- We export these ones +rET_SMALL = (RET_SMALL :: Int) +rET_VEC_SMALL = (RET_VEC_SMALL :: Int) +rET_BIG = (RET_BIG :: Int) +rET_VEC_BIG = (RET_VEC_BIG :: Int) +\end{code} + diff --git a/compiler/coreSyn/CoreFVs.lhs b/compiler/coreSyn/CoreFVs.lhs new file mode 100644 index 0000000000..fb6017eabf --- /dev/null +++ b/compiler/coreSyn/CoreFVs.lhs @@ -0,0 +1,415 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +Taken quite directly from the Peyton Jones/Lester paper. + +\begin{code} +module CoreFVs ( + exprFreeVars, -- CoreExpr -> VarSet -- Find all locally-defined free Ids or tyvars + exprsFreeVars, -- [CoreExpr] -> VarSet + + exprSomeFreeVars, exprsSomeFreeVars, + exprFreeNames, exprsFreeNames, + + idRuleVars, idFreeVars, idFreeTyVars, + ruleRhsFreeVars, rulesRhsFreeVars, + ruleLhsFreeNames, ruleLhsFreeIds, + + CoreExprWithFVs, -- = AnnExpr Id VarSet + CoreBindWithFVs, -- = AnnBind Id VarSet + freeVars, -- CoreExpr -> CoreExprWithFVs + freeVarsOf -- CoreExprWithFVs -> IdSet + ) where + +#include "HsVersions.h" + +import CoreSyn +import Id ( Id, idType, idSpecialisation, isLocalId ) +import IdInfo ( specInfoFreeVars ) +import NameSet +import UniqFM ( delFromUFM ) +import Name ( isExternalName ) +import VarSet +import Var ( Var, isId, isLocalVar, varName ) +import Type ( tyVarsOfType ) +import TcType ( tyClsNamesOfType ) +import Util ( mapAndUnzip ) +import Outputable +\end{code} + + +%************************************************************************ +%* * +\section{Finding the free variables of an expression} +%* * +%************************************************************************ + +This function simply finds the free variables of an expression. +So far as type variables are concerned, it only finds tyvars that are + + * free in type arguments, + * free in the type of a binder, + +but not those that are free in the type of variable occurrence. + +\begin{code} +exprFreeVars :: CoreExpr -> VarSet -- Find all locally-defined free Ids or tyvars +exprFreeVars = exprSomeFreeVars isLocalVar + +exprsFreeVars :: [CoreExpr] -> VarSet +exprsFreeVars = foldr (unionVarSet . exprFreeVars) emptyVarSet + +exprSomeFreeVars :: InterestingVarFun -- Says which Vars are interesting + -> CoreExpr + -> VarSet +exprSomeFreeVars fv_cand e = expr_fvs e fv_cand emptyVarSet + +exprsSomeFreeVars :: InterestingVarFun -- Says which Vars are interesting + -> [CoreExpr] + -> VarSet +exprsSomeFreeVars fv_cand = foldr (unionVarSet . exprSomeFreeVars fv_cand) emptyVarSet + +type InterestingVarFun = Var -> Bool -- True <=> interesting +\end{code} + + +\begin{code} +type FV = InterestingVarFun + -> VarSet -- In scope + -> VarSet -- Free vars + +union :: FV -> FV -> FV +union fv1 fv2 fv_cand in_scope = fv1 fv_cand in_scope `unionVarSet` fv2 fv_cand in_scope + +noVars :: FV +noVars fv_cand in_scope = emptyVarSet + +-- Comment about obselete code +-- We used to gather the free variables the RULES at a variable occurrence +-- with the following cryptic comment: +-- "At a variable occurrence, add in any free variables of its rule rhss +-- Curiously, we gather the Id's free *type* variables from its binding +-- site, but its free *rule-rhs* variables from its usage sites. This +-- is a little weird. The reason is that the former is more efficient, +-- but the latter is more fine grained, and a makes a difference when +-- a variable mentions itself one of its own rule RHSs" +-- Not only is this "weird", but it's also pretty bad because it can make +-- a function seem more recursive than it is. Suppose +-- f = ...g... +-- g = ... +-- RULE g x = ...f... +-- Then f is not mentioned in its own RHS, and needn't be a loop breaker +-- (though g may be). But if we collect the rule fvs from g's occurrence, +-- it looks as if f mentions itself. (This bites in the eftInt/eftIntFB +-- code in GHC.Enum.) +-- +-- Anyway, it seems plain wrong. The RULE is like an extra RHS for the +-- function, so its free variables belong at the definition site. +-- +-- Deleted code looked like +-- foldVarSet add_rule_var var_itself_set (idRuleVars var) +-- add_rule_var var set | keep_it fv_cand in_scope var = extendVarSet set var +-- | otherwise = set +-- SLPJ Feb06 + +oneVar :: Id -> FV +oneVar var fv_cand in_scope + = ASSERT( isId var ) + if keep_it fv_cand in_scope var + then unitVarSet var + else emptyVarSet + +someVars :: VarSet -> FV +someVars vars fv_cand in_scope + = filterVarSet (keep_it fv_cand in_scope) vars + +keep_it fv_cand in_scope var + | var `elemVarSet` in_scope = False + | fv_cand var = True + | otherwise = False + + +addBndr :: CoreBndr -> FV -> FV +addBndr bndr fv fv_cand in_scope + | isId bndr = inside_fvs `unionVarSet` someVars (idFreeTyVars bndr) fv_cand in_scope + | otherwise = inside_fvs + where + inside_fvs = fv fv_cand (in_scope `extendVarSet` bndr) + +addBndrs :: [CoreBndr] -> FV -> FV +addBndrs bndrs fv = foldr addBndr fv bndrs +\end{code} + + +\begin{code} +expr_fvs :: CoreExpr -> FV + +expr_fvs (Type ty) = someVars (tyVarsOfType ty) +expr_fvs (Var var) = oneVar var +expr_fvs (Lit lit) = noVars +expr_fvs (Note _ expr) = expr_fvs expr +expr_fvs (App fun arg) = expr_fvs fun `union` expr_fvs arg +expr_fvs (Lam bndr body) = addBndr bndr (expr_fvs body) + +expr_fvs (Case scrut bndr ty alts) + = expr_fvs scrut `union` someVars (tyVarsOfType ty) `union` addBndr bndr + (foldr (union . alt_fvs) noVars alts) + where + alt_fvs (con, bndrs, rhs) = addBndrs bndrs (expr_fvs rhs) + +expr_fvs (Let (NonRec bndr rhs) body) + = rhs_fvs (bndr, rhs) `union` addBndr bndr (expr_fvs body) + +expr_fvs (Let (Rec pairs) body) + = addBndrs (map fst pairs) + (foldr (union . rhs_fvs) (expr_fvs body) pairs) + +--------- +rhs_fvs (bndr, rhs) = expr_fvs rhs `union` someVars (idRuleVars bndr) + -- Treat any RULES as extra RHSs of the binding + +--------- +exprs_fvs exprs = foldr (union . expr_fvs) noVars exprs +\end{code} + + +%************************************************************************ +%* * +\section{Free names} +%* * +%************************************************************************ + +exprFreeNames finds the free *external* *names* of an expression, notably +including the names of type constructors (which of course do not show +up in exprFreeVars). Similarly ruleLhsFreeNames. The latter is used +when deciding whether a rule is an orphan. In particular, suppose that +T is defined in this module; we want to avoid declaring that a rule like + fromIntegral T = fromIntegral_T +is an orphan. Of course it isn't, an declaring it an orphan would +make the whole module an orphan module, which is bad. + +There's no need to delete local binders, because they will all +be *internal* names. + +\begin{code} +ruleLhsFreeNames :: CoreRule -> NameSet +ruleLhsFreeNames (BuiltinRule { ru_fn = fn }) = unitNameSet fn +ruleLhsFreeNames (Rule { ru_fn = fn, ru_bndrs = tpl_vars, ru_args = tpl_args }) + = addOneToNameSet (exprsFreeNames tpl_args) fn + +exprFreeNames :: CoreExpr -> NameSet +-- Find the free *external* names of an expression +exprFreeNames e + = go e + where + go (Var v) + | isExternalName n = unitNameSet n + | otherwise = emptyNameSet + where n = varName v + go (Lit _) = emptyNameSet + go (Type ty) = tyClsNamesOfType ty -- Don't need free tyvars + go (App e1 e2) = go e1 `unionNameSets` go e2 + go (Lam v e) = go e `delFromNameSet` varName v + go (Note n e) = go e + go (Let (NonRec b r) e) = go e `unionNameSets` go r + go (Let (Rec prs) e) = exprsFreeNames (map snd prs) `unionNameSets` go e + go (Case e b ty as) = go e `unionNameSets` tyClsNamesOfType ty + `unionNameSets` unionManyNameSets (map go_alt as) + + go_alt (_,_,r) = go r + +exprsFreeNames es = foldr (unionNameSets . exprFreeNames) emptyNameSet es +\end{code} + +%************************************************************************ +%* * +\section[freevars-everywhere]{Attaching free variables to every sub-expression} +%* * +%************************************************************************ + + +\begin{code} +ruleRhsFreeVars :: CoreRule -> VarSet +ruleRhsFreeVars (BuiltinRule {}) = noFVs +ruleRhsFreeVars (Rule { ru_fn = fn, ru_bndrs = bndrs, ru_rhs = rhs }) + = delFromUFM fvs fn + -- Hack alert! + -- Don't include the Id in its own rhs free-var set. + -- Otherwise the occurrence analyser makes bindings recursive + -- that shoudn't be. E.g. + -- RULE: f (f x y) z ==> f x (f y z) + where + fvs = addBndrs bndrs (expr_fvs rhs) isLocalVar emptyVarSet + +rulesRhsFreeVars :: [CoreRule] -> VarSet +rulesRhsFreeVars rules + = foldr (unionVarSet . ruleRhsFreeVars) emptyVarSet rules + +ruleLhsFreeIds :: CoreRule -> VarSet +-- This finds all locally-defined free Ids on the LHS of the rule +ruleLhsFreeIds (BuiltinRule {}) = noFVs +ruleLhsFreeIds (Rule { ru_bndrs = bndrs, ru_args = args }) + = addBndrs bndrs (exprs_fvs args) isLocalId emptyVarSet +\end{code} + + +%************************************************************************ +%* * +\section[freevars-everywhere]{Attaching free variables to every sub-expression} +%* * +%************************************************************************ + +The free variable pass annotates every node in the expression with its +NON-GLOBAL free variables and type variables. + +\begin{code} +type CoreBindWithFVs = AnnBind Id VarSet +type CoreExprWithFVs = AnnExpr Id VarSet + -- Every node annotated with its free variables, + -- both Ids and TyVars + +freeVarsOf :: CoreExprWithFVs -> IdSet +freeVarsOf (free_vars, _) = free_vars + +noFVs = emptyVarSet +aFreeVar = unitVarSet +unionFVs = unionVarSet + +delBindersFV :: [Var] -> VarSet -> VarSet +delBindersFV bs fvs = foldr delBinderFV fvs bs + +delBinderFV :: Var -> VarSet -> VarSet +-- This way round, so we can do it multiple times using foldr + +-- (b `delBinderFV` s) removes the binder b from the free variable set s, +-- but *adds* to s +-- (a) the free variables of b's type +-- (b) the idSpecVars of b +-- +-- This is really important for some lambdas: +-- In (\x::a -> x) the only mention of "a" is in the binder. +-- +-- Also in +-- let x::a = b in ... +-- we should really note that "a" is free in this expression. +-- It'll be pinned inside the /\a by the binding for b, but +-- it seems cleaner to make sure that a is in the free-var set +-- when it is mentioned. +-- +-- This also shows up in recursive bindings. Consider: +-- /\a -> letrec x::a = x in E +-- Now, there are no explicit free type variables in the RHS of x, +-- but nevertheless "a" is free in its definition. So we add in +-- the free tyvars of the types of the binders, and include these in the +-- free vars of the group, attached to the top level of each RHS. +-- +-- This actually happened in the defn of errorIO in IOBase.lhs: +-- errorIO (ST io) = case (errorIO# io) of +-- _ -> bottom +-- where +-- bottom = bottom -- Never evaluated + +delBinderFV b s | isId b = (s `delVarSet` b) `unionFVs` idFreeVars b + | otherwise = s `delVarSet` b + +idFreeVars :: Id -> VarSet +idFreeVars id = ASSERT( isId id) idRuleVars id `unionVarSet` idFreeTyVars id + +idFreeTyVars :: Id -> TyVarSet +-- Only local Ids conjured up locally, can have free type variables. +-- (During type checking top-level Ids can have free tyvars) +idFreeTyVars id = tyVarsOfType (idType id) +-- | isLocalId id = tyVarsOfType (idType id) +-- | otherwise = emptyVarSet + +idRuleVars ::Id -> VarSet +idRuleVars id = ASSERT( isId id) specInfoFreeVars (idSpecialisation id) +\end{code} + + +%************************************************************************ +%* * +\subsection{Free variables (and types)} +%* * +%************************************************************************ + +\begin{code} +freeVars :: CoreExpr -> CoreExprWithFVs + +freeVars (Var v) + = (fvs, AnnVar v) + where + -- ToDo: insert motivating example for why we *need* + -- to include the idSpecVars in the FV list. + -- Actually [June 98] I don't think it's necessary + -- fvs = fvs_v `unionVarSet` idSpecVars v + + fvs | isLocalVar v = aFreeVar v + | otherwise = noFVs + +freeVars (Lit lit) = (noFVs, AnnLit lit) +freeVars (Lam b body) + = (b `delBinderFV` freeVarsOf body', AnnLam b body') + where + body' = freeVars body + +freeVars (App fun arg) + = (freeVarsOf fun2 `unionFVs` freeVarsOf arg2, AnnApp fun2 arg2) + where + fun2 = freeVars fun + arg2 = freeVars arg + +freeVars (Case scrut bndr ty alts) +-- gaw 2004 + = ((bndr `delBinderFV` alts_fvs) `unionFVs` freeVarsOf scrut2 `unionFVs` tyVarsOfType ty, + AnnCase scrut2 bndr ty alts2) + where + scrut2 = freeVars scrut + + (alts_fvs_s, alts2) = mapAndUnzip fv_alt alts + alts_fvs = foldr1 unionFVs alts_fvs_s + + fv_alt (con,args,rhs) = (delBindersFV args (freeVarsOf rhs2), + (con, args, rhs2)) + where + rhs2 = freeVars rhs + +freeVars (Let (NonRec binder rhs) body) + = (freeVarsOf rhs2 `unionFVs` body_fvs, + AnnLet (AnnNonRec binder rhs2) body2) + where + rhs2 = freeVars rhs + body2 = freeVars body + body_fvs = binder `delBinderFV` freeVarsOf body2 + +freeVars (Let (Rec binds) body) + = (foldl delVarSet group_fvs binders, + -- The "delBinderFV" part may have added one of the binders + -- via the idSpecVars part, so we must delete it again + AnnLet (AnnRec (binders `zip` rhss2)) body2) + where + (binders, rhss) = unzip binds + + rhss2 = map freeVars rhss + all_fvs = foldr (unionFVs . fst) body_fvs rhss2 + group_fvs = delBindersFV binders all_fvs + + body2 = freeVars body + body_fvs = freeVarsOf body2 + +freeVars (Note (Coerce to_ty from_ty) expr) + = (freeVarsOf expr2 `unionFVs` tfvs1 `unionFVs` tfvs2, + AnnNote (Coerce to_ty from_ty) expr2) + where + expr2 = freeVars expr + tfvs1 = tyVarsOfType from_ty + tfvs2 = tyVarsOfType to_ty + +freeVars (Note other_note expr) + = (freeVarsOf expr2, AnnNote other_note expr2) + where + expr2 = freeVars expr + +freeVars (Type ty) = (tyVarsOfType ty, AnnType ty) +\end{code} + diff --git a/compiler/coreSyn/CoreLint.lhs b/compiler/coreSyn/CoreLint.lhs new file mode 100644 index 0000000000..dda8290bf4 --- /dev/null +++ b/compiler/coreSyn/CoreLint.lhs @@ -0,0 +1,785 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1993-1998 +% +\section[CoreLint]{A ``lint'' pass to check for Core correctness} + +\begin{code} +module CoreLint ( + lintCoreBindings, + lintUnfolding, + showPass, endPass + ) where + +#include "HsVersions.h" + +import CoreSyn +import CoreFVs ( idFreeVars ) +import CoreUtils ( findDefault, exprOkForSpeculation, coreBindsSize ) +import Unify ( coreRefineTys ) +import Bag +import Literal ( literalType ) +import DataCon ( dataConRepType, isVanillaDataCon, dataConTyCon, dataConWorkId ) +import TysWiredIn ( tupleCon ) +import Var ( Var, Id, TyVar, idType, tyVarKind, mustHaveLocalBinding ) +import VarSet +import Name ( getSrcLoc ) +import PprCore +import ErrUtils ( dumpIfSet_core, ghcExit, Message, showPass, + mkLocMessage, debugTraceMsg ) +import SrcLoc ( SrcLoc, noSrcLoc, mkSrcSpan ) +import Type ( Type, tyVarsOfType, coreEqType, + splitFunTy_maybe, mkTyVarTys, + splitForAllTy_maybe, splitTyConApp_maybe, + isUnLiftedType, typeKind, mkForAllTy, mkFunTy, + isUnboxedTupleType, isSubKind, + substTyWith, emptyTvSubst, extendTvInScope, + TvSubst, TvSubstEnv, mkTvSubst, setTvSubstEnv, substTy, + extendTvSubst, composeTvSubst, isInScope, + getTvSubstEnv, getTvInScope ) +import TyCon ( isPrimTyCon ) +import BasicTypes ( RecFlag(..), Boxity(..), isNonRec ) +import StaticFlags ( opt_PprStyle_Debug ) +import DynFlags ( DynFlags, DynFlag(..), dopt ) +import Outputable + +#ifdef DEBUG +import Util ( notNull ) +#endif + +import Maybe + +\end{code} + +%************************************************************************ +%* * +\subsection{End pass} +%* * +%************************************************************************ + +@showPass@ and @endPass@ don't really belong here, but it makes a convenient +place for them. They print out stuff before and after core passes, +and do Core Lint when necessary. + +\begin{code} +endPass :: DynFlags -> String -> DynFlag -> [CoreBind] -> IO [CoreBind] +endPass dflags pass_name dump_flag binds + = do + -- Report result size if required + -- This has the side effect of forcing the intermediate to be evaluated + debugTraceMsg dflags 2 $ + (text " Result size =" <+> int (coreBindsSize binds)) + + -- Report verbosely, if required + dumpIfSet_core dflags dump_flag pass_name (pprCoreBindings binds) + + -- Type check + lintCoreBindings dflags pass_name binds + + return binds +\end{code} + + +%************************************************************************ +%* * +\subsection[lintCoreBindings]{@lintCoreBindings@: Top-level interface} +%* * +%************************************************************************ + +Checks that a set of core bindings is well-formed. The PprStyle and String +just control what we print in the event of an error. The Bool value +indicates whether we have done any specialisation yet (in which case we do +some extra checks). + +We check for + (a) type errors + (b) Out-of-scope type variables + (c) Out-of-scope local variables + (d) Ill-kinded types + +If we have done specialisation the we check that there are + (a) No top-level bindings of primitive (unboxed type) + +Outstanding issues: + + -- + -- Things are *not* OK if: + -- + -- * Unsaturated type app before specialisation has been done; + -- + -- * Oversaturated type app after specialisation (eta reduction + -- may well be happening...); + +\begin{code} +lintCoreBindings :: DynFlags -> String -> [CoreBind] -> IO () + +lintCoreBindings dflags whoDunnit binds + | not (dopt Opt_DoCoreLinting dflags) + = return () + +lintCoreBindings dflags whoDunnit binds + = case (initL (lint_binds binds)) of + Nothing -> showPass dflags ("Core Linted result of " ++ whoDunnit) + Just bad_news -> printDump (display bad_news) >> + ghcExit dflags 1 + where + -- Put all the top-level binders in scope at the start + -- This is because transformation rules can bring something + -- into use 'unexpectedly' + lint_binds binds = addInScopeVars (bindersOfBinds binds) $ + mapM lint_bind binds + + lint_bind (Rec prs) = mapM_ (lintSingleBinding Recursive) prs + lint_bind (NonRec bndr rhs) = lintSingleBinding NonRecursive (bndr,rhs) + + display bad_news + = vcat [ text ("*** Core Lint Errors: in result of " ++ whoDunnit ++ " ***"), + bad_news, + ptext SLIT("*** Offending Program ***"), + pprCoreBindings binds, + ptext SLIT("*** End of Offense ***") + ] +\end{code} + +%************************************************************************ +%* * +\subsection[lintUnfolding]{lintUnfolding} +%* * +%************************************************************************ + +We use this to check all unfoldings that come in from interfaces +(it is very painful to catch errors otherwise): + +\begin{code} +lintUnfolding :: SrcLoc + -> [Var] -- Treat these as in scope + -> CoreExpr + -> Maybe Message -- Nothing => OK + +lintUnfolding locn vars expr + = initL (addLoc (ImportedUnfolding locn) $ + addInScopeVars vars $ + lintCoreExpr expr) +\end{code} + +%************************************************************************ +%* * +\subsection[lintCoreBinding]{lintCoreBinding} +%* * +%************************************************************************ + +Check a core binding, returning the list of variables bound. + +\begin{code} +lintSingleBinding rec_flag (binder,rhs) + = addLoc (RhsOf binder) $ + -- Check the rhs + do { ty <- lintCoreExpr rhs + ; lintBinder binder -- Check match to RHS type + ; binder_ty <- applySubst binder_ty + ; checkTys binder_ty ty (mkRhsMsg binder ty) + -- Check (not isUnLiftedType) (also checks for bogus unboxed tuples) + ; checkL (not (isUnLiftedType binder_ty) + || (isNonRec rec_flag && exprOkForSpeculation rhs)) + (mkRhsPrimMsg binder rhs) + -- Check whether binder's specialisations contain any out-of-scope variables + ; mapM_ (checkBndrIdInScope binder) bndr_vars } + + -- We should check the unfolding, if any, but this is tricky because + -- the unfolding is a SimplifiableCoreExpr. Give up for now. + where + binder_ty = idType binder + bndr_vars = varSetElems (idFreeVars binder) +\end{code} + +%************************************************************************ +%* * +\subsection[lintCoreExpr]{lintCoreExpr} +%* * +%************************************************************************ + +\begin{code} +type InType = Type -- Substitution not yet applied +type OutType = Type -- Substitution has been applied to this + +lintCoreExpr :: CoreExpr -> LintM OutType +-- The returned type has the substitution from the monad +-- already applied to it: +-- lintCoreExpr e subst = exprType (subst e) + +lintCoreExpr (Var var) + = do { checkIdInScope var + ; applySubst (idType var) } + +lintCoreExpr (Lit lit) + = return (literalType lit) + +lintCoreExpr (Note (Coerce to_ty from_ty) expr) + = do { expr_ty <- lintCoreExpr expr + ; to_ty <- lintTy to_ty + ; from_ty <- lintTy from_ty + ; checkTys from_ty expr_ty (mkCoerceErr from_ty expr_ty) + ; return to_ty } + +lintCoreExpr (Note other_note expr) + = lintCoreExpr expr + +lintCoreExpr (Let (NonRec bndr rhs) body) + = do { lintSingleBinding NonRecursive (bndr,rhs) + ; addLoc (BodyOfLetRec [bndr]) + (addInScopeVars [bndr] (lintCoreExpr body)) } + +lintCoreExpr (Let (Rec pairs) body) + = addInScopeVars bndrs $ + do { mapM (lintSingleBinding Recursive) pairs + ; addLoc (BodyOfLetRec bndrs) (lintCoreExpr body) } + where + bndrs = map fst pairs + +lintCoreExpr e@(App fun (Type ty)) +-- This is like 'let' for types +-- It's needed when dealing with desugarer output for GADTs. Consider +-- data T = forall a. T a (a->Int) Bool +-- f :: T -> ... -> +-- f (T x f True) = <e1> +-- f (T y g False) = <e2> +-- After desugaring we get +-- f t b = case t of +-- T a (x::a) (f::a->Int) (b:Bool) -> +-- case b of +-- True -> <e1> +-- False -> (/\b. let y=x; g=f in <e2>) a +-- And for a reason I now forget, the ...<e2>... can mention a; so +-- we want Lint to know that b=a. Ugh. +-- +-- I tried quite hard to make the necessity for this go away, by changing the +-- desugarer, but the fundamental problem is this: +-- +-- T a (x::a) (y::Int) -> let fail::a = ... +-- in (/\b. ...(case ... of +-- True -> x::b +-- False -> fail) +-- ) a +-- Now the inner case look as though it has incompatible branches. + = addLoc (AnExpr e) $ + go fun [ty] + where + go (App fun (Type ty)) tys + = do { go fun (ty:tys) } + go (Lam tv body) (ty:tys) + = do { checkL (isTyVar tv) (mkKindErrMsg tv ty) -- Not quite accurate + ; ty' <- lintTy ty; + ; checkKinds tv ty' + -- Now extend the substitution so we + -- take advantage of it in the body + ; addInScopeVars [tv] $ + extendSubstL tv ty' $ + go body tys } + go fun tys + = do { fun_ty <- lintCoreExpr fun + ; lintCoreArgs fun_ty (map Type tys) } + +lintCoreExpr e@(App fun arg) + = do { fun_ty <- lintCoreExpr fun + ; addLoc (AnExpr e) $ + lintCoreArg fun_ty arg } + +lintCoreExpr (Lam var expr) + = addLoc (LambdaBodyOf var) $ + do { body_ty <- addInScopeVars [var] $ + lintCoreExpr expr + ; if isId var then do + { var_ty <- lintId var + ; return (mkFunTy var_ty body_ty) } + else + return (mkForAllTy var body_ty) + } + -- The applySubst is needed to apply the subst to var + +lintCoreExpr e@(Case scrut var alt_ty alts) = + -- Check the scrutinee + do { scrut_ty <- lintCoreExpr scrut + ; alt_ty <- lintTy alt_ty + ; var_ty <- lintTy (idType var) + -- Don't use lintId on var, because unboxed tuple is legitimate + + ; checkTys var_ty scrut_ty (mkScrutMsg var scrut_ty) + + -- If the binder is an unboxed tuple type, don't put it in scope + ; let vars = if (isUnboxedTupleType (idType var)) then [] else [var] + ; addInScopeVars vars $ + do { -- Check the alternatives + checkCaseAlts e scrut_ty alts + ; mapM (lintCoreAlt scrut_ty alt_ty) alts + ; return alt_ty } } + +lintCoreExpr e@(Type ty) + = addErrL (mkStrangeTyMsg e) +\end{code} + +%************************************************************************ +%* * +\subsection[lintCoreArgs]{lintCoreArgs} +%* * +%************************************************************************ + +The basic version of these functions checks that the argument is a +subtype of the required type, as one would expect. + +\begin{code} +lintCoreArgs :: Type -> [CoreArg] -> LintM Type +lintCoreArg :: Type -> CoreArg -> LintM Type +-- First argument has already had substitution applied to it +\end{code} + +\begin{code} +lintCoreArgs ty [] = return ty +lintCoreArgs ty (a : args) = + do { res <- lintCoreArg ty a + ; lintCoreArgs res args } + +lintCoreArg fun_ty a@(Type arg_ty) = + do { arg_ty <- lintTy arg_ty + ; lintTyApp fun_ty arg_ty } + +lintCoreArg fun_ty arg = + -- Make sure function type matches argument + do { arg_ty <- lintCoreExpr arg + ; let err = mkAppMsg fun_ty arg_ty arg + ; case splitFunTy_maybe fun_ty of + Just (arg,res) -> + do { checkTys arg arg_ty err + ; return res } + _ -> addErrL err } +\end{code} + +\begin{code} +-- Both args have had substitution applied +lintTyApp ty arg_ty + = case splitForAllTy_maybe ty of + Nothing -> addErrL (mkTyAppMsg ty arg_ty) + + Just (tyvar,body) + -> do { checkL (isTyVar tyvar) (mkTyAppMsg ty arg_ty) + ; checkKinds tyvar arg_ty + ; return (substTyWith [tyvar] [arg_ty] body) } + +lintTyApps fun_ty [] = return fun_ty + +lintTyApps fun_ty (arg_ty : arg_tys) = + do { fun_ty' <- lintTyApp fun_ty arg_ty + ; lintTyApps fun_ty' arg_tys } + +checkKinds tyvar arg_ty + -- Arg type might be boxed for a function with an uncommitted + -- tyvar; notably this is used so that we can give + -- error :: forall a:*. String -> a + -- and then apply it to both boxed and unboxed types. + = checkL (argty_kind `isSubKind` tyvar_kind) + (mkKindErrMsg tyvar arg_ty) + where + tyvar_kind = tyVarKind tyvar + argty_kind = typeKind arg_ty +\end{code} + + +%************************************************************************ +%* * +\subsection[lintCoreAlts]{lintCoreAlts} +%* * +%************************************************************************ + +\begin{code} +checkCaseAlts :: CoreExpr -> OutType -> [CoreAlt] -> LintM () +-- a) Check that the alts are non-empty +-- b1) Check that the DEFAULT comes first, if it exists +-- b2) Check that the others are in increasing order +-- c) Check that there's a default for infinite types +-- NB: Algebraic cases are not necessarily exhaustive, because +-- the simplifer correctly eliminates case that can't +-- possibly match. + +checkCaseAlts e ty [] + = addErrL (mkNullAltsMsg e) + +checkCaseAlts e ty alts = + do { checkL (all non_deflt con_alts) (mkNonDefltMsg e) + ; checkL (increasing_tag con_alts) (mkNonIncreasingAltsMsg e) + ; checkL (isJust maybe_deflt || not is_infinite_ty) + (nonExhaustiveAltsMsg e) } + where + (con_alts, maybe_deflt) = findDefault alts + + -- Check that successive alternatives have increasing tags + increasing_tag (alt1 : rest@( alt2 : _)) = alt1 `ltAlt` alt2 && increasing_tag rest + increasing_tag other = True + + non_deflt (DEFAULT, _, _) = False + non_deflt alt = True + + is_infinite_ty = case splitTyConApp_maybe ty of + Nothing -> False + Just (tycon, tycon_arg_tys) -> isPrimTyCon tycon +\end{code} + +\begin{code} +checkAltExpr :: CoreExpr -> OutType -> LintM () +checkAltExpr expr ann_ty + = do { actual_ty <- lintCoreExpr expr + ; checkTys actual_ty ann_ty (mkCaseAltMsg expr actual_ty ann_ty) } + +lintCoreAlt :: OutType -- Type of scrutinee + -> OutType -- Type of the alternative + -> CoreAlt + -> LintM () + +lintCoreAlt scrut_ty alt_ty alt@(DEFAULT, args, rhs) = + do { checkL (null args) (mkDefaultArgsMsg args) + ; checkAltExpr rhs alt_ty } + +lintCoreAlt scrut_ty alt_ty alt@(LitAlt lit, args, rhs) = + do { checkL (null args) (mkDefaultArgsMsg args) + ; checkTys lit_ty scrut_ty (mkBadPatMsg lit_ty scrut_ty) + ; checkAltExpr rhs alt_ty } + where + lit_ty = literalType lit + +lintCoreAlt scrut_ty alt_ty alt@(DataAlt con, args, rhs) + | Just (tycon, tycon_arg_tys) <- splitTyConApp_maybe scrut_ty, + tycon == dataConTyCon con + = addLoc (CaseAlt alt) $ + addInScopeVars args $ -- Put the args in scope before lintBinder, + -- because the Ids mention the type variables + if isVanillaDataCon con then + do { addLoc (CasePat alt) $ do + { mapM lintBinder args + -- FIX! Add check that all args are Ids. + -- Check the pattern + -- Scrutinee type must be a tycon applicn; checked by caller + -- This code is remarkably compact considering what it does! + -- NB: args must be in scope here so that the lintCoreArgs line works. + -- NB: relies on existential type args coming *after* ordinary type args + + ; con_type <- lintTyApps (dataConRepType con) tycon_arg_tys + -- Can just map Var as we know that this is a vanilla datacon + ; con_result_ty <- lintCoreArgs con_type (map Var args) + ; checkTys con_result_ty scrut_ty (mkBadPatMsg con_result_ty scrut_ty) + } + -- Check the RHS + ; checkAltExpr rhs alt_ty } + + else -- GADT + do { let (tvs,ids) = span isTyVar args + ; subst <- getTvSubst + ; let in_scope = getTvInScope subst + subst_env = getTvSubstEnv subst + ; case coreRefineTys con tvs scrut_ty of { + Nothing -> return () ; -- Alternative is dead code + Just (refine, _) -> updateTvSubstEnv (composeTvSubst in_scope refine subst_env) $ + do { addLoc (CasePat alt) $ do + { tvs' <- mapM lintTy (mkTyVarTys tvs) + ; con_type <- lintTyApps (dataConRepType con) tvs' + ; mapM lintBinder ids -- Lint Ids in the refined world + ; lintCoreArgs con_type (map Var ids) + } + + ; let refined_alt_ty = substTy (mkTvSubst in_scope refine) alt_ty + -- alt_ty is already an OutType, so don't re-apply + -- the current substitution. But we must apply the + -- refinement so that the check in checkAltExpr is ok + ; checkAltExpr rhs refined_alt_ty + } } } + + | otherwise -- Scrut-ty is wrong shape + = addErrL (mkBadAltMsg scrut_ty alt) +\end{code} + +%************************************************************************ +%* * +\subsection[lint-types]{Types} +%* * +%************************************************************************ + +\begin{code} +lintBinder :: Var -> LintM () +lintBinder var | isId var = lintId var >> return () + | otherwise = return () + +lintId :: Var -> LintM OutType +-- ToDo: lint its rules +lintId id + = do { checkL (not (isUnboxedTupleType (idType id))) + (mkUnboxedTupleMsg id) + -- No variable can be bound to an unboxed tuple. + ; lintTy (idType id) } + +lintTy :: InType -> LintM OutType +-- Check the type, and apply the substitution to it +-- ToDo: check the kind structure of the type +lintTy ty + = do { ty' <- applySubst ty + ; mapM_ checkIdInScope (varSetElems (tyVarsOfType ty')) + ; return ty' } +\end{code} + + +%************************************************************************ +%* * +\subsection[lint-monad]{The Lint monad} +%* * +%************************************************************************ + +\begin{code} +newtype LintM a = + LintM { unLintM :: + [LintLocInfo] -> -- Locations + TvSubst -> -- Current type substitution; we also use this + -- to keep track of all the variables in scope, + -- both Ids and TyVars + Bag Message -> -- Error messages so far + (Maybe a, Bag Message) } -- Result and error messages (if any) + +instance Monad LintM where + return x = LintM (\ loc subst errs -> (Just x, errs)) + fail err = LintM (\ loc subst errs -> (Nothing, addErr subst errs (text err) loc)) + m >>= k = LintM (\ loc subst errs -> + let (res, errs') = unLintM m loc subst errs in + case res of + Just r -> unLintM (k r) loc subst errs' + Nothing -> (Nothing, errs')) + +data LintLocInfo + = RhsOf Id -- The variable bound + | LambdaBodyOf Id -- The lambda-binder + | BodyOfLetRec [Id] -- One of the binders + | CaseAlt CoreAlt -- Case alternative + | CasePat CoreAlt -- *Pattern* of the case alternative + | AnExpr CoreExpr -- Some expression + | ImportedUnfolding SrcLoc -- Some imported unfolding (ToDo: say which) +\end{code} + + +\begin{code} +initL :: LintM a -> Maybe Message {- errors -} +initL m + = case unLintM m [] emptyTvSubst emptyBag of + (_, errs) | isEmptyBag errs -> Nothing + | otherwise -> Just (vcat (punctuate (text "") (bagToList errs))) +\end{code} + +\begin{code} +checkL :: Bool -> Message -> LintM () +checkL True msg = return () +checkL False msg = addErrL msg + +addErrL :: Message -> LintM a +addErrL msg = LintM (\ loc subst errs -> (Nothing, addErr subst errs msg loc)) + +addErr :: TvSubst -> Bag Message -> Message -> [LintLocInfo] -> Bag Message +addErr subst errs_so_far msg locs + = ASSERT( notNull locs ) + errs_so_far `snocBag` mk_msg msg + where + (loc, cxt1) = dumpLoc (head locs) + cxts = [snd (dumpLoc loc) | loc <- locs] + context | opt_PprStyle_Debug = vcat (reverse cxts) $$ cxt1 $$ + ptext SLIT("Substitution:") <+> ppr subst + | otherwise = cxt1 + + mk_msg msg = mkLocMessage (mkSrcSpan loc loc) (context $$ msg) + +addLoc :: LintLocInfo -> LintM a -> LintM a +addLoc extra_loc m = + LintM (\ loc subst errs -> unLintM m (extra_loc:loc) subst errs) + +addInScopeVars :: [Var] -> LintM a -> LintM a +addInScopeVars vars m = + LintM (\ loc subst errs -> unLintM m loc (extendTvInScope subst vars) errs) + +updateTvSubstEnv :: TvSubstEnv -> LintM a -> LintM a +updateTvSubstEnv substenv m = + LintM (\ loc subst errs -> unLintM m loc (setTvSubstEnv subst substenv) errs) + +getTvSubst :: LintM TvSubst +getTvSubst = LintM (\ loc subst errs -> (Just subst, errs)) + +applySubst :: Type -> LintM Type +applySubst ty = do { subst <- getTvSubst; return (substTy subst ty) } + +extendSubstL :: TyVar -> Type -> LintM a -> LintM a +extendSubstL tv ty m + = LintM (\ loc subst errs -> unLintM m loc (extendTvSubst subst tv ty) errs) +\end{code} + +\begin{code} +checkIdInScope :: Var -> LintM () +checkIdInScope id + = do { checkL (not (id == oneTupleDataConId)) + (ptext SLIT("Illegal one-tuple")) + ; checkInScope (ptext SLIT("is out of scope")) id } + +oneTupleDataConId :: Id -- Should not happen +oneTupleDataConId = dataConWorkId (tupleCon Boxed 1) + +checkBndrIdInScope :: Var -> Var -> LintM () +checkBndrIdInScope binder id + = checkInScope msg id + where + msg = ptext SLIT("is out of scope inside info for") <+> + ppr binder + +checkInScope :: SDoc -> Var -> LintM () +checkInScope loc_msg var = + do { subst <- getTvSubst + ; checkL (not (mustHaveLocalBinding var) || (var `isInScope` subst)) + (hsep [ppr var, loc_msg]) } + +checkTys :: Type -> Type -> Message -> LintM () +-- check ty2 is subtype of ty1 (ie, has same structure but usage +-- annotations need only be consistent, not equal) +-- Assumes ty1,ty2 are have alrady had the substitution applied +checkTys ty1 ty2 msg = checkL (ty1 `coreEqType` ty2) msg +\end{code} + +%************************************************************************ +%* * +\subsection{Error messages} +%* * +%************************************************************************ + +\begin{code} +dumpLoc (RhsOf v) + = (getSrcLoc v, brackets (ptext SLIT("RHS of") <+> pp_binders [v])) + +dumpLoc (LambdaBodyOf b) + = (getSrcLoc b, brackets (ptext SLIT("in body of lambda with binder") <+> pp_binder b)) + +dumpLoc (BodyOfLetRec []) + = (noSrcLoc, brackets (ptext SLIT("In body of a letrec with no binders"))) + +dumpLoc (BodyOfLetRec bs@(_:_)) + = ( getSrcLoc (head bs), brackets (ptext SLIT("in body of letrec with binders") <+> pp_binders bs)) + +dumpLoc (AnExpr e) + = (noSrcLoc, text "In the expression:" <+> ppr e) + +dumpLoc (CaseAlt (con, args, rhs)) + = (noSrcLoc, text "In a case alternative:" <+> parens (ppr con <+> pp_binders args)) + +dumpLoc (CasePat (con, args, rhs)) + = (noSrcLoc, text "In the pattern of a case alternative:" <+> parens (ppr con <+> pp_binders args)) + +dumpLoc (ImportedUnfolding locn) + = (locn, brackets (ptext SLIT("in an imported unfolding"))) + +pp_binders :: [Var] -> SDoc +pp_binders bs = sep (punctuate comma (map pp_binder bs)) + +pp_binder :: Var -> SDoc +pp_binder b | isId b = hsep [ppr b, dcolon, ppr (idType b)] + | isTyVar b = hsep [ppr b, dcolon, ppr (tyVarKind b)] +\end{code} + +\begin{code} +------------------------------------------------------ +-- Messages for case expressions + +mkNullAltsMsg :: CoreExpr -> Message +mkNullAltsMsg e + = hang (text "Case expression with no alternatives:") + 4 (ppr e) + +mkDefaultArgsMsg :: [Var] -> Message +mkDefaultArgsMsg args + = hang (text "DEFAULT case with binders") + 4 (ppr args) + +mkCaseAltMsg :: CoreExpr -> Type -> Type -> Message +mkCaseAltMsg e ty1 ty2 + = hang (text "Type of case alternatives not the same as the annotation on case:") + 4 (vcat [ppr ty1, ppr ty2, ppr e]) + +mkScrutMsg :: Id -> Type -> Message +mkScrutMsg var scrut_ty + = vcat [text "Result binder in case doesn't match scrutinee:" <+> ppr var, + text "Result binder type:" <+> ppr (idType var), + text "Scrutinee type:" <+> ppr scrut_ty] + + +mkNonDefltMsg e + = hang (text "Case expression with DEFAULT not at the beginnning") 4 (ppr e) +mkNonIncreasingAltsMsg e + = hang (text "Case expression with badly-ordered alternatives") 4 (ppr e) + +nonExhaustiveAltsMsg :: CoreExpr -> Message +nonExhaustiveAltsMsg e + = hang (text "Case expression with non-exhaustive alternatives") 4 (ppr e) + +mkBadPatMsg :: Type -> Type -> Message +mkBadPatMsg con_result_ty scrut_ty + = vcat [ + text "In a case alternative, pattern result type doesn't match scrutinee type:", + text "Pattern result type:" <+> ppr con_result_ty, + text "Scrutinee type:" <+> ppr scrut_ty + ] + +mkBadAltMsg :: Type -> CoreAlt -> Message +mkBadAltMsg scrut_ty alt + = vcat [ text "Data alternative when scrutinee is not a tycon application", + text "Scrutinee type:" <+> ppr scrut_ty, + text "Alternative:" <+> pprCoreAlt alt ] + +------------------------------------------------------ +-- Other error messages + +mkAppMsg :: Type -> Type -> CoreExpr -> Message +mkAppMsg fun_ty arg_ty arg + = vcat [ptext SLIT("Argument value doesn't match argument type:"), + hang (ptext SLIT("Fun type:")) 4 (ppr fun_ty), + hang (ptext SLIT("Arg type:")) 4 (ppr arg_ty), + hang (ptext SLIT("Arg:")) 4 (ppr arg)] + +mkKindErrMsg :: TyVar -> Type -> Message +mkKindErrMsg tyvar arg_ty + = vcat [ptext SLIT("Kinds don't match in type application:"), + hang (ptext SLIT("Type variable:")) + 4 (ppr tyvar <+> dcolon <+> ppr (tyVarKind tyvar)), + hang (ptext SLIT("Arg type:")) + 4 (ppr arg_ty <+> dcolon <+> ppr (typeKind arg_ty))] + +mkTyAppMsg :: Type -> Type -> Message +mkTyAppMsg ty arg_ty + = vcat [text "Illegal type application:", + hang (ptext SLIT("Exp type:")) + 4 (ppr ty <+> dcolon <+> ppr (typeKind ty)), + hang (ptext SLIT("Arg type:")) + 4 (ppr arg_ty <+> dcolon <+> ppr (typeKind arg_ty))] + +mkRhsMsg :: Id -> Type -> Message +mkRhsMsg binder ty + = vcat + [hsep [ptext SLIT("The type of this binder doesn't match the type of its RHS:"), + ppr binder], + hsep [ptext SLIT("Binder's type:"), ppr (idType binder)], + hsep [ptext SLIT("Rhs type:"), ppr ty]] + +mkRhsPrimMsg :: Id -> CoreExpr -> Message +mkRhsPrimMsg binder rhs + = vcat [hsep [ptext SLIT("The type of this binder is primitive:"), + ppr binder], + hsep [ptext SLIT("Binder's type:"), ppr (idType binder)] + ] + +mkUnboxedTupleMsg :: Id -> Message +mkUnboxedTupleMsg binder + = vcat [hsep [ptext SLIT("A variable has unboxed tuple type:"), ppr binder], + hsep [ptext SLIT("Binder's type:"), ppr (idType binder)]] + +mkCoerceErr from_ty expr_ty + = vcat [ptext SLIT("From-type of Coerce differs from type of enclosed expression"), + ptext SLIT("From-type:") <+> ppr from_ty, + ptext SLIT("Type of enclosed expr:") <+> ppr expr_ty + ] + +mkStrangeTyMsg e + = ptext SLIT("Type where expression expected:") <+> ppr e +\end{code} diff --git a/compiler/coreSyn/CorePrep.lhs b/compiler/coreSyn/CorePrep.lhs new file mode 100644 index 0000000000..e5165f0ebe --- /dev/null +++ b/compiler/coreSyn/CorePrep.lhs @@ -0,0 +1,859 @@ +% +% (c) The University of Glasgow, 1994-2000 +% +\section{Core pass to saturate constructors and PrimOps} + +\begin{code} +module CorePrep ( + corePrepPgm, corePrepExpr + ) where + +#include "HsVersions.h" + +import CoreUtils( exprType, exprIsHNF, etaExpand, exprArity, exprOkForSpeculation ) +import CoreFVs ( exprFreeVars ) +import CoreLint ( endPass ) +import CoreSyn +import Type ( Type, applyTy, splitFunTy_maybe, + isUnLiftedType, isUnboxedTupleType, seqType ) +import TyCon ( TyCon, tyConDataCons ) +import NewDemand ( Demand, isStrictDmd, lazyDmd, StrictSig(..), DmdType(..) ) +import Var ( Var, Id, setVarUnique ) +import VarSet +import VarEnv +import Id ( mkSysLocal, idType, idNewDemandInfo, idArity, setIdUnfolding, setIdType, + isFCallId, isGlobalId, + isLocalId, hasNoBinding, idNewStrictness, + isPrimOpId_maybe + ) +import DataCon ( isVanillaDataCon, dataConWorkId ) +import PrimOp ( PrimOp( DataToTagOp ) ) +import BasicTypes ( TopLevelFlag(..), isTopLevel, isNotTopLevel, + RecFlag(..), isNonRec + ) +import UniqSupply +import Maybes +import OrdList +import ErrUtils +import DynFlags +import Util ( listLengthCmp ) +import Outputable +\end{code} + +-- --------------------------------------------------------------------------- +-- Overview +-- --------------------------------------------------------------------------- + +The goal of this pass is to prepare for code generation. + +1. Saturate constructor and primop applications. + +2. Convert to A-normal form: + + * Use case for strict arguments: + f E ==> case E of x -> f x + (where f is strict) + + * Use let for non-trivial lazy arguments + f E ==> let x = E in f x + (were f is lazy and x is non-trivial) + +3. Similarly, convert any unboxed lets into cases. + [I'm experimenting with leaving 'ok-for-speculation' + rhss in let-form right up to this point.] + +4. Ensure that lambdas only occur as the RHS of a binding + (The code generator can't deal with anything else.) + +5. [Not any more; nuked Jun 2002] Do the seq/par munging. + +6. Clone all local Ids. + This means that all such Ids are unique, rather than the + weaker guarantee of no clashes which the simplifier provides. + And that is what the code generator needs. + + We don't clone TyVars. The code gen doesn't need that, + and doing so would be tiresome because then we'd need + to substitute in types. + + +7. Give each dynamic CCall occurrence a fresh unique; this is + rather like the cloning step above. + +8. Inject bindings for the "implicit" Ids: + * Constructor wrappers + * Constructor workers + * Record selectors + We want curried definitions for all of these in case they + aren't inlined by some caller. + +This is all done modulo type applications and abstractions, so that +when type erasure is done for conversion to STG, we don't end up with +any trivial or useless bindings. + + + +-- ----------------------------------------------------------------------------- +-- Top level stuff +-- ----------------------------------------------------------------------------- + +\begin{code} +corePrepPgm :: DynFlags -> [CoreBind] -> [TyCon] -> IO [CoreBind] +corePrepPgm dflags binds data_tycons + = do showPass dflags "CorePrep" + us <- mkSplitUniqSupply 's' + + let implicit_binds = mkDataConWorkers data_tycons + -- NB: we must feed mkImplicitBinds through corePrep too + -- so that they are suitably cloned and eta-expanded + + binds_out = initUs_ us ( + corePrepTopBinds binds `thenUs` \ floats1 -> + corePrepTopBinds implicit_binds `thenUs` \ floats2 -> + returnUs (deFloatTop (floats1 `appendFloats` floats2)) + ) + + endPass dflags "CorePrep" Opt_D_dump_prep binds_out + return binds_out + +corePrepExpr :: DynFlags -> CoreExpr -> IO CoreExpr +corePrepExpr dflags expr + = do showPass dflags "CorePrep" + us <- mkSplitUniqSupply 's' + let new_expr = initUs_ us (corePrepAnExpr emptyCorePrepEnv expr) + dumpIfSet_dyn dflags Opt_D_dump_prep "CorePrep" + (ppr new_expr) + return new_expr +\end{code} + +-- ----------------------------------------------------------------------------- +-- Implicit bindings +-- ----------------------------------------------------------------------------- + +Create any necessary "implicit" bindings for data con workers. We +create the rather strange (non-recursive!) binding + + $wC = \x y -> $wC x y + +i.e. a curried constructor that allocates. This means that we can +treat the worker for a constructor like any other function in the rest +of the compiler. The point here is that CoreToStg will generate a +StgConApp for the RHS, rather than a call to the worker (which would +give a loop). As Lennart says: the ice is thin here, but it works. + +Hmm. Should we create bindings for dictionary constructors? They are +always fully applied, and the bindings are just there to support +partial applications. But it's easier to let them through. + +\begin{code} +mkDataConWorkers data_tycons + = [ NonRec id (Var id) -- The ice is thin here, but it works + | tycon <- data_tycons, -- CorePrep will eta-expand it + data_con <- tyConDataCons tycon, + let id = dataConWorkId data_con ] +\end{code} + + +\begin{code} +-- --------------------------------------------------------------------------- +-- Dealing with bindings +-- --------------------------------------------------------------------------- + +data FloatingBind = FloatLet CoreBind + | FloatCase Id CoreExpr Bool + -- The bool indicates "ok-for-speculation" + +data Floats = Floats OkToSpec (OrdList FloatingBind) + +-- Can we float these binds out of the rhs of a let? We cache this decision +-- to avoid having to recompute it in a non-linear way when there are +-- deeply nested lets. +data OkToSpec + = NotOkToSpec -- definitely not + | OkToSpec -- yes + | IfUnboxedOk -- only if floating an unboxed binding is ok + +emptyFloats :: Floats +emptyFloats = Floats OkToSpec nilOL + +addFloat :: Floats -> FloatingBind -> Floats +addFloat (Floats ok_to_spec floats) new_float + = Floats (combine ok_to_spec (check new_float)) (floats `snocOL` new_float) + where + check (FloatLet _) = OkToSpec + check (FloatCase _ _ ok_for_spec) + | ok_for_spec = IfUnboxedOk + | otherwise = NotOkToSpec + -- The ok-for-speculation flag says that it's safe to + -- float this Case out of a let, and thereby do it more eagerly + -- We need the top-level flag because it's never ok to float + -- an unboxed binding to the top level + +unitFloat :: FloatingBind -> Floats +unitFloat = addFloat emptyFloats + +appendFloats :: Floats -> Floats -> Floats +appendFloats (Floats spec1 floats1) (Floats spec2 floats2) + = Floats (combine spec1 spec2) (floats1 `appOL` floats2) + +concatFloats :: [Floats] -> Floats +concatFloats = foldr appendFloats emptyFloats + +combine NotOkToSpec _ = NotOkToSpec +combine _ NotOkToSpec = NotOkToSpec +combine IfUnboxedOk _ = IfUnboxedOk +combine _ IfUnboxedOk = IfUnboxedOk +combine _ _ = OkToSpec + +instance Outputable FloatingBind where + ppr (FloatLet bind) = text "FloatLet" <+> ppr bind + ppr (FloatCase b rhs spec) = text "FloatCase" <+> ppr b <+> ppr spec <+> equals <+> ppr rhs + +deFloatTop :: Floats -> [CoreBind] +-- For top level only; we don't expect any FloatCases +deFloatTop (Floats _ floats) + = foldrOL get [] floats + where + get (FloatLet b) bs = b:bs + get b bs = pprPanic "corePrepPgm" (ppr b) + +allLazy :: TopLevelFlag -> RecFlag -> Floats -> Bool +allLazy top_lvl is_rec (Floats ok_to_spec _) + = case ok_to_spec of + OkToSpec -> True + NotOkToSpec -> False + IfUnboxedOk -> isNotTopLevel top_lvl && isNonRec is_rec + +-- --------------------------------------------------------------------------- +-- Bindings +-- --------------------------------------------------------------------------- + +corePrepTopBinds :: [CoreBind] -> UniqSM Floats +corePrepTopBinds binds + = go emptyCorePrepEnv binds + where + go env [] = returnUs emptyFloats + go env (bind : binds) = corePrepTopBind env bind `thenUs` \ (env', bind') -> + go env' binds `thenUs` \ binds' -> + returnUs (bind' `appendFloats` binds') + +-- NB: we do need to float out of top-level bindings +-- Consider x = length [True,False] +-- We want to get +-- s1 = False : [] +-- s2 = True : s1 +-- x = length s2 + +-- We return a *list* of bindings, because we may start with +-- x* = f (g y) +-- where x is demanded, in which case we want to finish with +-- a = g y +-- x* = f a +-- And then x will actually end up case-bound +-- +-- What happens to the CafInfo on the floated bindings? By +-- default, all the CafInfos will be set to MayHaveCafRefs, +-- which is safe. +-- +-- This might be pessimistic, because eg. s1 & s2 +-- might not refer to any CAFs and the GC will end up doing +-- more traversal than is necessary, but it's still better +-- than not floating the bindings at all, because then +-- the GC would have to traverse the structure in the heap +-- instead. Given this, we decided not to try to get +-- the CafInfo on the floated bindings correct, because +-- it looks difficult. + +-------------------------------- +corePrepTopBind :: CorePrepEnv -> CoreBind -> UniqSM (CorePrepEnv, Floats) +corePrepTopBind env (NonRec bndr rhs) + = cloneBndr env bndr `thenUs` \ (env', bndr') -> + corePrepRhs TopLevel NonRecursive env (bndr, rhs) `thenUs` \ (floats, rhs') -> + returnUs (env', addFloat floats (FloatLet (NonRec bndr' rhs'))) + +corePrepTopBind env (Rec pairs) = corePrepRecPairs TopLevel env pairs + +-------------------------------- +corePrepBind :: CorePrepEnv -> CoreBind -> UniqSM (CorePrepEnv, Floats) + -- This one is used for *local* bindings +corePrepBind env (NonRec bndr rhs) + = etaExpandRhs bndr rhs `thenUs` \ rhs1 -> + corePrepExprFloat env rhs1 `thenUs` \ (floats, rhs2) -> + cloneBndr env bndr `thenUs` \ (_, bndr') -> + mkLocalNonRec bndr' (bdrDem bndr) floats rhs2 `thenUs` \ (floats', bndr'') -> + -- We want bndr'' in the envt, because it records + -- the evaluated-ness of the binder + returnUs (extendCorePrepEnv env bndr bndr'', floats') + +corePrepBind env (Rec pairs) = corePrepRecPairs NotTopLevel env pairs + +-------------------------------- +corePrepRecPairs :: TopLevelFlag -> CorePrepEnv + -> [(Id,CoreExpr)] -- Recursive bindings + -> UniqSM (CorePrepEnv, Floats) +-- Used for all recursive bindings, top level and otherwise +corePrepRecPairs lvl env pairs + = cloneBndrs env (map fst pairs) `thenUs` \ (env', bndrs') -> + mapAndUnzipUs (corePrepRhs lvl Recursive env') pairs `thenUs` \ (floats_s, rhss') -> + returnUs (env', unitFloat (FloatLet (Rec (flatten (concatFloats floats_s) bndrs' rhss')))) + where + -- Flatten all the floats, and the currrent + -- group into a single giant Rec + flatten (Floats _ floats) bndrs rhss = foldrOL get (bndrs `zip` rhss) floats + + get (FloatLet (NonRec b r)) prs2 = (b,r) : prs2 + get (FloatLet (Rec prs1)) prs2 = prs1 ++ prs2 + get b prs2 = pprPanic "corePrepRecPairs" (ppr b) + +-------------------------------- +corePrepRhs :: TopLevelFlag -> RecFlag + -> CorePrepEnv -> (Id, CoreExpr) + -> UniqSM (Floats, CoreExpr) +-- Used for top-level bindings, and local recursive bindings +corePrepRhs top_lvl is_rec env (bndr, rhs) + = etaExpandRhs bndr rhs `thenUs` \ rhs' -> + corePrepExprFloat env rhs' `thenUs` \ floats_w_rhs -> + floatRhs top_lvl is_rec bndr floats_w_rhs + + +-- --------------------------------------------------------------------------- +-- Making arguments atomic (function args & constructor args) +-- --------------------------------------------------------------------------- + +-- This is where we arrange that a non-trivial argument is let-bound +corePrepArg :: CorePrepEnv -> CoreArg -> RhsDemand + -> UniqSM (Floats, CoreArg) +corePrepArg env arg dem + = corePrepExprFloat env arg `thenUs` \ (floats, arg') -> + if exprIsTrivial arg' + then returnUs (floats, arg') + else newVar (exprType arg') `thenUs` \ v -> + mkLocalNonRec v dem floats arg' `thenUs` \ (floats', v') -> + returnUs (floats', Var v') + +-- version that doesn't consider an scc annotation to be trivial. +exprIsTrivial (Var v) = True +exprIsTrivial (Type _) = True +exprIsTrivial (Lit lit) = True +exprIsTrivial (App e arg) = isTypeArg arg && exprIsTrivial e +exprIsTrivial (Note (SCC _) e) = False +exprIsTrivial (Note _ e) = exprIsTrivial e +exprIsTrivial (Lam b body) | isTyVar b = exprIsTrivial body +exprIsTrivial other = False + +-- --------------------------------------------------------------------------- +-- Dealing with expressions +-- --------------------------------------------------------------------------- + +corePrepAnExpr :: CorePrepEnv -> CoreExpr -> UniqSM CoreExpr +corePrepAnExpr env expr + = corePrepExprFloat env expr `thenUs` \ (floats, expr) -> + mkBinds floats expr + + +corePrepExprFloat :: CorePrepEnv -> CoreExpr -> UniqSM (Floats, CoreExpr) +-- If +-- e ===> (bs, e') +-- then +-- e = let bs in e' (semantically, that is!) +-- +-- For example +-- f (g x) ===> ([v = g x], f v) + +corePrepExprFloat env (Var v) + = fiddleCCall v `thenUs` \ v1 -> + let + v2 = lookupCorePrepEnv env v1 + in + maybeSaturate v2 (Var v2) 0 emptyFloats (idType v2) + +corePrepExprFloat env expr@(Type _) + = returnUs (emptyFloats, expr) + +corePrepExprFloat env expr@(Lit lit) + = returnUs (emptyFloats, expr) + +corePrepExprFloat env (Let bind body) + = corePrepBind env bind `thenUs` \ (env', new_binds) -> + corePrepExprFloat env' body `thenUs` \ (floats, new_body) -> + returnUs (new_binds `appendFloats` floats, new_body) + +corePrepExprFloat env (Note n@(SCC _) expr) + = corePrepAnExpr env expr `thenUs` \ expr1 -> + deLamFloat expr1 `thenUs` \ (floats, expr2) -> + returnUs (floats, Note n expr2) + +corePrepExprFloat env (Note other_note expr) + = corePrepExprFloat env expr `thenUs` \ (floats, expr') -> + returnUs (floats, Note other_note expr') + +corePrepExprFloat env expr@(Lam _ _) + = cloneBndrs env bndrs `thenUs` \ (env', bndrs') -> + corePrepAnExpr env' body `thenUs` \ body' -> + returnUs (emptyFloats, mkLams bndrs' body') + where + (bndrs,body) = collectBinders expr + +corePrepExprFloat env (Case scrut bndr ty alts) + = corePrepExprFloat env scrut `thenUs` \ (floats1, scrut1) -> + deLamFloat scrut1 `thenUs` \ (floats2, scrut2) -> + let + bndr1 = bndr `setIdUnfolding` evaldUnfolding + -- Record that the case binder is evaluated in the alternatives + in + cloneBndr env bndr1 `thenUs` \ (env', bndr2) -> + mapUs (sat_alt env') alts `thenUs` \ alts' -> + returnUs (floats1 `appendFloats` floats2 , Case scrut2 bndr2 ty alts') + where + sat_alt env (con, bs, rhs) + = let + env1 = setGadt env con + in + cloneBndrs env1 bs `thenUs` \ (env2, bs') -> + corePrepAnExpr env2 rhs `thenUs` \ rhs1 -> + deLam rhs1 `thenUs` \ rhs2 -> + returnUs (con, bs', rhs2) + +corePrepExprFloat env expr@(App _ _) + = collect_args expr 0 `thenUs` \ (app, (head,depth), ty, floats, ss) -> + ASSERT(null ss) -- make sure we used all the strictness info + + -- Now deal with the function + case head of + Var fn_id -> maybeSaturate fn_id app depth floats ty + _other -> returnUs (floats, app) + + where + + -- Deconstruct and rebuild the application, floating any non-atomic + -- arguments to the outside. We collect the type of the expression, + -- the head of the application, and the number of actual value arguments, + -- all of which are used to possibly saturate this application if it + -- has a constructor or primop at the head. + + collect_args + :: CoreExpr + -> Int -- current app depth + -> UniqSM (CoreExpr, -- the rebuilt expression + (CoreExpr,Int), -- the head of the application, + -- and no. of args it was applied to + Type, -- type of the whole expr + Floats, -- any floats we pulled out + [Demand]) -- remaining argument demands + + collect_args (App fun arg@(Type arg_ty)) depth + = collect_args fun depth `thenUs` \ (fun',hd,fun_ty,floats,ss) -> + returnUs (App fun' arg, hd, applyTy fun_ty arg_ty, floats, ss) + + collect_args (App fun arg) depth + = collect_args fun (depth+1) `thenUs` \ (fun',hd,fun_ty,floats,ss) -> + let + (ss1, ss_rest) = case ss of + (ss1:ss_rest) -> (ss1, ss_rest) + [] -> (lazyDmd, []) + (arg_ty, res_ty) = expectJust "corePrepExprFloat:collect_args" $ + splitFunTy_maybe fun_ty + in + corePrepArg env arg (mkDemTy ss1 arg_ty) `thenUs` \ (fs, arg') -> + returnUs (App fun' arg', hd, res_ty, fs `appendFloats` floats, ss_rest) + + collect_args (Var v) depth + = fiddleCCall v `thenUs` \ v1 -> + let + v2 = lookupCorePrepEnv env v1 + in + returnUs (Var v2, (Var v2, depth), idType v2, emptyFloats, stricts) + where + stricts = case idNewStrictness v of + StrictSig (DmdType _ demands _) + | listLengthCmp demands depth /= GT -> demands + -- length demands <= depth + | otherwise -> [] + -- If depth < length demands, then we have too few args to + -- satisfy strictness info so we have to ignore all the + -- strictness info, e.g. + (error "urk") + -- Here, we can't evaluate the arg strictly, because this + -- partial application might be seq'd + + + collect_args (Note (Coerce ty1 ty2) fun) depth + = collect_args fun depth `thenUs` \ (fun', hd, fun_ty, floats, ss) -> + returnUs (Note (Coerce ty1 ty2) fun', hd, ty1, floats, ss) + + collect_args (Note note fun) depth + | ignore_note note -- Drop these notes altogether + -- They aren't used by the code generator + = collect_args fun depth `thenUs` \ (fun', hd, fun_ty, floats, ss) -> + returnUs (fun', hd, fun_ty, floats, ss) + + -- N-variable fun, better let-bind it + -- ToDo: perhaps we can case-bind rather than let-bind this closure, + -- since it is sure to be evaluated. + collect_args fun depth + = corePrepExprFloat env fun `thenUs` \ (fun_floats, fun') -> + newVar ty `thenUs` \ fn_id -> + mkLocalNonRec fn_id onceDem fun_floats fun' `thenUs` \ (floats, fn_id') -> + returnUs (Var fn_id', (Var fn_id', depth), ty, floats, []) + where + ty = exprType fun + + ignore_note (CoreNote _) = True + ignore_note InlineCall = True + ignore_note InlineMe = True + ignore_note _other = False + -- We don't ignore SCCs, since they require some code generation + +------------------------------------------------------------------------------ +-- Building the saturated syntax +-- --------------------------------------------------------------------------- + +-- maybeSaturate deals with saturating primops and constructors +-- The type is the type of the entire application +maybeSaturate :: Id -> CoreExpr -> Int -> Floats -> Type -> UniqSM (Floats, CoreExpr) +maybeSaturate fn expr n_args floats ty + | Just DataToTagOp <- isPrimOpId_maybe fn -- DataToTag must have an evaluated arg + -- A gruesome special case + = saturate_it `thenUs` \ sat_expr -> + + -- OK, now ensure that the arg is evaluated. + -- But (sigh) take into account the lambdas we've now introduced + let + (eta_bndrs, eta_body) = collectBinders sat_expr + in + eval_data2tag_arg eta_body `thenUs` \ (eta_floats, eta_body') -> + if null eta_bndrs then + returnUs (floats `appendFloats` eta_floats, eta_body') + else + mkBinds eta_floats eta_body' `thenUs` \ eta_body'' -> + returnUs (floats, mkLams eta_bndrs eta_body'') + + | hasNoBinding fn = saturate_it `thenUs` \ sat_expr -> + returnUs (floats, sat_expr) + + | otherwise = returnUs (floats, expr) + + where + fn_arity = idArity fn + excess_arity = fn_arity - n_args + + saturate_it :: UniqSM CoreExpr + saturate_it | excess_arity == 0 = returnUs expr + | otherwise = getUniquesUs `thenUs` \ us -> + returnUs (etaExpand excess_arity us expr ty) + + -- Ensure that the argument of DataToTagOp is evaluated + eval_data2tag_arg :: CoreExpr -> UniqSM (Floats, CoreExpr) + eval_data2tag_arg app@(fun `App` arg) + | exprIsHNF arg -- Includes nullary constructors + = returnUs (emptyFloats, app) -- The arg is evaluated + | otherwise -- Arg not evaluated, so evaluate it + = newVar (exprType arg) `thenUs` \ arg_id -> + let + arg_id1 = setIdUnfolding arg_id evaldUnfolding + in + returnUs (unitFloat (FloatCase arg_id1 arg False ), + fun `App` Var arg_id1) + + eval_data2tag_arg (Note note app) -- Scc notes can appear + = eval_data2tag_arg app `thenUs` \ (floats, app') -> + returnUs (floats, Note note app') + + eval_data2tag_arg other -- Should not happen + = pprPanic "eval_data2tag" (ppr other) + + +-- --------------------------------------------------------------------------- +-- Precipitating the floating bindings +-- --------------------------------------------------------------------------- + +floatRhs :: TopLevelFlag -> RecFlag + -> Id + -> (Floats, CoreExpr) -- Rhs: let binds in body + -> UniqSM (Floats, -- Floats out of this bind + CoreExpr) -- Final Rhs + +floatRhs top_lvl is_rec bndr (floats, rhs) + | isTopLevel top_lvl || exprIsHNF rhs, -- Float to expose value or + allLazy top_lvl is_rec floats -- at top level + = -- Why the test for allLazy? + -- v = f (x `divInt#` y) + -- we don't want to float the case, even if f has arity 2, + -- because floating the case would make it evaluated too early + returnUs (floats, rhs) + + | otherwise + -- Don't float; the RHS isn't a value + = mkBinds floats rhs `thenUs` \ rhs' -> + returnUs (emptyFloats, rhs') + +-- mkLocalNonRec is used only for *nested*, *non-recursive* bindings +mkLocalNonRec :: Id -> RhsDemand -- Lhs: id with demand + -> Floats -> CoreExpr -- Rhs: let binds in body + -> UniqSM (Floats, Id) -- The new Id may have an evaldUnfolding, + -- to record that it's been evaluated + +mkLocalNonRec bndr dem floats rhs + | isUnLiftedType (idType bndr) + -- If this is an unlifted binding, we always make a case for it. + = ASSERT( not (isUnboxedTupleType (idType bndr)) ) + let + float = FloatCase bndr rhs (exprOkForSpeculation rhs) + in + returnUs (addFloat floats float, evald_bndr) + + | isStrict dem + -- It's a strict let so we definitely float all the bindings + = let -- Don't make a case for a value binding, + -- even if it's strict. Otherwise we get + -- case (\x -> e) of ...! + float | exprIsHNF rhs = FloatLet (NonRec bndr rhs) + | otherwise = FloatCase bndr rhs (exprOkForSpeculation rhs) + in + returnUs (addFloat floats float, evald_bndr) + + | otherwise + = floatRhs NotTopLevel NonRecursive bndr (floats, rhs) `thenUs` \ (floats', rhs') -> + returnUs (addFloat floats' (FloatLet (NonRec bndr rhs')), + if exprIsHNF rhs' then evald_bndr else bndr) + + where + evald_bndr = bndr `setIdUnfolding` evaldUnfolding + -- Record if the binder is evaluated + + +mkBinds :: Floats -> CoreExpr -> UniqSM CoreExpr +mkBinds (Floats _ binds) body + | isNilOL binds = returnUs body + | otherwise = deLam body `thenUs` \ body' -> + -- Lambdas are not allowed as the body of a 'let' + returnUs (foldrOL mk_bind body' binds) + where + mk_bind (FloatCase bndr rhs _) body = Case rhs bndr (exprType body) [(DEFAULT, [], body)] + mk_bind (FloatLet bind) body = Let bind body + +etaExpandRhs bndr rhs + = -- Eta expand to match the arity claimed by the binder + -- Remember, after CorePrep we must not change arity + -- + -- Eta expansion might not have happened already, + -- because it is done by the simplifier only when + -- there at least one lambda already. + -- + -- NB1:we could refrain when the RHS is trivial (which can happen + -- for exported things). This would reduce the amount of code + -- generated (a little) and make things a little words for + -- code compiled without -O. The case in point is data constructor + -- wrappers. + -- + -- NB2: we have to be careful that the result of etaExpand doesn't + -- invalidate any of the assumptions that CorePrep is attempting + -- to establish. One possible cause is eta expanding inside of + -- an SCC note - we're now careful in etaExpand to make sure the + -- SCC is pushed inside any new lambdas that are generated. + -- + -- NB3: It's important to do eta expansion, and *then* ANF-ising + -- f = /\a -> g (h 3) -- h has arity 2 + -- If we ANF first we get + -- f = /\a -> let s = h 3 in g s + -- and now eta expansion gives + -- f = /\a -> \ y -> (let s = h 3 in g s) y + -- which is horrible. + -- Eta expanding first gives + -- f = /\a -> \y -> let s = h 3 in g s y + -- + getUniquesUs `thenUs` \ us -> + returnUs (etaExpand arity us rhs (idType bndr)) + where + -- For a GlobalId, take the Arity from the Id. + -- It was set in CoreTidy and must not change + -- For all others, just expand at will + arity | isGlobalId bndr = idArity bndr + | otherwise = exprArity rhs + +-- --------------------------------------------------------------------------- +-- Eliminate Lam as a non-rhs (STG doesn't have such a thing) +-- We arrange that they only show up as the RHS of a let(rec) +-- --------------------------------------------------------------------------- + +deLam :: CoreExpr -> UniqSM CoreExpr +deLam expr = + deLamFloat expr `thenUs` \ (floats, expr) -> + mkBinds floats expr + + +deLamFloat :: CoreExpr -> UniqSM (Floats, CoreExpr) +-- Remove top level lambdas by let-bindinig + +deLamFloat (Note n expr) + = -- You can get things like + -- case e of { p -> coerce t (\s -> ...) } + deLamFloat expr `thenUs` \ (floats, expr') -> + returnUs (floats, Note n expr') + +deLamFloat expr + | null bndrs = returnUs (emptyFloats, expr) + | otherwise + = case tryEta bndrs body of + Just no_lam_result -> returnUs (emptyFloats, no_lam_result) + Nothing -> newVar (exprType expr) `thenUs` \ fn -> + returnUs (unitFloat (FloatLet (NonRec fn expr)), + Var fn) + where + (bndrs,body) = collectBinders expr + +-- Why try eta reduction? Hasn't the simplifier already done eta? +-- But the simplifier only eta reduces if that leaves something +-- trivial (like f, or f Int). But for deLam it would be enough to +-- get to a partial application, like (map f). + +tryEta bndrs expr@(App _ _) + | ok_to_eta_reduce f && + n_remaining >= 0 && + and (zipWith ok bndrs last_args) && + not (any (`elemVarSet` fvs_remaining) bndrs) + = Just remaining_expr + where + (f, args) = collectArgs expr + remaining_expr = mkApps f remaining_args + fvs_remaining = exprFreeVars remaining_expr + (remaining_args, last_args) = splitAt n_remaining args + n_remaining = length args - length bndrs + + ok bndr (Var arg) = bndr == arg + ok bndr other = False + + -- we can't eta reduce something which must be saturated. + ok_to_eta_reduce (Var f) = not (hasNoBinding f) + ok_to_eta_reduce _ = False --safe. ToDo: generalise + +tryEta bndrs (Let bind@(NonRec b r) body) + | not (any (`elemVarSet` fvs) bndrs) + = case tryEta bndrs body of + Just e -> Just (Let bind e) + Nothing -> Nothing + where + fvs = exprFreeVars r + +tryEta bndrs _ = Nothing +\end{code} + + +-- ----------------------------------------------------------------------------- +-- Demands +-- ----------------------------------------------------------------------------- + +\begin{code} +data RhsDemand + = RhsDemand { isStrict :: Bool, -- True => used at least once + isOnceDem :: Bool -- True => used at most once + } + +mkDem :: Demand -> Bool -> RhsDemand +mkDem strict once = RhsDemand (isStrictDmd strict) once + +mkDemTy :: Demand -> Type -> RhsDemand +mkDemTy strict ty = RhsDemand (isStrictDmd strict) + False {- For now -} + +bdrDem :: Id -> RhsDemand +bdrDem id = mkDem (idNewDemandInfo id) + False {- For now -} + +-- safeDem :: RhsDemand +-- safeDem = RhsDemand False False -- always safe to use this + +onceDem :: RhsDemand +onceDem = RhsDemand False True -- used at most once +\end{code} + + + + +%************************************************************************ +%* * +\subsection{Cloning} +%* * +%************************************************************************ + +\begin{code} +-- --------------------------------------------------------------------------- +-- The environment +-- --------------------------------------------------------------------------- + +data CorePrepEnv = CPE (IdEnv Id) -- Clone local Ids + Bool -- True <=> inside a GADT case; see Note [GADT] + +-- Note [GADT] +-- +-- Be careful with cloning inside GADTs. For example, +-- /\a. \f::a. \x::T a. case x of { T -> f True; ... } +-- The case on x may refine the type of f to be a function type. +-- Without this type refinement, exprType (f True) may simply fail, +-- which is bad. +-- +-- Solution: remember when we are inside a potentially-type-refining case, +-- and in that situation use the type from the old occurrence +-- when looking up occurrences + +emptyCorePrepEnv :: CorePrepEnv +emptyCorePrepEnv = CPE emptyVarEnv False + +extendCorePrepEnv :: CorePrepEnv -> Id -> Id -> CorePrepEnv +extendCorePrepEnv (CPE env gadt) id id' = CPE (extendVarEnv env id id') gadt + +lookupCorePrepEnv :: CorePrepEnv -> Id -> Id +-- See Note [GADT] above +lookupCorePrepEnv (CPE env gadt) id + = case lookupVarEnv env id of + Nothing -> id + Just id' | gadt -> setIdType id' (idType id) + | otherwise -> id' + +setGadt :: CorePrepEnv -> AltCon -> CorePrepEnv +setGadt env@(CPE id_env _) (DataAlt data_con) | not (isVanillaDataCon data_con) = CPE id_env True +setGadt env other = env + + +------------------------------------------------------------------------------ +-- Cloning binders +-- --------------------------------------------------------------------------- + +cloneBndrs :: CorePrepEnv -> [Var] -> UniqSM (CorePrepEnv, [Var]) +cloneBndrs env bs = mapAccumLUs cloneBndr env bs + +cloneBndr :: CorePrepEnv -> Var -> UniqSM (CorePrepEnv, Var) +cloneBndr env bndr + | isLocalId bndr + = getUniqueUs `thenUs` \ uniq -> + let + bndr' = setVarUnique bndr uniq + in + returnUs (extendCorePrepEnv env bndr bndr', bndr') + + | otherwise -- Top level things, which we don't want + -- to clone, have become GlobalIds by now + -- And we don't clone tyvars + = returnUs (env, bndr) + + +------------------------------------------------------------------------------ +-- Cloning ccall Ids; each must have a unique name, +-- to give the code generator a handle to hang it on +-- --------------------------------------------------------------------------- + +fiddleCCall :: Id -> UniqSM Id +fiddleCCall id + | isFCallId id = getUniqueUs `thenUs` \ uniq -> + returnUs (id `setVarUnique` uniq) + | otherwise = returnUs id + +------------------------------------------------------------------------------ +-- Generating new binders +-- --------------------------------------------------------------------------- + +newVar :: Type -> UniqSM Id +newVar ty + = seqType ty `seq` + getUniqueUs `thenUs` \ uniq -> + returnUs (mkSysLocal FSLIT("sat") uniq ty) +\end{code} diff --git a/compiler/coreSyn/CoreSubst.lhs b/compiler/coreSyn/CoreSubst.lhs new file mode 100644 index 0000000000..c432d55f94 --- /dev/null +++ b/compiler/coreSyn/CoreSubst.lhs @@ -0,0 +1,393 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[CoreUtils]{Utility functions on @Core@ syntax} + +\begin{code} +module CoreSubst ( + -- Substitution stuff + Subst, TvSubstEnv, IdSubstEnv, InScopeSet, + + substTy, substExpr, substSpec, substWorker, + lookupIdSubst, lookupTvSubst, + + emptySubst, mkEmptySubst, mkSubst, substInScope, isEmptySubst, + extendIdSubst, extendIdSubstList, extendTvSubst, extendTvSubstList, + extendInScope, extendInScopeIds, + isInScope, + + -- Binders + substBndr, substBndrs, substRecBndrs, + cloneIdBndr, cloneIdBndrs, cloneRecIdBndrs + ) where + +#include "HsVersions.h" + +import CoreSyn ( Expr(..), Bind(..), Note(..), CoreExpr, + CoreRule(..), hasUnfolding, noUnfolding + ) +import CoreFVs ( exprFreeVars ) +import CoreUtils ( exprIsTrivial ) + +import qualified Type ( substTy, substTyVarBndr ) +import Type ( Type, tyVarsOfType, TvSubstEnv, TvSubst(..), mkTyVarTy ) +import VarSet +import VarEnv +import Var ( setVarUnique, isId ) +import Id ( idType, setIdType, maybeModifyIdInfo, isLocalId ) +import IdInfo ( IdInfo, SpecInfo(..), specInfo, setSpecInfo, isEmptySpecInfo, + unfoldingInfo, setUnfoldingInfo, seqSpecInfo, + WorkerInfo(..), workerExists, workerInfo, setWorkerInfo + ) +import Unique ( Unique ) +import UniqSupply ( UniqSupply, uniqFromSupply, uniqsFromSupply ) +import Var ( Var, Id, TyVar, isTyVar ) +import Maybes ( orElse ) +import Outputable +import PprCore () -- Instances +import Util ( mapAccumL ) +import FastTypes +\end{code} + + +%************************************************************************ +%* * +\subsection{Substitutions} +%* * +%************************************************************************ + +\begin{code} +data Subst + = Subst InScopeSet -- Variables in in scope (both Ids and TyVars) + IdSubstEnv -- Substitution for Ids + TvSubstEnv -- Substitution for TyVars + + -- INVARIANT 1: The (domain of the) in-scope set is a superset + -- of the free vars of the range of the substitution + -- that might possibly clash with locally-bound variables + -- in the thing being substituted in. + -- This is what lets us deal with name capture properly + -- It's a hard invariant to check... + -- There are various ways of causing it to happen: + -- - arrange that the in-scope set really is all the things in scope + -- - arrange that it's the free vars of the range of the substitution + -- - make it empty because all the free vars of the subst are fresh, + -- and hence can't possibly clash.a + -- + -- INVARIANT 2: The substitution is apply-once; see notes with + -- Types.TvSubstEnv + +type IdSubstEnv = IdEnv CoreExpr + +---------------------------- +isEmptySubst :: Subst -> Bool +isEmptySubst (Subst _ id_env tv_env) = isEmptyVarEnv id_env && isEmptyVarEnv tv_env + +emptySubst :: Subst +emptySubst = Subst emptyInScopeSet emptyVarEnv emptyVarEnv + +mkEmptySubst :: InScopeSet -> Subst +mkEmptySubst in_scope = Subst in_scope emptyVarEnv emptyVarEnv + +mkSubst :: InScopeSet -> TvSubstEnv -> IdSubstEnv -> Subst +mkSubst in_scope tvs ids = Subst in_scope ids tvs + +-- getTvSubst :: Subst -> TvSubst +-- getTvSubst (Subst in_scope _ tv_env) = TvSubst in_scope tv_env + +-- getTvSubstEnv :: Subst -> TvSubstEnv +-- getTvSubstEnv (Subst _ _ tv_env) = tv_env +-- +-- setTvSubstEnv :: Subst -> TvSubstEnv -> Subst +-- setTvSubstEnv (Subst in_scope ids _) tvs = Subst in_scope ids tvs + +substInScope :: Subst -> InScopeSet +substInScope (Subst in_scope _ _) = in_scope + +-- zapSubstEnv :: Subst -> Subst +-- zapSubstEnv (Subst in_scope _ _) = Subst in_scope emptyVarEnv emptyVarEnv + +-- ToDo: add an ASSERT that fvs(subst-result) is already in the in-scope set +extendIdSubst :: Subst -> Id -> CoreExpr -> Subst +extendIdSubst (Subst in_scope ids tvs) v r = Subst in_scope (extendVarEnv ids v r) tvs + +extendIdSubstList :: Subst -> [(Id, CoreExpr)] -> Subst +extendIdSubstList (Subst in_scope ids tvs) prs = Subst in_scope (extendVarEnvList ids prs) tvs + +extendTvSubst :: Subst -> TyVar -> Type -> Subst +extendTvSubst (Subst in_scope ids tvs) v r = Subst in_scope ids (extendVarEnv tvs v r) + +extendTvSubstList :: Subst -> [(TyVar,Type)] -> Subst +extendTvSubstList (Subst in_scope ids tvs) prs = Subst in_scope ids (extendVarEnvList tvs prs) + +lookupIdSubst :: Subst -> Id -> CoreExpr +lookupIdSubst (Subst in_scope ids tvs) v + | not (isLocalId v) = Var v + | otherwise + = case lookupVarEnv ids v of { + Just e -> e ; + Nothing -> + case lookupInScope in_scope v of { + -- Watch out! Must get the Id from the in-scope set, + -- because its type there may differ + Just v -> Var v ; + Nothing -> WARN( True, ptext SLIT("CoreSubst.lookupIdSubst") <+> ppr v ) + Var v + }} + +lookupTvSubst :: Subst -> TyVar -> Type +lookupTvSubst (Subst _ ids tvs) v = lookupVarEnv tvs v `orElse` mkTyVarTy v + +------------------------------ +isInScope :: Var -> Subst -> Bool +isInScope v (Subst in_scope _ _) = v `elemInScopeSet` in_scope + +extendInScope :: Subst -> Var -> Subst +extendInScope (Subst in_scope ids tvs) v + = Subst (in_scope `extendInScopeSet` v) + (ids `delVarEnv` v) (tvs `delVarEnv` v) + +extendInScopeIds :: Subst -> [Id] -> Subst +extendInScopeIds (Subst in_scope ids tvs) vs + = Subst (in_scope `extendInScopeSetList` vs) + (ids `delVarEnvList` vs) tvs +\end{code} + +Pretty printing, for debugging only + +\begin{code} +instance Outputable Subst where + ppr (Subst in_scope ids tvs) + = ptext SLIT("<InScope =") <+> braces (fsep (map ppr (varEnvElts (getInScopeVars in_scope)))) + $$ ptext SLIT(" IdSubst =") <+> ppr ids + $$ ptext SLIT(" TvSubst =") <+> ppr tvs + <> char '>' +\end{code} + + +%************************************************************************ +%* * + Substituting expressions +%* * +%************************************************************************ + +\begin{code} +substExpr :: Subst -> CoreExpr -> CoreExpr +substExpr subst expr + = go expr + where + go (Var v) = lookupIdSubst subst v + go (Type ty) = Type (substTy subst ty) + go (Lit lit) = Lit lit + go (App fun arg) = App (go fun) (go arg) + go (Note note e) = Note (go_note note) (go e) + go (Lam bndr body) = Lam bndr' (substExpr subst' body) + where + (subst', bndr') = substBndr subst bndr + + go (Let (NonRec bndr rhs) body) = Let (NonRec bndr' (go rhs)) (substExpr subst' body) + where + (subst', bndr') = substBndr subst bndr + + go (Let (Rec pairs) body) = Let (Rec pairs') (substExpr subst' body) + where + (subst', bndrs') = substRecBndrs subst (map fst pairs) + pairs' = bndrs' `zip` rhss' + rhss' = map (substExpr subst' . snd) pairs + + go (Case scrut bndr ty alts) = Case (go scrut) bndr' (substTy subst ty) (map (go_alt subst') alts) + where + (subst', bndr') = substBndr subst bndr + + go_alt subst (con, bndrs, rhs) = (con, bndrs', substExpr subst' rhs) + where + (subst', bndrs') = substBndrs subst bndrs + + go_note (Coerce ty1 ty2) = Coerce (substTy subst ty1) (substTy subst ty2) + go_note note = note +\end{code} + + +%************************************************************************ +%* * + Substituting binders +%* * +%************************************************************************ + +Remember that substBndr and friends are used when doing expression +substitution only. Their only business is substitution, so they +preserve all IdInfo (suitably substituted). For example, we *want* to +preserve occ info in rules. + +\begin{code} +substBndr :: Subst -> Var -> (Subst, Var) +substBndr subst bndr + | isTyVar bndr = substTyVarBndr subst bndr + | otherwise = substIdBndr subst subst bndr + +substBndrs :: Subst -> [Var] -> (Subst, [Var]) +substBndrs subst bndrs = mapAccumL substBndr subst bndrs + +substRecBndrs :: Subst -> [Id] -> (Subst, [Id]) +-- Substitute a mutually recursive group +substRecBndrs subst bndrs + = (new_subst, new_bndrs) + where -- Here's the reason we need to pass rec_subst to subst_id + (new_subst, new_bndrs) = mapAccumL (substIdBndr new_subst) subst bndrs +\end{code} + + +\begin{code} +substIdBndr :: Subst -- Substitution to use for the IdInfo + -> Subst -> Id -- Substitition and Id to transform + -> (Subst, Id) -- Transformed pair + +substIdBndr rec_subst subst@(Subst in_scope env tvs) old_id + = (Subst (in_scope `extendInScopeSet` new_id) new_env tvs, new_id) + where + id1 = uniqAway in_scope old_id -- id1 is cloned if necessary + id2 = substIdType subst id1 -- id2 has its type zapped + + -- new_id has the right IdInfo + -- The lazy-set is because we're in a loop here, with + -- rec_subst, when dealing with a mutually-recursive group + new_id = maybeModifyIdInfo (substIdInfo rec_subst) id2 + + -- Extend the substitution if the unique has changed + -- See the notes with substTyVarBndr for the delVarEnv + new_env | new_id /= old_id = extendVarEnv env old_id (Var new_id) + | otherwise = delVarEnv env old_id +\end{code} + +Now a variant that unconditionally allocates a new unique. +It also unconditionally zaps the OccInfo. + +\begin{code} +cloneIdBndr :: Subst -> UniqSupply -> Id -> (Subst, Id) +cloneIdBndr subst us old_id + = clone_id subst subst (old_id, uniqFromSupply us) + +cloneIdBndrs :: Subst -> UniqSupply -> [Id] -> (Subst, [Id]) +cloneIdBndrs subst us ids + = mapAccumL (clone_id subst) subst (ids `zip` uniqsFromSupply us) + +cloneRecIdBndrs :: Subst -> UniqSupply -> [Id] -> (Subst, [Id]) +cloneRecIdBndrs subst us ids + = (subst', ids') + where + (subst', ids') = mapAccumL (clone_id subst') subst + (ids `zip` uniqsFromSupply us) + +-- Just like substIdBndr, except that it always makes a new unique +-- It is given the unique to use +clone_id :: Subst -- Substitution for the IdInfo + -> Subst -> (Id, Unique) -- Substitition and Id to transform + -> (Subst, Id) -- Transformed pair + +clone_id rec_subst subst@(Subst in_scope env tvs) (old_id, uniq) + = (Subst (in_scope `extendInScopeSet` new_id) new_env tvs, new_id) + where + id1 = setVarUnique old_id uniq + id2 = substIdType subst id1 + new_id = maybeModifyIdInfo (substIdInfo rec_subst) id2 + new_env = extendVarEnv env old_id (Var new_id) +\end{code} + + +%************************************************************************ +%* * + Types +%* * +%************************************************************************ + +For types we just call the corresponding function in Type, but we have +to repackage the substitution, from a Subst to a TvSubst + +\begin{code} +substTyVarBndr :: Subst -> TyVar -> (Subst, TyVar) +substTyVarBndr (Subst in_scope id_env tv_env) tv + = case Type.substTyVarBndr (TvSubst in_scope tv_env) tv of + (TvSubst in_scope' tv_env', tv') + -> (Subst in_scope' id_env tv_env', tv') + +substTy :: Subst -> Type -> Type +substTy (Subst in_scope id_env tv_env) ty + = Type.substTy (TvSubst in_scope tv_env) ty +\end{code} + + +%************************************************************************ +%* * +\section{IdInfo substitution} +%* * +%************************************************************************ + +\begin{code} +substIdType :: Subst -> Id -> Id +substIdType subst@(Subst in_scope id_env tv_env) id + | isEmptyVarEnv tv_env || isEmptyVarSet (tyVarsOfType old_ty) = id + | otherwise = setIdType id (substTy subst old_ty) + -- The tyVarsOfType is cheaper than it looks + -- because we cache the free tyvars of the type + -- in a Note in the id's type itself + where + old_ty = idType id + +------------------ +substIdInfo :: Subst -> IdInfo -> Maybe IdInfo +-- Always zaps the unfolding, to save substitution work +substIdInfo subst info + | nothing_to_do = Nothing + | otherwise = Just (info `setSpecInfo` substSpec subst old_rules + `setWorkerInfo` substWorker subst old_wrkr + `setUnfoldingInfo` noUnfolding) + where + old_rules = specInfo info + old_wrkr = workerInfo info + nothing_to_do = isEmptySpecInfo old_rules && + not (workerExists old_wrkr) && + not (hasUnfolding (unfoldingInfo info)) + + +------------------ +substWorker :: Subst -> WorkerInfo -> WorkerInfo + -- Seq'ing on the returned WorkerInfo is enough to cause all the + -- substitutions to happen completely + +substWorker subst NoWorker + = NoWorker +substWorker subst (HasWorker w a) + = case lookupIdSubst subst w of + Var w1 -> HasWorker w1 a + other -> WARN( not (exprIsTrivial other), text "CoreSubst.substWorker:" <+> ppr w ) + NoWorker -- Worker has got substituted away altogether + -- (This can happen if it's trivial, + -- via postInlineUnconditionally, hence warning) + +------------------ +substSpec :: Subst -> SpecInfo -> SpecInfo + +substSpec subst spec@(SpecInfo rules rhs_fvs) + | isEmptySubst subst + = spec + | otherwise + = seqSpecInfo new_rules `seq` new_rules + where + new_rules = SpecInfo (map do_subst rules) (substVarSet subst rhs_fvs) + + do_subst rule@(BuiltinRule {}) = rule + do_subst rule@(Rule { ru_bndrs = bndrs, ru_args = args, ru_rhs = rhs }) + = rule { ru_bndrs = bndrs', + ru_args = map (substExpr subst') args, + ru_rhs = substExpr subst' rhs } + where + (subst', bndrs') = substBndrs subst bndrs + +------------------ +substVarSet subst fvs + = foldVarSet (unionVarSet . subst_fv subst) emptyVarSet fvs + where + subst_fv subst fv + | isId fv = exprFreeVars (lookupIdSubst subst fv) + | otherwise = tyVarsOfType (lookupTvSubst subst fv) +\end{code} diff --git a/compiler/coreSyn/CoreSyn.hi-boot-5 b/compiler/coreSyn/CoreSyn.hi-boot-5 new file mode 100644 index 0000000000..6031131f33 --- /dev/null +++ b/compiler/coreSyn/CoreSyn.hi-boot-5 @@ -0,0 +1,6 @@ +__interface CoreSyn 1 0 where +__export CoreSyn CoreExpr ; + +-- Needed by Var.lhs +1 type CoreExpr = Expr Var.Var; +1 data Expr b ; diff --git a/compiler/coreSyn/CoreSyn.hi-boot-6 b/compiler/coreSyn/CoreSyn.hi-boot-6 new file mode 100644 index 0000000000..38dc8c7f7e --- /dev/null +++ b/compiler/coreSyn/CoreSyn.hi-boot-6 @@ -0,0 +1,5 @@ +module CoreSyn where + +-- Needed by Var.lhs +data Expr b +type CoreExpr = Expr Var.Var diff --git a/compiler/coreSyn/CoreSyn.lhs b/compiler/coreSyn/CoreSyn.lhs new file mode 100644 index 0000000000..201d866834 --- /dev/null +++ b/compiler/coreSyn/CoreSyn.lhs @@ -0,0 +1,695 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[CoreSyn]{A data type for the Haskell compiler midsection} + +\begin{code} +module CoreSyn ( + Expr(..), Alt, Bind(..), AltCon(..), Arg, Note(..), + CoreExpr, CoreAlt, CoreBind, CoreArg, CoreBndr, + TaggedExpr, TaggedAlt, TaggedBind, TaggedArg, TaggedBndr(..), + + mkLets, mkLams, + mkApps, mkTyApps, mkValApps, mkVarApps, + mkLit, mkIntLitInt, mkIntLit, + mkConApp, + varToCoreExpr, + + isTyVar, isId, cmpAltCon, cmpAlt, ltAlt, + bindersOf, bindersOfBinds, rhssOfBind, rhssOfAlts, + collectBinders, collectTyBinders, collectValBinders, collectTyAndValBinders, + collectArgs, + coreExprCc, + flattenBinds, + + isValArg, isTypeArg, valArgCount, valBndrCount, isRuntimeArg, isRuntimeVar, + + -- Unfoldings + Unfolding(..), UnfoldingGuidance(..), -- Both abstract everywhere but in CoreUnfold.lhs + noUnfolding, evaldUnfolding, mkOtherCon, + unfoldingTemplate, maybeUnfoldingTemplate, otherCons, + isValueUnfolding, isEvaldUnfolding, isCheapUnfolding, isCompulsoryUnfolding, + hasUnfolding, hasSomeUnfolding, neverUnfold, + + -- Seq stuff + seqExpr, seqExprs, seqUnfolding, + + -- Annotated expressions + AnnExpr, AnnExpr'(..), AnnBind(..), AnnAlt, + deAnnotate, deAnnotate', deAnnAlt, collectAnnBndrs, + + -- Core rules + CoreRule(..), -- CoreSubst, CoreTidy, CoreFVs, PprCore only + RuleName, seqRules, + isBuiltinRule, ruleName, isLocalRule, ruleIdName + ) where + +#include "HsVersions.h" + +import StaticFlags ( opt_RuntimeTypes ) +import CostCentre ( CostCentre, noCostCentre ) +import Var ( Var, Id, TyVar, isTyVar, isId ) +import Type ( Type, mkTyVarTy, seqType ) +import Name ( Name ) +import OccName ( OccName ) +import Literal ( Literal, mkMachInt ) +import DataCon ( DataCon, dataConWorkId, dataConTag ) +import BasicTypes ( Activation ) +import FastString +import Outputable +\end{code} + +%************************************************************************ +%* * +\subsection{The main data types} +%* * +%************************************************************************ + +These data types are the heart of the compiler + +\begin{code} +infixl 8 `App` -- App brackets to the left + +data Expr b -- "b" for the type of binders, + = Var Id + | Lit Literal + | App (Expr b) (Arg b) + | Lam b (Expr b) + | Let (Bind b) (Expr b) + | Case (Expr b) b Type [Alt b] -- Binder gets bound to value of scrutinee + -- Invariant: The list of alternatives is ALWAYS EXHAUSTIVE, + -- meaning that it covers all cases that can occur + -- See the example below + -- + -- Invariant: The DEFAULT case must be *first*, if it occurs at all + -- Invariant: The remaining cases are in order of increasing + -- tag (for DataAlts) + -- lit (for LitAlts) + -- This makes finding the relevant constructor easy, + -- and makes comparison easier too + | Note Note (Expr b) + | Type Type -- This should only show up at the top + -- level of an Arg + +-- An "exhausive" case does not necessarily mention all constructors: +-- data Foo = Red | Green | Blue +-- +-- ...case x of +-- Red -> True +-- other -> f (case x of +-- Green -> ... +-- Blue -> ... ) +-- The inner case does not need a Red alternative, because x can't be Red at +-- that program point. + + +type Arg b = Expr b -- Can be a Type + +type Alt b = (AltCon, [b], Expr b) -- (DEFAULT, [], rhs) is the default alternative + +data AltCon = DataAlt DataCon + | LitAlt Literal + | DEFAULT + deriving (Eq, Ord) + + +data Bind b = NonRec b (Expr b) + | Rec [(b, (Expr b))] + +data Note + = SCC CostCentre + + | Coerce + Type -- The to-type: type of whole coerce expression + Type -- The from-type: type of enclosed expression + + | InlineCall -- Instructs simplifier to inline + -- the enclosed call + + | InlineMe -- Instructs simplifer to treat the enclosed expression + -- as very small, and inline it at its call sites + + | CoreNote String -- A generic core annotation, propagated but not used by GHC + +-- NOTE: we also treat expressions wrapped in InlineMe as +-- 'cheap' and 'dupable' (in the sense of exprIsCheap, exprIsDupable) +-- What this means is that we obediently inline even things that don't +-- look like valuse. This is sometimes important: +-- {-# INLINE f #-} +-- f = g . h +-- Here, f looks like a redex, and we aren't going to inline (.) because it's +-- inside an INLINE, so it'll stay looking like a redex. Nevertheless, we +-- should inline f even inside lambdas. In effect, we should trust the programmer. +\end{code} + +INVARIANTS: + +* The RHS of a letrec, and the RHSs of all top-level lets, + must be of LIFTED type. + +* The RHS of a let, may be of UNLIFTED type, but only if the expression + is ok-for-speculation. This means that the let can be floated around + without difficulty. e.g. + y::Int# = x +# 1# ok + y::Int# = fac 4# not ok [use case instead] + +* The argument of an App can be of any type. + +* The simplifier tries to ensure that if the RHS of a let is a constructor + application, its arguments are trivial, so that the constructor can be + inlined vigorously. + + +%************************************************************************ +%* * +\subsection{Transformation rules} +%* * +%************************************************************************ + +The CoreRule type and its friends are dealt with mainly in CoreRules, +but CoreFVs, Subst, PprCore, CoreTidy also inspect the representation. + +A Rule is + + "local" if the function it is a rule for is defined in the + same module as the rule itself. + + "orphan" if nothing on the LHS is defined in the same module + as the rule itself + +\begin{code} +type RuleName = FastString + +data CoreRule + = Rule { + ru_name :: RuleName, + ru_act :: Activation, -- When the rule is active + + -- Rough-matching stuff + -- see comments with InstEnv.Instance( is_cls, is_rough ) + ru_fn :: Name, -- Name of the Id at the head of this rule + ru_rough :: [Maybe Name], -- Name at the head of each argument + + -- Proper-matching stuff + -- see comments with InstEnv.Instance( is_tvs, is_tys ) + ru_bndrs :: [CoreBndr], -- Forall'd variables + ru_args :: [CoreExpr], -- LHS args + + -- And the right-hand side + ru_rhs :: CoreExpr, + + -- Locality + ru_local :: Bool, -- The fn at the head of the rule is + -- defined in the same module as the rule + + -- Orphan-hood; see comments is InstEnv.Instance( is_orph ) + ru_orph :: Maybe OccName } + + | BuiltinRule { -- Built-in rules are used for constant folding + ru_name :: RuleName, -- and suchlike. It has no free variables. + ru_fn :: Name, -- Name of the Id at + -- the head of this rule + ru_try :: [CoreExpr] -> Maybe CoreExpr } + +isBuiltinRule (BuiltinRule {}) = True +isBuiltinRule _ = False + +ruleName :: CoreRule -> RuleName +ruleName = ru_name + +ruleIdName :: CoreRule -> Name +ruleIdName = ru_fn + +isLocalRule :: CoreRule -> Bool +isLocalRule = ru_local +\end{code} + + +%************************************************************************ +%* * + Unfoldings +%* * +%************************************************************************ + +The @Unfolding@ type is declared here to avoid numerous loops, but it +should be abstract everywhere except in CoreUnfold.lhs + +\begin{code} +data Unfolding + = NoUnfolding + + | OtherCon [AltCon] -- It ain't one of these + -- (OtherCon xs) also indicates that something has been evaluated + -- and hence there's no point in re-evaluating it. + -- OtherCon [] is used even for non-data-type values + -- to indicated evaluated-ness. Notably: + -- data C = C !(Int -> Int) + -- case x of { C f -> ... } + -- Here, f gets an OtherCon [] unfolding. + + | CompulsoryUnfolding CoreExpr -- There is no "original" definition, + -- so you'd better unfold. + + | CoreUnfolding -- An unfolding with redundant cached information + CoreExpr -- Template; binder-info is correct + Bool -- True <=> top level binding + Bool -- exprIsHNF template (cached); it is ok to discard a `seq` on + -- this variable + Bool -- True <=> doesn't waste (much) work to expand inside an inlining + -- Basically it's exprIsCheap + UnfoldingGuidance -- Tells about the *size* of the template. + + +data UnfoldingGuidance + = UnfoldNever + | UnfoldIfGoodArgs Int -- and "n" value args + + [Int] -- Discount if the argument is evaluated. + -- (i.e., a simplification will definitely + -- be possible). One elt of the list per *value* arg. + + Int -- The "size" of the unfolding; to be elaborated + -- later. ToDo + + Int -- Scrutinee discount: the discount to substract if the thing is in + -- a context (case (thing args) of ...), + -- (where there are the right number of arguments.) + +noUnfolding = NoUnfolding +evaldUnfolding = OtherCon [] + +mkOtherCon = OtherCon + +seqUnfolding :: Unfolding -> () +seqUnfolding (CoreUnfolding e top b1 b2 g) + = seqExpr e `seq` top `seq` b1 `seq` b2 `seq` seqGuidance g +seqUnfolding other = () + +seqGuidance (UnfoldIfGoodArgs n ns a b) = n `seq` sum ns `seq` a `seq` b `seq` () +seqGuidance other = () +\end{code} + +\begin{code} +unfoldingTemplate :: Unfolding -> CoreExpr +unfoldingTemplate (CoreUnfolding expr _ _ _ _) = expr +unfoldingTemplate (CompulsoryUnfolding expr) = expr +unfoldingTemplate other = panic "getUnfoldingTemplate" + +maybeUnfoldingTemplate :: Unfolding -> Maybe CoreExpr +maybeUnfoldingTemplate (CoreUnfolding expr _ _ _ _) = Just expr +maybeUnfoldingTemplate (CompulsoryUnfolding expr) = Just expr +maybeUnfoldingTemplate other = Nothing + +otherCons :: Unfolding -> [AltCon] +otherCons (OtherCon cons) = cons +otherCons other = [] + +isValueUnfolding :: Unfolding -> Bool + -- Returns False for OtherCon +isValueUnfolding (CoreUnfolding _ _ is_evald _ _) = is_evald +isValueUnfolding other = False + +isEvaldUnfolding :: Unfolding -> Bool + -- Returns True for OtherCon +isEvaldUnfolding (OtherCon _) = True +isEvaldUnfolding (CoreUnfolding _ _ is_evald _ _) = is_evald +isEvaldUnfolding other = False + +isCheapUnfolding :: Unfolding -> Bool +isCheapUnfolding (CoreUnfolding _ _ _ is_cheap _) = is_cheap +isCheapUnfolding other = False + +isCompulsoryUnfolding :: Unfolding -> Bool +isCompulsoryUnfolding (CompulsoryUnfolding _) = True +isCompulsoryUnfolding other = False + +hasUnfolding :: Unfolding -> Bool +hasUnfolding (CoreUnfolding _ _ _ _ _) = True +hasUnfolding (CompulsoryUnfolding _) = True +hasUnfolding other = False + +hasSomeUnfolding :: Unfolding -> Bool +hasSomeUnfolding NoUnfolding = False +hasSomeUnfolding other = True + +neverUnfold :: Unfolding -> Bool +neverUnfold NoUnfolding = True +neverUnfold (OtherCon _) = True +neverUnfold (CoreUnfolding _ _ _ _ UnfoldNever) = True +neverUnfold other = False +\end{code} + + +%************************************************************************ +%* * +\subsection{The main data type} +%* * +%************************************************************************ + +\begin{code} +-- The Ord is needed for the FiniteMap used in the lookForConstructor +-- in SimplEnv. If you declared that lookForConstructor *ignores* +-- constructor-applications with LitArg args, then you could get +-- rid of this Ord. + +instance Outputable AltCon where + ppr (DataAlt dc) = ppr dc + ppr (LitAlt lit) = ppr lit + ppr DEFAULT = ptext SLIT("__DEFAULT") + +instance Show AltCon where + showsPrec p con = showsPrecSDoc p (ppr con) + +cmpAlt :: Alt b -> Alt b -> Ordering +cmpAlt (con1, _, _) (con2, _, _) = con1 `cmpAltCon` con2 + +ltAlt :: Alt b -> Alt b -> Bool +ltAlt a1 a2 = case a1 `cmpAlt` a2 of { LT -> True; other -> False } + +cmpAltCon :: AltCon -> AltCon -> Ordering +-- Compares AltCons within a single list of alternatives +cmpAltCon DEFAULT DEFAULT = EQ +cmpAltCon DEFAULT con = LT + +cmpAltCon (DataAlt d1) (DataAlt d2) = dataConTag d1 `compare` dataConTag d2 +cmpAltCon (DataAlt _) DEFAULT = GT +cmpAltCon (LitAlt l1) (LitAlt l2) = l1 `compare` l2 +cmpAltCon (LitAlt _) DEFAULT = GT + +cmpAltCon con1 con2 = WARN( True, text "Comparing incomparable AltCons" <+> + ppr con1 <+> ppr con2 ) + LT +\end{code} + + +%************************************************************************ +%* * +\subsection{Useful synonyms} +%* * +%************************************************************************ + +The common case + +\begin{code} +type CoreBndr = Var +type CoreExpr = Expr CoreBndr +type CoreArg = Arg CoreBndr +type CoreBind = Bind CoreBndr +type CoreAlt = Alt CoreBndr +\end{code} + +Binders are ``tagged'' with a \tr{t}: + +\begin{code} +data TaggedBndr t = TB CoreBndr t -- TB for "tagged binder" + +type TaggedBind t = Bind (TaggedBndr t) +type TaggedExpr t = Expr (TaggedBndr t) +type TaggedArg t = Arg (TaggedBndr t) +type TaggedAlt t = Alt (TaggedBndr t) + +instance Outputable b => Outputable (TaggedBndr b) where + ppr (TB b l) = char '<' <> ppr b <> comma <> ppr l <> char '>' + +instance Outputable b => OutputableBndr (TaggedBndr b) where + pprBndr _ b = ppr b -- Simple +\end{code} + + +%************************************************************************ +%* * +\subsection{Core-constructing functions with checking} +%* * +%************************************************************************ + +\begin{code} +mkApps :: Expr b -> [Arg b] -> Expr b +mkTyApps :: Expr b -> [Type] -> Expr b +mkValApps :: Expr b -> [Expr b] -> Expr b +mkVarApps :: Expr b -> [Var] -> Expr b + +mkApps f args = foldl App f args +mkTyApps f args = foldl (\ e a -> App e (Type a)) f args +mkValApps f args = foldl (\ e a -> App e a) f args +mkVarApps f vars = foldl (\ e a -> App e (varToCoreExpr a)) f vars + +mkLit :: Literal -> Expr b +mkIntLit :: Integer -> Expr b +mkIntLitInt :: Int -> Expr b +mkConApp :: DataCon -> [Arg b] -> Expr b +mkLets :: [Bind b] -> Expr b -> Expr b +mkLams :: [b] -> Expr b -> Expr b + +mkLit lit = Lit lit +mkConApp con args = mkApps (Var (dataConWorkId con)) args + +mkLams binders body = foldr Lam body binders +mkLets binds body = foldr Let body binds + +mkIntLit n = Lit (mkMachInt n) +mkIntLitInt n = Lit (mkMachInt (toInteger n)) + +varToCoreExpr :: CoreBndr -> Expr b +varToCoreExpr v | isId v = Var v + | otherwise = Type (mkTyVarTy v) +\end{code} + + +%************************************************************************ +%* * +\subsection{Simple access functions} +%* * +%************************************************************************ + +\begin{code} +bindersOf :: Bind b -> [b] +bindersOf (NonRec binder _) = [binder] +bindersOf (Rec pairs) = [binder | (binder, _) <- pairs] + +bindersOfBinds :: [Bind b] -> [b] +bindersOfBinds binds = foldr ((++) . bindersOf) [] binds + +rhssOfBind :: Bind b -> [Expr b] +rhssOfBind (NonRec _ rhs) = [rhs] +rhssOfBind (Rec pairs) = [rhs | (_,rhs) <- pairs] + +rhssOfAlts :: [Alt b] -> [Expr b] +rhssOfAlts alts = [e | (_,_,e) <- alts] + +flattenBinds :: [Bind b] -> [(b, Expr b)] -- Get all the lhs/rhs pairs +flattenBinds (NonRec b r : binds) = (b,r) : flattenBinds binds +flattenBinds (Rec prs1 : binds) = prs1 ++ flattenBinds binds +flattenBinds [] = [] +\end{code} + +We often want to strip off leading lambdas before getting down to +business. @collectBinders@ is your friend. + +We expect (by convention) type-, and value- lambdas in that +order. + +\begin{code} +collectBinders :: Expr b -> ([b], Expr b) +collectTyBinders :: CoreExpr -> ([TyVar], CoreExpr) +collectValBinders :: CoreExpr -> ([Id], CoreExpr) +collectTyAndValBinders :: CoreExpr -> ([TyVar], [Id], CoreExpr) + +collectBinders expr + = go [] expr + where + go bs (Lam b e) = go (b:bs) e + go bs e = (reverse bs, e) + +collectTyAndValBinders expr + = (tvs, ids, body) + where + (tvs, body1) = collectTyBinders expr + (ids, body) = collectValBinders body1 + +collectTyBinders expr + = go [] expr + where + go tvs (Lam b e) | isTyVar b = go (b:tvs) e + go tvs e = (reverse tvs, e) + +collectValBinders expr + = go [] expr + where + go ids (Lam b e) | isId b = go (b:ids) e + go ids body = (reverse ids, body) +\end{code} + + +@collectArgs@ takes an application expression, returning the function +and the arguments to which it is applied. + +\begin{code} +collectArgs :: Expr b -> (Expr b, [Arg b]) +collectArgs expr + = go expr [] + where + go (App f a) as = go f (a:as) + go e as = (e, as) +\end{code} + +coreExprCc gets the cost centre enclosing an expression, if any. +It looks inside lambdas because (scc "foo" \x.e) = \x.scc "foo" e + +\begin{code} +coreExprCc :: Expr b -> CostCentre +coreExprCc (Note (SCC cc) e) = cc +coreExprCc (Note other_note e) = coreExprCc e +coreExprCc (Lam _ e) = coreExprCc e +coreExprCc other = noCostCentre +\end{code} + + + +%************************************************************************ +%* * +\subsection{Predicates} +%* * +%************************************************************************ + +@isRuntimeVar v@ returns if (Lam v _) really becomes a lambda at runtime, +i.e. if type applications are actual lambdas because types are kept around +at runtime. + +Similarly isRuntimeArg. + +\begin{code} +isRuntimeVar :: Var -> Bool +isRuntimeVar | opt_RuntimeTypes = \v -> True + | otherwise = \v -> isId v + +isRuntimeArg :: CoreExpr -> Bool +isRuntimeArg | opt_RuntimeTypes = \e -> True + | otherwise = \e -> isValArg e +\end{code} + +\begin{code} +isValArg (Type _) = False +isValArg other = True + +isTypeArg (Type _) = True +isTypeArg other = False + +valBndrCount :: [CoreBndr] -> Int +valBndrCount [] = 0 +valBndrCount (b : bs) | isId b = 1 + valBndrCount bs + | otherwise = valBndrCount bs + +valArgCount :: [Arg b] -> Int +valArgCount [] = 0 +valArgCount (Type _ : args) = valArgCount args +valArgCount (other : args) = 1 + valArgCount args +\end{code} + + +%************************************************************************ +%* * +\subsection{Seq stuff} +%* * +%************************************************************************ + +\begin{code} +seqExpr :: CoreExpr -> () +seqExpr (Var v) = v `seq` () +seqExpr (Lit lit) = lit `seq` () +seqExpr (App f a) = seqExpr f `seq` seqExpr a +seqExpr (Lam b e) = seqBndr b `seq` seqExpr e +seqExpr (Let b e) = seqBind b `seq` seqExpr e +-- gaw 2004 +seqExpr (Case e b t as) = seqExpr e `seq` seqBndr b `seq` seqType t `seq` seqAlts as +seqExpr (Note n e) = seqNote n `seq` seqExpr e +seqExpr (Type t) = seqType t + +seqExprs [] = () +seqExprs (e:es) = seqExpr e `seq` seqExprs es + +seqNote (Coerce t1 t2) = seqType t1 `seq` seqType t2 +seqNote (CoreNote s) = s `seq` () +seqNote other = () + +seqBndr b = b `seq` () + +seqBndrs [] = () +seqBndrs (b:bs) = seqBndr b `seq` seqBndrs bs + +seqBind (NonRec b e) = seqBndr b `seq` seqExpr e +seqBind (Rec prs) = seqPairs prs + +seqPairs [] = () +seqPairs ((b,e):prs) = seqBndr b `seq` seqExpr e `seq` seqPairs prs + +seqAlts [] = () +seqAlts ((c,bs,e):alts) = seqBndrs bs `seq` seqExpr e `seq` seqAlts alts + +seqRules [] = () +seqRules (Rule { ru_bndrs = bndrs, ru_args = args, ru_rhs = rhs } : rules) + = seqBndrs bndrs `seq` seqExprs (rhs:args) `seq` seqRules rules +seqRules (BuiltinRule {} : rules) = seqRules rules +\end{code} + + + +%************************************************************************ +%* * +\subsection{Annotated core; annotation at every node in the tree} +%* * +%************************************************************************ + +\begin{code} +type AnnExpr bndr annot = (annot, AnnExpr' bndr annot) + +data AnnExpr' bndr annot + = AnnVar Id + | AnnLit Literal + | AnnLam bndr (AnnExpr bndr annot) + | AnnApp (AnnExpr bndr annot) (AnnExpr bndr annot) +-- gaw 2004 + | AnnCase (AnnExpr bndr annot) bndr Type [AnnAlt bndr annot] + | AnnLet (AnnBind bndr annot) (AnnExpr bndr annot) + | AnnNote Note (AnnExpr bndr annot) + | AnnType Type + +type AnnAlt bndr annot = (AltCon, [bndr], AnnExpr bndr annot) + +data AnnBind bndr annot + = AnnNonRec bndr (AnnExpr bndr annot) + | AnnRec [(bndr, AnnExpr bndr annot)] +\end{code} + +\begin{code} +deAnnotate :: AnnExpr bndr annot -> Expr bndr +deAnnotate (_, e) = deAnnotate' e + +deAnnotate' (AnnType t) = Type t +deAnnotate' (AnnVar v) = Var v +deAnnotate' (AnnLit lit) = Lit lit +deAnnotate' (AnnLam binder body) = Lam binder (deAnnotate body) +deAnnotate' (AnnApp fun arg) = App (deAnnotate fun) (deAnnotate arg) +deAnnotate' (AnnNote note body) = Note note (deAnnotate body) + +deAnnotate' (AnnLet bind body) + = Let (deAnnBind bind) (deAnnotate body) + where + deAnnBind (AnnNonRec var rhs) = NonRec var (deAnnotate rhs) + deAnnBind (AnnRec pairs) = Rec [(v,deAnnotate rhs) | (v,rhs) <- pairs] + +-- gaw 2004 +deAnnotate' (AnnCase scrut v t alts) + = Case (deAnnotate scrut) v t (map deAnnAlt alts) + +deAnnAlt :: AnnAlt bndr annot -> Alt bndr +deAnnAlt (con,args,rhs) = (con,args,deAnnotate rhs) +\end{code} + +\begin{code} +collectAnnBndrs :: AnnExpr bndr annot -> ([bndr], AnnExpr bndr annot) +collectAnnBndrs e + = collect [] e + where + collect bs (_, AnnLam b body) = collect (b:bs) body + collect bs body = (reverse bs, body) +\end{code} diff --git a/compiler/coreSyn/CoreTidy.lhs b/compiler/coreSyn/CoreTidy.lhs new file mode 100644 index 0000000000..ba604667e7 --- /dev/null +++ b/compiler/coreSyn/CoreTidy.lhs @@ -0,0 +1,221 @@ +% +% (c) The AQUA Project, Glasgow University, 1996-1998 +% + +\begin{code} +module CoreTidy ( + tidyExpr, tidyVarOcc, tidyRule, tidyRules + ) where + +#include "HsVersions.h" + +import CoreSyn +import CoreUtils ( exprArity ) +import Unify ( coreRefineTys ) +import DataCon ( DataCon, isVanillaDataCon ) +import Id ( Id, mkUserLocal, idInfo, setIdInfo, idUnique, + idType, setIdType ) +import IdInfo ( setArityInfo, vanillaIdInfo, + newStrictnessInfo, setAllStrictnessInfo, + newDemandInfo, setNewDemandInfo ) +import Type ( Type, tidyType, tidyTyVarBndr, substTy, mkOpenTvSubst ) +import Var ( Var, TyVar, varName ) +import VarEnv +import UniqFM ( lookupUFM ) +import Name ( Name, getOccName ) +import OccName ( tidyOccName ) +import SrcLoc ( noSrcLoc ) +import Maybes ( orElse ) +import Outputable +import Util ( mapAccumL ) +\end{code} + + +This module contains "tidying" code for *nested* expressions, bindings, rules. +The code for *top-level* bindings is in TidyPgm. + +%************************************************************************ +%* * +\subsection{Tidying expressions, rules} +%* * +%************************************************************************ + +\begin{code} +tidyBind :: TidyEnv + -> CoreBind + -> (TidyEnv, CoreBind) + +tidyBind env (NonRec bndr rhs) + = tidyLetBndr env (bndr,rhs) =: \ (env', bndr') -> + (env', NonRec bndr' (tidyExpr env' rhs)) + +tidyBind env (Rec prs) + = mapAccumL tidyLetBndr env prs =: \ (env', bndrs') -> + map (tidyExpr env') (map snd prs) =: \ rhss' -> + (env', Rec (zip bndrs' rhss')) + + +------------ Expressions -------------- +tidyExpr :: TidyEnv -> CoreExpr -> CoreExpr +tidyExpr env (Var v) = Var (tidyVarOcc env v) +tidyExpr env (Type ty) = Type (tidyType env ty) +tidyExpr env (Lit lit) = Lit lit +tidyExpr env (App f a) = App (tidyExpr env f) (tidyExpr env a) +tidyExpr env (Note n e) = Note (tidyNote env n) (tidyExpr env e) + +tidyExpr env (Let b e) + = tidyBind env b =: \ (env', b') -> + Let b' (tidyExpr env' e) + +tidyExpr env (Case e b ty alts) + = tidyBndr env b =: \ (env', b) -> + Case (tidyExpr env e) b (tidyType env ty) + (map (tidyAlt b env') alts) + +tidyExpr env (Lam b e) + = tidyBndr env b =: \ (env', b) -> + Lam b (tidyExpr env' e) + +------------ Case alternatives -------------- +tidyAlt case_bndr env (DataAlt con, vs, rhs) + | not (isVanillaDataCon con) -- GADT case + = tidyBndrs env tvs =: \ (env1, tvs') -> + let + env2 = refineTidyEnv env con tvs' scrut_ty + in + tidyBndrs env2 ids =: \ (env3, ids') -> + (DataAlt con, tvs' ++ ids', tidyExpr env3 rhs) + where + (tvs, ids) = span isTyVar vs + scrut_ty = idType case_bndr + +tidyAlt case_bndr env (con, vs, rhs) + = tidyBndrs env vs =: \ (env', vs) -> + (con, vs, tidyExpr env' rhs) + +refineTidyEnv :: TidyEnv -> DataCon -> [TyVar] -> Type -> TidyEnv +-- Refine the TidyEnv in the light of the type refinement from coreRefineTys +refineTidyEnv tidy_env@(occ_env, var_env) con tvs scrut_ty + = case coreRefineTys con tvs scrut_ty of + Nothing -> tidy_env + Just (tv_subst, all_bound_here) + | all_bound_here -- Local type refinement only + -> tidy_env + | otherwise -- Apply the refining subst to the tidy env + -- This ensures that occurences have the most refined type + -- And that means that exprType will work right everywhere + -> (occ_env, mapVarEnv (refine subst) var_env) + where + subst = mkOpenTvSubst tv_subst + where + refine subst var | isId var = setIdType var (substTy subst (idType var)) + | otherwise = var + +------------ Notes -------------- +tidyNote env (Coerce t1 t2) = Coerce (tidyType env t1) (tidyType env t2) +tidyNote env note = note + +------------ Rules -------------- +tidyRules :: TidyEnv -> [CoreRule] -> [CoreRule] +tidyRules env [] = [] +tidyRules env (rule : rules) + = tidyRule env rule =: \ rule -> + tidyRules env rules =: \ rules -> + (rule : rules) + +tidyRule :: TidyEnv -> CoreRule -> CoreRule +tidyRule env rule@(BuiltinRule {}) = rule +tidyRule env rule@(Rule { ru_bndrs = bndrs, ru_args = args, ru_rhs = rhs, + ru_fn = fn, ru_rough = mb_ns }) + = tidyBndrs env bndrs =: \ (env', bndrs) -> + map (tidyExpr env') args =: \ args -> + rule { ru_bndrs = bndrs, ru_args = args, + ru_rhs = tidyExpr env' rhs, + ru_fn = tidyNameOcc env fn, + ru_rough = map (fmap (tidyNameOcc env')) mb_ns } +\end{code} + + +%************************************************************************ +%* * +\subsection{Tidying non-top-level binders} +%* * +%************************************************************************ + +\begin{code} +tidyNameOcc :: TidyEnv -> Name -> Name +-- In rules and instances, we have Names, and we must tidy them too +-- Fortunately, we can lookup in the VarEnv with a name +tidyNameOcc (_, var_env) n = case lookupUFM var_env n of + Nothing -> n + Just v -> varName v + +tidyVarOcc :: TidyEnv -> Var -> Var +tidyVarOcc (_, var_env) v = lookupVarEnv var_env v `orElse` v + +-- tidyBndr is used for lambda and case binders +tidyBndr :: TidyEnv -> Var -> (TidyEnv, Var) +tidyBndr env var + | isTyVar var = tidyTyVarBndr env var + | otherwise = tidyIdBndr env var + +tidyBndrs :: TidyEnv -> [Var] -> (TidyEnv, [Var]) +tidyBndrs env vars = mapAccumL tidyBndr env vars + +tidyLetBndr :: TidyEnv -> (Id, CoreExpr) -> (TidyEnv, Var) +-- Used for local (non-top-level) let(rec)s +tidyLetBndr env (id,rhs) + = ((tidy_env,new_var_env), final_id) + where + ((tidy_env,var_env), new_id) = tidyIdBndr env id + + -- We need to keep around any interesting strictness and + -- demand info because later on we may need to use it when + -- converting to A-normal form. + -- eg. + -- f (g x), where f is strict in its argument, will be converted + -- into case (g x) of z -> f z by CorePrep, but only if f still + -- has its strictness info. + -- + -- Similarly for the demand info - on a let binder, this tells + -- CorePrep to turn the let into a case. + -- + -- Similarly arity info for eta expansion in CorePrep + -- + final_id = new_id `setIdInfo` new_info + idinfo = idInfo id + new_info = vanillaIdInfo + `setArityInfo` exprArity rhs + `setAllStrictnessInfo` newStrictnessInfo idinfo + `setNewDemandInfo` newDemandInfo idinfo + + -- Override the env we get back from tidyId with the new IdInfo + -- so it gets propagated to the usage sites. + new_var_env = extendVarEnv var_env id final_id + +-- Non-top-level variables +tidyIdBndr :: TidyEnv -> Id -> (TidyEnv, Id) +tidyIdBndr env@(tidy_env, var_env) id + = -- do this pattern match strictly, otherwise we end up holding on to + -- stuff in the OccName. + case tidyOccName tidy_env (getOccName id) of { (tidy_env', occ') -> + let + -- Give the Id a fresh print-name, *and* rename its type + -- The SrcLoc isn't important now, + -- though we could extract it from the Id + -- + -- All nested Ids now have the same IdInfo, namely vanillaIdInfo, + -- which should save some space. + -- But note that tidyLetBndr puts some of it back. + ty' = tidyType env (idType id) + id' = mkUserLocal occ' (idUnique id) ty' noSrcLoc + `setIdInfo` vanillaIdInfo + var_env' = extendVarEnv var_env id id' + in + ((tidy_env', var_env'), id') + } +\end{code} + +\begin{code} +m =: k = m `seq` k m +\end{code} diff --git a/compiler/coreSyn/CoreUnfold.lhs b/compiler/coreSyn/CoreUnfold.lhs new file mode 100644 index 0000000000..d57f1886fc --- /dev/null +++ b/compiler/coreSyn/CoreUnfold.lhs @@ -0,0 +1,632 @@ +% +% (c) The AQUA Project, Glasgow University, 1994-1998 +% +\section[CoreUnfold]{Core-syntax unfoldings} + +Unfoldings (which can travel across module boundaries) are in Core +syntax (namely @CoreExpr@s). + +The type @Unfolding@ sits ``above'' simply-Core-expressions +unfoldings, capturing ``higher-level'' things we know about a binding, +usually things that the simplifier found out (e.g., ``it's a +literal''). In the corner of a @CoreUnfolding@ unfolding, you will +find, unsurprisingly, a Core expression. + +\begin{code} +module CoreUnfold ( + Unfolding, UnfoldingGuidance, -- Abstract types + + noUnfolding, mkTopUnfolding, mkUnfolding, mkCompulsoryUnfolding, seqUnfolding, + evaldUnfolding, mkOtherCon, otherCons, + unfoldingTemplate, maybeUnfoldingTemplate, + isEvaldUnfolding, isValueUnfolding, isCheapUnfolding, isCompulsoryUnfolding, + hasUnfolding, hasSomeUnfolding, neverUnfold, + + couldBeSmallEnoughToInline, + certainlyWillInline, smallEnoughToInline, + + callSiteInline + ) where + +#include "HsVersions.h" + +import StaticFlags ( opt_UF_CreationThreshold, opt_UF_UseThreshold, + opt_UF_FunAppDiscount, opt_UF_KeenessFactor, + opt_UF_DearOp, + ) +import DynFlags ( DynFlags, DynFlag(..), dopt ) +import CoreSyn +import PprCore ( pprCoreExpr ) +import OccurAnal ( occurAnalyseExpr ) +import CoreUtils ( exprIsHNF, exprIsCheap, exprIsTrivial ) +import Id ( Id, idType, isId, + idUnfolding, globalIdDetails + ) +import DataCon ( isUnboxedTupleCon ) +import Literal ( litSize ) +import PrimOp ( primOpIsDupable, primOpOutOfLine ) +import IdInfo ( OccInfo(..), GlobalIdDetails(..) ) +import Type ( isUnLiftedType ) +import PrelNames ( hasKey, buildIdKey, augmentIdKey ) +import Bag +import FastTypes +import Outputable + +#if __GLASGOW_HASKELL__ >= 404 +import GLAEXTS ( Int# ) +#endif +\end{code} + + +%************************************************************************ +%* * +\subsection{Making unfoldings} +%* * +%************************************************************************ + +\begin{code} +mkTopUnfolding expr = mkUnfolding True {- Top level -} expr + +mkUnfolding top_lvl expr + = CoreUnfolding (occurAnalyseExpr expr) + top_lvl + + (exprIsHNF expr) + -- Already evaluated + + (exprIsCheap expr) + -- OK to inline inside a lambda + + (calcUnfoldingGuidance opt_UF_CreationThreshold expr) + -- Sometimes during simplification, there's a large let-bound thing + -- which has been substituted, and so is now dead; so 'expr' contains + -- two copies of the thing while the occurrence-analysed expression doesn't + -- Nevertheless, we don't occ-analyse before computing the size because the + -- size computation bales out after a while, whereas occurrence analysis does not. + -- + -- This can occasionally mean that the guidance is very pessimistic; + -- it gets fixed up next round + +mkCompulsoryUnfolding expr -- Used for things that absolutely must be unfolded + = CompulsoryUnfolding (occurAnalyseExpr expr) +\end{code} + + +%************************************************************************ +%* * +\subsection{The UnfoldingGuidance type} +%* * +%************************************************************************ + +\begin{code} +instance Outputable UnfoldingGuidance where + ppr UnfoldNever = ptext SLIT("NEVER") + ppr (UnfoldIfGoodArgs v cs size discount) + = hsep [ ptext SLIT("IF_ARGS"), int v, + brackets (hsep (map int cs)), + int size, + int discount ] +\end{code} + + +\begin{code} +calcUnfoldingGuidance + :: Int -- bomb out if size gets bigger than this + -> CoreExpr -- expression to look at + -> UnfoldingGuidance +calcUnfoldingGuidance bOMB_OUT_SIZE expr + = case collect_val_bndrs expr of { (inline, val_binders, body) -> + let + n_val_binders = length val_binders + + max_inline_size = n_val_binders+2 + -- The idea is that if there is an INLINE pragma (inline is True) + -- and there's a big body, we give a size of n_val_binders+2. This + -- This is just enough to fail the no-size-increase test in callSiteInline, + -- so that INLINE things don't get inlined into entirely boring contexts, + -- but no more. + + in + case (sizeExpr (iUnbox bOMB_OUT_SIZE) val_binders body) of + + TooBig + | not inline -> UnfoldNever + -- A big function with an INLINE pragma must + -- have an UnfoldIfGoodArgs guidance + | otherwise -> UnfoldIfGoodArgs n_val_binders + (map (const 0) val_binders) + max_inline_size 0 + + SizeIs size cased_args scrut_discount + -> UnfoldIfGoodArgs + n_val_binders + (map discount_for val_binders) + final_size + (iBox scrut_discount) + where + boxed_size = iBox size + + final_size | inline = boxed_size `min` max_inline_size + | otherwise = boxed_size + + -- Sometimes an INLINE thing is smaller than n_val_binders+2. + -- A particular case in point is a constructor, which has size 1. + -- We want to inline this regardless, hence the `min` + + discount_for b = foldlBag (\acc (b',n) -> if b==b' then acc+n else acc) + 0 cased_args + } + where + collect_val_bndrs e = go False [] e + -- We need to be a bit careful about how we collect the + -- value binders. In ptic, if we see + -- __inline_me (\x y -> e) + -- We want to say "2 value binders". Why? So that + -- we take account of information given for the arguments + + go inline rev_vbs (Note InlineMe e) = go True rev_vbs e + go inline rev_vbs (Lam b e) | isId b = go inline (b:rev_vbs) e + | otherwise = go inline rev_vbs e + go inline rev_vbs e = (inline, reverse rev_vbs, e) +\end{code} + +\begin{code} +sizeExpr :: Int# -- Bomb out if it gets bigger than this + -> [Id] -- Arguments; we're interested in which of these + -- get case'd + -> CoreExpr + -> ExprSize + +sizeExpr bOMB_OUT_SIZE top_args expr + = size_up expr + where + size_up (Type t) = sizeZero -- Types cost nothing + size_up (Var v) = sizeOne + + size_up (Note InlineMe body) = sizeOne -- Inline notes make it look very small + -- This can be important. If you have an instance decl like this: + -- instance Foo a => Foo [a] where + -- {-# INLINE op1, op2 #-} + -- op1 = ... + -- op2 = ... + -- then we'll get a dfun which is a pair of two INLINE lambdas + + size_up (Note _ body) = size_up body -- Other notes cost nothing + + size_up (App fun (Type t)) = size_up fun + size_up (App fun arg) = size_up_app fun [arg] + + size_up (Lit lit) = sizeN (litSize lit) + + size_up (Lam b e) | isId b = lamScrutDiscount (size_up e `addSizeN` 1) + | otherwise = size_up e + + size_up (Let (NonRec binder rhs) body) + = nukeScrutDiscount (size_up rhs) `addSize` + size_up body `addSizeN` + (if isUnLiftedType (idType binder) then 0 else 1) + -- For the allocation + -- If the binder has an unlifted type there is no allocation + + size_up (Let (Rec pairs) body) + = nukeScrutDiscount rhs_size `addSize` + size_up body `addSizeN` + length pairs -- For the allocation + where + rhs_size = foldr (addSize . size_up . snd) sizeZero pairs + + size_up (Case (Var v) _ _ alts) + | v `elem` top_args -- We are scrutinising an argument variable + = +{- I'm nuking this special case; BUT see the comment with case alternatives. + + (a) It's too eager. We don't want to inline a wrapper into a + context with no benefit. + E.g. \ x. f (x+x) no point in inlining (+) here! + + (b) It's ineffective. Once g's wrapper is inlined, its case-expressions + aren't scrutinising arguments any more + + case alts of + + [alt] -> size_up_alt alt `addSize` SizeIs 0# (unitBag (v, 1)) 0# + -- We want to make wrapper-style evaluation look cheap, so that + -- when we inline a wrapper it doesn't make call site (much) bigger + -- Otherwise we get nasty phase ordering stuff: + -- f x = g x x + -- h y = ...(f e)... + -- If we inline g's wrapper, f looks big, and doesn't get inlined + -- into h; if we inline f first, while it looks small, then g's + -- wrapper will get inlined later anyway. To avoid this nasty + -- ordering difference, we make (case a of (x,y) -> ...), + -- *where a is one of the arguments* look free. + + other -> +-} + alts_size (foldr addSize sizeOne alt_sizes) -- The 1 is for the scrutinee + (foldr1 maxSize alt_sizes) + + -- Good to inline if an arg is scrutinised, because + -- that may eliminate allocation in the caller + -- And it eliminates the case itself + + where + alt_sizes = map size_up_alt alts + + -- alts_size tries to compute a good discount for + -- the case when we are scrutinising an argument variable + alts_size (SizeIs tot tot_disc tot_scrut) -- Size of all alternatives + (SizeIs max max_disc max_scrut) -- Size of biggest alternative + = SizeIs tot (unitBag (v, iBox (_ILIT 1 +# tot -# max)) `unionBags` max_disc) max_scrut + -- If the variable is known, we produce a discount that + -- will take us back to 'max', the size of rh largest alternative + -- The 1+ is a little discount for reduced allocation in the caller + alts_size tot_size _ = tot_size + +-- gaw 2004 + size_up (Case e _ _ alts) = nukeScrutDiscount (size_up e) `addSize` + foldr (addSize . size_up_alt) sizeZero alts + -- We don't charge for the case itself + -- It's a strict thing, and the price of the call + -- is paid by scrut. Also consider + -- case f x of DEFAULT -> e + -- This is just ';'! Don't charge for it. + + ------------ + size_up_app (App fun arg) args + | isTypeArg arg = size_up_app fun args + | otherwise = size_up_app fun (arg:args) + size_up_app fun args = foldr (addSize . nukeScrutDiscount . size_up) + (size_up_fun fun args) + args + + -- A function application with at least one value argument + -- so if the function is an argument give it an arg-discount + -- + -- Also behave specially if the function is a build + -- + -- Also if the function is a constant Id (constr or primop) + -- compute discounts specially + size_up_fun (Var fun) args + | fun `hasKey` buildIdKey = buildSize + | fun `hasKey` augmentIdKey = augmentSize + | otherwise + = case globalIdDetails fun of + DataConWorkId dc -> conSizeN dc (valArgCount args) + + FCallId fc -> sizeN opt_UF_DearOp + PrimOpId op -> primOpSize op (valArgCount args) + -- foldr addSize (primOpSize op) (map arg_discount args) + -- At one time I tried giving an arg-discount if a primop + -- is applied to one of the function's arguments, but it's + -- not good. At the moment, any unlifted-type arg gets a + -- 'True' for 'yes I'm evald', so we collect the discount even + -- if we know nothing about it. And just having it in a primop + -- doesn't help at all if we don't know something more. + + other -> fun_discount fun `addSizeN` + (1 + length (filter (not . exprIsTrivial) args)) + -- The 1+ is for the function itself + -- Add 1 for each non-trivial arg; + -- the allocation cost, as in let(rec) + -- Slight hack here: for constructors the args are almost always + -- trivial; and for primops they are almost always prim typed + -- We should really only count for non-prim-typed args in the + -- general case, but that seems too much like hard work + + size_up_fun other args = size_up other + + ------------ + size_up_alt (con, bndrs, rhs) = size_up rhs + -- Don't charge for args, so that wrappers look cheap + -- (See comments about wrappers with Case) + + ------------ + -- We want to record if we're case'ing, or applying, an argument + fun_discount v | v `elem` top_args = SizeIs 0# (unitBag (v, opt_UF_FunAppDiscount)) 0# + fun_discount other = sizeZero + + ------------ + -- These addSize things have to be here because + -- I don't want to give them bOMB_OUT_SIZE as an argument + + addSizeN TooBig _ = TooBig + addSizeN (SizeIs n xs d) m = mkSizeIs bOMB_OUT_SIZE (n +# iUnbox m) xs d + + addSize TooBig _ = TooBig + addSize _ TooBig = TooBig + addSize (SizeIs n1 xs d1) (SizeIs n2 ys d2) + = mkSizeIs bOMB_OUT_SIZE (n1 +# n2) (xs `unionBags` ys) (d1 +# d2) +\end{code} + +Code for manipulating sizes + +\begin{code} +data ExprSize = TooBig + | SizeIs FastInt -- Size found + (Bag (Id,Int)) -- Arguments cased herein, and discount for each such + FastInt -- Size to subtract if result is scrutinised + -- by a case expression + +-- subtract the discount before deciding whether to bale out. eg. we +-- want to inline a large constructor application into a selector: +-- tup = (a_1, ..., a_99) +-- x = case tup of ... +-- +mkSizeIs max n xs d | (n -# d) ># max = TooBig + | otherwise = SizeIs n xs d + +maxSize TooBig _ = TooBig +maxSize _ TooBig = TooBig +maxSize s1@(SizeIs n1 _ _) s2@(SizeIs n2 _ _) | n1 ># n2 = s1 + | otherwise = s2 + +sizeZero = SizeIs (_ILIT 0) emptyBag (_ILIT 0) +sizeOne = SizeIs (_ILIT 1) emptyBag (_ILIT 0) +sizeN n = SizeIs (iUnbox n) emptyBag (_ILIT 0) +conSizeN dc n + | isUnboxedTupleCon dc = SizeIs (_ILIT 0) emptyBag (iUnbox n +# _ILIT 1) + | otherwise = SizeIs (_ILIT 1) emptyBag (iUnbox n +# _ILIT 1) + -- Treat constructors as size 1; we are keen to expose them + -- (and we charge separately for their args). We can't treat + -- them as size zero, else we find that (iBox x) has size 1, + -- which is the same as a lone variable; and hence 'v' will + -- always be replaced by (iBox x), where v is bound to iBox x. + -- + -- However, unboxed tuples count as size zero + -- I found occasions where we had + -- f x y z = case op# x y z of { s -> (# s, () #) } + -- and f wasn't getting inlined + +primOpSize op n_args + | not (primOpIsDupable op) = sizeN opt_UF_DearOp + | not (primOpOutOfLine op) = sizeN (2 - n_args) + -- Be very keen to inline simple primops. + -- We give a discount of 1 for each arg so that (op# x y z) costs 2. + -- We can't make it cost 1, else we'll inline let v = (op# x y z) + -- at every use of v, which is excessive. + -- + -- A good example is: + -- let x = +# p q in C {x} + -- Even though x get's an occurrence of 'many', its RHS looks cheap, + -- and there's a good chance it'll get inlined back into C's RHS. Urgh! + | otherwise = sizeOne + +buildSize = SizeIs (-2#) emptyBag 4# + -- We really want to inline applications of build + -- build t (\cn -> e) should cost only the cost of e (because build will be inlined later) + -- Indeed, we should add a result_discount becuause build is + -- very like a constructor. We don't bother to check that the + -- build is saturated (it usually is). The "-2" discounts for the \c n, + -- The "4" is rather arbitrary. + +augmentSize = SizeIs (-2#) emptyBag 4# + -- Ditto (augment t (\cn -> e) ys) should cost only the cost of + -- e plus ys. The -2 accounts for the \cn + +nukeScrutDiscount (SizeIs n vs d) = SizeIs n vs 0# +nukeScrutDiscount TooBig = TooBig + +-- When we return a lambda, give a discount if it's used (applied) +lamScrutDiscount (SizeIs n vs d) = case opt_UF_FunAppDiscount of { d -> SizeIs n vs (iUnbox d) } +lamScrutDiscount TooBig = TooBig +\end{code} + + +%************************************************************************ +%* * +\subsection[considerUnfolding]{Given all the info, do (not) do the unfolding} +%* * +%************************************************************************ + +We have very limited information about an unfolding expression: (1)~so +many type arguments and so many value arguments expected---for our +purposes here, we assume we've got those. (2)~A ``size'' or ``cost,'' +a single integer. (3)~An ``argument info'' vector. For this, what we +have at the moment is a Boolean per argument position that says, ``I +will look with great favour on an explicit constructor in this +position.'' (4)~The ``discount'' to subtract if the expression +is being scrutinised. + +Assuming we have enough type- and value arguments (if not, we give up +immediately), then we see if the ``discounted size'' is below some +(semi-arbitrary) threshold. It works like this: for every argument +position where we're looking for a constructor AND WE HAVE ONE in our +hands, we get a (again, semi-arbitrary) discount [proportion to the +number of constructors in the type being scrutinized]. + +If we're in the context of a scrutinee ( \tr{(case <expr > of A .. -> ...;.. )}) +and the expression in question will evaluate to a constructor, we use +the computed discount size *for the result only* rather than +computing the argument discounts. Since we know the result of +the expression is going to be taken apart, discounting its size +is more accurate (see @sizeExpr@ above for how this discount size +is computed). + +We use this one to avoid exporting inlinings that we ``couldn't possibly +use'' on the other side. Can be overridden w/ flaggery. +Just the same as smallEnoughToInline, except that it has no actual arguments. + +\begin{code} +couldBeSmallEnoughToInline :: Int -> CoreExpr -> Bool +couldBeSmallEnoughToInline threshold rhs = case calcUnfoldingGuidance threshold rhs of + UnfoldNever -> False + other -> True + +certainlyWillInline :: Unfolding -> Bool + -- Sees if the unfolding is pretty certain to inline +certainlyWillInline (CoreUnfolding _ _ _ is_cheap (UnfoldIfGoodArgs n_vals _ size _)) + = is_cheap && size - (n_vals +1) <= opt_UF_UseThreshold +certainlyWillInline other + = False + +smallEnoughToInline :: Unfolding -> Bool +smallEnoughToInline (CoreUnfolding _ _ _ _ (UnfoldIfGoodArgs _ _ size _)) + = size <= opt_UF_UseThreshold +smallEnoughToInline other + = False +\end{code} + +%************************************************************************ +%* * +\subsection{callSiteInline} +%* * +%************************************************************************ + +This is the key function. It decides whether to inline a variable at a call site + +callSiteInline is used at call sites, so it is a bit more generous. +It's a very important function that embodies lots of heuristics. +A non-WHNF can be inlined if it doesn't occur inside a lambda, +and occurs exactly once or + occurs once in each branch of a case and is small + +If the thing is in WHNF, there's no danger of duplicating work, +so we can inline if it occurs once, or is small + +NOTE: we don't want to inline top-level functions that always diverge. +It just makes the code bigger. Tt turns out that the convenient way to prevent +them inlining is to give them a NOINLINE pragma, which we do in +StrictAnal.addStrictnessInfoToTopId + +\begin{code} +callSiteInline :: DynFlags + -> Bool -- True <=> the Id can be inlined + -> Bool -- 'inline' note at call site + -> OccInfo + -> Id -- The Id + -> [Bool] -- One for each value arg; True if it is interesting + -> Bool -- True <=> continuation is interesting + -> Maybe CoreExpr -- Unfolding, if any + + +callSiteInline dflags active_inline inline_call occ id arg_infos interesting_cont + = case idUnfolding id of { + NoUnfolding -> Nothing ; + OtherCon cs -> Nothing ; + + CompulsoryUnfolding unf_template -> Just unf_template ; + -- CompulsoryUnfolding => there is no top-level binding + -- for these things, so we must inline it. + -- Only a couple of primop-like things have + -- compulsory unfoldings (see MkId.lhs). + -- We don't allow them to be inactive + + CoreUnfolding unf_template is_top is_value is_cheap guidance -> + + let + result | yes_or_no = Just unf_template + | otherwise = Nothing + + n_val_args = length arg_infos + + yes_or_no + | not active_inline = False + | otherwise = case occ of + IAmDead -> pprTrace "callSiteInline: dead" (ppr id) False + IAmALoopBreaker -> False + --OneOcc in_lam _ _ -> (not in_lam || is_cheap) && consider_safe True + other -> is_cheap && consider_safe False + -- we consider even the once-in-one-branch + -- occurrences, because they won't all have been + -- caught by preInlineUnconditionally. In particular, + -- if the occurrence is once inside a lambda, and the + -- rhs is cheap but not a manifest lambda, then + -- pre-inline will not have inlined it for fear of + -- invalidating the occurrence info in the rhs. + + consider_safe once + -- consider_safe decides whether it's a good idea to + -- inline something, given that there's no + -- work-duplication issue (the caller checks that). + | inline_call = True + + | otherwise + = case guidance of + UnfoldNever -> False + UnfoldIfGoodArgs n_vals_wanted arg_discounts size res_discount + + | enough_args && size <= (n_vals_wanted + 1) + -- Inline unconditionally if there no size increase + -- Size of call is n_vals_wanted (+1 for the function) + -> True + + | otherwise + -> some_benefit && small_enough + + where + some_benefit = or arg_infos || really_interesting_cont || + (not is_top && ({- once || -} (n_vals_wanted > 0 && enough_args))) + -- [was (once && not in_lam)] + -- If it occurs more than once, there must be + -- something interesting about some argument, or the + -- result context, to make it worth inlining + -- + -- If a function has a nested defn we also record + -- some-benefit, on the grounds that we are often able + -- to eliminate the binding, and hence the allocation, + -- for the function altogether; this is good for join + -- points. But this only makes sense for *functions*; + -- inlining a constructor doesn't help allocation + -- unless the result is scrutinised. UNLESS the + -- constructor occurs just once, albeit possibly in + -- multiple case branches. Then inlining it doesn't + -- increase allocation, but it does increase the + -- chance that the constructor won't be allocated at + -- all in the branches that don't use it. + + enough_args = n_val_args >= n_vals_wanted + really_interesting_cont | n_val_args < n_vals_wanted = False -- Too few args + | n_val_args == n_vals_wanted = interesting_cont + | otherwise = True -- Extra args + -- really_interesting_cont tells if the result of the + -- call is in an interesting context. + + small_enough = (size - discount) <= opt_UF_UseThreshold + discount = computeDiscount n_vals_wanted arg_discounts res_discount + arg_infos really_interesting_cont + + in + if dopt Opt_D_dump_inlinings dflags then + pprTrace "Considering inlining" + (ppr id <+> vcat [text "active:" <+> ppr active_inline, + text "occ info:" <+> ppr occ, + text "arg infos" <+> ppr arg_infos, + text "interesting continuation" <+> ppr interesting_cont, + text "is value:" <+> ppr is_value, + text "is cheap:" <+> ppr is_cheap, + text "guidance" <+> ppr guidance, + text "ANSWER =" <+> if yes_or_no then text "YES" else text "NO"]) + result + else + result + } + +computeDiscount :: Int -> [Int] -> Int -> [Bool] -> Bool -> Int +computeDiscount n_vals_wanted arg_discounts res_discount arg_infos result_used + -- We multiple the raw discounts (args_discount and result_discount) + -- ty opt_UnfoldingKeenessFactor because the former have to do with + -- *size* whereas the discounts imply that there's some extra + -- *efficiency* to be gained (e.g. beta reductions, case reductions) + -- by inlining. + + -- we also discount 1 for each argument passed, because these will + -- reduce with the lambdas in the function (we count 1 for a lambda + -- in size_up). + = 1 + -- Discount of 1 because the result replaces the call + -- so we count 1 for the function itself + length (take n_vals_wanted arg_infos) + + -- Discount of 1 for each arg supplied, because the + -- result replaces the call + round (opt_UF_KeenessFactor * + fromIntegral (arg_discount + result_discount)) + where + arg_discount = sum (zipWith mk_arg_discount arg_discounts arg_infos) + + mk_arg_discount discount is_evald | is_evald = discount + | otherwise = 0 + + -- Don't give a result discount unless there are enough args + result_discount | result_used = res_discount -- Over-applied, or case scrut + | otherwise = 0 +\end{code} diff --git a/compiler/coreSyn/CoreUtils.lhs b/compiler/coreSyn/CoreUtils.lhs new file mode 100644 index 0000000000..e358be4439 --- /dev/null +++ b/compiler/coreSyn/CoreUtils.lhs @@ -0,0 +1,1316 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[CoreUtils]{Utility functions on @Core@ syntax} + +\begin{code} +module CoreUtils ( + -- Construction + mkInlineMe, mkSCC, mkCoerce, mkCoerce2, + bindNonRec, needsCaseBinding, + mkIfThenElse, mkAltExpr, mkPiType, mkPiTypes, + + -- Taking expressions apart + findDefault, findAlt, isDefaultAlt, + + -- Properties of expressions + exprType, coreAltType, + exprIsDupable, exprIsTrivial, exprIsCheap, + exprIsHNF,exprOkForSpeculation, exprIsBig, + exprIsConApp_maybe, exprIsBottom, + rhsIsStatic, + + -- Arity and eta expansion + manifestArity, exprArity, + exprEtaExpandArity, etaExpand, + + -- Size + coreBindsSize, + + -- Hashing + hashExpr, + + -- Equality + cheapEqExpr, tcEqExpr, tcEqExprX, applyTypeToArgs, applyTypeToArg + ) where + +#include "HsVersions.h" + + +import GLAEXTS -- For `xori` + +import CoreSyn +import CoreFVs ( exprFreeVars ) +import PprCore ( pprCoreExpr ) +import Var ( Var ) +import VarSet ( unionVarSet ) +import VarEnv +import Name ( hashName ) +import Packages ( HomeModules ) +#if mingw32_TARGET_OS +import Packages ( isDllName ) +#endif +import Literal ( hashLiteral, literalType, litIsDupable, + litIsTrivial, isZeroLit, Literal( MachLabel ) ) +import DataCon ( DataCon, dataConRepArity, dataConInstArgTys, + isVanillaDataCon, dataConTyCon ) +import PrimOp ( PrimOp(..), primOpOkForSpeculation, primOpIsCheap ) +import Id ( Id, idType, globalIdDetails, idNewStrictness, + mkWildId, idArity, idName, idUnfolding, idInfo, + isOneShotBndr, isStateHackType, isDataConWorkId_maybe, mkSysLocal, + isDataConWorkId, isBottomingId + ) +import IdInfo ( GlobalIdDetails(..), megaSeqIdInfo ) +import NewDemand ( appIsBottom ) +import Type ( Type, mkFunTy, mkForAllTy, splitFunTy_maybe, + splitFunTy, tcEqTypeX, + applyTys, isUnLiftedType, seqType, mkTyVarTy, + splitForAllTy_maybe, isForAllTy, splitRecNewType_maybe, + splitTyConApp_maybe, coreEqType, funResultTy, applyTy + ) +import TyCon ( tyConArity ) +import TysWiredIn ( boolTy, trueDataCon, falseDataCon ) +import CostCentre ( CostCentre ) +import BasicTypes ( Arity ) +import Unique ( Unique ) +import Outputable +import TysPrim ( alphaTy ) -- Debugging only +import Util ( equalLength, lengthAtLeast, foldl2 ) +\end{code} + + +%************************************************************************ +%* * +\subsection{Find the type of a Core atom/expression} +%* * +%************************************************************************ + +\begin{code} +exprType :: CoreExpr -> Type + +exprType (Var var) = idType var +exprType (Lit lit) = literalType lit +exprType (Let _ body) = exprType body +exprType (Case _ _ ty alts) = ty +exprType (Note (Coerce ty _) e) = ty -- **! should take usage from e +exprType (Note other_note e) = exprType e +exprType (Lam binder expr) = mkPiType binder (exprType expr) +exprType e@(App _ _) + = case collectArgs e of + (fun, args) -> applyTypeToArgs e (exprType fun) args + +exprType other = pprTrace "exprType" (pprCoreExpr other) alphaTy + +coreAltType :: CoreAlt -> Type +coreAltType (_,_,rhs) = exprType rhs +\end{code} + +@mkPiType@ makes a (->) type or a forall type, depending on whether +it is given a type variable or a term variable. We cleverly use the +lbvarinfo field to figure out the right annotation for the arrove in +case of a term variable. + +\begin{code} +mkPiType :: Var -> Type -> Type -- The more polymorphic version +mkPiTypes :: [Var] -> Type -> Type -- doesn't work... + +mkPiTypes vs ty = foldr mkPiType ty vs + +mkPiType v ty + | isId v = mkFunTy (idType v) ty + | otherwise = mkForAllTy v ty +\end{code} + +\begin{code} +applyTypeToArg :: Type -> CoreExpr -> Type +applyTypeToArg fun_ty (Type arg_ty) = applyTy fun_ty arg_ty +applyTypeToArg fun_ty other_arg = funResultTy fun_ty + +applyTypeToArgs :: CoreExpr -> Type -> [CoreExpr] -> Type +-- A more efficient version of applyTypeToArg +-- when we have several args +-- The first argument is just for debugging +applyTypeToArgs e op_ty [] = op_ty + +applyTypeToArgs e op_ty (Type ty : args) + = -- Accumulate type arguments so we can instantiate all at once + go [ty] args + where + go rev_tys (Type ty : args) = go (ty:rev_tys) args + go rev_tys rest_args = applyTypeToArgs e op_ty' rest_args + where + op_ty' = applyTys op_ty (reverse rev_tys) + +applyTypeToArgs e op_ty (other_arg : args) + = case (splitFunTy_maybe op_ty) of + Just (_, res_ty) -> applyTypeToArgs e res_ty args + Nothing -> pprPanic "applyTypeToArgs" (pprCoreExpr e) +\end{code} + + + +%************************************************************************ +%* * +\subsection{Attaching notes} +%* * +%************************************************************************ + +mkNote removes redundant coercions, and SCCs where possible + +\begin{code} +#ifdef UNUSED +mkNote :: Note -> CoreExpr -> CoreExpr +mkNote (Coerce to_ty from_ty) expr = mkCoerce2 to_ty from_ty expr +mkNote (SCC cc) expr = mkSCC cc expr +mkNote InlineMe expr = mkInlineMe expr +mkNote note expr = Note note expr +#endif + +-- Slide InlineCall in around the function +-- No longer necessary I think (SLPJ Apr 99) +-- mkNote InlineCall (App f a) = App (mkNote InlineCall f) a +-- mkNote InlineCall (Var v) = Note InlineCall (Var v) +-- mkNote InlineCall expr = expr +\end{code} + +Drop trivial InlineMe's. This is somewhat important, because if we have an unfolding +that looks like (Note InlineMe (Var v)), the InlineMe doesn't go away because it may +not be *applied* to anything. + +We don't use exprIsTrivial here, though, because we sometimes generate worker/wrapper +bindings like + fw = ... + f = inline_me (coerce t fw) +As usual, the inline_me prevents the worker from getting inlined back into the wrapper. +We want the split, so that the coerces can cancel at the call site. + +However, we can get left with tiresome type applications. Notably, consider + f = /\ a -> let t = e in (t, w) +Then lifting the let out of the big lambda gives + t' = /\a -> e + f = /\ a -> let t = inline_me (t' a) in (t, w) +The inline_me is to stop the simplifier inlining t' right back +into t's RHS. In the next phase we'll substitute for t (since +its rhs is trivial) and *then* we could get rid of the inline_me. +But it hardly seems worth it, so I don't bother. + +\begin{code} +mkInlineMe (Var v) = Var v +mkInlineMe e = Note InlineMe e +\end{code} + + + +\begin{code} +mkCoerce :: Type -> CoreExpr -> CoreExpr +mkCoerce to_ty expr = mkCoerce2 to_ty (exprType expr) expr + +mkCoerce2 :: Type -> Type -> CoreExpr -> CoreExpr +mkCoerce2 to_ty from_ty (Note (Coerce to_ty2 from_ty2) expr) + = ASSERT( from_ty `coreEqType` to_ty2 ) + mkCoerce2 to_ty from_ty2 expr + +mkCoerce2 to_ty from_ty expr + | to_ty `coreEqType` from_ty = expr + | otherwise = ASSERT( from_ty `coreEqType` exprType expr ) + Note (Coerce to_ty from_ty) expr +\end{code} + +\begin{code} +mkSCC :: CostCentre -> Expr b -> Expr b + -- Note: Nested SCC's *are* preserved for the benefit of + -- cost centre stack profiling +mkSCC cc (Lit lit) = Lit lit +mkSCC cc (Lam x e) = Lam x (mkSCC cc e) -- Move _scc_ inside lambda +mkSCC cc (Note (SCC cc') e) = Note (SCC cc) (Note (SCC cc') e) +mkSCC cc (Note n e) = Note n (mkSCC cc e) -- Move _scc_ inside notes +mkSCC cc expr = Note (SCC cc) expr +\end{code} + + +%************************************************************************ +%* * +\subsection{Other expression construction} +%* * +%************************************************************************ + +\begin{code} +bindNonRec :: Id -> CoreExpr -> CoreExpr -> CoreExpr +-- (bindNonRec x r b) produces either +-- let x = r in b +-- or +-- case r of x { _DEFAULT_ -> b } +-- +-- depending on whether x is unlifted or not +-- It's used by the desugarer to avoid building bindings +-- that give Core Lint a heart attack. Actually the simplifier +-- deals with them perfectly well. + +bindNonRec bndr rhs body + | needsCaseBinding (idType bndr) rhs = Case rhs bndr (exprType body) [(DEFAULT,[],body)] + | otherwise = Let (NonRec bndr rhs) body + +needsCaseBinding ty rhs = isUnLiftedType ty && not (exprOkForSpeculation rhs) + -- Make a case expression instead of a let + -- These can arise either from the desugarer, + -- or from beta reductions: (\x.e) (x +# y) +\end{code} + +\begin{code} +mkAltExpr :: AltCon -> [CoreBndr] -> [Type] -> CoreExpr + -- This guy constructs the value that the scrutinee must have + -- when you are in one particular branch of a case +mkAltExpr (DataAlt con) args inst_tys + = mkConApp con (map Type inst_tys ++ map varToCoreExpr args) +mkAltExpr (LitAlt lit) [] [] + = Lit lit + +mkIfThenElse :: CoreExpr -> CoreExpr -> CoreExpr -> CoreExpr +mkIfThenElse guard then_expr else_expr +-- Not going to be refining, so okay to take the type of the "then" clause + = Case guard (mkWildId boolTy) (exprType then_expr) + [ (DataAlt falseDataCon, [], else_expr), -- Increasing order of tag! + (DataAlt trueDataCon, [], then_expr) ] +\end{code} + + +%************************************************************************ +%* * +\subsection{Taking expressions apart} +%* * +%************************************************************************ + +The default alternative must be first, if it exists at all. +This makes it easy to find, though it makes matching marginally harder. + +\begin{code} +findDefault :: [CoreAlt] -> ([CoreAlt], Maybe CoreExpr) +findDefault ((DEFAULT,args,rhs) : alts) = ASSERT( null args ) (alts, Just rhs) +findDefault alts = (alts, Nothing) + +findAlt :: AltCon -> [CoreAlt] -> CoreAlt +findAlt con alts + = case alts of + (deflt@(DEFAULT,_,_):alts) -> go alts deflt + other -> go alts panic_deflt + where + panic_deflt = pprPanic "Missing alternative" (ppr con $$ vcat (map ppr alts)) + + go [] deflt = deflt + go (alt@(con1,_,_) : alts) deflt + = case con `cmpAltCon` con1 of + LT -> deflt -- Missed it already; the alts are in increasing order + EQ -> alt + GT -> ASSERT( not (con1 == DEFAULT) ) go alts deflt + +isDefaultAlt :: CoreAlt -> Bool +isDefaultAlt (DEFAULT, _, _) = True +isDefaultAlt other = False +\end{code} + + +%************************************************************************ +%* * +\subsection{Figuring out things about expressions} +%* * +%************************************************************************ + +@exprIsTrivial@ is true of expressions we are unconditionally happy to + duplicate; simple variables and constants, and type + applications. Note that primop Ids aren't considered + trivial unless + +@exprIsBottom@ is true of expressions that are guaranteed to diverge + + +There used to be a gruesome test for (hasNoBinding v) in the +Var case: + exprIsTrivial (Var v) | hasNoBinding v = idArity v == 0 +The idea here is that a constructor worker, like $wJust, is +really short for (\x -> $wJust x), becuase $wJust has no binding. +So it should be treated like a lambda. Ditto unsaturated primops. +But now constructor workers are not "have-no-binding" Ids. And +completely un-applied primops and foreign-call Ids are sufficiently +rare that I plan to allow them to be duplicated and put up with +saturating them. + +SCC notes. We do not treat (_scc_ "foo" x) as trivial, because + a) it really generates code, (and a heap object when it's + a function arg) to capture the cost centre + b) see the note [SCC-and-exprIsTrivial] in Simplify.simplLazyBind + +\begin{code} +exprIsTrivial (Var v) = True -- See notes above +exprIsTrivial (Type _) = True +exprIsTrivial (Lit lit) = litIsTrivial lit +exprIsTrivial (App e arg) = not (isRuntimeArg arg) && exprIsTrivial e +exprIsTrivial (Note (SCC _) e) = False -- See notes above +exprIsTrivial (Note _ e) = exprIsTrivial e +exprIsTrivial (Lam b body) = not (isRuntimeVar b) && exprIsTrivial body +exprIsTrivial other = False +\end{code} + + +@exprIsDupable@ is true of expressions that can be duplicated at a modest + cost in code size. This will only happen in different case + branches, so there's no issue about duplicating work. + + That is, exprIsDupable returns True of (f x) even if + f is very very expensive to call. + + Its only purpose is to avoid fruitless let-binding + and then inlining of case join points + + +\begin{code} +exprIsDupable (Type _) = True +exprIsDupable (Var v) = True +exprIsDupable (Lit lit) = litIsDupable lit +exprIsDupable (Note InlineMe e) = True +exprIsDupable (Note _ e) = exprIsDupable e +exprIsDupable expr + = go expr 0 + where + go (Var v) n_args = True + go (App f a) n_args = n_args < dupAppSize + && exprIsDupable a + && go f (n_args+1) + go other n_args = False + +dupAppSize :: Int +dupAppSize = 4 -- Size of application we are prepared to duplicate +\end{code} + +@exprIsCheap@ looks at a Core expression and returns \tr{True} if +it is obviously in weak head normal form, or is cheap to get to WHNF. +[Note that that's not the same as exprIsDupable; an expression might be +big, and hence not dupable, but still cheap.] + +By ``cheap'' we mean a computation we're willing to: + push inside a lambda, or + inline at more than one place +That might mean it gets evaluated more than once, instead of being +shared. The main examples of things which aren't WHNF but are +``cheap'' are: + + * case e of + pi -> ei + (where e, and all the ei are cheap) + + * let x = e in b + (where e and b are cheap) + + * op x1 ... xn + (where op is a cheap primitive operator) + + * error "foo" + (because we are happy to substitute it inside a lambda) + +Notice that a variable is considered 'cheap': we can push it inside a lambda, +because sharing will make sure it is only evaluated once. + +\begin{code} +exprIsCheap :: CoreExpr -> Bool +exprIsCheap (Lit lit) = True +exprIsCheap (Type _) = True +exprIsCheap (Var _) = True +exprIsCheap (Note InlineMe e) = True +exprIsCheap (Note _ e) = exprIsCheap e +exprIsCheap (Lam x e) = isRuntimeVar x || exprIsCheap e +exprIsCheap (Case e _ _ alts) = exprIsCheap e && + and [exprIsCheap rhs | (_,_,rhs) <- alts] + -- Experimentally, treat (case x of ...) as cheap + -- (and case __coerce x etc.) + -- This improves arities of overloaded functions where + -- there is only dictionary selection (no construction) involved +exprIsCheap (Let (NonRec x _) e) + | isUnLiftedType (idType x) = exprIsCheap e + | otherwise = False + -- strict lets always have cheap right hand sides, and + -- do no allocation. + +exprIsCheap other_expr + = go other_expr 0 True + where + go (Var f) n_args args_cheap + = (idAppIsCheap f n_args && args_cheap) + -- A constructor, cheap primop, or partial application + + || idAppIsBottom f n_args + -- Application of a function which + -- always gives bottom; we treat this as cheap + -- because it certainly doesn't need to be shared! + + go (App f a) n_args args_cheap + | not (isRuntimeArg a) = go f n_args args_cheap + | otherwise = go f (n_args + 1) (exprIsCheap a && args_cheap) + + go other n_args args_cheap = False + +idAppIsCheap :: Id -> Int -> Bool +idAppIsCheap id n_val_args + | n_val_args == 0 = True -- Just a type application of + -- a variable (f t1 t2 t3) + -- counts as WHNF + | otherwise + = case globalIdDetails id of + DataConWorkId _ -> True + RecordSelId {} -> n_val_args == 1 -- I'm experimenting with making record selection + ClassOpId _ -> n_val_args == 1 -- look cheap, so we will substitute it inside a + -- lambda. Particularly for dictionary field selection. + -- BUT: Take care with (sel d x)! The (sel d) might be cheap, but + -- there's no guarantee that (sel d x) will be too. Hence (n_val_args == 1) + + PrimOpId op -> primOpIsCheap op -- In principle we should worry about primops + -- that return a type variable, since the result + -- might be applied to something, but I'm not going + -- to bother to check the number of args + other -> n_val_args < idArity id +\end{code} + +exprOkForSpeculation returns True of an expression that it is + + * safe to evaluate even if normal order eval might not + evaluate the expression at all, or + + * safe *not* to evaluate even if normal order would do so + +It returns True iff + + the expression guarantees to terminate, + soon, + without raising an exception, + without causing a side effect (e.g. writing a mutable variable) + +E.G. + let x = case y# +# 1# of { r# -> I# r# } + in E +==> + case y# +# 1# of { r# -> + let x = I# r# + in E + } + +We can only do this if the (y+1) is ok for speculation: it has no +side effects, and can't diverge or raise an exception. + +\begin{code} +exprOkForSpeculation :: CoreExpr -> Bool +exprOkForSpeculation (Lit _) = True +exprOkForSpeculation (Type _) = True +exprOkForSpeculation (Var v) = isUnLiftedType (idType v) +exprOkForSpeculation (Note _ e) = exprOkForSpeculation e +exprOkForSpeculation other_expr + = case collectArgs other_expr of + (Var f, args) -> spec_ok (globalIdDetails f) args + other -> False + + where + spec_ok (DataConWorkId _) args + = True -- The strictness of the constructor has already + -- been expressed by its "wrapper", so we don't need + -- to take the arguments into account + + spec_ok (PrimOpId op) args + | isDivOp op, -- Special case for dividing operations that fail + [arg1, Lit lit] <- args -- only if the divisor is zero + = not (isZeroLit lit) && exprOkForSpeculation arg1 + -- Often there is a literal divisor, and this + -- can get rid of a thunk in an inner looop + + | otherwise + = primOpOkForSpeculation op && + all exprOkForSpeculation args + -- A bit conservative: we don't really need + -- to care about lazy arguments, but this is easy + + spec_ok other args = False + +isDivOp :: PrimOp -> Bool +-- True of dyadic operators that can fail +-- only if the second arg is zero +-- This function probably belongs in PrimOp, or even in +-- an automagically generated file.. but it's such a +-- special case I thought I'd leave it here for now. +isDivOp IntQuotOp = True +isDivOp IntRemOp = True +isDivOp WordQuotOp = True +isDivOp WordRemOp = True +isDivOp IntegerQuotRemOp = True +isDivOp IntegerDivModOp = True +isDivOp FloatDivOp = True +isDivOp DoubleDivOp = True +isDivOp other = False +\end{code} + + +\begin{code} +exprIsBottom :: CoreExpr -> Bool -- True => definitely bottom +exprIsBottom e = go 0 e + where + -- n is the number of args + go n (Note _ e) = go n e + go n (Let _ e) = go n e + go n (Case e _ _ _) = go 0 e -- Just check the scrut + go n (App e _) = go (n+1) e + go n (Var v) = idAppIsBottom v n + go n (Lit _) = False + go n (Lam _ _) = False + go n (Type _) = False + +idAppIsBottom :: Id -> Int -> Bool +idAppIsBottom id n_val_args = appIsBottom (idNewStrictness id) n_val_args +\end{code} + +@exprIsHNF@ returns true for expressions that are certainly *already* +evaluated to *head* normal form. This is used to decide whether it's ok +to change + + case x of _ -> e ===> e + +and to decide whether it's safe to discard a `seq` + +So, it does *not* treat variables as evaluated, unless they say they are. + +But it *does* treat partial applications and constructor applications +as values, even if their arguments are non-trivial, provided the argument +type is lifted; + e.g. (:) (f x) (map f xs) is a value + map (...redex...) is a value +Because `seq` on such things completes immediately + +For unlifted argument types, we have to be careful: + C (f x :: Int#) +Suppose (f x) diverges; then C (f x) is not a value. True, but +this form is illegal (see the invariants in CoreSyn). Args of unboxed +type must be ok-for-speculation (or trivial). + +\begin{code} +exprIsHNF :: CoreExpr -> Bool -- True => Value-lambda, constructor, PAP +exprIsHNF (Var v) -- NB: There are no value args at this point + = isDataConWorkId v -- Catches nullary constructors, + -- so that [] and () are values, for example + || idArity v > 0 -- Catches (e.g.) primops that don't have unfoldings + || isEvaldUnfolding (idUnfolding v) + -- Check the thing's unfolding; it might be bound to a value + -- A worry: what if an Id's unfolding is just itself: + -- then we could get an infinite loop... + +exprIsHNF (Lit l) = True +exprIsHNF (Type ty) = True -- Types are honorary Values; + -- we don't mind copying them +exprIsHNF (Lam b e) = isRuntimeVar b || exprIsHNF e +exprIsHNF (Note _ e) = exprIsHNF e +exprIsHNF (App e (Type _)) = exprIsHNF e +exprIsHNF (App e a) = app_is_value e [a] +exprIsHNF other = False + +-- There is at least one value argument +app_is_value (Var fun) args + | isDataConWorkId fun -- Constructor apps are values + || idArity fun > valArgCount args -- Under-applied function + = check_args (idType fun) args +app_is_value (App f a) as = app_is_value f (a:as) +app_is_value other as = False + + -- 'check_args' checks that unlifted-type args + -- are in fact guaranteed non-divergent +check_args fun_ty [] = True +check_args fun_ty (Type _ : args) = case splitForAllTy_maybe fun_ty of + Just (_, ty) -> check_args ty args +check_args fun_ty (arg : args) + | isUnLiftedType arg_ty = exprOkForSpeculation arg + | otherwise = check_args res_ty args + where + (arg_ty, res_ty) = splitFunTy fun_ty +\end{code} + +\begin{code} +exprIsConApp_maybe :: CoreExpr -> Maybe (DataCon, [CoreExpr]) +exprIsConApp_maybe (Note (Coerce to_ty from_ty) expr) + = -- Maybe this is over the top, but here we try to turn + -- coerce (S,T) ( x, y ) + -- effectively into + -- ( coerce S x, coerce T y ) + -- This happens in anger in PrelArrExts which has a coerce + -- case coerce memcpy a b of + -- (# r, s #) -> ... + -- where the memcpy is in the IO monad, but the call is in + -- the (ST s) monad + case exprIsConApp_maybe expr of { + Nothing -> Nothing ; + Just (dc, args) -> + + case splitTyConApp_maybe to_ty of { + Nothing -> Nothing ; + Just (tc, tc_arg_tys) | tc /= dataConTyCon dc -> Nothing + | not (isVanillaDataCon dc) -> Nothing + | otherwise -> + -- Type constructor must match + -- We knock out existentials to keep matters simple(r) + let + arity = tyConArity tc + val_args = drop arity args + to_arg_tys = dataConInstArgTys dc tc_arg_tys + mk_coerce ty arg = mkCoerce ty arg + new_val_args = zipWith mk_coerce to_arg_tys val_args + in + ASSERT( all isTypeArg (take arity args) ) + ASSERT( equalLength val_args to_arg_tys ) + Just (dc, map Type tc_arg_tys ++ new_val_args) + }} + +exprIsConApp_maybe (Note _ expr) + = exprIsConApp_maybe expr + -- We ignore InlineMe notes in case we have + -- x = __inline_me__ (a,b) + -- All part of making sure that INLINE pragmas never hurt + -- Marcin tripped on this one when making dictionaries more inlinable + -- + -- In fact, we ignore all notes. For example, + -- case _scc_ "foo" (C a b) of + -- C a b -> e + -- should be optimised away, but it will be only if we look + -- through the SCC note. + +exprIsConApp_maybe expr = analyse (collectArgs expr) + where + analyse (Var fun, args) + | Just con <- isDataConWorkId_maybe fun, + args `lengthAtLeast` dataConRepArity con + -- Might be > because the arity excludes type args + = Just (con,args) + + -- Look through unfoldings, but only cheap ones, because + -- we are effectively duplicating the unfolding + analyse (Var fun, []) + | let unf = idUnfolding fun, + isCheapUnfolding unf + = exprIsConApp_maybe (unfoldingTemplate unf) + + analyse other = Nothing +\end{code} + + + +%************************************************************************ +%* * +\subsection{Eta reduction and expansion} +%* * +%************************************************************************ + +\begin{code} +exprEtaExpandArity :: CoreExpr -> Arity +{- The Arity returned is the number of value args the + thing can be applied to without doing much work + +exprEtaExpandArity is used when eta expanding + e ==> \xy -> e x y + +It returns 1 (or more) to: + case x of p -> \s -> ... +because for I/O ish things we really want to get that \s to the top. +We are prepared to evaluate x each time round the loop in order to get that + +It's all a bit more subtle than it looks: + +1. One-shot lambdas + +Consider one-shot lambdas + let x = expensive in \y z -> E +We want this to have arity 2 if the \y-abstraction is a 1-shot lambda +Hence the ArityType returned by arityType + +2. The state-transformer hack + +The one-shot lambda special cause is particularly important/useful for +IO state transformers, where we often get + let x = E in \ s -> ... + +and the \s is a real-world state token abstraction. Such abstractions +are almost invariably 1-shot, so we want to pull the \s out, past the +let x=E, even if E is expensive. So we treat state-token lambdas as +one-shot even if they aren't really. The hack is in Id.isOneShotBndr. + +3. Dealing with bottom + +Consider also + f = \x -> error "foo" +Here, arity 1 is fine. But if it is + f = \x -> case x of + True -> error "foo" + False -> \y -> x+y +then we want to get arity 2. Tecnically, this isn't quite right, because + (f True) `seq` 1 +should diverge, but it'll converge if we eta-expand f. Nevertheless, we +do so; it improves some programs significantly, and increasing convergence +isn't a bad thing. Hence the ABot/ATop in ArityType. + +Actually, the situation is worse. Consider + f = \x -> case x of + True -> \y -> x+y + False -> \y -> x-y +Can we eta-expand here? At first the answer looks like "yes of course", but +consider + (f bot) `seq` 1 +This should diverge! But if we eta-expand, it won't. Again, we ignore this +"problem", because being scrupulous would lose an important transformation for +many programs. + + +4. Newtypes + +Non-recursive newtypes are transparent, and should not get in the way. +We do (currently) eta-expand recursive newtypes too. So if we have, say + + newtype T = MkT ([T] -> Int) + +Suppose we have + e = coerce T f +where f has arity 1. Then: etaExpandArity e = 1; +that is, etaExpandArity looks through the coerce. + +When we eta-expand e to arity 1: eta_expand 1 e T +we want to get: coerce T (\x::[T] -> (coerce ([T]->Int) e) x) + +HOWEVER, note that if you use coerce bogusly you can ge + coerce Int negate +And since negate has arity 2, you might try to eta expand. But you can't +decopose Int to a function type. Hence the final case in eta_expand. +-} + + +exprEtaExpandArity e = arityDepth (arityType e) + +-- A limited sort of function type +data ArityType = AFun Bool ArityType -- True <=> one-shot + | ATop -- Know nothing + | ABot -- Diverges + +arityDepth :: ArityType -> Arity +arityDepth (AFun _ ty) = 1 + arityDepth ty +arityDepth ty = 0 + +andArityType ABot at2 = at2 +andArityType ATop at2 = ATop +andArityType (AFun t1 at1) (AFun t2 at2) = AFun (t1 && t2) (andArityType at1 at2) +andArityType at1 at2 = andArityType at2 at1 + +arityType :: CoreExpr -> ArityType + -- (go1 e) = [b1,..,bn] + -- means expression can be rewritten \x_b1 -> ... \x_bn -> body + -- where bi is True <=> the lambda is one-shot + +arityType (Note n e) = arityType e +-- Not needed any more: etaExpand is cleverer +-- | ok_note n = arityType e +-- | otherwise = ATop + +arityType (Var v) + = mk (idArity v) (arg_tys (idType v)) + where + mk :: Arity -> [Type] -> ArityType + -- The argument types are only to steer the "state hack" + -- Consider case x of + -- True -> foo + -- False -> \(s:RealWorld) -> e + -- where foo has arity 1. Then we want the state hack to + -- apply to foo too, so we can eta expand the case. + mk 0 tys | isBottomingId v = ABot + | otherwise = ATop + mk n (ty:tys) = AFun (isStateHackType ty) (mk (n-1) tys) + mk n [] = AFun False (mk (n-1) []) + + arg_tys :: Type -> [Type] -- Ignore for-alls + arg_tys ty + | Just (_, ty') <- splitForAllTy_maybe ty = arg_tys ty' + | Just (arg,res) <- splitFunTy_maybe ty = arg : arg_tys res + | otherwise = [] + + -- Lambdas; increase arity +arityType (Lam x e) | isId x = AFun (isOneShotBndr x) (arityType e) + | otherwise = arityType e + + -- Applications; decrease arity +arityType (App f (Type _)) = arityType f +arityType (App f a) = case arityType f of + AFun one_shot xs | exprIsCheap a -> xs + other -> ATop + + -- Case/Let; keep arity if either the expression is cheap + -- or it's a 1-shot lambda + -- The former is not really right for Haskell + -- f x = case x of { (a,b) -> \y. e } + -- ===> + -- f x y = case x of { (a,b) -> e } + -- The difference is observable using 'seq' +arityType (Case scrut _ _ alts) = case foldr1 andArityType [arityType rhs | (_,_,rhs) <- alts] of + xs@(AFun one_shot _) | one_shot -> xs + xs | exprIsCheap scrut -> xs + | otherwise -> ATop + +arityType (Let b e) = case arityType e of + xs@(AFun one_shot _) | one_shot -> xs + xs | all exprIsCheap (rhssOfBind b) -> xs + | otherwise -> ATop + +arityType other = ATop + +{- NOT NEEDED ANY MORE: etaExpand is cleverer +ok_note InlineMe = False +ok_note other = True + -- Notice that we do not look through __inline_me__ + -- This may seem surprising, but consider + -- f = _inline_me (\x -> e) + -- We DO NOT want to eta expand this to + -- f = \x -> (_inline_me (\x -> e)) x + -- because the _inline_me gets dropped now it is applied, + -- giving just + -- f = \x -> e + -- A Bad Idea +-} +\end{code} + + +\begin{code} +etaExpand :: Arity -- Result should have this number of value args + -> [Unique] + -> CoreExpr -> Type -- Expression and its type + -> CoreExpr +-- (etaExpand n us e ty) returns an expression with +-- the same meaning as 'e', but with arity 'n'. +-- +-- Given e' = etaExpand n us e ty +-- We should have +-- ty = exprType e = exprType e' +-- +-- Note that SCCs are not treated specially. If we have +-- etaExpand 2 (\x -> scc "foo" e) +-- = (\xy -> (scc "foo" e) y) +-- So the costs of evaluating 'e' (not 'e y') are attributed to "foo" + +etaExpand n us expr ty + | manifestArity expr >= n = expr -- The no-op case + | otherwise = eta_expand n us expr ty + where + +-- manifestArity sees how many leading value lambdas there are +manifestArity :: CoreExpr -> Arity +manifestArity (Lam v e) | isId v = 1 + manifestArity e + | otherwise = manifestArity e +manifestArity (Note _ e) = manifestArity e +manifestArity e = 0 + +-- etaExpand deals with for-alls. For example: +-- etaExpand 1 E +-- where E :: forall a. a -> a +-- would return +-- (/\b. \y::a -> E b y) +-- +-- It deals with coerces too, though they are now rare +-- so perhaps the extra code isn't worth it + +eta_expand n us expr ty + | n == 0 && + -- The ILX code generator requires eta expansion for type arguments + -- too, but alas the 'n' doesn't tell us how many of them there + -- may be. So we eagerly eta expand any big lambdas, and just + -- cross our fingers about possible loss of sharing in the ILX case. + -- The Right Thing is probably to make 'arity' include + -- type variables throughout the compiler. (ToDo.) + not (isForAllTy ty) + -- Saturated, so nothing to do + = expr + + -- Short cut for the case where there already + -- is a lambda; no point in gratuitously adding more +eta_expand n us (Lam v body) ty + | isTyVar v + = Lam v (eta_expand n us body (applyTy ty (mkTyVarTy v))) + + | otherwise + = Lam v (eta_expand (n-1) us body (funResultTy ty)) + +-- We used to have a special case that stepped inside Coerces here, +-- thus: eta_expand n us (Note note@(Coerce _ ty) e) _ +-- = Note note (eta_expand n us e ty) +-- BUT this led to an infinite loop +-- Example: newtype T = MkT (Int -> Int) +-- eta_expand 1 (coerce (Int->Int) e) +-- --> coerce (Int->Int) (eta_expand 1 T e) +-- by the bogus eqn +-- --> coerce (Int->Int) (coerce T +-- (\x::Int -> eta_expand 1 (coerce (Int->Int) e))) +-- by the splitNewType_maybe case below +-- and round we go + +eta_expand n us expr ty + = case splitForAllTy_maybe ty of { + Just (tv,ty') -> Lam tv (eta_expand n us (App expr (Type (mkTyVarTy tv))) ty') + + ; Nothing -> + + case splitFunTy_maybe ty of { + Just (arg_ty, res_ty) -> Lam arg1 (eta_expand (n-1) us2 (App expr (Var arg1)) res_ty) + where + arg1 = mkSysLocal FSLIT("eta") uniq arg_ty + (uniq:us2) = us + + ; Nothing -> + + -- Given this: + -- newtype T = MkT ([T] -> Int) + -- Consider eta-expanding this + -- eta_expand 1 e T + -- We want to get + -- coerce T (\x::[T] -> (coerce ([T]->Int) e) x) + -- Only try this for recursive newtypes; the non-recursive kind + -- are transparent anyway + + case splitRecNewType_maybe ty of { + Just ty' -> mkCoerce2 ty ty' (eta_expand n us (mkCoerce2 ty' ty expr) ty') ; + Nothing -> + + -- We have an expression of arity > 0, but its type isn't a function + -- This *can* legitmately happen: e.g. coerce Int (\x. x) + -- Essentially the programmer is playing fast and loose with types + -- (Happy does this a lot). So we simply decline to eta-expand. + expr + }}} +\end{code} + +exprArity is a cheap-and-cheerful version of exprEtaExpandArity. +It tells how many things the expression can be applied to before doing +any work. It doesn't look inside cases, lets, etc. The idea is that +exprEtaExpandArity will do the hard work, leaving something that's easy +for exprArity to grapple with. In particular, Simplify uses exprArity to +compute the ArityInfo for the Id. + +Originally I thought that it was enough just to look for top-level lambdas, but +it isn't. I've seen this + + foo = PrelBase.timesInt + +We want foo to get arity 2 even though the eta-expander will leave it +unchanged, in the expectation that it'll be inlined. But occasionally it +isn't, because foo is blacklisted (used in a rule). + +Similarly, see the ok_note check in exprEtaExpandArity. So + f = __inline_me (\x -> e) +won't be eta-expanded. + +And in any case it seems more robust to have exprArity be a bit more intelligent. +But note that (\x y z -> f x y z) +should have arity 3, regardless of f's arity. + +\begin{code} +exprArity :: CoreExpr -> Arity +exprArity e = go e + where + go (Var v) = idArity v + go (Lam x e) | isId x = go e + 1 + | otherwise = go e + go (Note n e) = go e + go (App e (Type t)) = go e + go (App f a) | exprIsCheap a = (go f - 1) `max` 0 + -- NB: exprIsCheap a! + -- f (fac x) does not have arity 2, + -- even if f has arity 3! + -- NB: `max 0`! (\x y -> f x) has arity 2, even if f is + -- unknown, hence arity 0 + go _ = 0 +\end{code} + +%************************************************************************ +%* * +\subsection{Equality} +%* * +%************************************************************************ + +@cheapEqExpr@ is a cheap equality test which bales out fast! + True => definitely equal + False => may or may not be equal + +\begin{code} +cheapEqExpr :: Expr b -> Expr b -> Bool + +cheapEqExpr (Var v1) (Var v2) = v1==v2 +cheapEqExpr (Lit lit1) (Lit lit2) = lit1 == lit2 +cheapEqExpr (Type t1) (Type t2) = t1 `coreEqType` t2 + +cheapEqExpr (App f1 a1) (App f2 a2) + = f1 `cheapEqExpr` f2 && a1 `cheapEqExpr` a2 + +cheapEqExpr _ _ = False + +exprIsBig :: Expr b -> Bool +-- Returns True of expressions that are too big to be compared by cheapEqExpr +exprIsBig (Lit _) = False +exprIsBig (Var v) = False +exprIsBig (Type t) = False +exprIsBig (App f a) = exprIsBig f || exprIsBig a +exprIsBig other = True +\end{code} + + +\begin{code} +tcEqExpr :: CoreExpr -> CoreExpr -> Bool +-- Used in rule matching, so does *not* look through +-- newtypes, predicate types; hence tcEqExpr + +tcEqExpr e1 e2 = tcEqExprX rn_env e1 e2 + where + rn_env = mkRnEnv2 (mkInScopeSet (exprFreeVars e1 `unionVarSet` exprFreeVars e2)) + +tcEqExprX :: RnEnv2 -> CoreExpr -> CoreExpr -> Bool +tcEqExprX env (Var v1) (Var v2) = rnOccL env v1 == rnOccR env v2 +tcEqExprX env (Lit lit1) (Lit lit2) = lit1 == lit2 +tcEqExprX env (App f1 a1) (App f2 a2) = tcEqExprX env f1 f2 && tcEqExprX env a1 a2 +tcEqExprX env (Lam v1 e1) (Lam v2 e2) = tcEqExprX (rnBndr2 env v1 v2) e1 e2 +tcEqExprX env (Let (NonRec v1 r1) e1) + (Let (NonRec v2 r2) e2) = tcEqExprX env r1 r2 + && tcEqExprX (rnBndr2 env v1 v2) e1 e2 +tcEqExprX env (Let (Rec ps1) e1) + (Let (Rec ps2) e2) = equalLength ps1 ps2 + && and (zipWith eq_rhs ps1 ps2) + && tcEqExprX env' e1 e2 + where + env' = foldl2 rn_bndr2 env ps2 ps2 + rn_bndr2 env (b1,_) (b2,_) = rnBndr2 env b1 b2 + eq_rhs (_,r1) (_,r2) = tcEqExprX env' r1 r2 +tcEqExprX env (Case e1 v1 t1 a1) + (Case e2 v2 t2 a2) = tcEqExprX env e1 e2 + && tcEqTypeX env t1 t2 + && equalLength a1 a2 + && and (zipWith (eq_alt env') a1 a2) + where + env' = rnBndr2 env v1 v2 + +tcEqExprX env (Note n1 e1) (Note n2 e2) = eq_note env n1 n2 && tcEqExprX env e1 e2 +tcEqExprX env (Type t1) (Type t2) = tcEqTypeX env t1 t2 +tcEqExprX env e1 e2 = False + +eq_alt env (c1,vs1,r1) (c2,vs2,r2) = c1==c2 && tcEqExprX (rnBndrs2 env vs1 vs2) r1 r2 + +eq_note env (SCC cc1) (SCC cc2) = cc1 == cc2 +eq_note env (Coerce t1 f1) (Coerce t2 f2) = tcEqTypeX env t1 t2 && tcEqTypeX env f1 f2 +eq_note env InlineCall InlineCall = True +eq_note env (CoreNote s1) (CoreNote s2) = s1 == s2 +eq_note env other1 other2 = False +\end{code} + + +%************************************************************************ +%* * +\subsection{The size of an expression} +%* * +%************************************************************************ + +\begin{code} +coreBindsSize :: [CoreBind] -> Int +coreBindsSize bs = foldr ((+) . bindSize) 0 bs + +exprSize :: CoreExpr -> Int + -- A measure of the size of the expressions + -- It also forces the expression pretty drastically as a side effect +exprSize (Var v) = v `seq` 1 +exprSize (Lit lit) = lit `seq` 1 +exprSize (App f a) = exprSize f + exprSize a +exprSize (Lam b e) = varSize b + exprSize e +exprSize (Let b e) = bindSize b + exprSize e +exprSize (Case e b t as) = seqType t `seq` exprSize e + varSize b + 1 + foldr ((+) . altSize) 0 as +exprSize (Note n e) = noteSize n + exprSize e +exprSize (Type t) = seqType t `seq` 1 + +noteSize (SCC cc) = cc `seq` 1 +noteSize (Coerce t1 t2) = seqType t1 `seq` seqType t2 `seq` 1 +noteSize InlineCall = 1 +noteSize InlineMe = 1 +noteSize (CoreNote s) = s `seq` 1 -- hdaume: core annotations + +varSize :: Var -> Int +varSize b | isTyVar b = 1 + | otherwise = seqType (idType b) `seq` + megaSeqIdInfo (idInfo b) `seq` + 1 + +varsSize = foldr ((+) . varSize) 0 + +bindSize (NonRec b e) = varSize b + exprSize e +bindSize (Rec prs) = foldr ((+) . pairSize) 0 prs + +pairSize (b,e) = varSize b + exprSize e + +altSize (c,bs,e) = c `seq` varsSize bs + exprSize e +\end{code} + + +%************************************************************************ +%* * +\subsection{Hashing} +%* * +%************************************************************************ + +\begin{code} +hashExpr :: CoreExpr -> Int +hashExpr e | hash < 0 = 77 -- Just in case we hit -maxInt + | otherwise = hash + where + hash = abs (hash_expr e) -- Negative numbers kill UniqFM + +hash_expr (Note _ e) = hash_expr e +hash_expr (Let (NonRec b r) e) = hashId b +hash_expr (Let (Rec ((b,r):_)) e) = hashId b +hash_expr (Case _ b _ _) = hashId b +hash_expr (App f e) = hash_expr f * fast_hash_expr e +hash_expr (Var v) = hashId v +hash_expr (Lit lit) = hashLiteral lit +hash_expr (Lam b _) = hashId b +hash_expr (Type t) = trace "hash_expr: type" 1 -- Shouldn't happen + +fast_hash_expr (Var v) = hashId v +fast_hash_expr (Lit lit) = hashLiteral lit +fast_hash_expr (App f (Type _)) = fast_hash_expr f +fast_hash_expr (App f a) = fast_hash_expr a +fast_hash_expr (Lam b _) = hashId b +fast_hash_expr other = 1 + +hashId :: Id -> Int +hashId id = hashName (idName id) +\end{code} + +%************************************************************************ +%* * +\subsection{Determining non-updatable right-hand-sides} +%* * +%************************************************************************ + +Top-level constructor applications can usually be allocated +statically, but they can't if the constructor, or any of the +arguments, come from another DLL (because we can't refer to static +labels in other DLLs). + +If this happens we simply make the RHS into an updatable thunk, +and 'exectute' it rather than allocating it statically. + +\begin{code} +rhsIsStatic :: HomeModules -> CoreExpr -> Bool +-- This function is called only on *top-level* right-hand sides +-- Returns True if the RHS can be allocated statically, with +-- no thunks involved at all. +-- +-- It's called (i) in TidyPgm.hasCafRefs to decide if the rhs is, or +-- refers to, CAFs; and (ii) in CoreToStg to decide whether to put an +-- update flag on it. +-- +-- The basic idea is that rhsIsStatic returns True only if the RHS is +-- (a) a value lambda +-- (b) a saturated constructor application with static args +-- +-- BUT watch out for +-- (i) Any cross-DLL references kill static-ness completely +-- because they must be 'executed' not statically allocated +-- ("DLL" here really only refers to Windows DLLs, on other platforms, +-- this is not necessary) +-- +-- (ii) We treat partial applications as redexes, because in fact we +-- make a thunk for them that runs and builds a PAP +-- at run-time. The only appliations that are treated as +-- static are *saturated* applications of constructors. + +-- We used to try to be clever with nested structures like this: +-- ys = (:) w ((:) w []) +-- on the grounds that CorePrep will flatten ANF-ise it later. +-- But supporting this special case made the function much more +-- complicated, because the special case only applies if there are no +-- enclosing type lambdas: +-- ys = /\ a -> Foo (Baz ([] a)) +-- Here the nested (Baz []) won't float out to top level in CorePrep. +-- +-- But in fact, even without -O, nested structures at top level are +-- flattened by the simplifier, so we don't need to be super-clever here. +-- +-- Examples +-- +-- f = \x::Int. x+7 TRUE +-- p = (True,False) TRUE +-- +-- d = (fst p, False) FALSE because there's a redex inside +-- (this particular one doesn't happen but...) +-- +-- h = D# (1.0## /## 2.0##) FALSE (redex again) +-- n = /\a. Nil a TRUE +-- +-- t = /\a. (:) (case w a of ...) (Nil a) FALSE (redex) +-- +-- +-- This is a bit like CoreUtils.exprIsHNF, with the following differences: +-- a) scc "foo" (\x -> ...) is updatable (so we catch the right SCC) +-- +-- b) (C x xs), where C is a contructors is updatable if the application is +-- dynamic +-- +-- c) don't look through unfolding of f in (f x). +-- +-- When opt_RuntimeTypes is on, we keep type lambdas and treat +-- them as making the RHS re-entrant (non-updatable). + +rhsIsStatic hmods rhs = is_static False rhs + where + is_static :: Bool -- True <=> in a constructor argument; must be atomic + -> CoreExpr -> Bool + + is_static False (Lam b e) = isRuntimeVar b || is_static False e + + is_static in_arg (Note (SCC _) e) = False + is_static in_arg (Note _ e) = is_static in_arg e + + is_static in_arg (Lit lit) + = case lit of + MachLabel _ _ -> False + other -> True + -- A MachLabel (foreign import "&foo") in an argument + -- prevents a constructor application from being static. The + -- reason is that it might give rise to unresolvable symbols + -- in the object file: under Linux, references to "weak" + -- symbols from the data segment give rise to "unresolvable + -- relocation" errors at link time This might be due to a bug + -- in the linker, but we'll work around it here anyway. + -- SDM 24/2/2004 + + is_static in_arg other_expr = go other_expr 0 + where + go (Var f) n_val_args +#if mingw32_TARGET_OS + | not (isDllName hmods (idName f)) +#endif + = saturated_data_con f n_val_args + || (in_arg && n_val_args == 0) + -- A naked un-applied variable is *not* deemed a static RHS + -- E.g. f = g + -- Reason: better to update so that the indirection gets shorted + -- out, and the true value will be seen + -- NB: if you change this, you'll break the invariant that THUNK_STATICs + -- are always updatable. If you do so, make sure that non-updatable + -- ones have enough space for their static link field! + + go (App f a) n_val_args + | isTypeArg a = go f n_val_args + | not in_arg && is_static True a = go f (n_val_args + 1) + -- The (not in_arg) checks that we aren't in a constructor argument; + -- if we are, we don't allow (value) applications of any sort + -- + -- NB. In case you wonder, args are sometimes not atomic. eg. + -- x = D# (1.0## /## 2.0##) + -- can't float because /## can fail. + + go (Note (SCC _) f) n_val_args = False + go (Note _ f) n_val_args = go f n_val_args + + go other n_val_args = False + + saturated_data_con f n_val_args + = case isDataConWorkId_maybe f of + Just dc -> n_val_args == dataConRepArity dc + Nothing -> False +\end{code} diff --git a/compiler/coreSyn/ExternalCore.lhs b/compiler/coreSyn/ExternalCore.lhs new file mode 100644 index 0000000000..09a6e7f7da --- /dev/null +++ b/compiler/coreSyn/ExternalCore.lhs @@ -0,0 +1,89 @@ +% +% (c) The University of Glasgow 2001 +% +\begin{code} + +module ExternalCore where + + +data Module + = Module Mname [Tdef] [Vdefg] + +data Tdef + = Data (Qual Tcon) [Tbind] [Cdef] + | Newtype (Qual Tcon) [Tbind] (Maybe Ty) + +data Cdef + = Constr Dcon [Tbind] [Ty] + | GadtConstr Dcon Ty + +data Vdefg + = Rec [Vdef] + | Nonrec Vdef + +type Vdef = (Var,Ty,Exp) -- Top level bindings are unqualified now + +data Exp + = Var (Qual Var) + | Dcon (Qual Dcon) + | Lit Lit + | App Exp Exp + | Appt Exp Ty + | Lam Bind Exp + | Let Vdefg Exp + | Case Exp Vbind Ty [Alt] {- non-empty list -} + | Coerce Ty Exp + | Note String Exp + | External String Ty + +data Bind + = Vb Vbind + | Tb Tbind + +data Alt + = Acon (Qual Dcon) [Tbind] [Vbind] Exp + | Alit Lit Exp + | Adefault Exp + +type Vbind = (Var,Ty) +type Tbind = (Tvar,Kind) + +data Ty + = Tvar Tvar + | Tcon (Qual Tcon) + | Tapp Ty Ty + | Tforall Tbind Ty + +data Kind + = Klifted + | Kunlifted + | Kopen + | Karrow Kind Kind + +data Lit + = Lint Integer Ty + | Lrational Rational Ty + | Lchar Char Ty + | Lstring String Ty + + +type Mname = Id +type Var = Id +type Tvar = Id +type Tcon = Id +type Dcon = Id + +type Qual t = (Mname,t) + +type Id = String + +primMname = "GHCziPrim" + +tcArrow :: Qual Tcon +tcArrow = (primMname, "ZLzmzgZR") + +\end{code} + + + + diff --git a/compiler/coreSyn/MkExternalCore.lhs b/compiler/coreSyn/MkExternalCore.lhs new file mode 100644 index 0000000000..291b16e823 --- /dev/null +++ b/compiler/coreSyn/MkExternalCore.lhs @@ -0,0 +1,222 @@ +% +% (c) The University of Glasgow 2001 +% +\begin{code} + +module MkExternalCore ( + emitExternalCore +) where + +#include "HsVersions.h" + +import qualified ExternalCore as C +import Char +import Module +import CoreSyn +import HscTypes +import TyCon +import TypeRep +import Type +import PprExternalCore -- Instances +import DataCon ( DataCon, dataConTyVars, dataConRepArgTys, + dataConName, dataConTyCon ) +import CoreSyn +import Var +import IdInfo +import Kind +import Literal +import Name +import Outputable +import ForeignCall +import DynFlags ( DynFlags(..) ) +import StaticFlags ( opt_EmitExternalCore ) +import IO +import FastString + +emitExternalCore :: DynFlags -> CgGuts -> IO () +emitExternalCore dflags cg_guts + | opt_EmitExternalCore + = (do handle <- openFile corename WriteMode + hPutStrLn handle (show (mkExternalCore cg_guts)) + hClose handle) + `catch` (\err -> pprPanic "Failed to open or write external core output file" + (text corename)) + where corename = extCoreName dflags +emitExternalCore _ _ + | otherwise + = return () + + +mkExternalCore :: CgGuts -> C.Module +-- The ModGuts has been tidied, but the implicit bindings have +-- not been injected, so we have to add them manually here +-- We don't include the strange data-con *workers* because they are +-- implicit in the data type declaration itself +mkExternalCore (CgGuts {cg_module=this_mod, cg_tycons = tycons, cg_binds = binds}) + = C.Module mname tdefs (map make_vdef binds) + where + mname = make_mid this_mod + tdefs = foldr collect_tdefs [] tycons + +collect_tdefs :: TyCon -> [C.Tdef] -> [C.Tdef] +collect_tdefs tcon tdefs + | isAlgTyCon tcon = tdef: tdefs + where + tdef | isNewTyCon tcon = + C.Newtype (make_con_qid (tyConName tcon)) (map make_tbind tyvars) repclause + | null (tyConDataCons tcon) = error "MkExternalCore died: can't handle datatype declarations with no data constructors" + | otherwise = + C.Data (make_con_qid (tyConName tcon)) (map make_tbind tyvars) (map make_cdef (tyConDataCons tcon)) + where repclause | isRecursiveTyCon tcon = Nothing + | otherwise = Just (make_ty rep) + where (_, rep) = newTyConRep tcon + tyvars = tyConTyVars tcon + +collect_tdefs _ tdefs = tdefs + + +make_cdef :: DataCon -> C.Cdef +make_cdef dcon = C.Constr dcon_name existentials tys + where + dcon_name = make_var_id (dataConName dcon) + existentials = map make_tbind ex_tyvars + ex_tyvars = drop (tyConArity (dataConTyCon dcon)) (dataConTyVars dcon) + tys = map make_ty (dataConRepArgTys dcon) + +make_tbind :: TyVar -> C.Tbind +make_tbind tv = (make_var_id (tyVarName tv), make_kind (tyVarKind tv)) + +make_vbind :: Var -> C.Vbind +make_vbind v = (make_var_id (Var.varName v), make_ty (idType v)) + +make_vdef :: CoreBind -> C.Vdefg +make_vdef b = + case b of + NonRec v e -> C.Nonrec (f (v,e)) + Rec ves -> C.Rec (map f ves) + where f (v,e) = (make_var_id (Var.varName v), make_ty (idType v),make_exp e) + -- Top level bindings are unqualified now + +make_exp :: CoreExpr -> C.Exp +make_exp (Var v) = + case globalIdDetails v of + -- a DataConId represents the Id of a worker, which is a varName. -- sof 4/02 +-- DataConId _ -> C.Dcon (make_con_qid (Var.varName v)) + FCallId (CCall (CCallSpec (StaticTarget nm) _ _)) -> C.External (unpackFS nm) (make_ty (idType v)) + FCallId _ -> error "MkExternalCore died: can't handle non-static-C foreign call" + _ -> C.Var (make_var_qid (Var.varName v)) +make_exp (Lit (l@(MachLabel s _))) = error "MkExternalCore died: can't handle \"foreign label\" declarations" +make_exp (Lit l) = C.Lit (make_lit l) +make_exp (App e (Type t)) = C.Appt (make_exp e) (make_ty t) +make_exp (App e1 e2) = C.App (make_exp e1) (make_exp e2) +make_exp (Lam v e) | isTyVar v = C.Lam (C.Tb (make_tbind v)) (make_exp e) +make_exp (Lam v e) | otherwise = C.Lam (C.Vb (make_vbind v)) (make_exp e) +make_exp (Let b e) = C.Let (make_vdef b) (make_exp e) +-- gaw 2004 +make_exp (Case e v ty alts) = C.Case (make_exp e) (make_vbind v) (make_ty ty) (map make_alt alts) +make_exp (Note (SCC cc) e) = C.Note "SCC" (make_exp e) -- temporary +make_exp (Note (Coerce t_to t_from) e) = C.Coerce (make_ty t_to) (make_exp e) +make_exp (Note InlineCall e) = C.Note "InlineCall" (make_exp e) +make_exp (Note (CoreNote s) e) = C.Note s (make_exp e) -- hdaume: core annotations +make_exp (Note InlineMe e) = C.Note "InlineMe" (make_exp e) +make_exp _ = error "MkExternalCore died: make_exp" + +make_alt :: CoreAlt -> C.Alt +make_alt (DataAlt dcon, vs, e) = + C.Acon (make_con_qid (dataConName dcon)) + (map make_tbind tbs) + (map make_vbind vbs) + (make_exp e) + where (tbs,vbs) = span isTyVar vs +make_alt (LitAlt l,_,e) = C.Alit (make_lit l) (make_exp e) +make_alt (DEFAULT,[],e) = C.Adefault (make_exp e) + +make_lit :: Literal -> C.Lit +make_lit l = + case l of + MachChar i -> C.Lchar i t + MachStr s -> C.Lstring (unpackFS s) t + MachNullAddr -> C.Lint 0 t + MachInt i -> C.Lint i t + MachInt64 i -> C.Lint i t + MachWord i -> C.Lint i t + MachWord64 i -> C.Lint i t + MachFloat r -> C.Lrational r t + MachDouble r -> C.Lrational r t + _ -> error "MkExternalCore died: make_lit" + where + t = make_ty (literalType l) + +make_ty :: Type -> C.Ty +make_ty (TyVarTy tv) = C.Tvar (make_var_id (tyVarName tv)) +make_ty (AppTy t1 t2) = C.Tapp (make_ty t1) (make_ty t2) +make_ty (FunTy t1 t2) = make_ty (TyConApp funTyCon [t1,t2]) +make_ty (ForAllTy tv t) = C.Tforall (make_tbind tv) (make_ty t) +make_ty (TyConApp tc ts) = foldl C.Tapp (C.Tcon (make_con_qid (tyConName tc))) + (map make_ty ts) +-- Newtypes are treated just like any other type constructor; not expanded +-- Reason: predTypeRep does substitution and, while substitution deals +-- correctly with name capture, it's only correct if you see the uniques! +-- If you just see occurrence names, name capture may occur. +-- Example: newtype A a = A (forall b. b -> a) +-- test :: forall q b. q -> A b +-- test _ = undefined +-- Here the 'a' gets substituted by 'b', which is captured. +-- Another solution would be to expand newtypes before tidying; but that would +-- expose the representation in interface files, which definitely isn't right. +-- Maybe CoreTidy should know whether to expand newtypes or not? + +make_ty (PredTy p) = make_ty (predTypeRep p) +make_ty (NoteTy _ t) = make_ty t + + + +make_kind :: Kind -> C.Kind +make_kind (FunKind k1 k2) = C.Karrow (make_kind k1) (make_kind k2) +make_kind LiftedTypeKind = C.Klifted +make_kind UnliftedTypeKind = C.Kunlifted +make_kind OpenTypeKind = C.Kopen +make_kind _ = error "MkExternalCore died: make_kind" + +{- Id generation. -} + +{- Use encoded strings. + Also, adjust casing to work around some badly-chosen internal names. -} +make_id :: Bool -> Name -> C.Id +make_id is_var nm = (occNameString . nameOccName) nm + +{- SIMON thinks this stuff isn't necessary +make_id is_var nm = + case n of + 'Z':cs | is_var -> 'z':cs + 'z':cs | not is_var -> 'Z':cs + c:cs | isUpper c && is_var -> 'z':'d':n + c:cs | isLower c && (not is_var) -> 'Z':'d':n + _ -> n + where n = (occNameString . nameOccName) nm +-} + +make_var_id :: Name -> C.Id +make_var_id = make_id True + +make_mid :: Module -> C.Id +make_mid = moduleString + +make_qid :: Bool -> Name -> C.Qual C.Id +make_qid is_var n = (mname,make_id is_var n) + where mname = + case nameModule_maybe n of + Just m -> make_mid m + Nothing -> "" + +make_var_qid :: Name -> C.Qual C.Id +make_var_qid = make_qid True + +make_con_qid :: Name -> C.Qual C.Id +make_con_qid = make_qid False + +\end{code} + + + + diff --git a/compiler/coreSyn/PprCore.lhs b/compiler/coreSyn/PprCore.lhs new file mode 100644 index 0000000000..864f4bdcb0 --- /dev/null +++ b/compiler/coreSyn/PprCore.lhs @@ -0,0 +1,384 @@ +% +% (c) The AQUA Project, Glasgow University, 1996-1998 +% +%************************************************************************ +%* * +\section[PprCore]{Printing of Core syntax, including for interfaces} +%* * +%************************************************************************ + +\begin{code} +module PprCore ( + pprCoreExpr, pprParendExpr, + pprCoreBinding, pprCoreBindings, pprCoreAlt, + pprRules + ) where + +#include "HsVersions.h" + +import CoreSyn +import CostCentre ( pprCostCentreCore ) +import Var ( Var ) +import Id ( Id, idType, isDataConWorkId_maybe, idArity, + idInfo, globalIdDetails, isGlobalId, isExportedId + ) +import Var ( TyVar, isTyVar, tyVarKind ) +import IdInfo ( IdInfo, megaSeqIdInfo, + inlinePragInfo, occInfo, newDemandInfo, + lbvarInfo, hasNoLBVarInfo, + arityInfo, ppArityInfo, + specInfo, pprNewStrictness, + workerInfo, ppWorkerInfo, + newStrictnessInfo, cafInfo, ppCafInfo, specInfoRules + ) +import NewDemand ( isTop ) +#ifdef OLD_STRICTNESS +import Id ( idDemandInfo ) +import IdInfo ( cprInfo, ppCprInfo, strictnessInfo, ppStrictnessInfo ) +#endif + +import DataCon ( dataConTyCon ) +import TyCon ( tupleTyConBoxity, isTupleTyCon ) +import Type ( pprParendType, pprType, pprParendKind ) +import BasicTypes ( tupleParens, isNoOcc, isAlwaysActive ) +import Util ( lengthIs ) +import Outputable +import FastString ( mkFastString ) +\end{code} + +%************************************************************************ +%* * +\subsection{Public interfaces for Core printing (excluding instances)} +%* * +%************************************************************************ + +@pprParendCoreExpr@ puts parens around non-atomic Core expressions. + +\begin{code} +pprCoreBindings :: OutputableBndr b => [Bind b] -> SDoc +pprCoreBinding :: OutputableBndr b => Bind b -> SDoc +pprCoreExpr :: OutputableBndr b => Expr b -> SDoc +pprParendExpr :: OutputableBndr b => Expr b -> SDoc + +pprCoreBindings = pprTopBinds +pprCoreBinding = pprTopBind + +instance OutputableBndr b => Outputable (Bind b) where + ppr bind = ppr_bind bind + +instance OutputableBndr b => Outputable (Expr b) where + ppr expr = pprCoreExpr expr +\end{code} + + +%************************************************************************ +%* * +\subsection{The guts} +%* * +%************************************************************************ + +\begin{code} +pprTopBinds binds = vcat (map pprTopBind binds) + +pprTopBind (NonRec binder expr) + = ppr_binding (binder,expr) $$ text "" + +pprTopBind (Rec binds) + = vcat [ptext SLIT("Rec {"), + vcat (map ppr_binding binds), + ptext SLIT("end Rec }"), + text ""] +\end{code} + +\begin{code} +ppr_bind :: OutputableBndr b => Bind b -> SDoc + +ppr_bind (NonRec val_bdr expr) = ppr_binding (val_bdr, expr) +ppr_bind (Rec binds) = vcat (map pp binds) + where + pp bind = ppr_binding bind <> semi + +ppr_binding :: OutputableBndr b => (b, Expr b) -> SDoc +ppr_binding (val_bdr, expr) + = pprBndr LetBind val_bdr $$ + hang (ppr val_bdr <+> equals) 2 (pprCoreExpr expr) +\end{code} + +\begin{code} +pprParendExpr expr = ppr_expr parens expr +pprCoreExpr expr = ppr_expr noParens expr + +noParens :: SDoc -> SDoc +noParens pp = pp +\end{code} + +\begin{code} +ppr_expr :: OutputableBndr b => (SDoc -> SDoc) -> Expr b -> SDoc + -- The function adds parens in context that need + -- an atomic value (e.g. function args) + +ppr_expr add_par (Type ty) = add_par (ptext SLIT("TYPE") <+> ppr ty) -- Wierd + +ppr_expr add_par (Var name) = ppr name +ppr_expr add_par (Lit lit) = ppr lit + +ppr_expr add_par expr@(Lam _ _) + = let + (bndrs, body) = collectBinders expr + in + add_par $ + hang (ptext SLIT("\\") <+> sep (map (pprBndr LambdaBind) bndrs) <+> arrow) + 2 (pprCoreExpr body) + +ppr_expr add_par expr@(App fun arg) + = case collectArgs expr of { (fun, args) -> + let + pp_args = sep (map pprArg args) + val_args = dropWhile isTypeArg args -- Drop the type arguments for tuples + pp_tup_args = sep (punctuate comma (map pprArg val_args)) + in + case fun of + Var f -> case isDataConWorkId_maybe f of + -- Notice that we print the *worker* + -- for tuples in paren'd format. + Just dc | saturated && isTupleTyCon tc + -> tupleParens (tupleTyConBoxity tc) pp_tup_args + where + tc = dataConTyCon dc + saturated = val_args `lengthIs` idArity f + + other -> add_par (hang (ppr f) 2 pp_args) + + other -> add_par (hang (pprParendExpr fun) 2 pp_args) + } + +ppr_expr add_par (Case expr var ty [(con,args,rhs)]) + = add_par $ + sep [sep [ptext SLIT("case") <+> pprCoreExpr expr, + ifPprDebug (braces (ppr ty)), + hsep [ptext SLIT("of"), + ppr_bndr var, + char '{', + ppr_case_pat con args + ]], + pprCoreExpr rhs, + char '}' + ] + where + ppr_bndr = pprBndr CaseBind + +ppr_expr add_par (Case expr var ty alts) + = add_par $ + sep [sep [ptext SLIT("case") + <+> pprCoreExpr expr + <+> ifPprDebug (braces (ppr ty)), + ptext SLIT("of") <+> ppr_bndr var <+> char '{'], + nest 2 (sep (punctuate semi (map pprCoreAlt alts))), + char '}' + ] + where + ppr_bndr = pprBndr CaseBind + + +-- special cases: let ... in let ... +-- ("disgusting" SLPJ) + +{- +ppr_expr add_par (Let bind@(NonRec val_bdr rhs@(Let _ _)) body) + = add_par $ + vcat [ + hsep [ptext SLIT("let {"), (pprBndr LetBind val_bdr $$ ppr val_bndr), equals], + nest 2 (pprCoreExpr rhs), + ptext SLIT("} in"), + pprCoreExpr body ] +-} + +ppr_expr add_par (Let bind@(NonRec val_bdr rhs) expr@(Let _ _)) + = add_par + (hang (ptext SLIT("let {")) + 2 (hsep [ppr_binding (val_bdr,rhs), + ptext SLIT("} in")]) + $$ + pprCoreExpr expr) + +-- general case (recursive case, too) +ppr_expr add_par (Let bind expr) + = add_par $ + sep [hang (ptext keyword) 2 (ppr_bind bind), + hang (ptext SLIT("} in ")) 2 (pprCoreExpr expr)] + where + keyword = case bind of + Rec _ -> SLIT("__letrec {") + NonRec _ _ -> SLIT("let {") + +ppr_expr add_par (Note (SCC cc) expr) + = add_par (sep [pprCostCentreCore cc, pprCoreExpr expr]) + +#ifdef DEBUG +ppr_expr add_par (Note (Coerce to_ty from_ty) expr) + = add_par $ + getPprStyle $ \ sty -> + if debugStyle sty then + sep [ptext SLIT("__coerce") <+> + sep [pprParendType to_ty, pprParendType from_ty], + pprParendExpr expr] + else + sep [hsep [ptext SLIT("__coerce"), pprParendType to_ty], + pprParendExpr expr] +#else +ppr_expr add_par (Note (Coerce to_ty from_ty) expr) + = add_par $ + sep [sep [ptext SLIT("__coerce"), nest 2 (pprParendType to_ty)], + pprParendExpr expr] +#endif + +ppr_expr add_par (Note InlineCall expr) + = add_par (ptext SLIT("__inline_call") <+> pprParendExpr expr) + +ppr_expr add_par (Note InlineMe expr) + = add_par $ ptext SLIT("__inline_me") <+> pprParendExpr expr + +ppr_expr add_par (Note (CoreNote s) expr) + = add_par $ + sep [sep [ptext SLIT("__core_note"), pprHsString (mkFastString s)], + pprParendExpr expr] + +pprCoreAlt (con, args, rhs) + = hang (ppr_case_pat con args) 2 (pprCoreExpr rhs) + +ppr_case_pat con@(DataAlt dc) args + | isTupleTyCon tc + = tupleParens (tupleTyConBoxity tc) (hsep (punctuate comma (map ppr_bndr args))) <+> arrow + where + ppr_bndr = pprBndr CaseBind + tc = dataConTyCon dc + +ppr_case_pat con args + = ppr con <+> hsep (map ppr_bndr args) <+> arrow + where + ppr_bndr = pprBndr CaseBind + +pprArg (Type ty) = ptext SLIT("@") <+> pprParendType ty +pprArg expr = pprParendExpr expr +\end{code} + +Other printing bits-and-bobs used with the general @pprCoreBinding@ +and @pprCoreExpr@ functions. + +\begin{code} +instance OutputableBndr Var where + pprBndr = pprCoreBinder + +pprCoreBinder :: BindingSite -> Var -> SDoc +pprCoreBinder LetBind binder + = vcat [sig, pprIdDetails binder, pragmas] + where + sig = pprTypedBinder binder + pragmas = ppIdInfo binder (idInfo binder) + +-- Lambda bound type variables are preceded by "@" +pprCoreBinder LambdaBind bndr = parens (pprTypedBinder bndr) + +-- Case bound things don't get a signature or a herald +pprCoreBinder CaseBind bndr = pprUntypedBinder bndr + +pprUntypedBinder binder + | isTyVar binder = ptext SLIT("@") <+> ppr binder -- NB: don't print kind + | otherwise = pprIdBndr binder + +pprTypedBinder binder + | isTyVar binder = ptext SLIT("@") <+> pprTyVarBndr binder + | otherwise = pprIdBndr binder <+> dcolon <+> pprType (idType binder) + +pprTyVarBndr :: TyVar -> SDoc +pprTyVarBndr tyvar + = getPprStyle $ \ sty -> + if debugStyle sty then + hsep [ppr tyvar, dcolon, pprParendKind kind] + -- See comments with ppDcolon in PprCore.lhs + else + ppr tyvar + where + kind = tyVarKind tyvar + +-- pprIdBndr does *not* print the type +-- When printing any Id binder in debug mode, we print its inline pragma and one-shot-ness +pprIdBndr id = ppr id <+> pprIdBndrInfo (idInfo id) + +pprIdBndrInfo info + = megaSeqIdInfo `seq` doc -- The seq is useful for poking on black holes + where + prag_info = inlinePragInfo info + occ_info = occInfo info + dmd_info = newDemandInfo info + lbv_info = lbvarInfo info + + no_info = isAlwaysActive prag_info && isNoOcc occ_info && + (case dmd_info of { Nothing -> True; Just d -> isTop d }) && + hasNoLBVarInfo lbv_info + + doc | no_info = empty + | otherwise + = brackets $ hcat [ppr prag_info, ppr occ_info, + ppr dmd_info, ppr lbv_info +#ifdef OLD_STRICTNESS + , ppr (demandInfo id) +#endif + ] +\end{code} + + +\begin{code} +pprIdDetails :: Id -> SDoc +pprIdDetails id | isGlobalId id = ppr (globalIdDetails id) + | isExportedId id = ptext SLIT("[Exported]") + | otherwise = empty + +ppIdInfo :: Id -> IdInfo -> SDoc +ppIdInfo b info + = brackets $ + vcat [ ppArityInfo a, + ppWorkerInfo (workerInfo info), + ppCafInfo (cafInfo info), +#ifdef OLD_STRICTNESS + ppStrictnessInfo s, + ppCprInfo m, +#endif + pprNewStrictness (newStrictnessInfo info), + if null rules then empty + else ptext SLIT("RULES:") <+> vcat (map pprRule rules) + -- Inline pragma, occ, demand, lbvar info + -- printed out with all binders (when debug is on); + -- see PprCore.pprIdBndr + ] + where + a = arityInfo info +#ifdef OLD_STRICTNESS + s = strictnessInfo info + m = cprInfo info +#endif + rules = specInfoRules (specInfo info) +\end{code} + + +\begin{code} +instance Outputable CoreRule where + ppr = pprRule + +pprRules :: [CoreRule] -> SDoc +pprRules rules = vcat (map pprRule rules) + +pprRule :: CoreRule -> SDoc +pprRule (BuiltinRule { ru_fn = fn, ru_name = name}) + = ptext SLIT("Built in rule for") <+> ppr fn <> colon <+> doubleQuotes (ftext name) + +pprRule (Rule { ru_name = name, ru_act = act, ru_fn = fn, + ru_bndrs = tpl_vars, ru_args = tpl_args, + ru_rhs = rhs }) + = doubleQuotes (ftext name) <+> ppr act <+> + sep [ + ptext SLIT("__forall") <+> braces (sep (map pprTypedBinder tpl_vars)), + nest 2 (ppr fn <+> sep (map pprArg tpl_args)), + nest 2 (ptext SLIT("=") <+> pprCoreExpr rhs) + ] <+> semi +\end{code} diff --git a/compiler/coreSyn/PprExternalCore.lhs b/compiler/coreSyn/PprExternalCore.lhs new file mode 100644 index 0000000000..26c89cce48 --- /dev/null +++ b/compiler/coreSyn/PprExternalCore.lhs @@ -0,0 +1,177 @@ +% +% (c) The University of Glasgow 2001 +% +\begin{code} + +module PprExternalCore () where + +import Pretty +import ExternalCore +import Char + +instance Show Module where + showsPrec d m = shows (pmodule m) + +instance Show Tdef where + showsPrec d t = shows (ptdef t) + +instance Show Cdef where + showsPrec d c = shows (pcdef c) + +instance Show Vdefg where + showsPrec d v = shows (pvdefg v) + +instance Show Exp where + showsPrec d e = shows (pexp e) + +instance Show Alt where + showsPrec d a = shows (palt a) + +instance Show Ty where + showsPrec d t = shows (pty t) + +instance Show Kind where + showsPrec d k = shows (pkind k) + +instance Show Lit where + showsPrec d l = shows (plit l) + + +indent = nest 2 + +pmodule (Module mname tdefs vdefgs) = + (text "%module" <+> text mname) + $$ indent ((vcat (map ((<> char ';') . ptdef) tdefs)) + $$ (vcat (map ((<> char ';') . pvdefg) vdefgs))) + +ptdef (Data tcon tbinds cdefs) = + (text "%data" <+> pqname tcon <+> (hsep (map ptbind tbinds)) <+> char '=') + $$ indent (braces ((vcat (punctuate (char ';') (map pcdef cdefs))))) + +ptdef (Newtype tcon tbinds rep ) = + text "%newtype" <+> pqname tcon <+> (hsep (map ptbind tbinds)) <+> repclause + where repclause = case rep of + Just ty -> char '=' <+> pty ty + Nothing -> empty + +pcdef (Constr dcon tbinds tys) = + (pname dcon) <+> (sep [hsep (map pattbind tbinds),sep (map paty tys)]) +pcdef (GadtConstr dcon ty) = + (pname dcon) <+> text "::" <+> pty ty + +pname id = text id + +pqname ("",id) = pname id +pqname (m,id) = pname m <> char '.' <> pname id + +ptbind (t,Klifted) = pname t +ptbind (t,k) = parens (pname t <> text "::" <> pkind k) + +pattbind (t,k) = char '@' <> ptbind (t,k) + +pakind (Klifted) = char '*' +pakind (Kunlifted) = char '#' +pakind (Kopen) = char '?' +pakind k = parens (pkind k) + +pkind (Karrow k1 k2) = parens (pakind k1 <> text "->" <> pkind k2) +pkind k = pakind k + +paty (Tvar n) = pname n +paty (Tcon c) = pqname c +paty t = parens (pty t) + +pbty (Tapp(Tapp(Tcon tc) t1) t2) | tc == tcArrow = parens(fsep [pbty t1, text "->",pty t2]) +pbty (Tapp t1 t2) = pappty t1 [t2] +pbty t = paty t + +pty (Tapp(Tapp(Tcon tc) t1) t2) | tc == tcArrow = fsep [pbty t1, text "->",pty t2] +pty (Tforall tb t) = text "%forall" <+> pforall [tb] t +pty t = pbty t + +pappty (Tapp t1 t2) ts = pappty t1 (t2:ts) +pappty t ts = sep (map paty (t:ts)) + +pforall tbs (Tforall tb t) = pforall (tbs ++ [tb]) t +pforall tbs t = hsep (map ptbind tbs) <+> char '.' <+> pty t + +pvdefg (Rec vtes) = text "%rec" $$ braces (indent (vcat (punctuate (char ';') (map pvte vtes)))) +pvdefg (Nonrec vte) = pvte vte + +pvte (v,t,e) = sep [pname v <+> text "::" <+> pty t <+> char '=', + indent (pexp e)] + +paexp (Var x) = pqname x +paexp (Dcon x) = pqname x +paexp (Lit l) = plit l +paexp e = parens(pexp e) + +plamexp bs (Lam b e) = plamexp (bs ++ [b]) e +plamexp bs e = sep [sep (map pbind bs) <+> text "->", + indent (pexp e)] + +pbind (Tb tb) = char '@' <+> ptbind tb +pbind (Vb vb) = pvbind vb + +pfexp (App e1 e2) = pappexp e1 [Left e2] +pfexp (Appt e t) = pappexp e [Right t] +pfexp e = paexp e + +pappexp (App e1 e2) as = pappexp e1 (Left e2:as) +pappexp (Appt e t) as = pappexp e (Right t:as) +pappexp e as = fsep (paexp e : map pa as) + where pa (Left e) = paexp e + pa (Right t) = char '@' <+> paty t + +pexp (Lam b e) = char '\\' <+> plamexp [b] e +pexp (Let vd e) = (text "%let" <+> pvdefg vd) $$ (text "%in" <+> pexp e) +pexp (Case e vb ty alts) = sep [text "%case" <+> parens (paty ty) <+> paexp e, + text "%of" <+> pvbind vb] + $$ (indent (braces (vcat (punctuate (char ';') (map palt alts))))) +pexp (Coerce t e) = (text "%coerce" <+> paty t) $$ pexp e +pexp (Note s e) = (text "%note" <+> pstring s) $$ pexp e +pexp (External n t) = (text "%external" <+> pstring n) $$ paty t +pexp e = pfexp e + + +pvbind (x,t) = parens(pname x <> text "::" <> pty t) + +palt (Acon c tbs vbs e) = + sep [pqname c, + sep (map pattbind tbs), + sep (map pvbind vbs) <+> text "->"] + $$ indent (pexp e) +palt (Alit l e) = + (plit l <+> text "->") + $$ indent (pexp e) +palt (Adefault e) = + (text "%_ ->") + $$ indent (pexp e) + +plit (Lint i t) = parens (integer i <> text "::" <> pty t) +plit (Lrational r t) = parens (rational r <> text "::" <> pty t) -- might be better to print as two integers +plit (Lchar c t) = parens (text ("\'" ++ escape [c] ++ "\'") <> text "::" <> pty t) +plit (Lstring s t) = parens (pstring s <> text "::" <> pty t) + +pstring s = doubleQuotes(text (escape s)) + +escape s = foldr f [] (map ord s) + where + f cv rest + | cv > 0xFF = '\\':'x':hs ++ rest + | (cv < 0x20 || cv > 0x7e || cv == 0x22 || cv == 0x27 || cv == 0x5c) = + '\\':'x':h1:h0:rest + where (q1,r1) = quotRem cv 16 + h1 = intToDigit q1 + h0 = intToDigit r1 + hs = dropWhile (=='0') $ reverse $ mkHex cv + mkHex 0 = "" + mkHex cv = intToDigit r : mkHex q + where (q,r) = quotRem cv 16 + f cv rest = (chr cv):rest + +\end{code} + + + + diff --git a/compiler/count_bytes b/compiler/count_bytes new file mode 100644 index 0000000000..bf6240228f --- /dev/null +++ b/compiler/count_bytes @@ -0,0 +1,43 @@ +#! /usr/local/bin/perl +# +%DirCount = (); +%ModCount = (); + +foreach $f ( @ARGV ) { + die "Not an .lhs file: $f\n" if $f !~ /\.lhs$/; + $f =~ s/\.lhs$/.o/; + + $f_size = `size $f`; + die "Size failed?\n" if $? != 0; + + if ( $f_size =~ /(\S+)\s*(\S+)\s*(\S+)\s*(\d+)\s*(\S+)/ ) { + $totsz = $4; + + if ( $f =~ /(.*)\/(.*)/ ) { + local($dir) = $1; + local($mod) = $2; + $DirCount{$dir} += $totsz; + $ModCount{$mod} += $totsz; + } else { + print STDERR "not counted in a directory: $f\n"; + $ModCount{$f} += $totsz; + } + } else { + die "Can't figure out size: $f_size\n"; + } +} + +# print the info +$tot = 0; +foreach $d (sort (keys %DirCount)) { + printf "%-20s %6d\n", $d, $DirCount{$d}; + $tot += $DirCount{$d}; +} +printf "\n%-20s %6d\n\n\n", 'TOTAL:', $tot; + +$tot = 0; +foreach $m (sort (keys %ModCount)) { + printf "%-20s %6d\n", $m, $ModCount{$m}; + $tot += $ModCount{$m}; +} +printf "\n%-20s %6d\n", 'TOTAL:', $tot; diff --git a/compiler/count_lines b/compiler/count_lines new file mode 100644 index 0000000000..43ca79e68a --- /dev/null +++ b/compiler/count_lines @@ -0,0 +1,63 @@ +#! /usr/bin/perl +# +%DirCount = (); +%ModCount = (); +%DirComments = (); +%ModComments = (); + +foreach $f ( @ARGV ) { + + if ( $f =~ /\.lhs$/ ) { + open(INF, "../utils/unlit/unlit $f - |") || die "Couldn't unlit $f!\n"; + } else { + open(INF, "< $f") || die "Couldn't open $f!\n"; + } + $cnt = 0; + while (<INF>) { + s/--.*//; + s/{-.*-}//; + next if /^\s*$/; + $cnt++; + } + close(INF); + + $f_wc = `wc $f`; die "wc failed: $f\n" if $? != 0; + if ( $f_wc =~ /\s*(\d+)\s*(\d+)\s*(\d+)/ ) { + $comments = $1 - $cnt; + } else { + die "Can't grok wc format: $f_wc"; + } + + if ( $f =~ /(.*)\/(.*)/ ) { + local($dir) = $1; + local($mod) = $2; + $DirCount{$dir} += $cnt; + $ModCount{$mod} += $cnt; + $DirComments{$dir} += $comments; + $ModComments{$mod} += $comments; + } else { + print STDERR "not counted in a directory: $f\n"; + $ModCount{$f} += $cnt; + $ModComments{$f} += $comments; + } +} + +# print the info +$tot = 0; +$totcmts = 0; +foreach $d (sort (keys %DirCount)) { + printf "%-20s %6d %6d\n", $d, $DirCount{$d}, $DirComments{$d}; + $tot += $DirCount{$d}; + $totcmts += $DirComments{$d}; +} +printf "\n%-20s %6d %6d\n\n\n", 'TOTAL:', $tot, $totcmts; + +$tot = 0; +$totcmts = 0; +printf "\n Code Comments\n"; +foreach $m (sort (keys %ModCount)) { + printf "%-20s %6d %6d\n", $m, $ModCount{$m}, $ModComments{$m}; + $tot += $ModCount{$m}; + $totcmts += $ModComments{$m}; +} +printf "\n%-20s %6d %6d\n", 'TOTAL:', $tot, $totcmts; diff --git a/compiler/cprAnalysis/CprAnalyse.lhs b/compiler/cprAnalysis/CprAnalyse.lhs new file mode 100644 index 0000000000..dad6ccbaee --- /dev/null +++ b/compiler/cprAnalysis/CprAnalyse.lhs @@ -0,0 +1,315 @@ +\section[CprAnalyse]{Identify functions that always return a +constructed product result} + +\begin{code} +#ifndef OLD_STRICTNESS +module CprAnalyse ( ) where + +#else + +module CprAnalyse ( cprAnalyse ) where + +#include "HsVersions.h" + +import DynFlags ( DynFlags, DynFlag(..) ) +import CoreLint ( showPass, endPass ) +import CoreSyn +import CoreUtils ( exprIsHNF ) +import Id ( Id, setIdCprInfo, idCprInfo, idArity, + isBottomingId, idDemandInfo, isImplicitId ) +import IdInfo ( CprInfo(..) ) +import Demand ( isStrict ) +import VarEnv +import Util ( nTimes, mapAccumL ) +import Outputable + +import Maybe +\end{code} + +This module performs an analysis of a set of Core Bindings for the +Constructed Product Result (CPR) transformation. + +It detects functions that always explicitly (manifestly?) construct a +result value with a product type. A product type is a type which has +only one constructor. For example, tuples and boxed primitive values +have product type. + +We must also ensure that the function's body starts with sufficient +manifest lambdas otherwise loss of sharing can occur. See the comment +in @StrictAnal.lhs@. + +The transformation of bindings to worker/wrapper pairs is done by the +worker-wrapper pass. The worker-wrapper pass splits bindings on the +basis of both strictness and CPR info. If an id has both then it can +combine the transformations so that only one pair is produced. + +The analysis here detects nested CPR information. For example, if a +function returns a constructed pair, the first element of which is a +constructed int, then the analysis will detect nested CPR information +for the int as well. Unfortunately, the current transformations can't +take advantage of the nested CPR information. They have (broken now, +I think) code which will flatten out nested CPR components and rebuild +them in the wrapper, but enabling this would lose laziness. It is +possible to make use of the nested info: if we knew that a caller was +strict in that position then we could create a specialized version of +the function which flattened/reconstructed that position. + +It is not known whether this optimisation would be worthwhile. + +So we generate and carry round nested CPR information, but before +using this info to guide the creation of workers and wrappers we map +all components of a CPRInfo to NoCprInfo. + + +Data types +~~~~~~~~~~ + +Within this module Id's CPR information is represented by +``AbsVal''. When adding this information to the Id's pragma info field +we convert the ``Absval'' to a ``CprInfo'' value. + +Abstract domains consist of a `no information' value (Top), a function +value (Fun) which when applied to an argument returns a new AbsVal +(note the argument is not used in any way), , for product types, a +corresponding length tuple (Tuple) of abstract values. And finally, +Bot. Bot is not a proper abstract value but a generic bottom is +useful for calculating fixpoints and representing divergent +computations. Note that we equate Bot and Fun^n Bot (n > 0), and +likewise for Top. This saves a lot of delving in types to keep +everything exactly correct. + +Since functions abstract to constant functions we could just +represent them by the abstract value of their result. However, it +turns out (I know - I tried!) that this requires a lot of type +manipulation and the code is more straightforward if we represent +functions by an abstract constant function. + +\begin{code} +data AbsVal = Top -- Not a constructed product + + | Fun AbsVal -- A function that takes an argument + -- and gives AbsVal as result. + + | Tuple -- A constructed product of values + + | Bot -- Bot'tom included for convenience + -- we could use appropriate Tuple Vals + deriving (Eq,Show) + +-- For pretty debugging +instance Outputable AbsVal where + ppr Top = ptext SLIT("Top") + ppr (Fun r) = ptext SLIT("Fun->") <> (parens.ppr) r + ppr Tuple = ptext SLIT("Tuple ") + ppr Bot = ptext SLIT("Bot") + + +-- lub takes the lowest upper bound of two abstract values, standard. +lub :: AbsVal -> AbsVal -> AbsVal +lub Bot a = a +lub a Bot = a +lub Top a = Top +lub a Top = Top +lub Tuple Tuple = Tuple +lub (Fun l) (Fun r) = Fun (lub l r) +lub l r = panic "CPR Analysis tried to take the lub of a function and a tuple" + + +\end{code} + +The environment maps Ids to their abstract CPR value. + +\begin{code} + +type CPREnv = VarEnv AbsVal + +initCPREnv = emptyVarEnv + +\end{code} + +Programs +~~~~~~~~ + +Take a list of core bindings and return a new list with CPR function +ids decorated with their CprInfo pragmas. + +\begin{code} + +cprAnalyse :: DynFlags -> [CoreBind] -> IO [CoreBind] +cprAnalyse dflags binds + = do { + showPass dflags "Constructed Product analysis" ; + let { binds_plus_cpr = do_prog binds } ; + endPass dflags "Constructed Product analysis" + Opt_D_dump_cpranal binds_plus_cpr + } + where + do_prog :: [CoreBind] -> [CoreBind] + do_prog binds = snd $ mapAccumL cprAnalBind initCPREnv binds +\end{code} + +The cprAnal functions take binds/expressions and an environment which +gives CPR info for visible ids and returns a new bind/expression +with ids decorated with their CPR info. + +\begin{code} +-- Return environment extended with info from this binding +cprAnalBind :: CPREnv -> CoreBind -> (CPREnv, CoreBind) +cprAnalBind rho (NonRec b e) + | isImplicitId b -- Don't touch the CPR info on constructors, selectors etc + = (rho, NonRec b e) + | otherwise + = (extendVarEnv rho b absval, NonRec b' e') + where + (e', absval) = cprAnalExpr rho e + b' = addIdCprInfo b e' absval + +cprAnalBind rho (Rec prs) + = (final_rho, Rec (map do_pr prs)) + where + do_pr (b,e) = (b', e') + where + b' = addIdCprInfo b e' absval + (e', absval) = cprAnalExpr final_rho e + + -- When analyzing mutually recursive bindings the iterations to find + -- a fixpoint is bounded by the number of bindings in the group. + -- for simplicity we just iterate that number of times. + final_rho = nTimes (length prs) do_one_pass init_rho + init_rho = rho `extendVarEnvList` [(b,Bot) | (b,e) <- prs] + + do_one_pass :: CPREnv -> CPREnv + do_one_pass rho = foldl (\ rho (b,e) -> extendVarEnv rho b (snd (cprAnalExpr rho e))) + rho prs + + +cprAnalExpr :: CPREnv -> CoreExpr -> (CoreExpr, AbsVal) + +-- If Id will always diverge when given sufficient arguments then +-- we can just set its abs val to Bot. Any other CPR info +-- from other paths will then dominate, which is what we want. +-- Check in rho, if not there it must be imported, so check +-- the var's idinfo. +cprAnalExpr rho e@(Var v) + | isBottomingId v = (e, Bot) + | otherwise = (e, case lookupVarEnv rho v of + Just a_val -> a_val + Nothing -> getCprAbsVal v) + +-- Literals are unboxed +cprAnalExpr rho (Lit l) = (Lit l, Top) + +-- For apps we don't care about the argument's abs val. This +-- app will return a constructed product if the function does. We strip +-- a Fun from the functions abs val, unless the argument is a type argument +-- or it is already Top or Bot. +cprAnalExpr rho (App fun arg@(Type _)) + = (App fun_cpr arg, fun_res) + where + (fun_cpr, fun_res) = cprAnalExpr rho fun + +cprAnalExpr rho (App fun arg) + = (App fun_cpr arg_cpr, res_res) + where + (fun_cpr, fun_res) = cprAnalExpr rho fun + (arg_cpr, _) = cprAnalExpr rho arg + res_res = case fun_res of + Fun res_res -> res_res + Top -> Top + Bot -> Bot + Tuple -> WARN( True, ppr (App fun arg) ) Top + -- This really should not happen! + + +-- Map arguments to Top (we aren't constructing them) +-- Return the abstract value of the body, since functions +-- are represented by the CPR value of their result, and +-- add a Fun for this lambda.. +cprAnalExpr rho (Lam b body) | isTyVar b = (Lam b body_cpr, body_aval) + | otherwise = (Lam b body_cpr, Fun body_aval) + where + (body_cpr, body_aval) = cprAnalExpr (extendVarEnv rho b Top) body + +cprAnalExpr rho (Let bind body) + = (Let bind' body', body_aval) + where + (rho', bind') = cprAnalBind rho bind + (body', body_aval) = cprAnalExpr rho' body + +cprAnalExpr rho (Case scrut bndr alts) + = (Case scrut_cpr bndr alts_cpr, alts_aval) + where + (scrut_cpr, scrut_aval) = cprAnalExpr rho scrut + (alts_cpr, alts_aval) = cprAnalCaseAlts (extendVarEnv rho bndr scrut_aval) alts + +cprAnalExpr rho (Note n exp) + = (Note n exp_cpr, expr_aval) + where + (exp_cpr, expr_aval) = cprAnalExpr rho exp + +cprAnalExpr rho (Type t) + = (Type t, Top) + +cprAnalCaseAlts :: CPREnv -> [CoreAlt] -> ([CoreAlt], AbsVal) +cprAnalCaseAlts rho alts + = foldr anal_alt ([], Bot) alts + where + anal_alt :: CoreAlt -> ([CoreAlt], AbsVal) -> ([CoreAlt], AbsVal) + anal_alt (con, binds, exp) (done, aval) + = ((con,binds,exp_cpr) : done, exp_aval `lub` aval) + where (exp_cpr, exp_aval) = cprAnalExpr rho' exp + rho' = rho `extendVarEnvList` (zip binds (repeat Top)) + + +addIdCprInfo :: Id -> CoreExpr -> AbsVal -> Id +addIdCprInfo bndr rhs absval + | useful_info && ok_to_add = setIdCprInfo bndr cpr_info + | otherwise = bndr + where + cpr_info = absToCprInfo absval + useful_info = case cpr_info of { ReturnsCPR -> True; NoCPRInfo -> False } + + ok_to_add = case absval of + Fun _ -> idArity bndr >= n_fun_tys absval + -- Enough visible lambdas + + Tuple -> exprIsHNF rhs || isStrict (idDemandInfo bndr) + -- If the rhs is a value, and returns a constructed product, + -- it will be inlined at usage sites, so we give it a Tuple absval + -- If it isn't a value, we won't inline it (code/work dup worries), so + -- we discard its absval. + -- + -- Also, if the strictness analyser has figured out that it's strict, + -- the let-to-case transformation will happen, so again it's good. + -- (CPR analysis runs before the simplifier has had a chance to do + -- the let-to-case transform.) + -- This made a big difference to PrelBase.modInt, which had something like + -- modInt = \ x -> let r = ... -> I# v in + -- ...body strict in r... + -- r's RHS isn't a value yet; but modInt returns r in various branches, so + -- if r doesn't have the CPR property then neither does modInt + + _ -> False + + n_fun_tys :: AbsVal -> Int + n_fun_tys (Fun av) = 1 + n_fun_tys av + n_fun_tys other = 0 + + +absToCprInfo :: AbsVal -> CprInfo +absToCprInfo Tuple = ReturnsCPR +absToCprInfo (Fun r) = absToCprInfo r +absToCprInfo _ = NoCPRInfo + + +-- Cpr Info doesn't store the number of arguments a function has, so the caller +-- must take care to add the appropriate number of Funs. +getCprAbsVal v = case idCprInfo v of + NoCPRInfo -> Top + ReturnsCPR -> nTimes arity Fun Tuple + where + arity = idArity v + -- Imported (non-nullary) constructors will have the CPR property + -- in their IdInfo, so no need to look at their unfolding +#endif /* OLD_STRICTNESS */ +\end{code} diff --git a/compiler/deSugar/Check.lhs b/compiler/deSugar/Check.lhs new file mode 100644 index 0000000000..9aac5ce777 --- /dev/null +++ b/compiler/deSugar/Check.lhs @@ -0,0 +1,698 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1997-1998 +% +% Author: Juan J. Quintela <quintela@krilin.dc.fi.udc.es> +\section{Module @Check@ in @deSugar@} + +\begin{code} + + +module Check ( check , ExhaustivePat ) where + + +import HsSyn +import TcHsSyn ( hsPatType, mkVanillaTuplePat ) +import TcType ( tcTyConAppTyCon ) +import DsUtils ( EquationInfo(..), MatchResult(..), + CanItFail(..), firstPat ) +import MatchLit ( tidyLitPat, tidyNPat ) +import Id ( Id, idType ) +import DataCon ( DataCon, dataConTyCon, dataConOrigArgTys, dataConFieldLabels ) +import Name ( Name, mkInternalName, getOccName, isDataSymOcc, + getName, mkVarOccFS ) +import TysWiredIn +import PrelNames ( unboundKey ) +import TyCon ( tyConDataCons, tupleTyConBoxity, isTupleTyCon ) +import BasicTypes ( Boxity(..) ) +import SrcLoc ( noSrcLoc, Located(..), unLoc, noLoc ) +import UniqSet +import Util ( takeList, splitAtList, notNull ) +import Outputable +import FastString + +#include "HsVersions.h" +\end{code} + +This module performs checks about if one list of equations are: +\begin{itemize} +\item Overlapped +\item Non exhaustive +\end{itemize} +To discover that we go through the list of equations in a tree-like fashion. + +If you like theory, a similar algorithm is described in: +\begin{quotation} + {\em Two Techniques for Compiling Lazy Pattern Matching}, + Luc Maranguet, + INRIA Rocquencourt (RR-2385, 1994) +\end{quotation} +The algorithm is based on the first technique, but there are some differences: +\begin{itemize} +\item We don't generate code +\item We have constructors and literals (not only literals as in the + article) +\item We don't use directions, we must select the columns from + left-to-right +\end{itemize} +(By the way the second technique is really similar to the one used in + @Match.lhs@ to generate code) + +This function takes the equations of a pattern and returns: +\begin{itemize} +\item The patterns that are not recognized +\item The equations that are not overlapped +\end{itemize} +It simplify the patterns and then call @check'@ (the same semantics), and it +needs to reconstruct the patterns again .... + +The problem appear with things like: +\begin{verbatim} + f [x,y] = .... + f (x:xs) = ..... +\end{verbatim} +We want to put the two patterns with the same syntax, (prefix form) and +then all the constructors are equal: +\begin{verbatim} + f (: x (: y [])) = .... + f (: x xs) = ..... +\end{verbatim} +(more about that in @simplify_eqns@) + +We would prefer to have a @WarningPat@ of type @String@, but Strings and the +Pretty Printer are not friends. + +We use @InPat@ in @WarningPat@ instead of @OutPat@ +because we need to print the +warning messages in the same way they are introduced, i.e. if the user +wrote: +\begin{verbatim} + f [x,y] = .. +\end{verbatim} +He don't want a warning message written: +\begin{verbatim} + f (: x (: y [])) ........ +\end{verbatim} +Then we need to use InPats. +\begin{quotation} + Juan Quintela 5 JUL 1998\\ + User-friendliness and compiler writers are no friends. +\end{quotation} + +\begin{code} +type WarningPat = InPat Name +type ExhaustivePat = ([WarningPat], [(Name, [HsLit])]) +type EqnNo = Int +type EqnSet = UniqSet EqnNo + + +check :: [EquationInfo] -> ([ExhaustivePat], [EquationInfo]) + -- Second result is the shadowed equations +check qs = (untidy_warns, shadowed_eqns) + where + (warns, used_nos) = check' ([1..] `zip` map simplify_eqn qs) + untidy_warns = map untidy_exhaustive warns + shadowed_eqns = [eqn | (eqn,i) <- qs `zip` [1..], + not (i `elementOfUniqSet` used_nos)] + +untidy_exhaustive :: ExhaustivePat -> ExhaustivePat +untidy_exhaustive ([pat], messages) = + ([untidy_no_pars pat], map untidy_message messages) +untidy_exhaustive (pats, messages) = + (map untidy_pars pats, map untidy_message messages) + +untidy_message :: (Name, [HsLit]) -> (Name, [HsLit]) +untidy_message (string, lits) = (string, map untidy_lit lits) +\end{code} + +The function @untidy@ does the reverse work of the @simplify_pat@ funcion. + +\begin{code} + +type NeedPars = Bool + +untidy_no_pars :: WarningPat -> WarningPat +untidy_no_pars p = untidy False p + +untidy_pars :: WarningPat -> WarningPat +untidy_pars p = untidy True p + +untidy :: NeedPars -> WarningPat -> WarningPat +untidy b (L loc p) = L loc (untidy' b p) + where + untidy' _ p@(WildPat _) = p + untidy' _ p@(VarPat name) = p + untidy' _ (LitPat lit) = LitPat (untidy_lit lit) + untidy' _ p@(ConPatIn name (PrefixCon [])) = p + untidy' b (ConPatIn name ps) = pars b (L loc (ConPatIn name (untidy_con ps))) + untidy' _ (ListPat pats ty) = ListPat (map untidy_no_pars pats) ty + untidy' _ (TuplePat pats box ty) = TuplePat (map untidy_no_pars pats) box ty + untidy' _ (PArrPat _ _) = panic "Check.untidy: Shouldn't get a parallel array here!" + untidy' _ (SigPatIn _ _) = panic "Check.untidy: SigPat" + +untidy_con (PrefixCon pats) = PrefixCon (map untidy_pars pats) +untidy_con (InfixCon p1 p2) = InfixCon (untidy_pars p1) (untidy_pars p2) +untidy_con (RecCon bs) = RecCon [(f,untidy_pars p) | (f,p) <- bs] + +pars :: NeedPars -> WarningPat -> Pat Name +pars True p = ParPat p +pars _ p = unLoc p + +untidy_lit :: HsLit -> HsLit +untidy_lit (HsCharPrim c) = HsChar c +untidy_lit lit = lit +\end{code} + +This equation is the same that check, the only difference is that the +boring work is done, that work needs to be done only once, this is +the reason top have two functions, check is the external interface, +@check'@ is called recursively. + +There are several cases: + +\begin{itemize} +\item There are no equations: Everything is OK. +\item There are only one equation, that can fail, and all the patterns are + variables. Then that equation is used and the same equation is + non-exhaustive. +\item All the patterns are variables, and the match can fail, there are + more equations then the results is the result of the rest of equations + and this equation is used also. + +\item The general case, if all the patterns are variables (here the match + can't fail) then the result is that this equation is used and this + equation doesn't generate non-exhaustive cases. + +\item In the general case, there can exist literals ,constructors or only + vars in the first column, we actuate in consequence. + +\end{itemize} + + +\begin{code} + +check' :: [(EqnNo, EquationInfo)] + -> ([ExhaustivePat], -- Pattern scheme that might not be matched at all + EqnSet) -- Eqns that are used (others are overlapped) + +check' [] = ([([],[])],emptyUniqSet) + +check' ((n, EqnInfo { eqn_pats = ps, eqn_rhs = MatchResult can_fail _ }) : rs) + | first_eqn_all_vars && case can_fail of { CantFail -> True; CanFail -> False } + = ([], unitUniqSet n) -- One eqn, which can't fail + + | first_eqn_all_vars && null rs -- One eqn, but it can fail + = ([(takeList ps (repeat nlWildPat),[])], unitUniqSet n) + + | first_eqn_all_vars -- Several eqns, first can fail + = (pats, addOneToUniqSet indexs n) + where + first_eqn_all_vars = all_vars ps + (pats,indexs) = check' rs + +check' qs + | literals = split_by_literals qs + | constructors = split_by_constructor qs + | only_vars = first_column_only_vars qs + | otherwise = pprPanic "Check.check': Not implemented :-(" (ppr first_pats) + where + -- Note: RecPats will have been simplified to ConPats + -- at this stage. + first_pats = ASSERT2( okGroup qs, pprGroup qs ) map firstPatN qs + constructors = any is_con first_pats + literals = any is_lit first_pats + only_vars = all is_var first_pats +\end{code} + +Here begins the code to deal with literals, we need to split the matrix +in different matrix beginning by each literal and a last matrix with the +rest of values. + +\begin{code} +split_by_literals :: [(EqnNo, EquationInfo)] -> ([ExhaustivePat], EqnSet) +split_by_literals qs = process_literals used_lits qs + where + used_lits = get_used_lits qs +\end{code} + +@process_explicit_literals@ is a function that process each literal that appears +in the column of the matrix. + +\begin{code} +process_explicit_literals :: [HsLit] -> [(EqnNo, EquationInfo)] -> ([ExhaustivePat],EqnSet) +process_explicit_literals lits qs = (concat pats, unionManyUniqSets indexs) + where + pats_indexs = map (\x -> construct_literal_matrix x qs) lits + (pats,indexs) = unzip pats_indexs +\end{code} + + +@process_literals@ calls @process_explicit_literals@ to deal with the literals +that appears in the matrix and deal also with the rest of the cases. It +must be one Variable to be complete. + +\begin{code} + +process_literals :: [HsLit] -> [(EqnNo, EquationInfo)] -> ([ExhaustivePat],EqnSet) +process_literals used_lits qs + | null default_eqns = ([make_row_vars used_lits (head qs)] ++ pats,indexs) + | otherwise = (pats_default,indexs_default) + where + (pats,indexs) = process_explicit_literals used_lits qs + default_eqns = ASSERT2( okGroup qs, pprGroup qs ) + [remove_var q | q <- qs, is_var (firstPatN q)] + (pats',indexs') = check' default_eqns + pats_default = [(nlWildPat:ps,constraints) | (ps,constraints) <- (pats')] ++ pats + indexs_default = unionUniqSets indexs' indexs +\end{code} + +Here we have selected the literal and we will select all the equations that +begins for that literal and create a new matrix. + +\begin{code} +construct_literal_matrix :: HsLit -> [(EqnNo, EquationInfo)] -> ([ExhaustivePat],EqnSet) +construct_literal_matrix lit qs = + (map (\ (xs,ys) -> (new_lit:xs,ys)) pats,indexs) + where + (pats,indexs) = (check' (remove_first_column_lit lit qs)) + new_lit = nlLitPat lit + +remove_first_column_lit :: HsLit + -> [(EqnNo, EquationInfo)] + -> [(EqnNo, EquationInfo)] +remove_first_column_lit lit qs + = ASSERT2( okGroup qs, pprGroup qs ) + [(n, shift_pat eqn) | q@(n,eqn) <- qs, is_var_lit lit (firstPatN q)] + where + shift_pat eqn@(EqnInfo { eqn_pats = _:ps}) = eqn { eqn_pats = ps } + shift_pat eqn@(EqnInfo { eqn_pats = []}) = panic "Check.shift_var: no patterns" +\end{code} + +This function splits the equations @qs@ in groups that deal with the +same constructor. + +\begin{code} +split_by_constructor :: [(EqnNo, EquationInfo)] -> ([ExhaustivePat], EqnSet) +split_by_constructor qs + | notNull unused_cons = need_default_case used_cons unused_cons qs + | otherwise = no_need_default_case used_cons qs + where + used_cons = get_used_cons qs + unused_cons = get_unused_cons used_cons +\end{code} + +The first column of the patterns matrix only have vars, then there is +nothing to do. + +\begin{code} +first_column_only_vars :: [(EqnNo, EquationInfo)] -> ([ExhaustivePat],EqnSet) +first_column_only_vars qs = (map (\ (xs,ys) -> (nlWildPat:xs,ys)) pats,indexs) + where + (pats, indexs) = check' (map remove_var qs) +\end{code} + +This equation takes a matrix of patterns and split the equations by +constructor, using all the constructors that appears in the first column +of the pattern matching. + +We can need a default clause or not ...., it depends if we used all the +constructors or not explicitly. The reasoning is similar to @process_literals@, +the difference is that here the default case is not always needed. + +\begin{code} +no_need_default_case :: [Pat Id] -> [(EqnNo, EquationInfo)] -> ([ExhaustivePat],EqnSet) +no_need_default_case cons qs = (concat pats, unionManyUniqSets indexs) + where + pats_indexs = map (\x -> construct_matrix x qs) cons + (pats,indexs) = unzip pats_indexs + +need_default_case :: [Pat Id] -> [DataCon] -> [(EqnNo, EquationInfo)] -> ([ExhaustivePat],EqnSet) +need_default_case used_cons unused_cons qs + | null default_eqns = (pats_default_no_eqns,indexs) + | otherwise = (pats_default,indexs_default) + where + (pats,indexs) = no_need_default_case used_cons qs + default_eqns = ASSERT2( okGroup qs, pprGroup qs ) + [remove_var q | q <- qs, is_var (firstPatN q)] + (pats',indexs') = check' default_eqns + pats_default = [(make_whole_con c:ps,constraints) | + c <- unused_cons, (ps,constraints) <- pats'] ++ pats + new_wilds = make_row_vars_for_constructor (head qs) + pats_default_no_eqns = [(make_whole_con c:new_wilds,[]) | c <- unused_cons] ++ pats + indexs_default = unionUniqSets indexs' indexs + +construct_matrix :: Pat Id -> [(EqnNo, EquationInfo)] -> ([ExhaustivePat],EqnSet) +construct_matrix con qs = + (map (make_con con) pats,indexs) + where + (pats,indexs) = (check' (remove_first_column con qs)) +\end{code} + +Here remove first column is more difficult that with literals due to the fact +that constructors can have arguments. + +For instance, the matrix +\begin{verbatim} + (: x xs) y + z y +\end{verbatim} +is transformed in: +\begin{verbatim} + x xs y + _ _ y +\end{verbatim} + +\begin{code} +remove_first_column :: Pat Id -- Constructor + -> [(EqnNo, EquationInfo)] + -> [(EqnNo, EquationInfo)] +remove_first_column (ConPatOut (L _ con) _ _ _ (PrefixCon con_pats) _) qs + = ASSERT2( okGroup qs, pprGroup qs ) + [(n, shift_var eqn) | q@(n, eqn) <- qs, is_var_con con (firstPatN q)] + where + new_wilds = [WildPat (hsPatType arg_pat) | arg_pat <- con_pats] + shift_var eqn@(EqnInfo { eqn_pats = ConPatOut _ _ _ _ (PrefixCon ps') _ : ps}) + = eqn { eqn_pats = map unLoc ps' ++ ps } + shift_var eqn@(EqnInfo { eqn_pats = WildPat _ : ps }) + = eqn { eqn_pats = new_wilds ++ ps } + shift_var _ = panic "Check.Shift_var:No done" + +make_row_vars :: [HsLit] -> (EqnNo, EquationInfo) -> ExhaustivePat +make_row_vars used_lits (_, EqnInfo { eqn_pats = pats}) + = (nlVarPat new_var:takeList (tail pats) (repeat nlWildPat),[(new_var,used_lits)]) + where + new_var = hash_x + +hash_x = mkInternalName unboundKey {- doesn't matter much -} + (mkVarOccFS FSLIT("#x")) + noSrcLoc + +make_row_vars_for_constructor :: (EqnNo, EquationInfo) -> [WarningPat] +make_row_vars_for_constructor (_, EqnInfo { eqn_pats = pats}) + = takeList (tail pats) (repeat nlWildPat) + +compare_cons :: Pat Id -> Pat Id -> Bool +compare_cons (ConPatOut (L _ id1) _ _ _ _ _) (ConPatOut (L _ id2) _ _ _ _ _) = id1 == id2 + +remove_dups :: [Pat Id] -> [Pat Id] +remove_dups [] = [] +remove_dups (x:xs) | or (map (\y -> compare_cons x y) xs) = remove_dups xs + | otherwise = x : remove_dups xs + +get_used_cons :: [(EqnNo, EquationInfo)] -> [Pat Id] +get_used_cons qs = remove_dups [pat | q <- qs, let pat = firstPatN q, + isConPatOut pat] + +isConPatOut (ConPatOut {}) = True +isConPatOut other = False + +remove_dups' :: [HsLit] -> [HsLit] +remove_dups' [] = [] +remove_dups' (x:xs) | x `elem` xs = remove_dups' xs + | otherwise = x : remove_dups' xs + + +get_used_lits :: [(EqnNo, EquationInfo)] -> [HsLit] +get_used_lits qs = remove_dups' all_literals + where + all_literals = get_used_lits' qs + +get_used_lits' :: [(EqnNo, EquationInfo)] -> [HsLit] +get_used_lits' [] = [] +get_used_lits' (q:qs) + | Just lit <- get_lit (firstPatN q) = lit : get_used_lits' qs + | otherwise = get_used_lits qs + +get_lit :: Pat id -> Maybe HsLit +-- Get a representative HsLit to stand for the OverLit +-- It doesn't matter which one, because they will only be compared +-- with other HsLits gotten in the same way +get_lit (LitPat lit) = Just lit +get_lit (NPat (HsIntegral i _) mb _ _) = Just (HsIntPrim (mb_neg mb i)) +get_lit (NPat (HsFractional f _) mb _ _) = Just (HsFloatPrim (mb_neg mb f)) +get_lit other_pat = Nothing + +mb_neg :: Num a => Maybe b -> a -> a +mb_neg Nothing v = v +mb_neg (Just _) v = -v + +get_unused_cons :: [Pat Id] -> [DataCon] +get_unused_cons used_cons = unused_cons + where + (ConPatOut _ _ _ _ _ ty) = head used_cons + ty_con = tcTyConAppTyCon ty -- Newtype observable + all_cons = tyConDataCons ty_con + used_cons_as_id = map (\ (ConPatOut (L _ d) _ _ _ _ _) -> d) used_cons + unused_cons = uniqSetToList + (mkUniqSet all_cons `minusUniqSet` mkUniqSet used_cons_as_id) + +all_vars :: [Pat Id] -> Bool +all_vars [] = True +all_vars (WildPat _:ps) = all_vars ps +all_vars _ = False + +remove_var :: (EqnNo, EquationInfo) -> (EqnNo, EquationInfo) +remove_var (n, eqn@(EqnInfo { eqn_pats = WildPat _ : ps})) = (n, eqn { eqn_pats = ps }) +remove_var _ = panic "Check.remove_var: equation does not begin with a variable" + +----------------------- +eqnPats :: (EqnNo, EquationInfo) -> [Pat Id] +eqnPats (_, eqn) = eqn_pats eqn + +okGroup :: [(EqnNo, EquationInfo)] -> Bool +-- True if all equations have at least one pattern, and +-- all have the same number of patterns +okGroup [] = True +okGroup (e:es) = n_pats > 0 && and [length (eqnPats e) == n_pats | e <- es] + where + n_pats = length (eqnPats e) + +-- Half-baked print +pprGroup es = vcat (map pprEqnInfo es) +pprEqnInfo e = ppr (eqnPats e) + + +firstPatN :: (EqnNo, EquationInfo) -> Pat Id +firstPatN (_, eqn) = firstPat eqn + +is_con :: Pat Id -> Bool +is_con (ConPatOut _ _ _ _ _ _) = True +is_con _ = False + +is_lit :: Pat Id -> Bool +is_lit (LitPat _) = True +is_lit (NPat _ _ _ _) = True +is_lit _ = False + +is_var :: Pat Id -> Bool +is_var (WildPat _) = True +is_var _ = False + +is_var_con :: DataCon -> Pat Id -> Bool +is_var_con con (WildPat _) = True +is_var_con con (ConPatOut (L _ id) _ _ _ _ _) | id == con = True +is_var_con con _ = False + +is_var_lit :: HsLit -> Pat Id -> Bool +is_var_lit lit (WildPat _) = True +is_var_lit lit pat + | Just lit' <- get_lit pat = lit == lit' + | otherwise = False +\end{code} + +The difference beteewn @make_con@ and @make_whole_con@ is that +@make_wole_con@ creates a new constructor with all their arguments, and +@make_con@ takes a list of argumntes, creates the contructor getting their +arguments from the list. See where \fbox{\ ???\ } are used for details. + +We need to reconstruct the patterns (make the constructors infix and +similar) at the same time that we create the constructors. + +You can tell tuple constructors using +\begin{verbatim} + Id.isTupleCon +\end{verbatim} +You can see if one constructor is infix with this clearer code :-)))))))))) +\begin{verbatim} + Lex.isLexConSym (Name.occNameString (Name.getOccName con)) +\end{verbatim} + + Rather clumsy but it works. (Simon Peyton Jones) + + +We don't mind the @nilDataCon@ because it doesn't change the way to +print the messsage, we are searching only for things like: @[1,2,3]@, +not @x:xs@ .... + +In @reconstruct_pat@ we want to ``undo'' the work +that we have done in @simplify_pat@. +In particular: +\begin{tabular}{lll} + @((,) x y)@ & returns to be & @(x, y)@ +\\ @((:) x xs)@ & returns to be & @(x:xs)@ +\\ @(x:(...:[])@ & returns to be & @[x,...]@ +\end{tabular} +% +The difficult case is the third one becouse we need to follow all the +contructors until the @[]@ to know that we need to use the second case, +not the second. \fbox{\ ???\ } +% +\begin{code} +isInfixCon con = isDataSymOcc (getOccName con) + +is_nil (ConPatIn con (PrefixCon [])) = unLoc con == getName nilDataCon +is_nil _ = False + +is_list (ListPat _ _) = True +is_list _ = False + +return_list id q = id == consDataCon && (is_nil q || is_list q) + +make_list p q | is_nil q = ListPat [p] placeHolderType +make_list p (ListPat ps ty) = ListPat (p:ps) ty +make_list _ _ = panic "Check.make_list: Invalid argument" + +make_con :: Pat Id -> ExhaustivePat -> ExhaustivePat +make_con (ConPatOut (L _ id) _ _ _ _ _) (lp:lq:ps, constraints) + | return_list id q = (noLoc (make_list lp q) : ps, constraints) + | isInfixCon id = (nlInfixConPat (getName id) lp lq : ps, constraints) + where q = unLoc lq + +make_con (ConPatOut (L _ id) _ _ _ (PrefixCon pats) ty) (ps, constraints) + | isTupleTyCon tc = (noLoc (TuplePat pats_con (tupleTyConBoxity tc) ty) : rest_pats, constraints) + | isPArrFakeCon id = (noLoc (PArrPat pats_con placeHolderType) : rest_pats, constraints) + | otherwise = (nlConPat name pats_con : rest_pats, constraints) + where + name = getName id + (pats_con, rest_pats) = splitAtList pats ps + tc = dataConTyCon id + +-- reconstruct parallel array pattern +-- +-- * don't check for the type only; we need to make sure that we are really +-- dealing with one of the fake constructors and not with the real +-- representation + +make_whole_con :: DataCon -> WarningPat +make_whole_con con | isInfixCon con = nlInfixConPat name nlWildPat nlWildPat + | otherwise = nlConPat name pats + where + name = getName con + pats = [nlWildPat | t <- dataConOrigArgTys con] +\end{code} + +This equation makes the same thing as @tidy@ in @Match.lhs@, the +difference is that here we can do all the tidy in one place and in the +@Match@ tidy it must be done one column each time due to bookkeeping +constraints. + +\begin{code} + +simplify_eqn :: EquationInfo -> EquationInfo +simplify_eqn eqn = eqn { eqn_pats = map simplify_pat (eqn_pats eqn), + eqn_rhs = simplify_rhs (eqn_rhs eqn) } + where + -- Horrible hack. The simplify_pat stuff converts NPlusK pats to WildPats + -- which of course loses the info that they can fail to match. So we + -- stick in a CanFail as if it were a guard. + -- The Right Thing to do is for the whole system to treat NPlusK pats properly + simplify_rhs (MatchResult can_fail body) + | any has_nplusk_pat (eqn_pats eqn) = MatchResult CanFail body + | otherwise = MatchResult can_fail body + +has_nplusk_lpat :: LPat Id -> Bool +has_nplusk_lpat (L _ p) = has_nplusk_pat p + +has_nplusk_pat :: Pat Id -> Bool +has_nplusk_pat (NPlusKPat _ _ _ _) = True +has_nplusk_pat (ParPat p) = has_nplusk_lpat p +has_nplusk_pat (AsPat _ p) = has_nplusk_lpat p +has_nplusk_pat (SigPatOut p _ ) = has_nplusk_lpat p +has_nplusk_pat (ConPatOut _ _ _ _ ps ty) = any has_nplusk_lpat (hsConArgs ps) +has_nplusk_pat (ListPat ps _) = any has_nplusk_lpat ps +has_nplusk_pat (TuplePat ps _ _) = any has_nplusk_lpat ps +has_nplusk_pat (PArrPat ps _) = any has_nplusk_lpat ps +has_nplusk_pat (LazyPat p) = False -- Why? +has_nplusk_pat (BangPat p) = has_nplusk_lpat p -- I think +has_nplusk_pat p = False -- VarPat, VarPatOut, WildPat, LitPat, NPat, TypePat, DictPat + +simplify_lpat :: LPat Id -> LPat Id +simplify_lpat p = fmap simplify_pat p + +simplify_pat :: Pat Id -> Pat Id +simplify_pat pat@(WildPat gt) = pat +simplify_pat (VarPat id) = WildPat (idType id) +simplify_pat (VarPatOut id _) = WildPat (idType id) -- Ignore the bindings +simplify_pat (ParPat p) = unLoc (simplify_lpat p) +simplify_pat (LazyPat p) = unLoc (simplify_lpat p) +simplify_pat (BangPat p) = unLoc (simplify_lpat p) +simplify_pat (AsPat id p) = unLoc (simplify_lpat p) +simplify_pat (SigPatOut p _) = unLoc (simplify_lpat p) -- I'm not sure this is right + +simplify_pat (ConPatOut (L loc id) tvs dicts binds ps ty) + = ConPatOut (L loc id) tvs dicts binds (simplify_con id ps) ty + +simplify_pat (ListPat ps ty) = + unLoc $ foldr (\ x y -> mkPrefixConPat consDataCon [x,y] list_ty) + (mkNilPat list_ty) + (map simplify_lpat ps) + where list_ty = mkListTy ty + +-- introduce fake parallel array constructors to be able to handle parallel +-- arrays with the existing machinery for constructor pattern +-- +simplify_pat (PArrPat ps ty) + = mk_simple_con_pat (parrFakeCon (length ps)) + (PrefixCon (map simplify_lpat ps)) + (mkPArrTy ty) + +simplify_pat (TuplePat ps boxity ty) + = mk_simple_con_pat (tupleCon boxity arity) + (PrefixCon (map simplify_lpat ps)) + ty + where + arity = length ps + +-- unpack string patterns fully, so we can see when they overlap with +-- each other, or even explicit lists of Chars. +simplify_pat pat@(LitPat (HsString s)) = + foldr (\c pat -> mk_simple_con_pat consDataCon (PrefixCon [mk_char_lit c,noLoc pat]) stringTy) + (mk_simple_con_pat nilDataCon (PrefixCon []) stringTy) (unpackFS s) + where + mk_char_lit c = noLoc (mk_simple_con_pat charDataCon (PrefixCon [nlLitPat (HsCharPrim c)]) charTy) + +simplify_pat pat@(LitPat lit) = unLoc (tidyLitPat lit (noLoc pat)) + +simplify_pat pat@(NPat lit mb_neg _ lit_ty) = unLoc (tidyNPat lit mb_neg lit_ty (noLoc pat)) + +simplify_pat (NPlusKPat id hslit hsexpr1 hsexpr2) + = WildPat (idType (unLoc id)) + +simplify_pat (DictPat dicts methods) + = case num_of_d_and_ms of + 0 -> simplify_pat (TuplePat [] Boxed unitTy) + 1 -> simplify_pat (head dict_and_method_pats) + _ -> simplify_pat (mkVanillaTuplePat (map noLoc dict_and_method_pats) Boxed) + where + num_of_d_and_ms = length dicts + length methods + dict_and_method_pats = map VarPat (dicts ++ methods) + +mk_simple_con_pat con args ty = ConPatOut (noLoc con) [] [] emptyLHsBinds args ty + +----------------- +simplify_con con (PrefixCon ps) = PrefixCon (map simplify_lpat ps) +simplify_con con (InfixCon p1 p2) = PrefixCon [simplify_lpat p1, simplify_lpat p2] +simplify_con con (RecCon fs) + | null fs = PrefixCon [nlWildPat | t <- dataConOrigArgTys con] + -- Special case for null patterns; maybe not a record at all + | otherwise = PrefixCon (map (simplify_lpat.snd) all_pats) + where + -- pad out all the missing fields with WildPats. + field_pats = map (\ f -> (f, nlWildPat)) (dataConFieldLabels con) + all_pats = foldr (\ (id,p) acc -> insertNm (getName (unLoc id)) p acc) + field_pats fs + + insertNm nm p [] = [(nm,p)] + insertNm nm p (x@(n,_):xs) + | nm == n = (nm,p):xs + | otherwise = x : insertNm nm p xs +\end{code} diff --git a/compiler/deSugar/Desugar.lhs b/compiler/deSugar/Desugar.lhs new file mode 100644 index 0000000000..45dc113cc1 --- /dev/null +++ b/compiler/deSugar/Desugar.lhs @@ -0,0 +1,298 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[Desugar]{@deSugar@: the main function} + +\begin{code} +module Desugar ( deSugar, deSugarExpr ) where + +#include "HsVersions.h" + +import DynFlags ( DynFlag(..), DynFlags(..), dopt, GhcMode(..) ) +import StaticFlags ( opt_SccProfilingOn ) +import DriverPhases ( isHsBoot ) +import HscTypes ( ModGuts(..), HscEnv(..), + Dependencies(..), ForeignStubs(..), TypeEnv, IsBootInterface ) +import HsSyn ( RuleDecl(..), RuleBndr(..), LHsExpr, LRuleDecl ) +import TcRnTypes ( TcGblEnv(..), ImportAvails(..) ) +import MkIface ( mkUsageInfo ) +import Id ( Id, setIdExported, idName ) +import Name ( Name, isExternalName, nameIsLocalOrFrom, nameOccName ) +import CoreSyn +import PprCore ( pprRules, pprCoreExpr ) +import DsMonad +import DsExpr ( dsLExpr ) +import DsBinds ( dsTopLHsBinds, decomposeRuleLhs, AutoScc(..) ) +import DsForeign ( dsForeigns ) +import DsExpr () -- Forces DsExpr to be compiled; DsBinds only + -- depends on DsExpr.hi-boot. +import Module ( Module, moduleEnvElts, delModuleEnv, moduleFS ) +import RdrName ( GlobalRdrEnv ) +import NameSet +import VarSet +import Bag ( Bag, isEmptyBag, emptyBag ) +import Rules ( roughTopNames ) +import CoreLint ( showPass, endPass ) +import CoreFVs ( ruleRhsFreeVars, exprsFreeNames ) +import Packages ( PackageState(thPackageId), PackageIdH(..) ) +import ErrUtils ( doIfSet, dumpIfSet_dyn, printBagOfWarnings, + errorsFound, WarnMsg ) +import ListSetOps ( insertList ) +import Outputable +import UniqSupply ( mkSplitUniqSupply ) +import SrcLoc ( Located(..) ) +import DATA_IOREF ( readIORef ) +import Maybes ( catMaybes ) +import FastString +import Util ( sortLe ) +\end{code} + +%************************************************************************ +%* * +%* The main function: deSugar +%* * +%************************************************************************ + +\begin{code} +deSugar :: HscEnv -> TcGblEnv -> IO (Bag WarnMsg, Maybe ModGuts) +-- Can modify PCS by faulting in more declarations + +deSugar hsc_env + tcg_env@(TcGblEnv { tcg_mod = mod, + tcg_src = hsc_src, + tcg_type_env = type_env, + tcg_imports = imports, + tcg_home_mods = home_mods, + tcg_exports = exports, + tcg_dus = dus, + tcg_inst_uses = dfun_uses_var, + tcg_th_used = th_var, + tcg_keep = keep_var, + tcg_rdr_env = rdr_env, + tcg_fix_env = fix_env, + tcg_deprecs = deprecs, + tcg_binds = binds, + tcg_fords = fords, + tcg_rules = rules, + tcg_insts = insts }) + = do { showPass dflags "Desugar" + + -- Desugar the program + ; ((all_prs, ds_rules, ds_fords), warns) + <- case ghcMode (hsc_dflags hsc_env) of + JustTypecheck -> return (([], [], NoStubs), emptyBag) + _ -> initDs hsc_env mod rdr_env type_env $ do + { core_prs <- dsTopLHsBinds auto_scc binds + ; (ds_fords, foreign_prs) <- dsForeigns fords + ; let all_prs = foreign_prs ++ core_prs + local_bndrs = mkVarSet (map fst all_prs) + ; ds_rules <- mappM (dsRule mod local_bndrs) rules + ; return (all_prs, catMaybes ds_rules, ds_fords) + } + + -- If warnings are considered errors, leave. + ; if errorsFound dflags (warns, emptyBag) + then return (warns, Nothing) + else do + + { -- Add export flags to bindings + keep_alive <- readIORef keep_var + ; let final_prs = addExportFlags ghci_mode exports keep_alive + all_prs ds_rules + ds_binds = [Rec final_prs] + -- Notice that we put the whole lot in a big Rec, even the foreign binds + -- When compiling PrelFloat, which defines data Float = F# Float# + -- we want F# to be in scope in the foreign marshalling code! + -- You might think it doesn't matter, but the simplifier brings all top-level + -- things into the in-scope set before simplifying; so we get no unfolding for F#! + + -- Lint result if necessary + ; endPass dflags "Desugar" Opt_D_dump_ds ds_binds + + -- Dump output + ; doIfSet (dopt Opt_D_dump_ds dflags) + (printDump (ppr_ds_rules ds_rules)) + + ; dfun_uses <- readIORef dfun_uses_var -- What dfuns are used + ; th_used <- readIORef th_var -- Whether TH is used + ; let used_names = allUses dus `unionNameSets` dfun_uses + thPackage = thPackageId (pkgState dflags) + pkgs | ExtPackage th_id <- thPackage, th_used + = insertList th_id (imp_dep_pkgs imports) + | otherwise + = imp_dep_pkgs imports + + dep_mods = moduleEnvElts (delModuleEnv (imp_dep_mods imports) mod) + -- M.hi-boot can be in the imp_dep_mods, but we must remove + -- it before recording the modules on which this one depends! + -- (We want to retain M.hi-boot in imp_dep_mods so that + -- loadHiBootInterface can see if M's direct imports depend + -- on M.hi-boot, and hence that we should do the hi-boot consistency + -- check.) + + dir_imp_mods = imp_mods imports + + ; usages <- mkUsageInfo hsc_env home_mods dir_imp_mods dep_mods used_names + + ; let + -- Modules don't compare lexicographically usually, + -- but we want them to do so here. + le_mod :: Module -> Module -> Bool + le_mod m1 m2 = moduleFS m1 <= moduleFS m2 + le_dep_mod :: (Module, IsBootInterface) -> (Module, IsBootInterface) -> Bool + le_dep_mod (m1,_) (m2,_) = m1 `le_mod` m2 + + deps = Deps { dep_mods = sortLe le_dep_mod dep_mods, + dep_pkgs = sortLe (<=) pkgs, + dep_orphs = sortLe le_mod (imp_orphs imports) } + -- sort to get into canonical order + + mod_guts = ModGuts { + mg_module = mod, + mg_boot = isHsBoot hsc_src, + mg_exports = exports, + mg_deps = deps, + mg_home_mods = home_mods, + mg_usages = usages, + mg_dir_imps = [m | (m,_,_) <- moduleEnvElts dir_imp_mods], + mg_rdr_env = rdr_env, + mg_fix_env = fix_env, + mg_deprecs = deprecs, + mg_types = type_env, + mg_insts = insts, + mg_rules = ds_rules, + mg_binds = ds_binds, + mg_foreign = ds_fords } + + ; return (warns, Just mod_guts) + }} + + where + dflags = hsc_dflags hsc_env + ghci_mode = ghcMode (hsc_dflags hsc_env) + auto_scc | opt_SccProfilingOn = TopLevel + | otherwise = NoSccs + +deSugarExpr :: HscEnv + -> Module -> GlobalRdrEnv -> TypeEnv + -> LHsExpr Id + -> IO CoreExpr +deSugarExpr hsc_env this_mod rdr_env type_env tc_expr + = do { showPass dflags "Desugar" + ; us <- mkSplitUniqSupply 'd' + + -- Do desugaring + ; (core_expr, ds_warns) <- initDs hsc_env this_mod rdr_env type_env $ + dsLExpr tc_expr + + -- Display any warnings + -- Note: if -Werror is used, we don't signal an error here. + ; doIfSet (not (isEmptyBag ds_warns)) + (printBagOfWarnings dflags ds_warns) + + -- Dump output + ; dumpIfSet_dyn dflags Opt_D_dump_ds "Desugared" (pprCoreExpr core_expr) + + ; return core_expr + } + where + dflags = hsc_dflags hsc_env + + +-- addExportFlags +-- Set the no-discard flag if either +-- a) the Id is exported +-- b) it's mentioned in the RHS of an orphan rule +-- c) it's in the keep-alive set +-- +-- It means that the binding won't be discarded EVEN if the binding +-- ends up being trivial (v = w) -- the simplifier would usually just +-- substitute w for v throughout, but we don't apply the substitution to +-- the rules (maybe we should?), so this substitution would make the rule +-- bogus. + +-- You might wonder why exported Ids aren't already marked as such; +-- it's just because the type checker is rather busy already and +-- I didn't want to pass in yet another mapping. + +addExportFlags ghci_mode exports keep_alive prs rules + = [(add_export bndr, rhs) | (bndr,rhs) <- prs] + where + add_export bndr + | dont_discard bndr = setIdExported bndr + | otherwise = bndr + + orph_rhs_fvs = unionVarSets [ ruleRhsFreeVars rule + | rule <- rules, + not (isLocalRule rule) ] + -- A non-local rule keeps alive the free vars of its right-hand side. + -- (A "non-local" is one whose head function is not locally defined.) + -- Local rules are (later, after gentle simplification) + -- attached to the Id, and that keeps the rhs free vars alive. + + dont_discard bndr = is_exported name + || name `elemNameSet` keep_alive + || bndr `elemVarSet` orph_rhs_fvs + where + name = idName bndr + + -- In interactive mode, we don't want to discard any top-level + -- entities at all (eg. do not inline them away during + -- simplification), and retain them all in the TypeEnv so they are + -- available from the command line. + -- + -- isExternalName separates the user-defined top-level names from those + -- introduced by the type checker. + is_exported :: Name -> Bool + is_exported | ghci_mode == Interactive = isExternalName + | otherwise = (`elemNameSet` exports) + +ppr_ds_rules [] = empty +ppr_ds_rules rules + = text "" $$ text "-------------- DESUGARED RULES -----------------" $$ + pprRules rules +\end{code} + + + +%************************************************************************ +%* * +%* Desugaring transformation rules +%* * +%************************************************************************ + +\begin{code} +dsRule :: Module -> IdSet -> LRuleDecl Id -> DsM (Maybe CoreRule) +dsRule mod in_scope (L loc (HsRule name act vars lhs tv_lhs rhs fv_rhs)) + = putSrcSpanDs loc $ + do { let bndrs = [var | RuleBndr (L _ var) <- vars] + ; lhs' <- dsLExpr lhs + ; rhs' <- dsLExpr rhs + + ; case decomposeRuleLhs bndrs lhs' of { + Nothing -> do { dsWarn msg; return Nothing } ; + Just (bndrs', fn_id, args) -> do + + -- Substitute the dict bindings eagerly, + -- and take the body apart into a (f args) form + { let local_rule = nameIsLocalOrFrom mod fn_name + -- NB we can't use isLocalId in the orphan test, + -- because isLocalId isn't true of class methods + fn_name = idName fn_id + lhs_names = fn_name : nameSetToList (exprsFreeNames args) + -- No need to delete bndrs, because + -- exprsFreeNames finds only External names + orph = case filter (nameIsLocalOrFrom mod) lhs_names of + (n:ns) -> Just (nameOccName n) + [] -> Nothing + + rule = Rule { ru_name = name, ru_fn = fn_name, ru_act = act, + ru_bndrs = bndrs', ru_args = args, ru_rhs = rhs', + ru_rough = roughTopNames args, + ru_local = local_rule, ru_orph = orph } + ; return (Just rule) + } } } + where + msg = hang (ptext SLIT("RULE left-hand side too complicated to desugar; ignored")) + 2 (ppr lhs) +\end{code} diff --git a/compiler/deSugar/DsArrows.lhs b/compiler/deSugar/DsArrows.lhs new file mode 100644 index 0000000000..111e0bccd0 --- /dev/null +++ b/compiler/deSugar/DsArrows.lhs @@ -0,0 +1,1055 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[DsArrows]{Desugaring arrow commands} + +\begin{code} +module DsArrows ( dsProcExpr ) where + +#include "HsVersions.h" + +import Match ( matchSimply ) +import DsUtils ( mkErrorAppDs, + mkCoreTupTy, mkCoreTup, selectSimpleMatchVarL, + mkTupleCase, mkBigCoreTup, mkTupleType, + mkTupleExpr, mkTupleSelector, + dsSyntaxTable, lookupEvidence ) +import DsMonad + +import HsSyn +import TcHsSyn ( hsPatType ) + +-- NB: The desugarer, which straddles the source and Core worlds, sometimes +-- needs to see source types (newtypes etc), and sometimes not +-- So WATCH OUT; check each use of split*Ty functions. +-- Sigh. This is a pain. + +import {-# SOURCE #-} DsExpr ( dsExpr, dsLExpr, dsLocalBinds ) + +import TcType ( Type, tcSplitAppTy, mkFunTy ) +import Type ( mkTyConApp, funArgTy ) +import CoreSyn +import CoreFVs ( exprFreeVars ) +import CoreUtils ( mkIfThenElse, bindNonRec, exprType ) + +import Id ( Id, idType ) +import Name ( Name ) +import PrelInfo ( pAT_ERROR_ID ) +import DataCon ( dataConWrapId ) +import TysWiredIn ( tupleCon ) +import BasicTypes ( Boxity(..) ) +import PrelNames ( eitherTyConName, leftDataConName, rightDataConName, + arrAName, composeAName, firstAName, + appAName, choiceAName, loopAName ) +import Util ( mapAccumL ) +import Outputable + +import HsUtils ( collectPatBinders, collectPatsBinders ) +import VarSet ( IdSet, mkVarSet, varSetElems, + intersectVarSet, minusVarSet, extendVarSetList, + unionVarSet, unionVarSets, elemVarSet ) +import SrcLoc ( Located(..), unLoc, noLoc ) +\end{code} + +\begin{code} +data DsCmdEnv = DsCmdEnv { + meth_binds :: [CoreBind], + arr_id, compose_id, first_id, app_id, choice_id, loop_id :: CoreExpr + } + +mkCmdEnv :: SyntaxTable Id -> DsM DsCmdEnv +mkCmdEnv ids + = dsSyntaxTable ids `thenDs` \ (meth_binds, ds_meths) -> + return $ DsCmdEnv { + meth_binds = meth_binds, + arr_id = Var (lookupEvidence ds_meths arrAName), + compose_id = Var (lookupEvidence ds_meths composeAName), + first_id = Var (lookupEvidence ds_meths firstAName), + app_id = Var (lookupEvidence ds_meths appAName), + choice_id = Var (lookupEvidence ds_meths choiceAName), + loop_id = Var (lookupEvidence ds_meths loopAName) + } + +bindCmdEnv :: DsCmdEnv -> CoreExpr -> CoreExpr +bindCmdEnv ids body = foldr Let body (meth_binds ids) + +-- arr :: forall b c. (b -> c) -> a b c +do_arr :: DsCmdEnv -> Type -> Type -> CoreExpr -> CoreExpr +do_arr ids b_ty c_ty f = mkApps (arr_id ids) [Type b_ty, Type c_ty, f] + +-- (>>>) :: forall b c d. a b c -> a c d -> a b d +do_compose :: DsCmdEnv -> Type -> Type -> Type -> + CoreExpr -> CoreExpr -> CoreExpr +do_compose ids b_ty c_ty d_ty f g + = mkApps (compose_id ids) [Type b_ty, Type c_ty, Type d_ty, f, g] + +-- first :: forall b c d. a b c -> a (b,d) (c,d) +do_first :: DsCmdEnv -> Type -> Type -> Type -> CoreExpr -> CoreExpr +do_first ids b_ty c_ty d_ty f + = mkApps (first_id ids) [Type b_ty, Type c_ty, Type d_ty, f] + +-- app :: forall b c. a (a b c, b) c +do_app :: DsCmdEnv -> Type -> Type -> CoreExpr +do_app ids b_ty c_ty = mkApps (app_id ids) [Type b_ty, Type c_ty] + +-- (|||) :: forall b d c. a b d -> a c d -> a (Either b c) d +-- note the swapping of d and c +do_choice :: DsCmdEnv -> Type -> Type -> Type -> + CoreExpr -> CoreExpr -> CoreExpr +do_choice ids b_ty c_ty d_ty f g + = mkApps (choice_id ids) [Type b_ty, Type d_ty, Type c_ty, f, g] + +-- loop :: forall b d c. a (b,d) (c,d) -> a b c +-- note the swapping of d and c +do_loop :: DsCmdEnv -> Type -> Type -> Type -> CoreExpr -> CoreExpr +do_loop ids b_ty c_ty d_ty f + = mkApps (loop_id ids) [Type b_ty, Type d_ty, Type c_ty, f] + +-- map_arrow (f :: b -> c) (g :: a c d) = arr f >>> g :: a b d +do_map_arrow :: DsCmdEnv -> Type -> Type -> Type -> + CoreExpr -> CoreExpr -> CoreExpr +do_map_arrow ids b_ty c_ty d_ty f c + = do_compose ids b_ty c_ty d_ty (do_arr ids b_ty c_ty f) c + +mkFailExpr :: HsMatchContext Id -> Type -> DsM CoreExpr +mkFailExpr ctxt ty + = mkErrorAppDs pAT_ERROR_ID ty (matchContextErrString ctxt) + +-- construct CoreExpr for \ (a :: a_ty, b :: b_ty) -> b +mkSndExpr :: Type -> Type -> DsM CoreExpr +mkSndExpr a_ty b_ty + = newSysLocalDs a_ty `thenDs` \ a_var -> + newSysLocalDs b_ty `thenDs` \ b_var -> + newSysLocalDs (mkCorePairTy a_ty b_ty) `thenDs` \ pair_var -> + returnDs (Lam pair_var + (coreCasePair pair_var a_var b_var (Var b_var))) +\end{code} + +Build case analysis of a tuple. This cannot be done in the DsM monad, +because the list of variables is typically not yet defined. + +\begin{code} +-- coreCaseTuple [u1..] v [x1..xn] body +-- = case v of v { (x1, .., xn) -> body } +-- But the matching may be nested if the tuple is very big + +coreCaseTuple :: UniqSupply -> Id -> [Id] -> CoreExpr -> CoreExpr +coreCaseTuple uniqs scrut_var vars body + = mkTupleCase uniqs vars body scrut_var (Var scrut_var) + +coreCasePair :: Id -> Id -> Id -> CoreExpr -> CoreExpr +coreCasePair scrut_var var1 var2 body + = Case (Var scrut_var) scrut_var (exprType body) + [(DataAlt (tupleCon Boxed 2), [var1, var2], body)] +\end{code} + +\begin{code} +mkCorePairTy :: Type -> Type -> Type +mkCorePairTy t1 t2 = mkCoreTupTy [t1, t2] + +mkCorePairExpr :: CoreExpr -> CoreExpr -> CoreExpr +mkCorePairExpr e1 e2 = mkCoreTup [e1, e2] +\end{code} + +The input is divided into a local environment, which is a flat tuple +(unless it's too big), and a stack, each element of which is paired +with the stack in turn. In general, the input has the form + + (...((x1,...,xn),s1),...sk) + +where xi are the environment values, and si the ones on the stack, +with s1 being the "top", the first one to be matched with a lambda. + +\begin{code} +envStackType :: [Id] -> [Type] -> Type +envStackType ids stack_tys = foldl mkCorePairTy (mkTupleType ids) stack_tys + +---------------------------------------------- +-- buildEnvStack +-- +-- (...((x1,...,xn),s1),...sk) + +buildEnvStack :: [Id] -> [Id] -> CoreExpr +buildEnvStack env_ids stack_ids + = foldl mkCorePairExpr (mkTupleExpr env_ids) (map Var stack_ids) + +---------------------------------------------- +-- matchEnvStack +-- +-- \ (...((x1,...,xn),s1),...sk) -> e +-- => +-- \ zk -> +-- case zk of (zk-1,sk) -> +-- ... +-- case z1 of (z0,s1) -> +-- case z0 of (x1,...,xn) -> +-- e + +matchEnvStack :: [Id] -- x1..xn + -> [Id] -- s1..sk + -> CoreExpr -- e + -> DsM CoreExpr +matchEnvStack env_ids stack_ids body + = newUniqueSupply `thenDs` \ uniqs -> + newSysLocalDs (mkTupleType env_ids) `thenDs` \ tup_var -> + matchVarStack tup_var stack_ids + (coreCaseTuple uniqs tup_var env_ids body) + + +---------------------------------------------- +-- matchVarStack +-- +-- \ (...(z0,s1),...sk) -> e +-- => +-- \ zk -> +-- case zk of (zk-1,sk) -> +-- ... +-- case z1 of (z0,s1) -> +-- e + +matchVarStack :: Id -- z0 + -> [Id] -- s1..sk + -> CoreExpr -- e + -> DsM CoreExpr +matchVarStack env_id [] body + = returnDs (Lam env_id body) +matchVarStack env_id (stack_id:stack_ids) body + = newSysLocalDs (mkCorePairTy (idType env_id) (idType stack_id)) + `thenDs` \ pair_id -> + matchVarStack pair_id stack_ids + (coreCasePair pair_id env_id stack_id body) +\end{code} + +\begin{code} +mkHsTupleExpr :: [HsExpr Id] -> HsExpr Id +mkHsTupleExpr [e] = e +mkHsTupleExpr es = ExplicitTuple (map noLoc es) Boxed + +mkHsPairExpr :: HsExpr Id -> HsExpr Id -> HsExpr Id +mkHsPairExpr e1 e2 = mkHsTupleExpr [e1, e2] + +mkHsEnvStackExpr :: [Id] -> [Id] -> HsExpr Id +mkHsEnvStackExpr env_ids stack_ids + = foldl mkHsPairExpr (mkHsTupleExpr (map HsVar env_ids)) (map HsVar stack_ids) +\end{code} + +Translation of arrow abstraction + +\begin{code} + +-- A | xs |- c :: [] t' ---> c' +-- -------------------------- +-- A |- proc p -> c :: a t t' ---> arr (\ p -> (xs)) >>> c' +-- +-- where (xs) is the tuple of variables bound by p + +dsProcExpr + :: LPat Id + -> LHsCmdTop Id + -> DsM CoreExpr +dsProcExpr pat (L _ (HsCmdTop cmd [] cmd_ty ids)) + = mkCmdEnv ids `thenDs` \ meth_ids -> + let + locals = mkVarSet (collectPatBinders pat) + in + dsfixCmd meth_ids locals [] cmd_ty cmd + `thenDs` \ (core_cmd, free_vars, env_ids) -> + let + env_ty = mkTupleType env_ids + in + mkFailExpr ProcExpr env_ty `thenDs` \ fail_expr -> + selectSimpleMatchVarL pat `thenDs` \ var -> + matchSimply (Var var) ProcExpr pat (mkTupleExpr env_ids) fail_expr + `thenDs` \ match_code -> + let + pat_ty = hsPatType pat + proc_code = do_map_arrow meth_ids pat_ty env_ty cmd_ty + (Lam var match_code) + core_cmd + in + returnDs (bindCmdEnv meth_ids proc_code) +\end{code} + +Translation of command judgements of the form + + A | xs |- c :: [ts] t + +\begin{code} +dsLCmd ids local_vars env_ids stack res_ty cmd + = dsCmd ids local_vars env_ids stack res_ty (unLoc cmd) + +dsCmd :: DsCmdEnv -- arrow combinators + -> IdSet -- set of local vars available to this command + -> [Id] -- list of vars in the input to this command + -- This is typically fed back, + -- so don't pull on it too early + -> [Type] -- type of the stack + -> Type -- return type of the command + -> HsCmd Id -- command to desugar + -> DsM (CoreExpr, -- desugared expression + IdSet) -- set of local vars that occur free + +-- A |- f :: a (t*ts) t' +-- A, xs |- arg :: t +-- ----------------------------- +-- A | xs |- f -< arg :: [ts] t' +-- +-- ---> arr (\ ((xs)*ts) -> (arg*ts)) >>> f + +dsCmd ids local_vars env_ids stack res_ty + (HsArrApp arrow arg arrow_ty HsFirstOrderApp _) + = let + (a_arg_ty, _res_ty') = tcSplitAppTy arrow_ty + (_a_ty, arg_ty) = tcSplitAppTy a_arg_ty + env_ty = mkTupleType env_ids + in + dsLExpr arrow `thenDs` \ core_arrow -> + dsLExpr arg `thenDs` \ core_arg -> + mappM newSysLocalDs stack `thenDs` \ stack_ids -> + matchEnvStack env_ids stack_ids + (foldl mkCorePairExpr core_arg (map Var stack_ids)) + `thenDs` \ core_make_arg -> + returnDs (do_map_arrow ids + (envStackType env_ids stack) + arg_ty + res_ty + core_make_arg + core_arrow, + exprFreeVars core_arg `intersectVarSet` local_vars) + +-- A, xs |- f :: a (t*ts) t' +-- A, xs |- arg :: t +-- ------------------------------ +-- A | xs |- f -<< arg :: [ts] t' +-- +-- ---> arr (\ ((xs)*ts) -> (f,(arg*ts))) >>> app + +dsCmd ids local_vars env_ids stack res_ty + (HsArrApp arrow arg arrow_ty HsHigherOrderApp _) + = let + (a_arg_ty, _res_ty') = tcSplitAppTy arrow_ty + (_a_ty, arg_ty) = tcSplitAppTy a_arg_ty + env_ty = mkTupleType env_ids + in + dsLExpr arrow `thenDs` \ core_arrow -> + dsLExpr arg `thenDs` \ core_arg -> + mappM newSysLocalDs stack `thenDs` \ stack_ids -> + matchEnvStack env_ids stack_ids + (mkCorePairExpr core_arrow + (foldl mkCorePairExpr core_arg (map Var stack_ids))) + `thenDs` \ core_make_pair -> + returnDs (do_map_arrow ids + (envStackType env_ids stack) + (mkCorePairTy arrow_ty arg_ty) + res_ty + core_make_pair + (do_app ids arg_ty res_ty), + (exprFreeVars core_arrow `unionVarSet` exprFreeVars core_arg) + `intersectVarSet` local_vars) + +-- A | ys |- c :: [t:ts] t' +-- A, xs |- e :: t +-- ------------------------ +-- A | xs |- c e :: [ts] t' +-- +-- ---> arr (\ ((xs)*ts) -> let z = e in (((ys),z)*ts)) >>> c + +dsCmd ids local_vars env_ids stack res_ty (HsApp cmd arg) + = dsLExpr arg `thenDs` \ core_arg -> + let + arg_ty = exprType core_arg + stack' = arg_ty:stack + in + dsfixCmd ids local_vars stack' res_ty cmd + `thenDs` \ (core_cmd, free_vars, env_ids') -> + mappM newSysLocalDs stack `thenDs` \ stack_ids -> + newSysLocalDs arg_ty `thenDs` \ arg_id -> + -- push the argument expression onto the stack + let + core_body = bindNonRec arg_id core_arg + (buildEnvStack env_ids' (arg_id:stack_ids)) + in + -- match the environment and stack against the input + matchEnvStack env_ids stack_ids core_body + `thenDs` \ core_map -> + returnDs (do_map_arrow ids + (envStackType env_ids stack) + (envStackType env_ids' stack') + res_ty + core_map + core_cmd, + (exprFreeVars core_arg `intersectVarSet` local_vars) + `unionVarSet` free_vars) + +-- A | ys |- c :: [ts] t' +-- ----------------------------------------------- +-- A | xs |- \ p1 ... pk -> c :: [t1:...:tk:ts] t' +-- +-- ---> arr (\ ((((xs), p1), ... pk)*ts) -> ((ys)*ts)) >>> c + +dsCmd ids local_vars env_ids stack res_ty + (HsLam (MatchGroup [L _ (Match pats _ (GRHSs [L _ (GRHS [] body)] _ ))] _)) + = let + pat_vars = mkVarSet (collectPatsBinders pats) + local_vars' = local_vars `unionVarSet` pat_vars + stack' = drop (length pats) stack + in + dsfixCmd ids local_vars' stack' res_ty body + `thenDs` \ (core_body, free_vars, env_ids') -> + mappM newSysLocalDs stack `thenDs` \ stack_ids -> + + -- the expression is built from the inside out, so the actions + -- are presented in reverse order + + let + (actual_ids, stack_ids') = splitAt (length pats) stack_ids + -- build a new environment, plus what's left of the stack + core_expr = buildEnvStack env_ids' stack_ids' + in_ty = envStackType env_ids stack + in_ty' = envStackType env_ids' stack' + in + mkFailExpr LambdaExpr in_ty' `thenDs` \ fail_expr -> + -- match the patterns against the top of the old stack + matchSimplys (map Var actual_ids) LambdaExpr pats core_expr fail_expr + `thenDs` \ match_code -> + -- match the old environment and stack against the input + matchEnvStack env_ids stack_ids match_code + `thenDs` \ select_code -> + returnDs (do_map_arrow ids in_ty in_ty' res_ty select_code core_body, + free_vars `minusVarSet` pat_vars) + +dsCmd ids local_vars env_ids stack res_ty (HsPar cmd) + = dsLCmd ids local_vars env_ids stack res_ty cmd + +-- A, xs |- e :: Bool +-- A | xs1 |- c1 :: [ts] t +-- A | xs2 |- c2 :: [ts] t +-- ---------------------------------------- +-- A | xs |- if e then c1 else c2 :: [ts] t +-- +-- ---> arr (\ ((xs)*ts) -> +-- if e then Left ((xs1)*ts) else Right ((xs2)*ts)) >>> +-- c1 ||| c2 + +dsCmd ids local_vars env_ids stack res_ty (HsIf cond then_cmd else_cmd) + = dsLExpr cond `thenDs` \ core_cond -> + dsfixCmd ids local_vars stack res_ty then_cmd + `thenDs` \ (core_then, fvs_then, then_ids) -> + dsfixCmd ids local_vars stack res_ty else_cmd + `thenDs` \ (core_else, fvs_else, else_ids) -> + mappM newSysLocalDs stack `thenDs` \ stack_ids -> + dsLookupTyCon eitherTyConName `thenDs` \ either_con -> + dsLookupDataCon leftDataConName `thenDs` \ left_con -> + dsLookupDataCon rightDataConName `thenDs` \ right_con -> + let + left_expr ty1 ty2 e = mkConApp left_con [Type ty1, Type ty2, e] + right_expr ty1 ty2 e = mkConApp right_con [Type ty1, Type ty2, e] + + in_ty = envStackType env_ids stack + then_ty = envStackType then_ids stack + else_ty = envStackType else_ids stack + sum_ty = mkTyConApp either_con [then_ty, else_ty] + fvs_cond = exprFreeVars core_cond `intersectVarSet` local_vars + in + matchEnvStack env_ids stack_ids + (mkIfThenElse core_cond + (left_expr then_ty else_ty (buildEnvStack then_ids stack_ids)) + (right_expr then_ty else_ty (buildEnvStack else_ids stack_ids))) + `thenDs` \ core_if -> + returnDs(do_map_arrow ids in_ty sum_ty res_ty + core_if + (do_choice ids then_ty else_ty res_ty core_then core_else), + fvs_cond `unionVarSet` fvs_then `unionVarSet` fvs_else) +\end{code} + +Case commands are treated in much the same way as if commands +(see above) except that there are more alternatives. For example + + case e of { p1 -> c1; p2 -> c2; p3 -> c3 } + +is translated to + + arr (\ ((xs)*ts) -> case e of + p1 -> (Left (Left (xs1)*ts)) + p2 -> Left ((Right (xs2)*ts)) + p3 -> Right ((xs3)*ts)) >>> + (c1 ||| c2) ||| c3 + +The idea is to extract the commands from the case, build a balanced tree +of choices, and replace the commands with expressions that build tagged +tuples, obtaining a case expression that can be desugared normally. +To build all this, we use quadruples decribing segments of the list of +case bodies, containing the following fields: +1. an IdSet containing the environment variables free in the case bodies +2. a list of expressions of the form (Left|Right)* ((xs)*ts), to be put + into the case replacing the commands +3. a sum type that is the common type of these expressions, and also the + input type of the arrow +4. a CoreExpr for an arrow built by combining the translated command + bodies with |||. + +\begin{code} +dsCmd ids local_vars env_ids stack res_ty (HsCase exp (MatchGroup matches match_ty)) + = dsLExpr exp `thenDs` \ core_exp -> + mappM newSysLocalDs stack `thenDs` \ stack_ids -> + + -- Extract and desugar the leaf commands in the case, building tuple + -- expressions that will (after tagging) replace these leaves + + let + leaves = concatMap leavesMatch matches + make_branch (leaf, bound_vars) + = dsfixCmd ids (local_vars `unionVarSet` bound_vars) stack res_ty leaf + `thenDs` \ (core_leaf, fvs, leaf_ids) -> + returnDs (fvs `minusVarSet` bound_vars, + [noLoc $ mkHsEnvStackExpr leaf_ids stack_ids], + envStackType leaf_ids stack, + core_leaf) + in + mappM make_branch leaves `thenDs` \ branches -> + dsLookupTyCon eitherTyConName `thenDs` \ either_con -> + dsLookupDataCon leftDataConName `thenDs` \ left_con -> + dsLookupDataCon rightDataConName `thenDs` \ right_con -> + let + left_id = nlHsVar (dataConWrapId left_con) + right_id = nlHsVar (dataConWrapId right_con) + left_expr ty1 ty2 e = noLoc $ HsApp (noLoc $ TyApp left_id [ty1, ty2]) e + right_expr ty1 ty2 e = noLoc $ HsApp (noLoc $ TyApp right_id [ty1, ty2]) e + + -- Prefix each tuple with a distinct series of Left's and Right's, + -- in a balanced way, keeping track of the types. + + merge_branches (fvs1, builds1, in_ty1, core_exp1) + (fvs2, builds2, in_ty2, core_exp2) + = (fvs1 `unionVarSet` fvs2, + map (left_expr in_ty1 in_ty2) builds1 ++ + map (right_expr in_ty1 in_ty2) builds2, + mkTyConApp either_con [in_ty1, in_ty2], + do_choice ids in_ty1 in_ty2 res_ty core_exp1 core_exp2) + (fvs_alts, leaves', sum_ty, core_choices) + = foldb merge_branches branches + + -- Replace the commands in the case with these tagged tuples, + -- yielding a HsExpr Id we can feed to dsExpr. + + (_, matches') = mapAccumL (replaceLeavesMatch res_ty) leaves' matches + in_ty = envStackType env_ids stack + fvs_exp = exprFreeVars core_exp `intersectVarSet` local_vars + + pat_ty = funArgTy match_ty + match_ty' = mkFunTy pat_ty sum_ty + -- Note that we replace the HsCase result type by sum_ty, + -- which is the type of matches' + in + dsExpr (HsCase exp (MatchGroup matches' match_ty')) `thenDs` \ core_body -> + matchEnvStack env_ids stack_ids core_body + `thenDs` \ core_matches -> + returnDs(do_map_arrow ids in_ty sum_ty res_ty core_matches core_choices, + fvs_exp `unionVarSet` fvs_alts) + +-- A | ys |- c :: [ts] t +-- ---------------------------------- +-- A | xs |- let binds in c :: [ts] t +-- +-- ---> arr (\ ((xs)*ts) -> let binds in ((ys)*ts)) >>> c + +dsCmd ids local_vars env_ids stack res_ty (HsLet binds body) + = let + defined_vars = mkVarSet (map unLoc (collectLocalBinders binds)) + local_vars' = local_vars `unionVarSet` defined_vars + in + dsfixCmd ids local_vars' stack res_ty body + `thenDs` \ (core_body, free_vars, env_ids') -> + mappM newSysLocalDs stack `thenDs` \ stack_ids -> + -- build a new environment, plus the stack, using the let bindings + dsLocalBinds binds (buildEnvStack env_ids' stack_ids) + `thenDs` \ core_binds -> + -- match the old environment and stack against the input + matchEnvStack env_ids stack_ids core_binds + `thenDs` \ core_map -> + returnDs (do_map_arrow ids + (envStackType env_ids stack) + (envStackType env_ids' stack) + res_ty + core_map + core_body, + exprFreeVars core_binds `intersectVarSet` local_vars) + +dsCmd ids local_vars env_ids [] res_ty (HsDo _ctxt stmts body _) + = dsCmdDo ids local_vars env_ids res_ty stmts body + +-- A |- e :: forall e. a1 (e*ts1) t1 -> ... an (e*tsn) tn -> a (e*ts) t +-- A | xs |- ci :: [tsi] ti +-- ----------------------------------- +-- A | xs |- (|e c1 ... cn|) :: [ts] t ---> e [t_xs] c1 ... cn + +dsCmd _ids local_vars env_ids _stack _res_ty (HsArrForm op _ args) + = let + env_ty = mkTupleType env_ids + in + dsLExpr op `thenDs` \ core_op -> + mapAndUnzipDs (dsTrimCmdArg local_vars env_ids) args + `thenDs` \ (core_args, fv_sets) -> + returnDs (mkApps (App core_op (Type env_ty)) core_args, + unionVarSets fv_sets) + +-- A | ys |- c :: [ts] t (ys <= xs) +-- --------------------- +-- A | xs |- c :: [ts] t ---> arr_ts (\ (xs) -> (ys)) >>> c + +dsTrimCmdArg + :: IdSet -- set of local vars available to this command + -> [Id] -- list of vars in the input to this command + -> LHsCmdTop Id -- command argument to desugar + -> DsM (CoreExpr, -- desugared expression + IdSet) -- set of local vars that occur free +dsTrimCmdArg local_vars env_ids (L _ (HsCmdTop cmd stack cmd_ty ids)) + = mkCmdEnv ids `thenDs` \ meth_ids -> + dsfixCmd meth_ids local_vars stack cmd_ty cmd + `thenDs` \ (core_cmd, free_vars, env_ids') -> + mappM newSysLocalDs stack `thenDs` \ stack_ids -> + matchEnvStack env_ids stack_ids (buildEnvStack env_ids' stack_ids) + `thenDs` \ trim_code -> + let + in_ty = envStackType env_ids stack + in_ty' = envStackType env_ids' stack + arg_code = if env_ids' == env_ids then core_cmd else + do_map_arrow meth_ids in_ty in_ty' cmd_ty trim_code core_cmd + in + returnDs (bindCmdEnv meth_ids arg_code, free_vars) + +-- Given A | xs |- c :: [ts] t, builds c with xs fed back. +-- Typically needs to be prefixed with arr (\p -> ((xs)*ts)) + +dsfixCmd + :: DsCmdEnv -- arrow combinators + -> IdSet -- set of local vars available to this command + -> [Type] -- type of the stack + -> Type -- return type of the command + -> LHsCmd Id -- command to desugar + -> DsM (CoreExpr, -- desugared expression + IdSet, -- set of local vars that occur free + [Id]) -- set as a list, fed back +dsfixCmd ids local_vars stack cmd_ty cmd + = fixDs (\ ~(_,_,env_ids') -> + dsLCmd ids local_vars env_ids' stack cmd_ty cmd + `thenDs` \ (core_cmd, free_vars) -> + returnDs (core_cmd, free_vars, varSetElems free_vars)) + +\end{code} + +Translation of command judgements of the form + + A | xs |- do { ss } :: [] t + +\begin{code} + +dsCmdDo :: DsCmdEnv -- arrow combinators + -> IdSet -- set of local vars available to this statement + -> [Id] -- list of vars in the input to this statement + -- This is typically fed back, + -- so don't pull on it too early + -> Type -- return type of the statement + -> [LStmt Id] -- statements to desugar + -> LHsExpr Id -- body + -> DsM (CoreExpr, -- desugared expression + IdSet) -- set of local vars that occur free + +-- A | xs |- c :: [] t +-- -------------------------- +-- A | xs |- do { c } :: [] t + +dsCmdDo ids local_vars env_ids res_ty [] body + = dsLCmd ids local_vars env_ids [] res_ty body + +dsCmdDo ids local_vars env_ids res_ty (stmt:stmts) body + = let + bound_vars = mkVarSet (map unLoc (collectLStmtBinders stmt)) + local_vars' = local_vars `unionVarSet` bound_vars + in + fixDs (\ ~(_,_,env_ids') -> + dsCmdDo ids local_vars' env_ids' res_ty stmts body + `thenDs` \ (core_stmts, fv_stmts) -> + returnDs (core_stmts, fv_stmts, varSetElems fv_stmts)) + `thenDs` \ (core_stmts, fv_stmts, env_ids') -> + dsCmdLStmt ids local_vars env_ids env_ids' stmt + `thenDs` \ (core_stmt, fv_stmt) -> + returnDs (do_compose ids + (mkTupleType env_ids) + (mkTupleType env_ids') + res_ty + core_stmt + core_stmts, + fv_stmt) + +\end{code} +A statement maps one local environment to another, and is represented +as an arrow from one tuple type to another. A statement sequence is +translated to a composition of such arrows. +\begin{code} +dsCmdLStmt ids local_vars env_ids out_ids cmd + = dsCmdStmt ids local_vars env_ids out_ids (unLoc cmd) + +dsCmdStmt + :: DsCmdEnv -- arrow combinators + -> IdSet -- set of local vars available to this statement + -> [Id] -- list of vars in the input to this statement + -- This is typically fed back, + -- so don't pull on it too early + -> [Id] -- list of vars in the output of this statement + -> Stmt Id -- statement to desugar + -> DsM (CoreExpr, -- desugared expression + IdSet) -- set of local vars that occur free + +-- A | xs1 |- c :: [] t +-- A | xs' |- do { ss } :: [] t' +-- ------------------------------ +-- A | xs |- do { c; ss } :: [] t' +-- +-- ---> arr (\ (xs) -> ((xs1),(xs'))) >>> first c >>> +-- arr snd >>> ss + +dsCmdStmt ids local_vars env_ids out_ids (ExprStmt cmd _ c_ty) + = dsfixCmd ids local_vars [] c_ty cmd + `thenDs` \ (core_cmd, fv_cmd, env_ids1) -> + matchEnvStack env_ids [] + (mkCorePairExpr (mkTupleExpr env_ids1) (mkTupleExpr out_ids)) + `thenDs` \ core_mux -> + let + in_ty = mkTupleType env_ids + in_ty1 = mkTupleType env_ids1 + out_ty = mkTupleType out_ids + before_c_ty = mkCorePairTy in_ty1 out_ty + after_c_ty = mkCorePairTy c_ty out_ty + in + mkSndExpr c_ty out_ty `thenDs` \ snd_fn -> + returnDs (do_map_arrow ids in_ty before_c_ty out_ty core_mux $ + do_compose ids before_c_ty after_c_ty out_ty + (do_first ids in_ty1 c_ty out_ty core_cmd) $ + do_arr ids after_c_ty out_ty snd_fn, + extendVarSetList fv_cmd out_ids) + where + +-- A | xs1 |- c :: [] t +-- A | xs' |- do { ss } :: [] t' xs2 = xs' - defs(p) +-- ----------------------------------- +-- A | xs |- do { p <- c; ss } :: [] t' +-- +-- ---> arr (\ (xs) -> ((xs1),(xs2))) >>> first c >>> +-- arr (\ (p, (xs2)) -> (xs')) >>> ss +-- +-- It would be simpler and more consistent to do this using second, +-- but that's likely to be defined in terms of first. + +dsCmdStmt ids local_vars env_ids out_ids (BindStmt pat cmd _ _) + = dsfixCmd ids local_vars [] (hsPatType pat) cmd + `thenDs` \ (core_cmd, fv_cmd, env_ids1) -> + let + pat_ty = hsPatType pat + pat_vars = mkVarSet (collectPatBinders pat) + env_ids2 = varSetElems (mkVarSet out_ids `minusVarSet` pat_vars) + env_ty2 = mkTupleType env_ids2 + in + + -- multiplexing function + -- \ (xs) -> ((xs1),(xs2)) + + matchEnvStack env_ids [] + (mkCorePairExpr (mkTupleExpr env_ids1) (mkTupleExpr env_ids2)) + `thenDs` \ core_mux -> + + -- projection function + -- \ (p, (xs2)) -> (zs) + + newSysLocalDs env_ty2 `thenDs` \ env_id -> + newUniqueSupply `thenDs` \ uniqs -> + let + after_c_ty = mkCorePairTy pat_ty env_ty2 + out_ty = mkTupleType out_ids + body_expr = coreCaseTuple uniqs env_id env_ids2 (mkTupleExpr out_ids) + in + mkFailExpr (StmtCtxt DoExpr) out_ty `thenDs` \ fail_expr -> + selectSimpleMatchVarL pat `thenDs` \ pat_id -> + matchSimply (Var pat_id) (StmtCtxt DoExpr) pat body_expr fail_expr + `thenDs` \ match_code -> + newSysLocalDs after_c_ty `thenDs` \ pair_id -> + let + proj_expr = Lam pair_id (coreCasePair pair_id pat_id env_id match_code) + in + + -- put it all together + let + in_ty = mkTupleType env_ids + in_ty1 = mkTupleType env_ids1 + in_ty2 = mkTupleType env_ids2 + before_c_ty = mkCorePairTy in_ty1 in_ty2 + in + returnDs (do_map_arrow ids in_ty before_c_ty out_ty core_mux $ + do_compose ids before_c_ty after_c_ty out_ty + (do_first ids in_ty1 pat_ty in_ty2 core_cmd) $ + do_arr ids after_c_ty out_ty proj_expr, + fv_cmd `unionVarSet` (mkVarSet out_ids `minusVarSet` pat_vars)) + +-- A | xs' |- do { ss } :: [] t +-- -------------------------------------- +-- A | xs |- do { let binds; ss } :: [] t +-- +-- ---> arr (\ (xs) -> let binds in (xs')) >>> ss + +dsCmdStmt ids local_vars env_ids out_ids (LetStmt binds) + -- build a new environment using the let bindings + = dsLocalBinds binds (mkTupleExpr out_ids) `thenDs` \ core_binds -> + -- match the old environment against the input + matchEnvStack env_ids [] core_binds `thenDs` \ core_map -> + returnDs (do_arr ids + (mkTupleType env_ids) + (mkTupleType out_ids) + core_map, + exprFreeVars core_binds `intersectVarSet` local_vars) + +-- A | ys |- do { ss; returnA -< ((xs1), (ys2)) } :: [] ... +-- A | xs' |- do { ss' } :: [] t +-- ------------------------------------ +-- A | xs |- do { rec ss; ss' } :: [] t +-- +-- xs1 = xs' /\ defs(ss) +-- xs2 = xs' - defs(ss) +-- ys1 = ys - defs(ss) +-- ys2 = ys /\ defs(ss) +-- +-- ---> arr (\(xs) -> ((ys1),(xs2))) >>> +-- first (loop (arr (\((ys1),~(ys2)) -> (ys)) >>> ss)) >>> +-- arr (\((xs1),(xs2)) -> (xs')) >>> ss' + +dsCmdStmt ids local_vars env_ids out_ids (RecStmt stmts later_ids rec_ids rhss binds) + = let -- ToDo: ****** binds not desugared; ROSS PLEASE FIX ******** + env2_id_set = mkVarSet out_ids `minusVarSet` mkVarSet later_ids + env2_ids = varSetElems env2_id_set + env2_ty = mkTupleType env2_ids + in + + -- post_loop_fn = \((later_ids),(env2_ids)) -> (out_ids) + + newUniqueSupply `thenDs` \ uniqs -> + newSysLocalDs env2_ty `thenDs` \ env2_id -> + let + later_ty = mkTupleType later_ids + post_pair_ty = mkCorePairTy later_ty env2_ty + post_loop_body = coreCaseTuple uniqs env2_id env2_ids (mkTupleExpr out_ids) + in + matchEnvStack later_ids [env2_id] post_loop_body + `thenDs` \ post_loop_fn -> + + --- loop (...) + + dsRecCmd ids local_vars stmts later_ids rec_ids rhss + `thenDs` \ (core_loop, env1_id_set, env1_ids) -> + + -- pre_loop_fn = \(env_ids) -> ((env1_ids),(env2_ids)) + + let + env1_ty = mkTupleType env1_ids + pre_pair_ty = mkCorePairTy env1_ty env2_ty + pre_loop_body = mkCorePairExpr (mkTupleExpr env1_ids) + (mkTupleExpr env2_ids) + + in + matchEnvStack env_ids [] pre_loop_body + `thenDs` \ pre_loop_fn -> + + -- arr pre_loop_fn >>> first (loop (...)) >>> arr post_loop_fn + + let + env_ty = mkTupleType env_ids + out_ty = mkTupleType out_ids + core_body = do_map_arrow ids env_ty pre_pair_ty out_ty + pre_loop_fn + (do_compose ids pre_pair_ty post_pair_ty out_ty + (do_first ids env1_ty later_ty env2_ty + core_loop) + (do_arr ids post_pair_ty out_ty + post_loop_fn)) + in + returnDs (core_body, env1_id_set `unionVarSet` env2_id_set) + +-- loop (arr (\ ((env1_ids), ~(rec_ids)) -> (env_ids)) >>> +-- ss >>> +-- arr (\ (out_ids) -> ((later_ids),(rhss))) >>> + +dsRecCmd ids local_vars stmts later_ids rec_ids rhss + = let + rec_id_set = mkVarSet rec_ids + out_ids = varSetElems (mkVarSet later_ids `unionVarSet` rec_id_set) + out_ty = mkTupleType out_ids + local_vars' = local_vars `unionVarSet` rec_id_set + in + + -- mk_pair_fn = \ (out_ids) -> ((later_ids),(rhss)) + + mappM dsExpr rhss `thenDs` \ core_rhss -> + let + later_tuple = mkTupleExpr later_ids + later_ty = mkTupleType later_ids + rec_tuple = mkBigCoreTup core_rhss + rec_ty = mkTupleType rec_ids + out_pair = mkCorePairExpr later_tuple rec_tuple + out_pair_ty = mkCorePairTy later_ty rec_ty + in + matchEnvStack out_ids [] out_pair + `thenDs` \ mk_pair_fn -> + + -- ss + + dsfixCmdStmts ids local_vars' out_ids stmts + `thenDs` \ (core_stmts, fv_stmts, env_ids) -> + + -- squash_pair_fn = \ ((env1_ids), ~(rec_ids)) -> (env_ids) + + newSysLocalDs rec_ty `thenDs` \ rec_id -> + let + env1_id_set = fv_stmts `minusVarSet` rec_id_set + env1_ids = varSetElems env1_id_set + env1_ty = mkTupleType env1_ids + in_pair_ty = mkCorePairTy env1_ty rec_ty + core_body = mkBigCoreTup (map selectVar env_ids) + where + selectVar v + | v `elemVarSet` rec_id_set + = mkTupleSelector rec_ids v rec_id (Var rec_id) + | otherwise = Var v + in + matchEnvStack env1_ids [rec_id] core_body + `thenDs` \ squash_pair_fn -> + + -- loop (arr squash_pair_fn >>> ss >>> arr mk_pair_fn) + + let + env_ty = mkTupleType env_ids + core_loop = do_loop ids env1_ty later_ty rec_ty + (do_map_arrow ids in_pair_ty env_ty out_pair_ty + squash_pair_fn + (do_compose ids env_ty out_ty out_pair_ty + core_stmts + (do_arr ids out_ty out_pair_ty mk_pair_fn))) + in + returnDs (core_loop, env1_id_set, env1_ids) + +\end{code} +A sequence of statements (as in a rec) is desugared to an arrow between +two environments +\begin{code} + +dsfixCmdStmts + :: DsCmdEnv -- arrow combinators + -> IdSet -- set of local vars available to this statement + -> [Id] -- output vars of these statements + -> [LStmt Id] -- statements to desugar + -> DsM (CoreExpr, -- desugared expression + IdSet, -- set of local vars that occur free + [Id]) -- input vars + +dsfixCmdStmts ids local_vars out_ids stmts + = fixDs (\ ~(_,_,env_ids) -> + dsCmdStmts ids local_vars env_ids out_ids stmts + `thenDs` \ (core_stmts, fv_stmts) -> + returnDs (core_stmts, fv_stmts, varSetElems fv_stmts)) + +dsCmdStmts + :: DsCmdEnv -- arrow combinators + -> IdSet -- set of local vars available to this statement + -> [Id] -- list of vars in the input to these statements + -> [Id] -- output vars of these statements + -> [LStmt Id] -- statements to desugar + -> DsM (CoreExpr, -- desugared expression + IdSet) -- set of local vars that occur free + +dsCmdStmts ids local_vars env_ids out_ids [stmt] + = dsCmdLStmt ids local_vars env_ids out_ids stmt + +dsCmdStmts ids local_vars env_ids out_ids (stmt:stmts) + = let + bound_vars = mkVarSet (map unLoc (collectLStmtBinders stmt)) + local_vars' = local_vars `unionVarSet` bound_vars + in + dsfixCmdStmts ids local_vars' out_ids stmts + `thenDs` \ (core_stmts, fv_stmts, env_ids') -> + dsCmdLStmt ids local_vars env_ids env_ids' stmt + `thenDs` \ (core_stmt, fv_stmt) -> + returnDs (do_compose ids + (mkTupleType env_ids) + (mkTupleType env_ids') + (mkTupleType out_ids) + core_stmt + core_stmts, + fv_stmt) + +\end{code} + +Match a list of expressions against a list of patterns, left-to-right. + +\begin{code} +matchSimplys :: [CoreExpr] -- Scrutinees + -> HsMatchContext Name -- Match kind + -> [LPat Id] -- Patterns they should match + -> CoreExpr -- Return this if they all match + -> CoreExpr -- Return this if they don't + -> DsM CoreExpr +matchSimplys [] _ctxt [] result_expr _fail_expr = returnDs result_expr +matchSimplys (exp:exps) ctxt (pat:pats) result_expr fail_expr + = matchSimplys exps ctxt pats result_expr fail_expr + `thenDs` \ match_code -> + matchSimply exp ctxt pat match_code fail_expr +\end{code} + +List of leaf expressions, with set of variables bound in each + +\begin{code} +leavesMatch :: LMatch Id -> [(LHsExpr Id, IdSet)] +leavesMatch (L _ (Match pats _ (GRHSs grhss binds))) + = let + defined_vars = mkVarSet (collectPatsBinders pats) + `unionVarSet` + mkVarSet (map unLoc (collectLocalBinders binds)) + in + [(expr, + mkVarSet (map unLoc (collectLStmtsBinders stmts)) + `unionVarSet` defined_vars) + | L _ (GRHS stmts expr) <- grhss] +\end{code} + +Replace the leaf commands in a match + +\begin{code} +replaceLeavesMatch + :: Type -- new result type + -> [LHsExpr Id] -- replacement leaf expressions of that type + -> LMatch Id -- the matches of a case command + -> ([LHsExpr Id],-- remaining leaf expressions + LMatch Id) -- updated match +replaceLeavesMatch res_ty leaves (L loc (Match pat mt (GRHSs grhss binds))) + = let + (leaves', grhss') = mapAccumL replaceLeavesGRHS leaves grhss + in + (leaves', L loc (Match pat mt (GRHSs grhss' binds))) + +replaceLeavesGRHS + :: [LHsExpr Id] -- replacement leaf expressions of that type + -> LGRHS Id -- rhss of a case command + -> ([LHsExpr Id],-- remaining leaf expressions + LGRHS Id) -- updated GRHS +replaceLeavesGRHS (leaf:leaves) (L loc (GRHS stmts rhs)) + = (leaves, L loc (GRHS stmts leaf)) +\end{code} + +Balanced fold of a non-empty list. + +\begin{code} +foldb :: (a -> a -> a) -> [a] -> a +foldb _ [] = error "foldb of empty list" +foldb _ [x] = x +foldb f xs = foldb f (fold_pairs xs) + where + fold_pairs [] = [] + fold_pairs [x] = [x] + fold_pairs (x1:x2:xs) = f x1 x2:fold_pairs xs +\end{code} diff --git a/compiler/deSugar/DsBinds.lhs b/compiler/deSugar/DsBinds.lhs new file mode 100644 index 0000000000..8f3006d0f3 --- /dev/null +++ b/compiler/deSugar/DsBinds.lhs @@ -0,0 +1,417 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[DsBinds]{Pattern-matching bindings (HsBinds and MonoBinds)} + +Handles @HsBinds@; those at the top level require different handling, +in that the @Rec@/@NonRec@/etc structure is thrown away (whereas at +lower levels it is preserved with @let@/@letrec@s). + +\begin{code} +module DsBinds ( dsTopLHsBinds, dsLHsBinds, decomposeRuleLhs, + dsCoercion, + AutoScc(..) + ) where + +#include "HsVersions.h" + + +import {-# SOURCE #-} DsExpr( dsLExpr, dsExpr ) +import {-# SOURCE #-} Match( matchWrapper ) + +import DsMonad +import DsGRHSs ( dsGuarded ) +import DsUtils + +import HsSyn -- lots of things +import CoreSyn -- lots of things +import CoreUtils ( exprType, mkInlineMe, mkSCC ) + +import StaticFlags ( opt_AutoSccsOnAllToplevs, + opt_AutoSccsOnExportedToplevs ) +import OccurAnal ( occurAnalyseExpr ) +import CostCentre ( mkAutoCC, IsCafCC(..) ) +import Id ( Id, DictId, idType, idName, isExportedId, mkLocalId, setInlinePragma ) +import Rules ( addIdSpecialisations, mkLocalRule ) +import Var ( TyVar, Var, isGlobalId, setIdNotExported ) +import VarEnv +import Type ( mkTyVarTy, substTyWith ) +import TysWiredIn ( voidTy ) +import Outputable +import SrcLoc ( Located(..) ) +import Maybes ( isJust, catMaybes, orElse ) +import Bag ( bagToList ) +import BasicTypes ( Activation(..), InlineSpec(..), isAlwaysActive, defaultInlineSpec ) +import Monad ( foldM ) +import FastString ( mkFastString ) +import List ( (\\) ) +import Util ( mapSnd ) +\end{code} + +%************************************************************************ +%* * +\subsection[dsMonoBinds]{Desugaring a @MonoBinds@} +%* * +%************************************************************************ + +\begin{code} +dsTopLHsBinds :: AutoScc -> LHsBinds Id -> DsM [(Id,CoreExpr)] +dsTopLHsBinds auto_scc binds = ds_lhs_binds auto_scc binds + +dsLHsBinds :: LHsBinds Id -> DsM [(Id,CoreExpr)] +dsLHsBinds binds = ds_lhs_binds NoSccs binds + + +------------------------ +ds_lhs_binds :: AutoScc -> LHsBinds Id -> DsM [(Id,CoreExpr)] + -- scc annotation policy (see below) +ds_lhs_binds auto_scc binds = foldM (dsLHsBind auto_scc) [] (bagToList binds) + +dsLHsBind :: AutoScc + -> [(Id,CoreExpr)] -- Put this on the end (avoid quadratic append) + -> LHsBind Id + -> DsM [(Id,CoreExpr)] -- Result +dsLHsBind auto_scc rest (L loc bind) + = putSrcSpanDs loc $ dsHsBind auto_scc rest bind + +dsHsBind :: AutoScc + -> [(Id,CoreExpr)] -- Put this on the end (avoid quadratic append) + -> HsBind Id + -> DsM [(Id,CoreExpr)] -- Result + +dsHsBind auto_scc rest (VarBind var expr) + = dsLExpr expr `thenDs` \ core_expr -> + + -- Dictionary bindings are always VarMonoBinds, so + -- we only need do this here + addDictScc var core_expr `thenDs` \ core_expr' -> + returnDs ((var, core_expr') : rest) + +dsHsBind auto_scc rest (FunBind { fun_id = L _ fun, fun_matches = matches, fun_co_fn = co_fn }) + = matchWrapper (FunRhs (idName fun)) matches `thenDs` \ (args, body) -> + dsCoercion co_fn (return (mkLams args body)) `thenDs` \ rhs -> + addAutoScc auto_scc (fun, rhs) `thenDs` \ pair -> + returnDs (pair : rest) + +dsHsBind auto_scc rest (PatBind { pat_lhs = pat, pat_rhs = grhss, pat_rhs_ty = ty }) + = dsGuarded grhss ty `thenDs` \ body_expr -> + mkSelectorBinds pat body_expr `thenDs` \ sel_binds -> + mappM (addAutoScc auto_scc) sel_binds `thenDs` \ sel_binds -> + returnDs (sel_binds ++ rest) + + -- Common special case: no type or dictionary abstraction + -- For the (rare) case when there are some mixed-up + -- dictionary bindings (for which a Rec is convenient) + -- we reply on the enclosing dsBind to wrap a Rec around. +dsHsBind auto_scc rest (AbsBinds [] [] exports binds) + = ds_lhs_binds (addSccs auto_scc exports) binds `thenDs` \ core_prs -> + let + core_prs' = addLocalInlines exports core_prs + exports' = [(global, Var local) | (_, global, local, _) <- exports] + in + returnDs (core_prs' ++ exports' ++ rest) + + -- Another common case: one exported variable + -- Non-recursive bindings come through this way +dsHsBind auto_scc rest + (AbsBinds all_tyvars dicts exports@[(tyvars, global, local, prags)] binds) + = ASSERT( all (`elem` tyvars) all_tyvars ) + ds_lhs_binds (addSccs auto_scc exports) binds `thenDs` \ core_prs -> + let + -- Always treat the binds as recursive, because the typechecker + -- makes rather mixed-up dictionary bindings + core_bind = Rec core_prs + in + mappM (dsSpec all_tyvars dicts tyvars global local core_bind) + prags `thenDs` \ mb_specs -> + let + (spec_binds, rules) = unzip (catMaybes mb_specs) + global' = addIdSpecialisations global rules + rhs' = mkLams tyvars $ mkLams dicts $ Let core_bind (Var local) + inl = case [inl | InlinePrag inl <- prags] of + [] -> defaultInlineSpec + (inl:_) -> inl + in + returnDs (addInlineInfo inl global' rhs' : spec_binds ++ rest) + +dsHsBind auto_scc rest (AbsBinds all_tyvars dicts exports binds) + = ds_lhs_binds (addSccs auto_scc exports) binds `thenDs` \ core_prs -> + let + -- Rec because of mixed-up dictionary bindings + core_bind = Rec (addLocalInlines exports core_prs) + + tup_expr = mkTupleExpr locals + tup_ty = exprType tup_expr + poly_tup_expr = mkLams all_tyvars $ mkLams dicts $ + Let core_bind tup_expr + locals = [local | (_, _, local, _) <- exports] + local_tys = map idType locals + in + newSysLocalDs (exprType poly_tup_expr) `thenDs` \ poly_tup_id -> + let + dict_args = map Var dicts + + mk_bind ((tyvars, global, local, prags), n) -- locals !! n == local + = -- Need to make fresh locals to bind in the selector, because + -- some of the tyvars will be bound to voidTy + newSysLocalsDs (map substitute local_tys) `thenDs` \ locals' -> + newSysLocalDs (substitute tup_ty) `thenDs` \ tup_id -> + mapM (dsSpec all_tyvars dicts tyvars global local core_bind) + prags `thenDs` \ mb_specs -> + let + (spec_binds, rules) = unzip (catMaybes mb_specs) + global' = addIdSpecialisations global rules + rhs = mkLams tyvars $ mkLams dicts $ + mkTupleSelector locals' (locals' !! n) tup_id $ + mkApps (mkTyApps (Var poly_tup_id) ty_args) dict_args + in + returnDs ((global', rhs) : spec_binds) + where + mk_ty_arg all_tyvar | all_tyvar `elem` tyvars = mkTyVarTy all_tyvar + | otherwise = voidTy + ty_args = map mk_ty_arg all_tyvars + substitute = substTyWith all_tyvars ty_args + in + mappM mk_bind (exports `zip` [0..]) `thenDs` \ export_binds_s -> + -- don't scc (auto-)annotate the tuple itself. + + returnDs ((poly_tup_id, poly_tup_expr) : (concat export_binds_s ++ rest)) + +dsSpec :: [TyVar] -> [DictId] -> [TyVar] + -> Id -> Id -- Global, local + -> CoreBind -> Prag + -> DsM (Maybe ((Id,CoreExpr), -- Binding for specialised Id + CoreRule)) -- Rule for the Global Id + +-- Example: +-- f :: (Eq a, Ix b) => a -> b -> b +-- {-# SPECIALISE f :: Ix b => Int -> b -> b #-} +-- +-- AbsBinds [ab] [d1,d2] [([ab], f, f_mono, prags)] binds +-- +-- SpecPrag (/\b.\(d:Ix b). f Int b dInt d) +-- (forall b. Ix b => Int -> b -> b) +-- +-- Rule: forall b,(d:Ix b). f Int b dInt d = f_spec b d +-- +-- Spec bind: f_spec = Let f = /\ab \(d1:Eq a)(d2:Ix b). let binds in f_mono +-- /\b.\(d:Ix b). in f Int b dInt d +-- The idea is that f occurs just once, so it'll be +-- inlined and specialised + +dsSpec all_tvs dicts tvs poly_id mono_id mono_bind (InlinePrag {}) + = return Nothing + +dsSpec all_tvs dicts tvs poly_id mono_id mono_bind + (SpecPrag spec_expr spec_ty const_dicts inl) + = do { let poly_name = idName poly_id + ; spec_name <- newLocalName poly_name + ; ds_spec_expr <- dsExpr spec_expr + ; let (bndrs, body) = collectBinders ds_spec_expr + mb_lhs = decomposeRuleLhs (bndrs ++ const_dicts) body + + ; case mb_lhs of + Nothing -> do { dsWarn msg; return Nothing } + + Just (bndrs', var, args) -> return (Just (addInlineInfo inl spec_id spec_rhs, rule)) + where + local_poly = setIdNotExported poly_id + -- Very important to make the 'f' non-exported, + -- else it won't be inlined! + spec_id = mkLocalId spec_name spec_ty + spec_rhs = Let (NonRec local_poly poly_f_body) ds_spec_expr + poly_f_body = mkLams (tvs ++ dicts) $ + fix_up (Let mono_bind (Var mono_id)) + + -- Quantify over constant dicts on the LHS, since + -- their value depends only on their type + -- The ones we are interested in may even be imported + -- e.g. GHC.Base.dEqInt + + rule = mkLocalRule (mkFastString ("SPEC " ++ showSDoc (ppr poly_name))) + AlwaysActive poly_name + bndrs' -- Includes constant dicts + args + (mkVarApps (Var spec_id) bndrs) + } + where + -- Bind to voidTy any of all_ptvs that aren't + -- relevant for this particular function + fix_up body | null void_tvs = body + | otherwise = mkTyApps (mkLams void_tvs body) + (map (const voidTy) void_tvs) + void_tvs = all_tvs \\ tvs + + msg = hang (ptext SLIT("Specialisation too complicated to desugar; ignored")) + 2 (ppr spec_expr) +\end{code} + + +%************************************************************************ +%* * +\subsection{Adding inline pragmas} +%* * +%************************************************************************ + +\begin{code} +decomposeRuleLhs :: [Var] -> CoreExpr -> Maybe ([Var], Id, [CoreExpr]) +-- Returns Nothing if the LHS isn't of the expected shape +-- The argument 'all_bndrs' includes the "constant dicts" of the LHS, +-- and they may be GlobalIds, which we can't forall-ify. +-- So we substitute them out instead +decomposeRuleLhs all_bndrs lhs + = go init_env (occurAnalyseExpr lhs) -- Occurrence analysis sorts out the dict + -- bindings so we know if they are recursive + where + + -- all_bndrs may include top-level imported dicts, + -- imported things with a for-all. + -- So we localise them and subtitute them out + bndr_prs = [ (id, Var (localise id)) | id <- all_bndrs, isGlobalId id ] + localise d = mkLocalId (idName d) (idType d) + + init_env = mkVarEnv bndr_prs + all_bndrs' = map subst_bndr all_bndrs + subst_bndr bndr = case lookupVarEnv init_env bndr of + Just (Var bndr') -> bndr' + Just other -> panic "decomposeRuleLhs" + Nothing -> bndr + + -- Substitute dicts in the LHS args, so that there + -- aren't any lets getting in the way + -- Note that we substitute the function too; we might have this as + -- a LHS: let f71 = M.f Int in f71 + go env (Let (NonRec dict rhs) body) + = go (extendVarEnv env dict (simpleSubst env rhs)) body + go env body + = case collectArgs (simpleSubst env body) of + (Var fn, args) -> Just (all_bndrs', fn, args) + other -> Nothing + +simpleSubst :: IdEnv CoreExpr -> CoreExpr -> CoreExpr +-- Similar to CoreSubst.substExpr, except that +-- (a) takes no account of capture; dictionary bindings use new names +-- (b) can have a GlobalId (imported) in its domain +-- (c) Ids only; no types are substituted + +simpleSubst subst expr + = go expr + where + go (Var v) = lookupVarEnv subst v `orElse` Var v + go (Type ty) = Type ty + go (Lit lit) = Lit lit + go (App fun arg) = App (go fun) (go arg) + go (Note note e) = Note note (go e) + go (Lam bndr body) = Lam bndr (go body) + go (Let (NonRec bndr rhs) body) = Let (NonRec bndr (go rhs)) (go body) + go (Let (Rec pairs) body) = Let (Rec (mapSnd go pairs)) (go body) + go (Case scrut bndr ty alts) = Case (go scrut) bndr ty + [(c,bs,go r) | (c,bs,r) <- alts] + +addLocalInlines exports core_prs + = map add_inline core_prs + where + add_inline (bndr,rhs) | Just inl <- lookupVarEnv inline_env bndr + = addInlineInfo inl bndr rhs + | otherwise + = (bndr,rhs) + inline_env = mkVarEnv [(mono_id, prag) + | (_, _, mono_id, prags) <- exports, + InlinePrag prag <- prags] + +addInlineInfo :: InlineSpec -> Id -> CoreExpr -> (Id,CoreExpr) +addInlineInfo (Inline phase is_inline) bndr rhs + = (attach_phase bndr phase, wrap_inline is_inline rhs) + where + attach_phase bndr phase + | isAlwaysActive phase = bndr -- Default phase + | otherwise = bndr `setInlinePragma` phase + + wrap_inline True body = mkInlineMe body + wrap_inline False body = body +\end{code} + + +%************************************************************************ +%* * +\subsection[addAutoScc]{Adding automatic sccs} +%* * +%************************************************************************ + +\begin{code} +data AutoScc + = TopLevel + | TopLevelAddSccs (Id -> Maybe Id) + | NoSccs + +addSccs :: AutoScc -> [(a,Id,Id,[Prag])] -> AutoScc +addSccs auto_scc@(TopLevelAddSccs _) exports = auto_scc +addSccs NoSccs exports = NoSccs +addSccs TopLevel exports + = TopLevelAddSccs (\id -> case [ exp | (_,exp,loc,_) <- exports, loc == id ] of + (exp:_) | opt_AutoSccsOnAllToplevs || + (isExportedId exp && + opt_AutoSccsOnExportedToplevs) + -> Just exp + _ -> Nothing) + +addAutoScc :: AutoScc -- if needs be, decorate toplevs? + -> (Id, CoreExpr) + -> DsM (Id, CoreExpr) + +addAutoScc (TopLevelAddSccs auto_scc_fn) pair@(bndr, core_expr) + | do_auto_scc + = getModuleDs `thenDs` \ mod -> + returnDs (bndr, mkSCC (mkAutoCC top_bndr mod NotCafCC) core_expr) + where do_auto_scc = isJust maybe_auto_scc + maybe_auto_scc = auto_scc_fn bndr + (Just top_bndr) = maybe_auto_scc + +addAutoScc _ pair + = returnDs pair +\end{code} + +If profiling and dealing with a dict binding, +wrap the dict in @_scc_ DICT <dict>@: + +\begin{code} +addDictScc var rhs = returnDs rhs + +{- DISABLED for now (need to somehow make up a name for the scc) -- SDM + | not ( opt_SccProfilingOn && opt_AutoSccsOnDicts) + || not (isDictId var) + = returnDs rhs -- That's easy: do nothing + + | otherwise + = getModuleAndGroupDs `thenDs` \ (mod, grp) -> + -- ToDo: do -dicts-all flag (mark dict things with individual CCs) + returnDs (Note (SCC (mkAllDictsCC mod grp False)) rhs) +-} +\end{code} + + +%************************************************************************ +%* * + Desugaring coercions +%* * +%************************************************************************ + + +\begin{code} +dsCoercion :: ExprCoFn -> DsM CoreExpr -> DsM CoreExpr +dsCoercion CoHole thing_inside = thing_inside +dsCoercion (CoCompose c1 c2) thing_inside = dsCoercion c1 (dsCoercion c2 thing_inside) +dsCoercion (CoLams ids c) thing_inside = do { expr <- dsCoercion c thing_inside + ; return (mkLams ids expr) } +dsCoercion (CoTyLams tvs c) thing_inside = do { expr <- dsCoercion c thing_inside + ; return (mkLams tvs expr) } +dsCoercion (CoApps c ids) thing_inside = do { expr <- dsCoercion c thing_inside + ; return (mkVarApps expr ids) } +dsCoercion (CoTyApps c tys) thing_inside = do { expr <- dsCoercion c thing_inside + ; return (mkTyApps expr tys) } +dsCoercion (CoLet bs c) thing_inside = do { prs <- dsLHsBinds bs + ; expr <- dsCoercion c thing_inside + ; return (Let (Rec prs) expr) } +\end{code} + + diff --git a/compiler/deSugar/DsCCall.lhs b/compiler/deSugar/DsCCall.lhs new file mode 100644 index 0000000000..3554197fb8 --- /dev/null +++ b/compiler/deSugar/DsCCall.lhs @@ -0,0 +1,456 @@ +% +% (c) The AQUA Project, Glasgow University, 1994-1998 +% +\section[DsCCall]{Desugaring C calls} + +\begin{code} +module DsCCall + ( dsCCall + , mkFCall + , unboxArg + , boxResult + , resultWrapper + ) where + +#include "HsVersions.h" + + +import CoreSyn + +import DsMonad + +import CoreUtils ( exprType, coreAltType, mkCoerce2 ) +import Id ( Id, mkWildId ) +import MkId ( mkFCallId, realWorldPrimId, mkPrimOpId ) +import Maybes ( maybeToBool ) +import ForeignCall ( ForeignCall(..), CCallSpec(..), CCallTarget(..), Safety, + CCallConv(..), CLabelString ) +import DataCon ( splitProductType_maybe, dataConSourceArity, dataConWrapId ) + +import TcType ( tcSplitTyConApp_maybe ) +import Type ( Type, isUnLiftedType, mkFunTys, mkFunTy, + tyVarsOfType, mkForAllTys, mkTyConApp, + isPrimitiveType, splitTyConApp_maybe, + splitRecNewType_maybe, splitForAllTy_maybe, + isUnboxedTupleType + ) + +import PrimOp ( PrimOp(..) ) +import TysPrim ( realWorldStatePrimTy, intPrimTy, + byteArrayPrimTyCon, mutableByteArrayPrimTyCon, + addrPrimTy + ) +import TyCon ( TyCon, tyConDataCons, tyConName ) +import TysWiredIn ( unitDataConId, + unboxedSingletonDataCon, unboxedPairDataCon, + unboxedSingletonTyCon, unboxedPairTyCon, + trueDataCon, falseDataCon, + trueDataConId, falseDataConId, + listTyCon, charTyCon, boolTy, + tupleTyCon, tupleCon + ) +import BasicTypes ( Boxity(..) ) +import Literal ( mkMachInt ) +import PrelNames ( Unique, hasKey, ioTyConKey, boolTyConKey, unitTyConKey, + int8TyConKey, int16TyConKey, int32TyConKey, + word8TyConKey, word16TyConKey, word32TyConKey + -- dotnet interop + , marshalStringName, unmarshalStringName + , marshalObjectName, unmarshalObjectName + , objectTyConName + ) +import VarSet ( varSetElems ) +import Constants ( wORD_SIZE) +import Outputable + +#ifdef DEBUG +import TypeRep +#endif + +\end{code} + +Desugaring of @ccall@s consists of adding some state manipulation, +unboxing any boxed primitive arguments and boxing the result if +desired. + +The state stuff just consists of adding in +@PrimIO (\ s -> case s of { S# s# -> ... })@ in an appropriate place. + +The unboxing is straightforward, as all information needed to unbox is +available from the type. For each boxed-primitive argument, we +transform: +\begin{verbatim} + _ccall_ foo [ r, t1, ... tm ] e1 ... em + | + | + V + case e1 of { T1# x1# -> + ... + case em of { Tm# xm# -> xm# + ccall# foo [ r, t1#, ... tm# ] x1# ... xm# + } ... } +\end{verbatim} + +The reboxing of a @_ccall_@ result is a bit tricker: the types don't +contain information about the state-pairing functions so we have to +keep a list of \tr{(type, s-p-function)} pairs. We transform as +follows: +\begin{verbatim} + ccall# foo [ r, t1#, ... tm# ] e1# ... em# + | + | + V + \ s# -> case (ccall# foo [ r, t1#, ... tm# ] s# e1# ... em#) of + (StateAnd<r># result# state#) -> (R# result#, realWorld#) +\end{verbatim} + +\begin{code} +dsCCall :: CLabelString -- C routine to invoke + -> [CoreExpr] -- Arguments (desugared) + -> Safety -- Safety of the call + -> Type -- Type of the result: IO t + -> DsM CoreExpr + +dsCCall lbl args may_gc result_ty + = mapAndUnzipDs unboxArg args `thenDs` \ (unboxed_args, arg_wrappers) -> + boxResult id Nothing result_ty `thenDs` \ (ccall_result_ty, res_wrapper) -> + newUnique `thenDs` \ uniq -> + let + target = StaticTarget lbl + the_fcall = CCall (CCallSpec target CCallConv may_gc) + the_prim_app = mkFCall uniq the_fcall unboxed_args ccall_result_ty + in + returnDs (foldr ($) (res_wrapper the_prim_app) arg_wrappers) + +mkFCall :: Unique -> ForeignCall + -> [CoreExpr] -- Args + -> Type -- Result type + -> CoreExpr +-- Construct the ccall. The only tricky bit is that the ccall Id should have +-- no free vars, so if any of the arg tys do we must give it a polymorphic type. +-- [I forget *why* it should have no free vars!] +-- For example: +-- mkCCall ... [s::StablePtr (a->b), x::Addr, c::Char] +-- +-- Here we build a ccall thus +-- (ccallid::(forall a b. StablePtr (a -> b) -> Addr -> Char -> IO Addr)) +-- a b s x c +mkFCall uniq the_fcall val_args res_ty + = mkApps (mkVarApps (Var the_fcall_id) tyvars) val_args + where + arg_tys = map exprType val_args + body_ty = (mkFunTys arg_tys res_ty) + tyvars = varSetElems (tyVarsOfType body_ty) + ty = mkForAllTys tyvars body_ty + the_fcall_id = mkFCallId uniq the_fcall ty +\end{code} + +\begin{code} +unboxArg :: CoreExpr -- The supplied argument + -> DsM (CoreExpr, -- To pass as the actual argument + CoreExpr -> CoreExpr -- Wrapper to unbox the arg + ) +-- Example: if the arg is e::Int, unboxArg will return +-- (x#::Int#, \W. case x of I# x# -> W) +-- where W is a CoreExpr that probably mentions x# + +unboxArg arg + -- Primtive types: nothing to unbox + | isPrimitiveType arg_ty + = returnDs (arg, \body -> body) + + -- Recursive newtypes + | Just rep_ty <- splitRecNewType_maybe arg_ty + = unboxArg (mkCoerce2 rep_ty arg_ty arg) + + -- Booleans + | Just (tc,_) <- splitTyConApp_maybe arg_ty, + tc `hasKey` boolTyConKey + = newSysLocalDs intPrimTy `thenDs` \ prim_arg -> + returnDs (Var prim_arg, + \ body -> Case (Case arg (mkWildId arg_ty) intPrimTy + [(DataAlt falseDataCon,[],mkIntLit 0), + (DataAlt trueDataCon, [],mkIntLit 1)]) + -- In increasing tag order! + prim_arg + (exprType body) + [(DEFAULT,[],body)]) + + -- Data types with a single constructor, which has a single, primitive-typed arg + -- This deals with Int, Float etc; also Ptr, ForeignPtr + | is_product_type && data_con_arity == 1 + = ASSERT2(isUnLiftedType data_con_arg_ty1, pprType arg_ty) + -- Typechecker ensures this + newSysLocalDs arg_ty `thenDs` \ case_bndr -> + newSysLocalDs data_con_arg_ty1 `thenDs` \ prim_arg -> + returnDs (Var prim_arg, + \ body -> Case arg case_bndr (exprType body) [(DataAlt data_con,[prim_arg],body)] + ) + + -- Byte-arrays, both mutable and otherwise; hack warning + -- We're looking for values of type ByteArray, MutableByteArray + -- data ByteArray ix = ByteArray ix ix ByteArray# + -- data MutableByteArray s ix = MutableByteArray ix ix (MutableByteArray# s) + | is_product_type && + data_con_arity == 3 && + maybeToBool maybe_arg3_tycon && + (arg3_tycon == byteArrayPrimTyCon || + arg3_tycon == mutableByteArrayPrimTyCon) + = newSysLocalDs arg_ty `thenDs` \ case_bndr -> + newSysLocalsDs data_con_arg_tys `thenDs` \ vars@[l_var, r_var, arr_cts_var] -> + returnDs (Var arr_cts_var, + \ body -> Case arg case_bndr (exprType body) [(DataAlt data_con,vars,body)] + + ) + + | Just (tc, [arg_ty]) <- splitTyConApp_maybe arg_ty, + tc == listTyCon, + Just (cc,[]) <- splitTyConApp_maybe arg_ty, + cc == charTyCon + -- String; dotnet only + = dsLookupGlobalId marshalStringName `thenDs` \ unpack_id -> + newSysLocalDs addrPrimTy `thenDs` \ prim_string -> + returnDs (Var prim_string, + \ body -> + let + io_ty = exprType body + (Just (_,[io_arg])) = tcSplitTyConApp_maybe io_ty + in + mkApps (Var unpack_id) + [ Type io_arg + , arg + , Lam prim_string body + ]) + | Just (tc, [arg_ty]) <- splitTyConApp_maybe arg_ty, + tyConName tc == objectTyConName + -- Object; dotnet only + = dsLookupGlobalId marshalObjectName `thenDs` \ unpack_id -> + newSysLocalDs addrPrimTy `thenDs` \ prim_obj -> + returnDs (Var prim_obj, + \ body -> + let + io_ty = exprType body + (Just (_,[io_arg])) = tcSplitTyConApp_maybe io_ty + in + mkApps (Var unpack_id) + [ Type io_arg + , arg + , Lam prim_obj body + ]) + + | otherwise + = getSrcSpanDs `thenDs` \ l -> + pprPanic "unboxArg: " (ppr l <+> ppr arg_ty) + where + arg_ty = exprType arg + maybe_product_type = splitProductType_maybe arg_ty + is_product_type = maybeToBool maybe_product_type + Just (_, _, data_con, data_con_arg_tys) = maybe_product_type + data_con_arity = dataConSourceArity data_con + (data_con_arg_ty1 : _) = data_con_arg_tys + + (_ : _ : data_con_arg_ty3 : _) = data_con_arg_tys + maybe_arg3_tycon = splitTyConApp_maybe data_con_arg_ty3 + Just (arg3_tycon,_) = maybe_arg3_tycon +\end{code} + + +\begin{code} +boxResult :: ((Maybe Type, CoreExpr -> CoreExpr) -> (Maybe Type, CoreExpr -> CoreExpr)) + -> Maybe Id + -> Type + -> DsM (Type, CoreExpr -> CoreExpr) + +-- Takes the result of the user-level ccall: +-- either (IO t), +-- or maybe just t for an side-effect-free call +-- Returns a wrapper for the primitive ccall itself, along with the +-- type of the result of the primitive ccall. This result type +-- will be of the form +-- State# RealWorld -> (# State# RealWorld, t' #) +-- where t' is the unwrapped form of t. If t is simply (), then +-- the result type will be +-- State# RealWorld -> (# State# RealWorld #) + +boxResult augment mbTopCon result_ty + = case tcSplitTyConApp_maybe result_ty of + -- This split absolutely has to be a tcSplit, because we must + -- see the IO type; and it's a newtype which is transparent to splitTyConApp. + + -- The result is IO t, so wrap the result in an IO constructor + Just (io_tycon, [io_res_ty]) | io_tycon `hasKey` ioTyConKey + -> resultWrapper io_res_ty `thenDs` \ res -> + let aug_res = augment res + extra_result_tys = + case aug_res of + (Just ty,_) + | isUnboxedTupleType ty -> + let (Just (_, ls)) = splitTyConApp_maybe ty in tail ls + _ -> [] + in + mk_alt (return_result extra_result_tys) aug_res + `thenDs` \ (ccall_res_ty, the_alt) -> + newSysLocalDs realWorldStatePrimTy `thenDs` \ state_id -> + let + io_data_con = head (tyConDataCons io_tycon) + toIOCon = + case mbTopCon of + Nothing -> dataConWrapId io_data_con + Just x -> x + wrap = \ the_call -> + mkApps (Var toIOCon) + [ Type io_res_ty, + Lam state_id $ + Case (App the_call (Var state_id)) + (mkWildId ccall_res_ty) + (coreAltType the_alt) + [the_alt] + ] + in + returnDs (realWorldStatePrimTy `mkFunTy` ccall_res_ty, wrap) + where + return_result ts state anss + = mkConApp (tupleCon Unboxed (2 + length ts)) + (Type realWorldStatePrimTy : Type io_res_ty : map Type ts ++ + state : anss) + -- It isn't, so do unsafePerformIO + -- It's not conveniently available, so we inline it + other -> resultWrapper result_ty `thenDs` \ res -> + mk_alt return_result (augment res) `thenDs` \ (ccall_res_ty, the_alt) -> + let + wrap = \ the_call -> Case (App the_call (Var realWorldPrimId)) + (mkWildId ccall_res_ty) + (coreAltType the_alt) + [the_alt] + in + returnDs (realWorldStatePrimTy `mkFunTy` ccall_res_ty, wrap) + where + return_result state [ans] = ans + return_result _ _ = panic "return_result: expected single result" + where + mk_alt return_result (Nothing, wrap_result) + = -- The ccall returns () + newSysLocalDs realWorldStatePrimTy `thenDs` \ state_id -> + let + the_rhs = return_result (Var state_id) + [wrap_result (panic "boxResult")] + + ccall_res_ty = mkTyConApp unboxedSingletonTyCon [realWorldStatePrimTy] + the_alt = (DataAlt unboxedSingletonDataCon, [state_id], the_rhs) + in + returnDs (ccall_res_ty, the_alt) + + mk_alt return_result (Just prim_res_ty, wrap_result) + -- The ccall returns a non-() value + | isUnboxedTupleType prim_res_ty + = let + Just (_, ls) = splitTyConApp_maybe prim_res_ty + arity = 1 + length ls + in + mappM newSysLocalDs ls `thenDs` \ args_ids@(result_id:as) -> + newSysLocalDs realWorldStatePrimTy `thenDs` \ state_id -> + let + the_rhs = return_result (Var state_id) + (wrap_result (Var result_id) : map Var as) + ccall_res_ty = mkTyConApp (tupleTyCon Unboxed arity) + (realWorldStatePrimTy : ls) + the_alt = ( DataAlt (tupleCon Unboxed arity) + , (state_id : args_ids) + , the_rhs + ) + in + returnDs (ccall_res_ty, the_alt) + | otherwise + = newSysLocalDs prim_res_ty `thenDs` \ result_id -> + newSysLocalDs realWorldStatePrimTy `thenDs` \ state_id -> + let + the_rhs = return_result (Var state_id) + [wrap_result (Var result_id)] + + ccall_res_ty = mkTyConApp unboxedPairTyCon [realWorldStatePrimTy, prim_res_ty] + the_alt = (DataAlt unboxedPairDataCon, [state_id, result_id], the_rhs) + in + returnDs (ccall_res_ty, the_alt) + + +resultWrapper :: Type + -> DsM (Maybe Type, -- Type of the expected result, if any + CoreExpr -> CoreExpr) -- Wrapper for the result +resultWrapper result_ty + -- Base case 1: primitive types + | isPrimitiveType result_ty + = returnDs (Just result_ty, \e -> e) + + -- Base case 2: the unit type () + | Just (tc,_) <- maybe_tc_app, tc `hasKey` unitTyConKey + = returnDs (Nothing, \e -> Var unitDataConId) + + -- Base case 3: the boolean type + | Just (tc,_) <- maybe_tc_app, tc `hasKey` boolTyConKey + = returnDs + (Just intPrimTy, \e -> Case e (mkWildId intPrimTy) + boolTy + [(DEFAULT ,[],Var trueDataConId ), + (LitAlt (mkMachInt 0),[],Var falseDataConId)]) + + -- Recursive newtypes + | Just rep_ty <- splitRecNewType_maybe result_ty + = resultWrapper rep_ty `thenDs` \ (maybe_ty, wrapper) -> + returnDs (maybe_ty, \e -> mkCoerce2 result_ty rep_ty (wrapper e)) + + -- The type might contain foralls (eg. for dummy type arguments, + -- referring to 'Ptr a' is legal). + | Just (tyvar, rest) <- splitForAllTy_maybe result_ty + = resultWrapper rest `thenDs` \ (maybe_ty, wrapper) -> + returnDs (maybe_ty, \e -> Lam tyvar (wrapper e)) + + -- Data types with a single constructor, which has a single arg + -- This includes types like Ptr and ForeignPtr + | Just (tycon, tycon_arg_tys, data_con, data_con_arg_tys) <- splitProductType_maybe result_ty, + dataConSourceArity data_con == 1 + = let + (unwrapped_res_ty : _) = data_con_arg_tys + narrow_wrapper = maybeNarrow tycon + in + resultWrapper unwrapped_res_ty `thenDs` \ (maybe_ty, wrapper) -> + returnDs + (maybe_ty, \e -> mkApps (Var (dataConWrapId data_con)) + (map Type tycon_arg_tys ++ [wrapper (narrow_wrapper e)])) + + -- Strings; 'dotnet' only. + | Just (tc, [arg_ty]) <- maybe_tc_app, tc == listTyCon, + Just (cc,[]) <- splitTyConApp_maybe arg_ty, cc == charTyCon + = dsLookupGlobalId unmarshalStringName `thenDs` \ pack_id -> + returnDs (Just addrPrimTy, + \ e -> App (Var pack_id) e) + + -- Objects; 'dotnet' only. + | Just (tc, [arg_ty]) <- maybe_tc_app, + tyConName tc == objectTyConName + = dsLookupGlobalId unmarshalObjectName `thenDs` \ pack_id -> + returnDs (Just addrPrimTy, + \ e -> App (Var pack_id) e) + + | otherwise + = pprPanic "resultWrapper" (ppr result_ty) + where + maybe_tc_app = splitTyConApp_maybe result_ty + +-- When the result of a foreign call is smaller than the word size, we +-- need to sign- or zero-extend the result up to the word size. The C +-- standard appears to say that this is the responsibility of the +-- caller, not the callee. + +maybeNarrow :: TyCon -> (CoreExpr -> CoreExpr) +maybeNarrow tycon + | tycon `hasKey` int8TyConKey = \e -> App (Var (mkPrimOpId Narrow8IntOp)) e + | tycon `hasKey` int16TyConKey = \e -> App (Var (mkPrimOpId Narrow16IntOp)) e + | tycon `hasKey` int32TyConKey + && wORD_SIZE > 4 = \e -> App (Var (mkPrimOpId Narrow32IntOp)) e + + | tycon `hasKey` word8TyConKey = \e -> App (Var (mkPrimOpId Narrow8WordOp)) e + | tycon `hasKey` word16TyConKey = \e -> App (Var (mkPrimOpId Narrow16WordOp)) e + | tycon `hasKey` word32TyConKey + && wORD_SIZE > 4 = \e -> App (Var (mkPrimOpId Narrow32WordOp)) e + | otherwise = id +\end{code} diff --git a/compiler/deSugar/DsExpr.hi-boot-5 b/compiler/deSugar/DsExpr.hi-boot-5 new file mode 100644 index 0000000000..7e5bbaab7f --- /dev/null +++ b/compiler/deSugar/DsExpr.hi-boot-5 @@ -0,0 +1,5 @@ +__interface DsExpr 1 0 where +__export DsExpr dsExpr dsLet; +1 dsExpr :: HsExpr.HsExpr Var.Id -> DsMonad.DsM CoreSyn.CoreExpr ; +1 dsLExpr :: HsExpr.HsLExpr Var.Id -> DsMonad.DsM CoreSyn.CoreExpr ; +1 dsLet :: [HsBinds.HsBindGroup Var.Id] -> CoreSyn.CoreExpr -> DsMonad.DsM CoreSyn.CoreExpr ; diff --git a/compiler/deSugar/DsExpr.hi-boot-6 b/compiler/deSugar/DsExpr.hi-boot-6 new file mode 100644 index 0000000000..c7ddb2ddfd --- /dev/null +++ b/compiler/deSugar/DsExpr.hi-boot-6 @@ -0,0 +1,6 @@ +module DsExpr where + +dsExpr :: HsExpr.HsExpr Var.Id -> DsMonad.DsM CoreSyn.CoreExpr +dsLExpr :: HsExpr.LHsExpr Var.Id -> DsMonad.DsM CoreSyn.CoreExpr +dsLocalBinds :: HsBinds.HsLocalBinds Var.Id -> CoreSyn.CoreExpr -> DsMonad.DsM CoreSyn.CoreExpr +dsValBinds :: HsBinds.HsValBinds Var.Id -> CoreSyn.CoreExpr -> DsMonad.DsM CoreSyn.CoreExpr diff --git a/compiler/deSugar/DsExpr.lhs b/compiler/deSugar/DsExpr.lhs new file mode 100644 index 0000000000..e8e9e7b370 --- /dev/null +++ b/compiler/deSugar/DsExpr.lhs @@ -0,0 +1,781 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[DsExpr]{Matching expressions (Exprs)} + +\begin{code} +module DsExpr ( dsExpr, dsLExpr, dsLocalBinds, dsValBinds, dsLit ) where + +#include "HsVersions.h" +#if defined(GHCI) && defined(BREAKPOINT) +import Foreign.StablePtr ( newStablePtr, castStablePtrToPtr ) +import GHC.Exts ( Ptr(..), Int(..), addr2Int# ) +import IOEnv ( ioToIOEnv ) +import PrelNames ( breakpointJumpName ) +import TysWiredIn ( unitTy ) +import TypeRep ( Type(..) ) +#endif + +import Match ( matchWrapper, matchSinglePat, matchEquations ) +import MatchLit ( dsLit, dsOverLit ) +import DsBinds ( dsLHsBinds, dsCoercion ) +import DsGRHSs ( dsGuarded ) +import DsListComp ( dsListComp, dsPArrComp ) +import DsUtils ( mkErrorAppDs, mkStringExpr, mkConsExpr, mkNilExpr, + extractMatchResult, cantFailMatchResult, matchCanFail, + mkCoreTupTy, selectSimpleMatchVarL, lookupEvidence, selectMatchVar ) +import DsArrows ( dsProcExpr ) +import DsMonad + +#ifdef GHCI + -- Template Haskell stuff iff bootstrapped +import DsMeta ( dsBracket ) +#endif + +import HsSyn +import TcHsSyn ( hsPatType, mkVanillaTuplePat ) + +-- NB: The desugarer, which straddles the source and Core worlds, sometimes +-- needs to see source types (newtypes etc), and sometimes not +-- So WATCH OUT; check each use of split*Ty functions. +-- Sigh. This is a pain. + +import TcType ( tcSplitAppTy, tcSplitFunTys, tcTyConAppTyCon, + tcTyConAppArgs, isUnLiftedType, Type, mkAppTy ) +import Type ( funArgTy, splitFunTys, isUnboxedTupleType, mkFunTy ) +import CoreSyn +import CoreUtils ( exprType, mkIfThenElse, bindNonRec ) + +import CostCentre ( mkUserCC ) +import Id ( Id, idType, idName, idDataCon ) +import PrelInfo ( rEC_CON_ERROR_ID, iRREFUT_PAT_ERROR_ID ) +import DataCon ( DataCon, dataConWrapId, dataConFieldLabels, dataConInstOrigArgTys ) +import DataCon ( isVanillaDataCon ) +import TyCon ( FieldLabel, tyConDataCons ) +import TysWiredIn ( tupleCon ) +import BasicTypes ( RecFlag(..), Boxity(..), ipNameName ) +import PrelNames ( toPName, + returnMName, bindMName, thenMName, failMName, + mfixName ) +import SrcLoc ( Located(..), unLoc, getLoc, noLoc ) +import Util ( zipEqual, zipWithEqual ) +import Bag ( bagToList ) +import Outputable +import FastString +\end{code} + + +%************************************************************************ +%* * + dsLocalBinds, dsValBinds +%* * +%************************************************************************ + +\begin{code} +dsLocalBinds :: HsLocalBinds Id -> CoreExpr -> DsM CoreExpr +dsLocalBinds EmptyLocalBinds body = return body +dsLocalBinds (HsValBinds binds) body = dsValBinds binds body +dsLocalBinds (HsIPBinds binds) body = dsIPBinds binds body + +------------------------- +dsValBinds :: HsValBinds Id -> CoreExpr -> DsM CoreExpr +dsValBinds (ValBindsOut binds _) body = foldrDs ds_val_bind body binds + +------------------------- +dsIPBinds (IPBinds ip_binds dict_binds) body + = do { prs <- dsLHsBinds dict_binds + ; let inner = foldr (\(x,r) e -> Let (NonRec x r) e) body prs + ; foldrDs ds_ip_bind inner ip_binds } + where + ds_ip_bind (L _ (IPBind n e)) body + = dsLExpr e `thenDs` \ e' -> + returnDs (Let (NonRec (ipNameName n) e') body) + +------------------------- +ds_val_bind :: (RecFlag, LHsBinds Id) -> CoreExpr -> DsM CoreExpr +-- Special case for bindings which bind unlifted variables +-- We need to do a case right away, rather than building +-- a tuple and doing selections. +-- Silently ignore INLINE and SPECIALISE pragmas... +ds_val_bind (NonRecursive, hsbinds) body + | [L _ (AbsBinds [] [] exports binds)] <- bagToList hsbinds, + (L loc bind : null_binds) <- bagToList binds, + isBangHsBind bind + || isUnboxedTupleBind bind + || or [isUnLiftedType (idType g) | (_, g, _, _) <- exports] + = let + body_w_exports = foldr bind_export body exports + bind_export (tvs, g, l, _) body = ASSERT( null tvs ) + bindNonRec g (Var l) body + in + ASSERT (null null_binds) + -- Non-recursive, non-overloaded bindings only come in ones + -- ToDo: in some bizarre case it's conceivable that there + -- could be dict binds in the 'binds'. (See the notes + -- below. Then pattern-match would fail. Urk.) + putSrcSpanDs loc $ + case bind of + FunBind { fun_id = L _ fun, fun_matches = matches, fun_co_fn = co_fn } + -> matchWrapper (FunRhs (idName fun)) matches `thenDs` \ (args, rhs) -> + ASSERT( null args ) -- Functions aren't lifted + ASSERT( isIdCoercion co_fn ) + returnDs (bindNonRec fun rhs body_w_exports) + + PatBind {pat_lhs = pat, pat_rhs = grhss, pat_rhs_ty = ty } + -> -- let C x# y# = rhs in body + -- ==> case rhs of C x# y# -> body + putSrcSpanDs loc $ + do { rhs <- dsGuarded grhss ty + ; let upat = unLoc pat + eqn = EqnInfo { eqn_wrap = idWrapper, eqn_pats = [upat], + eqn_rhs = cantFailMatchResult body_w_exports } + ; var <- selectMatchVar upat ty + ; result <- matchEquations PatBindRhs [var] [eqn] (exprType body) + ; return (scrungleMatch var rhs result) } + + other -> pprPanic "dsLet: unlifted" (pprLHsBinds hsbinds $$ ppr body) + + +-- Ordinary case for bindings; none should be unlifted +ds_val_bind (is_rec, binds) body + = do { prs <- dsLHsBinds binds + ; ASSERT( not (any (isUnLiftedType . idType . fst) prs) ) + case prs of + [] -> return body + other -> return (Let (Rec prs) body) } + -- Use a Rec regardless of is_rec. + -- Why? Because it allows the binds to be all + -- mixed up, which is what happens in one rare case + -- Namely, for an AbsBind with no tyvars and no dicts, + -- but which does have dictionary bindings. + -- See notes with TcSimplify.inferLoop [NO TYVARS] + -- It turned out that wrapping a Rec here was the easiest solution + -- + -- NB The previous case dealt with unlifted bindings, so we + -- only have to deal with lifted ones now; so Rec is ok + +isUnboxedTupleBind :: HsBind Id -> Bool +isUnboxedTupleBind (PatBind { pat_rhs_ty = ty }) = isUnboxedTupleType ty +isUnboxedTupleBind other = False + +scrungleMatch :: Id -> CoreExpr -> CoreExpr -> CoreExpr +-- Returns something like (let var = scrut in body) +-- but if var is an unboxed-tuple type, it inlines it in a fragile way +-- Special case to handle unboxed tuple patterns; they can't appear nested +-- The idea is that +-- case e of (# p1, p2 #) -> rhs +-- should desugar to +-- case e of (# x1, x2 #) -> ... match p1, p2 ... +-- NOT +-- let x = e in case x of .... +-- +-- But there may be a big +-- let fail = ... in case e of ... +-- wrapping the whole case, which complicates matters slightly +-- It all seems a bit fragile. Test is dsrun013. + +scrungleMatch var scrut body + | isUnboxedTupleType (idType var) = scrungle body + | otherwise = bindNonRec var scrut body + where + scrungle (Case (Var x) bndr ty alts) + | x == var = Case scrut bndr ty alts + scrungle (Let binds body) = Let binds (scrungle body) + scrungle other = panic ("scrungleMatch: tuple pattern:\n" ++ showSDoc (ppr other)) +\end{code} + +%************************************************************************ +%* * +\subsection[DsExpr-vars-and-cons]{Variables, constructors, literals} +%* * +%************************************************************************ + +\begin{code} +dsLExpr :: LHsExpr Id -> DsM CoreExpr +dsLExpr (L loc e) = putSrcSpanDs loc $ dsExpr e + +dsExpr :: HsExpr Id -> DsM CoreExpr + +dsExpr (HsPar e) = dsLExpr e +dsExpr (ExprWithTySigOut e _) = dsLExpr e +dsExpr (HsVar var) = returnDs (Var var) +dsExpr (HsIPVar ip) = returnDs (Var (ipNameName ip)) +dsExpr (HsLit lit) = dsLit lit +dsExpr (HsOverLit lit) = dsOverLit lit + +dsExpr (NegApp expr neg_expr) + = do { core_expr <- dsLExpr expr + ; core_neg <- dsExpr neg_expr + ; return (core_neg `App` core_expr) } + +dsExpr expr@(HsLam a_Match) + = matchWrapper LambdaExpr a_Match `thenDs` \ (binders, matching_code) -> + returnDs (mkLams binders matching_code) + +#if defined(GHCI) && defined(BREAKPOINT) +dsExpr (HsApp (L _ (HsApp realFun@(L _ (HsCoerce _ fun)) (L loc arg))) _) + | HsVar funId <- fun + , idName funId == breakpointJumpName + , ids <- filter (not.hasTyVar.idType) (extractIds arg) + = do dsWarn (text "Extracted ids:" <+> ppr ids <+> ppr (map idType ids)) + stablePtr <- ioToIOEnv $ newStablePtr ids + -- Yes, I know... I'm gonna burn in hell. + let Ptr addr# = castStablePtrToPtr stablePtr + funCore <- dsLExpr realFun + argCore <- dsLExpr (L loc (HsLit (HsInt (fromIntegral (I# (addr2Int# addr#)))))) + hvalCore <- dsLExpr (L loc (extractHVals ids)) + return ((funCore `App` argCore) `App` hvalCore) + where extractIds :: HsExpr Id -> [Id] + extractIds (HsApp fn arg) + | HsVar argId <- unLoc arg + = argId:extractIds (unLoc fn) + | TyApp arg' ts <- unLoc arg + , HsVar argId <- unLoc arg' + = error (showSDoc (ppr ts)) -- argId:extractIds (unLoc fn) + extractIds x = [] + extractHVals ids = ExplicitList unitTy (map (L loc . HsVar) ids) + hasTyVar (TyVarTy _) = True + hasTyVar (FunTy a b) = hasTyVar a || hasTyVar b + hasTyVar (NoteTy _ t) = hasTyVar t + hasTyVar (AppTy a b) = hasTyVar a || hasTyVar b + hasTyVar (TyConApp _ ts) = any hasTyVar ts + hasTyVar _ = False +#endif + +dsExpr expr@(HsApp fun arg) + = dsLExpr fun `thenDs` \ core_fun -> + dsLExpr arg `thenDs` \ core_arg -> + returnDs (core_fun `App` core_arg) +\end{code} + +Operator sections. At first it looks as if we can convert +\begin{verbatim} + (expr op) +\end{verbatim} +to +\begin{verbatim} + \x -> op expr x +\end{verbatim} + +But no! expr might be a redex, and we can lose laziness badly this +way. Consider +\begin{verbatim} + map (expr op) xs +\end{verbatim} +for example. So we convert instead to +\begin{verbatim} + let y = expr in \x -> op y x +\end{verbatim} +If \tr{expr} is actually just a variable, say, then the simplifier +will sort it out. + +\begin{code} +dsExpr (OpApp e1 op _ e2) + = dsLExpr op `thenDs` \ core_op -> + -- for the type of y, we need the type of op's 2nd argument + dsLExpr e1 `thenDs` \ x_core -> + dsLExpr e2 `thenDs` \ y_core -> + returnDs (mkApps core_op [x_core, y_core]) + +dsExpr (SectionL expr op) + = dsLExpr op `thenDs` \ core_op -> + -- for the type of y, we need the type of op's 2nd argument + let + (x_ty:y_ty:_, _) = splitFunTys (exprType core_op) + -- Must look through an implicit-parameter type; + -- newtype impossible; hence Type.splitFunTys + in + dsLExpr expr `thenDs` \ x_core -> + newSysLocalDs x_ty `thenDs` \ x_id -> + newSysLocalDs y_ty `thenDs` \ y_id -> + + returnDs (bindNonRec x_id x_core $ + Lam y_id (mkApps core_op [Var x_id, Var y_id])) + +-- dsLExpr (SectionR op expr) -- \ x -> op x expr +dsExpr (SectionR op expr) + = dsLExpr op `thenDs` \ core_op -> + -- for the type of x, we need the type of op's 2nd argument + let + (x_ty:y_ty:_, _) = splitFunTys (exprType core_op) + -- See comment with SectionL + in + dsLExpr expr `thenDs` \ y_core -> + newSysLocalDs x_ty `thenDs` \ x_id -> + newSysLocalDs y_ty `thenDs` \ y_id -> + + returnDs (bindNonRec y_id y_core $ + Lam x_id (mkApps core_op [Var x_id, Var y_id])) + +dsExpr (HsSCC cc expr) + = dsLExpr expr `thenDs` \ core_expr -> + getModuleDs `thenDs` \ mod_name -> + returnDs (Note (SCC (mkUserCC cc mod_name)) core_expr) + + +-- hdaume: core annotation + +dsExpr (HsCoreAnn fs expr) + = dsLExpr expr `thenDs` \ core_expr -> + returnDs (Note (CoreNote $ unpackFS fs) core_expr) + +dsExpr (HsCase discrim matches) + = dsLExpr discrim `thenDs` \ core_discrim -> + matchWrapper CaseAlt matches `thenDs` \ ([discrim_var], matching_code) -> + returnDs (scrungleMatch discrim_var core_discrim matching_code) + +dsExpr (HsLet binds body) + = dsLExpr body `thenDs` \ body' -> + dsLocalBinds binds body' + +-- We need the `ListComp' form to use `deListComp' (rather than the "do" form) +-- because the interpretation of `stmts' depends on what sort of thing it is. +-- +dsExpr (HsDo ListComp stmts body result_ty) + = -- Special case for list comprehensions + dsListComp stmts body elt_ty + where + [elt_ty] = tcTyConAppArgs result_ty + +dsExpr (HsDo DoExpr stmts body result_ty) + = dsDo stmts body result_ty + +dsExpr (HsDo (MDoExpr tbl) stmts body result_ty) + = dsMDo tbl stmts body result_ty + +dsExpr (HsDo PArrComp stmts body result_ty) + = -- Special case for array comprehensions + dsPArrComp (map unLoc stmts) body elt_ty + where + [elt_ty] = tcTyConAppArgs result_ty + +dsExpr (HsIf guard_expr then_expr else_expr) + = dsLExpr guard_expr `thenDs` \ core_guard -> + dsLExpr then_expr `thenDs` \ core_then -> + dsLExpr else_expr `thenDs` \ core_else -> + returnDs (mkIfThenElse core_guard core_then core_else) +\end{code} + + +\noindent +\underline{\bf Type lambda and application} +% ~~~~~~~~~~~~~~~~~~~~~~~~~~~ +\begin{code} +dsExpr (TyLam tyvars expr) + = dsLExpr expr `thenDs` \ core_expr -> + returnDs (mkLams tyvars core_expr) + +dsExpr (TyApp expr tys) + = dsLExpr expr `thenDs` \ core_expr -> + returnDs (mkTyApps core_expr tys) +\end{code} + + +\noindent +\underline{\bf Various data construction things} +% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +\begin{code} +dsExpr (ExplicitList ty xs) + = go xs + where + go [] = returnDs (mkNilExpr ty) + go (x:xs) = dsLExpr x `thenDs` \ core_x -> + go xs `thenDs` \ core_xs -> + returnDs (mkConsExpr ty core_x core_xs) + +-- we create a list from the array elements and convert them into a list using +-- `PrelPArr.toP' +-- +-- * the main disadvantage to this scheme is that `toP' traverses the list +-- twice: once to determine the length and a second time to put to elements +-- into the array; this inefficiency could be avoided by exposing some of +-- the innards of `PrelPArr' to the compiler (ie, have a `PrelPArrBase') so +-- that we can exploit the fact that we already know the length of the array +-- here at compile time +-- +dsExpr (ExplicitPArr ty xs) + = dsLookupGlobalId toPName `thenDs` \toP -> + dsExpr (ExplicitList ty xs) `thenDs` \coreList -> + returnDs (mkApps (Var toP) [Type ty, coreList]) + +dsExpr (ExplicitTuple expr_list boxity) + = mappM dsLExpr expr_list `thenDs` \ core_exprs -> + returnDs (mkConApp (tupleCon boxity (length expr_list)) + (map (Type . exprType) core_exprs ++ core_exprs)) + +dsExpr (ArithSeq expr (From from)) + = dsExpr expr `thenDs` \ expr2 -> + dsLExpr from `thenDs` \ from2 -> + returnDs (App expr2 from2) + +dsExpr (ArithSeq expr (FromTo from two)) + = dsExpr expr `thenDs` \ expr2 -> + dsLExpr from `thenDs` \ from2 -> + dsLExpr two `thenDs` \ two2 -> + returnDs (mkApps expr2 [from2, two2]) + +dsExpr (ArithSeq expr (FromThen from thn)) + = dsExpr expr `thenDs` \ expr2 -> + dsLExpr from `thenDs` \ from2 -> + dsLExpr thn `thenDs` \ thn2 -> + returnDs (mkApps expr2 [from2, thn2]) + +dsExpr (ArithSeq expr (FromThenTo from thn two)) + = dsExpr expr `thenDs` \ expr2 -> + dsLExpr from `thenDs` \ from2 -> + dsLExpr thn `thenDs` \ thn2 -> + dsLExpr two `thenDs` \ two2 -> + returnDs (mkApps expr2 [from2, thn2, two2]) + +dsExpr (PArrSeq expr (FromTo from two)) + = dsExpr expr `thenDs` \ expr2 -> + dsLExpr from `thenDs` \ from2 -> + dsLExpr two `thenDs` \ two2 -> + returnDs (mkApps expr2 [from2, two2]) + +dsExpr (PArrSeq expr (FromThenTo from thn two)) + = dsExpr expr `thenDs` \ expr2 -> + dsLExpr from `thenDs` \ from2 -> + dsLExpr thn `thenDs` \ thn2 -> + dsLExpr two `thenDs` \ two2 -> + returnDs (mkApps expr2 [from2, thn2, two2]) + +dsExpr (PArrSeq expr _) + = panic "DsExpr.dsExpr: Infinite parallel array!" + -- the parser shouldn't have generated it and the renamer and typechecker + -- shouldn't have let it through +\end{code} + +\noindent +\underline{\bf Record construction and update} +% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +For record construction we do this (assuming T has three arguments) +\begin{verbatim} + T { op2 = e } +==> + let err = /\a -> recConErr a + T (recConErr t1 "M.lhs/230/op1") + e + (recConErr t1 "M.lhs/230/op3") +\end{verbatim} +@recConErr@ then converts its arugment string into a proper message +before printing it as +\begin{verbatim} + M.lhs, line 230: missing field op1 was evaluated +\end{verbatim} + +We also handle @C{}@ as valid construction syntax for an unlabelled +constructor @C@, setting all of @C@'s fields to bottom. + +\begin{code} +dsExpr (RecordCon (L _ data_con_id) con_expr rbinds) + = dsExpr con_expr `thenDs` \ con_expr' -> + let + (arg_tys, _) = tcSplitFunTys (exprType con_expr') + -- A newtype in the corner should be opaque; + -- hence TcType.tcSplitFunTys + + mk_arg (arg_ty, lbl) -- Selector id has the field label as its name + = case [rhs | (L _ sel_id, rhs) <- rbinds, lbl == idName sel_id] of + (rhs:rhss) -> ASSERT( null rhss ) + dsLExpr rhs + [] -> mkErrorAppDs rEC_CON_ERROR_ID arg_ty (showSDoc (ppr lbl)) + unlabelled_bottom arg_ty = mkErrorAppDs rEC_CON_ERROR_ID arg_ty "" + + labels = dataConFieldLabels (idDataCon data_con_id) + -- The data_con_id is guaranteed to be the wrapper id of the constructor + in + + (if null labels + then mappM unlabelled_bottom arg_tys + else mappM mk_arg (zipEqual "dsExpr:RecordCon" arg_tys labels)) + `thenDs` \ con_args -> + + returnDs (mkApps con_expr' con_args) +\end{code} + +Record update is a little harder. Suppose we have the decl: +\begin{verbatim} + data T = T1 {op1, op2, op3 :: Int} + | T2 {op4, op2 :: Int} + | T3 +\end{verbatim} +Then we translate as follows: +\begin{verbatim} + r { op2 = e } +===> + let op2 = e in + case r of + T1 op1 _ op3 -> T1 op1 op2 op3 + T2 op4 _ -> T2 op4 op2 + other -> recUpdError "M.lhs/230" +\end{verbatim} +It's important that we use the constructor Ids for @T1@, @T2@ etc on the +RHSs, and do not generate a Core constructor application directly, because the constructor +might do some argument-evaluation first; and may have to throw away some +dictionaries. + +\begin{code} +dsExpr (RecordUpd record_expr [] record_in_ty record_out_ty) + = dsLExpr record_expr + +dsExpr expr@(RecordUpd record_expr rbinds record_in_ty record_out_ty) + = dsLExpr record_expr `thenDs` \ record_expr' -> + + -- Desugar the rbinds, and generate let-bindings if + -- necessary so that we don't lose sharing + + let + in_inst_tys = tcTyConAppArgs record_in_ty -- Newtype opaque + out_inst_tys = tcTyConAppArgs record_out_ty -- Newtype opaque + in_out_ty = mkFunTy record_in_ty record_out_ty + + mk_val_arg field old_arg_id + = case [rhs | (L _ sel_id, rhs) <- rbinds, field == idName sel_id] of + (rhs:rest) -> ASSERT(null rest) rhs + [] -> nlHsVar old_arg_id + + mk_alt con + = newSysLocalsDs (dataConInstOrigArgTys con in_inst_tys) `thenDs` \ arg_ids -> + -- This call to dataConInstOrigArgTys won't work for existentials + -- but existentials don't have record types anyway + let + val_args = zipWithEqual "dsExpr:RecordUpd" mk_val_arg + (dataConFieldLabels con) arg_ids + rhs = foldl (\a b -> nlHsApp a b) + (noLoc $ TyApp (nlHsVar (dataConWrapId con)) + out_inst_tys) + val_args + in + returnDs (mkSimpleMatch [noLoc $ ConPatOut (noLoc con) [] [] emptyLHsBinds + (PrefixCon (map nlVarPat arg_ids)) record_in_ty] + rhs) + in + -- Record stuff doesn't work for existentials + -- The type checker checks for this, but we need + -- worry only about the constructors that are to be updated + ASSERT2( all isVanillaDataCon cons_to_upd, ppr expr ) + + -- It's important to generate the match with matchWrapper, + -- and the right hand sides with applications of the wrapper Id + -- so that everything works when we are doing fancy unboxing on the + -- constructor aguments. + mappM mk_alt cons_to_upd `thenDs` \ alts -> + matchWrapper RecUpd (MatchGroup alts in_out_ty) `thenDs` \ ([discrim_var], matching_code) -> + + returnDs (bindNonRec discrim_var record_expr' matching_code) + + where + updated_fields :: [FieldLabel] + updated_fields = [ idName sel_id | (L _ sel_id,_) <- rbinds] + + -- Get the type constructor from the record_in_ty + -- so that we are sure it'll have all its DataCons + -- (In GHCI, it's possible that some TyCons may not have all + -- their constructors, in a module-loop situation.) + tycon = tcTyConAppTyCon record_in_ty + data_cons = tyConDataCons tycon + cons_to_upd = filter has_all_fields data_cons + + has_all_fields :: DataCon -> Bool + has_all_fields con_id + = all (`elem` con_fields) updated_fields + where + con_fields = dataConFieldLabels con_id +\end{code} + + +\noindent +\underline{\bf Dictionary lambda and application} +% ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +@DictLam@ and @DictApp@ turn into the regular old things. +(OLD:) @DictFunApp@ also becomes a curried application, albeit slightly more +complicated; reminiscent of fully-applied constructors. +\begin{code} +dsExpr (DictLam dictvars expr) + = dsLExpr expr `thenDs` \ core_expr -> + returnDs (mkLams dictvars core_expr) + +------------------ + +dsExpr (DictApp expr dicts) -- becomes a curried application + = dsLExpr expr `thenDs` \ core_expr -> + returnDs (foldl (\f d -> f `App` (Var d)) core_expr dicts) + +dsExpr (HsCoerce co_fn e) = dsCoercion co_fn (dsExpr e) +\end{code} + +Here is where we desugar the Template Haskell brackets and escapes + +\begin{code} +-- Template Haskell stuff + +#ifdef GHCI /* Only if bootstrapping */ +dsExpr (HsBracketOut x ps) = dsBracket x ps +dsExpr (HsSpliceE s) = pprPanic "dsExpr:splice" (ppr s) +#endif + +-- Arrow notation extension +dsExpr (HsProc pat cmd) = dsProcExpr pat cmd +\end{code} + + +\begin{code} + +#ifdef DEBUG +-- HsSyn constructs that just shouldn't be here: +dsExpr (ExprWithTySig _ _) = panic "dsExpr:ExprWithTySig" +#endif + +\end{code} + +%-------------------------------------------------------------------- + +Desugar 'do' and 'mdo' expressions (NOT list comprehensions, they're +handled in DsListComp). Basically does the translation given in the +Haskell 98 report: + +\begin{code} +dsDo :: [LStmt Id] + -> LHsExpr Id + -> Type -- Type of the whole expression + -> DsM CoreExpr + +dsDo stmts body result_ty + = go (map unLoc stmts) + where + go [] = dsLExpr body + + go (ExprStmt rhs then_expr _ : stmts) + = do { rhs2 <- dsLExpr rhs + ; then_expr2 <- dsExpr then_expr + ; rest <- go stmts + ; returnDs (mkApps then_expr2 [rhs2, rest]) } + + go (LetStmt binds : stmts) + = do { rest <- go stmts + ; dsLocalBinds binds rest } + + go (BindStmt pat rhs bind_op fail_op : stmts) + = do { body <- go stmts + ; var <- selectSimpleMatchVarL pat + ; match <- matchSinglePat (Var var) (StmtCtxt DoExpr) pat + result_ty (cantFailMatchResult body) + ; match_code <- handle_failure pat match fail_op + ; rhs' <- dsLExpr rhs + ; bind_op' <- dsExpr bind_op + ; returnDs (mkApps bind_op' [rhs', Lam var match_code]) } + + -- In a do expression, pattern-match failure just calls + -- the monadic 'fail' rather than throwing an exception + handle_failure pat match fail_op + | matchCanFail match + = do { fail_op' <- dsExpr fail_op + ; fail_msg <- mkStringExpr (mk_fail_msg pat) + ; extractMatchResult match (App fail_op' fail_msg) } + | otherwise + = extractMatchResult match (error "It can't fail") + +mk_fail_msg pat = "Pattern match failure in do expression at " ++ + showSDoc (ppr (getLoc pat)) +\end{code} + +Translation for RecStmt's: +----------------------------- +We turn (RecStmt [v1,..vn] stmts) into: + + (v1,..,vn) <- mfix (\~(v1,..vn). do stmts + return (v1,..vn)) + +\begin{code} +dsMDo :: PostTcTable + -> [LStmt Id] + -> LHsExpr Id + -> Type -- Type of the whole expression + -> DsM CoreExpr + +dsMDo tbl stmts body result_ty + = go (map unLoc stmts) + where + (m_ty, b_ty) = tcSplitAppTy result_ty -- result_ty must be of the form (m b) + mfix_id = lookupEvidence tbl mfixName + return_id = lookupEvidence tbl returnMName + bind_id = lookupEvidence tbl bindMName + then_id = lookupEvidence tbl thenMName + fail_id = lookupEvidence tbl failMName + ctxt = MDoExpr tbl + + go [] = dsLExpr body + + go (LetStmt binds : stmts) + = do { rest <- go stmts + ; dsLocalBinds binds rest } + + go (ExprStmt rhs _ rhs_ty : stmts) + = do { rhs2 <- dsLExpr rhs + ; rest <- go stmts + ; returnDs (mkApps (Var then_id) [Type rhs_ty, Type b_ty, rhs2, rest]) } + + go (BindStmt pat rhs _ _ : stmts) + = do { body <- go stmts + ; var <- selectSimpleMatchVarL pat + ; match <- matchSinglePat (Var var) (StmtCtxt ctxt) pat + result_ty (cantFailMatchResult body) + ; fail_msg <- mkStringExpr (mk_fail_msg pat) + ; let fail_expr = mkApps (Var fail_id) [Type b_ty, fail_msg] + ; match_code <- extractMatchResult match fail_expr + + ; rhs' <- dsLExpr rhs + ; returnDs (mkApps (Var bind_id) [Type (hsPatType pat), Type b_ty, + rhs', Lam var match_code]) } + + go (RecStmt rec_stmts later_ids rec_ids rec_rets binds : stmts) + = ASSERT( length rec_ids > 0 ) + ASSERT( length rec_ids == length rec_rets ) + go (new_bind_stmt : let_stmt : stmts) + where + new_bind_stmt = mkBindStmt (mk_tup_pat later_pats) mfix_app + let_stmt = LetStmt (HsValBinds (ValBindsOut [(Recursive, binds)] [])) + + + -- Remove the later_ids that appear (without fancy coercions) + -- in rec_rets, because there's no need to knot-tie them separately + -- See Note [RecStmt] in HsExpr + later_ids' = filter (`notElem` mono_rec_ids) later_ids + mono_rec_ids = [ id | HsVar id <- rec_rets ] + + mfix_app = nlHsApp (noLoc $ TyApp (nlHsVar mfix_id) [tup_ty]) mfix_arg + mfix_arg = noLoc $ HsLam (MatchGroup [mkSimpleMatch [mfix_pat] body] + (mkFunTy tup_ty body_ty)) + + -- The rec_tup_pat must bind the rec_ids only; remember that the + -- trimmed_laters may share the same Names + -- Meanwhile, the later_pats must bind the later_vars + rec_tup_pats = map mk_wild_pat later_ids' ++ map nlVarPat rec_ids + later_pats = map nlVarPat later_ids' ++ map mk_later_pat rec_ids + rets = map nlHsVar later_ids' ++ map noLoc rec_rets + + mfix_pat = noLoc $ LazyPat $ mk_tup_pat rec_tup_pats + body = noLoc $ HsDo ctxt rec_stmts return_app body_ty + body_ty = mkAppTy m_ty tup_ty + tup_ty = mkCoreTupTy (map idType (later_ids' ++ rec_ids)) + -- mkCoreTupTy deals with singleton case + + return_app = nlHsApp (noLoc $ TyApp (nlHsVar return_id) [tup_ty]) + (mk_ret_tup rets) + + mk_wild_pat :: Id -> LPat Id + mk_wild_pat v = noLoc $ WildPat $ idType v + + mk_later_pat :: Id -> LPat Id + mk_later_pat v | v `elem` later_ids' = mk_wild_pat v + | otherwise = nlVarPat v + + mk_tup_pat :: [LPat Id] -> LPat Id + mk_tup_pat [p] = p + mk_tup_pat ps = noLoc $ mkVanillaTuplePat ps Boxed + + mk_ret_tup :: [LHsExpr Id] -> LHsExpr Id + mk_ret_tup [r] = r + mk_ret_tup rs = noLoc $ ExplicitTuple rs Boxed +\end{code} diff --git a/compiler/deSugar/DsExpr.lhs-boot b/compiler/deSugar/DsExpr.lhs-boot new file mode 100644 index 0000000000..c65e99d80d --- /dev/null +++ b/compiler/deSugar/DsExpr.lhs-boot @@ -0,0 +1,11 @@ +\begin{code} +module DsExpr where +import HsSyn ( HsExpr, LHsExpr, HsLocalBinds ) +import Var ( Id ) +import DsMonad ( DsM ) +import CoreSyn ( CoreExpr ) + +dsExpr :: HsExpr Id -> DsM CoreExpr +dsLExpr :: LHsExpr Id -> DsM CoreExpr +dsLocalBinds :: HsLocalBinds Id -> CoreExpr -> DsM CoreExpr +\end{code} diff --git a/compiler/deSugar/DsForeign.lhs b/compiler/deSugar/DsForeign.lhs new file mode 100644 index 0000000000..52956a09ff --- /dev/null +++ b/compiler/deSugar/DsForeign.lhs @@ -0,0 +1,646 @@ +% +% (c) The AQUA Project, Glasgow University, 1998 +% +\section[DsCCall]{Desugaring \tr{foreign} declarations} + +Expanding out @foreign import@ and @foreign export@ declarations. + +\begin{code} +module DsForeign ( dsForeigns ) where + +#include "HsVersions.h" +import TcRnMonad -- temp + +import CoreSyn + +import DsCCall ( dsCCall, mkFCall, boxResult, unboxArg, resultWrapper ) +import DsMonad + +import HsSyn ( ForeignDecl(..), ForeignExport(..), LForeignDecl, + ForeignImport(..), CImportSpec(..) ) +import DataCon ( splitProductType_maybe ) +#ifdef DEBUG +import DataCon ( dataConSourceArity ) +import Type ( isUnLiftedType ) +#endif +import MachOp ( machRepByteWidth, MachRep(..) ) +import SMRep ( argMachRep, typeCgRep ) +import CoreUtils ( exprType, mkInlineMe ) +import Id ( Id, idType, idName, mkSysLocal, setInlinePragma ) +import Literal ( Literal(..), mkStringLit ) +import Module ( moduleFS ) +import Name ( getOccString, NamedThing(..) ) +import Type ( repType, coreEqType ) +import TcType ( Type, mkFunTys, mkForAllTys, mkTyConApp, + mkFunTy, tcSplitTyConApp_maybe, + tcSplitForAllTys, tcSplitFunTys, tcTyConAppArgs, + ) + +import BasicTypes ( Boxity(..) ) +import HscTypes ( ForeignStubs(..) ) +import ForeignCall ( ForeignCall(..), CCallSpec(..), + Safety(..), playSafe, + CExportSpec(..), CLabelString, + CCallConv(..), ccallConvToInt, + ccallConvAttribute + ) +import TysWiredIn ( unitTy, tupleTyCon ) +import TysPrim ( addrPrimTy, mkStablePtrPrimTy, alphaTy ) +import PrelNames ( hasKey, ioTyConKey, stablePtrTyConName, newStablePtrName, bindIOName, + checkDotnetResName ) +import BasicTypes ( Activation( NeverActive ) ) +import SrcLoc ( Located(..), unLoc ) +import Outputable +import Maybe ( fromJust, isNothing ) +import FastString +\end{code} + +Desugaring of @foreign@ declarations is naturally split up into +parts, an @import@ and an @export@ part. A @foreign import@ +declaration +\begin{verbatim} + foreign import cc nm f :: prim_args -> IO prim_res +\end{verbatim} +is the same as +\begin{verbatim} + f :: prim_args -> IO prim_res + f a1 ... an = _ccall_ nm cc a1 ... an +\end{verbatim} +so we reuse the desugaring code in @DsCCall@ to deal with these. + +\begin{code} +type Binding = (Id, CoreExpr) -- No rec/nonrec structure; + -- the occurrence analyser will sort it all out + +dsForeigns :: [LForeignDecl Id] + -> DsM (ForeignStubs, [Binding]) +dsForeigns [] + = returnDs (NoStubs, []) +dsForeigns fos + = foldlDs combine (ForeignStubs empty empty [] [], []) fos + where + combine stubs (L loc decl) = putSrcSpanDs loc (combine1 stubs decl) + + combine1 (ForeignStubs acc_h acc_c acc_hdrs acc_feb, acc_f) + (ForeignImport id _ spec depr) + = traceIf (text "fi start" <+> ppr id) `thenDs` \ _ -> + dsFImport (unLoc id) spec `thenDs` \ (bs, h, c, mbhd) -> + warnDepr depr `thenDs` \ _ -> + traceIf (text "fi end" <+> ppr id) `thenDs` \ _ -> + returnDs (ForeignStubs (h $$ acc_h) + (c $$ acc_c) + (addH mbhd acc_hdrs) + acc_feb, + bs ++ acc_f) + + combine1 (ForeignStubs acc_h acc_c acc_hdrs acc_feb, acc_f) + (ForeignExport (L _ id) _ (CExport (CExportStatic ext_nm cconv)) depr) + = dsFExport id (idType id) + ext_nm cconv False `thenDs` \(h, c, _, _) -> + warnDepr depr `thenDs` \_ -> + returnDs (ForeignStubs (h $$ acc_h) (c $$ acc_c) acc_hdrs (id:acc_feb), + acc_f) + + addH Nothing ls = ls + addH (Just e) ls + | e `elem` ls = ls + | otherwise = e:ls + + warnDepr False = returnDs () + warnDepr True = dsWarn msg + where + msg = ptext SLIT("foreign declaration uses deprecated non-standard syntax") +\end{code} + + +%************************************************************************ +%* * +\subsection{Foreign import} +%* * +%************************************************************************ + +Desugaring foreign imports is just the matter of creating a binding +that on its RHS unboxes its arguments, performs the external call +(using the @CCallOp@ primop), before boxing the result up and returning it. + +However, we create a worker/wrapper pair, thus: + + foreign import f :: Int -> IO Int +==> + f x = IO ( \s -> case x of { I# x# -> + case fw s x# of { (# s1, y# #) -> + (# s1, I# y# #)}}) + + fw s x# = ccall f s x# + +The strictness/CPR analyser won't do this automatically because it doesn't look +inside returned tuples; but inlining this wrapper is a Really Good Idea +because it exposes the boxing to the call site. + +\begin{code} +dsFImport :: Id + -> ForeignImport + -> DsM ([Binding], SDoc, SDoc, Maybe FastString) +dsFImport id (CImport cconv safety header lib spec) + = dsCImport id spec cconv safety no_hdrs `thenDs` \(ids, h, c) -> + returnDs (ids, h, c, if no_hdrs then Nothing else Just header) + where + no_hdrs = nullFS header + + -- FIXME: the `lib' field is needed for .NET ILX generation when invoking + -- routines that are external to the .NET runtime, but GHC doesn't + -- support such calls yet; if `nullFastString lib', the value was not given +dsFImport id (DNImport spec) + = dsFCall id (DNCall spec) True {- No headers -} `thenDs` \(ids, h, c) -> + returnDs (ids, h, c, Nothing) + +dsCImport :: Id + -> CImportSpec + -> CCallConv + -> Safety + -> Bool -- True <=> no headers in the f.i decl + -> DsM ([Binding], SDoc, SDoc) +dsCImport id (CLabel cid) _ _ no_hdrs + = resultWrapper (idType id) `thenDs` \ (resTy, foRhs) -> + ASSERT(fromJust resTy `coreEqType` addrPrimTy) -- typechecker ensures this + let rhs = foRhs (mkLit (MachLabel cid Nothing)) in + returnDs ([(setImpInline no_hdrs id, rhs)], empty, empty) +dsCImport id (CFunction target) cconv safety no_hdrs + = dsFCall id (CCall (CCallSpec target cconv safety)) no_hdrs +dsCImport id CWrapper cconv _ _ + = dsFExportDynamic id cconv + +setImpInline :: Bool -- True <=> No #include headers + -- in the foreign import declaration + -> Id -> Id +-- If there is a #include header in the foreign import +-- we make the worker non-inlinable, because we currently +-- don't keep the #include stuff in the CCallId, and hence +-- it won't be visible in the importing module, which can be +-- fatal. +-- (The #include stuff is just collected from the foreign import +-- decls in a module.) +-- If you want to do cross-module inlining of the c-calls themselves, +-- put the #include stuff in the package spec, not the foreign +-- import decl. +setImpInline True id = id +setImpInline False id = id `setInlinePragma` NeverActive +\end{code} + + +%************************************************************************ +%* * +\subsection{Foreign calls} +%* * +%************************************************************************ + +\begin{code} +dsFCall fn_id fcall no_hdrs + = let + ty = idType fn_id + (tvs, fun_ty) = tcSplitForAllTys ty + (arg_tys, io_res_ty) = tcSplitFunTys fun_ty + -- Must use tcSplit* functions because we want to + -- see that (IO t) in the corner + in + newSysLocalsDs arg_tys `thenDs` \ args -> + mapAndUnzipDs unboxArg (map Var args) `thenDs` \ (val_args, arg_wrappers) -> + + let + work_arg_ids = [v | Var v <- val_args] -- All guaranteed to be vars + + forDotnet = + case fcall of + DNCall{} -> True + _ -> False + + topConDs + | forDotnet = + dsLookupGlobalId checkDotnetResName `thenDs` \ check_id -> + return (Just check_id) + | otherwise = return Nothing + + augmentResultDs + | forDotnet = + newSysLocalDs addrPrimTy `thenDs` \ err_res -> + returnDs (\ (mb_res_ty, resWrap) -> + case mb_res_ty of + Nothing -> (Just (mkTyConApp (tupleTyCon Unboxed 1) + [ addrPrimTy ]), + resWrap) + Just x -> (Just (mkTyConApp (tupleTyCon Unboxed 2) + [ x, addrPrimTy ]), + resWrap)) + | otherwise = returnDs id + in + augmentResultDs `thenDs` \ augment -> + topConDs `thenDs` \ topCon -> + boxResult augment topCon io_res_ty `thenDs` \ (ccall_result_ty, res_wrapper) -> + + newUnique `thenDs` \ ccall_uniq -> + newUnique `thenDs` \ work_uniq -> + let + -- Build the worker + worker_ty = mkForAllTys tvs (mkFunTys (map idType work_arg_ids) ccall_result_ty) + the_ccall_app = mkFCall ccall_uniq fcall val_args ccall_result_ty + work_rhs = mkLams tvs (mkLams work_arg_ids the_ccall_app) + work_id = setImpInline no_hdrs $ -- See comments with setImpInline + mkSysLocal FSLIT("$wccall") work_uniq worker_ty + + -- Build the wrapper + work_app = mkApps (mkVarApps (Var work_id) tvs) val_args + wrapper_body = foldr ($) (res_wrapper work_app) arg_wrappers + wrap_rhs = mkInlineMe (mkLams (tvs ++ args) wrapper_body) + in + returnDs ([(work_id, work_rhs), (fn_id, wrap_rhs)], empty, empty) + +unsafe_call (CCall (CCallSpec _ _ safety)) = playSafe safety +unsafe_call (DNCall _) = False +\end{code} + + +%************************************************************************ +%* * +\subsection{Foreign export} +%* * +%************************************************************************ + +The function that does most of the work for `@foreign export@' declarations. +(see below for the boilerplate code a `@foreign export@' declaration expands + into.) + +For each `@foreign export foo@' in a module M we generate: +\begin{itemize} +\item a C function `@foo@', which calls +\item a Haskell stub `@M.$ffoo@', which calls +\end{itemize} +the user-written Haskell function `@M.foo@'. + +\begin{code} +dsFExport :: Id -- Either the exported Id, + -- or the foreign-export-dynamic constructor + -> Type -- The type of the thing callable from C + -> CLabelString -- The name to export to C land + -> CCallConv + -> Bool -- True => foreign export dynamic + -- so invoke IO action that's hanging off + -- the first argument's stable pointer + -> DsM ( SDoc -- contents of Module_stub.h + , SDoc -- contents of Module_stub.c + , [MachRep] -- primitive arguments expected by stub function + , Int -- size of args to stub function + ) + +dsFExport fn_id ty ext_name cconv isDyn + = + let + (_tvs,sans_foralls) = tcSplitForAllTys ty + (fe_arg_tys', orig_res_ty) = tcSplitFunTys sans_foralls + -- We must use tcSplits here, because we want to see + -- the (IO t) in the corner of the type! + fe_arg_tys | isDyn = tail fe_arg_tys' + | otherwise = fe_arg_tys' + in + -- Look at the result type of the exported function, orig_res_ty + -- If it's IO t, return (t, True) + -- If it's plain t, return (t, False) + (case tcSplitTyConApp_maybe orig_res_ty of + -- We must use tcSplit here so that we see the (IO t) in + -- the type. [IO t is transparent to plain splitTyConApp.] + + Just (ioTyCon, [res_ty]) + -> ASSERT( ioTyCon `hasKey` ioTyConKey ) + -- The function already returns IO t + returnDs (res_ty, True) + + other -> -- The function returns t + returnDs (orig_res_ty, False) + ) + `thenDs` \ (res_ty, -- t + is_IO_res_ty) -> -- Bool + returnDs $ + mkFExportCBits ext_name + (if isDyn then Nothing else Just fn_id) + fe_arg_tys res_ty is_IO_res_ty cconv +\end{code} + +@foreign export dynamic@ lets you dress up Haskell IO actions +of some fixed type behind an externally callable interface (i.e., +as a C function pointer). Useful for callbacks and stuff. + +\begin{verbatim} +foreign export dynamic f :: (Addr -> Int -> IO Int) -> IO Addr + +-- Haskell-visible constructor, which is generated from the above: +-- SUP: No check for NULL from createAdjustor anymore??? + +f :: (Addr -> Int -> IO Int) -> IO Addr +f cback = + bindIO (newStablePtr cback) + (\StablePtr sp# -> IO (\s1# -> + case _ccall_ createAdjustor cconv sp# ``f_helper'' s1# of + (# s2#, a# #) -> (# s2#, A# a# #))) + +foreign export "f_helper" f_helper :: StablePtr (Addr -> Int -> IO Int) -> Addr -> Int -> IO Int +-- `special' foreign export that invokes the closure pointed to by the +-- first argument. +\end{verbatim} + +\begin{code} +dsFExportDynamic :: Id + -> CCallConv + -> DsM ([Binding], SDoc, SDoc) +dsFExportDynamic id cconv + = newSysLocalDs ty `thenDs` \ fe_id -> + getModuleDs `thenDs` \ mod_name -> + let + -- hack: need to get at the name of the C stub we're about to generate. + fe_nm = mkFastString (unpackFS (zEncodeFS (moduleFS mod_name)) ++ "_" ++ toCName fe_id) + in + newSysLocalDs arg_ty `thenDs` \ cback -> + dsLookupGlobalId newStablePtrName `thenDs` \ newStablePtrId -> + dsLookupTyCon stablePtrTyConName `thenDs` \ stable_ptr_tycon -> + let + mk_stbl_ptr_app = mkApps (Var newStablePtrId) [ Type arg_ty, Var cback ] + stable_ptr_ty = mkTyConApp stable_ptr_tycon [arg_ty] + export_ty = mkFunTy stable_ptr_ty arg_ty + in + dsLookupGlobalId bindIOName `thenDs` \ bindIOId -> + newSysLocalDs stable_ptr_ty `thenDs` \ stbl_value -> + dsFExport id export_ty fe_nm cconv True + `thenDs` \ (h_code, c_code, arg_reps, args_size) -> + let + stbl_app cont ret_ty = mkApps (Var bindIOId) + [ Type stable_ptr_ty + , Type ret_ty + , mk_stbl_ptr_app + , cont + ] + {- + The arguments to the external function which will + create a little bit of (template) code on the fly + for allowing the (stable pointed) Haskell closure + to be entered using an external calling convention + (stdcall, ccall). + -} + adj_args = [ mkIntLitInt (ccallConvToInt cconv) + , Var stbl_value + , mkLit (MachLabel fe_nm mb_sz_args) + , mkLit (mkStringLit arg_type_info) + ] + -- name of external entry point providing these services. + -- (probably in the RTS.) + adjustor = FSLIT("createAdjustor") + + arg_type_info = map repCharCode arg_reps + repCharCode F32 = 'f' + repCharCode F64 = 'd' + repCharCode I64 = 'l' + repCharCode _ = 'i' + + -- Determine the number of bytes of arguments to the stub function, + -- so that we can attach the '@N' suffix to its label if it is a + -- stdcall on Windows. + mb_sz_args = case cconv of + StdCallConv -> Just args_size + _ -> Nothing + + in + dsCCall adjustor adj_args PlayRisky io_res_ty `thenDs` \ ccall_adj -> + -- PlayRisky: the adjustor doesn't allocate in the Haskell heap or do a callback + let ccall_adj_ty = exprType ccall_adj + ccall_io_adj = mkLams [stbl_value] $ + Note (Coerce io_res_ty ccall_adj_ty) + ccall_adj + io_app = mkLams tvs $ + mkLams [cback] $ + stbl_app ccall_io_adj res_ty + fed = (id `setInlinePragma` NeverActive, io_app) + -- Never inline the f.e.d. function, because the litlit + -- might not be in scope in other modules. + in + returnDs ([fed], h_code, c_code) + + where + ty = idType id + (tvs,sans_foralls) = tcSplitForAllTys ty + ([arg_ty], io_res_ty) = tcSplitFunTys sans_foralls + [res_ty] = tcTyConAppArgs io_res_ty + -- Must use tcSplit* to see the (IO t), which is a newtype + +toCName :: Id -> String +toCName i = showSDoc (pprCode CStyle (ppr (idName i))) +\end{code} + +%* +% +\subsection{Generating @foreign export@ stubs} +% +%* + +For each @foreign export@ function, a C stub function is generated. +The C stub constructs the application of the exported Haskell function +using the hugs/ghc rts invocation API. + +\begin{code} +mkFExportCBits :: FastString + -> Maybe Id -- Just==static, Nothing==dynamic + -> [Type] + -> Type + -> Bool -- True <=> returns an IO type + -> CCallConv + -> (SDoc, + SDoc, + [MachRep], -- the argument reps + Int -- total size of arguments + ) +mkFExportCBits c_nm maybe_target arg_htys res_hty is_IO_res_ty cc + = (header_bits, c_bits, + [rep | (_,_,_,rep) <- arg_info], -- just the real args + sum [ machRepByteWidth rep | (_,_,_,rep) <- aug_arg_info] -- all the args + ) + where + -- list the arguments to the C function + arg_info :: [(SDoc, -- arg name + SDoc, -- C type + Type, -- Haskell type + MachRep)] -- the MachRep + arg_info = [ (text ('a':show n), showStgType ty, ty, + typeMachRep (getPrimTyOf ty)) + | (ty,n) <- zip arg_htys [1..] ] + + -- add some auxiliary args; the stable ptr in the wrapper case, and + -- a slot for the dummy return address in the wrapper + ccall case + aug_arg_info + | isNothing maybe_target = stable_ptr_arg : insertRetAddr cc arg_info + | otherwise = arg_info + + stable_ptr_arg = + (text "the_stableptr", text "StgStablePtr", undefined, + typeMachRep (mkStablePtrPrimTy alphaTy)) + + -- stuff to do with the return type of the C function + res_hty_is_unit = res_hty `coreEqType` unitTy -- Look through any newtypes + + cResType | res_hty_is_unit = text "void" + | otherwise = showStgType res_hty + + -- Now we can cook up the prototype for the exported function. + pprCconv = case cc of + CCallConv -> empty + StdCallConv -> text (ccallConvAttribute cc) + + header_bits = ptext SLIT("extern") <+> fun_proto <> semi + + fun_proto = cResType <+> pprCconv <+> ftext c_nm <> + parens (hsep (punctuate comma (map (\(nm,ty,_,_) -> ty <+> nm) + aug_arg_info))) + + -- the target which will form the root of what we ask rts_evalIO to run + the_cfun + = case maybe_target of + Nothing -> text "(StgClosure*)deRefStablePtr(the_stableptr)" + Just hs_fn -> char '&' <> ppr hs_fn <> text "_closure" + + cap = text "cap" <> comma + + -- the expression we give to rts_evalIO + expr_to_run + = foldl appArg the_cfun arg_info -- NOT aug_arg_info + where + appArg acc (arg_cname, _, arg_hty, _) + = text "rts_apply" + <> parens (cap <> acc <> comma <> mkHObj arg_hty <> parens (cap <> arg_cname)) + + -- various other bits for inside the fn + declareResult = text "HaskellObj ret;" + declareCResult | res_hty_is_unit = empty + | otherwise = cResType <+> text "cret;" + + assignCResult | res_hty_is_unit = empty + | otherwise = + text "cret=" <> unpackHObj res_hty <> parens (text "ret") <> semi + + -- an extern decl for the fn being called + extern_decl + = case maybe_target of + Nothing -> empty + Just hs_fn -> text "extern StgClosure " <> ppr hs_fn <> text "_closure" <> semi + + + -- Initialise foreign exports by registering a stable pointer from an + -- __attribute__((constructor)) function. + -- The alternative is to do this from stginit functions generated in + -- codeGen/CodeGen.lhs; however, stginit functions have a negative impact + -- on binary sizes and link times because the static linker will think that + -- all modules that are imported directly or indirectly are actually used by + -- the program. + -- (this is bad for big umbrella modules like Graphics.Rendering.OpenGL) + + initialiser + = case maybe_target of + Nothing -> empty + Just hs_fn -> + vcat + [ text "static void stginit_export_" <> ppr hs_fn + <> text "() __attribute__((constructor));" + , text "static void stginit_export_" <> ppr hs_fn <> text "()" + , braces (text "getStablePtr" + <> parens (text "(StgPtr) &" <> ppr hs_fn <> text "_closure") + <> semi) + ] + + -- finally, the whole darn thing + c_bits = + space $$ + extern_decl $$ + fun_proto $$ + vcat + [ lbrace + , text "Capability *cap;" + , declareResult + , declareCResult + , text "cap = rts_lock();" + -- create the application + perform it. + , text "cap=rts_evalIO" <> parens ( + cap <> + text "rts_apply" <> parens ( + cap <> + text "(HaskellObj)" + <> text (if is_IO_res_ty + then "runIO_closure" + else "runNonIO_closure") + <> comma + <> expr_to_run + ) <+> comma + <> text "&ret" + ) <> semi + , text "rts_checkSchedStatus" <> parens (doubleQuotes (ftext c_nm) + <> comma <> text "cap") <> semi + , assignCResult + , text "rts_unlock(cap);" + , if res_hty_is_unit then empty + else text "return cret;" + , rbrace + ] $$ + initialiser + +-- NB. the calculation here isn't strictly speaking correct. +-- We have a primitive Haskell type (eg. Int#, Double#), and +-- we want to know the size, when passed on the C stack, of +-- the associated C type (eg. HsInt, HsDouble). We don't have +-- this information to hand, but we know what GHC's conventions +-- are for passing around the primitive Haskell types, so we +-- use that instead. I hope the two coincide --SDM +typeMachRep ty = argMachRep (typeCgRep ty) + +mkHObj :: Type -> SDoc +mkHObj t = text "rts_mk" <> text (showFFIType t) + +unpackHObj :: Type -> SDoc +unpackHObj t = text "rts_get" <> text (showFFIType t) + +showStgType :: Type -> SDoc +showStgType t = text "Hs" <> text (showFFIType t) + +showFFIType :: Type -> String +showFFIType t = getOccString (getName tc) + where + tc = case tcSplitTyConApp_maybe (repType t) of + Just (tc,_) -> tc + Nothing -> pprPanic "showFFIType" (ppr t) + +#if !defined(x86_64_TARGET_ARCH) +insertRetAddr CCallConv args = ret_addr_arg : args +insertRetAddr _ args = args +#else +-- On x86_64 we insert the return address after the 6th +-- integer argument, because this is the point at which we +-- need to flush a register argument to the stack (See rts/Adjustor.c for +-- details). +insertRetAddr CCallConv args = go 0 args + where go 6 args = ret_addr_arg : args + go n (arg@(_,_,_,rep):args) + | I64 <- rep = arg : go (n+1) args + | otherwise = arg : go n args + go n [] = [] +insertRetAddr _ args = args +#endif + +ret_addr_arg = (text "original_return_addr", text "void*", undefined, + typeMachRep addrPrimTy) + +-- This function returns the primitive type associated with the boxed +-- type argument to a foreign export (eg. Int ==> Int#). It assumes +-- that all the types we are interested in have a single constructor +-- with a single primitive-typed argument, which is true for all of the legal +-- foreign export argument types (see TcType.legalFEArgTyCon). +getPrimTyOf :: Type -> Type +getPrimTyOf ty = + case splitProductType_maybe (repType ty) of + Just (_, _, data_con, [prim_ty]) -> + ASSERT(dataConSourceArity data_con == 1) + ASSERT2(isUnLiftedType prim_ty, ppr prim_ty) + prim_ty + _other -> pprPanic "DsForeign.getPrimTyOf" (ppr ty) +\end{code} diff --git a/compiler/deSugar/DsGRHSs.lhs b/compiler/deSugar/DsGRHSs.lhs new file mode 100644 index 0000000000..eea61bafb2 --- /dev/null +++ b/compiler/deSugar/DsGRHSs.lhs @@ -0,0 +1,128 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[DsGRHSs]{Matching guarded right-hand-sides (GRHSs)} + +\begin{code} +module DsGRHSs ( dsGuarded, dsGRHSs ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} DsExpr ( dsLExpr, dsLocalBinds ) +import {-# SOURCE #-} Match ( matchSinglePat ) + +import HsSyn ( Stmt(..), HsExpr(..), GRHSs(..), GRHS(..), + LHsExpr, HsMatchContext(..), Pat(..) ) +import CoreSyn ( CoreExpr ) +import Var ( Id ) +import Type ( Type ) + +import DsMonad +import DsUtils +import Unique ( Uniquable(..) ) +import PrelInfo ( nON_EXHAUSTIVE_GUARDS_ERROR_ID ) +import TysWiredIn ( trueDataConId ) +import PrelNames ( otherwiseIdKey, hasKey ) +import Name ( Name ) +import SrcLoc ( unLoc, Located(..) ) +\end{code} + +@dsGuarded@ is used for both @case@ expressions and pattern bindings. +It desugars: +\begin{verbatim} + | g1 -> e1 + ... + | gn -> en + where binds +\end{verbatim} +producing an expression with a runtime error in the corner if +necessary. The type argument gives the type of the @ei@. + +\begin{code} +dsGuarded :: GRHSs Id -> Type -> DsM CoreExpr + +dsGuarded grhss rhs_ty + = dsGRHSs PatBindRhs [] grhss rhs_ty `thenDs` \ match_result -> + mkErrorAppDs nON_EXHAUSTIVE_GUARDS_ERROR_ID rhs_ty "" `thenDs` \ error_expr -> + extractMatchResult match_result error_expr +\end{code} + +In contrast, @dsGRHSs@ produces a @MatchResult@. + +\begin{code} +dsGRHSs :: HsMatchContext Name -> [Pat Id] -- These are to build a MatchContext from + -> GRHSs Id -- Guarded RHSs + -> Type -- Type of RHS + -> DsM MatchResult + +dsGRHSs hs_ctx pats (GRHSs grhss binds) rhs_ty + = mappM (dsGRHS hs_ctx pats rhs_ty) grhss `thenDs` \ match_results -> + let + match_result1 = foldr1 combineMatchResults match_results + match_result2 = adjustMatchResultDs (dsLocalBinds binds) match_result1 + -- NB: nested dsLet inside matchResult + in + returnDs match_result2 + +dsGRHS hs_ctx pats rhs_ty (L loc (GRHS guards rhs)) + = matchGuards (map unLoc guards) hs_ctx rhs rhs_ty +\end{code} + + +%************************************************************************ +%* * +%* matchGuard : make a MatchResult from a guarded RHS * +%* * +%************************************************************************ + +\begin{code} +matchGuards :: [Stmt Id] -- Guard + -> HsMatchContext Name -- Context + -> LHsExpr Id -- RHS + -> Type -- Type of RHS of guard + -> DsM MatchResult + +-- See comments with HsExpr.Stmt re what an ExprStmt means +-- Here we must be in a guard context (not do-expression, nor list-comp) + +matchGuards [] ctx rhs rhs_ty + = do { core_rhs <- dsLExpr rhs + ; return (cantFailMatchResult core_rhs) } + + -- ExprStmts must be guards + -- Turn an "otherwise" guard is a no-op. This ensures that + -- you don't get a "non-exhaustive eqns" message when the guards + -- finish in "otherwise". + -- NB: The success of this clause depends on the typechecker not + -- wrapping the 'otherwise' in empty HsTyApp or HsCoerce constructors + -- If it does, you'll get bogus overlap warnings +matchGuards (ExprStmt (L _ (HsVar v)) _ _ : stmts) ctx rhs rhs_ty + | v `hasKey` otherwiseIdKey + || v `hasKey` getUnique trueDataConId + -- trueDataConId doesn't have the same unique as trueDataCon + = matchGuards stmts ctx rhs rhs_ty + +matchGuards (ExprStmt expr _ _ : stmts) ctx rhs rhs_ty + = matchGuards stmts ctx rhs rhs_ty `thenDs` \ match_result -> + dsLExpr expr `thenDs` \ pred_expr -> + returnDs (mkGuardedMatchResult pred_expr match_result) + +matchGuards (LetStmt binds : stmts) ctx rhs rhs_ty + = matchGuards stmts ctx rhs rhs_ty `thenDs` \ match_result -> + returnDs (adjustMatchResultDs (dsLocalBinds binds) match_result) + -- NB the dsLet occurs inside the match_result + -- Reason: dsLet takes the body expression as its argument + -- so we can't desugar the bindings without the + -- body expression in hand + +matchGuards (BindStmt pat bind_rhs _ _ : stmts) ctx rhs rhs_ty + = matchGuards stmts ctx rhs rhs_ty `thenDs` \ match_result -> + dsLExpr bind_rhs `thenDs` \ core_rhs -> + matchSinglePat core_rhs ctx pat rhs_ty match_result +\end{code} + +Should {\em fail} if @e@ returns @D@ +\begin{verbatim} +f x | p <- e', let C y# = e, f y# = r1 + | otherwise = r2 +\end{verbatim} diff --git a/compiler/deSugar/DsListComp.lhs b/compiler/deSugar/DsListComp.lhs new file mode 100644 index 0000000000..6bb41a92e4 --- /dev/null +++ b/compiler/deSugar/DsListComp.lhs @@ -0,0 +1,516 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[DsListComp]{Desugaring list comprehensions and array comprehensions} + +\begin{code} +module DsListComp ( dsListComp, dsPArrComp ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} DsExpr ( dsLExpr, dsLocalBinds ) + +import BasicTypes ( Boxity(..) ) +import HsSyn +import TcHsSyn ( hsPatType, mkVanillaTuplePat ) +import CoreSyn + +import DsMonad -- the monadery used in the desugarer +import DsUtils + +import DynFlags ( DynFlag(..), dopt ) +import StaticFlags ( opt_RulesOff ) +import CoreUtils ( exprType, mkIfThenElse ) +import Id ( idType ) +import Var ( Id ) +import Type ( mkTyVarTy, mkFunTys, mkFunTy, Type, + splitTyConApp_maybe ) +import TysPrim ( alphaTyVar ) +import TysWiredIn ( nilDataCon, consDataCon, trueDataConId, falseDataConId, + unitDataConId, unitTy, mkListTy, parrTyCon ) +import Match ( matchSimply ) +import PrelNames ( foldrName, buildName, replicatePName, mapPName, + filterPName, zipPName, crossPName ) +import PrelInfo ( pAT_ERROR_ID ) +import SrcLoc ( noLoc, unLoc ) +import Panic ( panic ) +\end{code} + +List comprehensions may be desugared in one of two ways: ``ordinary'' +(as you would expect if you read SLPJ's book) and ``with foldr/build +turned on'' (if you read Gill {\em et al.}'s paper on the subject). + +There will be at least one ``qualifier'' in the input. + +\begin{code} +dsListComp :: [LStmt Id] + -> LHsExpr Id + -> Type -- Type of list elements + -> DsM CoreExpr +dsListComp lquals body elt_ty + = getDOptsDs `thenDs` \dflags -> + let + quals = map unLoc lquals + in + if opt_RulesOff || dopt Opt_IgnoreInterfacePragmas dflags + -- Either rules are switched off, or we are ignoring what there are; + -- Either way foldr/build won't happen, so use the more efficient + -- Wadler-style desugaring + || isParallelComp quals + -- Foldr-style desugaring can't handle + -- parallel list comprehensions + then deListComp quals body (mkNilExpr elt_ty) + + else -- Foldr/build should be enabled, so desugar + -- into foldrs and builds + newTyVarsDs [alphaTyVar] `thenDs` \ [n_tyvar] -> + let + n_ty = mkTyVarTy n_tyvar + c_ty = mkFunTys [elt_ty, n_ty] n_ty + in + newSysLocalsDs [c_ty,n_ty] `thenDs` \ [c, n] -> + dfListComp c n quals body `thenDs` \ result -> + dsLookupGlobalId buildName `thenDs` \ build_id -> + returnDs (Var build_id `App` Type elt_ty + `App` mkLams [n_tyvar, c, n] result) + + where isParallelComp (ParStmt bndrstmtss : _) = True + isParallelComp _ = False +\end{code} + +%************************************************************************ +%* * +\subsection[DsListComp-ordinary]{Ordinary desugaring of list comprehensions} +%* * +%************************************************************************ + +Just as in Phil's chapter~7 in SLPJ, using the rules for +optimally-compiled list comprehensions. This is what Kevin followed +as well, and I quite happily do the same. The TQ translation scheme +transforms a list of qualifiers (either boolean expressions or +generators) into a single expression which implements the list +comprehension. Because we are generating 2nd-order polymorphic +lambda-calculus, calls to NIL and CONS must be applied to a type +argument, as well as their usual value arguments. +\begin{verbatim} +TE << [ e | qs ] >> = TQ << [ e | qs ] ++ Nil (typeOf e) >> + +(Rule C) +TQ << [ e | ] ++ L >> = Cons (typeOf e) TE <<e>> TE <<L>> + +(Rule B) +TQ << [ e | b , qs ] ++ L >> = + if TE << b >> then TQ << [ e | qs ] ++ L >> else TE << L >> + +(Rule A') +TQ << [ e | p <- L1, qs ] ++ L2 >> = + letrec + h = \ u1 -> + case u1 of + [] -> TE << L2 >> + (u2 : u3) -> + (( \ TE << p >> -> ( TQ << [e | qs] ++ (h u3) >> )) u2) + [] (h u3) + in + h ( TE << L1 >> ) + +"h", "u1", "u2", and "u3" are new variables. +\end{verbatim} + +@deListComp@ is the TQ translation scheme. Roughly speaking, @dsExpr@ +is the TE translation scheme. Note that we carry around the @L@ list +already desugared. @dsListComp@ does the top TE rule mentioned above. + +To the above, we add an additional rule to deal with parallel list +comprehensions. The translation goes roughly as follows: + [ e | p1 <- e11, let v1 = e12, p2 <- e13 + | q1 <- e21, let v2 = e22, q2 <- e23] + => + [ e | ((x1, .., xn), (y1, ..., ym)) <- + zip [(x1,..,xn) | p1 <- e11, let v1 = e12, p2 <- e13] + [(y1,..,ym) | q1 <- e21, let v2 = e22, q2 <- e23]] +where (x1, .., xn) are the variables bound in p1, v1, p2 + (y1, .., ym) are the variables bound in q1, v2, q2 + +In the translation below, the ParStmt branch translates each parallel branch +into a sub-comprehension, and desugars each independently. The resulting lists +are fed to a zip function, we create a binding for all the variables bound in all +the comprehensions, and then we hand things off the the desugarer for bindings. +The zip function is generated here a) because it's small, and b) because then we +don't have to deal with arbitrary limits on the number of zip functions in the +prelude, nor which library the zip function came from. +The introduced tuples are Boxed, but only because I couldn't get it to work +with the Unboxed variety. + +\begin{code} +deListComp :: [Stmt Id] -> LHsExpr Id -> CoreExpr -> DsM CoreExpr + +deListComp (ParStmt stmtss_w_bndrs : quals) body list + = mappM do_list_comp stmtss_w_bndrs `thenDs` \ exps -> + mkZipBind qual_tys `thenDs` \ (zip_fn, zip_rhs) -> + + -- Deal with [e | pat <- zip l1 .. ln] in example above + deBindComp pat (Let (Rec [(zip_fn, zip_rhs)]) (mkApps (Var zip_fn) exps)) + quals body list + + where + bndrs_s = map snd stmtss_w_bndrs + + -- pat is the pattern ((x1,..,xn), (y1,..,ym)) in the example above + pat = mkTuplePat pats + pats = map mk_hs_tuple_pat bndrs_s + + -- Types of (x1,..,xn), (y1,..,yn) etc + qual_tys = map mk_bndrs_tys bndrs_s + + do_list_comp (stmts, bndrs) + = dsListComp stmts (mk_hs_tuple_expr bndrs) + (mk_bndrs_tys bndrs) + + mk_bndrs_tys bndrs = mkCoreTupTy (map idType bndrs) + + -- Last: the one to return +deListComp [] body list -- Figure 7.4, SLPJ, p 135, rule C above + = dsLExpr body `thenDs` \ core_body -> + returnDs (mkConsExpr (exprType core_body) core_body list) + + -- Non-last: must be a guard +deListComp (ExprStmt guard _ _ : quals) body list -- rule B above + = dsLExpr guard `thenDs` \ core_guard -> + deListComp quals body list `thenDs` \ core_rest -> + returnDs (mkIfThenElse core_guard core_rest list) + +-- [e | let B, qs] = let B in [e | qs] +deListComp (LetStmt binds : quals) body list + = deListComp quals body list `thenDs` \ core_rest -> + dsLocalBinds binds core_rest + +deListComp (BindStmt pat list1 _ _ : quals) body core_list2 -- rule A' above + = dsLExpr list1 `thenDs` \ core_list1 -> + deBindComp pat core_list1 quals body core_list2 +\end{code} + + +\begin{code} +deBindComp pat core_list1 quals body core_list2 + = let + u3_ty@u1_ty = exprType core_list1 -- two names, same thing + + -- u1_ty is a [alpha] type, and u2_ty = alpha + u2_ty = hsPatType pat + + res_ty = exprType core_list2 + h_ty = u1_ty `mkFunTy` res_ty + in + newSysLocalsDs [h_ty, u1_ty, u2_ty, u3_ty] `thenDs` \ [h, u1, u2, u3] -> + + -- the "fail" value ... + let + core_fail = App (Var h) (Var u3) + letrec_body = App (Var h) core_list1 + in + deListComp quals body core_fail `thenDs` \ rest_expr -> + matchSimply (Var u2) (StmtCtxt ListComp) pat + rest_expr core_fail `thenDs` \ core_match -> + let + rhs = Lam u1 $ + Case (Var u1) u1 res_ty + [(DataAlt nilDataCon, [], core_list2), + (DataAlt consDataCon, [u2, u3], core_match)] + -- Increasing order of tag + in + returnDs (Let (Rec [(h, rhs)]) letrec_body) +\end{code} + + +\begin{code} +mkZipBind :: [Type] -> DsM (Id, CoreExpr) +-- mkZipBind [t1, t2] +-- = (zip, \as1:[t1] as2:[t2] +-- -> case as1 of +-- [] -> [] +-- (a1:as'1) -> case as2 of +-- [] -> [] +-- (a2:as'2) -> (a2,a2) : zip as'1 as'2)] + +mkZipBind elt_tys + = mappM newSysLocalDs list_tys `thenDs` \ ass -> + mappM newSysLocalDs elt_tys `thenDs` \ as' -> + mappM newSysLocalDs list_tys `thenDs` \ as's -> + newSysLocalDs zip_fn_ty `thenDs` \ zip_fn -> + let + inner_rhs = mkConsExpr ret_elt_ty + (mkCoreTup (map Var as')) + (mkVarApps (Var zip_fn) as's) + zip_body = foldr mk_case inner_rhs (zip3 ass as' as's) + in + returnDs (zip_fn, mkLams ass zip_body) + where + list_tys = map mkListTy elt_tys + ret_elt_ty = mkCoreTupTy elt_tys + list_ret_ty = mkListTy ret_elt_ty + zip_fn_ty = mkFunTys list_tys list_ret_ty + + mk_case (as, a', as') rest + = Case (Var as) as list_ret_ty + [(DataAlt nilDataCon, [], mkNilExpr ret_elt_ty), + (DataAlt consDataCon, [a', as'], rest)] + -- Increasing order of tag +-- Helper functions that makes an HsTuple only for non-1-sized tuples +mk_hs_tuple_expr :: [Id] -> LHsExpr Id +mk_hs_tuple_expr [] = nlHsVar unitDataConId +mk_hs_tuple_expr [id] = nlHsVar id +mk_hs_tuple_expr ids = noLoc $ ExplicitTuple [ nlHsVar i | i <- ids ] Boxed + +mk_hs_tuple_pat :: [Id] -> LPat Id +mk_hs_tuple_pat bs = mkTuplePat (map nlVarPat bs) +\end{code} + + +%************************************************************************ +%* * +\subsection[DsListComp-foldr-build]{Foldr/Build desugaring of list comprehensions} +%* * +%************************************************************************ + +@dfListComp@ are the rules used with foldr/build turned on: + +\begin{verbatim} +TE[ e | ] c n = c e n +TE[ e | b , q ] c n = if b then TE[ e | q ] c n else n +TE[ e | p <- l , q ] c n = let + f = \ x b -> case x of + p -> TE[ e | q ] c b + _ -> b + in + foldr f n l +\end{verbatim} + +\begin{code} +dfListComp :: Id -> Id -- 'c' and 'n' + -> [Stmt Id] -- the rest of the qual's + -> LHsExpr Id + -> DsM CoreExpr + + -- Last: the one to return +dfListComp c_id n_id [] body + = dsLExpr body `thenDs` \ core_body -> + returnDs (mkApps (Var c_id) [core_body, Var n_id]) + + -- Non-last: must be a guard +dfListComp c_id n_id (ExprStmt guard _ _ : quals) body + = dsLExpr guard `thenDs` \ core_guard -> + dfListComp c_id n_id quals body `thenDs` \ core_rest -> + returnDs (mkIfThenElse core_guard core_rest (Var n_id)) + +dfListComp c_id n_id (LetStmt binds : quals) body + -- new in 1.3, local bindings + = dfListComp c_id n_id quals body `thenDs` \ core_rest -> + dsLocalBinds binds core_rest + +dfListComp c_id n_id (BindStmt pat list1 _ _ : quals) body + -- evaluate the two lists + = dsLExpr list1 `thenDs` \ core_list1 -> + + -- find the required type + let x_ty = hsPatType pat + b_ty = idType n_id + in + + -- create some new local id's + newSysLocalsDs [b_ty,x_ty] `thenDs` \ [b,x] -> + + -- build rest of the comprehesion + dfListComp c_id b quals body `thenDs` \ core_rest -> + + -- build the pattern match + matchSimply (Var x) (StmtCtxt ListComp) + pat core_rest (Var b) `thenDs` \ core_expr -> + + -- now build the outermost foldr, and return + dsLookupGlobalId foldrName `thenDs` \ foldr_id -> + returnDs ( + Var foldr_id `App` Type x_ty + `App` Type b_ty + `App` mkLams [x, b] core_expr + `App` Var n_id + `App` core_list1 + ) +\end{code} + +%************************************************************************ +%* * +\subsection[DsPArrComp]{Desugaring of array comprehensions} +%* * +%************************************************************************ + +\begin{code} + +-- entry point for desugaring a parallel array comprehension +-- +-- [:e | qss:] = <<[:e | qss:]>> () [:():] +-- +dsPArrComp :: [Stmt Id] + -> LHsExpr Id + -> Type -- Don't use; called with `undefined' below + -> DsM CoreExpr +dsPArrComp qs body _ = + dsLookupGlobalId replicatePName `thenDs` \repP -> + let unitArray = mkApps (Var repP) [Type unitTy, + mkIntExpr 1, + mkCoreTup []] + in + dePArrComp qs body (mkTuplePat []) unitArray + +-- the work horse +-- +dePArrComp :: [Stmt Id] + -> LHsExpr Id + -> LPat Id -- the current generator pattern + -> CoreExpr -- the current generator expression + -> DsM CoreExpr +-- +-- <<[:e' | :]>> pa ea = mapP (\pa -> e') ea +-- +dePArrComp [] e' pa cea = + dsLookupGlobalId mapPName `thenDs` \mapP -> + let ty = parrElemType cea + in + deLambda ty pa e' `thenDs` \(clam, + ty'e') -> + returnDs $ mkApps (Var mapP) [Type ty, Type ty'e', clam, cea] +-- +-- <<[:e' | b, qs:]>> pa ea = <<[:e' | qs:]>> pa (filterP (\pa -> b) ea) +-- +dePArrComp (ExprStmt b _ _ : qs) body pa cea = + dsLookupGlobalId filterPName `thenDs` \filterP -> + let ty = parrElemType cea + in + deLambda ty pa b `thenDs` \(clam,_) -> + dePArrComp qs body pa (mkApps (Var filterP) [Type ty, clam, cea]) +-- +-- <<[:e' | p <- e, qs:]>> pa ea = +-- let ef = filterP (\x -> case x of {p -> True; _ -> False}) e +-- in +-- <<[:e' | qs:]>> (pa, p) (crossP ea ef) +-- +dePArrComp (BindStmt p e _ _ : qs) body pa cea = + dsLookupGlobalId filterPName `thenDs` \filterP -> + dsLookupGlobalId crossPName `thenDs` \crossP -> + dsLExpr e `thenDs` \ce -> + let ty'cea = parrElemType cea + ty'ce = parrElemType ce + false = Var falseDataConId + true = Var trueDataConId + in + newSysLocalDs ty'ce `thenDs` \v -> + matchSimply (Var v) (StmtCtxt PArrComp) p true false `thenDs` \pred -> + let cef = mkApps (Var filterP) [Type ty'ce, mkLams [v] pred, ce] + ty'cef = ty'ce -- filterP preserves the type + pa' = mkTuplePat [pa, p] + in + dePArrComp qs body pa' (mkApps (Var crossP) [Type ty'cea, Type ty'cef, cea, cef]) +-- +-- <<[:e' | let ds, qs:]>> pa ea = +-- <<[:e' | qs:]>> (pa, (x_1, ..., x_n)) +-- (mapP (\v@pa -> (v, let ds in (x_1, ..., x_n))) ea) +-- where +-- {x_1, ..., x_n} = DV (ds) -- Defined Variables +-- +dePArrComp (LetStmt ds : qs) body pa cea = + dsLookupGlobalId mapPName `thenDs` \mapP -> + let xs = map unLoc (collectLocalBinders ds) + ty'cea = parrElemType cea + in + newSysLocalDs ty'cea `thenDs` \v -> + dsLocalBinds ds (mkCoreTup (map Var xs)) `thenDs` \clet -> + newSysLocalDs (exprType clet) `thenDs` \let'v -> + let projBody = mkDsLet (NonRec let'v clet) $ + mkCoreTup [Var v, Var let'v] + errTy = exprType projBody + errMsg = "DsListComp.dePArrComp: internal error!" + in + mkErrorAppDs pAT_ERROR_ID errTy errMsg `thenDs` \cerr -> + matchSimply (Var v) (StmtCtxt PArrComp) pa projBody cerr`thenDs` \ccase -> + let pa' = mkTuplePat [pa, mkTuplePat (map nlVarPat xs)] + proj = mkLams [v] ccase + in + dePArrComp qs body pa' (mkApps (Var mapP) [Type ty'cea, proj, cea]) +-- +-- <<[:e' | qs | qss:]>> pa ea = +-- <<[:e' | qss:]>> (pa, (x_1, ..., x_n)) +-- (zipP ea <<[:(x_1, ..., x_n) | qs:]>>) +-- where +-- {x_1, ..., x_n} = DV (qs) +-- +dePArrComp (ParStmt qss : qs) body pa cea = + dsLookupGlobalId crossPName `thenDs` \crossP -> + deParStmt qss `thenDs` \(pQss, + ceQss) -> + let ty'cea = parrElemType cea + ty'ceQss = parrElemType ceQss + pa' = mkTuplePat [pa, pQss] + in + dePArrComp qs body pa' (mkApps (Var crossP) [Type ty'cea, Type ty'ceQss, + cea, ceQss]) + where + deParStmt [] = + -- empty parallel statement lists have not source representation + panic "DsListComp.dePArrComp: Empty parallel list comprehension" + deParStmt ((qs, xs):qss) = -- first statement + let res_expr = mkExplicitTuple (map nlHsVar xs) + in + dsPArrComp (map unLoc qs) res_expr undefined `thenDs` \cqs -> + parStmts qss (mkTuplePat (map nlVarPat xs)) cqs + --- + parStmts [] pa cea = return (pa, cea) + parStmts ((qs, xs):qss) pa cea = -- subsequent statements (zip'ed) + dsLookupGlobalId zipPName `thenDs` \zipP -> + let pa' = mkTuplePat [pa, mkTuplePat (map nlVarPat xs)] + ty'cea = parrElemType cea + res_expr = mkExplicitTuple (map nlHsVar xs) + in + dsPArrComp (map unLoc qs) res_expr undefined `thenDs` \cqs -> + let ty'cqs = parrElemType cqs + cea' = mkApps (Var zipP) [Type ty'cea, Type ty'cqs, cea, cqs] + in + parStmts qss pa' cea' + +-- generate Core corresponding to `\p -> e' +-- +deLambda :: Type -- type of the argument + -> LPat Id -- argument pattern + -> LHsExpr Id -- body + -> DsM (CoreExpr, Type) +deLambda ty p e = + newSysLocalDs ty `thenDs` \v -> + dsLExpr e `thenDs` \ce -> + let errTy = exprType ce + errMsg = "DsListComp.deLambda: internal error!" + in + mkErrorAppDs pAT_ERROR_ID errTy errMsg `thenDs` \cerr -> + matchSimply (Var v) (StmtCtxt PArrComp) p ce cerr `thenDs` \res -> + returnDs (mkLams [v] res, errTy) + +-- obtain the element type of the parallel array produced by the given Core +-- expression +-- +parrElemType :: CoreExpr -> Type +parrElemType e = + case splitTyConApp_maybe (exprType e) of + Just (tycon, [ty]) | tycon == parrTyCon -> ty + _ -> panic + "DsListComp.parrElemType: not a parallel array type" + +-- Smart constructor for source tuple patterns +-- +mkTuplePat :: [LPat Id] -> LPat Id +mkTuplePat [lpat] = lpat +mkTuplePat lpats = noLoc $ mkVanillaTuplePat lpats Boxed + +-- Smart constructor for source tuple expressions +-- +mkExplicitTuple :: [LHsExpr id] -> LHsExpr id +mkExplicitTuple [lexp] = lexp +mkExplicitTuple lexps = noLoc $ ExplicitTuple lexps Boxed +\end{code} diff --git a/compiler/deSugar/DsMeta.hs b/compiler/deSugar/DsMeta.hs new file mode 100644 index 0000000000..88b0ba9c8e --- /dev/null +++ b/compiler/deSugar/DsMeta.hs @@ -0,0 +1,1732 @@ +----------------------------------------------------------------------------- +-- The purpose of this module is to transform an HsExpr into a CoreExpr which +-- when evaluated, returns a (Meta.Q Meta.Exp) computation analogous to the +-- input HsExpr. We do this in the DsM monad, which supplies access to +-- CoreExpr's of the "smart constructors" of the Meta.Exp datatype. +-- +-- It also defines a bunch of knownKeyNames, in the same way as is done +-- in prelude/PrelNames. It's much more convenient to do it here, becuase +-- otherwise we have to recompile PrelNames whenever we add a Name, which is +-- a Royal Pain (triggers other recompilation). +----------------------------------------------------------------------------- + + +module DsMeta( dsBracket, + templateHaskellNames, qTyConName, nameTyConName, + liftName, expQTyConName, decQTyConName, typeQTyConName, + decTyConName, typeTyConName, mkNameG_dName, mkNameG_vName, mkNameG_tcName + ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} DsExpr ( dsExpr ) + +import MatchLit ( dsLit ) +import DsUtils ( mkListExpr, mkStringExpr, mkCoreTup, mkIntExpr ) +import DsMonad + +import qualified Language.Haskell.TH as TH + +import HsSyn +import Class (FunDep) +import PrelNames ( rationalTyConName, integerTyConName, negateName ) +import OccName ( isDataOcc, isTvOcc, occNameString ) +-- To avoid clashes with DsMeta.varName we must make a local alias for OccName.varName +-- we do this by removing varName from the import of OccName above, making +-- a qualified instance of OccName and using OccNameAlias.varName where varName +-- ws previously used in this file. +import qualified OccName + +import Module ( Module, mkModule, moduleString ) +import Id ( Id, mkLocalId ) +import OccName ( mkOccNameFS ) +import Name ( Name, mkExternalName, localiseName, nameOccName, nameModule, + isExternalName, getSrcLoc ) +import NameEnv +import Type ( Type, mkTyConApp ) +import TcType ( tcTyConAppArgs ) +import TyCon ( tyConName ) +import TysWiredIn ( parrTyCon ) +import CoreSyn +import CoreUtils ( exprType ) +import SrcLoc ( noSrcLoc, unLoc, Located(..), SrcSpan, srcLocSpan ) +import Maybe ( catMaybes ) +import Unique ( mkPreludeTyConUnique, mkPreludeMiscIdUnique, getKey, Uniquable(..) ) +import BasicTypes ( isBoxed ) +import Outputable +import Bag ( bagToList, unionManyBags ) +import FastString ( unpackFS ) +import ForeignCall ( Safety(..), CCallConv(..), CCallTarget(..) ) + +import Monad ( zipWithM ) +import List ( sortBy ) + +----------------------------------------------------------------------------- +dsBracket :: HsBracket Name -> [PendingSplice] -> DsM CoreExpr +-- Returns a CoreExpr of type TH.ExpQ +-- The quoted thing is parameterised over Name, even though it has +-- been type checked. We don't want all those type decorations! + +dsBracket brack splices + = dsExtendMetaEnv new_bit (do_brack brack) + where + new_bit = mkNameEnv [(n, Splice (unLoc e)) | (n,e) <- splices] + + do_brack (VarBr n) = do { MkC e1 <- lookupOcc n ; return e1 } + do_brack (ExpBr e) = do { MkC e1 <- repLE e ; return e1 } + do_brack (PatBr p) = do { MkC p1 <- repLP p ; return p1 } + do_brack (TypBr t) = do { MkC t1 <- repLTy t ; return t1 } + do_brack (DecBr ds) = do { MkC ds1 <- repTopDs ds ; return ds1 } + +{- -------------- Examples -------------------- + + [| \x -> x |] +====> + gensym (unpackString "x"#) `bindQ` \ x1::String -> + lam (pvar x1) (var x1) + + + [| \x -> $(f [| x |]) |] +====> + gensym (unpackString "x"#) `bindQ` \ x1::String -> + lam (pvar x1) (f (var x1)) +-} + + +------------------------------------------------------- +-- Declarations +------------------------------------------------------- + +repTopDs :: HsGroup Name -> DsM (Core (TH.Q [TH.Dec])) +repTopDs group + = do { let { bndrs = map unLoc (groupBinders group) } ; + ss <- mkGenSyms bndrs ; + + -- Bind all the names mainly to avoid repeated use of explicit strings. + -- Thus we get + -- do { t :: String <- genSym "T" ; + -- return (Data t [] ...more t's... } + -- The other important reason is that the output must mention + -- only "T", not "Foo:T" where Foo is the current module + + + decls <- addBinds ss (do { + val_ds <- rep_val_binds (hs_valds group) ; + tycl_ds <- mapM repTyClD (hs_tyclds group) ; + inst_ds <- mapM repInstD' (hs_instds group) ; + for_ds <- mapM repForD (hs_fords group) ; + -- more needed + return (de_loc $ sort_by_loc $ val_ds ++ catMaybes tycl_ds ++ inst_ds ++ for_ds) }) ; + + decl_ty <- lookupType decQTyConName ; + let { core_list = coreList' decl_ty decls } ; + + dec_ty <- lookupType decTyConName ; + q_decs <- repSequenceQ dec_ty core_list ; + + wrapNongenSyms ss q_decs + -- Do *not* gensym top-level binders + } + +groupBinders (HsGroup { hs_valds = val_decls, hs_tyclds = tycl_decls, + hs_fords = foreign_decls }) +-- Collect the binders of a Group + = collectHsValBinders val_decls ++ + [n | d <- tycl_decls, n <- tyClDeclNames (unLoc d)] ++ + [n | L _ (ForeignImport n _ _ _) <- foreign_decls] + + +{- Note [Binders and occurrences] + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +When we desugar [d| data T = MkT |] +we want to get + Data "T" [] [Con "MkT" []] [] +and *not* + Data "Foo:T" [] [Con "Foo:MkT" []] [] +That is, the new data decl should fit into whatever new module it is +asked to fit in. We do *not* clone, though; no need for this: + Data "T79" .... + +But if we see this: + data T = MkT + foo = reifyDecl T + +then we must desugar to + foo = Data "Foo:T" [] [Con "Foo:MkT" []] [] + +So in repTopDs we bring the binders into scope with mkGenSyms and addBinds. +And we use lookupOcc, rather than lookupBinder +in repTyClD and repC. + +-} + +repTyClD :: LTyClDecl Name -> DsM (Maybe (SrcSpan, Core TH.DecQ)) + +repTyClD (L loc (TyData { tcdND = DataType, tcdCtxt = cxt, + tcdLName = tc, tcdTyVars = tvs, + tcdCons = cons, tcdDerivs = mb_derivs })) + = do { tc1 <- lookupLOcc tc ; -- See note [Binders and occurrences] + dec <- addTyVarBinds tvs $ \bndrs -> do { + cxt1 <- repLContext cxt ; + cons1 <- mapM repC cons ; + cons2 <- coreList conQTyConName cons1 ; + derivs1 <- repDerivs mb_derivs ; + bndrs1 <- coreList nameTyConName bndrs ; + repData cxt1 tc1 bndrs1 cons2 derivs1 } ; + return $ Just (loc, dec) } + +repTyClD (L loc (TyData { tcdND = NewType, tcdCtxt = cxt, + tcdLName = tc, tcdTyVars = tvs, + tcdCons = [con], tcdDerivs = mb_derivs })) + = do { tc1 <- lookupLOcc tc ; -- See note [Binders and occurrences] + dec <- addTyVarBinds tvs $ \bndrs -> do { + cxt1 <- repLContext cxt ; + con1 <- repC con ; + derivs1 <- repDerivs mb_derivs ; + bndrs1 <- coreList nameTyConName bndrs ; + repNewtype cxt1 tc1 bndrs1 con1 derivs1 } ; + return $ Just (loc, dec) } + +repTyClD (L loc (TySynonym { tcdLName = tc, tcdTyVars = tvs, tcdSynRhs = ty })) + = do { tc1 <- lookupLOcc tc ; -- See note [Binders and occurrences] + dec <- addTyVarBinds tvs $ \bndrs -> do { + ty1 <- repLTy ty ; + bndrs1 <- coreList nameTyConName bndrs ; + repTySyn tc1 bndrs1 ty1 } ; + return (Just (loc, dec)) } + +repTyClD (L loc (ClassDecl { tcdCtxt = cxt, tcdLName = cls, + tcdTyVars = tvs, + tcdFDs = fds, + tcdSigs = sigs, tcdMeths = meth_binds })) + = do { cls1 <- lookupLOcc cls ; -- See note [Binders and occurrences] + dec <- addTyVarBinds tvs $ \bndrs -> do { + cxt1 <- repLContext cxt ; + sigs1 <- rep_sigs sigs ; + binds1 <- rep_binds meth_binds ; + fds1 <- repLFunDeps fds; + decls1 <- coreList decQTyConName (sigs1 ++ binds1) ; + bndrs1 <- coreList nameTyConName bndrs ; + repClass cxt1 cls1 bndrs1 fds1 decls1 } ; + return $ Just (loc, dec) } + +-- Un-handled cases +repTyClD (L loc d) = putSrcSpanDs loc $ + do { dsWarn (hang ds_msg 4 (ppr d)) + ; return Nothing } + +-- represent fundeps +-- +repLFunDeps :: [Located (FunDep Name)] -> DsM (Core [TH.FunDep]) +repLFunDeps fds = do fds' <- mapM repLFunDep fds + fdList <- coreList funDepTyConName fds' + return fdList + +repLFunDep :: Located (FunDep Name) -> DsM (Core TH.FunDep) +repLFunDep (L _ (xs, ys)) = do xs' <- mapM lookupBinder xs + ys' <- mapM lookupBinder ys + xs_list <- coreList nameTyConName xs' + ys_list <- coreList nameTyConName ys' + repFunDep xs_list ys_list + +repInstD' (L loc (InstDecl ty binds _)) -- Ignore user pragmas for now + = do { i <- addTyVarBinds tvs $ \tv_bndrs -> + -- We must bring the type variables into scope, so their occurrences + -- don't fail, even though the binders don't appear in the resulting + -- data structure + do { cxt1 <- repContext cxt + ; inst_ty1 <- repPred (HsClassP cls tys) + ; ss <- mkGenSyms (collectHsBindBinders binds) + ; binds1 <- addBinds ss (rep_binds binds) + ; decls1 <- coreList decQTyConName binds1 + ; decls2 <- wrapNongenSyms ss decls1 + -- wrapNonGenSyms: do not clone the class op names! + -- They must be called 'op' etc, not 'op34' + ; repInst cxt1 inst_ty1 decls2 } + + ; return (loc, i)} + where + (tvs, cxt, cls, tys) = splitHsInstDeclTy (unLoc ty) + +repForD :: Located (ForeignDecl Name) -> DsM (SrcSpan, Core TH.DecQ) +repForD (L loc (ForeignImport name typ (CImport cc s ch cn cis) _)) + = do MkC name' <- lookupLOcc name + MkC typ' <- repLTy typ + MkC cc' <- repCCallConv cc + MkC s' <- repSafety s + MkC str <- coreStringLit $ static + ++ unpackFS ch ++ " " + ++ unpackFS cn ++ " " + ++ conv_cimportspec cis + dec <- rep2 forImpDName [cc', s', str, name', typ'] + return (loc, dec) + where + conv_cimportspec (CLabel cls) = panic "repForD': CLabel Not handled" + conv_cimportspec (CFunction DynamicTarget) = "dynamic" + conv_cimportspec (CFunction (StaticTarget fs)) = unpackFS fs + conv_cimportspec CWrapper = "wrapper" + static = case cis of + CFunction (StaticTarget _) -> "static " + _ -> "" + +repCCallConv :: CCallConv -> DsM (Core TH.Callconv) +repCCallConv CCallConv = rep2 cCallName [] +repCCallConv StdCallConv = rep2 stdCallName [] + +repSafety :: Safety -> DsM (Core TH.Safety) +repSafety PlayRisky = rep2 unsafeName [] +repSafety (PlaySafe False) = rep2 safeName [] +repSafety (PlaySafe True) = rep2 threadsafeName [] + +ds_msg = ptext SLIT("Cannot desugar this Template Haskell declaration:") + +------------------------------------------------------- +-- Constructors +------------------------------------------------------- + +repC :: LConDecl Name -> DsM (Core TH.ConQ) +repC (L loc (ConDecl con expl [] (L _ []) details ResTyH98)) + = do { con1 <- lookupLOcc con ; -- See note [Binders and occurrences] + repConstr con1 details } +repC (L loc (ConDecl con expl tvs (L cloc ctxt) details ResTyH98)) + = do { addTyVarBinds tvs $ \bndrs -> do { + c' <- repC (L loc (ConDecl con expl [] (L cloc []) details ResTyH98)); + ctxt' <- repContext ctxt; + bndrs' <- coreList nameTyConName bndrs; + rep2 forallCName [unC bndrs', unC ctxt', unC c'] + } + } +repC (L loc con_decl) -- GADTs + = putSrcSpanDs loc $ + do { dsWarn (hang ds_msg 4 (ppr con_decl)) + ; return (panic "DsMeta:repC") } + +repBangTy :: LBangType Name -> DsM (Core (TH.StrictTypeQ)) +repBangTy ty= do + MkC s <- rep2 str [] + MkC t <- repLTy ty' + rep2 strictTypeName [s, t] + where + (str, ty') = case ty of + L _ (HsBangTy _ ty) -> (isStrictName, ty) + other -> (notStrictName, ty) + +------------------------------------------------------- +-- Deriving clause +------------------------------------------------------- + +repDerivs :: Maybe [LHsType Name] -> DsM (Core [TH.Name]) +repDerivs Nothing = coreList nameTyConName [] +repDerivs (Just ctxt) + = do { strs <- mapM rep_deriv ctxt ; + coreList nameTyConName strs } + where + rep_deriv :: LHsType Name -> DsM (Core TH.Name) + -- Deriving clauses must have the simple H98 form + rep_deriv (L _ (HsPredTy (HsClassP cls []))) = lookupOcc cls + rep_deriv other = panic "rep_deriv" + + +------------------------------------------------------- +-- Signatures in a class decl, or a group of bindings +------------------------------------------------------- + +rep_sigs :: [LSig Name] -> DsM [Core TH.DecQ] +rep_sigs sigs = do locs_cores <- rep_sigs' sigs + return $ de_loc $ sort_by_loc locs_cores + +rep_sigs' :: [LSig Name] -> DsM [(SrcSpan, Core TH.DecQ)] + -- We silently ignore ones we don't recognise +rep_sigs' sigs = do { sigs1 <- mapM rep_sig sigs ; + return (concat sigs1) } + +rep_sig :: LSig Name -> DsM [(SrcSpan, Core TH.DecQ)] + -- Singleton => Ok + -- Empty => Too hard, signature ignored +rep_sig (L loc (TypeSig nm ty)) = rep_proto nm ty loc +rep_sig other = return [] + +rep_proto :: Located Name -> LHsType Name -> SrcSpan -> DsM [(SrcSpan, Core TH.DecQ)] +rep_proto nm ty loc = do { nm1 <- lookupLOcc nm ; + ty1 <- repLTy ty ; + sig <- repProto nm1 ty1 ; + return [(loc, sig)] } + + +------------------------------------------------------- +-- Types +------------------------------------------------------- + +-- gensym a list of type variables and enter them into the meta environment; +-- the computations passed as the second argument is executed in that extended +-- meta environment and gets the *new* names on Core-level as an argument +-- +addTyVarBinds :: [LHsTyVarBndr Name] -- the binders to be added + -> ([Core TH.Name] -> DsM (Core (TH.Q a))) -- action in the ext env + -> DsM (Core (TH.Q a)) +addTyVarBinds tvs m = + do + let names = map (hsTyVarName.unLoc) tvs + freshNames <- mkGenSyms names + term <- addBinds freshNames $ do + bndrs <- mapM lookupBinder names + m bndrs + wrapGenSyns freshNames term + +-- represent a type context +-- +repLContext :: LHsContext Name -> DsM (Core TH.CxtQ) +repLContext (L _ ctxt) = repContext ctxt + +repContext :: HsContext Name -> DsM (Core TH.CxtQ) +repContext ctxt = do + preds <- mapM repLPred ctxt + predList <- coreList typeQTyConName preds + repCtxt predList + +-- represent a type predicate +-- +repLPred :: LHsPred Name -> DsM (Core TH.TypeQ) +repLPred (L _ p) = repPred p + +repPred :: HsPred Name -> DsM (Core TH.TypeQ) +repPred (HsClassP cls tys) = do + tcon <- repTy (HsTyVar cls) + tys1 <- repLTys tys + repTapps tcon tys1 +repPred (HsIParam _ _) = + panic "DsMeta.repTy: Can't represent predicates with implicit parameters" + +-- yield the representation of a list of types +-- +repLTys :: [LHsType Name] -> DsM [Core TH.TypeQ] +repLTys tys = mapM repLTy tys + +-- represent a type +-- +repLTy :: LHsType Name -> DsM (Core TH.TypeQ) +repLTy (L _ ty) = repTy ty + +repTy :: HsType Name -> DsM (Core TH.TypeQ) +repTy (HsForAllTy _ tvs ctxt ty) = + addTyVarBinds tvs $ \bndrs -> do + ctxt1 <- repLContext ctxt + ty1 <- repLTy ty + bndrs1 <- coreList nameTyConName bndrs + repTForall bndrs1 ctxt1 ty1 + +repTy (HsTyVar n) + | isTvOcc (nameOccName n) = do + tv1 <- lookupBinder n + repTvar tv1 + | otherwise = do + tc1 <- lookupOcc n + repNamedTyCon tc1 +repTy (HsAppTy f a) = do + f1 <- repLTy f + a1 <- repLTy a + repTapp f1 a1 +repTy (HsFunTy f a) = do + f1 <- repLTy f + a1 <- repLTy a + tcon <- repArrowTyCon + repTapps tcon [f1, a1] +repTy (HsListTy t) = do + t1 <- repLTy t + tcon <- repListTyCon + repTapp tcon t1 +repTy (HsPArrTy t) = do + t1 <- repLTy t + tcon <- repTy (HsTyVar (tyConName parrTyCon)) + repTapp tcon t1 +repTy (HsTupleTy tc tys) = do + tys1 <- repLTys tys + tcon <- repTupleTyCon (length tys) + repTapps tcon tys1 +repTy (HsOpTy ty1 n ty2) = repLTy ((nlHsTyVar (unLoc n) `nlHsAppTy` ty1) + `nlHsAppTy` ty2) +repTy (HsParTy t) = repLTy t +repTy (HsNumTy i) = + panic "DsMeta.repTy: Can't represent number types (for generics)" +repTy (HsPredTy pred) = repPred pred +repTy (HsKindSig ty kind) = + panic "DsMeta.repTy: Can't represent explicit kind signatures yet" + + +----------------------------------------------------------------------------- +-- Expressions +----------------------------------------------------------------------------- + +repLEs :: [LHsExpr Name] -> DsM (Core [TH.ExpQ]) +repLEs es = do { es' <- mapM repLE es ; + coreList expQTyConName es' } + +-- FIXME: some of these panics should be converted into proper error messages +-- unless we can make sure that constructs, which are plainly not +-- supported in TH already lead to error messages at an earlier stage +repLE :: LHsExpr Name -> DsM (Core TH.ExpQ) +repLE (L _ e) = repE e + +repE :: HsExpr Name -> DsM (Core TH.ExpQ) +repE (HsVar x) = + do { mb_val <- dsLookupMetaEnv x + ; case mb_val of + Nothing -> do { str <- globalVar x + ; repVarOrCon x str } + Just (Bound y) -> repVarOrCon x (coreVar y) + Just (Splice e) -> do { e' <- dsExpr e + ; return (MkC e') } } +repE (HsIPVar x) = panic "DsMeta.repE: Can't represent implicit parameters" + + -- Remember, we're desugaring renamer output here, so + -- HsOverlit can definitely occur +repE (HsOverLit l) = do { a <- repOverloadedLiteral l; repLit a } +repE (HsLit l) = do { a <- repLiteral l; repLit a } +repE (HsLam (MatchGroup [m] _)) = repLambda m +repE (HsApp x y) = do {a <- repLE x; b <- repLE y; repApp a b} + +repE (OpApp e1 op fix e2) = + do { arg1 <- repLE e1; + arg2 <- repLE e2; + the_op <- repLE op ; + repInfixApp arg1 the_op arg2 } +repE (NegApp x nm) = do + a <- repLE x + negateVar <- lookupOcc negateName >>= repVar + negateVar `repApp` a +repE (HsPar x) = repLE x +repE (SectionL x y) = do { a <- repLE x; b <- repLE y; repSectionL a b } +repE (SectionR x y) = do { a <- repLE x; b <- repLE y; repSectionR a b } +repE (HsCase e (MatchGroup ms _)) = do { arg <- repLE e + ; ms2 <- mapM repMatchTup ms + ; repCaseE arg (nonEmptyCoreList ms2) } +repE (HsIf x y z) = do + a <- repLE x + b <- repLE y + c <- repLE z + repCond a b c +repE (HsLet bs e) = do { (ss,ds) <- repBinds bs + ; e2 <- addBinds ss (repLE e) + ; z <- repLetE ds e2 + ; wrapGenSyns ss z } +-- FIXME: I haven't got the types here right yet +repE (HsDo DoExpr sts body ty) + = do { (ss,zs) <- repLSts sts; + body' <- addBinds ss $ repLE body; + ret <- repNoBindSt body'; + e <- repDoE (nonEmptyCoreList (zs ++ [ret])); + wrapGenSyns ss e } +repE (HsDo ListComp sts body ty) + = do { (ss,zs) <- repLSts sts; + body' <- addBinds ss $ repLE body; + ret <- repNoBindSt body'; + e <- repComp (nonEmptyCoreList (zs ++ [ret])); + wrapGenSyns ss e } +repE (HsDo _ _ _ _) = panic "DsMeta.repE: Can't represent mdo and [: :] yet" +repE (ExplicitList ty es) = do { xs <- repLEs es; repListExp xs } +repE (ExplicitPArr ty es) = + panic "DsMeta.repE: No explicit parallel arrays yet" +repE (ExplicitTuple es boxed) + | isBoxed boxed = do { xs <- repLEs es; repTup xs } + | otherwise = panic "DsMeta.repE: Can't represent unboxed tuples" +repE (RecordCon c _ flds) + = do { x <- lookupLOcc c; + fs <- repFields flds; + repRecCon x fs } +repE (RecordUpd e flds _ _) + = do { x <- repLE e; + fs <- repFields flds; + repRecUpd x fs } + +repE (ExprWithTySig e ty) = do { e1 <- repLE e; t1 <- repLTy ty; repSigExp e1 t1 } +repE (ArithSeq _ aseq) = + case aseq of + From e -> do { ds1 <- repLE e; repFrom ds1 } + FromThen e1 e2 -> do + ds1 <- repLE e1 + ds2 <- repLE e2 + repFromThen ds1 ds2 + FromTo e1 e2 -> do + ds1 <- repLE e1 + ds2 <- repLE e2 + repFromTo ds1 ds2 + FromThenTo e1 e2 e3 -> do + ds1 <- repLE e1 + ds2 <- repLE e2 + ds3 <- repLE e3 + repFromThenTo ds1 ds2 ds3 +repE (PArrSeq _ aseq) = panic "DsMeta.repE: parallel array seq.s missing" +repE (HsCoreAnn _ _) = panic "DsMeta.repE: Can't represent CoreAnn" -- hdaume: core annotations +repE (HsSCC _ _) = panic "DsMeta.repE: Can't represent SCC" +repE (HsBracketOut _ _) = panic "DsMeta.repE: Can't represent Oxford brackets" +repE (HsSpliceE (HsSplice n _)) + = do { mb_val <- dsLookupMetaEnv n + ; case mb_val of + Just (Splice e) -> do { e' <- dsExpr e + ; return (MkC e') } + other -> pprPanic "HsSplice" (ppr n) } + +repE e = pprPanic "DsMeta.repE: Illegal expression form" (ppr e) + +----------------------------------------------------------------------------- +-- Building representations of auxillary structures like Match, Clause, Stmt, + +repMatchTup :: LMatch Name -> DsM (Core TH.MatchQ) +repMatchTup (L _ (Match [p] ty (GRHSs guards wheres))) = + do { ss1 <- mkGenSyms (collectPatBinders p) + ; addBinds ss1 $ do { + ; p1 <- repLP p + ; (ss2,ds) <- repBinds wheres + ; addBinds ss2 $ do { + ; gs <- repGuards guards + ; match <- repMatch p1 gs ds + ; wrapGenSyns (ss1++ss2) match }}} + +repClauseTup :: LMatch Name -> DsM (Core TH.ClauseQ) +repClauseTup (L _ (Match ps ty (GRHSs guards wheres))) = + do { ss1 <- mkGenSyms (collectPatsBinders ps) + ; addBinds ss1 $ do { + ps1 <- repLPs ps + ; (ss2,ds) <- repBinds wheres + ; addBinds ss2 $ do { + gs <- repGuards guards + ; clause <- repClause ps1 gs ds + ; wrapGenSyns (ss1++ss2) clause }}} + +repGuards :: [LGRHS Name] -> DsM (Core TH.BodyQ) +repGuards [L _ (GRHS [] e)] + = do {a <- repLE e; repNormal a } +repGuards other + = do { zs <- mapM process other; + let {(xs, ys) = unzip zs}; + gd <- repGuarded (nonEmptyCoreList ys); + wrapGenSyns (concat xs) gd } + where + process :: LGRHS Name -> DsM ([GenSymBind], (Core (TH.Q (TH.Guard, TH.Exp)))) + process (L _ (GRHS [L _ (ExprStmt e1 _ _)] e2)) + = do { x <- repLNormalGE e1 e2; + return ([], x) } + process (L _ (GRHS ss rhs)) + = do (gs, ss') <- repLSts ss + rhs' <- addBinds gs $ repLE rhs + g <- repPatGE (nonEmptyCoreList ss') rhs' + return (gs, g) + +repFields :: [(Located Name, LHsExpr Name)] -> DsM (Core [TH.Q TH.FieldExp]) +repFields flds = do + fnames <- mapM lookupLOcc (map fst flds) + es <- mapM repLE (map snd flds) + fs <- zipWithM repFieldExp fnames es + coreList fieldExpQTyConName fs + + +----------------------------------------------------------------------------- +-- Representing Stmt's is tricky, especially if bound variables +-- shadow each other. Consider: [| do { x <- f 1; x <- f x; g x } |] +-- First gensym new names for every variable in any of the patterns. +-- both static (x'1 and x'2), and dynamic ((gensym "x") and (gensym "y")) +-- if variables didn't shaddow, the static gensym wouldn't be necessary +-- and we could reuse the original names (x and x). +-- +-- do { x'1 <- gensym "x" +-- ; x'2 <- gensym "x" +-- ; doE [ BindSt (pvar x'1) [| f 1 |] +-- , BindSt (pvar x'2) [| f x |] +-- , NoBindSt [| g x |] +-- ] +-- } + +-- The strategy is to translate a whole list of do-bindings by building a +-- bigger environment, and a bigger set of meta bindings +-- (like: x'1 <- gensym "x" ) and then combining these with the translations +-- of the expressions within the Do + +----------------------------------------------------------------------------- +-- The helper function repSts computes the translation of each sub expression +-- and a bunch of prefix bindings denoting the dynamic renaming. + +repLSts :: [LStmt Name] -> DsM ([GenSymBind], [Core TH.StmtQ]) +repLSts stmts = repSts (map unLoc stmts) + +repSts :: [Stmt Name] -> DsM ([GenSymBind], [Core TH.StmtQ]) +repSts (BindStmt p e _ _ : ss) = + do { e2 <- repLE e + ; ss1 <- mkGenSyms (collectPatBinders p) + ; addBinds ss1 $ do { + ; p1 <- repLP p; + ; (ss2,zs) <- repSts ss + ; z <- repBindSt p1 e2 + ; return (ss1++ss2, z : zs) }} +repSts (LetStmt bs : ss) = + do { (ss1,ds) <- repBinds bs + ; z <- repLetSt ds + ; (ss2,zs) <- addBinds ss1 (repSts ss) + ; return (ss1++ss2, z : zs) } +repSts (ExprStmt e _ _ : ss) = + do { e2 <- repLE e + ; z <- repNoBindSt e2 + ; (ss2,zs) <- repSts ss + ; return (ss2, z : zs) } +repSts [] = return ([],[]) +repSts other = panic "Exotic Stmt in meta brackets" + + +----------------------------------------------------------- +-- Bindings +----------------------------------------------------------- + +repBinds :: HsLocalBinds Name -> DsM ([GenSymBind], Core [TH.DecQ]) +repBinds EmptyLocalBinds + = do { core_list <- coreList decQTyConName [] + ; return ([], core_list) } + +repBinds (HsIPBinds _) + = panic "DsMeta:repBinds: can't do implicit parameters" + +repBinds (HsValBinds decs) + = do { let { bndrs = map unLoc (collectHsValBinders decs) } + -- No need to worrry about detailed scopes within + -- the binding group, because we are talking Names + -- here, so we can safely treat it as a mutually + -- recursive group + ; ss <- mkGenSyms bndrs + ; prs <- addBinds ss (rep_val_binds decs) + ; core_list <- coreList decQTyConName + (de_loc (sort_by_loc prs)) + ; return (ss, core_list) } + +rep_val_binds :: HsValBinds Name -> DsM [(SrcSpan, Core TH.DecQ)] +-- Assumes: all the binders of the binding are alrady in the meta-env +rep_val_binds (ValBindsOut binds sigs) + = do { core1 <- rep_binds' (unionManyBags (map snd binds)) + ; core2 <- rep_sigs' sigs + ; return (core1 ++ core2) } + +rep_binds :: LHsBinds Name -> DsM [Core TH.DecQ] +rep_binds binds = do { binds_w_locs <- rep_binds' binds + ; return (de_loc (sort_by_loc binds_w_locs)) } + +rep_binds' :: LHsBinds Name -> DsM [(SrcSpan, Core TH.DecQ)] +rep_binds' binds = mapM rep_bind (bagToList binds) + +rep_bind :: LHsBind Name -> DsM (SrcSpan, Core TH.DecQ) +-- Assumes: all the binders of the binding are alrady in the meta-env + +-- Note GHC treats declarations of a variable (not a pattern) +-- e.g. x = g 5 as a Fun MonoBinds. This is indicated by a single match +-- with an empty list of patterns +rep_bind (L loc (FunBind { fun_id = fn, + fun_matches = MatchGroup [L _ (Match [] ty (GRHSs guards wheres))] _ })) + = do { (ss,wherecore) <- repBinds wheres + ; guardcore <- addBinds ss (repGuards guards) + ; fn' <- lookupLBinder fn + ; p <- repPvar fn' + ; ans <- repVal p guardcore wherecore + ; ans' <- wrapGenSyns ss ans + ; return (loc, ans') } + +rep_bind (L loc (FunBind { fun_id = fn, fun_matches = MatchGroup ms _ })) + = do { ms1 <- mapM repClauseTup ms + ; fn' <- lookupLBinder fn + ; ans <- repFun fn' (nonEmptyCoreList ms1) + ; return (loc, ans) } + +rep_bind (L loc (PatBind { pat_lhs = pat, pat_rhs = GRHSs guards wheres })) + = do { patcore <- repLP pat + ; (ss,wherecore) <- repBinds wheres + ; guardcore <- addBinds ss (repGuards guards) + ; ans <- repVal patcore guardcore wherecore + ; ans' <- wrapGenSyns ss ans + ; return (loc, ans') } + +rep_bind (L loc (VarBind { var_id = v, var_rhs = e})) + = do { v' <- lookupBinder v + ; e2 <- repLE e + ; x <- repNormal e2 + ; patcore <- repPvar v' + ; empty_decls <- coreList decQTyConName [] + ; ans <- repVal patcore x empty_decls + ; return (srcLocSpan (getSrcLoc v), ans) } + +----------------------------------------------------------------------------- +-- Since everything in a Bind is mutually recursive we need rename all +-- all the variables simultaneously. For example: +-- [| AndMonoBinds (f x = x + g 2) (g x = f 1 + 2) |] would translate to +-- do { f'1 <- gensym "f" +-- ; g'2 <- gensym "g" +-- ; [ do { x'3 <- gensym "x"; fun f'1 [pvar x'3] [| x + g2 |]}, +-- do { x'4 <- gensym "x"; fun g'2 [pvar x'4] [| f 1 + 2 |]} +-- ]} +-- This requires collecting the bindings (f'1 <- gensym "f"), and the +-- environment ( f |-> f'1 ) from each binding, and then unioning them +-- together. As we do this we collect GenSymBinds's which represent the renamed +-- variables bound by the Bindings. In order not to lose track of these +-- representations we build a shadow datatype MB with the same structure as +-- MonoBinds, but which has slots for the representations + + +----------------------------------------------------------------------------- +-- GHC allows a more general form of lambda abstraction than specified +-- by Haskell 98. In particular it allows guarded lambda's like : +-- (\ x | even x -> 0 | odd x -> 1) at the moment we can't represent this in +-- Haskell Template's Meta.Exp type so we punt if it isn't a simple thing like +-- (\ p1 .. pn -> exp) by causing an error. + +repLambda :: LMatch Name -> DsM (Core TH.ExpQ) +repLambda (L _ (Match ps _ (GRHSs [L _ (GRHS [] e)] EmptyLocalBinds))) + = do { let bndrs = collectPatsBinders ps ; + ; ss <- mkGenSyms bndrs + ; lam <- addBinds ss ( + do { xs <- repLPs ps; body <- repLE e; repLam xs body }) + ; wrapGenSyns ss lam } + +repLambda z = panic "Can't represent a guarded lambda in Template Haskell" + + +----------------------------------------------------------------------------- +-- Patterns +-- repP deals with patterns. It assumes that we have already +-- walked over the pattern(s) once to collect the binders, and +-- have extended the environment. So every pattern-bound +-- variable should already appear in the environment. + +-- Process a list of patterns +repLPs :: [LPat Name] -> DsM (Core [TH.PatQ]) +repLPs ps = do { ps' <- mapM repLP ps ; + coreList patQTyConName ps' } + +repLP :: LPat Name -> DsM (Core TH.PatQ) +repLP (L _ p) = repP p + +repP :: Pat Name -> DsM (Core TH.PatQ) +repP (WildPat _) = repPwild +repP (LitPat l) = do { l2 <- repLiteral l; repPlit l2 } +repP (VarPat x) = do { x' <- lookupBinder x; repPvar x' } +repP (LazyPat p) = do { p1 <- repLP p; repPtilde p1 } +repP (AsPat x p) = do { x' <- lookupLBinder x; p1 <- repLP p; repPaspat x' p1 } +repP (ParPat p) = repLP p +repP (ListPat ps _) = do { qs <- repLPs ps; repPlist qs } +repP (TuplePat ps _ _) = do { qs <- repLPs ps; repPtup qs } +repP (ConPatIn dc details) + = do { con_str <- lookupLOcc dc + ; case details of + PrefixCon ps -> do { qs <- repLPs ps; repPcon con_str qs } + RecCon pairs -> do { vs <- sequence $ map lookupLOcc (map fst pairs) + ; ps <- sequence $ map repLP (map snd pairs) + ; fps <- zipWithM (\x y -> rep2 fieldPatName [unC x,unC y]) vs ps + ; fps' <- coreList fieldPatQTyConName fps + ; repPrec con_str fps' } + InfixCon p1 p2 -> do { p1' <- repLP p1; + p2' <- repLP p2; + repPinfix p1' con_str p2' } + } +repP (NPat l (Just _) _ _) = panic "Can't cope with negative overloaded patterns yet (repP (NPat _ (Just _)))" +repP (NPat l Nothing _ _) = do { a <- repOverloadedLiteral l; repPlit a } +repP (SigPatIn p t) = do { p' <- repLP p; t' <- repLTy t; repPsig p' t' } +repP other = panic "Exotic pattern inside meta brackets" + +---------------------------------------------------------- +-- Declaration ordering helpers + +sort_by_loc :: [(SrcSpan, a)] -> [(SrcSpan, a)] +sort_by_loc xs = sortBy comp xs + where comp x y = compare (fst x) (fst y) + +de_loc :: [(a, b)] -> [b] +de_loc = map snd + +---------------------------------------------------------- +-- The meta-environment + +-- A name/identifier association for fresh names of locally bound entities +type GenSymBind = (Name, Id) -- Gensym the string and bind it to the Id + -- I.e. (x, x_id) means + -- let x_id = gensym "x" in ... + +-- Generate a fresh name for a locally bound entity + +mkGenSyms :: [Name] -> DsM [GenSymBind] +-- We can use the existing name. For example: +-- [| \x_77 -> x_77 + x_77 |] +-- desugars to +-- do { x_77 <- genSym "x"; .... } +-- We use the same x_77 in the desugared program, but with the type Bndr +-- instead of Int +-- +-- We do make it an Internal name, though (hence localiseName) +-- +-- Nevertheless, it's monadic because we have to generate nameTy +mkGenSyms ns = do { var_ty <- lookupType nameTyConName + ; return [(nm, mkLocalId (localiseName nm) var_ty) | nm <- ns] } + + +addBinds :: [GenSymBind] -> DsM a -> DsM a +-- Add a list of fresh names for locally bound entities to the +-- meta environment (which is part of the state carried around +-- by the desugarer monad) +addBinds bs m = dsExtendMetaEnv (mkNameEnv [(n,Bound id) | (n,id) <- bs]) m + +-- Look up a locally bound name +-- +lookupLBinder :: Located Name -> DsM (Core TH.Name) +lookupLBinder (L _ n) = lookupBinder n + +lookupBinder :: Name -> DsM (Core TH.Name) +lookupBinder n + = do { mb_val <- dsLookupMetaEnv n; + case mb_val of + Just (Bound x) -> return (coreVar x) + other -> pprPanic "DsMeta: failed binder lookup when desugaring a TH bracket:" (ppr n) } + +-- Look up a name that is either locally bound or a global name +-- +-- * If it is a global name, generate the "original name" representation (ie, +-- the <module>:<name> form) for the associated entity +-- +lookupLOcc :: Located Name -> DsM (Core TH.Name) +-- Lookup an occurrence; it can't be a splice. +-- Use the in-scope bindings if they exist +lookupLOcc (L _ n) = lookupOcc n + +lookupOcc :: Name -> DsM (Core TH.Name) +lookupOcc n + = do { mb_val <- dsLookupMetaEnv n ; + case mb_val of + Nothing -> globalVar n + Just (Bound x) -> return (coreVar x) + Just (Splice _) -> pprPanic "repE:lookupOcc" (ppr n) + } + +globalVar :: Name -> DsM (Core TH.Name) +-- Not bound by the meta-env +-- Could be top-level; or could be local +-- f x = $(g [| x |]) +-- Here the x will be local +globalVar name + | isExternalName name + = do { MkC mod <- coreStringLit name_mod + ; MkC occ <- occNameLit name + ; rep2 mk_varg [mod,occ] } + | otherwise + = do { MkC occ <- occNameLit name + ; MkC uni <- coreIntLit (getKey (getUnique name)) + ; rep2 mkNameLName [occ,uni] } + where + name_mod = moduleString (nameModule name) + name_occ = nameOccName name + mk_varg | OccName.isDataOcc name_occ = mkNameG_dName + | OccName.isVarOcc name_occ = mkNameG_vName + | OccName.isTcOcc name_occ = mkNameG_tcName + | otherwise = pprPanic "DsMeta.globalVar" (ppr name) + +lookupType :: Name -- Name of type constructor (e.g. TH.ExpQ) + -> DsM Type -- The type +lookupType tc_name = do { tc <- dsLookupTyCon tc_name ; + return (mkTyConApp tc []) } + +wrapGenSyns :: [GenSymBind] + -> Core (TH.Q a) -> DsM (Core (TH.Q a)) +-- wrapGenSyns [(nm1,id1), (nm2,id2)] y +-- --> bindQ (gensym nm1) (\ id1 -> +-- bindQ (gensym nm2 (\ id2 -> +-- y)) + +wrapGenSyns binds body@(MkC b) + = do { var_ty <- lookupType nameTyConName + ; go var_ty binds } + where + [elt_ty] = tcTyConAppArgs (exprType b) + -- b :: Q a, so we can get the type 'a' by looking at the + -- argument type. NB: this relies on Q being a data/newtype, + -- not a type synonym + + go var_ty [] = return body + go var_ty ((name,id) : binds) + = do { MkC body' <- go var_ty binds + ; lit_str <- occNameLit name + ; gensym_app <- repGensym lit_str + ; repBindQ var_ty elt_ty + gensym_app (MkC (Lam id body')) } + +-- Just like wrapGenSym, but don't actually do the gensym +-- Instead use the existing name: +-- let x = "x" in ... +-- Only used for [Decl], and for the class ops in class +-- and instance decls +wrapNongenSyms :: [GenSymBind] -> Core a -> DsM (Core a) +wrapNongenSyms binds (MkC body) + = do { binds' <- mapM do_one binds ; + return (MkC (mkLets binds' body)) } + where + do_one (name,id) + = do { MkC lit_str <- occNameLit name + ; MkC var <- rep2 mkNameName [lit_str] + ; return (NonRec id var) } + +occNameLit :: Name -> DsM (Core String) +occNameLit n = coreStringLit (occNameString (nameOccName n)) + + +-- %********************************************************************* +-- %* * +-- Constructing code +-- %* * +-- %********************************************************************* + +----------------------------------------------------------------------------- +-- PHANTOM TYPES for consistency. In order to make sure we do this correct +-- we invent a new datatype which uses phantom types. + +newtype Core a = MkC CoreExpr +unC (MkC x) = x + +rep2 :: Name -> [ CoreExpr ] -> DsM (Core a) +rep2 n xs = do { id <- dsLookupGlobalId n + ; return (MkC (foldl App (Var id) xs)) } + +-- Then we make "repConstructors" which use the phantom types for each of the +-- smart constructors of the Meta.Meta datatypes. + + +-- %********************************************************************* +-- %* * +-- The 'smart constructors' +-- %* * +-- %********************************************************************* + +--------------- Patterns ----------------- +repPlit :: Core TH.Lit -> DsM (Core TH.PatQ) +repPlit (MkC l) = rep2 litPName [l] + +repPvar :: Core TH.Name -> DsM (Core TH.PatQ) +repPvar (MkC s) = rep2 varPName [s] + +repPtup :: Core [TH.PatQ] -> DsM (Core TH.PatQ) +repPtup (MkC ps) = rep2 tupPName [ps] + +repPcon :: Core TH.Name -> Core [TH.PatQ] -> DsM (Core TH.PatQ) +repPcon (MkC s) (MkC ps) = rep2 conPName [s, ps] + +repPrec :: Core TH.Name -> Core [(TH.Name,TH.PatQ)] -> DsM (Core TH.PatQ) +repPrec (MkC c) (MkC rps) = rep2 recPName [c,rps] + +repPinfix :: Core TH.PatQ -> Core TH.Name -> Core TH.PatQ -> DsM (Core TH.PatQ) +repPinfix (MkC p1) (MkC n) (MkC p2) = rep2 infixPName [p1, n, p2] + +repPtilde :: Core TH.PatQ -> DsM (Core TH.PatQ) +repPtilde (MkC p) = rep2 tildePName [p] + +repPaspat :: Core TH.Name -> Core TH.PatQ -> DsM (Core TH.PatQ) +repPaspat (MkC s) (MkC p) = rep2 asPName [s, p] + +repPwild :: DsM (Core TH.PatQ) +repPwild = rep2 wildPName [] + +repPlist :: Core [TH.PatQ] -> DsM (Core TH.PatQ) +repPlist (MkC ps) = rep2 listPName [ps] + +repPsig :: Core TH.PatQ -> Core TH.TypeQ -> DsM (Core TH.PatQ) +repPsig (MkC p) (MkC t) = rep2 sigPName [p, t] + +--------------- Expressions ----------------- +repVarOrCon :: Name -> Core TH.Name -> DsM (Core TH.ExpQ) +repVarOrCon vc str | isDataOcc (nameOccName vc) = repCon str + | otherwise = repVar str + +repVar :: Core TH.Name -> DsM (Core TH.ExpQ) +repVar (MkC s) = rep2 varEName [s] + +repCon :: Core TH.Name -> DsM (Core TH.ExpQ) +repCon (MkC s) = rep2 conEName [s] + +repLit :: Core TH.Lit -> DsM (Core TH.ExpQ) +repLit (MkC c) = rep2 litEName [c] + +repApp :: Core TH.ExpQ -> Core TH.ExpQ -> DsM (Core TH.ExpQ) +repApp (MkC x) (MkC y) = rep2 appEName [x,y] + +repLam :: Core [TH.PatQ] -> Core TH.ExpQ -> DsM (Core TH.ExpQ) +repLam (MkC ps) (MkC e) = rep2 lamEName [ps, e] + +repTup :: Core [TH.ExpQ] -> DsM (Core TH.ExpQ) +repTup (MkC es) = rep2 tupEName [es] + +repCond :: Core TH.ExpQ -> Core TH.ExpQ -> Core TH.ExpQ -> DsM (Core TH.ExpQ) +repCond (MkC x) (MkC y) (MkC z) = rep2 condEName [x,y,z] + +repLetE :: Core [TH.DecQ] -> Core TH.ExpQ -> DsM (Core TH.ExpQ) +repLetE (MkC ds) (MkC e) = rep2 letEName [ds, e] + +repCaseE :: Core TH.ExpQ -> Core [TH.MatchQ] -> DsM( Core TH.ExpQ) +repCaseE (MkC e) (MkC ms) = rep2 caseEName [e, ms] + +repDoE :: Core [TH.StmtQ] -> DsM (Core TH.ExpQ) +repDoE (MkC ss) = rep2 doEName [ss] + +repComp :: Core [TH.StmtQ] -> DsM (Core TH.ExpQ) +repComp (MkC ss) = rep2 compEName [ss] + +repListExp :: Core [TH.ExpQ] -> DsM (Core TH.ExpQ) +repListExp (MkC es) = rep2 listEName [es] + +repSigExp :: Core TH.ExpQ -> Core TH.TypeQ -> DsM (Core TH.ExpQ) +repSigExp (MkC e) (MkC t) = rep2 sigEName [e,t] + +repRecCon :: Core TH.Name -> Core [TH.Q TH.FieldExp]-> DsM (Core TH.ExpQ) +repRecCon (MkC c) (MkC fs) = rep2 recConEName [c,fs] + +repRecUpd :: Core TH.ExpQ -> Core [TH.Q TH.FieldExp] -> DsM (Core TH.ExpQ) +repRecUpd (MkC e) (MkC fs) = rep2 recUpdEName [e,fs] + +repFieldExp :: Core TH.Name -> Core TH.ExpQ -> DsM (Core (TH.Q TH.FieldExp)) +repFieldExp (MkC n) (MkC x) = rep2 fieldExpName [n,x] + +repInfixApp :: Core TH.ExpQ -> Core TH.ExpQ -> Core TH.ExpQ -> DsM (Core TH.ExpQ) +repInfixApp (MkC x) (MkC y) (MkC z) = rep2 infixAppName [x,y,z] + +repSectionL :: Core TH.ExpQ -> Core TH.ExpQ -> DsM (Core TH.ExpQ) +repSectionL (MkC x) (MkC y) = rep2 sectionLName [x,y] + +repSectionR :: Core TH.ExpQ -> Core TH.ExpQ -> DsM (Core TH.ExpQ) +repSectionR (MkC x) (MkC y) = rep2 sectionRName [x,y] + +------------ Right hand sides (guarded expressions) ---- +repGuarded :: Core [TH.Q (TH.Guard, TH.Exp)] -> DsM (Core TH.BodyQ) +repGuarded (MkC pairs) = rep2 guardedBName [pairs] + +repNormal :: Core TH.ExpQ -> DsM (Core TH.BodyQ) +repNormal (MkC e) = rep2 normalBName [e] + +------------ Guards ---- +repLNormalGE :: LHsExpr Name -> LHsExpr Name -> DsM (Core (TH.Q (TH.Guard, TH.Exp))) +repLNormalGE g e = do g' <- repLE g + e' <- repLE e + repNormalGE g' e' + +repNormalGE :: Core TH.ExpQ -> Core TH.ExpQ -> DsM (Core (TH.Q (TH.Guard, TH.Exp))) +repNormalGE (MkC g) (MkC e) = rep2 normalGEName [g, e] + +repPatGE :: Core [TH.StmtQ] -> Core TH.ExpQ -> DsM (Core (TH.Q (TH.Guard, TH.Exp))) +repPatGE (MkC ss) (MkC e) = rep2 patGEName [ss, e] + +------------- Stmts ------------------- +repBindSt :: Core TH.PatQ -> Core TH.ExpQ -> DsM (Core TH.StmtQ) +repBindSt (MkC p) (MkC e) = rep2 bindSName [p,e] + +repLetSt :: Core [TH.DecQ] -> DsM (Core TH.StmtQ) +repLetSt (MkC ds) = rep2 letSName [ds] + +repNoBindSt :: Core TH.ExpQ -> DsM (Core TH.StmtQ) +repNoBindSt (MkC e) = rep2 noBindSName [e] + +-------------- Range (Arithmetic sequences) ----------- +repFrom :: Core TH.ExpQ -> DsM (Core TH.ExpQ) +repFrom (MkC x) = rep2 fromEName [x] + +repFromThen :: Core TH.ExpQ -> Core TH.ExpQ -> DsM (Core TH.ExpQ) +repFromThen (MkC x) (MkC y) = rep2 fromThenEName [x,y] + +repFromTo :: Core TH.ExpQ -> Core TH.ExpQ -> DsM (Core TH.ExpQ) +repFromTo (MkC x) (MkC y) = rep2 fromToEName [x,y] + +repFromThenTo :: Core TH.ExpQ -> Core TH.ExpQ -> Core TH.ExpQ -> DsM (Core TH.ExpQ) +repFromThenTo (MkC x) (MkC y) (MkC z) = rep2 fromThenToEName [x,y,z] + +------------ Match and Clause Tuples ----------- +repMatch :: Core TH.PatQ -> Core TH.BodyQ -> Core [TH.DecQ] -> DsM (Core TH.MatchQ) +repMatch (MkC p) (MkC bod) (MkC ds) = rep2 matchName [p, bod, ds] + +repClause :: Core [TH.PatQ] -> Core TH.BodyQ -> Core [TH.DecQ] -> DsM (Core TH.ClauseQ) +repClause (MkC ps) (MkC bod) (MkC ds) = rep2 clauseName [ps, bod, ds] + +-------------- Dec ----------------------------- +repVal :: Core TH.PatQ -> Core TH.BodyQ -> Core [TH.DecQ] -> DsM (Core TH.DecQ) +repVal (MkC p) (MkC b) (MkC ds) = rep2 valDName [p, b, ds] + +repFun :: Core TH.Name -> Core [TH.ClauseQ] -> DsM (Core TH.DecQ) +repFun (MkC nm) (MkC b) = rep2 funDName [nm, b] + +repData :: Core TH.CxtQ -> Core TH.Name -> Core [TH.Name] -> Core [TH.ConQ] -> Core [TH.Name] -> DsM (Core TH.DecQ) +repData (MkC cxt) (MkC nm) (MkC tvs) (MkC cons) (MkC derivs) + = rep2 dataDName [cxt, nm, tvs, cons, derivs] + +repNewtype :: Core TH.CxtQ -> Core TH.Name -> Core [TH.Name] -> Core TH.ConQ -> Core [TH.Name] -> DsM (Core TH.DecQ) +repNewtype (MkC cxt) (MkC nm) (MkC tvs) (MkC con) (MkC derivs) + = rep2 newtypeDName [cxt, nm, tvs, con, derivs] + +repTySyn :: Core TH.Name -> Core [TH.Name] -> Core TH.TypeQ -> DsM (Core TH.DecQ) +repTySyn (MkC nm) (MkC tvs) (MkC rhs) = rep2 tySynDName [nm, tvs, rhs] + +repInst :: Core TH.CxtQ -> Core TH.TypeQ -> Core [TH.DecQ] -> DsM (Core TH.DecQ) +repInst (MkC cxt) (MkC ty) (MkC ds) = rep2 instanceDName [cxt, ty, ds] + +repClass :: Core TH.CxtQ -> Core TH.Name -> Core [TH.Name] -> Core [TH.FunDep] -> Core [TH.DecQ] -> DsM (Core TH.DecQ) +repClass (MkC cxt) (MkC cls) (MkC tvs) (MkC fds) (MkC ds) = rep2 classDName [cxt, cls, tvs, fds, ds] + +repFunDep :: Core [TH.Name] -> Core [TH.Name] -> DsM (Core TH.FunDep) +repFunDep (MkC xs) (MkC ys) = rep2 funDepName [xs, ys] + +repProto :: Core TH.Name -> Core TH.TypeQ -> DsM (Core TH.DecQ) +repProto (MkC s) (MkC ty) = rep2 sigDName [s, ty] + +repCtxt :: Core [TH.TypeQ] -> DsM (Core TH.CxtQ) +repCtxt (MkC tys) = rep2 cxtName [tys] + +repConstr :: Core TH.Name -> HsConDetails Name (LBangType Name) + -> DsM (Core TH.ConQ) +repConstr con (PrefixCon ps) + = do arg_tys <- mapM repBangTy ps + arg_tys1 <- coreList strictTypeQTyConName arg_tys + rep2 normalCName [unC con, unC arg_tys1] +repConstr con (RecCon ips) + = do arg_vs <- mapM lookupLOcc (map fst ips) + arg_tys <- mapM repBangTy (map snd ips) + arg_vtys <- zipWithM (\x y -> rep2 varStrictTypeName [unC x, unC y]) + arg_vs arg_tys + arg_vtys' <- coreList varStrictTypeQTyConName arg_vtys + rep2 recCName [unC con, unC arg_vtys'] +repConstr con (InfixCon st1 st2) + = do arg1 <- repBangTy st1 + arg2 <- repBangTy st2 + rep2 infixCName [unC arg1, unC con, unC arg2] + +------------ Types ------------------- + +repTForall :: Core [TH.Name] -> Core TH.CxtQ -> Core TH.TypeQ -> DsM (Core TH.TypeQ) +repTForall (MkC tvars) (MkC ctxt) (MkC ty) + = rep2 forallTName [tvars, ctxt, ty] + +repTvar :: Core TH.Name -> DsM (Core TH.TypeQ) +repTvar (MkC s) = rep2 varTName [s] + +repTapp :: Core TH.TypeQ -> Core TH.TypeQ -> DsM (Core TH.TypeQ) +repTapp (MkC t1) (MkC t2) = rep2 appTName [t1,t2] + +repTapps :: Core TH.TypeQ -> [Core TH.TypeQ] -> DsM (Core TH.TypeQ) +repTapps f [] = return f +repTapps f (t:ts) = do { f1 <- repTapp f t; repTapps f1 ts } + +--------- Type constructors -------------- + +repNamedTyCon :: Core TH.Name -> DsM (Core TH.TypeQ) +repNamedTyCon (MkC s) = rep2 conTName [s] + +repTupleTyCon :: Int -> DsM (Core TH.TypeQ) +-- Note: not Core Int; it's easier to be direct here +repTupleTyCon i = rep2 tupleTName [mkIntExpr (fromIntegral i)] + +repArrowTyCon :: DsM (Core TH.TypeQ) +repArrowTyCon = rep2 arrowTName [] + +repListTyCon :: DsM (Core TH.TypeQ) +repListTyCon = rep2 listTName [] + + +---------------------------------------------------------- +-- Literals + +repLiteral :: HsLit -> DsM (Core TH.Lit) +repLiteral lit + = do lit' <- case lit of + HsIntPrim i -> mk_integer i + HsInt i -> mk_integer i + HsFloatPrim r -> mk_rational r + HsDoublePrim r -> mk_rational r + _ -> return lit + lit_expr <- dsLit lit' + rep2 lit_name [lit_expr] + where + lit_name = case lit of + HsInteger _ _ -> integerLName + HsInt _ -> integerLName + HsIntPrim _ -> intPrimLName + HsFloatPrim _ -> floatPrimLName + HsDoublePrim _ -> doublePrimLName + HsChar _ -> charLName + HsString _ -> stringLName + HsRat _ _ -> rationalLName + other -> uh_oh + uh_oh = pprPanic "DsMeta.repLiteral: trying to represent exotic literal" + (ppr lit) + +mk_integer i = do integer_ty <- lookupType integerTyConName + return $ HsInteger i integer_ty +mk_rational r = do rat_ty <- lookupType rationalTyConName + return $ HsRat r rat_ty + +repOverloadedLiteral :: HsOverLit Name -> DsM (Core TH.Lit) +repOverloadedLiteral (HsIntegral i _) = do { lit <- mk_integer i; repLiteral lit } +repOverloadedLiteral (HsFractional f _) = do { lit <- mk_rational f; repLiteral lit } + -- The type Rational will be in the environment, becuase + -- the smart constructor 'TH.Syntax.rationalL' uses it in its type, + -- and rationalL is sucked in when any TH stuff is used + +--------------- Miscellaneous ------------------- + +repGensym :: Core String -> DsM (Core (TH.Q TH.Name)) +repGensym (MkC lit_str) = rep2 newNameName [lit_str] + +repBindQ :: Type -> Type -- a and b + -> Core (TH.Q a) -> Core (a -> TH.Q b) -> DsM (Core (TH.Q b)) +repBindQ ty_a ty_b (MkC x) (MkC y) + = rep2 bindQName [Type ty_a, Type ty_b, x, y] + +repSequenceQ :: Type -> Core [TH.Q a] -> DsM (Core (TH.Q [a])) +repSequenceQ ty_a (MkC list) + = rep2 sequenceQName [Type ty_a, list] + +------------ Lists and Tuples ------------------- +-- turn a list of patterns into a single pattern matching a list + +coreList :: Name -- Of the TyCon of the element type + -> [Core a] -> DsM (Core [a]) +coreList tc_name es + = do { elt_ty <- lookupType tc_name; return (coreList' elt_ty es) } + +coreList' :: Type -- The element type + -> [Core a] -> Core [a] +coreList' elt_ty es = MkC (mkListExpr elt_ty (map unC es )) + +nonEmptyCoreList :: [Core a] -> Core [a] + -- The list must be non-empty so we can get the element type + -- Otherwise use coreList +nonEmptyCoreList [] = panic "coreList: empty argument" +nonEmptyCoreList xs@(MkC x:_) = MkC (mkListExpr (exprType x) (map unC xs)) + +corePair :: (Core a, Core b) -> Core (a,b) +corePair (MkC x, MkC y) = MkC (mkCoreTup [x,y]) + +coreStringLit :: String -> DsM (Core String) +coreStringLit s = do { z <- mkStringExpr s; return(MkC z) } + +coreIntLit :: Int -> DsM (Core Int) +coreIntLit i = return (MkC (mkIntExpr (fromIntegral i))) + +coreVar :: Id -> Core TH.Name -- The Id has type Name +coreVar id = MkC (Var id) + + + +-- %************************************************************************ +-- %* * +-- The known-key names for Template Haskell +-- %* * +-- %************************************************************************ + +-- To add a name, do three things +-- +-- 1) Allocate a key +-- 2) Make a "Name" +-- 3) Add the name to knownKeyNames + +templateHaskellNames :: [Name] +-- The names that are implicitly mentioned by ``bracket'' +-- Should stay in sync with the import list of DsMeta + +templateHaskellNames = [ + returnQName, bindQName, sequenceQName, newNameName, liftName, + mkNameName, mkNameG_vName, mkNameG_dName, mkNameG_tcName, mkNameLName, + + -- Lit + charLName, stringLName, integerLName, intPrimLName, + floatPrimLName, doublePrimLName, rationalLName, + -- Pat + litPName, varPName, tupPName, conPName, tildePName, infixPName, + asPName, wildPName, recPName, listPName, sigPName, + -- FieldPat + fieldPatName, + -- Match + matchName, + -- Clause + clauseName, + -- Exp + varEName, conEName, litEName, appEName, infixEName, + infixAppName, sectionLName, sectionRName, lamEName, tupEName, + condEName, letEName, caseEName, doEName, compEName, + fromEName, fromThenEName, fromToEName, fromThenToEName, + listEName, sigEName, recConEName, recUpdEName, + -- FieldExp + fieldExpName, + -- Body + guardedBName, normalBName, + -- Guard + normalGEName, patGEName, + -- Stmt + bindSName, letSName, noBindSName, parSName, + -- Dec + funDName, valDName, dataDName, newtypeDName, tySynDName, + classDName, instanceDName, sigDName, forImpDName, + -- Cxt + cxtName, + -- Strict + isStrictName, notStrictName, + -- Con + normalCName, recCName, infixCName, forallCName, + -- StrictType + strictTypeName, + -- VarStrictType + varStrictTypeName, + -- Type + forallTName, varTName, conTName, appTName, + tupleTName, arrowTName, listTName, + -- Callconv + cCallName, stdCallName, + -- Safety + unsafeName, + safeName, + threadsafeName, + -- FunDep + funDepName, + + -- And the tycons + qTyConName, nameTyConName, patTyConName, fieldPatTyConName, matchQTyConName, + clauseQTyConName, expQTyConName, fieldExpTyConName, stmtQTyConName, + decQTyConName, conQTyConName, strictTypeQTyConName, + varStrictTypeQTyConName, typeQTyConName, expTyConName, decTyConName, + typeTyConName, matchTyConName, clauseTyConName, patQTyConName, + fieldPatQTyConName, fieldExpQTyConName, funDepTyConName] + +thSyn :: Module +thSyn = mkModule "Language.Haskell.TH.Syntax" +thLib = mkModule "Language.Haskell.TH.Lib" + +mk_known_key_name mod space str uniq + = mkExternalName uniq mod (mkOccNameFS space str) + Nothing noSrcLoc + +libFun = mk_known_key_name thLib OccName.varName +libTc = mk_known_key_name thLib OccName.tcName +thFun = mk_known_key_name thSyn OccName.varName +thTc = mk_known_key_name thSyn OccName.tcName + +-------------------- TH.Syntax ----------------------- +qTyConName = thTc FSLIT("Q") qTyConKey +nameTyConName = thTc FSLIT("Name") nameTyConKey +fieldExpTyConName = thTc FSLIT("FieldExp") fieldExpTyConKey +patTyConName = thTc FSLIT("Pat") patTyConKey +fieldPatTyConName = thTc FSLIT("FieldPat") fieldPatTyConKey +expTyConName = thTc FSLIT("Exp") expTyConKey +decTyConName = thTc FSLIT("Dec") decTyConKey +typeTyConName = thTc FSLIT("Type") typeTyConKey +matchTyConName = thTc FSLIT("Match") matchTyConKey +clauseTyConName = thTc FSLIT("Clause") clauseTyConKey +funDepTyConName = thTc FSLIT("FunDep") funDepTyConKey + +returnQName = thFun FSLIT("returnQ") returnQIdKey +bindQName = thFun FSLIT("bindQ") bindQIdKey +sequenceQName = thFun FSLIT("sequenceQ") sequenceQIdKey +newNameName = thFun FSLIT("newName") newNameIdKey +liftName = thFun FSLIT("lift") liftIdKey +mkNameName = thFun FSLIT("mkName") mkNameIdKey +mkNameG_vName = thFun FSLIT("mkNameG_v") mkNameG_vIdKey +mkNameG_dName = thFun FSLIT("mkNameG_d") mkNameG_dIdKey +mkNameG_tcName = thFun FSLIT("mkNameG_tc") mkNameG_tcIdKey +mkNameLName = thFun FSLIT("mkNameL") mkNameLIdKey + + +-------------------- TH.Lib ----------------------- +-- data Lit = ... +charLName = libFun FSLIT("charL") charLIdKey +stringLName = libFun FSLIT("stringL") stringLIdKey +integerLName = libFun FSLIT("integerL") integerLIdKey +intPrimLName = libFun FSLIT("intPrimL") intPrimLIdKey +floatPrimLName = libFun FSLIT("floatPrimL") floatPrimLIdKey +doublePrimLName = libFun FSLIT("doublePrimL") doublePrimLIdKey +rationalLName = libFun FSLIT("rationalL") rationalLIdKey + +-- data Pat = ... +litPName = libFun FSLIT("litP") litPIdKey +varPName = libFun FSLIT("varP") varPIdKey +tupPName = libFun FSLIT("tupP") tupPIdKey +conPName = libFun FSLIT("conP") conPIdKey +infixPName = libFun FSLIT("infixP") infixPIdKey +tildePName = libFun FSLIT("tildeP") tildePIdKey +asPName = libFun FSLIT("asP") asPIdKey +wildPName = libFun FSLIT("wildP") wildPIdKey +recPName = libFun FSLIT("recP") recPIdKey +listPName = libFun FSLIT("listP") listPIdKey +sigPName = libFun FSLIT("sigP") sigPIdKey + +-- type FieldPat = ... +fieldPatName = libFun FSLIT("fieldPat") fieldPatIdKey + +-- data Match = ... +matchName = libFun FSLIT("match") matchIdKey + +-- data Clause = ... +clauseName = libFun FSLIT("clause") clauseIdKey + +-- data Exp = ... +varEName = libFun FSLIT("varE") varEIdKey +conEName = libFun FSLIT("conE") conEIdKey +litEName = libFun FSLIT("litE") litEIdKey +appEName = libFun FSLIT("appE") appEIdKey +infixEName = libFun FSLIT("infixE") infixEIdKey +infixAppName = libFun FSLIT("infixApp") infixAppIdKey +sectionLName = libFun FSLIT("sectionL") sectionLIdKey +sectionRName = libFun FSLIT("sectionR") sectionRIdKey +lamEName = libFun FSLIT("lamE") lamEIdKey +tupEName = libFun FSLIT("tupE") tupEIdKey +condEName = libFun FSLIT("condE") condEIdKey +letEName = libFun FSLIT("letE") letEIdKey +caseEName = libFun FSLIT("caseE") caseEIdKey +doEName = libFun FSLIT("doE") doEIdKey +compEName = libFun FSLIT("compE") compEIdKey +-- ArithSeq skips a level +fromEName = libFun FSLIT("fromE") fromEIdKey +fromThenEName = libFun FSLIT("fromThenE") fromThenEIdKey +fromToEName = libFun FSLIT("fromToE") fromToEIdKey +fromThenToEName = libFun FSLIT("fromThenToE") fromThenToEIdKey +-- end ArithSeq +listEName = libFun FSLIT("listE") listEIdKey +sigEName = libFun FSLIT("sigE") sigEIdKey +recConEName = libFun FSLIT("recConE") recConEIdKey +recUpdEName = libFun FSLIT("recUpdE") recUpdEIdKey + +-- type FieldExp = ... +fieldExpName = libFun FSLIT("fieldExp") fieldExpIdKey + +-- data Body = ... +guardedBName = libFun FSLIT("guardedB") guardedBIdKey +normalBName = libFun FSLIT("normalB") normalBIdKey + +-- data Guard = ... +normalGEName = libFun FSLIT("normalGE") normalGEIdKey +patGEName = libFun FSLIT("patGE") patGEIdKey + +-- data Stmt = ... +bindSName = libFun FSLIT("bindS") bindSIdKey +letSName = libFun FSLIT("letS") letSIdKey +noBindSName = libFun FSLIT("noBindS") noBindSIdKey +parSName = libFun FSLIT("parS") parSIdKey + +-- data Dec = ... +funDName = libFun FSLIT("funD") funDIdKey +valDName = libFun FSLIT("valD") valDIdKey +dataDName = libFun FSLIT("dataD") dataDIdKey +newtypeDName = libFun FSLIT("newtypeD") newtypeDIdKey +tySynDName = libFun FSLIT("tySynD") tySynDIdKey +classDName = libFun FSLIT("classD") classDIdKey +instanceDName = libFun FSLIT("instanceD") instanceDIdKey +sigDName = libFun FSLIT("sigD") sigDIdKey +forImpDName = libFun FSLIT("forImpD") forImpDIdKey + +-- type Ctxt = ... +cxtName = libFun FSLIT("cxt") cxtIdKey + +-- data Strict = ... +isStrictName = libFun FSLIT("isStrict") isStrictKey +notStrictName = libFun FSLIT("notStrict") notStrictKey + +-- data Con = ... +normalCName = libFun FSLIT("normalC") normalCIdKey +recCName = libFun FSLIT("recC") recCIdKey +infixCName = libFun FSLIT("infixC") infixCIdKey +forallCName = libFun FSLIT("forallC") forallCIdKey + +-- type StrictType = ... +strictTypeName = libFun FSLIT("strictType") strictTKey + +-- type VarStrictType = ... +varStrictTypeName = libFun FSLIT("varStrictType") varStrictTKey + +-- data Type = ... +forallTName = libFun FSLIT("forallT") forallTIdKey +varTName = libFun FSLIT("varT") varTIdKey +conTName = libFun FSLIT("conT") conTIdKey +tupleTName = libFun FSLIT("tupleT") tupleTIdKey +arrowTName = libFun FSLIT("arrowT") arrowTIdKey +listTName = libFun FSLIT("listT") listTIdKey +appTName = libFun FSLIT("appT") appTIdKey + +-- data Callconv = ... +cCallName = libFun FSLIT("cCall") cCallIdKey +stdCallName = libFun FSLIT("stdCall") stdCallIdKey + +-- data Safety = ... +unsafeName = libFun FSLIT("unsafe") unsafeIdKey +safeName = libFun FSLIT("safe") safeIdKey +threadsafeName = libFun FSLIT("threadsafe") threadsafeIdKey + +-- data FunDep = ... +funDepName = libFun FSLIT("funDep") funDepIdKey + +matchQTyConName = libTc FSLIT("MatchQ") matchQTyConKey +clauseQTyConName = libTc FSLIT("ClauseQ") clauseQTyConKey +expQTyConName = libTc FSLIT("ExpQ") expQTyConKey +stmtQTyConName = libTc FSLIT("StmtQ") stmtQTyConKey +decQTyConName = libTc FSLIT("DecQ") decQTyConKey +conQTyConName = libTc FSLIT("ConQ") conQTyConKey +strictTypeQTyConName = libTc FSLIT("StrictTypeQ") strictTypeQTyConKey +varStrictTypeQTyConName = libTc FSLIT("VarStrictTypeQ") varStrictTypeQTyConKey +typeQTyConName = libTc FSLIT("TypeQ") typeQTyConKey +fieldExpQTyConName = libTc FSLIT("FieldExpQ") fieldExpQTyConKey +patQTyConName = libTc FSLIT("PatQ") patQTyConKey +fieldPatQTyConName = libTc FSLIT("FieldPatQ") fieldPatQTyConKey + +-- TyConUniques available: 100-129 +-- Check in PrelNames if you want to change this + +expTyConKey = mkPreludeTyConUnique 100 +matchTyConKey = mkPreludeTyConUnique 101 +clauseTyConKey = mkPreludeTyConUnique 102 +qTyConKey = mkPreludeTyConUnique 103 +expQTyConKey = mkPreludeTyConUnique 104 +decQTyConKey = mkPreludeTyConUnique 105 +patTyConKey = mkPreludeTyConUnique 106 +matchQTyConKey = mkPreludeTyConUnique 107 +clauseQTyConKey = mkPreludeTyConUnique 108 +stmtQTyConKey = mkPreludeTyConUnique 109 +conQTyConKey = mkPreludeTyConUnique 110 +typeQTyConKey = mkPreludeTyConUnique 111 +typeTyConKey = mkPreludeTyConUnique 112 +decTyConKey = mkPreludeTyConUnique 113 +varStrictTypeQTyConKey = mkPreludeTyConUnique 114 +strictTypeQTyConKey = mkPreludeTyConUnique 115 +fieldExpTyConKey = mkPreludeTyConUnique 116 +fieldPatTyConKey = mkPreludeTyConUnique 117 +nameTyConKey = mkPreludeTyConUnique 118 +patQTyConKey = mkPreludeTyConUnique 119 +fieldPatQTyConKey = mkPreludeTyConUnique 120 +fieldExpQTyConKey = mkPreludeTyConUnique 121 +funDepTyConKey = mkPreludeTyConUnique 122 + +-- IdUniques available: 200-399 +-- If you want to change this, make sure you check in PrelNames + +returnQIdKey = mkPreludeMiscIdUnique 200 +bindQIdKey = mkPreludeMiscIdUnique 201 +sequenceQIdKey = mkPreludeMiscIdUnique 202 +liftIdKey = mkPreludeMiscIdUnique 203 +newNameIdKey = mkPreludeMiscIdUnique 204 +mkNameIdKey = mkPreludeMiscIdUnique 205 +mkNameG_vIdKey = mkPreludeMiscIdUnique 206 +mkNameG_dIdKey = mkPreludeMiscIdUnique 207 +mkNameG_tcIdKey = mkPreludeMiscIdUnique 208 +mkNameLIdKey = mkPreludeMiscIdUnique 209 + + +-- data Lit = ... +charLIdKey = mkPreludeMiscIdUnique 210 +stringLIdKey = mkPreludeMiscIdUnique 211 +integerLIdKey = mkPreludeMiscIdUnique 212 +intPrimLIdKey = mkPreludeMiscIdUnique 213 +floatPrimLIdKey = mkPreludeMiscIdUnique 214 +doublePrimLIdKey = mkPreludeMiscIdUnique 215 +rationalLIdKey = mkPreludeMiscIdUnique 216 + +-- data Pat = ... +litPIdKey = mkPreludeMiscIdUnique 220 +varPIdKey = mkPreludeMiscIdUnique 221 +tupPIdKey = mkPreludeMiscIdUnique 222 +conPIdKey = mkPreludeMiscIdUnique 223 +infixPIdKey = mkPreludeMiscIdUnique 312 +tildePIdKey = mkPreludeMiscIdUnique 224 +asPIdKey = mkPreludeMiscIdUnique 225 +wildPIdKey = mkPreludeMiscIdUnique 226 +recPIdKey = mkPreludeMiscIdUnique 227 +listPIdKey = mkPreludeMiscIdUnique 228 +sigPIdKey = mkPreludeMiscIdUnique 229 + +-- type FieldPat = ... +fieldPatIdKey = mkPreludeMiscIdUnique 230 + +-- data Match = ... +matchIdKey = mkPreludeMiscIdUnique 231 + +-- data Clause = ... +clauseIdKey = mkPreludeMiscIdUnique 232 + +-- data Exp = ... +varEIdKey = mkPreludeMiscIdUnique 240 +conEIdKey = mkPreludeMiscIdUnique 241 +litEIdKey = mkPreludeMiscIdUnique 242 +appEIdKey = mkPreludeMiscIdUnique 243 +infixEIdKey = mkPreludeMiscIdUnique 244 +infixAppIdKey = mkPreludeMiscIdUnique 245 +sectionLIdKey = mkPreludeMiscIdUnique 246 +sectionRIdKey = mkPreludeMiscIdUnique 247 +lamEIdKey = mkPreludeMiscIdUnique 248 +tupEIdKey = mkPreludeMiscIdUnique 249 +condEIdKey = mkPreludeMiscIdUnique 250 +letEIdKey = mkPreludeMiscIdUnique 251 +caseEIdKey = mkPreludeMiscIdUnique 252 +doEIdKey = mkPreludeMiscIdUnique 253 +compEIdKey = mkPreludeMiscIdUnique 254 +fromEIdKey = mkPreludeMiscIdUnique 255 +fromThenEIdKey = mkPreludeMiscIdUnique 256 +fromToEIdKey = mkPreludeMiscIdUnique 257 +fromThenToEIdKey = mkPreludeMiscIdUnique 258 +listEIdKey = mkPreludeMiscIdUnique 259 +sigEIdKey = mkPreludeMiscIdUnique 260 +recConEIdKey = mkPreludeMiscIdUnique 261 +recUpdEIdKey = mkPreludeMiscIdUnique 262 + +-- type FieldExp = ... +fieldExpIdKey = mkPreludeMiscIdUnique 265 + +-- data Body = ... +guardedBIdKey = mkPreludeMiscIdUnique 266 +normalBIdKey = mkPreludeMiscIdUnique 267 + +-- data Guard = ... +normalGEIdKey = mkPreludeMiscIdUnique 310 +patGEIdKey = mkPreludeMiscIdUnique 311 + +-- data Stmt = ... +bindSIdKey = mkPreludeMiscIdUnique 268 +letSIdKey = mkPreludeMiscIdUnique 269 +noBindSIdKey = mkPreludeMiscIdUnique 270 +parSIdKey = mkPreludeMiscIdUnique 271 + +-- data Dec = ... +funDIdKey = mkPreludeMiscIdUnique 272 +valDIdKey = mkPreludeMiscIdUnique 273 +dataDIdKey = mkPreludeMiscIdUnique 274 +newtypeDIdKey = mkPreludeMiscIdUnique 275 +tySynDIdKey = mkPreludeMiscIdUnique 276 +classDIdKey = mkPreludeMiscIdUnique 277 +instanceDIdKey = mkPreludeMiscIdUnique 278 +sigDIdKey = mkPreludeMiscIdUnique 279 +forImpDIdKey = mkPreludeMiscIdUnique 297 + +-- type Cxt = ... +cxtIdKey = mkPreludeMiscIdUnique 280 + +-- data Strict = ... +isStrictKey = mkPreludeMiscIdUnique 281 +notStrictKey = mkPreludeMiscIdUnique 282 + +-- data Con = ... +normalCIdKey = mkPreludeMiscIdUnique 283 +recCIdKey = mkPreludeMiscIdUnique 284 +infixCIdKey = mkPreludeMiscIdUnique 285 +forallCIdKey = mkPreludeMiscIdUnique 288 + +-- type StrictType = ... +strictTKey = mkPreludeMiscIdUnique 286 + +-- type VarStrictType = ... +varStrictTKey = mkPreludeMiscIdUnique 287 + +-- data Type = ... +forallTIdKey = mkPreludeMiscIdUnique 290 +varTIdKey = mkPreludeMiscIdUnique 291 +conTIdKey = mkPreludeMiscIdUnique 292 +tupleTIdKey = mkPreludeMiscIdUnique 294 +arrowTIdKey = mkPreludeMiscIdUnique 295 +listTIdKey = mkPreludeMiscIdUnique 296 +appTIdKey = mkPreludeMiscIdUnique 293 + +-- data Callconv = ... +cCallIdKey = mkPreludeMiscIdUnique 300 +stdCallIdKey = mkPreludeMiscIdUnique 301 + +-- data Safety = ... +unsafeIdKey = mkPreludeMiscIdUnique 305 +safeIdKey = mkPreludeMiscIdUnique 306 +threadsafeIdKey = mkPreludeMiscIdUnique 307 + +-- data FunDep = ... +funDepIdKey = mkPreludeMiscIdUnique 320 + diff --git a/compiler/deSugar/DsMonad.lhs b/compiler/deSugar/DsMonad.lhs new file mode 100644 index 0000000000..f24dee4905 --- /dev/null +++ b/compiler/deSugar/DsMonad.lhs @@ -0,0 +1,285 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[DsMonad]{@DsMonad@: monadery used in desugaring} + +\begin{code} +module DsMonad ( + DsM, mappM, mapAndUnzipM, + initDs, returnDs, thenDs, listDs, fixDs, mapAndUnzipDs, + foldlDs, foldrDs, + + newTyVarsDs, newLocalName, + duplicateLocalDs, newSysLocalDs, newSysLocalsDs, newUniqueId, + newFailLocalDs, + getSrcSpanDs, putSrcSpanDs, + getModuleDs, + newUnique, + UniqSupply, newUniqueSupply, + getDOptsDs, + dsLookupGlobal, dsLookupGlobalId, dsLookupTyCon, dsLookupDataCon, + + DsMetaEnv, DsMetaVal(..), dsLookupMetaEnv, dsExtendMetaEnv, + + -- Warnings + DsWarning, dsWarn, + + -- Data types + DsMatchContext(..), + EquationInfo(..), MatchResult(..), DsWrapper, idWrapper, + CanItFail(..), orFail + ) where + +#include "HsVersions.h" + +import TcRnMonad +import CoreSyn ( CoreExpr ) +import HsSyn ( HsExpr, HsMatchContext, Pat ) +import TcIface ( tcIfaceGlobal ) +import RdrName ( GlobalRdrEnv ) +import HscTypes ( TyThing(..), TypeEnv, HscEnv, + tyThingId, tyThingTyCon, tyThingDataCon, unQualInScope ) +import Bag ( emptyBag, snocBag, Bag ) +import DataCon ( DataCon ) +import TyCon ( TyCon ) +import Id ( mkSysLocal, setIdUnique, Id ) +import Module ( Module ) +import Var ( TyVar, setTyVarUnique ) +import Outputable +import SrcLoc ( noSrcSpan, SrcSpan ) +import Type ( Type ) +import UniqSupply ( UniqSupply, uniqsFromSupply ) +import Name ( Name, nameOccName ) +import NameEnv +import OccName ( occNameFS ) +import DynFlags ( DynFlags ) +import ErrUtils ( WarnMsg, mkWarnMsg ) +import Bag ( mapBag ) + +import DATA_IOREF ( newIORef, readIORef ) + +infixr 9 `thenDs` +\end{code} + +%************************************************************************ +%* * + Data types for the desugarer +%* * +%************************************************************************ + +\begin{code} +data DsMatchContext + = DsMatchContext (HsMatchContext Name) SrcSpan + | NoMatchContext + deriving () + +data EquationInfo + = EqnInfo { eqn_wrap :: DsWrapper, -- Bindings + eqn_pats :: [Pat Id], -- The patterns for an eqn + eqn_rhs :: MatchResult } -- What to do after match + +type DsWrapper = CoreExpr -> CoreExpr +idWrapper e = e + +-- The semantics of (match vs (EqnInfo wrap pats rhs)) is the MatchResult +-- \fail. wrap (case vs of { pats -> rhs fail }) +-- where vs are not bound by wrap + + +-- A MatchResult is an expression with a hole in it +data MatchResult + = MatchResult + CanItFail -- Tells whether the failure expression is used + (CoreExpr -> DsM CoreExpr) + -- Takes a expression to plug in at the + -- failure point(s). The expression should + -- be duplicatable! + +data CanItFail = CanFail | CantFail + +orFail CantFail CantFail = CantFail +orFail _ _ = CanFail +\end{code} + + +%************************************************************************ +%* * + Monad stuff +%* * +%************************************************************************ + +Now the mondo monad magic (yes, @DsM@ is a silly name)---carry around +a @UniqueSupply@ and some annotations, which +presumably include source-file location information: +\begin{code} +type DsM result = TcRnIf DsGblEnv DsLclEnv result + +-- Compatibility functions +fixDs = fixM +thenDs = thenM +returnDs = returnM +listDs = sequenceM +foldlDs = foldlM +foldrDs = foldrM +mapAndUnzipDs = mapAndUnzipM + + +type DsWarning = (SrcSpan, SDoc) + -- Not quite the same as a WarnMsg, we have an SDoc here + -- and we'll do the print_unqual stuff later on to turn it + -- into a Doc. + +data DsGblEnv = DsGblEnv { + ds_mod :: Module, -- For SCC profiling + ds_warns :: IORef (Bag DsWarning), -- Warning messages + ds_if_env :: (IfGblEnv, IfLclEnv) -- Used for looking up global, + -- possibly-imported things + } + +data DsLclEnv = DsLclEnv { + ds_meta :: DsMetaEnv, -- Template Haskell bindings + ds_loc :: SrcSpan -- to put in pattern-matching error msgs + } + +-- Inside [| |] brackets, the desugarer looks +-- up variables in the DsMetaEnv +type DsMetaEnv = NameEnv DsMetaVal + +data DsMetaVal + = Bound Id -- Bound by a pattern inside the [| |]. + -- Will be dynamically alpha renamed. + -- The Id has type THSyntax.Var + + | Splice (HsExpr Id) -- These bindings are introduced by + -- the PendingSplices on a HsBracketOut + +-- initDs returns the UniqSupply out the end (not just the result) + +initDs :: HscEnv + -> Module -> GlobalRdrEnv -> TypeEnv + -> DsM a + -> IO (a, Bag WarnMsg) + +initDs hsc_env mod rdr_env type_env thing_inside + = do { warn_var <- newIORef emptyBag + ; let { if_genv = IfGblEnv { if_rec_types = Just (mod, return type_env) } + ; if_lenv = mkIfLclEnv mod (ptext SLIT("GHC error in desugarer lookup in") <+> ppr mod) + ; gbl_env = DsGblEnv { ds_mod = mod, + ds_if_env = (if_genv, if_lenv), + ds_warns = warn_var } + ; lcl_env = DsLclEnv { ds_meta = emptyNameEnv, + ds_loc = noSrcSpan } } + + ; res <- initTcRnIf 'd' hsc_env gbl_env lcl_env thing_inside + + ; warns <- readIORef warn_var + ; return (res, mapBag mk_warn warns) + } + where + print_unqual = unQualInScope rdr_env + + mk_warn :: (SrcSpan,SDoc) -> WarnMsg + mk_warn (loc,sdoc) = mkWarnMsg loc print_unqual sdoc +\end{code} + +%************************************************************************ +%* * + Operations in the monad +%* * +%************************************************************************ + +And all this mysterious stuff is so we can occasionally reach out and +grab one or more names. @newLocalDs@ isn't exported---exported +functions are defined with it. The difference in name-strings makes +it easier to read debugging output. + +\begin{code} +-- Make a new Id with the same print name, but different type, and new unique +newUniqueId :: Name -> Type -> DsM Id +newUniqueId id ty + = newUnique `thenDs` \ uniq -> + returnDs (mkSysLocal (occNameFS (nameOccName id)) uniq ty) + +duplicateLocalDs :: Id -> DsM Id +duplicateLocalDs old_local + = newUnique `thenDs` \ uniq -> + returnDs (setIdUnique old_local uniq) + +newSysLocalDs, newFailLocalDs :: Type -> DsM Id +newSysLocalDs ty + = newUnique `thenDs` \ uniq -> + returnDs (mkSysLocal FSLIT("ds") uniq ty) + +newSysLocalsDs tys = mappM newSysLocalDs tys + +newFailLocalDs ty + = newUnique `thenDs` \ uniq -> + returnDs (mkSysLocal FSLIT("fail") uniq ty) + -- The UserLocal bit just helps make the code a little clearer +\end{code} + +\begin{code} +newTyVarsDs :: [TyVar] -> DsM [TyVar] +newTyVarsDs tyvar_tmpls + = newUniqueSupply `thenDs` \ uniqs -> + returnDs (zipWith setTyVarUnique tyvar_tmpls (uniqsFromSupply uniqs)) +\end{code} + +We can also reach out and either set/grab location information from +the @SrcSpan@ being carried around. + +\begin{code} +getDOptsDs :: DsM DynFlags +getDOptsDs = getDOpts + +getModuleDs :: DsM Module +getModuleDs = do { env <- getGblEnv; return (ds_mod env) } + +getSrcSpanDs :: DsM SrcSpan +getSrcSpanDs = do { env <- getLclEnv; return (ds_loc env) } + +putSrcSpanDs :: SrcSpan -> DsM a -> DsM a +putSrcSpanDs new_loc thing_inside = updLclEnv (\ env -> env {ds_loc = new_loc}) thing_inside + +dsWarn :: SDoc -> DsM () +dsWarn warn = do { env <- getGblEnv + ; loc <- getSrcSpanDs + ; updMutVar (ds_warns env) (`snocBag` (loc,msg)) } + where + msg = ptext SLIT("Warning:") <+> warn +\end{code} + +\begin{code} +dsLookupGlobal :: Name -> DsM TyThing +-- Very like TcEnv.tcLookupGlobal +dsLookupGlobal name + = do { env <- getGblEnv + ; setEnvs (ds_if_env env) + (tcIfaceGlobal name) } + +dsLookupGlobalId :: Name -> DsM Id +dsLookupGlobalId name + = dsLookupGlobal name `thenDs` \ thing -> + returnDs (tyThingId thing) + +dsLookupTyCon :: Name -> DsM TyCon +dsLookupTyCon name + = dsLookupGlobal name `thenDs` \ thing -> + returnDs (tyThingTyCon thing) + +dsLookupDataCon :: Name -> DsM DataCon +dsLookupDataCon name + = dsLookupGlobal name `thenDs` \ thing -> + returnDs (tyThingDataCon thing) +\end{code} + +\begin{code} +dsLookupMetaEnv :: Name -> DsM (Maybe DsMetaVal) +dsLookupMetaEnv name = do { env <- getLclEnv; return (lookupNameEnv (ds_meta env) name) } + +dsExtendMetaEnv :: DsMetaEnv -> DsM a -> DsM a +dsExtendMetaEnv menv thing_inside + = updLclEnv (\env -> env { ds_meta = ds_meta env `plusNameEnv` menv }) thing_inside +\end{code} + + diff --git a/compiler/deSugar/DsUtils.lhs b/compiler/deSugar/DsUtils.lhs new file mode 100644 index 0000000000..29e7773bb8 --- /dev/null +++ b/compiler/deSugar/DsUtils.lhs @@ -0,0 +1,884 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[DsUtils]{Utilities for desugaring} + +This module exports some utility functions of no great interest. + +\begin{code} +module DsUtils ( + EquationInfo(..), + firstPat, shiftEqns, + + mkDsLet, mkDsLets, + + MatchResult(..), CanItFail(..), + cantFailMatchResult, alwaysFailMatchResult, + extractMatchResult, combineMatchResults, + adjustMatchResult, adjustMatchResultDs, + mkCoLetMatchResult, mkGuardedMatchResult, + matchCanFail, + mkCoPrimCaseMatchResult, mkCoAlgCaseMatchResult, + wrapBind, wrapBinds, + + mkErrorAppDs, mkNilExpr, mkConsExpr, mkListExpr, + mkIntExpr, mkCharExpr, + mkStringExpr, mkStringExprFS, mkIntegerExpr, + + mkSelectorBinds, mkTupleExpr, mkTupleSelector, + mkTupleType, mkTupleCase, mkBigCoreTup, + mkCoreTup, mkCoreTupTy, seqVar, + + dsSyntaxTable, lookupEvidence, + + selectSimpleMatchVarL, selectMatchVars, selectMatchVar + ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} Match ( matchSimply ) +import {-# SOURCE #-} DsExpr( dsExpr ) + +import HsSyn +import TcHsSyn ( hsPatType ) +import CoreSyn +import Constants ( mAX_TUPLE_SIZE ) +import DsMonad + +import CoreUtils ( exprType, mkIfThenElse, mkCoerce, bindNonRec ) +import MkId ( iRREFUT_PAT_ERROR_ID, mkReboxingAlt, mkNewTypeBody ) +import Id ( idType, Id, mkWildId, mkTemplateLocals, mkSysLocal ) +import Var ( Var ) +import Name ( Name ) +import Literal ( Literal(..), mkStringLit, inIntRange, tARGET_MAX_INT ) +import TyCon ( isNewTyCon, tyConDataCons ) +import DataCon ( DataCon, dataConSourceArity, dataConTyCon, dataConTag ) +import Type ( mkFunTy, isUnLiftedType, Type, splitTyConApp, mkTyVarTy ) +import TcType ( tcEqType ) +import TysPrim ( intPrimTy ) +import TysWiredIn ( nilDataCon, consDataCon, + tupleCon, mkTupleTy, + unitDataConId, unitTy, + charTy, charDataCon, + intTy, intDataCon, + isPArrFakeCon ) +import BasicTypes ( Boxity(..) ) +import UniqSet ( mkUniqSet, minusUniqSet, isEmptyUniqSet ) +import UniqSupply ( splitUniqSupply, uniqFromSupply, uniqsFromSupply ) +import PrelNames ( unpackCStringName, unpackCStringUtf8Name, + plusIntegerName, timesIntegerName, smallIntegerDataConName, + lengthPName, indexPName ) +import Outputable +import SrcLoc ( Located(..), unLoc ) +import Util ( isSingleton, zipEqual, sortWith ) +import ListSetOps ( assocDefault ) +import FastString +import Data.Char ( ord ) + +#ifdef DEBUG +import Util ( notNull ) -- Used in an assertion +#endif +\end{code} + + + +%************************************************************************ +%* * + Rebindable syntax +%* * +%************************************************************************ + +\begin{code} +dsSyntaxTable :: SyntaxTable Id + -> DsM ([CoreBind], -- Auxiliary bindings + [(Name,Id)]) -- Maps the standard name to its value + +dsSyntaxTable rebound_ids + = mapAndUnzipDs mk_bind rebound_ids `thenDs` \ (binds_s, prs) -> + return (concat binds_s, prs) + where + -- The cheapo special case can happen when we + -- make an intermediate HsDo when desugaring a RecStmt + mk_bind (std_name, HsVar id) = return ([], (std_name, id)) + mk_bind (std_name, expr) + = dsExpr expr `thenDs` \ rhs -> + newSysLocalDs (exprType rhs) `thenDs` \ id -> + return ([NonRec id rhs], (std_name, id)) + +lookupEvidence :: [(Name, Id)] -> Name -> Id +lookupEvidence prs std_name + = assocDefault (mk_panic std_name) prs std_name + where + mk_panic std_name = pprPanic "dsSyntaxTable" (ptext SLIT("Not found:") <+> ppr std_name) +\end{code} + + +%************************************************************************ +%* * +\subsection{Building lets} +%* * +%************************************************************************ + +Use case, not let for unlifted types. The simplifier will turn some +back again. + +\begin{code} +mkDsLet :: CoreBind -> CoreExpr -> CoreExpr +mkDsLet (NonRec bndr rhs) body + | isUnLiftedType (idType bndr) + = Case rhs bndr (exprType body) [(DEFAULT,[],body)] +mkDsLet bind body + = Let bind body + +mkDsLets :: [CoreBind] -> CoreExpr -> CoreExpr +mkDsLets binds body = foldr mkDsLet body binds +\end{code} + + +%************************************************************************ +%* * +\subsection{ Selecting match variables} +%* * +%************************************************************************ + +We're about to match against some patterns. We want to make some +@Ids@ to use as match variables. If a pattern has an @Id@ readily at +hand, which should indeed be bound to the pattern as a whole, then use it; +otherwise, make one up. + +\begin{code} +selectSimpleMatchVarL :: LPat Id -> DsM Id +selectSimpleMatchVarL pat = selectMatchVar (unLoc pat) (hsPatType pat) + +-- (selectMatchVars ps tys) chooses variables of type tys +-- to use for matching ps against. If the pattern is a variable, +-- we try to use that, to save inventing lots of fresh variables. +-- But even if it is a variable, its type might not match. Consider +-- data T a where +-- T1 :: Int -> T Int +-- T2 :: a -> T a +-- +-- f :: T a -> a -> Int +-- f (T1 i) (x::Int) = x +-- f (T2 i) (y::a) = 0 +-- Then we must not choose (x::Int) as the matching variable! + +selectMatchVars :: [Pat Id] -> [Type] -> DsM [Id] +selectMatchVars [] [] = return [] +selectMatchVars (p:ps) (ty:tys) = do { v <- selectMatchVar p ty + ; vs <- selectMatchVars ps tys + ; return (v:vs) } + +selectMatchVar (BangPat pat) pat_ty = selectMatchVar (unLoc pat) pat_ty +selectMatchVar (LazyPat pat) pat_ty = selectMatchVar (unLoc pat) pat_ty +selectMatchVar (VarPat var) pat_ty = try_for var pat_ty +selectMatchVar (AsPat var pat) pat_ty = try_for (unLoc var) pat_ty +selectMatchVar other_pat pat_ty = newSysLocalDs pat_ty -- OK, better make up one... + +try_for var pat_ty + | idType var `tcEqType` pat_ty = returnDs var + | otherwise = newSysLocalDs pat_ty +\end{code} + + +%************************************************************************ +%* * +%* type synonym EquationInfo and access functions for its pieces * +%* * +%************************************************************************ +\subsection[EquationInfo-synonym]{@EquationInfo@: a useful synonym} + +The ``equation info'' used by @match@ is relatively complicated and +worthy of a type synonym and a few handy functions. + +\begin{code} +firstPat :: EquationInfo -> Pat Id +firstPat eqn = head (eqn_pats eqn) + +shiftEqns :: [EquationInfo] -> [EquationInfo] +-- Drop the first pattern in each equation +shiftEqns eqns = [ eqn { eqn_pats = tail (eqn_pats eqn) } | eqn <- eqns ] +\end{code} + +Functions on MatchResults + +\begin{code} +matchCanFail :: MatchResult -> Bool +matchCanFail (MatchResult CanFail _) = True +matchCanFail (MatchResult CantFail _) = False + +alwaysFailMatchResult :: MatchResult +alwaysFailMatchResult = MatchResult CanFail (\fail -> returnDs fail) + +cantFailMatchResult :: CoreExpr -> MatchResult +cantFailMatchResult expr = MatchResult CantFail (\ ignore -> returnDs expr) + +extractMatchResult :: MatchResult -> CoreExpr -> DsM CoreExpr +extractMatchResult (MatchResult CantFail match_fn) fail_expr + = match_fn (error "It can't fail!") + +extractMatchResult (MatchResult CanFail match_fn) fail_expr + = mkFailurePair fail_expr `thenDs` \ (fail_bind, if_it_fails) -> + match_fn if_it_fails `thenDs` \ body -> + returnDs (mkDsLet fail_bind body) + + +combineMatchResults :: MatchResult -> MatchResult -> MatchResult +combineMatchResults (MatchResult CanFail body_fn1) + (MatchResult can_it_fail2 body_fn2) + = MatchResult can_it_fail2 body_fn + where + body_fn fail = body_fn2 fail `thenDs` \ body2 -> + mkFailurePair body2 `thenDs` \ (fail_bind, duplicatable_expr) -> + body_fn1 duplicatable_expr `thenDs` \ body1 -> + returnDs (Let fail_bind body1) + +combineMatchResults match_result1@(MatchResult CantFail body_fn1) match_result2 + = match_result1 + +adjustMatchResult :: (CoreExpr -> CoreExpr) -> MatchResult -> MatchResult +adjustMatchResult encl_fn (MatchResult can_it_fail body_fn) + = MatchResult can_it_fail (\fail -> body_fn fail `thenDs` \ body -> + returnDs (encl_fn body)) + +adjustMatchResultDs :: (CoreExpr -> DsM CoreExpr) -> MatchResult -> MatchResult +adjustMatchResultDs encl_fn (MatchResult can_it_fail body_fn) + = MatchResult can_it_fail (\fail -> body_fn fail `thenDs` \ body -> + encl_fn body) + +wrapBinds :: [(Var,Var)] -> CoreExpr -> CoreExpr +wrapBinds [] e = e +wrapBinds ((new,old):prs) e = wrapBind new old (wrapBinds prs e) + +wrapBind :: Var -> Var -> CoreExpr -> CoreExpr +wrapBind new old body + | new==old = body + | isTyVar new = App (Lam new body) (Type (mkTyVarTy old)) + | otherwise = Let (NonRec new (Var old)) body + +seqVar :: Var -> CoreExpr -> CoreExpr +seqVar var body = Case (Var var) var (exprType body) + [(DEFAULT, [], body)] + +mkCoLetMatchResult :: CoreBind -> MatchResult -> MatchResult +mkCoLetMatchResult bind match_result + = adjustMatchResult (mkDsLet bind) match_result + +mkGuardedMatchResult :: CoreExpr -> MatchResult -> MatchResult +mkGuardedMatchResult pred_expr (MatchResult can_it_fail body_fn) + = MatchResult CanFail (\fail -> body_fn fail `thenDs` \ body -> + returnDs (mkIfThenElse pred_expr body fail)) + +mkCoPrimCaseMatchResult :: Id -- Scrutinee + -> Type -- Type of the case + -> [(Literal, MatchResult)] -- Alternatives + -> MatchResult +mkCoPrimCaseMatchResult var ty match_alts + = MatchResult CanFail mk_case + where + mk_case fail + = mappM (mk_alt fail) sorted_alts `thenDs` \ alts -> + returnDs (Case (Var var) var ty ((DEFAULT, [], fail) : alts)) + + sorted_alts = sortWith fst match_alts -- Right order for a Case + mk_alt fail (lit, MatchResult _ body_fn) = body_fn fail `thenDs` \ body -> + returnDs (LitAlt lit, [], body) + + +mkCoAlgCaseMatchResult :: Id -- Scrutinee + -> Type -- Type of exp + -> [(DataCon, [CoreBndr], MatchResult)] -- Alternatives + -> MatchResult +mkCoAlgCaseMatchResult var ty match_alts + | isNewTyCon tycon -- Newtype case; use a let + = ASSERT( null (tail match_alts) && null (tail arg_ids1) ) + mkCoLetMatchResult (NonRec arg_id1 newtype_rhs) match_result1 + + | isPArrFakeAlts match_alts -- Sugared parallel array; use a literal case + = MatchResult CanFail mk_parrCase + + | otherwise -- Datatype case; use a case + = MatchResult fail_flag mk_case + where + tycon = dataConTyCon con1 + -- [Interesting: becuase of GADTs, we can't rely on the type of + -- the scrutinised Id to be sufficiently refined to have a TyCon in it] + + -- Stuff for newtype + (con1, arg_ids1, match_result1) = head match_alts + arg_id1 = head arg_ids1 + newtype_rhs = mkNewTypeBody tycon (idType arg_id1) (Var var) + + -- Stuff for data types + data_cons = tyConDataCons tycon + match_results = [match_result | (_,_,match_result) <- match_alts] + + fail_flag | exhaustive_case + = foldr1 orFail [can_it_fail | MatchResult can_it_fail _ <- match_results] + | otherwise + = CanFail + + wild_var = mkWildId (idType var) + sorted_alts = sortWith get_tag match_alts + get_tag (con, _, _) = dataConTag con + mk_case fail = mappM (mk_alt fail) sorted_alts `thenDs` \ alts -> + returnDs (Case (Var var) wild_var ty (mk_default fail ++ alts)) + + mk_alt fail (con, args, MatchResult _ body_fn) + = body_fn fail `thenDs` \ body -> + newUniqueSupply `thenDs` \ us -> + returnDs (mkReboxingAlt (uniqsFromSupply us) con args body) + + mk_default fail | exhaustive_case = [] + | otherwise = [(DEFAULT, [], fail)] + + un_mentioned_constructors + = mkUniqSet data_cons `minusUniqSet` mkUniqSet [ con | (con, _, _) <- match_alts] + exhaustive_case = isEmptyUniqSet un_mentioned_constructors + + -- Stuff for parallel arrays + -- + -- * the following is to desugar cases over fake constructors for + -- parallel arrays, which are introduced by `tidy1' in the `PArrPat' + -- case + -- + -- Concerning `isPArrFakeAlts': + -- + -- * it is *not* sufficient to just check the type of the type + -- constructor, as we have to be careful not to confuse the real + -- representation of parallel arrays with the fake constructors; + -- moreover, a list of alternatives must not mix fake and real + -- constructors (this is checked earlier on) + -- + -- FIXME: We actually go through the whole list and make sure that + -- either all or none of the constructors are fake parallel + -- array constructors. This is to spot equations that mix fake + -- constructors with the real representation defined in + -- `PrelPArr'. It would be nicer to spot this situation + -- earlier and raise a proper error message, but it can really + -- only happen in `PrelPArr' anyway. + -- + isPArrFakeAlts [(dcon, _, _)] = isPArrFakeCon dcon + isPArrFakeAlts ((dcon, _, _):alts) = + case (isPArrFakeCon dcon, isPArrFakeAlts alts) of + (True , True ) -> True + (False, False) -> False + _ -> + panic "DsUtils: You may not mix `[:...:]' with `PArr' patterns" + -- + mk_parrCase fail = + dsLookupGlobalId lengthPName `thenDs` \lengthP -> + unboxAlt `thenDs` \alt -> + returnDs (Case (len lengthP) (mkWildId intTy) ty [alt]) + where + elemTy = case splitTyConApp (idType var) of + (_, [elemTy]) -> elemTy + _ -> panic panicMsg + panicMsg = "DsUtils.mkCoAlgCaseMatchResult: not a parallel array?" + len lengthP = mkApps (Var lengthP) [Type elemTy, Var var] + -- + unboxAlt = + newSysLocalDs intPrimTy `thenDs` \l -> + dsLookupGlobalId indexPName `thenDs` \indexP -> + mappM (mkAlt indexP) sorted_alts `thenDs` \alts -> + returnDs (DataAlt intDataCon, [l], (Case (Var l) wild ty (dft : alts))) + where + wild = mkWildId intPrimTy + dft = (DEFAULT, [], fail) + -- + -- each alternative matches one array length (corresponding to one + -- fake array constructor), so the match is on a literal; each + -- alternative's body is extended by a local binding for each + -- constructor argument, which are bound to array elements starting + -- with the first + -- + mkAlt indexP (con, args, MatchResult _ bodyFun) = + bodyFun fail `thenDs` \body -> + returnDs (LitAlt lit, [], mkDsLets binds body) + where + lit = MachInt $ toInteger (dataConSourceArity con) + binds = [NonRec arg (indexExpr i) | (i, arg) <- zip [1..] args] + -- + indexExpr i = mkApps (Var indexP) [Type elemTy, Var var, mkIntExpr i] +\end{code} + + +%************************************************************************ +%* * +\subsection{Desugarer's versions of some Core functions} +%* * +%************************************************************************ + +\begin{code} +mkErrorAppDs :: Id -- The error function + -> Type -- Type to which it should be applied + -> String -- The error message string to pass + -> DsM CoreExpr + +mkErrorAppDs err_id ty msg + = getSrcSpanDs `thenDs` \ src_loc -> + let + full_msg = showSDoc (hcat [ppr src_loc, text "|", text msg]) + core_msg = Lit (mkStringLit full_msg) + -- mkStringLit returns a result of type String# + in + returnDs (mkApps (Var err_id) [Type ty, core_msg]) +\end{code} + + +************************************************************* +%* * +\subsection{Making literals} +%* * +%************************************************************************ + +\begin{code} +mkCharExpr :: Char -> CoreExpr -- Returns C# c :: Int +mkIntExpr :: Integer -> CoreExpr -- Returns I# i :: Int +mkIntegerExpr :: Integer -> DsM CoreExpr -- Result :: Integer +mkStringExpr :: String -> DsM CoreExpr -- Result :: String +mkStringExprFS :: FastString -> DsM CoreExpr -- Result :: String + +mkIntExpr i = mkConApp intDataCon [mkIntLit i] +mkCharExpr c = mkConApp charDataCon [mkLit (MachChar c)] + +mkIntegerExpr i + | inIntRange i -- Small enough, so start from an Int + = dsLookupDataCon smallIntegerDataConName `thenDs` \ integer_dc -> + returnDs (mkSmallIntegerLit integer_dc i) + +-- Special case for integral literals with a large magnitude: +-- They are transformed into an expression involving only smaller +-- integral literals. This improves constant folding. + + | otherwise -- Big, so start from a string + = dsLookupGlobalId plusIntegerName `thenDs` \ plus_id -> + dsLookupGlobalId timesIntegerName `thenDs` \ times_id -> + dsLookupDataCon smallIntegerDataConName `thenDs` \ integer_dc -> + let + lit i = mkSmallIntegerLit integer_dc i + plus a b = Var plus_id `App` a `App` b + times a b = Var times_id `App` a `App` b + + -- Transform i into (x1 + (x2 + (x3 + (...) * b) * b) * b) with abs xi <= b + horner :: Integer -> Integer -> CoreExpr + horner b i | abs q <= 1 = if r == 0 || r == i + then lit i + else lit r `plus` lit (i-r) + | r == 0 = horner b q `times` lit b + | otherwise = lit r `plus` (horner b q `times` lit b) + where + (q,r) = i `quotRem` b + + in + returnDs (horner tARGET_MAX_INT i) + +mkSmallIntegerLit small_integer_data_con i = mkConApp small_integer_data_con [mkIntLit i] + +mkStringExpr str = mkStringExprFS (mkFastString str) + +mkStringExprFS str + | nullFS str + = returnDs (mkNilExpr charTy) + + | lengthFS str == 1 + = let + the_char = mkCharExpr (headFS str) + in + returnDs (mkConsExpr charTy the_char (mkNilExpr charTy)) + + | all safeChar chars + = dsLookupGlobalId unpackCStringName `thenDs` \ unpack_id -> + returnDs (App (Var unpack_id) (Lit (MachStr str))) + + | otherwise + = dsLookupGlobalId unpackCStringUtf8Name `thenDs` \ unpack_id -> + returnDs (App (Var unpack_id) (Lit (MachStr str))) + + where + chars = unpackFS str + safeChar c = ord c >= 1 && ord c <= 0x7F +\end{code} + + +%************************************************************************ +%* * +\subsection[mkSelectorBind]{Make a selector bind} +%* * +%************************************************************************ + +This is used in various places to do with lazy patterns. +For each binder $b$ in the pattern, we create a binding: +\begin{verbatim} + b = case v of pat' -> b' +\end{verbatim} +where @pat'@ is @pat@ with each binder @b@ cloned into @b'@. + +ToDo: making these bindings should really depend on whether there's +much work to be done per binding. If the pattern is complex, it +should be de-mangled once, into a tuple (and then selected from). +Otherwise the demangling can be in-line in the bindings (as here). + +Boring! Boring! One error message per binder. The above ToDo is +even more helpful. Something very similar happens for pattern-bound +expressions. + +\begin{code} +mkSelectorBinds :: LPat Id -- The pattern + -> CoreExpr -- Expression to which the pattern is bound + -> DsM [(Id,CoreExpr)] + +mkSelectorBinds (L _ (VarPat v)) val_expr + = returnDs [(v, val_expr)] + +mkSelectorBinds pat val_expr + | isSingleton binders || is_simple_lpat pat + = -- Given p = e, where p binds x,y + -- we are going to make + -- v = p (where v is fresh) + -- x = case v of p -> x + -- y = case v of p -> x + + -- Make up 'v' + -- NB: give it the type of *pattern* p, not the type of the *rhs* e. + -- This does not matter after desugaring, but there's a subtle + -- issue with implicit parameters. Consider + -- (x,y) = ?i + -- Then, ?i is given type {?i :: Int}, a PredType, which is opaque + -- to the desugarer. (Why opaque? Because newtypes have to be. Why + -- does it get that type? So that when we abstract over it we get the + -- right top-level type (?i::Int) => ...) + -- + -- So to get the type of 'v', use the pattern not the rhs. Often more + -- efficient too. + newSysLocalDs (hsPatType pat) `thenDs` \ val_var -> + + -- For the error message we make one error-app, to avoid duplication. + -- But we need it at different types... so we use coerce for that + mkErrorAppDs iRREFUT_PAT_ERROR_ID + unitTy (showSDoc (ppr pat)) `thenDs` \ err_expr -> + newSysLocalDs unitTy `thenDs` \ err_var -> + mappM (mk_bind val_var err_var) binders `thenDs` \ binds -> + returnDs ( (val_var, val_expr) : + (err_var, err_expr) : + binds ) + + + | otherwise + = mkErrorAppDs iRREFUT_PAT_ERROR_ID + tuple_ty (showSDoc (ppr pat)) `thenDs` \ error_expr -> + matchSimply val_expr PatBindRhs pat local_tuple error_expr `thenDs` \ tuple_expr -> + newSysLocalDs tuple_ty `thenDs` \ tuple_var -> + let + mk_tup_bind binder + = (binder, mkTupleSelector binders binder tuple_var (Var tuple_var)) + in + returnDs ( (tuple_var, tuple_expr) : map mk_tup_bind binders ) + where + binders = collectPatBinders pat + local_tuple = mkTupleExpr binders + tuple_ty = exprType local_tuple + + mk_bind scrut_var err_var bndr_var + -- (mk_bind sv err_var) generates + -- bv = case sv of { pat -> bv; other -> coerce (type-of-bv) err_var } + -- Remember, pat binds bv + = matchSimply (Var scrut_var) PatBindRhs pat + (Var bndr_var) error_expr `thenDs` \ rhs_expr -> + returnDs (bndr_var, rhs_expr) + where + error_expr = mkCoerce (idType bndr_var) (Var err_var) + + is_simple_lpat p = is_simple_pat (unLoc p) + + is_simple_pat (TuplePat ps Boxed _) = all is_triv_lpat ps + is_simple_pat (ConPatOut _ _ _ _ ps _) = all is_triv_lpat (hsConArgs ps) + is_simple_pat (VarPat _) = True + is_simple_pat (ParPat p) = is_simple_lpat p + is_simple_pat other = False + + is_triv_lpat p = is_triv_pat (unLoc p) + + is_triv_pat (VarPat v) = True + is_triv_pat (WildPat _) = True + is_triv_pat (ParPat p) = is_triv_lpat p + is_triv_pat other = False +\end{code} + + +%************************************************************************ +%* * + Tuples +%* * +%************************************************************************ + +@mkTupleExpr@ builds a tuple; the inverse to @mkTupleSelector@. + +* If it has only one element, it is the identity function. + +* If there are more elements than a big tuple can have, it nests + the tuples. + +Nesting policy. Better a 2-tuple of 10-tuples (3 objects) than +a 10-tuple of 2-tuples (11 objects). So we want the leaves to be big. + +\begin{code} +mkTupleExpr :: [Id] -> CoreExpr +mkTupleExpr ids = mkBigCoreTup (map Var ids) + +-- corresponding type +mkTupleType :: [Id] -> Type +mkTupleType ids = mkBigTuple mkCoreTupTy (map idType ids) + +mkBigCoreTup :: [CoreExpr] -> CoreExpr +mkBigCoreTup = mkBigTuple mkCoreTup + +mkBigTuple :: ([a] -> a) -> [a] -> a +mkBigTuple small_tuple as = mk_big_tuple (chunkify as) + where + -- Each sub-list is short enough to fit in a tuple + mk_big_tuple [as] = small_tuple as + mk_big_tuple as_s = mk_big_tuple (chunkify (map small_tuple as_s)) + +chunkify :: [a] -> [[a]] +-- The sub-lists of the result all have length <= mAX_TUPLE_SIZE +-- But there may be more than mAX_TUPLE_SIZE sub-lists +chunkify xs + | n_xs <= mAX_TUPLE_SIZE = {- pprTrace "Small" (ppr n_xs) -} [xs] + | otherwise = {- pprTrace "Big" (ppr n_xs) -} (split xs) + where + n_xs = length xs + split [] = [] + split xs = take mAX_TUPLE_SIZE xs : split (drop mAX_TUPLE_SIZE xs) +\end{code} + + +@mkTupleSelector@ builds a selector which scrutises the given +expression and extracts the one name from the list given. +If you want the no-shadowing rule to apply, the caller +is responsible for making sure that none of these names +are in scope. + +If there is just one id in the ``tuple'', then the selector is +just the identity. + +If it's big, it does nesting + mkTupleSelector [a,b,c,d] b v e + = case e of v { + (p,q) -> case p of p { + (a,b) -> b }} +We use 'tpl' vars for the p,q, since shadowing does not matter. + +In fact, it's more convenient to generate it innermost first, getting + + case (case e of v + (p,q) -> p) of p + (a,b) -> b + +\begin{code} +mkTupleSelector :: [Id] -- The tuple args + -> Id -- The selected one + -> Id -- A variable of the same type as the scrutinee + -> CoreExpr -- Scrutinee + -> CoreExpr + +mkTupleSelector vars the_var scrut_var scrut + = mk_tup_sel (chunkify vars) the_var + where + mk_tup_sel [vars] the_var = mkCoreSel vars the_var scrut_var scrut + mk_tup_sel vars_s the_var = mkCoreSel group the_var tpl_v $ + mk_tup_sel (chunkify tpl_vs) tpl_v + where + tpl_tys = [mkCoreTupTy (map idType gp) | gp <- vars_s] + tpl_vs = mkTemplateLocals tpl_tys + [(tpl_v, group)] = [(tpl,gp) | (tpl,gp) <- zipEqual "mkTupleSelector" tpl_vs vars_s, + the_var `elem` gp ] +\end{code} + +A generalization of @mkTupleSelector@, allowing the body +of the case to be an arbitrary expression. + +If the tuple is big, it is nested: + + mkTupleCase uniqs [a,b,c,d] body v e + = case e of v { (p,q) -> + case p of p { (a,b) -> + case q of q { (c,d) -> + body }}} + +To avoid shadowing, we use uniqs to invent new variables p,q. + +ToDo: eliminate cases where none of the variables are needed. + +\begin{code} +mkTupleCase + :: UniqSupply -- for inventing names of intermediate variables + -> [Id] -- the tuple args + -> CoreExpr -- body of the case + -> Id -- a variable of the same type as the scrutinee + -> CoreExpr -- scrutinee + -> CoreExpr + +mkTupleCase uniqs vars body scrut_var scrut + = mk_tuple_case uniqs (chunkify vars) body + where + mk_tuple_case us [vars] body + = mkSmallTupleCase vars body scrut_var scrut + mk_tuple_case us vars_s body + = let + (us', vars', body') = foldr one_tuple_case (us, [], body) vars_s + in + mk_tuple_case us' (chunkify vars') body' + one_tuple_case chunk_vars (us, vs, body) + = let + (us1, us2) = splitUniqSupply us + scrut_var = mkSysLocal FSLIT("ds") (uniqFromSupply us1) + (mkCoreTupTy (map idType chunk_vars)) + body' = mkSmallTupleCase chunk_vars body scrut_var (Var scrut_var) + in (us2, scrut_var:vs, body') +\end{code} + +The same, but with a tuple small enough not to need nesting. + +\begin{code} +mkSmallTupleCase + :: [Id] -- the tuple args + -> CoreExpr -- body of the case + -> Id -- a variable of the same type as the scrutinee + -> CoreExpr -- scrutinee + -> CoreExpr + +mkSmallTupleCase [var] body _scrut_var scrut + = bindNonRec var scrut body +mkSmallTupleCase vars body scrut_var scrut +-- One branch no refinement? + = Case scrut scrut_var (exprType body) [(DataAlt (tupleCon Boxed (length vars)), vars, body)] +\end{code} + +%************************************************************************ +%* * +\subsection[mkFailurePair]{Code for pattern-matching and other failures} +%* * +%************************************************************************ + +Call the constructor Ids when building explicit lists, so that they +interact well with rules. + +\begin{code} +mkNilExpr :: Type -> CoreExpr +mkNilExpr ty = mkConApp nilDataCon [Type ty] + +mkConsExpr :: Type -> CoreExpr -> CoreExpr -> CoreExpr +mkConsExpr ty hd tl = mkConApp consDataCon [Type ty, hd, tl] + +mkListExpr :: Type -> [CoreExpr] -> CoreExpr +mkListExpr ty xs = foldr (mkConsExpr ty) (mkNilExpr ty) xs + + +-- The next three functions make tuple types, constructors and selectors, +-- with the rule that a 1-tuple is represented by the thing itselg +mkCoreTupTy :: [Type] -> Type +mkCoreTupTy [ty] = ty +mkCoreTupTy tys = mkTupleTy Boxed (length tys) tys + +mkCoreTup :: [CoreExpr] -> CoreExpr +-- Builds exactly the specified tuple. +-- No fancy business for big tuples +mkCoreTup [] = Var unitDataConId +mkCoreTup [c] = c +mkCoreTup cs = mkConApp (tupleCon Boxed (length cs)) + (map (Type . exprType) cs ++ cs) + +mkCoreSel :: [Id] -- The tuple args + -> Id -- The selected one + -> Id -- A variable of the same type as the scrutinee + -> CoreExpr -- Scrutinee + -> CoreExpr +-- mkCoreSel [x,y,z] x v e +-- ===> case e of v { (x,y,z) -> x +mkCoreSel [var] should_be_the_same_var scrut_var scrut + = ASSERT(var == should_be_the_same_var) + scrut + +mkCoreSel vars the_var scrut_var scrut + = ASSERT( notNull vars ) + Case scrut scrut_var (idType the_var) + [(DataAlt (tupleCon Boxed (length vars)), vars, Var the_var)] +\end{code} + + +%************************************************************************ +%* * +\subsection[mkFailurePair]{Code for pattern-matching and other failures} +%* * +%************************************************************************ + +Generally, we handle pattern matching failure like this: let-bind a +fail-variable, and use that variable if the thing fails: +\begin{verbatim} + let fail.33 = error "Help" + in + case x of + p1 -> ... + p2 -> fail.33 + p3 -> fail.33 + p4 -> ... +\end{verbatim} +Then +\begin{itemize} +\item +If the case can't fail, then there'll be no mention of @fail.33@, and the +simplifier will later discard it. + +\item +If it can fail in only one way, then the simplifier will inline it. + +\item +Only if it is used more than once will the let-binding remain. +\end{itemize} + +There's a problem when the result of the case expression is of +unboxed type. Then the type of @fail.33@ is unboxed too, and +there is every chance that someone will change the let into a case: +\begin{verbatim} + case error "Help" of + fail.33 -> case .... +\end{verbatim} + +which is of course utterly wrong. Rather than drop the condition that +only boxed types can be let-bound, we just turn the fail into a function +for the primitive case: +\begin{verbatim} + let fail.33 :: Void -> Int# + fail.33 = \_ -> error "Help" + in + case x of + p1 -> ... + p2 -> fail.33 void + p3 -> fail.33 void + p4 -> ... +\end{verbatim} + +Now @fail.33@ is a function, so it can be let-bound. + +\begin{code} +mkFailurePair :: CoreExpr -- Result type of the whole case expression + -> DsM (CoreBind, -- Binds the newly-created fail variable + -- to either the expression or \ _ -> expression + CoreExpr) -- Either the fail variable, or fail variable + -- applied to unit tuple +mkFailurePair expr + | isUnLiftedType ty + = newFailLocalDs (unitTy `mkFunTy` ty) `thenDs` \ fail_fun_var -> + newSysLocalDs unitTy `thenDs` \ fail_fun_arg -> + returnDs (NonRec fail_fun_var (Lam fail_fun_arg expr), + App (Var fail_fun_var) (Var unitDataConId)) + + | otherwise + = newFailLocalDs ty `thenDs` \ fail_var -> + returnDs (NonRec fail_var expr, Var fail_var) + where + ty = exprType expr +\end{code} + + diff --git a/compiler/deSugar/Match.hi-boot-5 b/compiler/deSugar/Match.hi-boot-5 new file mode 100644 index 0000000000..42c200fbff --- /dev/null +++ b/compiler/deSugar/Match.hi-boot-5 @@ -0,0 +1,6 @@ +__interface Match 1 0 where +__export Match match matchExport matchSimply matchSinglePat; +1 match :: [Var.Id] -> [DsUtils.EquationInfo] -> DsMonad.DsM DsUtils.MatchResult ; +1 matchExport :: [Var.Id] -> [DsUtils.EquationInfo] -> DsMonad.DsM DsUtils.MatchResult ; +1 matchSimply :: CoreSyn.CoreExpr -> HsExpr.HsMatchContext Name.Name -> HsPat.LPat Var.Id -> CoreSyn.CoreExpr -> CoreSyn.CoreExpr -> DsMonad.DsM CoreSyn.CoreExpr ; +1 matchSinglePat :: CoreSyn.CoreExpr -> DsMonad.DsMatchContext -> HsPat.LPat Var.Id -> DsUtils.MatchResult -> DsMonad.DsM DsUtils.MatchResult ; diff --git a/compiler/deSugar/Match.hi-boot-6 b/compiler/deSugar/Match.hi-boot-6 new file mode 100644 index 0000000000..df806ec644 --- /dev/null +++ b/compiler/deSugar/Match.hi-boot-6 @@ -0,0 +1,27 @@ +module Match where + +match :: [Var.Id] + -> TcType.TcType + -> [DsMonad.EquationInfo] + -> DsMonad.DsM DsMonad.MatchResult + +matchWrapper + :: HsExpr.HsMatchContext Name.Name + -> HsExpr.MatchGroup Var.Id + -> DsMonad.DsM ([Var.Id], CoreSyn.CoreExpr) + +matchSimply + :: CoreSyn.CoreExpr + -> HsExpr.HsMatchContext Name.Name + -> HsPat.LPat Var.Id + -> CoreSyn.CoreExpr + -> CoreSyn.CoreExpr + -> DsMonad.DsM CoreSyn.CoreExpr + +matchSinglePat + :: CoreSyn.CoreExpr + -> HsExpr.HsMatchContext Name.Name + -> HsPat.LPat Var.Id + -> TcType.TcType + -> DsMonad.MatchResult + -> DsMonad.DsM DsMonad.MatchResult diff --git a/compiler/deSugar/Match.lhs b/compiler/deSugar/Match.lhs new file mode 100644 index 0000000000..d72d6adf17 --- /dev/null +++ b/compiler/deSugar/Match.lhs @@ -0,0 +1,740 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[Main_match]{The @match@ function} + +\begin{code} +module Match ( match, matchEquations, matchWrapper, matchSimply, matchSinglePat ) where + +#include "HsVersions.h" + +import DynFlags ( DynFlag(..), dopt ) +import HsSyn +import TcHsSyn ( mkVanillaTuplePat ) +import Check ( check, ExhaustivePat ) +import CoreSyn +import CoreUtils ( bindNonRec, exprType ) +import DsMonad +import DsBinds ( dsLHsBinds ) +import DsGRHSs ( dsGRHSs ) +import DsUtils +import Id ( idName, idType, Id ) +import DataCon ( dataConFieldLabels, dataConInstOrigArgTys, isVanillaDataCon ) +import MatchCon ( matchConFamily ) +import MatchLit ( matchLiterals, matchNPlusKPats, matchNPats, tidyLitPat, tidyNPat ) +import PrelInfo ( pAT_ERROR_ID ) +import TcType ( Type, tcTyConAppArgs ) +import Type ( splitFunTysN, mkTyVarTys ) +import TysWiredIn ( consDataCon, mkListTy, unitTy, + tupleCon, parrFakeCon, mkPArrTy ) +import BasicTypes ( Boxity(..) ) +import ListSetOps ( runs ) +import SrcLoc ( noLoc, unLoc, Located(..) ) +import Util ( lengthExceeds, notNull ) +import Name ( Name ) +import Outputable +\end{code} + +This function is a wrapper of @match@, it must be called from all the parts where +it was called match, but only substitutes the firs call, .... +if the associated flags are declared, warnings will be issued. +It can not be called matchWrapper because this name already exists :-( + +JJCQ 30-Nov-1997 + +\begin{code} +matchCheck :: DsMatchContext + -> [Id] -- Vars rep'ing the exprs we're matching with + -> Type -- Type of the case expression + -> [EquationInfo] -- Info about patterns, etc. (type synonym below) + -> DsM MatchResult -- Desugared result! + +matchCheck ctx vars ty qs + = getDOptsDs `thenDs` \ dflags -> + matchCheck_really dflags ctx vars ty qs + +matchCheck_really dflags ctx vars ty qs + | incomplete && shadow = + dsShadowWarn ctx eqns_shadow `thenDs` \ () -> + dsIncompleteWarn ctx pats `thenDs` \ () -> + match vars ty qs + | incomplete = + dsIncompleteWarn ctx pats `thenDs` \ () -> + match vars ty qs + | shadow = + dsShadowWarn ctx eqns_shadow `thenDs` \ () -> + match vars ty qs + | otherwise = + match vars ty qs + where (pats, eqns_shadow) = check qs + incomplete = want_incomplete && (notNull pats) + want_incomplete = case ctx of + DsMatchContext RecUpd _ -> + dopt Opt_WarnIncompletePatternsRecUpd dflags + _ -> + dopt Opt_WarnIncompletePatterns dflags + shadow = dopt Opt_WarnOverlappingPatterns dflags + && not (null eqns_shadow) +\end{code} + +This variable shows the maximum number of lines of output generated for warnings. +It will limit the number of patterns/equations displayed to@ maximum_output@. + +(ToDo: add command-line option?) + +\begin{code} +maximum_output = 4 +\end{code} + +The next two functions create the warning message. + +\begin{code} +dsShadowWarn :: DsMatchContext -> [EquationInfo] -> DsM () +dsShadowWarn ctx@(DsMatchContext kind loc) qs + = putSrcSpanDs loc (dsWarn warn) + where + warn | qs `lengthExceeds` maximum_output + = pp_context ctx (ptext SLIT("are overlapped")) + (\ f -> vcat (map (ppr_eqn f kind) (take maximum_output qs)) $$ + ptext SLIT("...")) + | otherwise + = pp_context ctx (ptext SLIT("are overlapped")) + (\ f -> vcat $ map (ppr_eqn f kind) qs) + + +dsIncompleteWarn :: DsMatchContext -> [ExhaustivePat] -> DsM () +dsIncompleteWarn ctx@(DsMatchContext kind loc) pats + = putSrcSpanDs loc (dsWarn warn) + where + warn = pp_context ctx (ptext SLIT("are non-exhaustive")) + (\f -> hang (ptext SLIT("Patterns not matched:")) + 4 ((vcat $ map (ppr_incomplete_pats kind) + (take maximum_output pats)) + $$ dots)) + + dots | pats `lengthExceeds` maximum_output = ptext SLIT("...") + | otherwise = empty + +pp_context (DsMatchContext kind _loc) msg rest_of_msg_fun + = vcat [ptext SLIT("Pattern match(es)") <+> msg, + sep [ptext SLIT("In") <+> ppr_match <> char ':', nest 4 (rest_of_msg_fun pref)]] + where + (ppr_match, pref) + = case kind of + FunRhs fun -> (pprMatchContext kind, \ pp -> ppr fun <+> pp) + other -> (pprMatchContext kind, \ pp -> pp) + +ppr_pats pats = sep (map ppr pats) + +ppr_shadow_pats kind pats + = sep [ppr_pats pats, matchSeparator kind, ptext SLIT("...")] + +ppr_incomplete_pats kind (pats,[]) = ppr_pats pats +ppr_incomplete_pats kind (pats,constraints) = + sep [ppr_pats pats, ptext SLIT("with"), + sep (map ppr_constraint constraints)] + + +ppr_constraint (var,pats) = sep [ppr var, ptext SLIT("`notElem`"), ppr pats] + +ppr_eqn prefixF kind eqn = prefixF (ppr_shadow_pats kind (eqn_pats eqn)) +\end{code} + + +The function @match@ is basically the same as in the Wadler chapter, +except it is monadised, to carry around the name supply, info about +annotations, etc. + +Notes on @match@'s arguments, assuming $m$ equations and $n$ patterns: +\begin{enumerate} +\item +A list of $n$ variable names, those variables presumably bound to the +$n$ expressions being matched against the $n$ patterns. Using the +list of $n$ expressions as the first argument showed no benefit and +some inelegance. + +\item +The second argument, a list giving the ``equation info'' for each of +the $m$ equations: +\begin{itemize} +\item +the $n$ patterns for that equation, and +\item +a list of Core bindings [@(Id, CoreExpr)@ pairs] to be ``stuck on +the front'' of the matching code, as in: +\begin{verbatim} +let <binds> +in <matching-code> +\end{verbatim} +\item +and finally: (ToDo: fill in) + +The right way to think about the ``after-match function'' is that it +is an embryonic @CoreExpr@ with a ``hole'' at the end for the +final ``else expression''. +\end{itemize} + +There is a type synonym, @EquationInfo@, defined in module @DsUtils@. + +An experiment with re-ordering this information about equations (in +particular, having the patterns available in column-major order) +showed no benefit. + +\item +A default expression---what to evaluate if the overall pattern-match +fails. This expression will (almost?) always be +a measly expression @Var@, unless we know it will only be used once +(as we do in @glue_success_exprs@). + +Leaving out this third argument to @match@ (and slamming in lots of +@Var "fail"@s) is a positively {\em bad} idea, because it makes it +impossible to share the default expressions. (Also, it stands no +chance of working in our post-upheaval world of @Locals@.) +\end{enumerate} +So, the full type signature: +\begin{code} +match :: [Id] -- Variables rep'ing the exprs we're matching with + -> Type -- Type of the case expression + -> [EquationInfo] -- Info about patterns, etc. (type synonym below) + -> DsM MatchResult -- Desugared result! +\end{code} + +Note: @match@ is often called via @matchWrapper@ (end of this module), +a function that does much of the house-keeping that goes with a call +to @match@. + +It is also worth mentioning the {\em typical} way a block of equations +is desugared with @match@. At each stage, it is the first column of +patterns that is examined. The steps carried out are roughly: +\begin{enumerate} +\item +Tidy the patterns in column~1 with @tidyEqnInfo@ (this may add +bindings to the second component of the equation-info): +\begin{itemize} +\item +Remove the `as' patterns from column~1. +\item +Make all constructor patterns in column~1 into @ConPats@, notably +@ListPats@ and @TuplePats@. +\item +Handle any irrefutable (or ``twiddle'') @LazyPats@. +\end{itemize} +\item +Now {\em unmix} the equations into {\em blocks} [w/ local function +@unmix_eqns@], in which the equations in a block all have variable +patterns in column~1, or they all have constructor patterns in ... +(see ``the mixture rule'' in SLPJ). +\item +Call @matchEqnBlock@ on each block of equations; it will do the +appropriate thing for each kind of column-1 pattern, usually ending up +in a recursive call to @match@. +\end{enumerate} + +%************************************************************************ +%* * +%* match: empty rule * +%* * +%************************************************************************ +\subsection[Match-empty-rule]{The ``empty rule''} + +We are a little more paranoid about the ``empty rule'' (SLPJ, p.~87) +than the Wadler-chapter code for @match@ (p.~93, first @match@ clause). +And gluing the ``success expressions'' together isn't quite so pretty. + +\begin{code} +match [] ty eqns_info + = ASSERT( not (null eqns_info) ) + returnDs (foldr1 combineMatchResults match_results) + where + match_results = [ ASSERT( null (eqn_pats eqn) ) + adjustMatchResult (eqn_wrap eqn) (eqn_rhs eqn) + | eqn <- eqns_info ] +\end{code} + + +%************************************************************************ +%* * +%* match: non-empty rule * +%* * +%************************************************************************ +\subsection[Match-nonempty]{@match@ when non-empty: unmixing} + +This (more interesting) clause of @match@ uses @tidy_and_unmix_eqns@ +(a)~to get `as'- and `twiddle'-patterns out of the way (tidying), and +(b)~to do ``the mixture rule'' (SLPJ, p.~88) [which really {\em +un}mixes the equations], producing a list of equation-info +blocks, each block having as its first column of patterns either all +constructors, or all variables (or similar beasts), etc. + +@match_unmixed_eqn_blks@ simply takes the place of the @foldr@ in the +Wadler-chapter @match@ (p.~93, last clause), and @match_unmixed_blk@ +corresponds roughly to @matchVarCon@. + +\begin{code} +match vars@(v:_) ty eqns_info + = do { tidy_eqns <- mappM (tidyEqnInfo v) eqns_info + ; let eqns_blks = runs same_family tidy_eqns + ; match_results <- mappM match_block eqns_blks + ; ASSERT( not (null match_results) ) + return (foldr1 combineMatchResults match_results) } + where + same_family eqn1 eqn2 + = samePatFamily (firstPat eqn1) (firstPat eqn2) + + match_block eqns + = case firstPat (head eqns) of + WildPat {} -> matchVariables vars ty eqns + ConPatOut {} -> matchConFamily vars ty eqns + NPlusKPat {} -> matchNPlusKPats vars ty eqns + NPat {} -> matchNPats vars ty eqns + LitPat {} -> matchLiterals vars ty eqns + +-- After tidying, there are only five kinds of patterns +samePatFamily (WildPat {}) (WildPat {}) = True +samePatFamily (ConPatOut {}) (ConPatOut {}) = True +samePatFamily (NPlusKPat {}) (NPlusKPat {}) = True +samePatFamily (NPat {}) (NPat {}) = True +samePatFamily (LitPat {}) (LitPat {}) = True +samePatFamily _ _ = False + +matchVariables :: [Id] -> Type -> [EquationInfo] -> DsM MatchResult +-- Real true variables, just like in matchVar, SLPJ p 94 +-- No binding to do: they'll all be wildcards by now (done in tidy) +matchVariables (var:vars) ty eqns = match vars ty (shiftEqns eqns) +\end{code} + + +\end{code} + +Tidy up the leftmost pattern in an @EquationInfo@, given the variable @v@ +which will be scrutinised. This means: +\begin{itemize} +\item +Replace variable patterns @x@ (@x /= v@) with the pattern @_@, +together with the binding @x = v@. +\item +Replace the `as' pattern @x@@p@ with the pattern p and a binding @x = v@. +\item +Removing lazy (irrefutable) patterns (you don't want to know...). +\item +Converting explicit tuple-, list-, and parallel-array-pats into ordinary +@ConPats@. +\item +Convert the literal pat "" to []. +\end{itemize} + +The result of this tidying is that the column of patterns will include +{\em only}: +\begin{description} +\item[@WildPats@:] +The @VarPat@ information isn't needed any more after this. + +\item[@ConPats@:] +@ListPats@, @TuplePats@, etc., are all converted into @ConPats@. + +\item[@LitPats@ and @NPats@:] +@LitPats@/@NPats@ of ``known friendly types'' (Int, Char, +Float, Double, at least) are converted to unboxed form; e.g., +\tr{(NPat (HsInt i) _ _)} is converted to: +\begin{verbatim} +(ConPat I# _ _ [LitPat (HsIntPrim i)]) +\end{verbatim} +\end{description} + +\begin{code} +tidyEqnInfo :: Id -> EquationInfo -> DsM EquationInfo + -- DsM'd because of internal call to dsLHsBinds + -- and mkSelectorBinds. + -- "tidy1" does the interesting stuff, looking at + -- one pattern and fiddling the list of bindings. + -- + -- POST CONDITION: head pattern in the EqnInfo is + -- WildPat + -- ConPat + -- NPat + -- LitPat + -- NPlusKPat + -- but no other + +tidyEqnInfo v eqn@(EqnInfo { eqn_wrap = wrap, eqn_pats = pat : pats }) + = tidy1 v wrap pat `thenDs` \ (wrap', pat') -> + returnDs (eqn { eqn_wrap = wrap', eqn_pats = pat' : pats }) + +tidy1 :: Id -- The Id being scrutinised + -> DsWrapper -- Previous wrapping bindings + -> Pat Id -- The pattern against which it is to be matched + -> DsM (DsWrapper, -- Extra bindings around what to do afterwards + Pat Id) -- Equivalent pattern + +-- The extra bindings etc are all wrapped around the RHS of the match +-- so they are only available when matching is complete. But that's ok +-- becuase, for example, in the pattern x@(...), the x can only be +-- used in the RHS, not in the nested pattern, nor subsquent patterns +-- +-- However this does have an awkward consequence. The bindings in +-- a VarPatOut get wrapped around the result in right to left order, +-- rather than left to right. This only matters if one set of +-- bindings can mention things used in another, and that can happen +-- if we allow equality dictionary bindings of form d1=d2. +-- bindIInstsOfLocalFuns is now careful not to do this, but it's a wart. +-- (Without this care in bindInstsOfLocalFuns, compiling +-- Data.Generics.Schemes.hs fails in function everywhereBut.) + +------------------------------------------------------- +-- (pat', mr') = tidy1 v pat mr +-- tidies the *outer level only* of pat, giving pat' +-- It eliminates many pattern forms (as-patterns, variable patterns, +-- list patterns, etc) yielding one of: +-- WildPat +-- ConPatOut +-- LitPat +-- NPat +-- NPlusKPat + +tidy1 v wrap (ParPat pat) = tidy1 v wrap (unLoc pat) +tidy1 v wrap (SigPatOut pat _) = tidy1 v wrap (unLoc pat) +tidy1 v wrap (WildPat ty) = returnDs (wrap, WildPat ty) + + -- case v of { x -> mr[] } + -- = case v of { _ -> let x=v in mr[] } +tidy1 v wrap (VarPat var) + = returnDs (wrap . wrapBind var v, WildPat (idType var)) + +tidy1 v wrap (VarPatOut var binds) + = do { prs <- dsLHsBinds binds + ; return (wrap . wrapBind var v . mkDsLet (Rec prs), + WildPat (idType var)) } + + -- case v of { x@p -> mr[] } + -- = case v of { p -> let x=v in mr[] } +tidy1 v wrap (AsPat (L _ var) pat) + = tidy1 v (wrap . wrapBind var v) (unLoc pat) + +tidy1 v wrap (BangPat pat) + = tidy1 v (wrap . seqVar v) (unLoc pat) + +{- now, here we handle lazy patterns: + tidy1 v ~p bs = (v, v1 = case v of p -> v1 : + v2 = case v of p -> v2 : ... : bs ) + + where the v_i's are the binders in the pattern. + + ToDo: in "v_i = ... -> v_i", are the v_i's really the same thing? + + The case expr for v_i is just: match [v] [(p, [], \ x -> Var v_i)] any_expr +-} + +tidy1 v wrap (LazyPat pat) + = do { v' <- newSysLocalDs (idType v) + ; sel_prs <- mkSelectorBinds pat (Var v) + ; let sel_binds = [NonRec b rhs | (b,rhs) <- sel_prs] + ; returnDs (wrap . wrapBind v' v . mkDsLets sel_binds, + WildPat (idType v)) } + +-- re-express <con-something> as (ConPat ...) [directly] + +tidy1 v wrap (ConPatOut (L loc con) ex_tvs dicts binds ps pat_ty) + = returnDs (wrap, ConPatOut (L loc con) ex_tvs dicts binds tidy_ps pat_ty) + where + tidy_ps = PrefixCon (tidy_con con ex_tvs pat_ty ps) + +tidy1 v wrap (ListPat pats ty) + = returnDs (wrap, unLoc list_ConPat) + where + list_ty = mkListTy ty + list_ConPat = foldr (\ x y -> mkPrefixConPat consDataCon [x, y] list_ty) + (mkNilPat list_ty) + pats + +-- Introduce fake parallel array constructors to be able to handle parallel +-- arrays with the existing machinery for constructor pattern +tidy1 v wrap (PArrPat pats ty) + = returnDs (wrap, unLoc parrConPat) + where + arity = length pats + parrConPat = mkPrefixConPat (parrFakeCon arity) pats (mkPArrTy ty) + +tidy1 v wrap (TuplePat pats boxity ty) + = returnDs (wrap, unLoc tuple_ConPat) + where + arity = length pats + tuple_ConPat = mkPrefixConPat (tupleCon boxity arity) pats ty + +tidy1 v wrap (DictPat dicts methods) + = case num_of_d_and_ms of + 0 -> tidy1 v wrap (TuplePat [] Boxed unitTy) + 1 -> tidy1 v wrap (unLoc (head dict_and_method_pats)) + _ -> tidy1 v wrap (mkVanillaTuplePat dict_and_method_pats Boxed) + where + num_of_d_and_ms = length dicts + length methods + dict_and_method_pats = map nlVarPat (dicts ++ methods) + +-- LitPats: we *might* be able to replace these w/ a simpler form +tidy1 v wrap pat@(LitPat lit) + = returnDs (wrap, unLoc (tidyLitPat lit (noLoc pat))) + +-- NPats: we *might* be able to replace these w/ a simpler form +tidy1 v wrap pat@(NPat lit mb_neg _ lit_ty) + = returnDs (wrap, unLoc (tidyNPat lit mb_neg lit_ty (noLoc pat))) + +-- and everything else goes through unchanged... + +tidy1 v wrap non_interesting_pat + = returnDs (wrap, non_interesting_pat) + + +tidy_con data_con ex_tvs pat_ty (PrefixCon ps) = ps +tidy_con data_con ex_tvs pat_ty (InfixCon p1 p2) = [p1,p2] +tidy_con data_con ex_tvs pat_ty (RecCon rpats) + | null rpats + = -- Special case for C {}, which can be used for + -- a constructor that isn't declared to have + -- fields at all + map (noLoc . WildPat) con_arg_tys' + + | otherwise + = map mk_pat tagged_arg_tys + where + -- Boring stuff to find the arg-tys of the constructor + + inst_tys | isVanillaDataCon data_con = tcTyConAppArgs pat_ty -- Newtypes must be opaque + | otherwise = mkTyVarTys ex_tvs + + con_arg_tys' = dataConInstOrigArgTys data_con inst_tys + tagged_arg_tys = con_arg_tys' `zip` dataConFieldLabels data_con + + -- mk_pat picks a WildPat of the appropriate type for absent fields, + -- and the specified pattern for present fields + mk_pat (arg_ty, lbl) = + case [ pat | (sel_id,pat) <- rpats, idName (unLoc sel_id) == lbl] of + (pat:pats) -> ASSERT( null pats ) pat + [] -> noLoc (WildPat arg_ty) +\end{code} + +\noindent +{\bf Previous @matchTwiddled@ stuff:} + +Now we get to the only interesting part; note: there are choices for +translation [from Simon's notes]; translation~1: +\begin{verbatim} +deTwiddle [s,t] e +\end{verbatim} +returns +\begin{verbatim} +[ w = e, + s = case w of [s,t] -> s + t = case w of [s,t] -> t +] +\end{verbatim} + +Here \tr{w} is a fresh variable, and the \tr{w}-binding prevents multiple +evaluation of \tr{e}. An alternative translation (No.~2): +\begin{verbatim} +[ w = case e of [s,t] -> (s,t) + s = case w of (s,t) -> s + t = case w of (s,t) -> t +] +\end{verbatim} + +%************************************************************************ +%* * +\subsubsection[improved-unmixing]{UNIMPLEMENTED idea for improved unmixing} +%* * +%************************************************************************ + +We might be able to optimise unmixing when confronted by +only-one-constructor-possible, of which tuples are the most notable +examples. Consider: +\begin{verbatim} +f (a,b,c) ... = ... +f d ... (e:f) = ... +f (g,h,i) ... = ... +f j ... = ... +\end{verbatim} +This definition would normally be unmixed into four equation blocks, +one per equation. But it could be unmixed into just one equation +block, because if the one equation matches (on the first column), +the others certainly will. + +You have to be careful, though; the example +\begin{verbatim} +f j ... = ... +------------------- +f (a,b,c) ... = ... +f d ... (e:f) = ... +f (g,h,i) ... = ... +\end{verbatim} +{\em must} be broken into two blocks at the line shown; otherwise, you +are forcing unnecessary evaluation. In any case, the top-left pattern +always gives the cue. You could then unmix blocks into groups of... +\begin{description} +\item[all variables:] +As it is now. +\item[constructors or variables (mixed):] +Need to make sure the right names get bound for the variable patterns. +\item[literals or variables (mixed):] +Presumably just a variant on the constructor case (as it is now). +\end{description} + +%************************************************************************ +%* * +%* matchWrapper: a convenient way to call @match@ * +%* * +%************************************************************************ +\subsection[matchWrapper]{@matchWrapper@: a convenient interface to @match@} + +Calls to @match@ often involve similar (non-trivial) work; that work +is collected here, in @matchWrapper@. This function takes as +arguments: +\begin{itemize} +\item +Typchecked @Matches@ (of a function definition, or a case or lambda +expression)---the main input; +\item +An error message to be inserted into any (runtime) pattern-matching +failure messages. +\end{itemize} + +As results, @matchWrapper@ produces: +\begin{itemize} +\item +A list of variables (@Locals@) that the caller must ``promise'' to +bind to appropriate values; and +\item +a @CoreExpr@, the desugared output (main result). +\end{itemize} + +The main actions of @matchWrapper@ include: +\begin{enumerate} +\item +Flatten the @[TypecheckedMatch]@ into a suitable list of +@EquationInfo@s. +\item +Create as many new variables as there are patterns in a pattern-list +(in any one of the @EquationInfo@s). +\item +Create a suitable ``if it fails'' expression---a call to @error@ using +the error-string input; the {\em type} of this fail value can be found +by examining one of the RHS expressions in one of the @EquationInfo@s. +\item +Call @match@ with all of this information! +\end{enumerate} + +\begin{code} +matchWrapper :: HsMatchContext Name -- For shadowing warning messages + -> MatchGroup Id -- Matches being desugared + -> DsM ([Id], CoreExpr) -- Results +\end{code} + + There is one small problem with the Lambda Patterns, when somebody + writes something similar to: +\begin{verbatim} + (\ (x:xs) -> ...) +\end{verbatim} + he/she don't want a warning about incomplete patterns, that is done with + the flag @opt_WarnSimplePatterns@. + This problem also appears in the: +\begin{itemize} +\item @do@ patterns, but if the @do@ can fail + it creates another equation if the match can fail + (see @DsExpr.doDo@ function) +\item @let@ patterns, are treated by @matchSimply@ + List Comprension Patterns, are treated by @matchSimply@ also +\end{itemize} + +We can't call @matchSimply@ with Lambda patterns, +due to the fact that lambda patterns can have more than +one pattern, and match simply only accepts one pattern. + +JJQC 30-Nov-1997 + +\begin{code} +matchWrapper ctxt (MatchGroup matches match_ty) + = do { eqns_info <- mapM mk_eqn_info matches + ; new_vars <- selectMatchVars arg_pats pat_tys + ; result_expr <- matchEquations ctxt new_vars eqns_info rhs_ty + ; return (new_vars, result_expr) } + where + arg_pats = map unLoc (hsLMatchPats (head matches)) + n_pats = length arg_pats + (pat_tys, rhs_ty) = splitFunTysN n_pats match_ty + + mk_eqn_info (L _ (Match pats _ grhss)) + = do { let upats = map unLoc pats + ; match_result <- dsGRHSs ctxt upats grhss rhs_ty + ; return (EqnInfo { eqn_wrap = idWrapper, + eqn_pats = upats, + eqn_rhs = match_result}) } + + +matchEquations :: HsMatchContext Name + -> [Id] -> [EquationInfo] -> Type + -> DsM CoreExpr +matchEquations ctxt vars eqns_info rhs_ty + = do { dflags <- getDOptsDs + ; locn <- getSrcSpanDs + ; let ds_ctxt = DsMatchContext ctxt locn + error_string = matchContextErrString ctxt + + ; match_result <- match_fun dflags ds_ctxt vars rhs_ty eqns_info + + ; fail_expr <- mkErrorAppDs pAT_ERROR_ID rhs_ty error_string + ; extractMatchResult match_result fail_expr } + where + match_fun dflags ds_ctxt + = case ctxt of + LambdaExpr | dopt Opt_WarnSimplePatterns dflags -> matchCheck ds_ctxt + | otherwise -> match + _ -> matchCheck ds_ctxt +\end{code} + +%************************************************************************ +%* * +\subsection[matchSimply]{@matchSimply@: match a single expression against a single pattern} +%* * +%************************************************************************ + +@mkSimpleMatch@ is a wrapper for @match@ which deals with the +situation where we want to match a single expression against a single +pattern. It returns an expression. + +\begin{code} +matchSimply :: CoreExpr -- Scrutinee + -> HsMatchContext Name -- Match kind + -> LPat Id -- Pattern it should match + -> CoreExpr -- Return this if it matches + -> CoreExpr -- Return this if it doesn't + -> DsM CoreExpr + +matchSimply scrut hs_ctx pat result_expr fail_expr + = let + match_result = cantFailMatchResult result_expr + rhs_ty = exprType fail_expr + -- Use exprType of fail_expr, because won't refine in the case of failure! + in + matchSinglePat scrut hs_ctx pat rhs_ty match_result `thenDs` \ match_result' -> + extractMatchResult match_result' fail_expr + + +matchSinglePat :: CoreExpr -> HsMatchContext Name -> LPat Id + -> Type -> MatchResult -> DsM MatchResult +matchSinglePat (Var var) hs_ctx (L _ pat) ty match_result + = getDOptsDs `thenDs` \ dflags -> + getSrcSpanDs `thenDs` \ locn -> + let + match_fn dflags + | dopt Opt_WarnSimplePatterns dflags = matchCheck ds_ctx + | otherwise = match + where + ds_ctx = DsMatchContext hs_ctx locn + in + match_fn dflags [var] ty [EqnInfo { eqn_wrap = idWrapper, + eqn_pats = [pat], + eqn_rhs = match_result }] + +matchSinglePat scrut hs_ctx pat ty match_result + = selectSimpleMatchVarL pat `thenDs` \ var -> + matchSinglePat (Var var) hs_ctx pat ty match_result `thenDs` \ match_result' -> + returnDs (adjustMatchResult (bindNonRec var scrut) match_result') +\end{code} + diff --git a/compiler/deSugar/Match.lhs-boot b/compiler/deSugar/Match.lhs-boot new file mode 100644 index 0000000000..5f99f5cc1a --- /dev/null +++ b/compiler/deSugar/Match.lhs-boot @@ -0,0 +1,35 @@ +\begin{code} +module Match where +import Var ( Id ) +import TcType ( TcType ) +import DsMonad ( DsM, EquationInfo, MatchResult ) +import CoreSyn ( CoreExpr ) +import HsSyn ( LPat, HsMatchContext, MatchGroup ) +import Name ( Name ) + +match :: [Id] + -> TcType + -> [EquationInfo] + -> DsM MatchResult + +matchWrapper + :: HsMatchContext Name + -> MatchGroup Id + -> DsM ([Id], CoreExpr) + +matchSimply + :: CoreExpr + -> HsMatchContext Name + -> LPat Id + -> CoreExpr + -> CoreExpr + -> DsM CoreExpr + +matchSinglePat + :: CoreExpr + -> HsMatchContext Name + -> LPat Id + -> TcType + -> MatchResult + -> DsM MatchResult +\end{code} diff --git a/compiler/deSugar/MatchCon.lhs b/compiler/deSugar/MatchCon.lhs new file mode 100644 index 0000000000..6ff502a8ae --- /dev/null +++ b/compiler/deSugar/MatchCon.lhs @@ -0,0 +1,174 @@ + +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[MatchCon]{Pattern-matching constructors} + +\begin{code} +module MatchCon ( matchConFamily ) where + +#include "HsVersions.h" + +import Id( idType ) + +import {-# SOURCE #-} Match ( match ) + +import HsSyn ( Pat(..), HsConDetails(..) ) +import DsBinds ( dsLHsBinds ) +import DataCon ( isVanillaDataCon, dataConInstOrigArgTys ) +import TcType ( tcTyConAppArgs ) +import Type ( mkTyVarTys ) +import CoreSyn +import DsMonad +import DsUtils + +import Id ( Id ) +import Type ( Type ) +import ListSetOps ( equivClassesByUniq ) +import SrcLoc ( unLoc, Located(..) ) +import Unique ( Uniquable(..) ) +import Outputable +\end{code} + +We are confronted with the first column of patterns in a set of +equations, all beginning with constructors from one ``family'' (e.g., +@[]@ and @:@ make up the @List@ ``family''). We want to generate the +alternatives for a @Case@ expression. There are several choices: +\begin{enumerate} +\item +Generate an alternative for every constructor in the family, whether +they are used in this set of equations or not; this is what the Wadler +chapter does. +\begin{description} +\item[Advantages:] +(a)~Simple. (b)~It may also be that large sparsely-used constructor +families are mainly handled by the code for literals. +\item[Disadvantages:] +(a)~Not practical for large sparsely-used constructor families, e.g., +the ASCII character set. (b)~Have to look up a list of what +constructors make up the whole family. +\end{description} + +\item +Generate an alternative for each constructor used, then add a default +alternative in case some constructors in the family weren't used. +\begin{description} +\item[Advantages:] +(a)~Alternatives aren't generated for unused constructors. (b)~The +STG is quite happy with defaults. (c)~No lookup in an environment needed. +\item[Disadvantages:] +(a)~A spurious default alternative may be generated. +\end{description} + +\item +``Do it right:'' generate an alternative for each constructor used, +and add a default alternative if all constructors in the family +weren't used. +\begin{description} +\item[Advantages:] +(a)~You will get cases with only one alternative (and no default), +which should be amenable to optimisation. Tuples are a common example. +\item[Disadvantages:] +(b)~Have to look up constructor families in TDE (as above). +\end{description} +\end{enumerate} + +We are implementing the ``do-it-right'' option for now. The arguments +to @matchConFamily@ are the same as to @match@; the extra @Int@ +returned is the number of constructors in the family. + +The function @matchConFamily@ is concerned with this +have-we-used-all-the-constructors? question; the local function +@match_cons_used@ does all the real work. +\begin{code} +matchConFamily :: [Id] + -> Type + -> [EquationInfo] + -> DsM MatchResult +matchConFamily (var:vars) ty eqns_info + = let + -- Sort into equivalence classes by the unique on the constructor + -- All the EqnInfos should start with a ConPat + groups = equivClassesByUniq get_uniq eqns_info + get_uniq (EqnInfo { eqn_pats = ConPatOut (L _ data_con) _ _ _ _ _ : _}) = getUnique data_con + + -- Get the wrapper from the head of each group. We're going to + -- use it as the pattern in this case expression, so we need to + -- ensure that any type variables it mentions in the pattern are + -- in scope. So we put its wrappers outside the case, and + -- zap the wrapper for it. + wraps :: [CoreExpr -> CoreExpr] + wraps = map (eqn_wrap . head) groups + + groups' = [ eqn { eqn_wrap = idWrapper } : eqns | eqn:eqns <- groups ] + in + -- Now make a case alternative out of each group + mappM (match_con vars ty) groups' `thenDs` \ alts -> + returnDs (adjustMatchResult (foldr (.) idWrapper wraps) $ + mkCoAlgCaseMatchResult var ty alts) +\end{code} + +And here is the local function that does all the work. It is +more-or-less the @matchCon@/@matchClause@ functions on page~94 in +Wadler's chapter in SLPJ. The function @shift_con_pats@ does what the +list comprehension in @matchClause@ (SLPJ, p.~94) does, except things +are trickier in real life. Works for @ConPats@, and we want it to +fail catastrophically for anything else (which a list comprehension +wouldn't). Cf.~@shift_lit_pats@ in @MatchLits@. + +\begin{code} +match_con vars ty eqns + = do { -- Make new vars for the con arguments; avoid new locals where possible + arg_vars <- selectMatchVars (map unLoc arg_pats1) arg_tys + ; eqns' <- mapM shift eqns + ; match_result <- match (arg_vars ++ vars) ty eqns' + ; return (con, tvs1 ++ dicts1 ++ arg_vars, match_result) } + where + ConPatOut (L _ con) tvs1 dicts1 _ (PrefixCon arg_pats1) pat_ty = firstPat (head eqns) + + shift eqn@(EqnInfo { eqn_wrap = wrap, + eqn_pats = ConPatOut _ tvs ds bind (PrefixCon arg_pats) _ : pats }) + = do { prs <- dsLHsBinds bind + ; return (eqn { eqn_wrap = wrap . wrapBinds (tvs `zip` tvs1) + . wrapBinds (ds `zip` dicts1) + . mkDsLet (Rec prs), + eqn_pats = map unLoc arg_pats ++ pats }) } + + -- Get the arg types, which we use to type the new vars + -- to match on, from the "outside"; the types of pats1 may + -- be more refined, and hence won't do + arg_tys = dataConInstOrigArgTys con inst_tys + inst_tys | isVanillaDataCon con = tcTyConAppArgs pat_ty -- Newtypes opaque! + | otherwise = mkTyVarTys tvs1 +\end{code} + +Note [Existentials in shift_con_pat] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Consider + data T = forall a. Ord a => T a (a->Int) + + f (T x f) True = ...expr1... + f (T y g) False = ...expr2.. + +When we put in the tyvars etc we get + + f (T a (d::Ord a) (x::a) (f::a->Int)) True = ...expr1... + f (T b (e::Ord b) (y::a) (g::a->Int)) True = ...expr2... + +After desugaring etc we'll get a single case: + + f = \t::T b::Bool -> + case t of + T a (d::Ord a) (x::a) (f::a->Int)) -> + case b of + True -> ...expr1... + False -> ...expr2... + +*** We have to substitute [a/b, d/e] in expr2! ** +Hence + False -> ....((/\b\(e:Ord b).expr2) a d).... + +Originally I tried to use + (\b -> let e = d in expr2) a +to do this substitution. While this is "correct" in a way, it fails +Lint, because e::Ord b but d::Ord a. + diff --git a/compiler/deSugar/MatchLit.lhs b/compiler/deSugar/MatchLit.lhs new file mode 100644 index 0000000000..0b7907b22e --- /dev/null +++ b/compiler/deSugar/MatchLit.lhs @@ -0,0 +1,329 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[MatchLit]{Pattern-matching literal patterns} + +\begin{code} +module MatchLit ( dsLit, dsOverLit, + tidyLitPat, tidyNPat, + matchLiterals, matchNPlusKPats, matchNPats ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} Match ( match ) +import {-# SOURCE #-} DsExpr ( dsExpr ) + +import DsMonad +import DsUtils + +import HsSyn +import Id ( Id, idType ) +import CoreSyn +import TyCon ( tyConDataCons ) +import TcType ( tcSplitTyConApp, isIntegerTy, isIntTy, + isFloatTy, isDoubleTy, isStringTy ) +import Type ( Type ) +import PrelNames ( ratioTyConKey ) +import TysWiredIn ( stringTy, consDataCon, intDataCon, floatDataCon, doubleDataCon ) +import PrelNames ( eqStringName ) +import Unique ( hasKey ) +import Literal ( mkMachInt, Literal(..) ) +import SrcLoc ( noLoc ) +import ListSetOps ( equivClasses, runs ) +import Ratio ( numerator, denominator ) +import SrcLoc ( Located(..) ) +import Outputable +import FastString ( lengthFS, unpackFS ) +\end{code} + +%************************************************************************ +%* * + Desugaring literals + [used to be in DsExpr, but DsMeta needs it, + and it's nice to avoid a loop] +%* * +%************************************************************************ + +We give int/float literals type @Integer@ and @Rational@, respectively. +The typechecker will (presumably) have put \tr{from{Integer,Rational}s} +around them. + +ToDo: put in range checks for when converting ``@i@'' +(or should that be in the typechecker?) + +For numeric literals, we try to detect there use at a standard type +(@Int@, @Float@, etc.) are directly put in the right constructor. +[NB: down with the @App@ conversion.] + +See also below where we look for @DictApps@ for \tr{plusInt}, etc. + +\begin{code} +dsLit :: HsLit -> DsM CoreExpr +dsLit (HsChar c) = returnDs (mkCharExpr c) +dsLit (HsCharPrim c) = returnDs (mkLit (MachChar c)) +dsLit (HsString str) = mkStringExprFS str +dsLit (HsStringPrim s) = returnDs (mkLit (MachStr s)) +dsLit (HsInteger i _) = mkIntegerExpr i +dsLit (HsInt i) = returnDs (mkIntExpr i) +dsLit (HsIntPrim i) = returnDs (mkIntLit i) +dsLit (HsFloatPrim f) = returnDs (mkLit (MachFloat f)) +dsLit (HsDoublePrim d) = returnDs (mkLit (MachDouble d)) + +dsLit (HsRat r ty) + = mkIntegerExpr (numerator r) `thenDs` \ num -> + mkIntegerExpr (denominator r) `thenDs` \ denom -> + returnDs (mkConApp ratio_data_con [Type integer_ty, num, denom]) + where + (ratio_data_con, integer_ty) + = case tcSplitTyConApp ty of + (tycon, [i_ty]) -> ASSERT(isIntegerTy i_ty && tycon `hasKey` ratioTyConKey) + (head (tyConDataCons tycon), i_ty) + +dsOverLit :: HsOverLit Id -> DsM CoreExpr +-- Post-typechecker, the SyntaxExpr field of an OverLit contains +-- (an expression for) the literal value itself +dsOverLit (HsIntegral _ lit) = dsExpr lit +dsOverLit (HsFractional _ lit) = dsExpr lit +\end{code} + +%************************************************************************ +%* * + Tidying lit pats +%* * +%************************************************************************ + +\begin{code} +tidyLitPat :: HsLit -> LPat Id -> LPat Id +-- Result has only the following HsLits: +-- HsIntPrim, HsCharPrim, HsFloatPrim +-- HsDoublePrim, HsStringPrim, HsString +-- * HsInteger, HsRat, HsInt can't show up in LitPats +-- * We get rid of HsChar right here +tidyLitPat (HsChar c) pat = mkCharLitPat c +tidyLitPat (HsString s) pat + | lengthFS s <= 1 -- Short string literals only + = foldr (\c pat -> mkPrefixConPat consDataCon [mkCharLitPat c,pat] stringTy) + (mkNilPat stringTy) (unpackFS s) + -- The stringTy is the type of the whole pattern, not + -- the type to instantiate (:) or [] with! +tidyLitPat lit pat = pat + +---------------- +tidyNPat :: HsOverLit Id -> Maybe (SyntaxExpr Id) -> Type -> LPat Id -> LPat Id +tidyNPat over_lit mb_neg lit_ty default_pat + | isIntTy lit_ty = mk_con_pat intDataCon (HsIntPrim int_val) + | isFloatTy lit_ty = mk_con_pat floatDataCon (HsFloatPrim rat_val) + | isDoubleTy lit_ty = mk_con_pat doubleDataCon (HsDoublePrim rat_val) + | otherwise = default_pat + where + mk_con_pat con lit = mkPrefixConPat con [noLoc $ LitPat lit] lit_ty + neg_lit = case (mb_neg, over_lit) of + (Nothing, _) -> over_lit + (Just _, HsIntegral i s) -> HsIntegral (-i) s + (Just _, HsFractional f s) -> HsFractional (-f) s + + int_val :: Integer + int_val = case neg_lit of + HsIntegral i _ -> i + HsFractional f _ -> panic "tidyNPat" + + rat_val :: Rational + rat_val = case neg_lit of + HsIntegral i _ -> fromInteger i + HsFractional f _ -> f +\end{code} + + +%************************************************************************ +%* * + Pattern matching on LitPat +%* * +%************************************************************************ + +\begin{code} +matchLiterals :: [Id] + -> Type -- Type of the whole case expression + -> [EquationInfo] + -> DsM MatchResult +-- All the EquationInfos have LitPats at the front + +matchLiterals (var:vars) ty eqns + = do { -- Group by literal + let groups :: [[(Literal, EquationInfo)]] + groups = equivClasses cmpTaggedEqn (tagLitEqns eqns) + + -- Deal with each group + ; alts <- mapM match_group groups + + -- Combine results. For everything except String + -- we can use a case expression; for String we need + -- a chain of if-then-else + ; if isStringTy (idType var) then + do { mrs <- mapM wrap_str_guard alts + ; return (foldr1 combineMatchResults mrs) } + else + return (mkCoPrimCaseMatchResult var ty alts) + } + where + match_group :: [(Literal, EquationInfo)] -> DsM (Literal, MatchResult) + match_group group + = do { let (lits, eqns) = unzip group + ; match_result <- match vars ty (shiftEqns eqns) + ; return (head lits, match_result) } + + wrap_str_guard :: (Literal,MatchResult) -> DsM MatchResult + -- Equality check for string literals + wrap_str_guard (MachStr s, mr) + = do { eq_str <- dsLookupGlobalId eqStringName + ; lit <- mkStringExprFS s + ; let pred = mkApps (Var eq_str) [Var var, lit] + ; return (mkGuardedMatchResult pred mr) } +\end{code} + +%************************************************************************ +%* * + Pattern matching on NPat +%* * +%************************************************************************ + +\begin{code} +matchNPats :: [Id] -> Type -> [EquationInfo] -> DsM MatchResult +-- All the EquationInfos have NPat at the front + +matchNPats (var:vars) ty eqns + = do { let groups :: [[(Literal, EquationInfo)]] + groups = equivClasses cmpTaggedEqn (tagLitEqns eqns) + + ; match_results <- mapM (match_group . map snd) groups + + ; ASSERT( not (null match_results) ) + return (foldr1 combineMatchResults match_results) } + where + match_group :: [EquationInfo] -> DsM MatchResult + match_group (eqn1:eqns) + = do { lit_expr <- dsOverLit lit + ; neg_lit <- case mb_neg of + Nothing -> return lit_expr + Just neg -> do { neg_expr <- dsExpr neg + ; return (App neg_expr lit_expr) } + ; eq_expr <- dsExpr eq_chk + ; let pred_expr = mkApps eq_expr [Var var, neg_lit] + ; match_result <- match vars ty (eqn1' : shiftEqns eqns) + ; return (adjustMatchResult (eqn_wrap eqn1) $ + -- Bring the eqn1 wrapper stuff into scope because + -- it may be used in pred_expr + mkGuardedMatchResult pred_expr match_result) } + where + NPat lit mb_neg eq_chk _ : pats1 = eqn_pats eqn1 + eqn1' = eqn1 { eqn_wrap = idWrapper, eqn_pats = pats1 } +\end{code} + + +%************************************************************************ +%* * + Pattern matching on n+k patterns +%* * +%************************************************************************ + +For an n+k pattern, we use the various magic expressions we've been given. +We generate: +\begin{verbatim} + if ge var lit then + let n = sub var lit + in <expr-for-a-successful-match> + else + <try-next-pattern-or-whatever> +\end{verbatim} + +WATCH OUT! Consider + + f (n+1) = ... + f (n+2) = ... + f (n+1) = ... + +We can't group the first and third together, because the second may match +the same thing as the first. Contrast + f 1 = ... + f 2 = ... + f 1 = ... +where we can group the first and third. Hence 'runs' rather than 'equivClasses' + +\begin{code} +matchNPlusKPats all_vars@(var:vars) ty eqns + = do { let groups :: [[(Literal, EquationInfo)]] + groups = runs eqTaggedEqn (tagLitEqns eqns) + + ; match_results <- mapM (match_group . map snd) groups + + ; ASSERT( not (null match_results) ) + return (foldr1 combineMatchResults match_results) } + where + match_group :: [EquationInfo] -> DsM MatchResult + match_group (eqn1:eqns) + = do { ge_expr <- dsExpr ge + ; minus_expr <- dsExpr minus + ; lit_expr <- dsOverLit lit + ; let pred_expr = mkApps ge_expr [Var var, lit_expr] + minusk_expr = mkApps minus_expr [Var var, lit_expr] + ; match_result <- match vars ty (eqn1' : map shift eqns) + ; return (adjustMatchResult (eqn_wrap eqn1) $ + -- Bring the eqn1 wrapper stuff into scope because + -- it may be used in ge_expr, minusk_expr + mkGuardedMatchResult pred_expr $ + mkCoLetMatchResult (NonRec n1 minusk_expr) $ + match_result) } + where + NPlusKPat (L _ n1) lit ge minus : pats1 = eqn_pats eqn1 + eqn1' = eqn1 { eqn_wrap = idWrapper, eqn_pats = pats1 } + + shift eqn@(EqnInfo { eqn_wrap = wrap, + eqn_pats = NPlusKPat (L _ n) _ _ _ : pats }) + = eqn { eqn_wrap = wrap . wrapBind n n1, eqn_pats = pats } +\end{code} + + +%************************************************************************ +%* * + Grouping functions +%* * +%************************************************************************ + +Given a blob of @LitPat@s/@NPat@s, we want to split them into those +that are ``same''/different as one we are looking at. We need to know +whether we're looking at a @LitPat@/@NPat@, and what literal we're after. + +\begin{code} +-- Tag equations by the leading literal +-- NB: we have ordering on Core Literals, but not on HsLits +cmpTaggedEqn :: (Literal,EquationInfo) -> (Literal,EquationInfo) -> Ordering +cmpTaggedEqn (lit1,_) (lit2,_) = lit1 `compare` lit2 + +eqTaggedEqn :: (Literal,EquationInfo) -> (Literal,EquationInfo) -> Bool +eqTaggedEqn (lit1,_) (lit2,_) = lit1 == lit2 + +tagLitEqns :: [EquationInfo] -> [(Literal, EquationInfo)] +tagLitEqns eqns = [(get_lit (firstPat eqn), eqn) | eqn <- eqns] + +get_lit :: Pat Id -> Literal +-- Get a Core literal to use (only) a grouping key +-- Hence its type doesn't need to match the type of the original literal +get_lit (LitPat (HsIntPrim i)) = mkMachInt i +get_lit (LitPat (HsCharPrim c)) = MachChar c +get_lit (LitPat (HsStringPrim s)) = MachStr s +get_lit (LitPat (HsFloatPrim f)) = MachFloat f +get_lit (LitPat (HsDoublePrim d)) = MachDouble d +get_lit (LitPat (HsString s)) = MachStr s + +get_lit (NPat (HsIntegral i _) Nothing _ _) = MachInt i +get_lit (NPat (HsIntegral i _) (Just _) _ _) = MachInt (-i) +get_lit (NPat (HsFractional r _) Nothing _ _) = MachFloat r +get_lit (NPat (HsFractional r _) (Just _) _ _) = MachFloat (-r) + +get_lit (NPlusKPat _ (HsIntegral i _) _ _) = MachInt i + +-- These ones can't happen +-- get_lit (LitPat (HsChar c)) +-- get_lit (LitPat (HsInt i)) +get_lit other = pprPanic "get_lit:bad pattern" (ppr other) +\end{code} + diff --git a/compiler/deSugar/deSugar.tex b/compiler/deSugar/deSugar.tex new file mode 100644 index 0000000000..02cb285742 --- /dev/null +++ b/compiler/deSugar/deSugar.tex @@ -0,0 +1,23 @@ +\documentstyle{report} +\input{lit-style} + +\begin{document} +\centerline{{\Large{deSugar}}} +\tableofcontents + +\input{Desugar} % {@deSugar@: the main function} +\input{DsBinds} % {Pattern-matching bindings (HsBinds and MonoBinds)} +\input{DsGRHSs} % {Matching guarded right-hand-sides (GRHSs)} +\input{DsExpr} % {Matching expressions (Exprs)} +\input{DsHsSyn} % {Haskell abstract syntax---added things for desugarer} +\input{DsListComp} % {Desugaring list comprehensions} +\input{DsMonad} % {@DsMonad@: monadery used in desugaring} +\input{DsUtils} % {Utilities for desugaring} +\input{Check} % {Module @Check@ in @deSugar@} +\input{Match} % {The @match@ function} +\input{MatchCon} % {Pattern-matching constructors} +\input{MatchLit} % {Pattern-matching literal patterns} +\input{DsForeign} % {Desugaring \tr{foreign} declarations} +\input{DsCCall} % {Desugaring \tr{_ccall_}s and \tr{_casm_}s} + +\end{document} diff --git a/compiler/ghci/ByteCodeAsm.lhs b/compiler/ghci/ByteCodeAsm.lhs new file mode 100644 index 0000000000..e332413dae --- /dev/null +++ b/compiler/ghci/ByteCodeAsm.lhs @@ -0,0 +1,497 @@ +% +% (c) The University of Glasgow 2002 +% +\section[ByteCodeLink]{Bytecode assembler and linker} + +\begin{code} +{-# OPTIONS -optc-DNON_POSIX_SOURCE #-} + +module ByteCodeAsm ( + assembleBCOs, assembleBCO, + + CompiledByteCode(..), + UnlinkedBCO(..), BCOPtr(..), bcoFreeNames, + SizedSeq, sizeSS, ssElts, + iNTERP_STACK_CHECK_THRESH + ) where + +#include "HsVersions.h" + +import ByteCodeInstr +import ByteCodeItbls ( ItblEnv, mkITbls ) + +import Name ( Name, getName ) +import NameSet +import FiniteMap ( addToFM, lookupFM, emptyFM ) +import Literal ( Literal(..) ) +import TyCon ( TyCon ) +import PrimOp ( PrimOp ) +import Constants ( wORD_SIZE ) +import FastString ( FastString(..) ) +import SMRep ( CgRep(..), StgWord ) +import FiniteMap +import Outputable + +import Control.Monad ( foldM ) +import Control.Monad.ST ( runST ) + +import GHC.Word ( Word(..) ) +import Data.Array.MArray +import Data.Array.Unboxed ( listArray ) +import Data.Array.Base ( UArray(..) ) +import Data.Array.ST ( castSTUArray ) +import Foreign ( Word16, free ) +import Data.Int ( Int64 ) +import Data.Char ( ord ) + +import GHC.Base ( ByteArray# ) +import GHC.IOBase ( IO(..) ) +import GHC.Ptr ( Ptr(..) ) + +-- ----------------------------------------------------------------------------- +-- Unlinked BCOs + +-- CompiledByteCode represents the result of byte-code +-- compiling a bunch of functions and data types + +data CompiledByteCode + = ByteCode [UnlinkedBCO] -- Bunch of interpretable bindings + ItblEnv -- A mapping from DataCons to their itbls + +instance Outputable CompiledByteCode where + ppr (ByteCode bcos _) = ppr bcos + + +data UnlinkedBCO + = UnlinkedBCO { + unlinkedBCOName :: Name, + unlinkedBCOArity :: Int, + unlinkedBCOInstrs :: ByteArray#, -- insns + unlinkedBCOBitmap :: ByteArray#, -- bitmap + unlinkedBCOLits :: (SizedSeq (Either Word FastString)), -- literals + -- Either literal words or a pointer to a asciiz + -- string, denoting a label whose *address* should + -- be determined at link time + unlinkedBCOPtrs :: (SizedSeq BCOPtr), -- ptrs + unlinkedBCOItbls :: (SizedSeq Name) -- itbl refs + } + +data BCOPtr + = BCOPtrName Name + | BCOPtrPrimOp PrimOp + | BCOPtrBCO UnlinkedBCO + +-- | Finds external references. Remember to remove the names +-- defined by this group of BCOs themselves +bcoFreeNames :: UnlinkedBCO -> NameSet +bcoFreeNames bco + = bco_refs bco `minusNameSet` mkNameSet [unlinkedBCOName bco] + where + bco_refs (UnlinkedBCO _ _ _ _ _ ptrs itbls) + = unionManyNameSets ( + mkNameSet [ n | BCOPtrName n <- ssElts ptrs ] : + mkNameSet (ssElts itbls) : + map bco_refs [ bco | BCOPtrBCO bco <- ssElts ptrs ] + ) + +instance Outputable UnlinkedBCO where + ppr (UnlinkedBCO nm arity insns bitmap lits ptrs itbls) + = sep [text "BCO", ppr nm, text "with", + int (sizeSS lits), text "lits", + int (sizeSS ptrs), text "ptrs", + int (sizeSS itbls), text "itbls"] + +-- ----------------------------------------------------------------------------- +-- The bytecode assembler + +-- The object format for bytecodes is: 16 bits for the opcode, and 16 +-- for each field -- so the code can be considered a sequence of +-- 16-bit ints. Each field denotes either a stack offset or number of +-- items on the stack (eg SLIDE), and index into the pointer table (eg +-- PUSH_G), an index into the literal table (eg PUSH_I/D/L), or a +-- bytecode address in this BCO. + +-- Top level assembler fn. +assembleBCOs :: [ProtoBCO Name] -> [TyCon] -> IO CompiledByteCode +assembleBCOs proto_bcos tycons + = do itblenv <- mkITbls tycons + bcos <- mapM assembleBCO proto_bcos + return (ByteCode bcos itblenv) + +assembleBCO :: ProtoBCO Name -> IO UnlinkedBCO +assembleBCO (ProtoBCO nm instrs bitmap bsize arity origin malloced) + = let + -- pass 1: collect up the offsets of the local labels. + -- Remember that the first insn starts at offset 1 since offset 0 + -- (eventually) will hold the total # of insns. + label_env = mkLabelEnv emptyFM 1 instrs + + mkLabelEnv env i_offset [] = env + mkLabelEnv env i_offset (i:is) + = let new_env + = case i of LABEL n -> addToFM env n i_offset ; _ -> env + in mkLabelEnv new_env (i_offset + instrSize16s i) is + + findLabel lab + = case lookupFM label_env lab of + Just bco_offset -> bco_offset + Nothing -> pprPanic "assembleBCO.findLabel" (int lab) + in + do -- pass 2: generate the instruction, ptr and nonptr bits + insns <- return emptySS :: IO (SizedSeq Word16) + lits <- return emptySS :: IO (SizedSeq (Either Word FastString)) + ptrs <- return emptySS :: IO (SizedSeq BCOPtr) + itbls <- return emptySS :: IO (SizedSeq Name) + let init_asm_state = (insns,lits,ptrs,itbls) + (final_insns, final_lits, final_ptrs, final_itbls) + <- mkBits findLabel init_asm_state instrs + + let asm_insns = ssElts final_insns + n_insns = sizeSS final_insns + + insns_arr + | n_insns > 65535 = panic "linkBCO: >= 64k insns in BCO" + | otherwise = mkInstrArray n_insns asm_insns + insns_barr = case insns_arr of UArray _lo _hi barr -> barr + + bitmap_arr = mkBitmapArray bsize bitmap + bitmap_barr = case bitmap_arr of UArray _lo _hi barr -> barr + + let ul_bco = UnlinkedBCO nm arity insns_barr bitmap_barr final_lits + final_ptrs final_itbls + + -- 8 Aug 01: Finalisers aren't safe when attached to non-primitive + -- objects, since they might get run too early. Disable this until + -- we figure out what to do. + -- when (notNull malloced) (addFinalizer ul_bco (mapM_ zonk malloced)) + + return ul_bco + where + zonk ptr = do -- putStrLn ("freeing malloc'd block at " ++ show (A# a#)) + free ptr + +mkBitmapArray :: Int -> [StgWord] -> UArray Int StgWord +mkBitmapArray bsize bitmap + = listArray (0, length bitmap) (fromIntegral bsize : bitmap) + +mkInstrArray :: Int -> [Word16] -> UArray Int Word16 +mkInstrArray n_insns asm_insns + = listArray (0, n_insns) (fromIntegral n_insns : asm_insns) + +-- instrs nonptrs ptrs itbls +type AsmState = (SizedSeq Word16, + SizedSeq (Either Word FastString), + SizedSeq BCOPtr, + SizedSeq Name) + +data SizedSeq a = SizedSeq !Int [a] +emptySS = SizedSeq 0 [] + +-- Why are these two monadic??? +addToSS (SizedSeq n r_xs) x = return (SizedSeq (n+1) (x:r_xs)) +addListToSS (SizedSeq n r_xs) xs + = return (SizedSeq (n + length xs) (reverse xs ++ r_xs)) + +ssElts :: SizedSeq a -> [a] +ssElts (SizedSeq n r_xs) = reverse r_xs + +sizeSS :: SizedSeq a -> Int +sizeSS (SizedSeq n r_xs) = n + +-- Bring in all the bci_ bytecode constants. +#include "Bytecodes.h" + +-- This is where all the action is (pass 2 of the assembler) +mkBits :: (Int -> Int) -- label finder + -> AsmState + -> [BCInstr] -- instructions (in) + -> IO AsmState + +mkBits findLabel st proto_insns + = foldM doInstr st proto_insns + where + doInstr :: AsmState -> BCInstr -> IO AsmState + doInstr st i + = case i of + STKCHECK n -> instr2 st bci_STKCHECK n + PUSH_L o1 -> instr2 st bci_PUSH_L o1 + PUSH_LL o1 o2 -> instr3 st bci_PUSH_LL o1 o2 + PUSH_LLL o1 o2 o3 -> instr4 st bci_PUSH_LLL o1 o2 o3 + PUSH_G nm -> do (p, st2) <- ptr st (BCOPtrName nm) + instr2 st2 bci_PUSH_G p + PUSH_PRIMOP op -> do (p, st2) <- ptr st (BCOPtrPrimOp op) + instr2 st2 bci_PUSH_G p + PUSH_BCO proto -> do ul_bco <- assembleBCO proto + (p, st2) <- ptr st (BCOPtrBCO ul_bco) + instr2 st2 bci_PUSH_G p + PUSH_ALTS proto -> do ul_bco <- assembleBCO proto + (p, st2) <- ptr st (BCOPtrBCO ul_bco) + instr2 st2 bci_PUSH_ALTS p + PUSH_ALTS_UNLIFTED proto pk -> do + ul_bco <- assembleBCO proto + (p, st2) <- ptr st (BCOPtrBCO ul_bco) + instr2 st2 (push_alts pk) p + PUSH_UBX (Left lit) nws + -> do (np, st2) <- literal st lit + instr3 st2 bci_PUSH_UBX np nws + PUSH_UBX (Right aa) nws + -> do (np, st2) <- addr st aa + instr3 st2 bci_PUSH_UBX np nws + + PUSH_APPLY_N -> do instr1 st bci_PUSH_APPLY_N + PUSH_APPLY_V -> do instr1 st bci_PUSH_APPLY_V + PUSH_APPLY_F -> do instr1 st bci_PUSH_APPLY_F + PUSH_APPLY_D -> do instr1 st bci_PUSH_APPLY_D + PUSH_APPLY_L -> do instr1 st bci_PUSH_APPLY_L + PUSH_APPLY_P -> do instr1 st bci_PUSH_APPLY_P + PUSH_APPLY_PP -> do instr1 st bci_PUSH_APPLY_PP + PUSH_APPLY_PPP -> do instr1 st bci_PUSH_APPLY_PPP + PUSH_APPLY_PPPP -> do instr1 st bci_PUSH_APPLY_PPPP + PUSH_APPLY_PPPPP -> do instr1 st bci_PUSH_APPLY_PPPPP + PUSH_APPLY_PPPPPP -> do instr1 st bci_PUSH_APPLY_PPPPPP + + SLIDE n by -> instr3 st bci_SLIDE n by + ALLOC_AP n -> instr2 st bci_ALLOC_AP n + ALLOC_PAP arity n -> instr3 st bci_ALLOC_PAP arity n + MKAP off sz -> instr3 st bci_MKAP off sz + MKPAP off sz -> instr3 st bci_MKPAP off sz + UNPACK n -> instr2 st bci_UNPACK n + PACK dcon sz -> do (itbl_no,st2) <- itbl st dcon + instr3 st2 bci_PACK itbl_no sz + LABEL lab -> return st + TESTLT_I i l -> do (np, st2) <- int st i + instr3 st2 bci_TESTLT_I np (findLabel l) + TESTEQ_I i l -> do (np, st2) <- int st i + instr3 st2 bci_TESTEQ_I np (findLabel l) + TESTLT_F f l -> do (np, st2) <- float st f + instr3 st2 bci_TESTLT_F np (findLabel l) + TESTEQ_F f l -> do (np, st2) <- float st f + instr3 st2 bci_TESTEQ_F np (findLabel l) + TESTLT_D d l -> do (np, st2) <- double st d + instr3 st2 bci_TESTLT_D np (findLabel l) + TESTEQ_D d l -> do (np, st2) <- double st d + instr3 st2 bci_TESTEQ_D np (findLabel l) + TESTLT_P i l -> instr3 st bci_TESTLT_P i (findLabel l) + TESTEQ_P i l -> instr3 st bci_TESTEQ_P i (findLabel l) + CASEFAIL -> instr1 st bci_CASEFAIL + SWIZZLE stkoff n -> instr3 st bci_SWIZZLE stkoff n + JMP l -> instr2 st bci_JMP (findLabel l) + ENTER -> instr1 st bci_ENTER + RETURN -> instr1 st bci_RETURN + RETURN_UBX rep -> instr1 st (return_ubx rep) + CCALL off m_addr -> do (np, st2) <- addr st m_addr + instr3 st2 bci_CCALL off np + + i2s :: Int -> Word16 + i2s = fromIntegral + + instr1 (st_i0,st_l0,st_p0,st_I0) i1 + = do st_i1 <- addToSS st_i0 i1 + return (st_i1,st_l0,st_p0,st_I0) + + instr2 (st_i0,st_l0,st_p0,st_I0) i1 i2 + = do st_i1 <- addToSS st_i0 (i2s i1) + st_i2 <- addToSS st_i1 (i2s i2) + return (st_i2,st_l0,st_p0,st_I0) + + instr3 (st_i0,st_l0,st_p0,st_I0) i1 i2 i3 + = do st_i1 <- addToSS st_i0 (i2s i1) + st_i2 <- addToSS st_i1 (i2s i2) + st_i3 <- addToSS st_i2 (i2s i3) + return (st_i3,st_l0,st_p0,st_I0) + + instr4 (st_i0,st_l0,st_p0,st_I0) i1 i2 i3 i4 + = do st_i1 <- addToSS st_i0 (i2s i1) + st_i2 <- addToSS st_i1 (i2s i2) + st_i3 <- addToSS st_i2 (i2s i3) + st_i4 <- addToSS st_i3 (i2s i4) + return (st_i4,st_l0,st_p0,st_I0) + + float (st_i0,st_l0,st_p0,st_I0) f + = do let ws = mkLitF f + st_l1 <- addListToSS st_l0 (map Left ws) + return (sizeSS st_l0, (st_i0,st_l1,st_p0,st_I0)) + + double (st_i0,st_l0,st_p0,st_I0) d + = do let ws = mkLitD d + st_l1 <- addListToSS st_l0 (map Left ws) + return (sizeSS st_l0, (st_i0,st_l1,st_p0,st_I0)) + + int (st_i0,st_l0,st_p0,st_I0) i + = do let ws = mkLitI i + st_l1 <- addListToSS st_l0 (map Left ws) + return (sizeSS st_l0, (st_i0,st_l1,st_p0,st_I0)) + + int64 (st_i0,st_l0,st_p0,st_I0) i + = do let ws = mkLitI64 i + st_l1 <- addListToSS st_l0 (map Left ws) + return (sizeSS st_l0, (st_i0,st_l1,st_p0,st_I0)) + + addr (st_i0,st_l0,st_p0,st_I0) a + = do let ws = mkLitPtr a + st_l1 <- addListToSS st_l0 (map Left ws) + return (sizeSS st_l0, (st_i0,st_l1,st_p0,st_I0)) + + litlabel (st_i0,st_l0,st_p0,st_I0) fs + = do st_l1 <- addListToSS st_l0 [Right fs] + return (sizeSS st_l0, (st_i0,st_l1,st_p0,st_I0)) + + ptr (st_i0,st_l0,st_p0,st_I0) p + = do st_p1 <- addToSS st_p0 p + return (sizeSS st_p0, (st_i0,st_l0,st_p1,st_I0)) + + itbl (st_i0,st_l0,st_p0,st_I0) dcon + = do st_I1 <- addToSS st_I0 (getName dcon) + return (sizeSS st_I0, (st_i0,st_l0,st_p0,st_I1)) + + literal st (MachLabel fs _) = litlabel st fs + literal st (MachWord w) = int st (fromIntegral w) + literal st (MachInt j) = int st (fromIntegral j) + literal st (MachFloat r) = float st (fromRational r) + literal st (MachDouble r) = double st (fromRational r) + literal st (MachChar c) = int st (ord c) + literal st (MachInt64 ii) = int64 st (fromIntegral ii) + literal st (MachWord64 ii) = int64 st (fromIntegral ii) + literal st other = pprPanic "ByteCodeLink.literal" (ppr other) + + +push_alts NonPtrArg = bci_PUSH_ALTS_N +push_alts FloatArg = bci_PUSH_ALTS_F +push_alts DoubleArg = bci_PUSH_ALTS_D +push_alts VoidArg = bci_PUSH_ALTS_V +push_alts LongArg = bci_PUSH_ALTS_L +push_alts PtrArg = bci_PUSH_ALTS_P + +return_ubx NonPtrArg = bci_RETURN_N +return_ubx FloatArg = bci_RETURN_F +return_ubx DoubleArg = bci_RETURN_D +return_ubx VoidArg = bci_RETURN_V +return_ubx LongArg = bci_RETURN_L +return_ubx PtrArg = bci_RETURN_P + + +-- The size in 16-bit entities of an instruction. +instrSize16s :: BCInstr -> Int +instrSize16s instr + = case instr of + STKCHECK{} -> 2 + PUSH_L{} -> 2 + PUSH_LL{} -> 3 + PUSH_LLL{} -> 4 + PUSH_G{} -> 2 + PUSH_PRIMOP{} -> 2 + PUSH_BCO{} -> 2 + PUSH_ALTS{} -> 2 + PUSH_ALTS_UNLIFTED{} -> 2 + PUSH_UBX{} -> 3 + PUSH_APPLY_N{} -> 1 + PUSH_APPLY_V{} -> 1 + PUSH_APPLY_F{} -> 1 + PUSH_APPLY_D{} -> 1 + PUSH_APPLY_L{} -> 1 + PUSH_APPLY_P{} -> 1 + PUSH_APPLY_PP{} -> 1 + PUSH_APPLY_PPP{} -> 1 + PUSH_APPLY_PPPP{} -> 1 + PUSH_APPLY_PPPPP{} -> 1 + PUSH_APPLY_PPPPPP{} -> 1 + SLIDE{} -> 3 + ALLOC_AP{} -> 2 + ALLOC_PAP{} -> 3 + MKAP{} -> 3 + MKPAP{} -> 3 + UNPACK{} -> 2 + PACK{} -> 3 + LABEL{} -> 0 -- !! + TESTLT_I{} -> 3 + TESTEQ_I{} -> 3 + TESTLT_F{} -> 3 + TESTEQ_F{} -> 3 + TESTLT_D{} -> 3 + TESTEQ_D{} -> 3 + TESTLT_P{} -> 3 + TESTEQ_P{} -> 3 + JMP{} -> 2 + CASEFAIL{} -> 1 + ENTER{} -> 1 + RETURN{} -> 1 + RETURN_UBX{} -> 1 + CCALL{} -> 3 + SWIZZLE{} -> 3 + +-- Make lists of host-sized words for literals, so that when the +-- words are placed in memory at increasing addresses, the +-- bit pattern is correct for the host's word size and endianness. +mkLitI :: Int -> [Word] +mkLitF :: Float -> [Word] +mkLitD :: Double -> [Word] +mkLitPtr :: Ptr () -> [Word] +mkLitI64 :: Int64 -> [Word] + +mkLitF f + = runST (do + arr <- newArray_ ((0::Int),0) + writeArray arr 0 f + f_arr <- castSTUArray arr + w0 <- readArray f_arr 0 + return [w0 :: Word] + ) + +mkLitD d + | wORD_SIZE == 4 + = runST (do + arr <- newArray_ ((0::Int),1) + writeArray arr 0 d + d_arr <- castSTUArray arr + w0 <- readArray d_arr 0 + w1 <- readArray d_arr 1 + return [w0 :: Word, w1] + ) + | wORD_SIZE == 8 + = runST (do + arr <- newArray_ ((0::Int),0) + writeArray arr 0 d + d_arr <- castSTUArray arr + w0 <- readArray d_arr 0 + return [w0 :: Word] + ) + +mkLitI64 ii + | wORD_SIZE == 4 + = runST (do + arr <- newArray_ ((0::Int),1) + writeArray arr 0 ii + d_arr <- castSTUArray arr + w0 <- readArray d_arr 0 + w1 <- readArray d_arr 1 + return [w0 :: Word,w1] + ) + | wORD_SIZE == 8 + = runST (do + arr <- newArray_ ((0::Int),0) + writeArray arr 0 ii + d_arr <- castSTUArray arr + w0 <- readArray d_arr 0 + return [w0 :: Word] + ) + +mkLitI i + = runST (do + arr <- newArray_ ((0::Int),0) + writeArray arr 0 i + i_arr <- castSTUArray arr + w0 <- readArray i_arr 0 + return [w0 :: Word] + ) + +mkLitPtr a + = runST (do + arr <- newArray_ ((0::Int),0) + writeArray arr 0 a + a_arr <- castSTUArray arr + w0 <- readArray a_arr 0 + return [w0 :: Word] + ) + +iNTERP_STACK_CHECK_THRESH = (INTERP_STACK_CHECK_THRESH :: Int) +\end{code} diff --git a/compiler/ghci/ByteCodeFFI.lhs b/compiler/ghci/ByteCodeFFI.lhs new file mode 100644 index 0000000000..61e70d64e4 --- /dev/null +++ b/compiler/ghci/ByteCodeFFI.lhs @@ -0,0 +1,832 @@ +% +% (c) The University of Glasgow 2001 +% +\section[ByteCodeGen]{Generate machine-code sequences for foreign import} + +\begin{code} +module ByteCodeFFI ( mkMarshalCode, moan64 ) where + +#include "HsVersions.h" + +import Outputable +import SMRep ( CgRep(..), cgRepSizeW ) +import ForeignCall ( CCallConv(..) ) +import Panic + +-- DON'T remove apparently unused imports here .. +-- there is ifdeffery below +import Control.Exception ( throwDyn ) +import DATA_BITS ( Bits(..), shiftR, shiftL ) +import Foreign ( newArray ) +import Data.List ( mapAccumL ) + +import DATA_WORD ( Word8, Word32 ) +import Foreign ( Ptr ) +import System.IO.Unsafe ( unsafePerformIO ) +import IO ( hPutStrLn, stderr ) +import Debug.Trace ( trace ) +\end{code} + +%************************************************************************ +%* * +\subsection{The platform-dependent marshall-code-generator.} +%* * +%************************************************************************ + +\begin{code} + +moan64 :: String -> SDoc -> a +moan64 msg pp_rep + = unsafePerformIO ( + hPutStrLn stderr ( + "\nGHCi's bytecode generation machinery can't handle 64-bit\n" ++ + "code properly yet. You can work around this for the time being\n" ++ + "by compiling this module and all those it imports to object code,\n" ++ + "and re-starting your GHCi session. The panic below contains information,\n" ++ + "intended for the GHC implementors, about the exact place where GHC gave up.\n" + ) + ) + `seq` + pprPanic msg pp_rep + + +-- For sparc_TARGET_ARCH, i386_TARGET_ARCH, etc. +#include "nativeGen/NCG.h" + +{- +Make a piece of code which expects to see the Haskell stack +looking like this. It is given a pointer to the lowest word in +the stack -- presumably the tag of the placeholder. + + <arg_n> + ... + <arg_1> + Addr# address_of_C_fn + <placeholder-for-result#> (must be an unboxed type) + +We cope with both ccall and stdcall for the C fn. However, this code +itself expects only to be called using the ccall convention -- that is, +we don't clear our own (single) arg off the C stack. +-} +mkMarshalCode :: CCallConv + -> (Int, CgRep) -> Int -> [(Int, CgRep)] + -> IO (Ptr Word8) +mkMarshalCode cconv (r_offW, r_rep) addr_offW arg_offs_n_reps + = let bytes = mkMarshalCode_wrk cconv (r_offW, r_rep) + addr_offW arg_offs_n_reps + in trace (show bytes) $ Foreign.newArray bytes + + + + +mkMarshalCode_wrk :: CCallConv + -> (Int, CgRep) -> Int -> [(Int, CgRep)] + -> [Word8] + +mkMarshalCode_wrk cconv (r_offW, r_rep) addr_offW arg_offs_n_reps + +#if i386_TARGET_ARCH + + = let -- Don't change this without first consulting Intel Corp :-) + bytes_per_word = 4 + + offsets_to_pushW + = concat + [ -- reversed because x86 is little-endian + reverse [a_offW .. a_offW + cgRepSizeW a_rep - 1] + + -- reversed because args are pushed L -> R onto C stack + | (a_offW, a_rep) <- reverse arg_offs_n_reps + ] + + arguments_size = bytes_per_word * length offsets_to_pushW +#if darwin_TARGET_OS + -- Darwin: align stack frame size to a multiple of 16 bytes + stack_frame_size = (arguments_size + 15) .&. complement 15 + stack_frame_pad = stack_frame_size - arguments_size +#else + stack_frame_size = arguments_size +#endif + + -- some helpers to assemble x86 insns. + movl_offespmem_esi offB -- movl offB(%esp), %esi + = [0x8B, 0xB4, 0x24] ++ lit32 offB + movl_offesimem_ecx offB -- movl offB(%esi), %ecx + = [0x8B, 0x8E] ++ lit32 offB + save_regs -- pushl all intregs except %esp + = [0x50, 0x53, 0x51, 0x52, 0x56, 0x57, 0x55] + restore_regs -- popl ditto + = [0x5D, 0x5F, 0x5E, 0x5A, 0x59, 0x5B, 0x58] + pushl_ecx -- pushl %ecx + = [0x51] + call_star_ecx -- call * %ecx + = [0xFF, 0xD1] + add_lit_esp lit -- addl $lit, %esp + = [0x81, 0xC4] ++ lit32 lit + movl_eax_offesimem offB -- movl %eax, offB(%esi) + = [0x89, 0x86] ++ lit32 offB + movl_edx_offesimem offB -- movl %edx, offB(%esi) + = [0x89, 0x96] ++ lit32 offB + ret -- ret + = [0xC3] + fstpl_offesimem offB -- fstpl offB(%esi) + = [0xDD, 0x9E] ++ lit32 offB + fstps_offesimem offB -- fstps offB(%esi) + = [0xD9, 0x9E] ++ lit32 offB + {- + 2 0000 8BB42478 movl 0x12345678(%esp), %esi + 2 563412 + 3 0007 8B8E7856 movl 0x12345678(%esi), %ecx + 3 3412 + 4 + 5 000d 50535152 pushl %eax ; pushl %ebx ; pushl %ecx ; pushl %edx + 6 0011 565755 pushl %esi ; pushl %edi ; pushl %ebp + 7 + 8 0014 5D5F5E popl %ebp ; popl %edi ; popl %esi + 9 0017 5A595B58 popl %edx ; popl %ecx ; popl %ebx ; popl %eax + 10 + 11 001b 51 pushl %ecx + 12 001c FFD1 call * %ecx + 13 + 14 001e 81C47856 addl $0x12345678, %esp + 14 3412 + 15 0024 89867856 movl %eax, 0x12345678(%esi) + 15 3412 + 16 002a 89967856 movl %edx, 0x12345678(%esi) + 16 3412 + 17 + 18 0030 DD967856 fstl 0x12345678(%esi) + 18 3412 + 19 0036 DD9E7856 fstpl 0x12345678(%esi) + 19 3412 + 20 003c D9967856 fsts 0x12345678(%esi) + 20 3412 + 21 0042 D99E7856 fstps 0x12345678(%esi) + 18 + 19 0030 C3 ret + 20 + + -} + + in + --trace (show (map fst arg_offs_n_reps)) + ( + {- On entry, top of C stack 0(%esp) is the RA and 4(%esp) is + arg passed from the interpreter. + + Push all callee saved regs. Push all of them anyway ... + pushl %eax + pushl %ebx + pushl %ecx + pushl %edx + pushl %esi + pushl %edi + pushl %ebp + -} + save_regs + + {- Now 28+0(%esp) is RA and 28+4(%esp) is the arg (the H stack ptr). + We'll use %esi as a temporary to point at the H stack, and + %ecx as a temporary to copy via. + + movl 28+4(%esp), %esi + -} + ++ movl_offespmem_esi 32 + +#if darwin_TARGET_OS + {- On Darwin, add some padding so that the stack stays aligned. -} + ++ (if stack_frame_pad /= 0 + then add_lit_esp (-stack_frame_pad) + else []) +#endif + + {- For each arg in args_offs_n_reps, examine the associated + CgRep to determine how many words there are. This gives a + bunch of offsets on the H stack to copy to the C stack: + + movl off1(%esi), %ecx + pushl %ecx + -} + ++ concatMap (\offW -> movl_offesimem_ecx (bytes_per_word * offW) + ++ pushl_ecx) + offsets_to_pushW + + {- Get the addr to call into %ecx, bearing in mind that there's + an Addr# tag at the indicated location, and do the call: + + movl 4*(1 /*tag*/ +addr_offW)(%esi), %ecx + call * %ecx + -} + ++ movl_offesimem_ecx (bytes_per_word * addr_offW) + ++ call_star_ecx + + {- Nuke the args just pushed and re-establish %esi at the + H-stack ptr: + + addl $4*number_of_args_pushed, %esp (ccall only) + movl 28+4(%esp), %esi + -} + ++ (if cconv /= StdCallConv + then add_lit_esp stack_frame_size + else []) + ++ movl_offespmem_esi 32 + + {- Depending on what the return type is, get the result + from %eax or %edx:%eax or %st(0). + + movl %eax, 4(%esi) -- assuming tagged result + or + movl %edx, 4(%esi) + movl %eax, 8(%esi) + or + fstpl 4(%esi) + or + fstps 4(%esi) + -} + ++ let i32 = movl_eax_offesimem 0 + i64 = movl_eax_offesimem 0 ++ movl_edx_offesimem 4 + f32 = fstps_offesimem 0 + f64 = fstpl_offesimem 0 + in + case r_rep of + NonPtrArg -> i32 + DoubleArg -> f64 + FloatArg -> f32 + -- LongArg -> i64 + VoidArg -> [] + other -> moan64 "ByteCodeFFI.mkMarshalCode_wrk(x86)" + (ppr r_rep) + + {- Restore all the pushed regs and go home. + + pushl %ebp + pushl %edi + pushl %esi + pushl %edx + pushl %ecx + pushl %ebx + pushl %eax + + ret + -} + ++ restore_regs + ++ ret + ) + +#elif x86_64_TARGET_ARCH + + = + -- the address of the H stack is in %rdi. We need to move it out, so + -- we can use %rdi as an arg reg for the following call: + pushq_rbp ++ + movq_rdi_rbp ++ + + -- ####### load / push the args + + let + (stack_args, fregs_unused, reg_loads) = + load_arg_regs arg_offs_n_reps int_loads float_loads [] + + tot_arg_size = bytes_per_word * length stack_args + + -- On entry to the called function, %rsp should be aligned + -- on a 16-byte boundary +8 (i.e. the first stack arg after + -- the return address is 16-byte aligned). In STG land + -- %rsp is kept 16-byte aligned (see StgCRun.c), so we just + -- need to make sure we push a multiple of 16-bytes of args, + -- plus the return address, to get the correct alignment. + (real_size, adjust_rsp) + | tot_arg_size `rem` 16 == 0 = (tot_arg_size, []) + | otherwise = (tot_arg_size + 8, subq_lit_rsp 8) + + (stack_pushes, stack_words) = + push_args stack_args [] 0 + + -- we need to know the number of SSE regs used in the call, see later + n_sse_regs_used = length float_loads - length fregs_unused + in + concat reg_loads + ++ adjust_rsp + ++ concat stack_pushes -- push in reverse order + + -- ####### make the call + + -- use %r10 to make the call, because we don't have to save it. + -- movq 8*addr_offW(%rbp), %r10 + ++ movq_rbpoff_r10 (bytes_per_word * addr_offW) + + -- The x86_64 ABI requires us to set %al to the number of SSE + -- registers that contain arguments, if the called routine + -- is a varargs function. We don't know whether it's a + -- varargs function or not, so we have to assume it is. + -- + -- It's not safe to omit this assignment, even if the number + -- of SSE regs in use is zero. If %al is larger than 8 + -- on entry to a varargs function, seg faults ensue. + ++ movq_lit_rax n_sse_regs_used + ++ call_star_r10 + + -- pop the args from the stack, only in ccall mode + -- (in stdcall the callee does it). + ++ (if cconv /= StdCallConv + then addq_lit_rsp real_size + else []) + + -- ####### place the result in the right place and return + + ++ assign_result + ++ popq_rbp + ++ ret + + where + bytes_per_word = 8 + + -- int arg regs: rdi,rsi,rdx,rcx,r8,r9 + -- flt arg regs: xmm0..xmm7 + int_loads = [ movq_rbpoff_rdi, movq_rbpoff_rsi, movq_rbpoff_rdx, + movq_rbpoff_rcx, movq_rbpoff_r8, movq_rbpoff_r9 ] + float_loads = [ (mov_f32_rbpoff_xmm n, mov_f64_rbpoff_xmm n) | n <- [0..7] ] + + load_arg_regs args [] [] code = (args, [], code) + load_arg_regs [] iregs fregs code = ([], fregs, code) + load_arg_regs ((off,rep):args) iregs fregs code + | FloatArg <- rep, ((mov_f32,_):frest) <- fregs = + load_arg_regs args iregs frest (mov_f32 (bytes_per_word * off) : code) + | DoubleArg <- rep, ((_,mov_f64):frest) <- fregs = + load_arg_regs args iregs frest (mov_f64 (bytes_per_word * off) : code) + | (mov_reg:irest) <- iregs = + load_arg_regs args irest fregs (mov_reg (bytes_per_word * off) : code) + | otherwise = + push_this_arg + where + push_this_arg = ((off,rep):args',fregs', code') + where (args',fregs',code') = load_arg_regs args iregs fregs code + + push_args [] code pushed_words = (code, pushed_words) + push_args ((off,rep):args) code pushed_words + | FloatArg <- rep = + push_args args (push_f32_rbpoff (bytes_per_word * off) : code) + (pushed_words+1) + | DoubleArg <- rep = + push_args args (push_f64_rbpoff (bytes_per_word * off) : code) + (pushed_words+1) + | otherwise = + push_args args (pushq_rbpoff (bytes_per_word * off) : code) + (pushed_words+1) + + + assign_result = + case r_rep of + DoubleArg -> f64 + FloatArg -> f32 + VoidArg -> [] + _other -> i64 + where + i64 = movq_rax_rbpoff 0 + f32 = mov_f32_xmm0_rbpoff 0 + f64 = mov_f64_xmm0_rbpoff 0 + +-- ######### x86_64 machine code: + +-- 0: 48 89 fd mov %rdi,%rbp +-- 3: 48 8b bd 78 56 34 12 mov 0x12345678(%rbp),%rdi +-- a: 48 8b b5 78 56 34 12 mov 0x12345678(%rbp),%rsi +-- 11: 48 8b 95 78 56 34 12 mov 0x12345678(%rbp),%rdx +-- 18: 48 8b 8d 78 56 34 12 mov 0x12345678(%rbp),%rcx +-- 1f: 4c 8b 85 78 56 34 12 mov 0x12345678(%rbp),%r8 +-- 26: 4c 8b 8d 78 56 34 12 mov 0x12345678(%rbp),%r9 +-- 2d: 4c 8b 95 78 56 34 12 mov 0x12345678(%rbp),%r10 +-- 34: 48 c7 c0 78 56 34 12 mov $0x12345678,%rax +-- 3b: 48 89 85 78 56 34 12 mov %rax,0x12345678(%rbp) +-- 42: f3 0f 10 85 78 56 34 12 movss 0x12345678(%rbp),%xmm0 +-- 4a: f2 0f 10 85 78 56 34 12 movsd 0x12345678(%rbp),%xmm0 +-- 52: f3 0f 11 85 78 56 34 12 movss %xmm0,0x12345678(%rbp) +-- 5a: f2 0f 11 85 78 56 34 12 movsd %xmm0,0x12345678(%rbp) +-- 62: ff b5 78 56 34 12 pushq 0x12345678(%rbp) +-- 68: f3 44 0f 11 04 24 movss %xmm8,(%rsp) +-- 6e: f2 44 0f 11 04 24 movsd %xmm8,(%rsp) +-- 74: 48 81 ec 78 56 34 12 sub $0x12345678,%rsp +-- 7b: 48 81 c4 78 56 34 12 add $0x12345678,%rsp +-- 82: 41 ff d2 callq *%r10 +-- 85: c3 retq + + movq_rdi_rbp = [0x48,0x89,0xfd] + movq_rbpoff_rdi off = [0x48, 0x8b, 0xbd] ++ lit32 off + movq_rbpoff_rsi off = [0x48, 0x8b, 0xb5] ++ lit32 off + movq_rbpoff_rdx off = [0x48, 0x8b, 0x95] ++ lit32 off + movq_rbpoff_rcx off = [0x48, 0x8b, 0x8d] ++ lit32 off + movq_rbpoff_r8 off = [0x4c, 0x8b, 0x85] ++ lit32 off + movq_rbpoff_r9 off = [0x4c, 0x8b, 0x8d] ++ lit32 off + movq_rbpoff_r10 off = [0x4c, 0x8b, 0x95] ++ lit32 off + movq_lit_rax lit = [0x48, 0xc7, 0xc0] ++ lit32 lit + movq_rax_rbpoff off = [0x48, 0x89, 0x85] ++ lit32 off + mov_f32_rbpoff_xmm n off = [0xf3, 0x0f, 0x10, 0x85 + n`shiftL`3] ++ lit32 off + mov_f64_rbpoff_xmm n off = [0xf2, 0x0f, 0x10, 0x85 + n`shiftL`3] ++ lit32 off + mov_f32_xmm0_rbpoff off = [0xf3, 0x0f, 0x11, 0x85] ++ lit32 off + mov_f64_xmm0_rbpoff off = [0xf2, 0x0f, 0x11, 0x85] ++ lit32 off + pushq_rbpoff off = [0xff, 0xb5] ++ lit32 off + push_f32_rbpoff off = + mov_f32_rbpoff_xmm 8 off ++ -- movss off(%rbp), %xmm8 + [0xf3, 0x44, 0x0f, 0x11, 0x04, 0x24] ++ -- movss %xmm8, (%rsp) + subq_lit_rsp 8 -- subq $8, %rsp + push_f64_rbpoff off = + mov_f64_rbpoff_xmm 8 off ++ -- movsd off(%rbp), %xmm8 + [0xf2, 0x44, 0x0f, 0x11, 0x04, 0x24] ++ -- movsd %xmm8, (%rsp) + subq_lit_rsp 8 -- subq $8, %rsp + subq_lit_rsp lit = [0x48, 0x81, 0xec] ++ lit32 lit + addq_lit_rsp lit = [0x48, 0x81, 0xc4] ++ lit32 lit + call_star_r10 = [0x41,0xff,0xd2] + ret = [0xc3] + pushq_rbp = [0x55] + popq_rbp = [0x5d] + +#elif sparc_TARGET_ARCH + + = let -- At least for sparc V8 + bytes_per_word = 4 + + -- speaks for itself + w32_to_w8s_bigEndian :: Word32 -> [Word8] + w32_to_w8s_bigEndian w + = [fromIntegral (0xFF .&. (w `shiftR` 24)), + fromIntegral (0xFF .&. (w `shiftR` 16)), + fromIntegral (0xFF .&. (w `shiftR` 8)), + fromIntegral (0xFF .&. w)] + + offsets_to_pushW + = concat + [ [a_offW .. a_offW + cgRepSizeW a_rep - 1] + + | (a_offW, a_rep) <- arg_offs_n_reps + ] + + total_argWs = length offsets_to_pushW + argWs_on_stack = if total_argWs > 6 then total_argWs - 6 + else 0 + + -- The stack pointer must be kept 8-byte aligned, which means + -- we need to calculate this quantity too + argWs_on_stack_ROUNDED_UP + | odd argWs_on_stack = 1 + argWs_on_stack + | otherwise = argWs_on_stack + + -- some helpers to assemble sparc insns. + -- REGS + iReg, oReg, gReg, fReg :: Int -> Word32 + iReg = fromIntegral . (+ 24) + oReg = fromIntegral . (+ 8) + gReg = fromIntegral . (+ 0) + fReg = fromIntegral + + sp = oReg 6 + i0 = iReg 0 + i7 = iReg 7 + o0 = oReg 0 + o1 = oReg 1 + o7 = oReg 7 + g0 = gReg 0 + g1 = gReg 1 + f0 = fReg 0 + f1 = fReg 1 + + -- INSN templates + insn_r_r_i :: Word32 -> Word32 -> Word32 -> Int -> Word32 + insn_r_r_i op3 rs1 rd imm13 + = (3 `shiftL` 30) + .|. (rs1 `shiftL` 25) + .|. (op3 `shiftL` 19) + .|. (rd `shiftL` 14) + .|. (1 `shiftL` 13) + .|. mkSimm13 imm13 + + insn_r_i_r :: Word32 -> Word32 -> Int -> Word32 -> Word32 + insn_r_i_r op3 rs1 imm13 rd + = (2 `shiftL` 30) + .|. (rd `shiftL` 25) + .|. (op3 `shiftL` 19) + .|. (rs1 `shiftL` 14) + .|. (1 `shiftL` 13) + .|. mkSimm13 imm13 + + mkSimm13 :: Int -> Word32 + mkSimm13 imm13 + = let imm13w = (fromIntegral imm13) :: Word32 + in imm13w .&. 0x1FFF + + -- REAL (non-synthetic) insns + -- or %rs1, %rs2, %rd + mkOR :: Word32 -> Word32 -> Word32 -> Word32 + mkOR rs1 rs2 rd + = (2 `shiftL` 30) + .|. (rd `shiftL` 25) + .|. (op3_OR `shiftL` 19) + .|. (rs1 `shiftL` 14) + .|. (0 `shiftL` 13) + .|. rs2 + where op3_OR = 2 :: Word32 + + -- ld(int) [%rs + imm13], %rd + mkLD rs1 imm13 rd = insn_r_r_i 0x00{-op3_LD-} rd rs1 imm13 + + -- st(int) %rs, [%rd + imm13] + mkST = insn_r_r_i 0x04 -- op3_ST + + -- st(float) %rs, [%rd + imm13] + mkSTF = insn_r_r_i 0x24 -- op3_STF + + -- jmpl %rs + imm13, %rd + mkJMPL = insn_r_i_r 0x38 -- op3_JMPL + + -- save %rs + imm13, %rd + mkSAVE = insn_r_i_r 0x3C -- op3_SAVE + + -- restore %rs + imm13, %rd + mkRESTORE = insn_r_i_r 0x3D -- op3_RESTORE + + -- SYNTHETIC insns + mkNOP = mkOR g0 g0 g0 + mkCALL reg = mkJMPL reg 0 o7 + mkRET = mkJMPL i7 8 g0 + mkRESTORE_TRIVIAL = mkRESTORE g0 0 g0 + in + --trace (show (map fst arg_offs_n_reps)) + concatMap w32_to_w8s_bigEndian ( + + {- On entry, %o0 is the arg passed from the interpreter. After + the initial save insn, it will be in %i0. Studying the sparc + docs one would have thought that the minimum frame size is 92 + bytes, but gcc always uses at least 112, and indeed there are + segfaults a-plenty with 92. So I use 112 here as well. I + don't understand why, tho. + -} + [mkSAVE sp (- ({-92-}112 + 4*argWs_on_stack_ROUNDED_UP)) sp] + + {- For each arg in args_offs_n_reps, examine the associated + CgRep to determine how many words there are. This gives a + bunch of offsets on the H stack. Move the first 6 words into + %o0 .. %o5 and the rest on the stack, starting at [%sp+92]. + Use %g1 as a temp. + -} + ++ let doArgW (offW, wordNo) + | wordNo < 6 + = [mkLD i0 (bytes_per_word * offW) (oReg wordNo)] + | otherwise + = [mkLD i0 (bytes_per_word * offW) g1, + mkST g1 sp (92 + bytes_per_word * (wordNo - 6))] + in + concatMap doArgW (zip offsets_to_pushW [0 ..]) + + {- Get the addr to call into %g1, bearing in mind that there's + an Addr# tag at the indicated location, and do the call: + + ld [4*(1 /*tag*/ +addr_offW) + %i0], %g1 + call %g1 + -} + ++ [mkLD i0 (bytes_per_word * addr_offW) g1, + mkCALL g1, + mkNOP] + + {- Depending on what the return type is, get the result + from %o0 or %o1:%o0 or %f0 or %f1:%f0. + + st %o0, [%i0 + 4] -- 32 bit int + or + st %o0, [%i0 + 4] -- 64 bit int + st %o1, [%i0 + 8] -- or the other way round? + or + st %f0, [%i0 + 4] -- 32 bit float + or + st %f0, [%i0 + 4] -- 64 bit float + st %f1, [%i0 + 8] -- or the other way round? + + -} + ++ let i32 = [mkST o0 i0 0] + i64 = [mkST o0 i0 0, mkST o1 i0 4] + f32 = [mkSTF f0 i0 0] + f64 = [mkSTF f0 i0 0, mkSTF f1 i0 4] + in + case r_rep of + NonPtrArg -> i32 + DoubleArg -> f64 + FloatArg -> f32 + VoidArg -> [] + other -> moan64 "ByteCodeFFI.mkMarshalCode_wrk(sparc)" + (ppr r_rep) + + ++ [mkRET, + mkRESTORE_TRIVIAL] -- this is in the delay slot of the RET + ) +#elif powerpc_TARGET_ARCH && darwin_TARGET_OS + + = let + bytes_per_word = 4 + + -- speaks for itself + w32_to_w8s_bigEndian :: Word32 -> [Word8] + w32_to_w8s_bigEndian w + = [fromIntegral (0xFF .&. (w `shiftR` 24)), + fromIntegral (0xFF .&. (w `shiftR` 16)), + fromIntegral (0xFF .&. (w `shiftR` 8)), + fromIntegral (0xFF .&. w)] + + -- addr and result bits offsetsW + a_off = addr_offW * bytes_per_word + result_off = r_offW * bytes_per_word + + linkageArea = 24 + parameterArea = sum [ cgRepSizeW a_rep * bytes_per_word + | (_, a_rep) <- arg_offs_n_reps ] + savedRegisterArea = 4 + frameSize = padTo16 (linkageArea + max parameterArea 32 + savedRegisterArea) + padTo16 x = case x `mod` 16 of + 0 -> x + y -> x - y + 16 + + pass_parameters [] _ _ = [] + pass_parameters ((a_offW, a_rep):args) nextFPR offsetW = + let + haskellArgOffset = a_offW * bytes_per_word + offsetW' = offsetW + cgRepSizeW a_rep + + pass_word w + | offsetW + w < 8 = + [0x801f0000 -- lwz rX, src(r31) + .|. (fromIntegral src .&. 0xFFFF) + .|. (fromIntegral (offsetW+w+3) `shiftL` 21)] + | otherwise = + [0x801f0000 -- lwz r0, src(r31) + .|. (fromIntegral src .&. 0xFFFF), + 0x90010000 -- stw r0, dst(r1) + .|. (fromIntegral dst .&. 0xFFFF)] + where + src = haskellArgOffset + w*bytes_per_word + dst = linkageArea + (offsetW+w) * bytes_per_word + in + case a_rep of + FloatArg | nextFPR < 14 -> + (0xc01f0000 -- lfs fX, haskellArgOffset(r31) + .|. (fromIntegral haskellArgOffset .&. 0xFFFF) + .|. (fromIntegral nextFPR `shiftL` 21)) + : pass_parameters args (nextFPR+1) offsetW' + DoubleArg | nextFPR < 14 -> + (0xc81f0000 -- lfd fX, haskellArgOffset(r31) + .|. (fromIntegral haskellArgOffset .&. 0xFFFF) + .|. (fromIntegral nextFPR `shiftL` 21)) + : pass_parameters args (nextFPR+1) offsetW' + _ -> + concatMap pass_word [0 .. cgRepSizeW a_rep - 1] + ++ pass_parameters args nextFPR offsetW' + + gather_result = case r_rep of + VoidArg -> [] + FloatArg -> + [0xd03f0000 .|. (fromIntegral result_off .&. 0xFFFF)] + -- stfs f1, result_off(r31) + DoubleArg -> + [0xd83f0000 .|. (fromIntegral result_off .&. 0xFFFF)] + -- stfd f1, result_off(r31) + _ | cgRepSizeW r_rep == 2 -> + [0x907f0000 .|. (fromIntegral result_off .&. 0xFFFF), + 0x909f0000 .|. (fromIntegral (result_off+4) .&. 0xFFFF)] + -- stw r3, result_off(r31) + -- stw r4, result_off+4(r31) + _ | cgRepSizeW r_rep == 1 -> + [0x907f0000 .|. (fromIntegral result_off .&. 0xFFFF)] + -- stw r3, result_off(r31) + in + concatMap w32_to_w8s_bigEndian $ [ + 0x7c0802a6, -- mflr r0 + 0x93e1fffc, -- stw r31,-4(r1) + 0x90010008, -- stw r0,8(r1) + 0x94210000 .|. (fromIntegral (-frameSize) .&. 0xFFFF), + -- stwu r1, -frameSize(r1) + 0x7c7f1b78 -- mr r31, r3 + ] ++ pass_parameters arg_offs_n_reps 1 0 ++ [ + 0x819f0000 .|. (fromIntegral a_off .&. 0xFFFF), + -- lwz r12, a_off(r31) + 0x7d8903a6, -- mtctr r12 + 0x4e800421 -- bctrl + ] ++ gather_result ++ [ + 0x80210000, -- lwz r1, 0(r1) + 0x83e1fffc, -- lwz r31, -4(r1) + 0x80010008, -- lwz r0, 8(r1) + 0x7c0803a6, -- mtlr r0 + 0x4e800020 -- blr + ] + +#elif powerpc_TARGET_ARCH && linux_TARGET_OS + + -- All offsets here are measured in Words (not bytes). This includes + -- arguments to the load/store machine code generators, alignment numbers + -- and the final 'framesize' among others. + + = concatMap w32_to_w8s_bigEndian $ [ + 0x7c0802a6, -- mflr r0 + 0x93e1fffc, -- stw r31,-4(r1) + 0x90010008, -- stw r0,8(r1) + 0x94210000 .|. offset (-framesize), -- stwu r1, -frameSize(r1) + 0x7c7f1b78 -- mr r31, r3 + ] ++ pass_parameters ++ -- pass the parameters + loadWord 12 addr_offW ++ [ -- lwz r12, a_off(r31) + 0x7d8903a6, -- mtctr r12 + 0x4e800421 -- bctrl + ] ++ gather_result ++ [ -- save the return value + 0x80210000, -- lwz r1, 0(r1) + 0x83e1fffc, -- lwz r31, -4(r1) + 0x80010008, -- lwz r0, 8(r1) + 0x7c0803a6, -- mtlr r0 + 0x4e800020 -- blr + ] + + where + gather_result :: [Word32] + gather_result = case r_rep of + VoidArg -> [] + FloatArg -> storeFloat 1 r_offW + DoubleArg -> storeDouble 1 r_offW + LongArg -> storeLong 3 r_offW + _ -> storeWord 3 r_offW + + pass_parameters :: [Word32] + pass_parameters = concat params + + -- vector aligned (4 word = 16 bytes) with 8 extra words of buffer space + framesize = alignedTo 4 (argsize + 8) + + ((_,_,argsize), params) = mapAccumL loadparam (3,1,2) arg_offs_n_reps + + -- handle one argument, returning machine code and the updated state + loadparam :: (Int, Int, Int) -> (Int, CgRep) -> + ((Int, Int, Int), [Word32]) + + loadparam (gpr, fpr, stack) (ofs, rep) = case rep of + FloatArg | fpr <= 8 -> ( (gpr, fpr + 1, stack), loadFloat fpr ofs ) + FloatArg -> ( (gpr, fpr, stack + 1), stackWord stack ofs ) + + DoubleArg | fpr <= 8 -> ( (gpr, fpr + 1, stack), loadDouble fpr ofs ) + DoubleArg -> ( (gpr, fpr, astack + 2), stackLong astack ofs ) + + LongArg | even gpr -> loadparam (gpr + 1, fpr, stack) (ofs, rep) + LongArg | gpr <= 9 -> ( (gpr + 2, fpr, stack), loadLong gpr ofs ) + LongArg -> ( (gpr, fpr, astack + 2), stackLong astack ofs ) + + _ | gpr <= 10 -> ( (gpr + 1, fpr, stack), loadWord gpr ofs ) + _ -> ( (gpr, fpr, stack + 1), stackWord stack ofs ) + where astack = alignedTo 2 stack + + alignedTo :: Int -> Int -> Int + alignedTo alignment x = case x `mod` alignment of + 0 -> x + y -> x - y + alignment + + -- convenience macros to do multiple-instruction data moves + stackWord dst src = loadWord 0 src ++ storeWordC 0 dst + stackLong dst src = stackWord dst src ++ stackWord (dst + 1) (src + 1) + loadLong dst src = loadWord dst src ++ loadWord (dst + 1) (src + 1) + storeLong dst src = storeWord dst src ++ storeWord (dst + 1) (src + 1) + + -- load data from the Haskell stack (relative to r31) + loadFloat = loadstoreInstr 0xc01f0000 -- lfs fpr, ofs(r31) + loadDouble = loadstoreInstr 0xc81f0000 -- lfd fpr, ofs(r31) + loadWord = loadstoreInstr 0x801f0000 -- lwz gpr, ofs(r31) + + -- store data to the Haskell stack (relative to r31) + storeFloat = loadstoreInstr 0xd01f0000 -- stfs fpr, ofs(r31) + storeDouble = loadstoreInstr 0xd81f0000 -- stfd fpr, ofs(r31) + storeWord = loadstoreInstr 0x901f0000 -- stw gpr, ofs(r31) + + -- store data to the C stack (relative to r1) + storeWordC = loadstoreInstr 0x90010000 -- stw gpr, ofs(r1) + + -- machine code building blocks + loadstoreInstr :: Word32 -> Int -> Int -> [Word32] + loadstoreInstr code reg ofs = [ code .|. register reg .|. offset ofs ] + + register :: Int -> Word32 + register reg = fromIntegral reg `shiftL` 21 + + offset :: Int -> Word32 + offset ofs = fromIntegral (ofs * 4) .&. 0xFFFF + + -- speaks for itself + w32_to_w8s_bigEndian :: Word32 -> [Word8] + w32_to_w8s_bigEndian w = [fromIntegral (0xFF .&. (w `shiftR` 24)), + fromIntegral (0xFF .&. (w `shiftR` 16)), + fromIntegral (0xFF .&. (w `shiftR` 8)), + fromIntegral (0xFF .&. w)] + +#else + + = throwDyn (InstallationError "foreign import is not implemented for GHCi on this platform.") + +#endif + +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH +lit32 :: Int -> [Word8] +lit32 i = let w32 = (fromIntegral i) :: Word32 + in map (fromIntegral . ( .&. 0xFF)) + [w32, w32 `shiftR` 8, + w32 `shiftR` 16, w32 `shiftR` 24] +#endif +\end{code} + diff --git a/compiler/ghci/ByteCodeGen.lhs b/compiler/ghci/ByteCodeGen.lhs new file mode 100644 index 0000000000..19db7af16b --- /dev/null +++ b/compiler/ghci/ByteCodeGen.lhs @@ -0,0 +1,1358 @@ +% +% (c) The University of Glasgow 2002 +% +\section[ByteCodeGen]{Generate bytecode from Core} + +\begin{code} +module ByteCodeGen ( UnlinkedBCO, byteCodeGen, coreExprToBCOs ) where + +#include "HsVersions.h" + +import ByteCodeInstr +import ByteCodeFFI ( mkMarshalCode, moan64 ) +import ByteCodeAsm ( CompiledByteCode(..), UnlinkedBCO, + assembleBCO, assembleBCOs, iNTERP_STACK_CHECK_THRESH ) +import ByteCodeLink ( lookupStaticPtr ) + +import Outputable +import Name ( Name, getName, mkSystemVarName ) +import Id +import FiniteMap +import ForeignCall ( ForeignCall(..), CCallTarget(..), CCallSpec(..) ) +import HscTypes ( TypeEnv, typeEnvTyCons, typeEnvClasses ) +import CoreUtils ( exprType ) +import CoreSyn +import PprCore ( pprCoreExpr ) +import Literal ( Literal(..), literalType ) +import PrimOp ( PrimOp(..) ) +import CoreFVs ( freeVars ) +import Type ( isUnLiftedType, splitTyConApp_maybe ) +import DataCon ( DataCon, dataConTag, fIRST_TAG, dataConTyCon, + isUnboxedTupleCon, isNullaryRepDataCon, dataConWorkId, + dataConRepArity ) +import TyCon ( TyCon, tyConFamilySize, isDataTyCon, + tyConDataCons, isUnboxedTupleTyCon ) +import Class ( Class, classTyCon ) +import Type ( Type, repType, splitFunTys, dropForAlls, pprType ) +import Util +import DataCon ( dataConRepArity ) +import Var ( isTyVar ) +import VarSet ( VarSet, varSetElems ) +import TysPrim ( arrayPrimTyCon, mutableArrayPrimTyCon, + byteArrayPrimTyCon, mutableByteArrayPrimTyCon + ) +import DynFlags ( DynFlags, DynFlag(..) ) +import ErrUtils ( showPass, dumpIfSet_dyn ) +import Unique ( mkPseudoUniqueE ) +import FastString ( FastString(..), unpackFS ) +import Panic ( GhcException(..) ) +import SMRep ( typeCgRep, arrWordsHdrSize, arrPtrsHdrSize, StgWord, + CgRep(..), cgRepSizeW, isFollowableArg, idCgRep ) +import Bitmap ( intsToReverseBitmap, mkBitmap ) +import OrdList +import Constants ( wORD_SIZE ) + +import Data.List ( intersperse, sortBy, zip4, zip6, partition ) +import Foreign ( Ptr, castPtr, mallocBytes, pokeByteOff, Word8, + withForeignPtr ) +import Foreign.C ( CInt ) +import Control.Exception ( throwDyn ) + +import GHC.Exts ( Int(..), ByteArray# ) + +import Control.Monad ( when ) +import Data.Char ( ord, chr ) + +-- ----------------------------------------------------------------------------- +-- Generating byte code for a complete module + +byteCodeGen :: DynFlags + -> [CoreBind] + -> [TyCon] + -> IO CompiledByteCode +byteCodeGen dflags binds tycs + = do showPass dflags "ByteCodeGen" + + let flatBinds = [ (bndr, freeVars rhs) + | (bndr, rhs) <- flattenBinds binds] + + (BcM_State final_ctr mallocd, proto_bcos) + <- runBc (mapM schemeTopBind flatBinds) + + when (notNull mallocd) + (panic "ByteCodeGen.byteCodeGen: missing final emitBc?") + + dumpIfSet_dyn dflags Opt_D_dump_BCOs + "Proto-BCOs" (vcat (intersperse (char ' ') (map ppr proto_bcos))) + + assembleBCOs proto_bcos tycs + +-- ----------------------------------------------------------------------------- +-- Generating byte code for an expression + +-- Returns: (the root BCO for this expression, +-- a list of auxilary BCOs resulting from compiling closures) +coreExprToBCOs :: DynFlags + -> CoreExpr + -> IO UnlinkedBCO +coreExprToBCOs dflags expr + = do showPass dflags "ByteCodeGen" + + -- create a totally bogus name for the top-level BCO; this + -- should be harmless, since it's never used for anything + let invented_name = mkSystemVarName (mkPseudoUniqueE 0) FSLIT("ExprTopLevel") + invented_id = mkLocalId invented_name (panic "invented_id's type") + + (BcM_State final_ctr mallocd, proto_bco) + <- runBc (schemeTopBind (invented_id, freeVars expr)) + + when (notNull mallocd) + (panic "ByteCodeGen.coreExprToBCOs: missing final emitBc?") + + dumpIfSet_dyn dflags Opt_D_dump_BCOs "Proto-BCOs" (ppr proto_bco) + + assembleBCO proto_bco + + +-- ----------------------------------------------------------------------------- +-- Compilation schema for the bytecode generator + +type BCInstrList = OrdList BCInstr + +type Sequel = Int -- back off to this depth before ENTER + +-- Maps Ids to the offset from the stack _base_ so we don't have +-- to mess with it after each push/pop. +type BCEnv = FiniteMap Id Int -- To find vars on the stack + +ppBCEnv :: BCEnv -> SDoc +ppBCEnv p + = text "begin-env" + $$ nest 4 (vcat (map pp_one (sortBy cmp_snd (fmToList p)))) + $$ text "end-env" + where + pp_one (var, offset) = int offset <> colon <+> ppr var <+> ppr (idCgRep var) + cmp_snd x y = compare (snd x) (snd y) + +-- Create a BCO and do a spot of peephole optimisation on the insns +-- at the same time. +mkProtoBCO + :: name + -> BCInstrList + -> Either [AnnAlt Id VarSet] (AnnExpr Id VarSet) + -> Int + -> Int + -> [StgWord] + -> Bool -- True <=> is a return point, rather than a function + -> [Ptr ()] + -> ProtoBCO name +mkProtoBCO nm instrs_ordlist origin arity bitmap_size bitmap + is_ret mallocd_blocks + = ProtoBCO { + protoBCOName = nm, + protoBCOInstrs = maybe_with_stack_check, + protoBCOBitmap = bitmap, + protoBCOBitmapSize = bitmap_size, + protoBCOArity = arity, + protoBCOExpr = origin, + protoBCOPtrs = mallocd_blocks + } + where + -- Overestimate the stack usage (in words) of this BCO, + -- and if >= iNTERP_STACK_CHECK_THRESH, add an explicit + -- stack check. (The interpreter always does a stack check + -- for iNTERP_STACK_CHECK_THRESH words at the start of each + -- BCO anyway, so we only need to add an explicit on in the + -- (hopefully rare) cases when the (overestimated) stack use + -- exceeds iNTERP_STACK_CHECK_THRESH. + maybe_with_stack_check + | is_ret = peep_d + -- don't do stack checks at return points; + -- everything is aggregated up to the top BCO + -- (which must be a function) + | stack_overest >= 65535 + = pprPanic "mkProtoBCO: stack use won't fit in 16 bits" + (int stack_overest) + | stack_overest >= iNTERP_STACK_CHECK_THRESH + = STKCHECK stack_overest : peep_d + | otherwise + = peep_d -- the supposedly common case + + stack_overest = sum (map bciStackUse peep_d) + + -- Merge local pushes + peep_d = peep (fromOL instrs_ordlist) + + peep (PUSH_L off1 : PUSH_L off2 : PUSH_L off3 : rest) + = PUSH_LLL off1 (off2-1) (off3-2) : peep rest + peep (PUSH_L off1 : PUSH_L off2 : rest) + = PUSH_LL off1 (off2-1) : peep rest + peep (i:rest) + = i : peep rest + peep [] + = [] + +argBits :: [CgRep] -> [Bool] +argBits [] = [] +argBits (rep : args) + | isFollowableArg rep = False : argBits args + | otherwise = take (cgRepSizeW rep) (repeat True) ++ argBits args + +-- ----------------------------------------------------------------------------- +-- schemeTopBind + +-- Compile code for the right-hand side of a top-level binding + +schemeTopBind :: (Id, AnnExpr Id VarSet) -> BcM (ProtoBCO Name) + + +schemeTopBind (id, rhs) + | Just data_con <- isDataConWorkId_maybe id, + isNullaryRepDataCon data_con + = -- Special case for the worker of a nullary data con. + -- It'll look like this: Nil = /\a -> Nil a + -- If we feed it into schemeR, we'll get + -- Nil = Nil + -- because mkConAppCode treats nullary constructor applications + -- by just re-using the single top-level definition. So + -- for the worker itself, we must allocate it directly. + emitBc (mkProtoBCO (getName id) (toOL [PACK data_con 0, ENTER]) + (Right rhs) 0 0 [{-no bitmap-}] False{-not alts-}) + + | otherwise + = schemeR [{- No free variables -}] (id, rhs) + +-- ----------------------------------------------------------------------------- +-- schemeR + +-- Compile code for a right-hand side, to give a BCO that, +-- when executed with the free variables and arguments on top of the stack, +-- will return with a pointer to the result on top of the stack, after +-- removing the free variables and arguments. +-- +-- Park the resulting BCO in the monad. Also requires the +-- variable to which this value was bound, so as to give the +-- resulting BCO a name. + +schemeR :: [Id] -- Free vars of the RHS, ordered as they + -- will appear in the thunk. Empty for + -- top-level things, which have no free vars. + -> (Id, AnnExpr Id VarSet) + -> BcM (ProtoBCO Name) +schemeR fvs (nm, rhs) +{- + | trace (showSDoc ( + (char ' ' + $$ (ppr.filter (not.isTyVar).varSetElems.fst) rhs + $$ pprCoreExpr (deAnnotate rhs) + $$ char ' ' + ))) False + = undefined + | otherwise +-} + = schemeR_wrk fvs nm rhs (collect [] rhs) + +collect xs (_, AnnNote note e) = collect xs e +collect xs (_, AnnLam x e) = collect (if isTyVar x then xs else (x:xs)) e +collect xs (_, not_lambda) = (reverse xs, not_lambda) + +schemeR_wrk fvs nm original_body (args, body) + = let + all_args = reverse args ++ fvs + arity = length all_args + -- all_args are the args in reverse order. We're compiling a function + -- \fv1..fvn x1..xn -> e + -- i.e. the fvs come first + + szsw_args = map idSizeW all_args + szw_args = sum szsw_args + p_init = listToFM (zip all_args (mkStackOffsets 0 szsw_args)) + + -- make the arg bitmap + bits = argBits (reverse (map idCgRep all_args)) + bitmap_size = length bits + bitmap = mkBitmap bits + in + schemeE szw_args 0 p_init body `thenBc` \ body_code -> + emitBc (mkProtoBCO (getName nm) body_code (Right original_body) + arity bitmap_size bitmap False{-not alts-}) + + +fvsToEnv :: BCEnv -> VarSet -> [Id] +-- Takes the free variables of a right-hand side, and +-- delivers an ordered list of the local variables that will +-- be captured in the thunk for the RHS +-- The BCEnv argument tells which variables are in the local +-- environment: these are the ones that should be captured +-- +-- The code that constructs the thunk, and the code that executes +-- it, have to agree about this layout +fvsToEnv p fvs = [v | v <- varSetElems fvs, + isId v, -- Could be a type variable + v `elemFM` p] + +-- ----------------------------------------------------------------------------- +-- schemeE + +-- Compile code to apply the given expression to the remaining args +-- on the stack, returning a HNF. +schemeE :: Int -> Sequel -> BCEnv -> AnnExpr' Id VarSet -> BcM BCInstrList + +-- Delegate tail-calls to schemeT. +schemeE d s p e@(AnnApp f a) + = schemeT d s p e + +schemeE d s p e@(AnnVar v) + | not (isUnLiftedType v_type) + = -- Lifted-type thing; push it in the normal way + schemeT d s p e + + | otherwise + = -- Returning an unlifted value. + -- Heave it on the stack, SLIDE, and RETURN. + pushAtom d p (AnnVar v) `thenBc` \ (push, szw) -> + returnBc (push -- value onto stack + `appOL` mkSLIDE szw (d-s) -- clear to sequel + `snocOL` RETURN_UBX v_rep) -- go + where + v_type = idType v + v_rep = typeCgRep v_type + +schemeE d s p (AnnLit literal) + = pushAtom d p (AnnLit literal) `thenBc` \ (push, szw) -> + let l_rep = typeCgRep (literalType literal) + in returnBc (push -- value onto stack + `appOL` mkSLIDE szw (d-s) -- clear to sequel + `snocOL` RETURN_UBX l_rep) -- go + + +schemeE d s p (AnnLet (AnnNonRec x (_,rhs)) (_,body)) + | (AnnVar v, args_r_to_l) <- splitApp rhs, + Just data_con <- isDataConWorkId_maybe v, + dataConRepArity data_con == length args_r_to_l + = -- Special case for a non-recursive let whose RHS is a + -- saturatred constructor application. + -- Just allocate the constructor and carry on + mkConAppCode d s p data_con args_r_to_l `thenBc` \ alloc_code -> + schemeE (d+1) s (addToFM p x d) body `thenBc` \ body_code -> + returnBc (alloc_code `appOL` body_code) + +-- General case for let. Generates correct, if inefficient, code in +-- all situations. +schemeE d s p (AnnLet binds (_,body)) + = let (xs,rhss) = case binds of AnnNonRec x rhs -> ([x],[rhs]) + AnnRec xs_n_rhss -> unzip xs_n_rhss + n_binds = length xs + + fvss = map (fvsToEnv p' . fst) rhss + + -- Sizes of free vars + sizes = map (\rhs_fvs -> sum (map idSizeW rhs_fvs)) fvss + + -- the arity of each rhs + arities = map (length . fst . collect []) rhss + + -- This p', d' defn is safe because all the items being pushed + -- are ptrs, so all have size 1. d' and p' reflect the stack + -- after the closures have been allocated in the heap (but not + -- filled in), and pointers to them parked on the stack. + p' = addListToFM p (zipE xs (mkStackOffsets d (nOfThem n_binds 1))) + d' = d + n_binds + zipE = zipEqual "schemeE" + + -- ToDo: don't build thunks for things with no free variables + build_thunk dd [] size bco off arity + = returnBc (PUSH_BCO bco `consOL` unitOL (mkap (off+size) size)) + where + mkap | arity == 0 = MKAP + | otherwise = MKPAP + build_thunk dd (fv:fvs) size bco off arity = do + (push_code, pushed_szw) <- pushAtom dd p' (AnnVar fv) + more_push_code <- build_thunk (dd+pushed_szw) fvs size bco off arity + returnBc (push_code `appOL` more_push_code) + + alloc_code = toOL (zipWith mkAlloc sizes arities) + where mkAlloc sz 0 = ALLOC_AP sz + mkAlloc sz arity = ALLOC_PAP arity sz + + compile_bind d' fvs x rhs size arity off = do + bco <- schemeR fvs (x,rhs) + build_thunk d' fvs size bco off arity + + compile_binds = + [ compile_bind d' fvs x rhs size arity n + | (fvs, x, rhs, size, arity, n) <- + zip6 fvss xs rhss sizes arities [n_binds, n_binds-1 .. 1] + ] + in do + body_code <- schemeE d' s p' body + thunk_codes <- sequence compile_binds + returnBc (alloc_code `appOL` concatOL thunk_codes `appOL` body_code) + + + +schemeE d s p (AnnCase scrut bndr _ [(DataAlt dc, [bind1, bind2], rhs)]) + | isUnboxedTupleCon dc, VoidArg <- typeCgRep (idType bind1) + -- Convert + -- case .... of x { (# VoidArg'd-thing, a #) -> ... } + -- to + -- case .... of a { DEFAULT -> ... } + -- becuse the return convention for both are identical. + -- + -- Note that it does not matter losing the void-rep thing from the + -- envt (it won't be bound now) because we never look such things up. + + = --trace "automagic mashing of case alts (# VoidArg, a #)" $ + doCase d s p scrut bind2 [(DEFAULT, [], rhs)] True{-unboxed tuple-} + + | isUnboxedTupleCon dc, VoidArg <- typeCgRep (idType bind2) + = --trace "automagic mashing of case alts (# a, VoidArg #)" $ + doCase d s p scrut bind1 [(DEFAULT, [], rhs)] True{-unboxed tuple-} + +schemeE d s p (AnnCase scrut bndr _ [(DataAlt dc, [bind1], rhs)]) + | isUnboxedTupleCon dc + -- Similarly, convert + -- case .... of x { (# a #) -> ... } + -- to + -- case .... of a { DEFAULT -> ... } + = --trace "automagic mashing of case alts (# a #)" $ + doCase d s p scrut bind1 [(DEFAULT, [], rhs)] True{-unboxed tuple-} + +schemeE d s p (AnnCase scrut bndr _ alts) + = doCase d s p scrut bndr alts False{-not an unboxed tuple-} + +schemeE d s p (AnnNote note (_, body)) + = schemeE d s p body + +schemeE d s p other + = pprPanic "ByteCodeGen.schemeE: unhandled case" + (pprCoreExpr (deAnnotate' other)) + + +-- Compile code to do a tail call. Specifically, push the fn, +-- slide the on-stack app back down to the sequel depth, +-- and enter. Four cases: +-- +-- 0. (Nasty hack). +-- An application "GHC.Prim.tagToEnum# <type> unboxed-int". +-- The int will be on the stack. Generate a code sequence +-- to convert it to the relevant constructor, SLIDE and ENTER. +-- +-- 1. The fn denotes a ccall. Defer to generateCCall. +-- +-- 2. (Another nasty hack). Spot (# a::VoidArg, b #) and treat +-- it simply as b -- since the representations are identical +-- (the VoidArg takes up zero stack space). Also, spot +-- (# b #) and treat it as b. +-- +-- 3. Application of a constructor, by defn saturated. +-- Split the args into ptrs and non-ptrs, and push the nonptrs, +-- then the ptrs, and then do PACK and RETURN. +-- +-- 4. Otherwise, it must be a function call. Push the args +-- right to left, SLIDE and ENTER. + +schemeT :: Int -- Stack depth + -> Sequel -- Sequel depth + -> BCEnv -- stack env + -> AnnExpr' Id VarSet + -> BcM BCInstrList + +schemeT d s p app + +-- | trace ("schemeT: env in = \n" ++ showSDocDebug (ppBCEnv p)) False +-- = panic "schemeT ?!?!" + +-- | trace ("\nschemeT\n" ++ showSDoc (pprCoreExpr (deAnnotate' app)) ++ "\n") False +-- = error "?!?!" + + -- Case 0 + | Just (arg, constr_names) <- maybe_is_tagToEnum_call + = pushAtom d p arg `thenBc` \ (push, arg_words) -> + implement_tagToId constr_names `thenBc` \ tagToId_sequence -> + returnBc (push `appOL` tagToId_sequence + `appOL` mkSLIDE 1 (d+arg_words-s) + `snocOL` ENTER) + + -- Case 1 + | Just (CCall ccall_spec) <- isFCallId_maybe fn + = generateCCall d s p ccall_spec fn args_r_to_l + + -- Case 2: Constructor application + | Just con <- maybe_saturated_dcon, + isUnboxedTupleCon con + = case args_r_to_l of + [arg1,arg2] | isVoidArgAtom arg1 -> + unboxedTupleReturn d s p arg2 + [arg1,arg2] | isVoidArgAtom arg2 -> + unboxedTupleReturn d s p arg1 + _other -> unboxedTupleException + + -- Case 3: Ordinary data constructor + | Just con <- maybe_saturated_dcon + = mkConAppCode d s p con args_r_to_l `thenBc` \ alloc_con -> + returnBc (alloc_con `appOL` + mkSLIDE 1 (d - s) `snocOL` + ENTER) + + -- Case 4: Tail call of function + | otherwise + = doTailCall d s p fn args_r_to_l + + where + -- Detect and extract relevant info for the tagToEnum kludge. + maybe_is_tagToEnum_call + = let extract_constr_Names ty + | Just (tyc, []) <- splitTyConApp_maybe (repType ty), + isDataTyCon tyc + = map (getName . dataConWorkId) (tyConDataCons tyc) + -- NOTE: use the worker name, not the source name of + -- the DataCon. See DataCon.lhs for details. + | otherwise + = panic "maybe_is_tagToEnum_call.extract_constr_Ids" + in + case app of + (AnnApp (_, AnnApp (_, AnnVar v) (_, AnnType t)) arg) + -> case isPrimOpId_maybe v of + Just TagToEnumOp -> Just (snd arg, extract_constr_Names t) + other -> Nothing + other -> Nothing + + -- Extract the args (R->L) and fn + -- The function will necessarily be a variable, + -- because we are compiling a tail call + (AnnVar fn, args_r_to_l) = splitApp app + + -- Only consider this to be a constructor application iff it is + -- saturated. Otherwise, we'll call the constructor wrapper. + n_args = length args_r_to_l + maybe_saturated_dcon + = case isDataConWorkId_maybe fn of + Just con | dataConRepArity con == n_args -> Just con + _ -> Nothing + +-- ----------------------------------------------------------------------------- +-- Generate code to build a constructor application, +-- leaving it on top of the stack + +mkConAppCode :: Int -> Sequel -> BCEnv + -> DataCon -- The data constructor + -> [AnnExpr' Id VarSet] -- Args, in *reverse* order + -> BcM BCInstrList + +mkConAppCode orig_d s p con [] -- Nullary constructor + = ASSERT( isNullaryRepDataCon con ) + returnBc (unitOL (PUSH_G (getName (dataConWorkId con)))) + -- Instead of doing a PACK, which would allocate a fresh + -- copy of this constructor, use the single shared version. + +mkConAppCode orig_d s p con args_r_to_l + = ASSERT( dataConRepArity con == length args_r_to_l ) + do_pushery orig_d (non_ptr_args ++ ptr_args) + where + -- The args are already in reverse order, which is the way PACK + -- expects them to be. We must push the non-ptrs after the ptrs. + (ptr_args, non_ptr_args) = partition isPtrAtom args_r_to_l + + do_pushery d (arg:args) + = pushAtom d p arg `thenBc` \ (push, arg_words) -> + do_pushery (d+arg_words) args `thenBc` \ more_push_code -> + returnBc (push `appOL` more_push_code) + do_pushery d [] + = returnBc (unitOL (PACK con n_arg_words)) + where + n_arg_words = d - orig_d + + +-- ----------------------------------------------------------------------------- +-- Returning an unboxed tuple with one non-void component (the only +-- case we can handle). +-- +-- Remember, we don't want to *evaluate* the component that is being +-- returned, even if it is a pointed type. We always just return. + +unboxedTupleReturn + :: Int -> Sequel -> BCEnv + -> AnnExpr' Id VarSet -> BcM BCInstrList +unboxedTupleReturn d s p arg = do + (push, sz) <- pushAtom d p arg + returnBc (push `appOL` + mkSLIDE sz (d-s) `snocOL` + RETURN_UBX (atomRep arg)) + +-- ----------------------------------------------------------------------------- +-- Generate code for a tail-call + +doTailCall + :: Int -> Sequel -> BCEnv + -> Id -> [AnnExpr' Id VarSet] + -> BcM BCInstrList +doTailCall init_d s p fn args + = do_pushes init_d args (map atomRep args) + where + do_pushes d [] reps = do + ASSERT( null reps ) return () + (push_fn, sz) <- pushAtom d p (AnnVar fn) + ASSERT( sz == 1 ) return () + returnBc (push_fn `appOL` ( + mkSLIDE ((d-init_d) + 1) (init_d - s) `appOL` + unitOL ENTER)) + do_pushes d args reps = do + let (push_apply, n, rest_of_reps) = findPushSeq reps + (these_args, rest_of_args) = splitAt n args + (next_d, push_code) <- push_seq d these_args + instrs <- do_pushes (next_d + 1) rest_of_args rest_of_reps + -- ^^^ for the PUSH_APPLY_ instruction + returnBc (push_code `appOL` (push_apply `consOL` instrs)) + + push_seq d [] = return (d, nilOL) + push_seq d (arg:args) = do + (push_code, sz) <- pushAtom d p arg + (final_d, more_push_code) <- push_seq (d+sz) args + return (final_d, push_code `appOL` more_push_code) + +-- v. similar to CgStackery.findMatch, ToDo: merge +findPushSeq (PtrArg: PtrArg: PtrArg: PtrArg: PtrArg: PtrArg: rest) + = (PUSH_APPLY_PPPPPP, 6, rest) +findPushSeq (PtrArg: PtrArg: PtrArg: PtrArg: PtrArg: rest) + = (PUSH_APPLY_PPPPP, 5, rest) +findPushSeq (PtrArg: PtrArg: PtrArg: PtrArg: rest) + = (PUSH_APPLY_PPPP, 4, rest) +findPushSeq (PtrArg: PtrArg: PtrArg: rest) + = (PUSH_APPLY_PPP, 3, rest) +findPushSeq (PtrArg: PtrArg: rest) + = (PUSH_APPLY_PP, 2, rest) +findPushSeq (PtrArg: rest) + = (PUSH_APPLY_P, 1, rest) +findPushSeq (VoidArg: rest) + = (PUSH_APPLY_V, 1, rest) +findPushSeq (NonPtrArg: rest) + = (PUSH_APPLY_N, 1, rest) +findPushSeq (FloatArg: rest) + = (PUSH_APPLY_F, 1, rest) +findPushSeq (DoubleArg: rest) + = (PUSH_APPLY_D, 1, rest) +findPushSeq (LongArg: rest) + = (PUSH_APPLY_L, 1, rest) +findPushSeq _ + = panic "ByteCodeGen.findPushSeq" + +-- ----------------------------------------------------------------------------- +-- Case expressions + +doCase :: Int -> Sequel -> BCEnv + -> AnnExpr Id VarSet -> Id -> [AnnAlt Id VarSet] + -> Bool -- True <=> is an unboxed tuple case, don't enter the result + -> BcM BCInstrList +doCase d s p (_,scrut) + bndr alts is_unboxed_tuple + = let + -- Top of stack is the return itbl, as usual. + -- underneath it is the pointer to the alt_code BCO. + -- When an alt is entered, it assumes the returned value is + -- on top of the itbl. + ret_frame_sizeW = 2 + + -- An unlifted value gets an extra info table pushed on top + -- when it is returned. + unlifted_itbl_sizeW | isAlgCase = 0 + | otherwise = 1 + + -- depth of stack after the return value has been pushed + d_bndr = d + ret_frame_sizeW + idSizeW bndr + + -- depth of stack after the extra info table for an unboxed return + -- has been pushed, if any. This is the stack depth at the + -- continuation. + d_alts = d_bndr + unlifted_itbl_sizeW + + -- Env in which to compile the alts, not including + -- any vars bound by the alts themselves + p_alts = addToFM p bndr (d_bndr - 1) + + bndr_ty = idType bndr + isAlgCase = not (isUnLiftedType bndr_ty) && not is_unboxed_tuple + + -- given an alt, return a discr and code for it. + codeALt alt@(DEFAULT, _, (_,rhs)) + = schemeE d_alts s p_alts rhs `thenBc` \ rhs_code -> + returnBc (NoDiscr, rhs_code) + codeAlt alt@(discr, bndrs, (_,rhs)) + -- primitive or nullary constructor alt: no need to UNPACK + | null real_bndrs = do + rhs_code <- schemeE d_alts s p_alts rhs + returnBc (my_discr alt, rhs_code) + -- algebraic alt with some binders + | ASSERT(isAlgCase) otherwise = + let + (ptrs,nptrs) = partition (isFollowableArg.idCgRep) real_bndrs + ptr_sizes = map idSizeW ptrs + nptrs_sizes = map idSizeW nptrs + bind_sizes = ptr_sizes ++ nptrs_sizes + size = sum ptr_sizes + sum nptrs_sizes + -- the UNPACK instruction unpacks in reverse order... + p' = addListToFM p_alts + (zip (reverse (ptrs ++ nptrs)) + (mkStackOffsets d_alts (reverse bind_sizes))) + in do + rhs_code <- schemeE (d_alts+size) s p' rhs + return (my_discr alt, unitOL (UNPACK size) `appOL` rhs_code) + where + real_bndrs = filter (not.isTyVar) bndrs + + + my_discr (DEFAULT, binds, rhs) = NoDiscr {-shouldn't really happen-} + my_discr (DataAlt dc, binds, rhs) + | isUnboxedTupleCon dc + = unboxedTupleException + | otherwise + = DiscrP (dataConTag dc - fIRST_TAG) + my_discr (LitAlt l, binds, rhs) + = case l of MachInt i -> DiscrI (fromInteger i) + MachFloat r -> DiscrF (fromRational r) + MachDouble r -> DiscrD (fromRational r) + MachChar i -> DiscrI (ord i) + _ -> pprPanic "schemeE(AnnCase).my_discr" (ppr l) + + maybe_ncons + | not isAlgCase = Nothing + | otherwise + = case [dc | (DataAlt dc, _, _) <- alts] of + [] -> Nothing + (dc:_) -> Just (tyConFamilySize (dataConTyCon dc)) + + -- the bitmap is relative to stack depth d, i.e. before the + -- BCO, info table and return value are pushed on. + -- This bit of code is v. similar to buildLivenessMask in CgBindery, + -- except that here we build the bitmap from the known bindings of + -- things that are pointers, whereas in CgBindery the code builds the + -- bitmap from the free slots and unboxed bindings. + -- (ToDo: merge?) + bitmap = intsToReverseBitmap d{-size-} (sortLe (<=) rel_slots) + where + binds = fmToList p + rel_slots = concat (map spread binds) + spread (id, offset) + | isFollowableArg (idCgRep id) = [ rel_offset ] + | otherwise = [] + where rel_offset = d - offset - 1 + + in do + alt_stuff <- mapM codeAlt alts + alt_final <- mkMultiBranch maybe_ncons alt_stuff + let + alt_bco_name = getName bndr + alt_bco = mkProtoBCO alt_bco_name alt_final (Left alts) + 0{-no arity-} d{-bitmap size-} bitmap True{-is alts-} + -- in +-- trace ("case: bndr = " ++ showSDocDebug (ppr bndr) ++ "\ndepth = " ++ show d ++ "\nenv = \n" ++ showSDocDebug (ppBCEnv p) ++ +-- "\n bitmap = " ++ show bitmap) $ do + scrut_code <- schemeE (d + ret_frame_sizeW) (d + ret_frame_sizeW) p scrut + alt_bco' <- emitBc alt_bco + let push_alts + | isAlgCase = PUSH_ALTS alt_bco' + | otherwise = PUSH_ALTS_UNLIFTED alt_bco' (typeCgRep bndr_ty) + returnBc (push_alts `consOL` scrut_code) + + +-- ----------------------------------------------------------------------------- +-- Deal with a CCall. + +-- Taggedly push the args onto the stack R->L, +-- deferencing ForeignObj#s and adjusting addrs to point to +-- payloads in Ptr/Byte arrays. Then, generate the marshalling +-- (machine) code for the ccall, and create bytecodes to call that and +-- then return in the right way. + +generateCCall :: Int -> Sequel -- stack and sequel depths + -> BCEnv + -> CCallSpec -- where to call + -> Id -- of target, for type info + -> [AnnExpr' Id VarSet] -- args (atoms) + -> BcM BCInstrList + +generateCCall d0 s p ccall_spec@(CCallSpec target cconv safety) fn args_r_to_l + = let + -- useful constants + addr_sizeW = cgRepSizeW NonPtrArg + + -- Get the args on the stack, with tags and suitably + -- dereferenced for the CCall. For each arg, return the + -- depth to the first word of the bits for that arg, and the + -- CgRep of what was actually pushed. + + pargs d [] = returnBc [] + pargs d (a:az) + = let arg_ty = repType (exprType (deAnnotate' a)) + + in case splitTyConApp_maybe arg_ty of + -- Don't push the FO; instead push the Addr# it + -- contains. + Just (t, _) + | t == arrayPrimTyCon || t == mutableArrayPrimTyCon + -> pargs (d + addr_sizeW) az `thenBc` \ rest -> + parg_ArrayishRep arrPtrsHdrSize d p a + `thenBc` \ code -> + returnBc ((code,NonPtrArg):rest) + + | t == byteArrayPrimTyCon || t == mutableByteArrayPrimTyCon + -> pargs (d + addr_sizeW) az `thenBc` \ rest -> + parg_ArrayishRep arrWordsHdrSize d p a + `thenBc` \ code -> + returnBc ((code,NonPtrArg):rest) + + -- Default case: push taggedly, but otherwise intact. + other + -> pushAtom d p a `thenBc` \ (code_a, sz_a) -> + pargs (d+sz_a) az `thenBc` \ rest -> + returnBc ((code_a, atomRep a) : rest) + + -- Do magic for Ptr/Byte arrays. Push a ptr to the array on + -- the stack but then advance it over the headers, so as to + -- point to the payload. + parg_ArrayishRep hdrSize d p a + = pushAtom d p a `thenBc` \ (push_fo, _) -> + -- The ptr points at the header. Advance it over the + -- header and then pretend this is an Addr#. + returnBc (push_fo `snocOL` SWIZZLE 0 hdrSize) + + in + pargs d0 args_r_to_l `thenBc` \ code_n_reps -> + let + (pushs_arg, a_reps_pushed_r_to_l) = unzip code_n_reps + + push_args = concatOL pushs_arg + d_after_args = d0 + sum (map cgRepSizeW a_reps_pushed_r_to_l) + a_reps_pushed_RAW + | null a_reps_pushed_r_to_l || head a_reps_pushed_r_to_l /= VoidArg + = panic "ByteCodeGen.generateCCall: missing or invalid World token?" + | otherwise + = reverse (tail a_reps_pushed_r_to_l) + + -- Now: a_reps_pushed_RAW are the reps which are actually on the stack. + -- push_args is the code to do that. + -- d_after_args is the stack depth once the args are on. + + -- Get the result rep. + (returns_void, r_rep) + = case maybe_getCCallReturnRep (idType fn) of + Nothing -> (True, VoidArg) + Just rr -> (False, rr) + {- + Because the Haskell stack grows down, the a_reps refer to + lowest to highest addresses in that order. The args for the call + are on the stack. Now push an unboxed Addr# indicating + the C function to call. Then push a dummy placeholder for the + result. Finally, emit a CCALL insn with an offset pointing to the + Addr# just pushed, and a literal field holding the mallocville + address of the piece of marshalling code we generate. + So, just prior to the CCALL insn, the stack looks like this + (growing down, as usual): + + <arg_n> + ... + <arg_1> + Addr# address_of_C_fn + <placeholder-for-result#> (must be an unboxed type) + + The interpreter then calls the marshall code mentioned + in the CCALL insn, passing it (& <placeholder-for-result#>), + that is, the addr of the topmost word in the stack. + When this returns, the placeholder will have been + filled in. The placeholder is slid down to the sequel + depth, and we RETURN. + + This arrangement makes it simple to do f-i-dynamic since the Addr# + value is the first arg anyway. + + The marshalling code is generated specifically for this + call site, and so knows exactly the (Haskell) stack + offsets of the args, fn address and placeholder. It + copies the args to the C stack, calls the stacked addr, + and parks the result back in the placeholder. The interpreter + calls it as a normal C call, assuming it has a signature + void marshall_code ( StgWord* ptr_to_top_of_stack ) + -} + -- resolve static address + get_target_info + = case target of + DynamicTarget + -> returnBc (False, panic "ByteCodeGen.generateCCall(dyn)") + StaticTarget target + -> ioToBc (lookupStaticPtr target) `thenBc` \res -> + returnBc (True, res) + in + get_target_info `thenBc` \ (is_static, static_target_addr) -> + let + + -- Get the arg reps, zapping the leading Addr# in the dynamic case + a_reps -- | trace (showSDoc (ppr a_reps_pushed_RAW)) False = error "???" + | is_static = a_reps_pushed_RAW + | otherwise = if null a_reps_pushed_RAW + then panic "ByteCodeGen.generateCCall: dyn with no args" + else tail a_reps_pushed_RAW + + -- push the Addr# + (push_Addr, d_after_Addr) + | is_static + = (toOL [PUSH_UBX (Right static_target_addr) addr_sizeW], + d_after_args + addr_sizeW) + | otherwise -- is already on the stack + = (nilOL, d_after_args) + + -- Push the return placeholder. For a call returning nothing, + -- this is a VoidArg (tag). + r_sizeW = cgRepSizeW r_rep + d_after_r = d_after_Addr + r_sizeW + r_lit = mkDummyLiteral r_rep + push_r = (if returns_void + then nilOL + else unitOL (PUSH_UBX (Left r_lit) r_sizeW)) + + -- generate the marshalling code we're going to call + r_offW = 0 + addr_offW = r_sizeW + arg1_offW = r_sizeW + addr_sizeW + args_offW = map (arg1_offW +) + (init (scanl (+) 0 (map cgRepSizeW a_reps))) + in + ioToBc (mkMarshalCode cconv + (r_offW, r_rep) addr_offW + (zip args_offW a_reps)) `thenBc` \ addr_of_marshaller -> + recordMallocBc addr_of_marshaller `thenBc_` + let + -- Offset of the next stack frame down the stack. The CCALL + -- instruction needs to describe the chunk of stack containing + -- the ccall args to the GC, so it needs to know how large it + -- is. See comment in Interpreter.c with the CCALL instruction. + stk_offset = d_after_r - s + + -- do the call + do_call = unitOL (CCALL stk_offset (castPtr addr_of_marshaller)) + -- slide and return + wrapup = mkSLIDE r_sizeW (d_after_r - r_sizeW - s) + `snocOL` RETURN_UBX r_rep + in + --trace (show (arg1_offW, args_offW , (map cgRepSizeW a_reps) )) $ + returnBc ( + push_args `appOL` + push_Addr `appOL` push_r `appOL` do_call `appOL` wrapup + ) + + +-- Make a dummy literal, to be used as a placeholder for FFI return +-- values on the stack. +mkDummyLiteral :: CgRep -> Literal +mkDummyLiteral pr + = case pr of + NonPtrArg -> MachWord 0 + DoubleArg -> MachDouble 0 + FloatArg -> MachFloat 0 + _ -> moan64 "mkDummyLiteral" (ppr pr) + + +-- Convert (eg) +-- GHC.Prim.Char# -> GHC.Prim.State# GHC.Prim.RealWorld +-- -> (# GHC.Prim.State# GHC.Prim.RealWorld, GHC.Prim.Int# #) +-- +-- to Just IntRep +-- and check that an unboxed pair is returned wherein the first arg is VoidArg'd. +-- +-- Alternatively, for call-targets returning nothing, convert +-- +-- GHC.Prim.Char# -> GHC.Prim.State# GHC.Prim.RealWorld +-- -> (# GHC.Prim.State# GHC.Prim.RealWorld #) +-- +-- to Nothing + +maybe_getCCallReturnRep :: Type -> Maybe CgRep +maybe_getCCallReturnRep fn_ty + = let (a_tys, r_ty) = splitFunTys (dropForAlls fn_ty) + maybe_r_rep_to_go + = if isSingleton r_reps then Nothing else Just (r_reps !! 1) + (r_tycon, r_reps) + = case splitTyConApp_maybe (repType r_ty) of + (Just (tyc, tys)) -> (tyc, map typeCgRep tys) + Nothing -> blargh + ok = ( ( r_reps `lengthIs` 2 && VoidArg == head r_reps) + || r_reps == [VoidArg] ) + && isUnboxedTupleTyCon r_tycon + && case maybe_r_rep_to_go of + Nothing -> True + Just r_rep -> r_rep /= PtrArg + -- if it was, it would be impossible + -- to create a valid return value + -- placeholder on the stack + blargh = pprPanic "maybe_getCCallReturn: can't handle:" + (pprType fn_ty) + in + --trace (showSDoc (ppr (a_reps, r_reps))) $ + if ok then maybe_r_rep_to_go else blargh + +-- Compile code which expects an unboxed Int on the top of stack, +-- (call it i), and pushes the i'th closure in the supplied list +-- as a consequence. +implement_tagToId :: [Name] -> BcM BCInstrList +implement_tagToId names + = ASSERT( notNull names ) + getLabelsBc (length names) `thenBc` \ labels -> + getLabelBc `thenBc` \ label_fail -> + getLabelBc `thenBc` \ label_exit -> + zip4 labels (tail labels ++ [label_fail]) + [0 ..] names `bind` \ infos -> + map (mkStep label_exit) infos `bind` \ steps -> + returnBc (concatOL steps + `appOL` + toOL [LABEL label_fail, CASEFAIL, LABEL label_exit]) + where + mkStep l_exit (my_label, next_label, n, name_for_n) + = toOL [LABEL my_label, + TESTEQ_I n next_label, + PUSH_G name_for_n, + JMP l_exit] + + +-- ----------------------------------------------------------------------------- +-- pushAtom + +-- Push an atom onto the stack, returning suitable code & number of +-- stack words used. +-- +-- The env p must map each variable to the highest- numbered stack +-- slot for it. For example, if the stack has depth 4 and we +-- tagged-ly push (v :: Int#) on it, the value will be in stack[4], +-- the tag in stack[5], the stack will have depth 6, and p must map v +-- to 5 and not to 4. Stack locations are numbered from zero, so a +-- depth 6 stack has valid words 0 .. 5. + +pushAtom :: Int -> BCEnv -> AnnExpr' Id VarSet -> BcM (BCInstrList, Int) + +pushAtom d p (AnnApp f (_, AnnType _)) + = pushAtom d p (snd f) + +pushAtom d p (AnnNote note e) + = pushAtom d p (snd e) + +pushAtom d p (AnnLam x e) + | isTyVar x + = pushAtom d p (snd e) + +pushAtom d p (AnnVar v) + + | idCgRep v == VoidArg + = returnBc (nilOL, 0) + + | isFCallId v + = pprPanic "pushAtom: shouldn't get an FCallId here" (ppr v) + + | Just primop <- isPrimOpId_maybe v + = returnBc (unitOL (PUSH_PRIMOP primop), 1) + + | Just d_v <- lookupBCEnv_maybe p v -- v is a local variable + = returnBc (toOL (nOfThem sz (PUSH_L (d-d_v+sz-2))), sz) + -- d - d_v the number of words between the TOS + -- and the 1st slot of the object + -- + -- d - d_v - 1 the offset from the TOS of the 1st slot + -- + -- d - d_v - 1 + sz - 1 the offset from the TOS of the last slot + -- of the object. + -- + -- Having found the last slot, we proceed to copy the right number of + -- slots on to the top of the stack. + + | otherwise -- v must be a global variable + = ASSERT(sz == 1) + returnBc (unitOL (PUSH_G (getName v)), sz) + + where + sz = idSizeW v + + +pushAtom d p (AnnLit lit) + = case lit of + MachLabel fs _ -> code NonPtrArg + MachWord w -> code NonPtrArg + MachInt i -> code PtrArg + MachFloat r -> code FloatArg + MachDouble r -> code DoubleArg + MachChar c -> code NonPtrArg + MachStr s -> pushStr s + where + code rep + = let size_host_words = cgRepSizeW rep + in returnBc (unitOL (PUSH_UBX (Left lit) size_host_words), + size_host_words) + + pushStr s + = let getMallocvilleAddr + = case s of + FastString _ n _ fp _ -> + -- we could grab the Ptr from the ForeignPtr, + -- but then we have no way to control its lifetime. + -- In reality it'll probably stay alive long enoungh + -- by virtue of the global FastString table, but + -- to be on the safe side we copy the string into + -- a malloc'd area of memory. + ioToBc (mallocBytes (n+1)) `thenBc` \ ptr -> + recordMallocBc ptr `thenBc_` + ioToBc ( + withForeignPtr fp $ \p -> do + memcpy ptr p (fromIntegral n) + pokeByteOff ptr n (fromIntegral (ord '\0') :: Word8) + return ptr + ) + other -> panic "ByteCodeGen.pushAtom.pushStr" + in + getMallocvilleAddr `thenBc` \ addr -> + -- Get the addr on the stack, untaggedly + returnBc (unitOL (PUSH_UBX (Right addr) 1), 1) + +pushAtom d p other + = pprPanic "ByteCodeGen.pushAtom" + (pprCoreExpr (deAnnotate (undefined, other))) + +foreign import ccall unsafe "memcpy" + memcpy :: Ptr a -> Ptr b -> CInt -> IO () + + +-- ----------------------------------------------------------------------------- +-- Given a bunch of alts code and their discrs, do the donkey work +-- of making a multiway branch using a switch tree. +-- What a load of hassle! + +mkMultiBranch :: Maybe Int -- # datacons in tycon, if alg alt + -- a hint; generates better code + -- Nothing is always safe + -> [(Discr, BCInstrList)] + -> BcM BCInstrList +mkMultiBranch maybe_ncons raw_ways + = let d_way = filter (isNoDiscr.fst) raw_ways + notd_ways = sortLe + (\w1 w2 -> leAlt (fst w1) (fst w2)) + (filter (not.isNoDiscr.fst) raw_ways) + + mkTree :: [(Discr, BCInstrList)] -> Discr -> Discr -> BcM BCInstrList + mkTree [] range_lo range_hi = returnBc the_default + + mkTree [val] range_lo range_hi + | range_lo `eqAlt` range_hi + = returnBc (snd val) + | otherwise + = getLabelBc `thenBc` \ label_neq -> + returnBc (mkTestEQ (fst val) label_neq + `consOL` (snd val + `appOL` unitOL (LABEL label_neq) + `appOL` the_default)) + + mkTree vals range_lo range_hi + = let n = length vals `div` 2 + vals_lo = take n vals + vals_hi = drop n vals + v_mid = fst (head vals_hi) + in + getLabelBc `thenBc` \ label_geq -> + mkTree vals_lo range_lo (dec v_mid) `thenBc` \ code_lo -> + mkTree vals_hi v_mid range_hi `thenBc` \ code_hi -> + returnBc (mkTestLT v_mid label_geq + `consOL` (code_lo + `appOL` unitOL (LABEL label_geq) + `appOL` code_hi)) + + the_default + = case d_way of [] -> unitOL CASEFAIL + [(_, def)] -> def + + -- None of these will be needed if there are no non-default alts + (mkTestLT, mkTestEQ, init_lo, init_hi) + | null notd_ways + = panic "mkMultiBranch: awesome foursome" + | otherwise + = case fst (head notd_ways) of { + DiscrI _ -> ( \(DiscrI i) fail_label -> TESTLT_I i fail_label, + \(DiscrI i) fail_label -> TESTEQ_I i fail_label, + DiscrI minBound, + DiscrI maxBound ); + DiscrF _ -> ( \(DiscrF f) fail_label -> TESTLT_F f fail_label, + \(DiscrF f) fail_label -> TESTEQ_F f fail_label, + DiscrF minF, + DiscrF maxF ); + DiscrD _ -> ( \(DiscrD d) fail_label -> TESTLT_D d fail_label, + \(DiscrD d) fail_label -> TESTEQ_D d fail_label, + DiscrD minD, + DiscrD maxD ); + DiscrP _ -> ( \(DiscrP i) fail_label -> TESTLT_P i fail_label, + \(DiscrP i) fail_label -> TESTEQ_P i fail_label, + DiscrP algMinBound, + DiscrP algMaxBound ) + } + + (algMinBound, algMaxBound) + = case maybe_ncons of + Just n -> (0, n - 1) + Nothing -> (minBound, maxBound) + + (DiscrI i1) `eqAlt` (DiscrI i2) = i1 == i2 + (DiscrF f1) `eqAlt` (DiscrF f2) = f1 == f2 + (DiscrD d1) `eqAlt` (DiscrD d2) = d1 == d2 + (DiscrP i1) `eqAlt` (DiscrP i2) = i1 == i2 + NoDiscr `eqAlt` NoDiscr = True + _ `eqAlt` _ = False + + (DiscrI i1) `leAlt` (DiscrI i2) = i1 <= i2 + (DiscrF f1) `leAlt` (DiscrF f2) = f1 <= f2 + (DiscrD d1) `leAlt` (DiscrD d2) = d1 <= d2 + (DiscrP i1) `leAlt` (DiscrP i2) = i1 <= i2 + NoDiscr `leAlt` NoDiscr = True + _ `leAlt` _ = False + + isNoDiscr NoDiscr = True + isNoDiscr _ = False + + dec (DiscrI i) = DiscrI (i-1) + dec (DiscrP i) = DiscrP (i-1) + dec other = other -- not really right, but if you + -- do cases on floating values, you'll get what you deserve + + -- same snotty comment applies to the following + minF, maxF :: Float + minD, maxD :: Double + minF = -1.0e37 + maxF = 1.0e37 + minD = -1.0e308 + maxD = 1.0e308 + in + mkTree notd_ways init_lo init_hi + + +-- ----------------------------------------------------------------------------- +-- Supporting junk for the compilation schemes + +-- Describes case alts +data Discr + = DiscrI Int + | DiscrF Float + | DiscrD Double + | DiscrP Int + | NoDiscr + +instance Outputable Discr where + ppr (DiscrI i) = int i + ppr (DiscrF f) = text (show f) + ppr (DiscrD d) = text (show d) + ppr (DiscrP i) = int i + ppr NoDiscr = text "DEF" + + +lookupBCEnv_maybe :: BCEnv -> Id -> Maybe Int +lookupBCEnv_maybe = lookupFM + +idSizeW :: Id -> Int +idSizeW id = cgRepSizeW (typeCgRep (idType id)) + +unboxedTupleException :: a +unboxedTupleException + = throwDyn + (Panic + ("Bytecode generator can't handle unboxed tuples. Possibly due\n" ++ + "\tto foreign import/export decls in source. Workaround:\n" ++ + "\tcompile this module to a .o file, then restart session.")) + + +mkSLIDE n d = if d == 0 then nilOL else unitOL (SLIDE n d) +bind x f = f x + +splitApp :: AnnExpr' id ann -> (AnnExpr' id ann, [AnnExpr' id ann]) + -- The arguments are returned in *right-to-left* order +splitApp (AnnApp (_,f) (_,a)) + | isTypeAtom a = splitApp f + | otherwise = case splitApp f of + (f', as) -> (f', a:as) +splitApp (AnnNote n (_,e)) = splitApp e +splitApp e = (e, []) + + +isTypeAtom :: AnnExpr' id ann -> Bool +isTypeAtom (AnnType _) = True +isTypeAtom _ = False + +isVoidArgAtom :: AnnExpr' id ann -> Bool +isVoidArgAtom (AnnVar v) = typeCgRep (idType v) == VoidArg +isVoidArgAtom (AnnNote n (_,e)) = isVoidArgAtom e +isVoidArgAtom _ = False + +atomRep :: AnnExpr' Id ann -> CgRep +atomRep (AnnVar v) = typeCgRep (idType v) +atomRep (AnnLit l) = typeCgRep (literalType l) +atomRep (AnnNote n b) = atomRep (snd b) +atomRep (AnnApp f (_, AnnType _)) = atomRep (snd f) +atomRep (AnnLam x e) | isTyVar x = atomRep (snd e) +atomRep other = pprPanic "atomRep" (ppr (deAnnotate (undefined,other))) + +isPtrAtom :: AnnExpr' Id ann -> Bool +isPtrAtom e = atomRep e == PtrArg + +-- Let szsw be the sizes in words of some items pushed onto the stack, +-- which has initial depth d'. Return the values which the stack environment +-- should map these items to. +mkStackOffsets :: Int -> [Int] -> [Int] +mkStackOffsets original_depth szsw + = map (subtract 1) (tail (scanl (+) original_depth szsw)) + +-- ----------------------------------------------------------------------------- +-- The bytecode generator's monad + +data BcM_State + = BcM_State { + nextlabel :: Int, -- for generating local labels + malloced :: [Ptr ()] } -- ptrs malloced for current BCO + -- Should be free()d when it is GCd + +newtype BcM r = BcM (BcM_State -> IO (BcM_State, r)) + +ioToBc :: IO a -> BcM a +ioToBc io = BcM $ \st -> do + x <- io + return (st, x) + +runBc :: BcM r -> IO (BcM_State, r) +runBc (BcM m) = m (BcM_State 0 []) + +thenBc :: BcM a -> (a -> BcM b) -> BcM b +thenBc (BcM expr) cont = BcM $ \st0 -> do + (st1, q) <- expr st0 + let BcM k = cont q + (st2, r) <- k st1 + return (st2, r) + +thenBc_ :: BcM a -> BcM b -> BcM b +thenBc_ (BcM expr) (BcM cont) = BcM $ \st0 -> do + (st1, q) <- expr st0 + (st2, r) <- cont st1 + return (st2, r) + +returnBc :: a -> BcM a +returnBc result = BcM $ \st -> (return (st, result)) + +instance Monad BcM where + (>>=) = thenBc + (>>) = thenBc_ + return = returnBc + +emitBc :: ([Ptr ()] -> ProtoBCO Name) -> BcM (ProtoBCO Name) +emitBc bco + = BcM $ \st -> return (st{malloced=[]}, bco (malloced st)) + +recordMallocBc :: Ptr a -> BcM () +recordMallocBc a + = BcM $ \st -> return (st{malloced = castPtr a : malloced st}, ()) + +getLabelBc :: BcM Int +getLabelBc + = BcM $ \st -> return (st{nextlabel = 1 + nextlabel st}, nextlabel st) + +getLabelsBc :: Int -> BcM [Int] +getLabelsBc n + = BcM $ \st -> let ctr = nextlabel st + in return (st{nextlabel = ctr+n}, [ctr .. ctr+n-1]) +\end{code} diff --git a/compiler/ghci/ByteCodeInstr.lhs b/compiler/ghci/ByteCodeInstr.lhs new file mode 100644 index 0000000000..7bd4408fff --- /dev/null +++ b/compiler/ghci/ByteCodeInstr.lhs @@ -0,0 +1,256 @@ +% +% (c) The University of Glasgow 2000 +% +\section[ByteCodeInstrs]{Bytecode instruction definitions} + +\begin{code} +module ByteCodeInstr ( + BCInstr(..), ProtoBCO(..), bciStackUse + ) where + +#include "HsVersions.h" +#include "../includes/MachDeps.h" + +import Outputable +import Name ( Name ) +import Id ( Id ) +import CoreSyn +import PprCore ( pprCoreExpr, pprCoreAlt ) +import Literal ( Literal ) +import DataCon ( DataCon ) +import VarSet ( VarSet ) +import PrimOp ( PrimOp ) +import SMRep ( StgWord, CgRep ) +import GHC.Ptr + +-- ---------------------------------------------------------------------------- +-- Bytecode instructions + +data ProtoBCO a + = ProtoBCO { + protoBCOName :: a, -- name, in some sense + protoBCOInstrs :: [BCInstr], -- instrs + -- arity and GC info + protoBCOBitmap :: [StgWord], + protoBCOBitmapSize :: Int, + protoBCOArity :: Int, + -- what the BCO came from + protoBCOExpr :: Either [AnnAlt Id VarSet] (AnnExpr Id VarSet), + -- malloc'd pointers + protoBCOPtrs :: [Ptr ()] + } + +type LocalLabel = Int + +data BCInstr + -- Messing with the stack + = STKCHECK Int + + -- Push locals (existing bits of the stack) + | PUSH_L Int{-offset-} + | PUSH_LL Int Int{-2 offsets-} + | PUSH_LLL Int Int Int{-3 offsets-} + + -- Push a ptr (these all map to PUSH_G really) + | PUSH_G Name + | PUSH_PRIMOP PrimOp + | PUSH_BCO (ProtoBCO Name) + + -- Push an alt continuation + | PUSH_ALTS (ProtoBCO Name) + | PUSH_ALTS_UNLIFTED (ProtoBCO Name) CgRep + + -- Pushing literals + | PUSH_UBX (Either Literal (Ptr ())) Int + -- push this int/float/double/addr, on the stack. Int + -- is # of words to copy from literal pool. Eitherness reflects + -- the difficulty of dealing with MachAddr here, mostly due to + -- the excessive (and unnecessary) restrictions imposed by the + -- designers of the new Foreign library. In particular it is + -- quite impossible to convert an Addr to any other integral + -- type, and it appears impossible to get hold of the bits of + -- an addr, even though we need to to assemble BCOs. + + -- various kinds of application + | PUSH_APPLY_N + | PUSH_APPLY_V + | PUSH_APPLY_F + | PUSH_APPLY_D + | PUSH_APPLY_L + | PUSH_APPLY_P + | PUSH_APPLY_PP + | PUSH_APPLY_PPP + | PUSH_APPLY_PPPP + | PUSH_APPLY_PPPPP + | PUSH_APPLY_PPPPPP + + | SLIDE Int{-this many-} Int{-down by this much-} + + -- To do with the heap + | ALLOC_AP Int -- make an AP with this many payload words + | ALLOC_PAP Int Int -- make a PAP with this arity / payload words + | MKAP Int{-ptr to AP is this far down stack-} Int{-# words-} + | MKPAP Int{-ptr to PAP is this far down stack-} Int{-# words-} + | UNPACK Int -- unpack N words from t.o.s Constr + | PACK DataCon Int + -- after assembly, the DataCon is an index into the + -- itbl array + -- For doing case trees + | LABEL LocalLabel + | TESTLT_I Int LocalLabel + | TESTEQ_I Int LocalLabel + | TESTLT_F Float LocalLabel + | TESTEQ_F Float LocalLabel + | TESTLT_D Double LocalLabel + | TESTEQ_D Double LocalLabel + + -- The Int value is a constructor number and therefore + -- stored in the insn stream rather than as an offset into + -- the literal pool. + | TESTLT_P Int LocalLabel + | TESTEQ_P Int LocalLabel + + | CASEFAIL + | JMP LocalLabel + + -- For doing calls to C (via glue code generated by ByteCodeFFI) + | CCALL Int -- stack frame size + (Ptr ()) -- addr of the glue code + + -- For doing magic ByteArray passing to foreign calls + | SWIZZLE Int -- to the ptr N words down the stack, + Int -- add M (interpreted as a signed 16-bit entity) + + -- To Infinity And Beyond + | ENTER + | RETURN -- return a lifted value + | RETURN_UBX CgRep -- return an unlifted value, here's its rep + +-- ----------------------------------------------------------------------------- +-- Printing bytecode instructions + +instance Outputable a => Outputable (ProtoBCO a) where + ppr (ProtoBCO name instrs bitmap bsize arity origin malloced) + = (text "ProtoBCO" <+> ppr name <> char '#' <> int arity + <+> text (show malloced) <> colon) + $$ nest 6 (text "bitmap: " <+> text (show bsize) <+> text (show bitmap)) + $$ nest 6 (vcat (map ppr instrs)) + $$ case origin of + Left alts -> vcat (map (pprCoreAlt.deAnnAlt) alts) + Right rhs -> pprCoreExpr (deAnnotate rhs) + +instance Outputable BCInstr where + ppr (STKCHECK n) = text "STKCHECK" <+> int n + ppr (PUSH_L offset) = text "PUSH_L " <+> int offset + ppr (PUSH_LL o1 o2) = text "PUSH_LL " <+> int o1 <+> int o2 + ppr (PUSH_LLL o1 o2 o3) = text "PUSH_LLL" <+> int o1 <+> int o2 <+> int o3 + ppr (PUSH_G nm) = text "PUSH_G " <+> ppr nm + ppr (PUSH_PRIMOP op) = text "PUSH_G " <+> text "GHC.PrimopWrappers." + <> ppr op + ppr (PUSH_BCO bco) = text "PUSH_BCO" <+> nest 3 (ppr bco) + ppr (PUSH_ALTS bco) = text "PUSH_ALTS " <+> ppr bco + ppr (PUSH_ALTS_UNLIFTED bco pk) = text "PUSH_ALTS_UNLIFTED " <+> ppr pk <+> ppr bco + + ppr (PUSH_UBX (Left lit) nw) = text "PUSH_UBX" <+> parens (int nw) <+> ppr lit + ppr (PUSH_UBX (Right aa) nw) = text "PUSH_UBX" <+> parens (int nw) <+> text (show aa) + ppr PUSH_APPLY_N = text "PUSH_APPLY_N" + ppr PUSH_APPLY_V = text "PUSH_APPLY_V" + ppr PUSH_APPLY_F = text "PUSH_APPLY_F" + ppr PUSH_APPLY_D = text "PUSH_APPLY_D" + ppr PUSH_APPLY_L = text "PUSH_APPLY_L" + ppr PUSH_APPLY_P = text "PUSH_APPLY_P" + ppr PUSH_APPLY_PP = text "PUSH_APPLY_PP" + ppr PUSH_APPLY_PPP = text "PUSH_APPLY_PPP" + ppr PUSH_APPLY_PPPP = text "PUSH_APPLY_PPPP" + ppr PUSH_APPLY_PPPPP = text "PUSH_APPLY_PPPPP" + ppr PUSH_APPLY_PPPPPP = text "PUSH_APPLY_PPPPPP" + + ppr (SLIDE n d) = text "SLIDE " <+> int n <+> int d + ppr (ALLOC_AP sz) = text "ALLOC_AP " <+> int sz + ppr (ALLOC_PAP arity sz) = text "ALLOC_PAP " <+> int arity <+> int sz + ppr (MKAP offset sz) = text "MKAP " <+> int sz <+> text "words," + <+> int offset <+> text "stkoff" + ppr (UNPACK sz) = text "UNPACK " <+> int sz + ppr (PACK dcon sz) = text "PACK " <+> ppr dcon <+> ppr sz + ppr (LABEL lab) = text "__" <> int lab <> colon + ppr (TESTLT_I i lab) = text "TESTLT_I" <+> int i <+> text "__" <> int lab + ppr (TESTEQ_I i lab) = text "TESTEQ_I" <+> int i <+> text "__" <> int lab + ppr (TESTLT_F f lab) = text "TESTLT_F" <+> float f <+> text "__" <> int lab + ppr (TESTEQ_F f lab) = text "TESTEQ_F" <+> float f <+> text "__" <> int lab + ppr (TESTLT_D d lab) = text "TESTLT_D" <+> double d <+> text "__" <> int lab + ppr (TESTEQ_D d lab) = text "TESTEQ_D" <+> double d <+> text "__" <> int lab + ppr (TESTLT_P i lab) = text "TESTLT_P" <+> int i <+> text "__" <> int lab + ppr (TESTEQ_P i lab) = text "TESTEQ_P" <+> int i <+> text "__" <> int lab + ppr (JMP lab) = text "JMP" <+> int lab + ppr CASEFAIL = text "CASEFAIL" + ppr ENTER = text "ENTER" + ppr RETURN = text "RETURN" + ppr (RETURN_UBX pk) = text "RETURN_UBX " <+> ppr pk + ppr (CCALL off marshall_addr) = text "CCALL " <+> int off + <+> text "marshall code at" + <+> text (show marshall_addr) + ppr (SWIZZLE stkoff n) = text "SWIZZLE " <+> text "stkoff" <+> int stkoff + <+> text "by" <+> int n + +-- ----------------------------------------------------------------------------- +-- The stack use, in words, of each bytecode insn. These _must_ be +-- correct, or overestimates of reality, to be safe. + +-- NOTE: we aggregate the stack use from case alternatives too, so that +-- we can do a single stack check at the beginning of a function only. + +-- This could all be made more accurate by keeping track of a proper +-- stack high water mark, but it doesn't seem worth the hassle. + +protoBCOStackUse :: ProtoBCO a -> Int +protoBCOStackUse bco = sum (map bciStackUse (protoBCOInstrs bco)) + +bciStackUse :: BCInstr -> Int +bciStackUse STKCHECK{} = 0 +bciStackUse PUSH_L{} = 1 +bciStackUse PUSH_LL{} = 2 +bciStackUse PUSH_LLL{} = 3 +bciStackUse PUSH_G{} = 1 +bciStackUse PUSH_PRIMOP{} = 1 +bciStackUse PUSH_BCO{} = 1 +bciStackUse (PUSH_ALTS bco) = 2 + protoBCOStackUse bco +bciStackUse (PUSH_ALTS_UNLIFTED bco _) = 2 + protoBCOStackUse bco +bciStackUse (PUSH_UBX _ nw) = nw +bciStackUse PUSH_APPLY_N{} = 1 +bciStackUse PUSH_APPLY_V{} = 1 +bciStackUse PUSH_APPLY_F{} = 1 +bciStackUse PUSH_APPLY_D{} = 1 +bciStackUse PUSH_APPLY_L{} = 1 +bciStackUse PUSH_APPLY_P{} = 1 +bciStackUse PUSH_APPLY_PP{} = 1 +bciStackUse PUSH_APPLY_PPP{} = 1 +bciStackUse PUSH_APPLY_PPPP{} = 1 +bciStackUse PUSH_APPLY_PPPPP{} = 1 +bciStackUse PUSH_APPLY_PPPPPP{} = 1 +bciStackUse ALLOC_AP{} = 1 +bciStackUse ALLOC_PAP{} = 1 +bciStackUse (UNPACK sz) = sz +bciStackUse LABEL{} = 0 +bciStackUse TESTLT_I{} = 0 +bciStackUse TESTEQ_I{} = 0 +bciStackUse TESTLT_F{} = 0 +bciStackUse TESTEQ_F{} = 0 +bciStackUse TESTLT_D{} = 0 +bciStackUse TESTEQ_D{} = 0 +bciStackUse TESTLT_P{} = 0 +bciStackUse TESTEQ_P{} = 0 +bciStackUse CASEFAIL{} = 0 +bciStackUse JMP{} = 0 +bciStackUse ENTER{} = 0 +bciStackUse RETURN{} = 0 +bciStackUse RETURN_UBX{} = 1 +bciStackUse CCALL{} = 0 +bciStackUse SWIZZLE{} = 0 + +-- These insns actually reduce stack use, but we need the high-tide level, +-- so can't use this info. Not that it matters much. +bciStackUse SLIDE{} = 0 +bciStackUse MKAP{} = 0 +bciStackUse MKPAP{} = 0 +bciStackUse PACK{} = 1 -- worst case is PACK 0 words +\end{code} diff --git a/compiler/ghci/ByteCodeItbls.lhs b/compiler/ghci/ByteCodeItbls.lhs new file mode 100644 index 0000000000..74346c6218 --- /dev/null +++ b/compiler/ghci/ByteCodeItbls.lhs @@ -0,0 +1,366 @@ +% +% (c) The University of Glasgow 2000 +% +\section[ByteCodeItbls]{Generate infotables for interpreter-made bytecodes} + +\begin{code} + +{-# OPTIONS -optc-DNON_POSIX_SOURCE #-} + +module ByteCodeItbls ( ItblEnv, ItblPtr, mkITbls ) where + +#include "HsVersions.h" + +import Name ( Name, getName ) +import NameEnv +import SMRep ( typeCgRep ) +import DataCon ( DataCon, dataConRepArgTys ) +import TyCon ( TyCon, tyConFamilySize, isDataTyCon, tyConDataCons ) +import Constants ( mIN_PAYLOAD_SIZE, wORD_SIZE ) +import CgHeapery ( mkVirtHeapOffsets ) +import FastString ( FastString(..) ) +import Util ( lengthIs, listLengthCmp ) + +import Foreign +import Foreign.C +import DATA_BITS ( Bits(..), shiftR ) + +import GHC.Exts ( Int(I#), addr2Int# ) +#if __GLASGOW_HASKELL__ < 503 +import Ptr ( Ptr(..) ) +#else +import GHC.Ptr ( Ptr(..) ) +#endif +\end{code} + +%************************************************************************ +%* * +\subsection{Manufacturing of info tables for DataCons} +%* * +%************************************************************************ + +\begin{code} +type ItblPtr = Ptr StgInfoTable +type ItblEnv = NameEnv (Name, ItblPtr) + -- We need the Name in the range so we know which + -- elements to filter out when unloading a module + +mkItblEnv :: [(Name,ItblPtr)] -> ItblEnv +mkItblEnv pairs = mkNameEnv [(n, (n,p)) | (n,p) <- pairs] + + +-- Make info tables for the data decls in this module +mkITbls :: [TyCon] -> IO ItblEnv +mkITbls [] = return emptyNameEnv +mkITbls (tc:tcs) = do itbls <- mkITbl tc + itbls2 <- mkITbls tcs + return (itbls `plusNameEnv` itbls2) + +mkITbl :: TyCon -> IO ItblEnv +mkITbl tc + | not (isDataTyCon tc) + = return emptyNameEnv + | dcs `lengthIs` n -- paranoia; this is an assertion. + = make_constr_itbls dcs + where + dcs = tyConDataCons tc + n = tyConFamilySize tc + +#include "../includes/ClosureTypes.h" +cONSTR :: Int -- Defined in ClosureTypes.h +cONSTR = CONSTR + +-- Assumes constructors are numbered from zero, not one +make_constr_itbls :: [DataCon] -> IO ItblEnv +make_constr_itbls cons + | listLengthCmp cons 8 /= GT -- <= 8 elements in the list + = do is <- mapM mk_vecret_itbl (zip cons [0..]) + return (mkItblEnv is) + | otherwise + = do is <- mapM mk_dirret_itbl (zip cons [0..]) + return (mkItblEnv is) + where + mk_vecret_itbl (dcon, conNo) + = mk_itbl dcon conNo (vecret_entry conNo) + mk_dirret_itbl (dcon, conNo) + = mk_itbl dcon conNo stg_interp_constr_entry + + mk_itbl :: DataCon -> Int -> Ptr () -> IO (Name,ItblPtr) + mk_itbl dcon conNo entry_addr + = let rep_args = [ (typeCgRep arg,arg) + | arg <- dataConRepArgTys dcon ] + (tot_wds, ptr_wds, _) = mkVirtHeapOffsets False{-not a THUNK-} rep_args + + ptrs = ptr_wds + nptrs = tot_wds - ptr_wds + nptrs_really + | ptrs + nptrs >= mIN_PAYLOAD_SIZE = nptrs + | otherwise = mIN_PAYLOAD_SIZE - ptrs + itbl = StgInfoTable { + ptrs = fromIntegral ptrs, + nptrs = fromIntegral nptrs_really, + tipe = fromIntegral cONSTR, + srtlen = fromIntegral conNo, + code = code + } + -- Make a piece of code to jump to "entry_label". + -- This is the only arch-dependent bit. + code = mkJumpToAddr entry_addr + in + do addr <- malloc_exec (sizeOf itbl) + --putStrLn ("SIZE of itbl is " ++ show (sizeOf itbl)) + --putStrLn ("# ptrs of itbl is " ++ show ptrs) + --putStrLn ("# nptrs of itbl is " ++ show nptrs_really) + poke addr itbl + return (getName dcon, addr `plusPtr` (2 * wORD_SIZE)) + + +-- Make code which causes a jump to the given address. This is the +-- only arch-dependent bit of the itbl story. The returned list is +-- itblCodeLength elements (bytes) long. + +-- For sparc_TARGET_ARCH, i386_TARGET_ARCH, etc. +#include "nativeGen/NCG.h" + +itblCodeLength :: Int +itblCodeLength = length (mkJumpToAddr undefined) + +mkJumpToAddr :: Ptr () -> [ItblCode] + +ptrToInt (Ptr a#) = I# (addr2Int# a#) + +#if sparc_TARGET_ARCH +-- After some consideration, we'll try this, where +-- 0x55555555 stands in for the address to jump to. +-- According to ghc/includes/MachRegs.h, %g3 is very +-- likely indeed to be baggable. +-- +-- 0000 07155555 sethi %hi(0x55555555), %g3 +-- 0004 8610E155 or %g3, %lo(0x55555555), %g3 +-- 0008 81C0C000 jmp %g3 +-- 000c 01000000 nop + +type ItblCode = Word32 +mkJumpToAddr a + = let w32 = fromIntegral (ptrToInt a) + + hi22, lo10 :: Word32 -> Word32 + lo10 x = x .&. 0x3FF + hi22 x = (x `shiftR` 10) .&. 0x3FFFF + + in [ 0x07000000 .|. (hi22 w32), + 0x8610E000 .|. (lo10 w32), + 0x81C0C000, + 0x01000000 ] + +#elif powerpc_TARGET_ARCH +-- We'll use r12, for no particular reason. +-- 0xDEADBEEF stands for the adress: +-- 3D80DEAD lis r12,0xDEAD +-- 618CBEEF ori r12,r12,0xBEEF +-- 7D8903A6 mtctr r12 +-- 4E800420 bctr + +type ItblCode = Word32 +mkJumpToAddr a = + let w32 = fromIntegral (ptrToInt a) + hi16 x = (x `shiftR` 16) .&. 0xFFFF + lo16 x = x .&. 0xFFFF + in [ + 0x3D800000 .|. hi16 w32, + 0x618C0000 .|. lo16 w32, + 0x7D8903A6, 0x4E800420 + ] + +#elif i386_TARGET_ARCH +-- Let the address to jump to be 0xWWXXYYZZ. +-- Generate movl $0xWWXXYYZZ,%eax ; jmp *%eax +-- which is +-- B8 ZZ YY XX WW FF E0 + +type ItblCode = Word8 +mkJumpToAddr a + = let w32 = fromIntegral (ptrToInt a) :: Word32 + insnBytes :: [Word8] + insnBytes + = [0xB8, byte0 w32, byte1 w32, + byte2 w32, byte3 w32, + 0xFF, 0xE0] + in + insnBytes + +#elif x86_64_TARGET_ARCH +-- Generates: +-- jmpq *.L1(%rip) +-- .align 8 +-- .L1: +-- .quad <addr> +-- +-- We need a full 64-bit pointer (we can't assume the info table is +-- allocated in low memory). Assuming the info pointer is aligned to +-- an 8-byte boundary, the addr will also be aligned. + +type ItblCode = Word8 +mkJumpToAddr a + = let w64 = fromIntegral (ptrToInt a) :: Word64 + insnBytes :: [Word8] + insnBytes + = [0xff, 0x25, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, + byte0 w64, byte1 w64, byte2 w64, byte3 w64, + byte4 w64, byte5 w64, byte6 w64, byte7 w64] + in + insnBytes + +#elif alpha_TARGET_ARCH +type ItblCode = Word32 +mkJumpToAddr a + = [ 0xc3800000 -- br at, .+4 + , 0xa79c000c -- ldq at, 12(at) + , 0x6bfc0000 -- jmp (at) # with zero hint -- oh well + , 0x47ff041f -- nop + , fromIntegral (w64 .&. 0x0000FFFF) + , fromIntegral ((w64 `shiftR` 32) .&. 0x0000FFFF) ] + where w64 = fromIntegral (ptrToInt a) :: Word64 + +#else +type ItblCode = Word32 +mkJumpToAddr a + = undefined +#endif + + +byte0, byte1, byte2, byte3, byte4, byte5, byte6, byte7 + :: (Integral w, Bits w) => w -> Word8 +byte0 w = fromIntegral w +byte1 w = fromIntegral (w `shiftR` 8) +byte2 w = fromIntegral (w `shiftR` 16) +byte3 w = fromIntegral (w `shiftR` 24) +byte4 w = fromIntegral (w `shiftR` 32) +byte5 w = fromIntegral (w `shiftR` 40) +byte6 w = fromIntegral (w `shiftR` 48) +byte7 w = fromIntegral (w `shiftR` 56) + + +vecret_entry 0 = stg_interp_constr1_entry +vecret_entry 1 = stg_interp_constr2_entry +vecret_entry 2 = stg_interp_constr3_entry +vecret_entry 3 = stg_interp_constr4_entry +vecret_entry 4 = stg_interp_constr5_entry +vecret_entry 5 = stg_interp_constr6_entry +vecret_entry 6 = stg_interp_constr7_entry +vecret_entry 7 = stg_interp_constr8_entry + +#ifndef __HADDOCK__ +-- entry point for direct returns for created constr itbls +foreign import ccall "&stg_interp_constr_entry" stg_interp_constr_entry :: Ptr () +-- and the 8 vectored ones +foreign import ccall "&stg_interp_constr1_entry" stg_interp_constr1_entry :: Ptr () +foreign import ccall "&stg_interp_constr2_entry" stg_interp_constr2_entry :: Ptr () +foreign import ccall "&stg_interp_constr3_entry" stg_interp_constr3_entry :: Ptr () +foreign import ccall "&stg_interp_constr4_entry" stg_interp_constr4_entry :: Ptr () +foreign import ccall "&stg_interp_constr5_entry" stg_interp_constr5_entry :: Ptr () +foreign import ccall "&stg_interp_constr6_entry" stg_interp_constr6_entry :: Ptr () +foreign import ccall "&stg_interp_constr7_entry" stg_interp_constr7_entry :: Ptr () +foreign import ccall "&stg_interp_constr8_entry" stg_interp_constr8_entry :: Ptr () +#endif + + + + +-- Ultra-minimalist version specially for constructors +#if SIZEOF_VOID_P == 8 +type HalfWord = Word32 +#else +type HalfWord = Word16 +#endif + +data StgInfoTable = StgInfoTable { + ptrs :: HalfWord, + nptrs :: HalfWord, + tipe :: HalfWord, + srtlen :: HalfWord, + code :: [ItblCode] +} + +instance Storable StgInfoTable where + + sizeOf itbl + = sum + [fieldSz ptrs itbl, + fieldSz nptrs itbl, + fieldSz tipe itbl, + fieldSz srtlen itbl, + fieldSz (head.code) itbl * itblCodeLength] + + alignment itbl + = SIZEOF_VOID_P + + poke a0 itbl + = runState (castPtr a0) + $ do store (ptrs itbl) + store (nptrs itbl) + store (tipe itbl) + store (srtlen itbl) + sequence_ (map store (code itbl)) + + peek a0 + = runState (castPtr a0) + $ do ptrs <- load + nptrs <- load + tipe <- load + srtlen <- load + code <- sequence (replicate itblCodeLength load) + return + StgInfoTable { + ptrs = ptrs, + nptrs = nptrs, + tipe = tipe, + srtlen = srtlen, + code = code + } + +fieldSz :: (Storable a, Storable b) => (a -> b) -> a -> Int +fieldSz sel x = sizeOf (sel x) + +newtype State s m a = State (s -> m (s, a)) + +instance Monad m => Monad (State s m) where + return a = State (\s -> return (s, a)) + State m >>= k = State (\s -> m s >>= \(s', a) -> case k a of State n -> n s') + fail str = State (\s -> fail str) + +class (Monad m, Monad (t m)) => MonadT t m where + lift :: m a -> t m a + +instance Monad m => MonadT (State s) m where + lift m = State (\s -> m >>= \a -> return (s, a)) + +runState :: (Monad m) => s -> State s m a -> m a +runState s (State m) = m s >>= return . snd + +type PtrIO = State (Ptr Word8) IO + +advance :: Storable a => PtrIO (Ptr a) +advance = State adv where + adv addr = case castPtr addr of { addrCast -> return + (addr `plusPtr` sizeOfPointee addrCast, addrCast) } + +sizeOfPointee :: (Storable a) => Ptr a -> Int +sizeOfPointee addr = sizeOf (typeHack addr) + where typeHack = undefined :: Ptr a -> a + +store :: Storable a => a -> PtrIO () +store x = do addr <- advance + lift (poke addr x) + +load :: Storable a => PtrIO a +load = do addr <- advance + lift (peek addr) + +foreign import ccall unsafe "stgMallocBytesRWX" + _stgMallocBytesRWX :: CInt -> IO (Ptr a) + +malloc_exec :: Int -> IO (Ptr a) +malloc_exec bytes = _stgMallocBytesRWX (fromIntegral bytes) + +\end{code} diff --git a/compiler/ghci/ByteCodeLink.lhs b/compiler/ghci/ByteCodeLink.lhs new file mode 100644 index 0000000000..875f1d6331 --- /dev/null +++ b/compiler/ghci/ByteCodeLink.lhs @@ -0,0 +1,268 @@ +% +% (c) The University of Glasgow 2000 +% +\section[ByteCodeLink]{Bytecode assembler and linker} + +\begin{code} + +{-# OPTIONS -optc-DNON_POSIX_SOURCE #-} + +module ByteCodeLink ( + HValue, + ClosureEnv, emptyClosureEnv, extendClosureEnv, + linkBCO, lookupStaticPtr + ) where + +#include "HsVersions.h" + +import ByteCodeItbls ( ItblEnv, ItblPtr ) +import ByteCodeAsm ( UnlinkedBCO(..), BCOPtr(..), sizeSS, ssElts ) +import ObjLink ( lookupSymbol ) + +import Name ( Name, nameModule, nameOccName, isExternalName ) +import NameEnv +import OccName ( occNameFS ) +import PrimOp ( PrimOp, primOpOcc ) +import Module ( moduleFS ) +import FastString ( FastString(..), unpackFS, zEncodeFS ) +import Outputable +import Panic ( GhcException(..) ) + +-- Standard libraries +import GHC.Word ( Word(..) ) + +import Data.Array.IArray ( listArray ) +import Data.Array.Base +import GHC.Arr ( STArray(..) ) + +import Control.Exception ( throwDyn ) +import Control.Monad ( zipWithM ) +import Control.Monad.ST ( stToIO ) + +import GHC.Exts ( BCO#, newBCO#, unsafeCoerce#, Int#, + ByteArray#, Array#, addrToHValue#, mkApUpd0# ) + +import GHC.Arr ( Array(..) ) +import GHC.IOBase ( IO(..) ) +import GHC.Ptr ( Ptr(..) ) +import GHC.Base ( writeArray#, RealWorld, Int(..) ) +\end{code} + + +%************************************************************************ +%* * +\subsection{Linking interpretables into something we can run} +%* * +%************************************************************************ + +\begin{code} +type ClosureEnv = NameEnv (Name, HValue) +newtype HValue = HValue (forall a . a) + +emptyClosureEnv = emptyNameEnv + +extendClosureEnv :: ClosureEnv -> [(Name,HValue)] -> ClosureEnv +extendClosureEnv cl_env pairs + = extendNameEnvList cl_env [ (n, (n,v)) | (n,v) <- pairs] +\end{code} + + +%************************************************************************ +%* * +\subsection{Linking interpretables into something we can run} +%* * +%************************************************************************ + +\begin{code} +{- +data BCO# = BCO# ByteArray# -- instrs :: Array Word16# + ByteArray# -- literals :: Array Word32# + PtrArray# -- ptrs :: Array HValue + ByteArray# -- itbls :: Array Addr# +-} + +linkBCO :: ItblEnv -> ClosureEnv -> UnlinkedBCO -> IO HValue +linkBCO ie ce ul_bco + = do BCO bco# <- linkBCO' ie ce ul_bco + -- SDM: Why do we need mkApUpd0 here? I *think* it's because + -- otherwise top-level interpreted CAFs don't get updated + -- after evaluation. A top-level BCO will evaluate itself and + -- return its value when entered, but it won't update itself. + -- Wrapping the BCO in an AP_UPD thunk will take care of the + -- update for us. + -- + -- Update: the above is true, but now we also have extra invariants: + -- (a) An AP thunk *must* point directly to a BCO + -- (b) A zero-arity BCO *must* be wrapped in an AP thunk + -- (c) An AP is always fully saturated, so we *can't* wrap + -- non-zero arity BCOs in an AP thunk. + -- + if (unlinkedBCOArity ul_bco > 0) + then return (unsafeCoerce# bco#) + else case mkApUpd0# bco# of { (# final_bco #) -> return final_bco } + + +linkBCO' :: ItblEnv -> ClosureEnv -> UnlinkedBCO -> IO BCO +linkBCO' ie ce (UnlinkedBCO nm arity insns_barr bitmap literalsSS ptrsSS itblsSS) + -- Raises an IO exception on failure + = do let literals = ssElts literalsSS + ptrs = ssElts ptrsSS + itbls = ssElts itblsSS + + linked_itbls <- mapM (lookupIE ie) itbls + linked_literals <- mapM lookupLiteral literals + + let n_literals = sizeSS literalsSS + n_ptrs = sizeSS ptrsSS + n_itbls = sizeSS itblsSS + + ptrs_arr <- mkPtrsArray ie ce n_ptrs ptrs + + let + ptrs_parr = case ptrs_arr of Array lo hi parr -> parr + + itbls_arr = listArray (0, n_itbls-1) linked_itbls + :: UArray Int ItblPtr + itbls_barr = case itbls_arr of UArray lo hi barr -> barr + + literals_arr = listArray (0, n_literals-1) linked_literals + :: UArray Int Word + literals_barr = case literals_arr of UArray lo hi barr -> barr + + (I# arity#) = arity + + newBCO insns_barr literals_barr ptrs_parr itbls_barr arity# bitmap + + +-- we recursively link any sub-BCOs while making the ptrs array +mkPtrsArray :: ItblEnv -> ClosureEnv -> Int -> [BCOPtr] -> IO (Array Int HValue) +mkPtrsArray ie ce n_ptrs ptrs = do + marr <- newArray_ (0, n_ptrs-1) + let + fill (BCOPtrName n) i = do + ptr <- lookupName ce n + unsafeWrite marr i ptr + fill (BCOPtrPrimOp op) i = do + ptr <- lookupPrimOp op + unsafeWrite marr i ptr + fill (BCOPtrBCO ul_bco) i = do + BCO bco# <- linkBCO' ie ce ul_bco + writeArrayBCO marr i bco# + zipWithM fill ptrs [0..] + unsafeFreeze marr + +newtype IOArray i e = IOArray (STArray RealWorld i e) + +instance HasBounds IOArray where + bounds (IOArray marr) = bounds marr + +instance MArray IOArray e IO where + newArray lu init = stToIO $ do + marr <- newArray lu init; return (IOArray marr) + newArray_ lu = stToIO $ do + marr <- newArray_ lu; return (IOArray marr) + unsafeRead (IOArray marr) i = stToIO (unsafeRead marr i) + unsafeWrite (IOArray marr) i e = stToIO (unsafeWrite marr i e) + +-- XXX HACK: we should really have a new writeArray# primop that takes a BCO#. +writeArrayBCO :: IOArray Int a -> Int -> BCO# -> IO () +writeArrayBCO (IOArray (STArray _ _ marr#)) (I# i#) bco# = IO $ \s# -> + case (unsafeCoerce# writeArray#) marr# i# bco# s# of { s# -> + (# s#, () #) } + +data BCO = BCO BCO# + +newBCO :: ByteArray# -> ByteArray# -> Array# a + -> ByteArray# -> Int# -> ByteArray# -> IO BCO +newBCO instrs lits ptrs itbls arity bitmap + = IO $ \s -> case newBCO# instrs lits ptrs itbls arity bitmap s of + (# s1, bco #) -> (# s1, BCO bco #) + + +lookupLiteral :: Either Word FastString -> IO Word +lookupLiteral (Left lit) = return lit +lookupLiteral (Right sym) = do Ptr addr <- lookupStaticPtr sym + return (W# (unsafeCoerce# addr)) + -- Can't be bothered to find the official way to convert Addr# to Word#; + -- the FFI/Foreign designers make it too damn difficult + -- Hence we apply the Blunt Instrument, which works correctly + -- on all reasonable architectures anyway + +lookupStaticPtr :: FastString -> IO (Ptr ()) +lookupStaticPtr addr_of_label_string + = do let label_to_find = unpackFS addr_of_label_string + m <- lookupSymbol label_to_find + case m of + Just ptr -> return ptr + Nothing -> linkFail "ByteCodeLink: can't find label" + label_to_find + +lookupPrimOp :: PrimOp -> IO HValue +lookupPrimOp primop + = do let sym_to_find = primopToCLabel primop "closure" + m <- lookupSymbol sym_to_find + case m of + Just (Ptr addr) -> case addrToHValue# addr of + (# hval #) -> return hval + Nothing -> linkFail "ByteCodeLink.lookupCE(primop)" sym_to_find + +lookupName :: ClosureEnv -> Name -> IO HValue +lookupName ce nm + = case lookupNameEnv ce nm of + Just (_,aa) -> return aa + Nothing + -> ASSERT2(isExternalName nm, ppr nm) + do let sym_to_find = nameToCLabel nm "closure" + m <- lookupSymbol sym_to_find + case m of + Just (Ptr addr) -> case addrToHValue# addr of + (# hval #) -> return hval + Nothing -> linkFail "ByteCodeLink.lookupCE" sym_to_find + +lookupIE :: ItblEnv -> Name -> IO (Ptr a) +lookupIE ie con_nm + = case lookupNameEnv ie con_nm of + Just (_, Ptr a) -> return (Ptr a) + Nothing + -> do -- try looking up in the object files. + let sym_to_find1 = nameToCLabel con_nm "con_info" + m <- lookupSymbol sym_to_find1 + case m of + Just addr -> return addr + Nothing + -> do -- perhaps a nullary constructor? + let sym_to_find2 = nameToCLabel con_nm "static_info" + n <- lookupSymbol sym_to_find2 + case n of + Just addr -> return addr + Nothing -> linkFail "ByteCodeLink.lookupIE" + (sym_to_find1 ++ " or " ++ sym_to_find2) + +linkFail :: String -> String -> IO a +linkFail who what + = throwDyn (ProgramError $ + unlines [ "" + , "During interactive linking, GHCi couldn't find the following symbol:" + , ' ' : ' ' : what + , "This may be due to you not asking GHCi to load extra object files," + , "archives or DLLs needed by your current session. Restart GHCi, specifying" + , "the missing library using the -L/path/to/object/dir and -lmissinglibname" + , "flags, or simply by naming the relevant files on the GHCi command line." + , "Alternatively, this link failure might indicate a bug in GHCi." + , "If you suspect the latter, please send a bug report to:" + , " glasgow-haskell-bugs@haskell.org" + ]) + +-- HACKS!!! ToDo: cleaner +nameToCLabel :: Name -> String{-suffix-} -> String +nameToCLabel n suffix + = unpackFS (zEncodeFS (moduleFS (nameModule n))) + ++ '_': unpackFS (zEncodeFS (occNameFS (nameOccName n))) ++ '_':suffix + +primopToCLabel :: PrimOp -> String{-suffix-} -> String +primopToCLabel primop suffix + = let str = "GHCziPrimopWrappers_" ++ unpackFS (zEncodeFS (occNameFS (primOpOcc primop))) ++ '_':suffix + in --trace ("primopToCLabel: " ++ str) + str +\end{code} + diff --git a/compiler/ghci/InteractiveUI.hs b/compiler/ghci/InteractiveUI.hs new file mode 100644 index 0000000000..9e9c262052 --- /dev/null +++ b/compiler/ghci/InteractiveUI.hs @@ -0,0 +1,1534 @@ +{-# OPTIONS -#include "Linker.h" #-} +----------------------------------------------------------------------------- +-- +-- GHC Interactive User Interface +-- +-- (c) The GHC Team 2005 +-- +----------------------------------------------------------------------------- +module InteractiveUI ( + interactiveUI, + ghciWelcomeMsg + ) where + +#include "HsVersions.h" + +#if defined(GHCI) && defined(BREAKPOINT) +import GHC.Exts ( Int(..), Ptr(..), int2Addr# ) +import Foreign.StablePtr ( deRefStablePtr, castPtrToStablePtr ) +import System.IO.Unsafe ( unsafePerformIO ) +import Var ( Id, globaliseId, idName, idType ) +import HscTypes ( Session(..), InteractiveContext(..), HscEnv(..) + , extendTypeEnvWithIds ) +import RdrName ( extendLocalRdrEnv, mkRdrUnqual, lookupLocalRdrEnv ) +import NameEnv ( delListFromNameEnv ) +import TcType ( tidyTopType ) +import qualified Id ( setIdType ) +import IdInfo ( GlobalIdDetails(..) ) +import Linker ( HValue, extendLinkEnv, withExtendedLinkEnv,initDynLinker ) +import PrelNames ( breakpointJumpName ) +#endif + +-- The GHC interface +import qualified GHC +import GHC ( Session, verbosity, dopt, DynFlag(..), Target(..), + TargetId(..), DynFlags(..), + pprModule, Type, Module, SuccessFlag(..), + TyThing(..), Name, LoadHowMuch(..), Phase, + GhcException(..), showGhcException, + CheckedModule(..), SrcLoc ) +import DynFlags ( allFlags ) +import Packages ( PackageState(..) ) +import PackageConfig ( InstalledPackageInfo(..) ) +import UniqFM ( eltsUFM ) +import PprTyThing +import Outputable + +-- for createtags (should these come via GHC?) +import Module ( moduleString ) +import Name ( nameSrcLoc, nameModule, nameOccName ) +import OccName ( pprOccName ) +import SrcLoc ( isGoodSrcLoc, srcLocFile, srcLocLine, srcLocCol ) + +-- Other random utilities +import Digraph ( flattenSCCs ) +import BasicTypes ( failed, successIf ) +import Panic ( panic, installSignalHandlers ) +import Config +import StaticFlags ( opt_IgnoreDotGhci ) +import Linker ( showLinkerState ) +import Util ( removeSpaces, handle, global, toArgs, + looksLikeModuleName, prefixMatch, sortLe ) + +#ifndef mingw32_HOST_OS +import System.Posix +#if __GLASGOW_HASKELL__ > 504 + hiding (getEnv) +#endif +#else +import GHC.ConsoleHandler ( flushConsole ) +#endif + +#ifdef USE_READLINE +import Control.Concurrent ( yield ) -- Used in readline loop +import System.Console.Readline as Readline +#endif + +--import SystemExts + +import Control.Exception as Exception +import Data.Dynamic +-- import Control.Concurrent + +import Numeric +import Data.List +import Data.Int ( Int64 ) +import Data.Maybe ( isJust, fromMaybe, catMaybes ) +import System.Cmd +import System.CPUTime +import System.Environment +import System.Exit ( exitWith, ExitCode(..) ) +import System.Directory +import System.IO +import System.IO.Error as IO +import Data.Char +import Control.Monad as Monad +import Foreign.StablePtr ( newStablePtr ) +import Text.Printf + +import GHC.Exts ( unsafeCoerce# ) +import GHC.IOBase ( IOErrorType(InvalidArgument) ) + +import Data.IORef ( IORef, newIORef, readIORef, writeIORef ) + +import System.Posix.Internals ( setNonBlockingFD ) + +----------------------------------------------------------------------------- + +ghciWelcomeMsg = + " ___ ___ _\n"++ + " / _ \\ /\\ /\\/ __(_)\n"++ + " / /_\\// /_/ / / | | GHC Interactive, version " ++ cProjectVersion ++ ", for Haskell 98.\n"++ + "/ /_\\\\/ __ / /___| | http://www.haskell.org/ghc/\n"++ + "\\____/\\/ /_/\\____/|_| Type :? for help.\n" + +type Command = (String, String -> GHCi Bool, Bool, String -> IO [String]) +cmdName (n,_,_,_) = n + +GLOBAL_VAR(commands, builtin_commands, [Command]) + +builtin_commands :: [Command] +builtin_commands = [ + ("add", keepGoingPaths addModule, False, completeFilename), + ("browse", keepGoing browseCmd, False, completeModule), + ("cd", keepGoing changeDirectory, False, completeFilename), + ("def", keepGoing defineMacro, False, completeIdentifier), + ("help", keepGoing help, False, completeNone), + ("?", keepGoing help, False, completeNone), + ("info", keepGoing info, False, completeIdentifier), + ("load", keepGoingPaths loadModule_, False, completeHomeModuleOrFile), + ("module", keepGoing setContext, False, completeModule), + ("main", keepGoing runMain, False, completeIdentifier), + ("reload", keepGoing reloadModule, False, completeNone), + ("check", keepGoing checkModule, False, completeHomeModule), + ("set", keepGoing setCmd, True, completeSetOptions), + ("show", keepGoing showCmd, False, completeNone), + ("etags", keepGoing createETagsFileCmd, False, completeFilename), + ("ctags", keepGoing createCTagsFileCmd, False, completeFilename), + ("type", keepGoing typeOfExpr, False, completeIdentifier), + ("kind", keepGoing kindOfType, False, completeIdentifier), + ("unset", keepGoing unsetOptions, True, completeSetOptions), + ("undef", keepGoing undefineMacro, False, completeMacro), + ("quit", quit, False, completeNone) + ] + +keepGoing :: (String -> GHCi ()) -> (String -> GHCi Bool) +keepGoing a str = a str >> return False + +keepGoingPaths :: ([FilePath] -> GHCi ()) -> (String -> GHCi Bool) +keepGoingPaths a str = a (toArgs str) >> return False + +shortHelpText = "use :? for help.\n" + +-- NOTE: spaces at the end of each line to workaround CPP/string gap bug. +helpText = + " Commands available from the prompt:\n" ++ + "\n" ++ + " <stmt> evaluate/run <stmt>\n" ++ + " :add <filename> ... add module(s) to the current target set\n" ++ + " :browse [*]<module> display the names defined by <module>\n" ++ + " :cd <dir> change directory to <dir>\n" ++ + " :def <cmd> <expr> define a command :<cmd>\n" ++ + " :help, :? display this list of commands\n" ++ + " :info [<name> ...] display information about the given names\n" ++ + " :load <filename> ... load module(s) and their dependents\n" ++ + " :module [+/-] [*]<mod> ... set the context for expression evaluation\n" ++ + " :main [<arguments> ...] run the main function with the given arguments\n" ++ + " :reload reload the current module set\n" ++ + "\n" ++ + " :set <option> ... set options\n" ++ + " :set args <arg> ... set the arguments returned by System.getArgs\n" ++ + " :set prog <progname> set the value returned by System.getProgName\n" ++ + " :set prompt <prompt> set the prompt used in GHCi\n" ++ + "\n" ++ + " :show modules show the currently loaded modules\n" ++ + " :show bindings show the current bindings made at the prompt\n" ++ + "\n" ++ + " :ctags [<file>] create tags file for Vi (default: \"tags\")\n" ++ + " :etags [<file>] create tags file for Emacs (defauilt: \"TAGS\")\n" ++ + " :type <expr> show the type of <expr>\n" ++ + " :kind <type> show the kind of <type>\n" ++ + " :undef <cmd> undefine user-defined command :<cmd>\n" ++ + " :unset <option> ... unset options\n" ++ + " :quit exit GHCi\n" ++ + " :!<command> run the shell command <command>\n" ++ + "\n" ++ + " Options for ':set' and ':unset':\n" ++ + "\n" ++ + " +r revert top-level expressions after each evaluation\n" ++ + " +s print timing/memory stats after each evaluation\n" ++ + " +t print type after evaluation\n" ++ + " -<flags> most GHC command line flags can also be set here\n" ++ + " (eg. -v2, -fglasgow-exts, etc.)\n" + + +#if defined(GHCI) && defined(BREAKPOINT) +globaliseAndTidy :: Id -> Id +globaliseAndTidy id +-- Give the Id a Global Name, and tidy its type + = Id.setIdType (globaliseId VanillaGlobal id) tidy_type + where + tidy_type = tidyTopType (idType id) + + +printScopeMsg :: Session -> String -> [Id] -> IO () +printScopeMsg session location ids + = GHC.getPrintUnqual session >>= \unqual -> + printForUser stdout unqual $ + text "Local bindings in scope:" $$ + nest 2 (pprWithCommas showId ids) + where showId id = ppr (idName id) <+> dcolon <+> ppr (idType id) + +jumpFunction :: Session -> Int -> [HValue] -> String -> b -> b +jumpFunction session@(Session ref) (I# idsPtr) hValues location b + = unsafePerformIO $ + do ids <- deRefStablePtr (castPtrToStablePtr (Ptr (int2Addr# idsPtr))) + let names = map idName ids + ASSERT (length names == length hValues) return () + printScopeMsg session location ids + hsc_env <- readIORef ref + + let ictxt = hsc_IC hsc_env + global_ids = map globaliseAndTidy ids + rn_env = ic_rn_local_env ictxt + type_env = ic_type_env ictxt + bound_names = map idName global_ids + new_rn_env = extendLocalRdrEnv rn_env bound_names + -- Remove any shadowed bindings from the type_env; + -- they are inaccessible but might, I suppose, cause + -- a space leak if we leave them there + shadowed = [ n | name <- bound_names, + let rdr_name = mkRdrUnqual (nameOccName name), + Just n <- [lookupLocalRdrEnv rn_env rdr_name] ] + filtered_type_env = delListFromNameEnv type_env shadowed + new_type_env = extendTypeEnvWithIds filtered_type_env global_ids + new_ic = ictxt { ic_rn_local_env = new_rn_env, + ic_type_env = new_type_env } + writeIORef ref (hsc_env { hsc_IC = new_ic }) + withExtendedLinkEnv (zip names hValues) $ + startGHCi (runGHCi [] Nothing) + GHCiState{ progname = "<interactive>", + args = [], + prompt = location++"> ", + session = session, + options = [] } + writeIORef ref hsc_env + putStrLn $ "Returning to normal execution..." + return b +#endif + +interactiveUI :: Session -> [(FilePath, Maybe Phase)] -> Maybe String -> IO () +interactiveUI session srcs maybe_expr = do +#if defined(GHCI) && defined(BREAKPOINT) + initDynLinker =<< GHC.getSessionDynFlags session + extendLinkEnv [(breakpointJumpName,unsafeCoerce# (jumpFunction session))] +#endif + -- HACK! If we happen to get into an infinite loop (eg the user + -- types 'let x=x in x' at the prompt), then the thread will block + -- on a blackhole, and become unreachable during GC. The GC will + -- detect that it is unreachable and send it the NonTermination + -- exception. However, since the thread is unreachable, everything + -- it refers to might be finalized, including the standard Handles. + -- This sounds like a bug, but we don't have a good solution right + -- now. + newStablePtr stdin + newStablePtr stdout + newStablePtr stderr + + hFlush stdout + hSetBuffering stdout NoBuffering + + -- Initialise buffering for the *interpreted* I/O system + initInterpBuffering session + + -- We don't want the cmd line to buffer any input that might be + -- intended for the program, so unbuffer stdin. + hSetBuffering stdin NoBuffering + + -- initial context is just the Prelude + GHC.setContext session [] [prelude_mod] + +#ifdef USE_READLINE + Readline.initialize + Readline.setAttemptedCompletionFunction (Just completeWord) + --Readline.parseAndBind "set show-all-if-ambiguous 1" + + let symbols = "!#$%&*+/<=>?@\\^|-~" + specials = "(),;[]`{}" + spaces = " \t\n" + word_break_chars = spaces ++ specials ++ symbols + + Readline.setBasicWordBreakCharacters word_break_chars + Readline.setCompleterWordBreakCharacters word_break_chars +#endif + + startGHCi (runGHCi srcs maybe_expr) + GHCiState{ progname = "<interactive>", + args = [], + prompt = "%s> ", + session = session, + options = [] } + +#ifdef USE_READLINE + Readline.resetTerminal Nothing +#endif + + return () + +runGHCi :: [(FilePath, Maybe Phase)] -> Maybe String -> GHCi () +runGHCi paths maybe_expr = do + let read_dot_files = not opt_IgnoreDotGhci + + when (read_dot_files) $ do + -- Read in ./.ghci. + let file = "./.ghci" + exists <- io (doesFileExist file) + when exists $ do + dir_ok <- io (checkPerms ".") + file_ok <- io (checkPerms file) + when (dir_ok && file_ok) $ do + either_hdl <- io (IO.try (openFile "./.ghci" ReadMode)) + case either_hdl of + Left e -> return () + Right hdl -> fileLoop hdl False + + when (read_dot_files) $ do + -- Read in $HOME/.ghci + either_dir <- io (IO.try (getEnv "HOME")) + case either_dir of + Left e -> return () + Right dir -> do + cwd <- io (getCurrentDirectory) + when (dir /= cwd) $ do + let file = dir ++ "/.ghci" + ok <- io (checkPerms file) + when ok $ do + either_hdl <- io (IO.try (openFile file ReadMode)) + case either_hdl of + Left e -> return () + Right hdl -> fileLoop hdl False + + -- Perform a :load for files given on the GHCi command line + -- When in -e mode, if the load fails then we want to stop + -- immediately rather than going on to evaluate the expression. + when (not (null paths)) $ do + ok <- ghciHandle (\e -> do showException e; return Failed) $ + loadModule paths + when (isJust maybe_expr && failed ok) $ + io (exitWith (ExitFailure 1)) + + -- if verbosity is greater than 0, or we are connected to a + -- terminal, display the prompt in the interactive loop. + is_tty <- io (hIsTerminalDevice stdin) + dflags <- getDynFlags + let show_prompt = verbosity dflags > 0 || is_tty + + case maybe_expr of + Nothing -> +#if defined(mingw32_HOST_OS) + do + -- The win32 Console API mutates the first character of + -- type-ahead when reading from it in a non-buffered manner. Work + -- around this by flushing the input buffer of type-ahead characters, + -- but only if stdin is available. + flushed <- io (IO.try (GHC.ConsoleHandler.flushConsole stdin)) + case flushed of + Left err | isDoesNotExistError err -> return () + | otherwise -> io (ioError err) + Right () -> return () +#endif + -- enter the interactive loop + interactiveLoop is_tty show_prompt + Just expr -> do + -- just evaluate the expression we were given + runCommandEval expr + return () + + -- and finally, exit + io $ do when (verbosity dflags > 0) $ putStrLn "Leaving GHCi." + + +interactiveLoop is_tty show_prompt = + -- Ignore ^C exceptions caught here + ghciHandleDyn (\e -> case e of + Interrupted -> do +#if defined(mingw32_HOST_OS) + io (putStrLn "") +#endif + interactiveLoop is_tty show_prompt + _other -> return ()) $ + + ghciUnblock $ do -- unblock necessary if we recursed from the + -- exception handler above. + + -- read commands from stdin +#ifdef USE_READLINE + if (is_tty) + then readlineLoop + else fileLoop stdin show_prompt +#else + fileLoop stdin show_prompt +#endif + + +-- NOTE: We only read .ghci files if they are owned by the current user, +-- and aren't world writable. Otherwise, we could be accidentally +-- running code planted by a malicious third party. + +-- Furthermore, We only read ./.ghci if . is owned by the current user +-- and isn't writable by anyone else. I think this is sufficient: we +-- don't need to check .. and ../.. etc. because "." always refers to +-- the same directory while a process is running. + +checkPerms :: String -> IO Bool +checkPerms name = +#ifdef mingw32_HOST_OS + return True +#else + Util.handle (\_ -> return False) $ do + st <- getFileStatus name + me <- getRealUserID + if fileOwner st /= me then do + putStrLn $ "WARNING: " ++ name ++ " is owned by someone else, IGNORING!" + return False + else do + let mode = fileMode st + if (groupWriteMode == (mode `intersectFileModes` groupWriteMode)) + || (otherWriteMode == (mode `intersectFileModes` otherWriteMode)) + then do + putStrLn $ "*** WARNING: " ++ name ++ + " is writable by someone else, IGNORING!" + return False + else return True +#endif + +fileLoop :: Handle -> Bool -> GHCi () +fileLoop hdl show_prompt = do + session <- getSession + (mod,imports) <- io (GHC.getContext session) + st <- getGHCiState + when show_prompt (io (putStr (mkPrompt mod imports (prompt st)))) + l <- io (IO.try (hGetLine hdl)) + case l of + Left e | isEOFError e -> return () + | InvalidArgument <- etype -> return () + | otherwise -> io (ioError e) + where etype = ioeGetErrorType e + -- treat InvalidArgument in the same way as EOF: + -- this can happen if the user closed stdin, or + -- perhaps did getContents which closes stdin at + -- EOF. + Right l -> + case removeSpaces l of + "" -> fileLoop hdl show_prompt + l -> do quit <- runCommand l + if quit then return () else fileLoop hdl show_prompt + +stringLoop :: [String] -> GHCi () +stringLoop [] = return () +stringLoop (s:ss) = do + case removeSpaces s of + "" -> stringLoop ss + l -> do quit <- runCommand l + if quit then return () else stringLoop ss + +mkPrompt toplevs exports prompt + = showSDoc $ f prompt + where + f ('%':'s':xs) = perc_s <> f xs + f ('%':'%':xs) = char '%' <> f xs + f (x:xs) = char x <> f xs + f [] = empty + + perc_s = hsep (map (\m -> char '*' <> pprModule m) toplevs) <+> + hsep (map pprModule exports) + + +#ifdef USE_READLINE +readlineLoop :: GHCi () +readlineLoop = do + session <- getSession + (mod,imports) <- io (GHC.getContext session) + io yield + saveSession -- for use by completion + st <- getGHCiState + l <- io (readline (mkPrompt mod imports (prompt st)) + `finally` setNonBlockingFD 0) + -- readline sometimes puts stdin into blocking mode, + -- so we need to put it back for the IO library + splatSavedSession + case l of + Nothing -> return () + Just l -> + case removeSpaces l of + "" -> readlineLoop + l -> do + io (addHistory l) + quit <- runCommand l + if quit then return () else readlineLoop +#endif + +runCommand :: String -> GHCi Bool +runCommand c = ghciHandle handler (doCommand c) + where + doCommand (':' : command) = specialCommand command + doCommand stmt + = do timeIt (do nms <- runStmt stmt; finishEvalExpr nms) + return False + +-- This version is for the GHC command-line option -e. The only difference +-- from runCommand is that it catches the ExitException exception and +-- exits, rather than printing out the exception. +runCommandEval c = ghciHandle handleEval (doCommand c) + where + handleEval (ExitException code) = io (exitWith code) + handleEval e = do showException e + io (exitWith (ExitFailure 1)) + + doCommand (':' : command) = specialCommand command + doCommand stmt + = do nms <- runStmt stmt + case nms of + Nothing -> io (exitWith (ExitFailure 1)) + -- failure to run the command causes exit(1) for ghc -e. + _ -> finishEvalExpr nms + +-- This is the exception handler for exceptions generated by the +-- user's code; it normally just prints out the exception. The +-- handler must be recursive, in case showing the exception causes +-- more exceptions to be raised. +-- +-- Bugfix: if the user closed stdout or stderr, the flushing will fail, +-- raising another exception. We therefore don't put the recursive +-- handler arond the flushing operation, so if stderr is closed +-- GHCi will just die gracefully rather than going into an infinite loop. +handler :: Exception -> GHCi Bool +handler exception = do + flushInterpBuffers + io installSignalHandlers + ghciHandle handler (showException exception >> return False) + +showException (DynException dyn) = + case fromDynamic dyn of + Nothing -> io (putStrLn ("*** Exception: (unknown)")) + Just Interrupted -> io (putStrLn "Interrupted.") + Just (CmdLineError s) -> io (putStrLn s) -- omit the location for CmdLineError + Just ph@PhaseFailed{} -> io (putStrLn (showGhcException ph "")) -- ditto + Just other_ghc_ex -> io (print other_ghc_ex) + +showException other_exception + = io (putStrLn ("*** Exception: " ++ show other_exception)) + +runStmt :: String -> GHCi (Maybe [Name]) +runStmt stmt + | null (filter (not.isSpace) stmt) = return (Just []) + | otherwise + = do st <- getGHCiState + session <- getSession + result <- io $ withProgName (progname st) $ withArgs (args st) $ + GHC.runStmt session stmt + case result of + GHC.RunFailed -> return Nothing + GHC.RunException e -> throw e -- this is caught by runCommand(Eval) + GHC.RunOk names -> return (Just names) + +-- possibly print the type and revert CAFs after evaluating an expression +finishEvalExpr mb_names + = do b <- isOptionSet ShowType + session <- getSession + case mb_names of + Nothing -> return () + Just names -> when b (mapM_ (showTypeOfName session) names) + + flushInterpBuffers + io installSignalHandlers + b <- isOptionSet RevertCAFs + io (when b revertCAFs) + return True + +showTypeOfName :: Session -> Name -> GHCi () +showTypeOfName session n + = do maybe_tything <- io (GHC.lookupName session n) + case maybe_tything of + Nothing -> return () + Just thing -> showTyThing thing + +showForUser :: SDoc -> GHCi String +showForUser doc = do + session <- getSession + unqual <- io (GHC.getPrintUnqual session) + return $! showSDocForUser unqual doc + +specialCommand :: String -> GHCi Bool +specialCommand ('!':str) = shellEscape (dropWhile isSpace str) +specialCommand str = do + let (cmd,rest) = break isSpace str + maybe_cmd <- io (lookupCommand cmd) + case maybe_cmd of + Nothing -> io (hPutStr stdout ("unknown command ':" ++ cmd ++ "'\n" + ++ shortHelpText) >> return False) + Just (_,f,_,_) -> f (dropWhile isSpace rest) + +lookupCommand :: String -> IO (Maybe Command) +lookupCommand str = do + cmds <- readIORef commands + -- look for exact match first, then the first prefix match + case [ c | c <- cmds, str == cmdName c ] of + c:_ -> return (Just c) + [] -> case [ c | c@(s,_,_,_) <- cmds, prefixMatch str s ] of + [] -> return Nothing + c:_ -> return (Just c) + +----------------------------------------------------------------------------- +-- To flush buffers for the *interpreted* computation we need +-- to refer to *its* stdout/stderr handles + +GLOBAL_VAR(flush_interp, error "no flush_interp", IO ()) +GLOBAL_VAR(turn_off_buffering, error "no flush_stdout", IO ()) + +no_buf_cmd = "System.IO.hSetBuffering System.IO.stdout System.IO.NoBuffering" ++ + " Prelude.>> System.IO.hSetBuffering System.IO.stderr System.IO.NoBuffering" +flush_cmd = "System.IO.hFlush System.IO.stdout Prelude.>> System.IO.hFlush IO.stderr" + +initInterpBuffering :: Session -> IO () +initInterpBuffering session + = do maybe_hval <- GHC.compileExpr session no_buf_cmd + + case maybe_hval of + Just hval -> writeIORef turn_off_buffering (unsafeCoerce# hval :: IO ()) + other -> panic "interactiveUI:setBuffering" + + maybe_hval <- GHC.compileExpr session flush_cmd + case maybe_hval of + Just hval -> writeIORef flush_interp (unsafeCoerce# hval :: IO ()) + _ -> panic "interactiveUI:flush" + + turnOffBuffering -- Turn it off right now + + return () + + +flushInterpBuffers :: GHCi () +flushInterpBuffers + = io $ do Monad.join (readIORef flush_interp) + return () + +turnOffBuffering :: IO () +turnOffBuffering + = do Monad.join (readIORef turn_off_buffering) + return () + +----------------------------------------------------------------------------- +-- Commands + +help :: String -> GHCi () +help _ = io (putStr helpText) + +info :: String -> GHCi () +info "" = throwDyn (CmdLineError "syntax: ':i <thing-you-want-info-about>'") +info s = do { let names = words s + ; session <- getSession + ; dflags <- getDynFlags + ; let exts = dopt Opt_GlasgowExts dflags + ; mapM_ (infoThing exts session) names } + where + infoThing exts session str = io $ do + names <- GHC.parseName session str + let filtered = filterOutChildren names + mb_stuffs <- mapM (GHC.getInfo session) filtered + unqual <- GHC.getPrintUnqual session + putStrLn (showSDocForUser unqual $ + vcat (intersperse (text "") $ + [ pprInfo exts stuff | Just stuff <- mb_stuffs ])) + + -- Filter out names whose parent is also there Good + -- example is '[]', which is both a type and data + -- constructor in the same type +filterOutChildren :: [Name] -> [Name] +filterOutChildren names = filter (not . parent_is_there) names + where parent_is_there n + | Just p <- GHC.nameParent_maybe n = p `elem` names + | otherwise = False + +pprInfo exts (thing, fixity, insts) + = pprTyThingInContextLoc exts thing + $$ show_fixity fixity + $$ vcat (map GHC.pprInstance insts) + where + show_fixity fix + | fix == GHC.defaultFixity = empty + | otherwise = ppr fix <+> ppr (GHC.getName thing) + +----------------------------------------------------------------------------- +-- Commands + +runMain :: String -> GHCi () +runMain args = do + let ss = concat $ intersperse "," (map (\ s -> ('"':s)++"\"") (toArgs args)) + runCommand $ '[': ss ++ "] `System.Environment.withArgs` main" + return () + +addModule :: [FilePath] -> GHCi () +addModule files = do + io (revertCAFs) -- always revert CAFs on load/add. + files <- mapM expandPath files + targets <- mapM (\m -> io (GHC.guessTarget m Nothing)) files + session <- getSession + io (mapM_ (GHC.addTarget session) targets) + ok <- io (GHC.load session LoadAllTargets) + afterLoad ok session + +changeDirectory :: String -> GHCi () +changeDirectory dir = do + session <- getSession + graph <- io (GHC.getModuleGraph session) + when (not (null graph)) $ + io $ putStr "Warning: changing directory causes all loaded modules to be unloaded,\nbecause the search path has changed.\n" + io (GHC.setTargets session []) + io (GHC.load session LoadAllTargets) + setContextAfterLoad session [] + io (GHC.workingDirectoryChanged session) + dir <- expandPath dir + io (setCurrentDirectory dir) + +defineMacro :: String -> GHCi () +defineMacro s = do + let (macro_name, definition) = break isSpace s + cmds <- io (readIORef commands) + if (null macro_name) + then throwDyn (CmdLineError "invalid macro name") + else do + if (macro_name `elem` map cmdName cmds) + then throwDyn (CmdLineError + ("command '" ++ macro_name ++ "' is already defined")) + else do + + -- give the expression a type signature, so we can be sure we're getting + -- something of the right type. + let new_expr = '(' : definition ++ ") :: String -> IO String" + + -- compile the expression + cms <- getSession + maybe_hv <- io (GHC.compileExpr cms new_expr) + case maybe_hv of + Nothing -> return () + Just hv -> io (writeIORef commands -- + (cmds ++ [(macro_name, keepGoing (runMacro hv), False, completeNone)])) + +runMacro :: GHC.HValue{-String -> IO String-} -> String -> GHCi () +runMacro fun s = do + str <- io ((unsafeCoerce# fun :: String -> IO String) s) + stringLoop (lines str) + +undefineMacro :: String -> GHCi () +undefineMacro macro_name = do + cmds <- io (readIORef commands) + if (macro_name `elem` map cmdName builtin_commands) + then throwDyn (CmdLineError + ("command '" ++ macro_name ++ "' cannot be undefined")) + else do + if (macro_name `notElem` map cmdName cmds) + then throwDyn (CmdLineError + ("command '" ++ macro_name ++ "' not defined")) + else do + io (writeIORef commands (filter ((/= macro_name) . cmdName) cmds)) + + +loadModule :: [(FilePath, Maybe Phase)] -> GHCi SuccessFlag +loadModule fs = timeIt (loadModule' fs) + +loadModule_ :: [FilePath] -> GHCi () +loadModule_ fs = do loadModule (zip fs (repeat Nothing)); return () + +loadModule' :: [(FilePath, Maybe Phase)] -> GHCi SuccessFlag +loadModule' files = do + session <- getSession + + -- unload first + io (GHC.setTargets session []) + io (GHC.load session LoadAllTargets) + + -- expand tildes + let (filenames, phases) = unzip files + exp_filenames <- mapM expandPath filenames + let files' = zip exp_filenames phases + targets <- io (mapM (uncurry GHC.guessTarget) files') + + -- NOTE: we used to do the dependency anal first, so that if it + -- fails we didn't throw away the current set of modules. This would + -- require some re-working of the GHC interface, so we'll leave it + -- as a ToDo for now. + + io (GHC.setTargets session targets) + ok <- io (GHC.load session LoadAllTargets) + afterLoad ok session + return ok + +checkModule :: String -> GHCi () +checkModule m = do + let modl = GHC.mkModule m + session <- getSession + result <- io (GHC.checkModule session modl) + case result of + Nothing -> io $ putStrLn "Nothing" + Just r -> io $ putStrLn (showSDoc ( + case checkedModuleInfo r of + Just cm | Just scope <- GHC.modInfoTopLevelScope cm -> + let + (local,global) = partition ((== modl) . GHC.nameModule) scope + in + (text "global names: " <+> ppr global) $$ + (text "local names: " <+> ppr local) + _ -> empty)) + afterLoad (successIf (isJust result)) session + +reloadModule :: String -> GHCi () +reloadModule "" = do + io (revertCAFs) -- always revert CAFs on reload. + session <- getSession + ok <- io (GHC.load session LoadAllTargets) + afterLoad ok session +reloadModule m = do + io (revertCAFs) -- always revert CAFs on reload. + session <- getSession + ok <- io (GHC.load session (LoadUpTo (GHC.mkModule m))) + afterLoad ok session + +afterLoad ok session = do + io (revertCAFs) -- always revert CAFs on load. + graph <- io (GHC.getModuleGraph session) + graph' <- filterM (io . GHC.isLoaded session . GHC.ms_mod) graph + setContextAfterLoad session graph' + modulesLoadedMsg ok (map GHC.ms_mod graph') +#if defined(GHCI) && defined(BREAKPOINT) + io (extendLinkEnv [(breakpointJumpName,unsafeCoerce# (jumpFunction session))]) +#endif + +setContextAfterLoad session [] = do + io (GHC.setContext session [] [prelude_mod]) +setContextAfterLoad session ms = do + -- load a target if one is available, otherwise load the topmost module. + targets <- io (GHC.getTargets session) + case [ m | Just m <- map (findTarget ms) targets ] of + [] -> + let graph' = flattenSCCs (GHC.topSortModuleGraph True ms Nothing) in + load_this (last graph') + (m:_) -> + load_this m + where + findTarget ms t + = case filter (`matches` t) ms of + [] -> Nothing + (m:_) -> Just m + + summary `matches` Target (TargetModule m) _ + = GHC.ms_mod summary == m + summary `matches` Target (TargetFile f _) _ + | Just f' <- GHC.ml_hs_file (GHC.ms_location summary) = f == f' + summary `matches` target + = False + + load_this summary | m <- GHC.ms_mod summary = do + b <- io (GHC.moduleIsInterpreted session m) + if b then io (GHC.setContext session [m] []) + else io (GHC.setContext session [] [prelude_mod,m]) + + +modulesLoadedMsg :: SuccessFlag -> [Module] -> GHCi () +modulesLoadedMsg ok mods = do + dflags <- getDynFlags + when (verbosity dflags > 0) $ do + let mod_commas + | null mods = text "none." + | otherwise = hsep ( + punctuate comma (map pprModule mods)) <> text "." + case ok of + Failed -> + io (putStrLn (showSDoc (text "Failed, modules loaded: " <> mod_commas))) + Succeeded -> + io (putStrLn (showSDoc (text "Ok, modules loaded: " <> mod_commas))) + + +typeOfExpr :: String -> GHCi () +typeOfExpr str + = do cms <- getSession + maybe_ty <- io (GHC.exprType cms str) + case maybe_ty of + Nothing -> return () + Just ty -> do ty' <- cleanType ty + tystr <- showForUser (ppr ty') + io (putStrLn (str ++ " :: " ++ tystr)) + +kindOfType :: String -> GHCi () +kindOfType str + = do cms <- getSession + maybe_ty <- io (GHC.typeKind cms str) + case maybe_ty of + Nothing -> return () + Just ty -> do tystr <- showForUser (ppr ty) + io (putStrLn (str ++ " :: " ++ tystr)) + +quit :: String -> GHCi Bool +quit _ = return True + +shellEscape :: String -> GHCi Bool +shellEscape str = io (system str >> return False) + +----------------------------------------------------------------------------- +-- create tags file for currently loaded modules. + +createETagsFileCmd, createCTagsFileCmd :: String -> GHCi () + +createCTagsFileCmd "" = ghciCreateTagsFile CTags "tags" +createCTagsFileCmd file = ghciCreateTagsFile CTags file + +createETagsFileCmd "" = ghciCreateTagsFile ETags "TAGS" +createETagsFileCmd file = ghciCreateTagsFile ETags file + +data TagsKind = ETags | CTags + +ghciCreateTagsFile :: TagsKind -> FilePath -> GHCi () +ghciCreateTagsFile kind file = do + session <- getSession + io $ createTagsFile session kind file + +-- ToDo: +-- - remove restriction that all modules must be interpreted +-- (problem: we don't know source locations for entities unless +-- we compiled the module. +-- +-- - extract createTagsFile so it can be used from the command-line +-- (probably need to fix first problem before this is useful). +-- +createTagsFile :: Session -> TagsKind -> FilePath -> IO () +createTagsFile session tagskind tagFile = do + graph <- GHC.getModuleGraph session + let ms = map GHC.ms_mod graph + tagModule m = do + is_interpreted <- GHC.moduleIsInterpreted session m + -- should we just skip these? + when (not is_interpreted) $ + throwDyn (CmdLineError ("module '" ++ moduleString m ++ "' is not interpreted")) + + mbModInfo <- GHC.getModuleInfo session m + let unqual + | Just modinfo <- mbModInfo, + Just unqual <- GHC.modInfoPrintUnqualified modinfo = unqual + | otherwise = GHC.alwaysQualify + + case mbModInfo of + Just modInfo -> return $! listTags unqual modInfo + _ -> return [] + + mtags <- mapM tagModule ms + either_res <- collateAndWriteTags tagskind tagFile $ concat mtags + case either_res of + Left e -> hPutStrLn stderr $ ioeGetErrorString e + Right _ -> return () + +listTags :: PrintUnqualified -> GHC.ModuleInfo -> [TagInfo] +listTags unqual modInfo = + [ tagInfo unqual name loc + | name <- GHC.modInfoExports modInfo + , let loc = nameSrcLoc name + , isGoodSrcLoc loc + ] + +type TagInfo = (String -- tag name + ,String -- file name + ,Int -- line number + ,Int -- column number + ) + +-- get tag info, for later translation into Vim or Emacs style +tagInfo :: PrintUnqualified -> Name -> SrcLoc -> TagInfo +tagInfo unqual name loc + = ( showSDocForUser unqual $ pprOccName (nameOccName name) + , showSDocForUser unqual $ ftext (srcLocFile loc) + , srcLocLine loc + , srcLocCol loc + ) + +collateAndWriteTags :: TagsKind -> FilePath -> [TagInfo] -> IO (Either IOError ()) +collateAndWriteTags CTags file tagInfos = do -- ctags style, Vim et al + let tags = unlines $ sortLe (<=) $ nub $ map showTag tagInfos + IO.try (writeFile file tags) +collateAndWriteTags ETags file tagInfos = do -- etags style, Emacs/XEmacs + let byFile op (_,f1,_,_) (_,f2,_,_) = f1 `op` f2 + groups = groupBy (byFile (==)) $ sortLe (byFile (<=)) tagInfos + tagGroups <- mapM tagFileGroup groups + IO.try (writeFile file $ concat tagGroups) + where + tagFileGroup group@[] = throwDyn (CmdLineError "empty tag file group??") + tagFileGroup group@((_,fileName,_,_):_) = do + file <- readFile fileName -- need to get additional info from sources.. + let byLine (_,_,l1,_) (_,_,l2,_) = l1 <= l2 + sortedGroup = sortLe byLine group + tags = unlines $ perFile sortedGroup 1 0 $ lines file + return $ "\x0c\n" ++ fileName ++ "," ++ show (length tags) ++ "\n" ++ tags + perFile (tagInfo@(tag,file,lNo,colNo):tags) count pos (line:lines) | lNo>count = + perFile (tagInfo:tags) (count+1) (pos+length line) lines + perFile (tagInfo@(tag,file,lNo,colNo):tags) count pos lines@(line:_) | lNo==count = + showETag tagInfo line pos : perFile tags count pos lines + perFile tags count pos lines = [] + +-- simple ctags format, for Vim et al +showTag :: TagInfo -> String +showTag (tag,file,lineNo,colNo) + = tag ++ "\t" ++ file ++ "\t" ++ show lineNo + +-- etags format, for Emacs/XEmacs +showETag :: TagInfo -> String -> Int -> String +showETag (tag,file,lineNo,colNo) line charPos + = take colNo line ++ tag + ++ "\x7f" ++ tag + ++ "\x01" ++ show lineNo + ++ "," ++ show charPos + +----------------------------------------------------------------------------- +-- Browsing a module's contents + +browseCmd :: String -> GHCi () +browseCmd m = + case words m of + ['*':m] | looksLikeModuleName m -> browseModule m False + [m] | looksLikeModuleName m -> browseModule m True + _ -> throwDyn (CmdLineError "syntax: :browse <module>") + +browseModule m exports_only = do + s <- getSession + + let modl = GHC.mkModule m + is_interpreted <- io (GHC.moduleIsInterpreted s modl) + when (not is_interpreted && not exports_only) $ + throwDyn (CmdLineError ("module '" ++ m ++ "' is not interpreted")) + + -- Temporarily set the context to the module we're interested in, + -- just so we can get an appropriate PrintUnqualified + (as,bs) <- io (GHC.getContext s) + io (if exports_only then GHC.setContext s [] [prelude_mod,modl] + else GHC.setContext s [modl] []) + unqual <- io (GHC.getPrintUnqual s) + io (GHC.setContext s as bs) + + mb_mod_info <- io $ GHC.getModuleInfo s modl + case mb_mod_info of + Nothing -> throwDyn (CmdLineError ("unknown module: " ++ m)) + Just mod_info -> do + let names + | exports_only = GHC.modInfoExports mod_info + | otherwise = fromMaybe [] (GHC.modInfoTopLevelScope mod_info) + + filtered = filterOutChildren names + + things <- io $ mapM (GHC.lookupName s) filtered + + dflags <- getDynFlags + let exts = dopt Opt_GlasgowExts dflags + io (putStrLn (showSDocForUser unqual ( + vcat (map (pprTyThingInContext exts) (catMaybes things)) + ))) + -- ToDo: modInfoInstances currently throws an exception for + -- package modules. When it works, we can do this: + -- $$ vcat (map GHC.pprInstance (GHC.modInfoInstances mod_info)) + +----------------------------------------------------------------------------- +-- Setting the module context + +setContext str + | all sensible mods = fn mods + | otherwise = throwDyn (CmdLineError "syntax: :module [+/-] [*]M1 ... [*]Mn") + where + (fn, mods) = case str of + '+':stuff -> (addToContext, words stuff) + '-':stuff -> (removeFromContext, words stuff) + stuff -> (newContext, words stuff) + + sensible ('*':m) = looksLikeModuleName m + sensible m = looksLikeModuleName m + +newContext mods = do + session <- getSession + (as,bs) <- separate session mods [] [] + let bs' = if null as && prelude_mod `notElem` bs then prelude_mod:bs else bs + io (GHC.setContext session as bs') + +separate :: Session -> [String] -> [Module] -> [Module] + -> GHCi ([Module],[Module]) +separate session [] as bs = return (as,bs) +separate session (('*':m):ms) as bs = do + let modl = GHC.mkModule m + b <- io (GHC.moduleIsInterpreted session modl) + if b then separate session ms (modl:as) bs + else throwDyn (CmdLineError ("module '" ++ m ++ "' is not interpreted")) +separate session (m:ms) as bs = separate session ms as (GHC.mkModule m:bs) + +prelude_mod = GHC.mkModule "Prelude" + + +addToContext mods = do + cms <- getSession + (as,bs) <- io (GHC.getContext cms) + + (as',bs') <- separate cms mods [] [] + + let as_to_add = as' \\ (as ++ bs) + bs_to_add = bs' \\ (as ++ bs) + + io (GHC.setContext cms (as ++ as_to_add) (bs ++ bs_to_add)) + + +removeFromContext mods = do + cms <- getSession + (as,bs) <- io (GHC.getContext cms) + + (as_to_remove,bs_to_remove) <- separate cms mods [] [] + + let as' = as \\ (as_to_remove ++ bs_to_remove) + bs' = bs \\ (as_to_remove ++ bs_to_remove) + + io (GHC.setContext cms as' bs') + +---------------------------------------------------------------------------- +-- Code for `:set' + +-- set options in the interpreter. Syntax is exactly the same as the +-- ghc command line, except that certain options aren't available (-C, +-- -E etc.) +-- +-- This is pretty fragile: most options won't work as expected. ToDo: +-- figure out which ones & disallow them. + +setCmd :: String -> GHCi () +setCmd "" + = do st <- getGHCiState + let opts = options st + io $ putStrLn (showSDoc ( + text "options currently set: " <> + if null opts + then text "none." + else hsep (map (\o -> char '+' <> text (optToStr o)) opts) + )) +setCmd str + = case words str of + ("args":args) -> setArgs args + ("prog":prog) -> setProg prog + ("prompt":prompt) -> setPrompt (dropWhile isSpace $ drop 6 $ dropWhile isSpace str) + wds -> setOptions wds + +setArgs args = do + st <- getGHCiState + setGHCiState st{ args = args } + +setProg [prog] = do + st <- getGHCiState + setGHCiState st{ progname = prog } +setProg _ = do + io (hPutStrLn stderr "syntax: :set prog <progname>") + +setPrompt value = do + st <- getGHCiState + if null value + then io $ hPutStrLn stderr $ "syntax: :set prompt <prompt>, currently \"" ++ prompt st ++ "\"" + else setGHCiState st{ prompt = remQuotes value } + where + remQuotes ('\"':xs) | not (null xs) && last xs == '\"' = init xs + remQuotes x = x + +setOptions wds = + do -- first, deal with the GHCi opts (+s, +t, etc.) + let (plus_opts, minus_opts) = partition isPlus wds + mapM_ setOpt plus_opts + + -- then, dynamic flags + dflags <- getDynFlags + (dflags',leftovers) <- io $ GHC.parseDynamicFlags dflags minus_opts + setDynFlags dflags' + + -- update things if the users wants more packages +{- TODO: + let new_packages = pkgs_after \\ pkgs_before + when (not (null new_packages)) $ + newPackages new_packages +-} + + if (not (null leftovers)) + then throwDyn (CmdLineError ("unrecognised flags: " ++ + unwords leftovers)) + else return () + + +unsetOptions :: String -> GHCi () +unsetOptions str + = do -- first, deal with the GHCi opts (+s, +t, etc.) + let opts = words str + (minus_opts, rest1) = partition isMinus opts + (plus_opts, rest2) = partition isPlus rest1 + + if (not (null rest2)) + then io (putStrLn ("unknown option: '" ++ head rest2 ++ "'")) + else do + + mapM_ unsetOpt plus_opts + + -- can't do GHC flags for now + if (not (null minus_opts)) + then throwDyn (CmdLineError "can't unset GHC command-line flags") + else return () + +isMinus ('-':s) = True +isMinus _ = False + +isPlus ('+':s) = True +isPlus _ = False + +setOpt ('+':str) + = case strToGHCiOpt str of + Nothing -> io (putStrLn ("unknown option: '" ++ str ++ "'")) + Just o -> setOption o + +unsetOpt ('+':str) + = case strToGHCiOpt str of + Nothing -> io (putStrLn ("unknown option: '" ++ str ++ "'")) + Just o -> unsetOption o + +strToGHCiOpt :: String -> (Maybe GHCiOption) +strToGHCiOpt "s" = Just ShowTiming +strToGHCiOpt "t" = Just ShowType +strToGHCiOpt "r" = Just RevertCAFs +strToGHCiOpt _ = Nothing + +optToStr :: GHCiOption -> String +optToStr ShowTiming = "s" +optToStr ShowType = "t" +optToStr RevertCAFs = "r" + +{- ToDo +newPackages new_pkgs = do -- The new packages are already in v_Packages + session <- getSession + io (GHC.setTargets session []) + io (GHC.load session Nothing) + dflags <- getDynFlags + io (linkPackages dflags new_pkgs) + setContextAfterLoad [] +-} + +-- --------------------------------------------------------------------------- +-- code for `:show' + +showCmd str = + case words str of + ["modules" ] -> showModules + ["bindings"] -> showBindings + ["linker"] -> io showLinkerState + _ -> throwDyn (CmdLineError "syntax: :show [modules|bindings]") + +showModules = do + session <- getSession + let show_one ms = do m <- io (GHC.showModule session ms) + io (putStrLn m) + graph <- io (GHC.getModuleGraph session) + mapM_ show_one graph + +showBindings = do + s <- getSession + unqual <- io (GHC.getPrintUnqual s) + bindings <- io (GHC.getBindings s) + mapM_ showTyThing bindings + return () + +showTyThing (AnId id) = do + ty' <- cleanType (GHC.idType id) + str <- showForUser (ppr id <> text " :: " <> ppr ty') + io (putStrLn str) +showTyThing _ = return () + +-- if -fglasgow-exts is on we show the foralls, otherwise we don't. +cleanType :: Type -> GHCi Type +cleanType ty = do + dflags <- getDynFlags + if dopt Opt_GlasgowExts dflags + then return ty + else return $! GHC.dropForAlls ty + +-- ----------------------------------------------------------------------------- +-- Completion + +completeNone :: String -> IO [String] +completeNone w = return [] + +#ifdef USE_READLINE +completeWord :: String -> Int -> Int -> IO (Maybe (String, [String])) +completeWord w start end = do + line <- Readline.getLineBuffer + case w of + ':':_ | all isSpace (take (start-1) line) -> wrapCompleter completeCmd w + _other + | Just c <- is_cmd line -> do + maybe_cmd <- lookupCommand c + let (n,w') = selectWord (words' 0 line) + case maybe_cmd of + Nothing -> return Nothing + Just (_,_,False,complete) -> wrapCompleter complete w + Just (_,_,True,complete) -> let complete' w = do rets <- complete w + return (map (drop n) rets) + in wrapCompleter complete' w' + | otherwise -> do + --printf "complete %s, start = %d, end = %d\n" w start end + wrapCompleter completeIdentifier w + where words' _ [] = [] + words' n str = let (w,r) = break isSpace str + (s,r') = span isSpace r + in (n,w):words' (n+length w+length s) r' + -- In a Haskell expression we want to parse 'a-b' as three words + -- where a compiler flag (ie. -fno-monomorphism-restriction) should + -- only be a single word. + selectWord [] = (0,w) + selectWord ((offset,x):xs) + | offset+length x >= start = (start-offset,take (end-offset) x) + | otherwise = selectWord xs + +is_cmd line + | ((':':w) : _) <- words (dropWhile isSpace line) = Just w + | otherwise = Nothing + +completeCmd w = do + cmds <- readIORef commands + return (filter (w `isPrefixOf`) (map (':':) (map cmdName cmds))) + +completeMacro w = do + cmds <- readIORef commands + let cmds' = [ cmd | cmd <- map cmdName cmds, cmd `elem` map cmdName builtin_commands ] + return (filter (w `isPrefixOf`) cmds') + +completeIdentifier w = do + s <- restoreSession + rdrs <- GHC.getRdrNamesInScope s + return (filter (w `isPrefixOf`) (map (showSDoc.ppr) rdrs)) + +completeModule w = do + s <- restoreSession + dflags <- GHC.getSessionDynFlags s + let pkg_mods = allExposedModules dflags + return (filter (w `isPrefixOf`) (map (showSDoc.ppr) pkg_mods)) + +completeHomeModule w = do + s <- restoreSession + g <- GHC.getModuleGraph s + let home_mods = map GHC.ms_mod g + return (filter (w `isPrefixOf`) (map (showSDoc.ppr) home_mods)) + +completeSetOptions w = do + return (filter (w `isPrefixOf`) options) + where options = "args":"prog":allFlags + +completeFilename = Readline.filenameCompletionFunction + +completeHomeModuleOrFile = unionComplete completeHomeModule completeFilename + +unionComplete :: (String -> IO [String]) -> (String -> IO [String]) -> String -> IO [String] +unionComplete f1 f2 w = do + s1 <- f1 w + s2 <- f2 w + return (s1 ++ s2) + +wrapCompleter :: (String -> IO [String]) -> String -> IO (Maybe (String,[String])) +wrapCompleter fun w = do + strs <- fun w + case strs of + [] -> return Nothing + [x] -> return (Just (x,[])) + xs -> case getCommonPrefix xs of + "" -> return (Just ("",xs)) + pref -> return (Just (pref,xs)) + +getCommonPrefix :: [String] -> String +getCommonPrefix [] = "" +getCommonPrefix (s:ss) = foldl common s ss + where common s "" = s + common "" s = "" + common (c:cs) (d:ds) + | c == d = c : common cs ds + | otherwise = "" + +allExposedModules :: DynFlags -> [Module] +allExposedModules dflags + = map GHC.mkModule (concat (map exposedModules (filter exposed (eltsUFM pkg_db)))) + where + pkg_db = pkgIdMap (pkgState dflags) +#else +completeCmd = completeNone +completeMacro = completeNone +completeIdentifier = completeNone +completeModule = completeNone +completeHomeModule = completeNone +completeSetOptions = completeNone +completeFilename = completeNone +completeHomeModuleOrFile=completeNone +#endif + +----------------------------------------------------------------------------- +-- GHCi monad + +data GHCiState = GHCiState + { + progname :: String, + args :: [String], + prompt :: String, + session :: GHC.Session, + options :: [GHCiOption] + } + +data GHCiOption + = ShowTiming -- show time/allocs after evaluation + | ShowType -- show the type of expressions + | RevertCAFs -- revert CAFs after every evaluation + deriving Eq + +newtype GHCi a = GHCi { unGHCi :: IORef GHCiState -> IO a } + +startGHCi :: GHCi a -> GHCiState -> IO a +startGHCi g state = do ref <- newIORef state; unGHCi g ref + +instance Monad GHCi where + (GHCi m) >>= k = GHCi $ \s -> m s >>= \a -> unGHCi (k a) s + return a = GHCi $ \s -> return a + +ghciHandleDyn :: Typeable t => (t -> GHCi a) -> GHCi a -> GHCi a +ghciHandleDyn h (GHCi m) = GHCi $ \s -> + Exception.catchDyn (m s) (\e -> unGHCi (h e) s) + +getGHCiState = GHCi $ \r -> readIORef r +setGHCiState s = GHCi $ \r -> writeIORef r s + +-- for convenience... +getSession = getGHCiState >>= return . session + +GLOBAL_VAR(saved_sess, no_saved_sess, Session) +no_saved_sess = error "no saved_ses" +saveSession = getSession >>= io . writeIORef saved_sess +splatSavedSession = io (writeIORef saved_sess no_saved_sess) +restoreSession = readIORef saved_sess + +getDynFlags = do + s <- getSession + io (GHC.getSessionDynFlags s) +setDynFlags dflags = do + s <- getSession + io (GHC.setSessionDynFlags s dflags) + +isOptionSet :: GHCiOption -> GHCi Bool +isOptionSet opt + = do st <- getGHCiState + return (opt `elem` options st) + +setOption :: GHCiOption -> GHCi () +setOption opt + = do st <- getGHCiState + setGHCiState (st{ options = opt : filter (/= opt) (options st) }) + +unsetOption :: GHCiOption -> GHCi () +unsetOption opt + = do st <- getGHCiState + setGHCiState (st{ options = filter (/= opt) (options st) }) + +io :: IO a -> GHCi a +io m = GHCi { unGHCi = \s -> m >>= return } + +----------------------------------------------------------------------------- +-- recursive exception handlers + +-- Don't forget to unblock async exceptions in the handler, or if we're +-- in an exception loop (eg. let a = error a in a) the ^C exception +-- may never be delivered. Thanks to Marcin for pointing out the bug. + +ghciHandle :: (Exception -> GHCi a) -> GHCi a -> GHCi a +ghciHandle h (GHCi m) = GHCi $ \s -> + Exception.catch (m s) + (\e -> unGHCi (ghciUnblock (h e)) s) + +ghciUnblock :: GHCi a -> GHCi a +ghciUnblock (GHCi a) = GHCi $ \s -> Exception.unblock (a s) + +----------------------------------------------------------------------------- +-- timing & statistics + +timeIt :: GHCi a -> GHCi a +timeIt action + = do b <- isOptionSet ShowTiming + if not b + then action + else do allocs1 <- io $ getAllocations + time1 <- io $ getCPUTime + a <- action + allocs2 <- io $ getAllocations + time2 <- io $ getCPUTime + io $ printTimes (fromIntegral (allocs2 - allocs1)) + (time2 - time1) + return a + +foreign import ccall unsafe "getAllocations" getAllocations :: IO Int64 + -- defined in ghc/rts/Stats.c + +printTimes :: Integer -> Integer -> IO () +printTimes allocs psecs + = do let secs = (fromIntegral psecs / (10^12)) :: Float + secs_str = showFFloat (Just 2) secs + putStrLn (showSDoc ( + parens (text (secs_str "") <+> text "secs" <> comma <+> + text (show allocs) <+> text "bytes"))) + +----------------------------------------------------------------------------- +-- reverting CAFs + +revertCAFs :: IO () +revertCAFs = do + rts_revertCAFs + turnOffBuffering + -- Have to turn off buffering again, because we just + -- reverted stdout, stderr & stdin to their defaults. + +foreign import ccall "revertCAFs" rts_revertCAFs :: IO () + -- Make it "safe", just in case + +-- ----------------------------------------------------------------------------- +-- Utils + +expandPath :: String -> GHCi String +expandPath path = + case dropWhile isSpace path of + ('~':d) -> do + tilde <- io (getEnv "HOME") -- will fail if HOME not defined + return (tilde ++ '/':d) + other -> + return other diff --git a/compiler/ghci/Linker.lhs b/compiler/ghci/Linker.lhs new file mode 100644 index 0000000000..3a5ecf8a6d --- /dev/null +++ b/compiler/ghci/Linker.lhs @@ -0,0 +1,927 @@ +% +% (c) The University of Glasgow 2005 +% + +-- -------------------------------------- +-- The dynamic linker for GHCi +-- -------------------------------------- + +This module deals with the top-level issues of dynamic linking, +calling the object-code linker and the byte-code linker where +necessary. + + +\begin{code} + +{-# OPTIONS -optc-DNON_POSIX_SOURCE -#include "Linker.h" #-} + +module Linker ( HValue, showLinkerState, + linkExpr, unload, extendLinkEnv, withExtendedLinkEnv, + extendLoadedPkgs, + linkPackages,initDynLinker + ) where + +#include "HsVersions.h" + +import ObjLink ( loadDLL, loadObj, unloadObj, resolveObjs, initObjLinker ) +import ByteCodeLink ( HValue, ClosureEnv, extendClosureEnv, linkBCO ) +import ByteCodeItbls ( ItblEnv ) +import ByteCodeAsm ( CompiledByteCode(..), bcoFreeNames, UnlinkedBCO(..)) + +import Packages +import DriverPhases ( isObjectFilename, isDynLibFilename ) +import Finder ( findModule, findObjectLinkableMaybe, FindResult(..) ) +import HscTypes +import Name ( Name, nameModule, isExternalName, isWiredInName ) +import NameEnv +import NameSet ( nameSetToList ) +import Module +import ListSetOps ( minusList ) +import DynFlags ( DynFlags(..), getOpts ) +import BasicTypes ( SuccessFlag(..), succeeded, failed ) +import Outputable +import Panic ( GhcException(..) ) +import Util ( zipLazy, global, joinFileExt, joinFileName, suffixOf ) +import StaticFlags ( v_Ld_inputs ) +import ErrUtils ( debugTraceMsg ) + +-- Standard libraries +import Control.Monad ( when, filterM, foldM ) + +import Data.IORef ( IORef, readIORef, writeIORef, modifyIORef ) +import Data.List ( partition, nub ) + +import System.IO ( putStr, putStrLn, hPutStrLn, stderr, fixIO ) +import System.Directory ( doesFileExist ) + +import Control.Exception ( block, throwDyn, bracket ) +import Maybe ( isJust, fromJust ) + +#if __GLASGOW_HASKELL__ >= 503 +import GHC.IOBase ( IO(..) ) +#else +import PrelIOBase ( IO(..) ) +#endif +\end{code} + + +%************************************************************************ +%* * + The Linker's state +%* * +%************************************************************************ + +The persistent linker state *must* match the actual state of the +C dynamic linker at all times, so we keep it in a private global variable. + + +The PersistentLinkerState maps Names to actual closures (for +interpreted code only), for use during linking. + +\begin{code} +GLOBAL_VAR(v_PersistentLinkerState, panic "Dynamic linker not initialised", PersistentLinkerState) +GLOBAL_VAR(v_InitLinkerDone, False, Bool) -- Set True when dynamic linker is initialised + +data PersistentLinkerState + = PersistentLinkerState { + + -- Current global mapping from Names to their true values + closure_env :: ClosureEnv, + + -- The current global mapping from RdrNames of DataCons to + -- info table addresses. + -- When a new Unlinked is linked into the running image, or an existing + -- module in the image is replaced, the itbl_env must be updated + -- appropriately. + itbl_env :: ItblEnv, + + -- The currently loaded interpreted modules (home package) + bcos_loaded :: [Linkable], + + -- And the currently-loaded compiled modules (home package) + objs_loaded :: [Linkable], + + -- The currently-loaded packages; always object code + -- Held, as usual, in dependency order; though I am not sure if + -- that is really important + pkgs_loaded :: [PackageId] + } + +emptyPLS :: DynFlags -> PersistentLinkerState +emptyPLS dflags = PersistentLinkerState { + closure_env = emptyNameEnv, + itbl_env = emptyNameEnv, + pkgs_loaded = init_pkgs, + bcos_loaded = [], + objs_loaded = [] } + -- Packages that don't need loading, because the compiler + -- shares them with the interpreted program. + -- + -- The linker's symbol table is populated with RTS symbols using an + -- explicit list. See rts/Linker.c for details. + where init_pkgs + | ExtPackage rts_id <- rtsPackageId (pkgState dflags) = [rts_id] + | otherwise = [] +\end{code} + +\begin{code} +extendLoadedPkgs :: [PackageId] -> IO () +extendLoadedPkgs pkgs + = modifyIORef v_PersistentLinkerState (\s -> s{pkgs_loaded = pkgs ++ pkgs_loaded s}) + +extendLinkEnv :: [(Name,HValue)] -> IO () +-- Automatically discards shadowed bindings +extendLinkEnv new_bindings + = do pls <- readIORef v_PersistentLinkerState + let new_closure_env = extendClosureEnv (closure_env pls) new_bindings + new_pls = pls { closure_env = new_closure_env } + writeIORef v_PersistentLinkerState new_pls + +withExtendedLinkEnv :: [(Name,HValue)] -> IO a -> IO a +withExtendedLinkEnv new_env action + = bracket set_new_env + reset_old_env + (const action) + where set_new_env = do pls <- readIORef v_PersistentLinkerState + let new_closure_env = extendClosureEnv (closure_env pls) new_env + new_pls = pls { closure_env = new_closure_env } + writeIORef v_PersistentLinkerState new_pls + return pls + reset_old_env pls = writeIORef v_PersistentLinkerState pls + +-- filterNameMap removes from the environment all entries except +-- those for a given set of modules; +-- Note that this removes all *local* (i.e. non-isExternal) names too +-- (these are the temporary bindings from the command line). +-- Used to filter both the ClosureEnv and ItblEnv + +filterNameMap :: [Module] -> NameEnv (Name, a) -> NameEnv (Name, a) +filterNameMap mods env + = filterNameEnv keep_elt env + where + keep_elt (n,_) = isExternalName n + && (nameModule n `elem` mods) +\end{code} + + +\begin{code} +showLinkerState :: IO () +-- Display the persistent linker state +showLinkerState + = do pls <- readIORef v_PersistentLinkerState + printDump (vcat [text "----- Linker state -----", + text "Pkgs:" <+> ppr (pkgs_loaded pls), + text "Objs:" <+> ppr (objs_loaded pls), + text "BCOs:" <+> ppr (bcos_loaded pls)]) +\end{code} + + + + +%************************************************************************ +%* * +\subsection{Initialisation} +%* * +%************************************************************************ + +We initialise the dynamic linker by + +a) calling the C initialisation procedure + +b) Loading any packages specified on the command line, + now held in v_ExplicitPackages + +c) Loading any packages specified on the command line, + now held in the -l options in v_Opt_l + +d) Loading any .o/.dll files specified on the command line, + now held in v_Ld_inputs + +e) Loading any MacOS frameworks + +\begin{code} +initDynLinker :: DynFlags -> IO () +-- This function is idempotent; if called more than once, it does nothing +-- This is useful in Template Haskell, where we call it before trying to link +initDynLinker dflags + = do { done <- readIORef v_InitLinkerDone + ; if done then return () + else do { writeIORef v_InitLinkerDone True + ; reallyInitDynLinker dflags } + } + +reallyInitDynLinker dflags + = do { -- Initialise the linker state + ; writeIORef v_PersistentLinkerState (emptyPLS dflags) + + -- (a) initialise the C dynamic linker + ; initObjLinker + + -- (b) Load packages from the command-line + ; linkPackages dflags (explicitPackages (pkgState dflags)) + + -- (c) Link libraries from the command-line + ; let optl = getOpts dflags opt_l + ; let minus_ls = [ lib | '-':'l':lib <- optl ] + + -- (d) Link .o files from the command-line + ; let lib_paths = libraryPaths dflags + ; cmdline_ld_inputs <- readIORef v_Ld_inputs + + ; classified_ld_inputs <- mapM classifyLdInput cmdline_ld_inputs + + -- (e) Link any MacOS frameworks +#ifdef darwin_TARGET_OS + ; let framework_paths = frameworkPaths dflags + ; let frameworks = cmdlineFrameworks dflags +#else + ; let frameworks = [] + ; let framework_paths = [] +#endif + -- Finally do (c),(d),(e) + ; let cmdline_lib_specs = [ l | Just l <- classified_ld_inputs ] + ++ map DLL minus_ls + ++ map Framework frameworks + ; if null cmdline_lib_specs then return () + else do + + { mapM_ (preloadLib dflags lib_paths framework_paths) cmdline_lib_specs + ; maybePutStr dflags "final link ... " + ; ok <- resolveObjs + + ; if succeeded ok then maybePutStrLn dflags "done" + else throwDyn (InstallationError "linking extra libraries/objects failed") + }} + +classifyLdInput :: FilePath -> IO (Maybe LibrarySpec) +classifyLdInput f + | isObjectFilename f = return (Just (Object f)) + | isDynLibFilename f = return (Just (DLLPath f)) + | otherwise = do + hPutStrLn stderr ("Warning: ignoring unrecognised input `" ++ f ++ "'") + return Nothing + +preloadLib :: DynFlags -> [String] -> [String] -> LibrarySpec -> IO () +preloadLib dflags lib_paths framework_paths lib_spec + = do maybePutStr dflags ("Loading object " ++ showLS lib_spec ++ " ... ") + case lib_spec of + Object static_ish + -> do b <- preload_static lib_paths static_ish + maybePutStrLn dflags (if b then "done" + else "not found") + + DLL dll_unadorned + -> do maybe_errstr <- loadDynamic lib_paths dll_unadorned + case maybe_errstr of + Nothing -> maybePutStrLn dflags "done" + Just mm -> preloadFailed mm lib_paths lib_spec + + DLLPath dll_path + -> do maybe_errstr <- loadDLL dll_path + case maybe_errstr of + Nothing -> maybePutStrLn dflags "done" + Just mm -> preloadFailed mm lib_paths lib_spec + +#ifdef darwin_TARGET_OS + Framework framework + -> do maybe_errstr <- loadFramework framework_paths framework + case maybe_errstr of + Nothing -> maybePutStrLn dflags "done" + Just mm -> preloadFailed mm framework_paths lib_spec +#endif + where + preloadFailed :: String -> [String] -> LibrarySpec -> IO () + preloadFailed sys_errmsg paths spec + = do maybePutStr dflags + ("failed.\nDynamic linker error message was:\n " + ++ sys_errmsg ++ "\nWhilst trying to load: " + ++ showLS spec ++ "\nDirectories to search are:\n" + ++ unlines (map (" "++) paths) ) + give_up + + -- Not interested in the paths in the static case. + preload_static paths name + = do b <- doesFileExist name + if not b then return False + else loadObj name >> return True + + give_up = throwDyn $ + CmdLineError "user specified .o/.so/.DLL could not be loaded." +\end{code} + + +%************************************************************************ +%* * + Link a byte-code expression +%* * +%************************************************************************ + +\begin{code} +linkExpr :: HscEnv -> UnlinkedBCO -> IO HValue + +-- Link a single expression, *including* first linking packages and +-- modules that this expression depends on. +-- +-- Raises an IO exception if it can't find a compiled version of the +-- dependents to link. + +linkExpr hsc_env root_ul_bco + = do { + -- Initialise the linker (if it's not been done already) + let dflags = hsc_dflags hsc_env + ; initDynLinker dflags + + -- Find what packages and linkables are required + ; eps <- readIORef (hsc_EPS hsc_env) + ; (lnks, pkgs) <- getLinkDeps hsc_env hpt (eps_PIT eps) needed_mods + + -- Link the packages and modules required + ; linkPackages dflags pkgs + ; ok <- linkModules dflags lnks + ; if failed ok then + dieWith empty + else do { + + -- Link the expression itself + pls <- readIORef v_PersistentLinkerState + ; let ie = itbl_env pls + ce = closure_env pls + + -- Link the necessary packages and linkables + ; (_, (root_hval:_)) <- linkSomeBCOs False ie ce [root_ul_bco] + ; return root_hval + }} + where + hpt = hsc_HPT hsc_env + dflags = hsc_dflags hsc_env + free_names = nameSetToList (bcoFreeNames root_ul_bco) + + needed_mods :: [Module] + needed_mods = [ nameModule n | n <- free_names, + isExternalName n, -- Names from other modules + not (isWiredInName n) -- Exclude wired-in names + ] -- (see note below) + -- Exclude wired-in names because we may not have read + -- their interface files, so getLinkDeps will fail + -- All wired-in names are in the base package, which we link + -- by default, so we can safely ignore them here. + +dieWith msg = throwDyn (ProgramError (showSDoc msg)) + +getLinkDeps :: HscEnv -> HomePackageTable -> PackageIfaceTable + -> [Module] -- If you need these + -> IO ([Linkable], [PackageId]) -- ... then link these first +-- Fails with an IO exception if it can't find enough files + +getLinkDeps hsc_env hpt pit mods +-- Find all the packages and linkables that a set of modules depends on + = do { pls <- readIORef v_PersistentLinkerState ; + let { + -- 1. Find the dependent home-pkg-modules/packages from each iface + (mods_s, pkgs_s) = unzip (map get_deps mods) ; + + -- 2. Exclude ones already linked + -- Main reason: avoid findModule calls in get_linkable + mods_needed = nub (concat mods_s) `minusList` linked_mods ; + pkgs_needed = nub (concat pkgs_s) `minusList` pkgs_loaded pls ; + + linked_mods = map linkableModule (objs_loaded pls ++ bcos_loaded pls) + } ; + + -- 3. For each dependent module, find its linkable + -- This will either be in the HPT or (in the case of one-shot + -- compilation) we may need to use maybe_getFileLinkable + lnks_needed <- mapM get_linkable mods_needed ; + + return (lnks_needed, pkgs_needed) } + where + get_deps :: Module -> ([Module],[PackageId]) + -- Get the things needed for the specified module + -- This is rather similar to the code in RnNames.importsFromImportDecl + get_deps mod + | ExtPackage p <- mi_package iface + = ([], p : dep_pkgs deps) + | otherwise + = (mod : [m | (m,_) <- dep_mods deps], dep_pkgs deps) + where + iface = get_iface mod + deps = mi_deps iface + + get_iface mod = case lookupIface hpt pit mod of + Just iface -> iface + Nothing -> pprPanic "getLinkDeps" (no_iface mod) + no_iface mod = ptext SLIT("No iface for") <+> ppr mod + -- This one is a GHC bug + + no_obj mod = dieWith (ptext SLIT("No compiled code for") <+> ppr mod) + -- This one is a build-system bug + + get_linkable mod_name -- A home-package module + | Just mod_info <- lookupModuleEnv hpt mod_name + = ASSERT(isJust (hm_linkable mod_info)) + return (fromJust (hm_linkable mod_info)) + | otherwise + = -- It's not in the HPT because we are in one shot mode, + -- so use the Finder to get a ModLocation... + do { mb_stuff <- findModule hsc_env mod_name False ; + case mb_stuff of { + Found loc _ -> found loc mod_name ; + _ -> no_obj mod_name + }} + + found loc mod_name = do { + -- ...and then find the linkable for it + mb_lnk <- findObjectLinkableMaybe mod_name loc ; + case mb_lnk of { + Nothing -> no_obj mod_name ; + Just lnk -> return lnk + }} +\end{code} + + +%************************************************************************ +%* * + Link some linkables + The linkables may consist of a mixture of + byte-code modules and object modules +%* * +%************************************************************************ + +\begin{code} +linkModules :: DynFlags -> [Linkable] -> IO SuccessFlag +linkModules dflags linkables + = block $ do -- don't want to be interrupted by ^C in here + + let (objs, bcos) = partition isObjectLinkable + (concatMap partitionLinkable linkables) + + -- Load objects first; they can't depend on BCOs + ok_flag <- dynLinkObjs dflags objs + + if failed ok_flag then + return Failed + else do + dynLinkBCOs bcos + return Succeeded + + +-- HACK to support f-x-dynamic in the interpreter; no other purpose +partitionLinkable :: Linkable -> [Linkable] +partitionLinkable li + = let li_uls = linkableUnlinked li + li_uls_obj = filter isObject li_uls + li_uls_bco = filter isInterpretable li_uls + in + case (li_uls_obj, li_uls_bco) of + (objs@(_:_), bcos@(_:_)) + -> [li{linkableUnlinked=li_uls_obj}, li{linkableUnlinked=li_uls_bco}] + other + -> [li] + +findModuleLinkable_maybe :: [Linkable] -> Module -> Maybe Linkable +findModuleLinkable_maybe lis mod + = case [LM time nm us | LM time nm us <- lis, nm == mod] of + [] -> Nothing + [li] -> Just li + many -> pprPanic "findModuleLinkable" (ppr mod) + +linkableInSet :: Linkable -> [Linkable] -> Bool +linkableInSet l objs_loaded = + case findModuleLinkable_maybe objs_loaded (linkableModule l) of + Nothing -> False + Just m -> linkableTime l == linkableTime m +\end{code} + + +%************************************************************************ +%* * +\subsection{The object-code linker} +%* * +%************************************************************************ + +\begin{code} +dynLinkObjs :: DynFlags -> [Linkable] -> IO SuccessFlag + -- Side-effects the PersistentLinkerState + +dynLinkObjs dflags objs + = do pls <- readIORef v_PersistentLinkerState + + -- Load the object files and link them + let (objs_loaded', new_objs) = rmDupLinkables (objs_loaded pls) objs + pls1 = pls { objs_loaded = objs_loaded' } + unlinkeds = concatMap linkableUnlinked new_objs + + mapM loadObj (map nameOfObject unlinkeds) + + -- Link the all together + ok <- resolveObjs + + -- If resolving failed, unload all our + -- object modules and carry on + if succeeded ok then do + writeIORef v_PersistentLinkerState pls1 + return Succeeded + else do + pls2 <- unload_wkr dflags [] pls1 + writeIORef v_PersistentLinkerState pls2 + return Failed + + +rmDupLinkables :: [Linkable] -- Already loaded + -> [Linkable] -- New linkables + -> ([Linkable], -- New loaded set (including new ones) + [Linkable]) -- New linkables (excluding dups) +rmDupLinkables already ls + = go already [] ls + where + go already extras [] = (already, extras) + go already extras (l:ls) + | linkableInSet l already = go already extras ls + | otherwise = go (l:already) (l:extras) ls +\end{code} + +%************************************************************************ +%* * +\subsection{The byte-code linker} +%* * +%************************************************************************ + +\begin{code} +dynLinkBCOs :: [Linkable] -> IO () + -- Side-effects the persistent linker state +dynLinkBCOs bcos + = do pls <- readIORef v_PersistentLinkerState + + let (bcos_loaded', new_bcos) = rmDupLinkables (bcos_loaded pls) bcos + pls1 = pls { bcos_loaded = bcos_loaded' } + unlinkeds :: [Unlinked] + unlinkeds = concatMap linkableUnlinked new_bcos + + cbcs :: [CompiledByteCode] + cbcs = map byteCodeOfObject unlinkeds + + + ul_bcos = [b | ByteCode bs _ <- cbcs, b <- bs] + ies = [ie | ByteCode _ ie <- cbcs] + gce = closure_env pls + final_ie = foldr plusNameEnv (itbl_env pls) ies + + (final_gce, linked_bcos) <- linkSomeBCOs True final_ie gce ul_bcos + -- What happens to these linked_bcos? + + let pls2 = pls1 { closure_env = final_gce, + itbl_env = final_ie } + + writeIORef v_PersistentLinkerState pls2 + return () + +-- Link a bunch of BCOs and return them + updated closure env. +linkSomeBCOs :: Bool -- False <=> add _all_ BCOs to returned closure env + -- True <=> add only toplevel BCOs to closure env + -> ItblEnv + -> ClosureEnv + -> [UnlinkedBCO] + -> IO (ClosureEnv, [HValue]) + -- The returned HValues are associated 1-1 with + -- the incoming unlinked BCOs. Each gives the + -- value of the corresponding unlinked BCO + + +linkSomeBCOs toplevs_only ie ce_in ul_bcos + = do let nms = map unlinkedBCOName ul_bcos + hvals <- fixIO + ( \ hvs -> let ce_out = extendClosureEnv ce_in (zipLazy nms hvs) + in mapM (linkBCO ie ce_out) ul_bcos ) + + let ce_all_additions = zip nms hvals + ce_top_additions = filter (isExternalName.fst) ce_all_additions + ce_additions = if toplevs_only then ce_top_additions + else ce_all_additions + ce_out = -- make sure we're not inserting duplicate names into the + -- closure environment, which leads to trouble. + ASSERT (all (not . (`elemNameEnv` ce_in)) (map fst ce_additions)) + extendClosureEnv ce_in ce_additions + return (ce_out, hvals) + +\end{code} + + +%************************************************************************ +%* * + Unload some object modules +%* * +%************************************************************************ + +\begin{code} +-- --------------------------------------------------------------------------- +-- Unloading old objects ready for a new compilation sweep. +-- +-- The compilation manager provides us with a list of linkables that it +-- considers "stable", i.e. won't be recompiled this time around. For +-- each of the modules current linked in memory, +-- +-- * if the linkable is stable (and it's the same one - the +-- user may have recompiled the module on the side), we keep it, +-- +-- * otherwise, we unload it. +-- +-- * we also implicitly unload all temporary bindings at this point. + +unload :: DynFlags -> [Linkable] -> IO () +-- The 'linkables' are the ones to *keep* + +unload dflags linkables + = block $ do -- block, so we're safe from Ctrl-C in here + + -- Initialise the linker (if it's not been done already) + initDynLinker dflags + + pls <- readIORef v_PersistentLinkerState + new_pls <- unload_wkr dflags linkables pls + writeIORef v_PersistentLinkerState new_pls + + debugTraceMsg dflags 3 (text "unload: retaining objs" <+> ppr (objs_loaded new_pls)) + debugTraceMsg dflags 3 (text "unload: retaining bcos" <+> ppr (bcos_loaded new_pls)) + return () + +unload_wkr :: DynFlags + -> [Linkable] -- stable linkables + -> PersistentLinkerState + -> IO PersistentLinkerState +-- Does the core unload business +-- (the wrapper blocks exceptions and deals with the PLS get and put) + +unload_wkr dflags linkables pls + = do let (objs_to_keep, bcos_to_keep) = partition isObjectLinkable linkables + + objs_loaded' <- filterM (maybeUnload objs_to_keep) (objs_loaded pls) + bcos_loaded' <- filterM (maybeUnload bcos_to_keep) (bcos_loaded pls) + + let bcos_retained = map linkableModule bcos_loaded' + itbl_env' = filterNameMap bcos_retained (itbl_env pls) + closure_env' = filterNameMap bcos_retained (closure_env pls) + new_pls = pls { itbl_env = itbl_env', + closure_env = closure_env', + bcos_loaded = bcos_loaded', + objs_loaded = objs_loaded' } + + return new_pls + where + maybeUnload :: [Linkable] -> Linkable -> IO Bool + maybeUnload keep_linkables lnk + | linkableInSet lnk linkables = return True + | otherwise + = do mapM_ unloadObj [f | DotO f <- linkableUnlinked lnk] + -- The components of a BCO linkable may contain + -- dot-o files. Which is very confusing. + -- + -- But the BCO parts can be unlinked just by + -- letting go of them (plus of course depopulating + -- the symbol table which is done in the main body) + return False +\end{code} + + +%************************************************************************ +%* * + Loading packages +%* * +%************************************************************************ + + +\begin{code} +data LibrarySpec + = Object FilePath -- Full path name of a .o file, including trailing .o + -- For dynamic objects only, try to find the object + -- file in all the directories specified in + -- v_Library_paths before giving up. + + | DLL String -- "Unadorned" name of a .DLL/.so + -- e.g. On unix "qt" denotes "libqt.so" + -- On WinDoze "burble" denotes "burble.DLL" + -- loadDLL is platform-specific and adds the lib/.so/.DLL + -- suffixes platform-dependently + + | DLLPath FilePath -- Absolute or relative pathname to a dynamic library + -- (ends with .dll or .so). + + | Framework String -- Only used for darwin, but does no harm + +-- If this package is already part of the GHCi binary, we'll already +-- have the right DLLs for this package loaded, so don't try to +-- load them again. +-- +-- But on Win32 we must load them 'again'; doing so is a harmless no-op +-- as far as the loader is concerned, but it does initialise the list +-- of DLL handles that rts/Linker.c maintains, and that in turn is +-- used by lookupSymbol. So we must call addDLL for each library +-- just to get the DLL handle into the list. +partOfGHCi +# if defined(mingw32_TARGET_OS) || defined(darwin_TARGET_OS) + = [ ] +# else + = [ "base", "haskell98", "template-haskell", "readline" ] +# endif + +showLS (Object nm) = "(static) " ++ nm +showLS (DLL nm) = "(dynamic) " ++ nm +showLS (DLLPath nm) = "(dynamic) " ++ nm +showLS (Framework nm) = "(framework) " ++ nm + +linkPackages :: DynFlags -> [PackageId] -> IO () +-- Link exactly the specified packages, and their dependents +-- (unless of course they are already linked) +-- The dependents are linked automatically, and it doesn't matter +-- what order you specify the input packages. +-- +-- NOTE: in fact, since each module tracks all the packages it depends on, +-- we don't really need to use the package-config dependencies. +-- However we do need the package-config stuff (to find aux libs etc), +-- and following them lets us load libraries in the right order, which +-- perhaps makes the error message a bit more localised if we get a link +-- failure. So the dependency walking code is still here. + +linkPackages dflags new_pkgs + = do { pls <- readIORef v_PersistentLinkerState + ; let pkg_map = pkgIdMap (pkgState dflags) + + ; pkgs' <- link pkg_map (pkgs_loaded pls) new_pkgs + + ; writeIORef v_PersistentLinkerState (pls { pkgs_loaded = pkgs' }) + } + where + link :: PackageConfigMap -> [PackageId] -> [PackageId] -> IO [PackageId] + link pkg_map pkgs new_pkgs + = foldM (link_one pkg_map) pkgs new_pkgs + + link_one pkg_map pkgs new_pkg + | new_pkg `elem` pkgs -- Already linked + = return pkgs + + | Just pkg_cfg <- lookupPackage pkg_map new_pkg + = do { -- Link dependents first + pkgs' <- link pkg_map pkgs (map mkPackageId (depends pkg_cfg)) + -- Now link the package itself + ; linkPackage dflags pkg_cfg + ; return (new_pkg : pkgs') } + + | otherwise + = throwDyn (CmdLineError ("unknown package: " ++ packageIdString new_pkg)) + + +linkPackage :: DynFlags -> PackageConfig -> IO () +linkPackage dflags pkg + = do + let dirs = Packages.libraryDirs pkg + + let libs = Packages.hsLibraries pkg + -- Because of slight differences between the GHC dynamic linker and + -- the native system linker some packages have to link with a + -- different list of libraries when using GHCi. Examples include: libs + -- that are actually gnu ld scripts, and the possability that the .a + -- libs do not exactly match the .so/.dll equivalents. So if the + -- package file provides an "extra-ghci-libraries" field then we use + -- that instead of the "extra-libraries" field. + ++ (if null (Packages.extraGHCiLibraries pkg) + then Packages.extraLibraries pkg + else Packages.extraGHCiLibraries pkg) + ++ [ lib | '-':'l':lib <- Packages.ldOptions pkg ] + classifieds <- mapM (locateOneObj dirs) libs + + -- Complication: all the .so's must be loaded before any of the .o's. + let dlls = [ dll | DLL dll <- classifieds ] + objs = [ obj | Object obj <- classifieds ] + + maybePutStr dflags ("Loading package " ++ showPackageId (package pkg) ++ " ... ") + + -- See comments with partOfGHCi + when (pkgName (package pkg) `notElem` partOfGHCi) $ do + loadFrameworks pkg + -- When a library A needs symbols from a library B, the order in + -- extra_libraries/extra_ld_opts is "-lA -lB", because that's the + -- way ld expects it for static linking. Dynamic linking is a + -- different story: When A has no dependency information for B, + -- dlopen-ing A with RTLD_NOW (see addDLL in Linker.c) will fail + -- when B has not been loaded before. In a nutshell: Reverse the + -- order of DLLs for dynamic linking. + -- This fixes a problem with the HOpenGL package (see "Compiling + -- HOpenGL under recent versions of GHC" on the HOpenGL list). + mapM_ (load_dyn dirs) (reverse dlls) + + -- After loading all the DLLs, we can load the static objects. + -- Ordering isn't important here, because we do one final link + -- step to resolve everything. + mapM_ loadObj objs + + maybePutStr dflags "linking ... " + ok <- resolveObjs + if succeeded ok then maybePutStrLn dflags "done." + else throwDyn (InstallationError ("unable to load package `" ++ showPackageId (package pkg) ++ "'")) + +load_dyn dirs dll = do r <- loadDynamic dirs dll + case r of + Nothing -> return () + Just err -> throwDyn (CmdLineError ("can't load .so/.DLL for: " + ++ dll ++ " (" ++ err ++ ")" )) +#ifndef darwin_TARGET_OS +loadFrameworks pkg = return () +#else +loadFrameworks pkg = mapM_ load frameworks + where + fw_dirs = Packages.frameworkDirs pkg + frameworks = Packages.frameworks pkg + + load fw = do r <- loadFramework fw_dirs fw + case r of + Nothing -> return () + Just err -> throwDyn (CmdLineError ("can't load framework: " + ++ fw ++ " (" ++ err ++ ")" )) +#endif + +-- Try to find an object file for a given library in the given paths. +-- If it isn't present, we assume it's a dynamic library. +locateOneObj :: [FilePath] -> String -> IO LibrarySpec +locateOneObj dirs lib + = do { mb_obj_path <- findFile mk_obj_path dirs + ; case mb_obj_path of + Just obj_path -> return (Object obj_path) + Nothing -> + do { mb_lib_path <- findFile mk_dyn_lib_path dirs + ; case mb_lib_path of + Just lib_path -> return (DLL (lib ++ "_dyn")) + Nothing -> return (DLL lib) }} -- We assume + where + mk_obj_path dir = dir `joinFileName` (lib `joinFileExt` "o") + mk_dyn_lib_path dir = dir `joinFileName` mkSOName (lib ++ "_dyn") + + +-- ---------------------------------------------------------------------------- +-- Loading a dyanmic library (dlopen()-ish on Unix, LoadLibrary-ish on Win32) + +-- return Nothing == success, else Just error message from dlopen +loadDynamic paths rootname + = do { mb_dll <- findFile mk_dll_path paths + ; case mb_dll of + Just dll -> loadDLL dll + Nothing -> loadDLL (mkSOName rootname) } + -- Tried all our known library paths, so let + -- dlopen() search its own builtin paths now. + where + mk_dll_path dir = dir `joinFileName` mkSOName rootname + +#if defined(darwin_TARGET_OS) +mkSOName root = ("lib" ++ root) `joinFileExt` "dylib" +#elif defined(mingw32_TARGET_OS) +-- Win32 DLLs have no .dll extension here, because addDLL tries +-- both foo.dll and foo.drv +mkSOName root = root +#else +mkSOName root = ("lib" ++ root) `joinFileExt` "so" +#endif + +-- Darwin / MacOS X only: load a framework +-- a framework is a dynamic library packaged inside a directory of the same +-- name. They are searched for in different paths than normal libraries. +#ifdef darwin_TARGET_OS +loadFramework extraPaths rootname + = do { mb_fwk <- findFile mk_fwk (extraPaths ++ defaultFrameworkPaths) + ; case mb_fwk of + Just fwk_path -> loadDLL fwk_path + Nothing -> return (Just "not found") } + -- Tried all our known library paths, but dlopen() + -- has no built-in paths for frameworks: give up + where + mk_fwk dir = dir `joinFileName` (rootname ++ ".framework/" ++ rootname) + -- sorry for the hardcoded paths, I hope they won't change anytime soon: + defaultFrameworkPaths = ["/Library/Frameworks", "/System/Library/Frameworks"] +#endif +\end{code} + +%************************************************************************ +%* * + Helper functions +%* * +%************************************************************************ + +\begin{code} +findFile :: (FilePath -> FilePath) -- Maps a directory path to a file path + -> [FilePath] -- Directories to look in + -> IO (Maybe FilePath) -- The first file path to match +findFile mk_file_path [] + = return Nothing +findFile mk_file_path (dir:dirs) + = do { let file_path = mk_file_path dir + ; b <- doesFileExist file_path + ; if b then + return (Just file_path) + else + findFile mk_file_path dirs } +\end{code} + +\begin{code} +maybePutStr dflags s | verbosity dflags > 0 = putStr s + | otherwise = return () + +maybePutStrLn dflags s | verbosity dflags > 0 = putStrLn s + | otherwise = return () +\end{code} diff --git a/compiler/ghci/ObjLink.lhs b/compiler/ghci/ObjLink.lhs new file mode 100644 index 0000000000..057938a45e --- /dev/null +++ b/compiler/ghci/ObjLink.lhs @@ -0,0 +1,97 @@ +% +% (c) The University of Glasgow, 2000 +% + +-- --------------------------------------------------------------------------- +-- The dynamic linker for object code (.o .so .dll files) +-- --------------------------------------------------------------------------- + +Primarily, this module consists of an interface to the C-land dynamic linker. + +\begin{code} +{-# OPTIONS -#include "Linker.h" #-} + +module ObjLink ( + initObjLinker, -- :: IO () + loadDLL, -- :: String -> IO (Maybe String) + loadObj, -- :: String -> IO () + unloadObj, -- :: String -> IO () + lookupSymbol, -- :: String -> IO (Maybe (Ptr a)) + resolveObjs -- :: IO SuccessFlag + ) where + +import Monad ( when ) + +import Foreign.C +import Foreign ( Ptr, nullPtr ) +import Panic ( panic ) +import BasicTypes ( SuccessFlag, successIf ) +import Config ( cLeadingUnderscore ) +import Outputable + +-- --------------------------------------------------------------------------- +-- RTS Linker Interface +-- --------------------------------------------------------------------------- + +lookupSymbol :: String -> IO (Maybe (Ptr a)) +lookupSymbol str_in = do + let str = prefixUnderscore str_in + withCString str $ \c_str -> do + addr <- c_lookupSymbol c_str + if addr == nullPtr + then return Nothing + else return (Just addr) + +prefixUnderscore :: String -> String +prefixUnderscore + | cLeadingUnderscore == "YES" = ('_':) + | otherwise = id + +loadDLL :: String -> IO (Maybe String) +-- Nothing => success +-- Just err_msg => failure +loadDLL str = do + maybe_errmsg <- withCString str $ \dll -> c_addDLL dll + if maybe_errmsg == nullPtr + then return Nothing + else do str <- peekCString maybe_errmsg + return (Just str) + +loadObj :: String -> IO () +loadObj str = do + withCString str $ \c_str -> do + r <- c_loadObj c_str + when (r == 0) (panic "loadObj: failed") + +unloadObj :: String -> IO () +unloadObj str = + withCString str $ \c_str -> do + r <- c_unloadObj c_str + when (r == 0) (panic "unloadObj: failed") + +resolveObjs :: IO SuccessFlag +resolveObjs = do + r <- c_resolveObjs + return (successIf (r /= 0)) + +-- --------------------------------------------------------------------------- +-- Foreign declaractions to RTS entry points which does the real work; +-- --------------------------------------------------------------------------- + +#if __GLASGOW_HASKELL__ >= 504 +foreign import ccall unsafe "addDLL" c_addDLL :: CString -> IO CString +foreign import ccall unsafe "initLinker" initObjLinker :: IO () +foreign import ccall unsafe "lookupSymbol" c_lookupSymbol :: CString -> IO (Ptr a) +foreign import ccall unsafe "loadObj" c_loadObj :: CString -> IO Int +foreign import ccall unsafe "unloadObj" c_unloadObj :: CString -> IO Int +foreign import ccall unsafe "resolveObjs" c_resolveObjs :: IO Int +#else +foreign import "addDLL" unsafe c_addDLL :: CString -> IO CString +foreign import "initLinker" unsafe initLinker :: IO () +foreign import "lookupSymbol" unsafe c_lookupSymbol :: CString -> IO (Ptr a) +foreign import "loadObj" unsafe c_loadObj :: CString -> IO Int +foreign import "unloadObj" unsafe c_unloadObj :: CString -> IO Int +foreign import "resolveObjs" unsafe c_resolveObjs :: IO Int +#endif + +\end{code} diff --git a/compiler/ghci/keepCAFsForGHCi.c b/compiler/ghci/keepCAFsForGHCi.c new file mode 100644 index 0000000000..0aabbedea0 --- /dev/null +++ b/compiler/ghci/keepCAFsForGHCi.c @@ -0,0 +1,15 @@ +#include "Rts.h" +#include "Storage.h" + +// This file is only included when GhcBuildDylibs is set in mk/build.mk. +// It contains an __attribute__((constructor)) function (run prior to main()) +// which sets the keepCAFs flag in the RTS, before any Haskell code is run. +// This is required so that GHCi can use dynamic libraries instead of HSxyz.o +// files. + +static void keepCAFsForGHCi() __attribute__((constructor)); + +static void keepCAFsForGHCi() +{ + keepCAFs = 1; +} diff --git a/compiler/hsSyn/Convert.lhs b/compiler/hsSyn/Convert.lhs new file mode 100644 index 0000000000..6c14c11893 --- /dev/null +++ b/compiler/hsSyn/Convert.lhs @@ -0,0 +1,622 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% + +This module converts Template Haskell syntax into HsSyn + + +\begin{code} +module Convert( convertToHsExpr, convertToHsDecls, convertToHsType, thRdrName ) where + +#include "HsVersions.h" + +import Language.Haskell.TH as TH hiding (sigP) +import Language.Haskell.TH.Syntax as TH + +import HsSyn as Hs +import qualified Class (FunDep) +import RdrName ( RdrName, mkRdrUnqual, mkRdrQual, mkOrig, getRdrName, nameRdrName ) +import qualified Name ( Name, mkInternalName, getName ) +import Module ( Module, mkModule ) +import RdrHsSyn ( mkClassDecl, mkTyData ) +import qualified OccName +import OccName ( startsVarId, startsVarSym, startsConId, startsConSym, + pprNameSpace ) +import SrcLoc ( Located(..), SrcSpan ) +import Type ( Type ) +import TysWiredIn ( unitTyCon, tupleTyCon, tupleCon, trueDataCon, nilDataCon, consDataCon ) +import BasicTypes( Boxity(..) ) +import ForeignCall ( Safety(..), CCallConv(..), CCallTarget(..), + CExportSpec(..)) +import Char ( isAscii, isAlphaNum, isAlpha ) +import List ( partition ) +import Unique ( Unique, mkUniqueGrimily ) +import ErrUtils ( Message ) +import GLAEXTS ( Int(..), Int# ) +import SrcLoc ( noSrcLoc ) +import Bag ( listToBag ) +import FastString +import Outputable + + + +------------------------------------------------------------------- +-- The external interface + +convertToHsDecls :: SrcSpan -> [TH.Dec] -> Either Message [LHsDecl RdrName] +convertToHsDecls loc ds = initCvt loc (mapM cvtTop ds) + +convertToHsExpr :: SrcSpan -> TH.Exp -> Either Message (LHsExpr RdrName) +convertToHsExpr loc e = initCvt loc (cvtl e) + +convertToHsType :: SrcSpan -> TH.Type -> Either Message (LHsType RdrName) +convertToHsType loc t = initCvt loc (cvtType t) + + +------------------------------------------------------------------- +newtype CvtM a = CvtM { unCvtM :: SrcSpan -> Either Message a } + -- Push down the source location; + -- Can fail, with a single error message + +-- NB: If the conversion succeeds with (Right x), there should +-- be no exception values hiding in x +-- Reason: so a (head []) in TH code doesn't subsequently +-- make GHC crash when it tries to walk the generated tree + +-- Use the loc everywhere, for lack of anything better +-- In particular, we want it on binding locations, so that variables bound in +-- the spliced-in declarations get a location that at least relates to the splice point + +instance Monad CvtM where + return x = CvtM $ \loc -> Right x + (CvtM m) >>= k = CvtM $ \loc -> case m loc of + Left err -> Left err + Right v -> unCvtM (k v) loc + +initCvt :: SrcSpan -> CvtM a -> Either Message a +initCvt loc (CvtM m) = m loc + +force :: a -> CvtM a +force a = a `seq` return a + +failWith :: Message -> CvtM a +failWith m = CvtM (\loc -> Left full_msg) + where + full_msg = m $$ ptext SLIT("When splicing generated code into the program") + +returnL :: a -> CvtM (Located a) +returnL x = CvtM (\loc -> Right (L loc x)) + +wrapL :: CvtM a -> CvtM (Located a) +wrapL (CvtM m) = CvtM (\loc -> case m loc of + Left err -> Left err + Right v -> Right (L loc v)) + +------------------------------------------------------------------- +cvtTop :: TH.Dec -> CvtM (LHsDecl RdrName) +cvtTop d@(TH.ValD _ _ _) = do { L loc d' <- cvtBind d; return (L loc $ Hs.ValD d') } +cvtTop d@(TH.FunD _ _) = do { L loc d' <- cvtBind d; return (L loc $ Hs.ValD d') } +cvtTop (TH.SigD nm typ) = do { nm' <- vNameL nm + ; ty' <- cvtType typ + ; returnL $ Hs.SigD (TypeSig nm' ty') } + +cvtTop (TySynD tc tvs rhs) + = do { tc' <- tconNameL tc + ; tvs' <- cvtTvs tvs + ; rhs' <- cvtType rhs + ; returnL $ TyClD (TySynonym tc' tvs' rhs') } + +cvtTop (DataD ctxt tc tvs constrs derivs) + = do { stuff <- cvt_tycl_hdr ctxt tc tvs + ; cons' <- mapM cvtConstr constrs + ; derivs' <- cvtDerivs derivs + ; returnL $ TyClD (mkTyData DataType stuff Nothing cons' derivs') } + + +cvtTop (NewtypeD ctxt tc tvs constr derivs) + = do { stuff <- cvt_tycl_hdr ctxt tc tvs + ; con' <- cvtConstr constr + ; derivs' <- cvtDerivs derivs + ; returnL $ TyClD (mkTyData NewType stuff Nothing [con'] derivs') } + +cvtTop (ClassD ctxt cl tvs fds decs) + = do { stuff <- cvt_tycl_hdr ctxt cl tvs + ; fds' <- mapM cvt_fundep fds + ; (binds', sigs') <- cvtBindsAndSigs decs + ; returnL $ TyClD $ mkClassDecl stuff fds' sigs' binds' } + +cvtTop (InstanceD tys ty decs) + = do { (binds', sigs') <- cvtBindsAndSigs decs + ; ctxt' <- cvtContext tys + ; L loc pred' <- cvtPred ty + ; inst_ty' <- returnL $ mkImplicitHsForAllTy ctxt' (L loc (HsPredTy pred')) + ; returnL $ InstD (InstDecl inst_ty' binds' sigs') } + +cvtTop (ForeignD ford) = do { ford' <- cvtForD ford; returnL $ ForD ford' } + +cvt_tycl_hdr cxt tc tvs + = do { cxt' <- cvtContext cxt + ; tc' <- tconNameL tc + ; tvs' <- cvtTvs tvs + ; return (cxt', tc', tvs') } + +--------------------------------------------------- +-- Data types +-- Can't handle GADTs yet +--------------------------------------------------- + +cvtConstr (NormalC c strtys) + = do { c' <- cNameL c + ; cxt' <- returnL [] + ; tys' <- mapM cvt_arg strtys + ; returnL $ ConDecl c' Explicit noExistentials cxt' (PrefixCon tys') ResTyH98 } + +cvtConstr (RecC c varstrtys) + = do { c' <- cNameL c + ; cxt' <- returnL [] + ; args' <- mapM cvt_id_arg varstrtys + ; returnL $ ConDecl c' Explicit noExistentials cxt' (RecCon args') ResTyH98 } + +cvtConstr (InfixC st1 c st2) + = do { c' <- cNameL c + ; cxt' <- returnL [] + ; st1' <- cvt_arg st1 + ; st2' <- cvt_arg st2 + ; returnL $ ConDecl c' Explicit noExistentials cxt' (InfixCon st1' st2') ResTyH98 } + +cvtConstr (ForallC tvs ctxt (ForallC tvs' ctxt' con')) + = cvtConstr (ForallC (tvs ++ tvs') (ctxt ++ ctxt') con') + +cvtConstr (ForallC tvs ctxt con) + = do { L _ con' <- cvtConstr con + ; tvs' <- cvtTvs tvs + ; ctxt' <- cvtContext ctxt + ; case con' of + ConDecl l _ [] (L _ []) x ResTyH98 + -> returnL $ ConDecl l Explicit tvs' ctxt' x ResTyH98 + c -> panic "ForallC: Can't happen" } + +cvt_arg (IsStrict, ty) = do { ty' <- cvtType ty; returnL $ HsBangTy HsStrict ty' } +cvt_arg (NotStrict, ty) = cvtType ty + +cvt_id_arg (i, str, ty) = do { i' <- vNameL i + ; ty' <- cvt_arg (str,ty) + ; return (i', ty') } + +cvtDerivs [] = return Nothing +cvtDerivs cs = do { cs' <- mapM cvt_one cs + ; return (Just cs') } + where + cvt_one c = do { c' <- tconName c + ; returnL $ HsPredTy $ HsClassP c' [] } + +cvt_fundep :: FunDep -> CvtM (Located (Class.FunDep RdrName)) +cvt_fundep (FunDep xs ys) = do { xs' <- mapM tName xs; ys' <- mapM tName ys; returnL (xs', ys') } + +noExistentials = [] + +------------------------------------------ +-- Foreign declarations +------------------------------------------ + +cvtForD :: Foreign -> CvtM (ForeignDecl RdrName) +cvtForD (ImportF callconv safety from nm ty) + | Just (c_header, cis) <- parse_ccall_impent (TH.nameBase nm) from + = do { nm' <- vNameL nm + ; ty' <- cvtType ty + ; let i = CImport (cvt_conv callconv) safety' c_header nilFS cis + ; return $ ForeignImport nm' ty' i False } + + | otherwise + = failWith $ text (show from)<+> ptext SLIT("is not a valid ccall impent") + where + safety' = case safety of + Unsafe -> PlayRisky + Safe -> PlaySafe False + Threadsafe -> PlaySafe True + +cvtForD (ExportF callconv as nm ty) + = do { nm' <- vNameL nm + ; ty' <- cvtType ty + ; let e = CExport (CExportStatic (mkFastString as) (cvt_conv callconv)) + ; return $ ForeignExport nm' ty' e False } + +cvt_conv CCall = CCallConv +cvt_conv StdCall = StdCallConv + +parse_ccall_impent :: String -> String -> Maybe (FastString, CImportSpec) +parse_ccall_impent nm s + = case lex_ccall_impent s of + Just ["dynamic"] -> Just (nilFS, CFunction DynamicTarget) + Just ["wrapper"] -> Just (nilFS, CWrapper) + Just ("static":ts) -> parse_ccall_impent_static nm ts + Just ts -> parse_ccall_impent_static nm ts + Nothing -> Nothing + +parse_ccall_impent_static :: String + -> [String] + -> Maybe (FastString, CImportSpec) +parse_ccall_impent_static nm ts + = let ts' = case ts of + [ "&", cid] -> [ cid] + [fname, "&" ] -> [fname ] + [fname, "&", cid] -> [fname, cid] + _ -> ts + in case ts' of + [ cid] | is_cid cid -> Just (nilFS, mk_cid cid) + [fname, cid] | is_cid cid -> Just (mkFastString fname, mk_cid cid) + [ ] -> Just (nilFS, mk_cid nm) + [fname ] -> Just (mkFastString fname, mk_cid nm) + _ -> Nothing + where is_cid :: String -> Bool + is_cid x = all (/= '.') x && (isAlpha (head x) || head x == '_') + mk_cid :: String -> CImportSpec + mk_cid = CFunction . StaticTarget . mkFastString + +lex_ccall_impent :: String -> Maybe [String] +lex_ccall_impent "" = Just [] +lex_ccall_impent ('&':xs) = fmap ("&":) $ lex_ccall_impent xs +lex_ccall_impent (' ':xs) = lex_ccall_impent xs +lex_ccall_impent ('\t':xs) = lex_ccall_impent xs +lex_ccall_impent xs = case span is_valid xs of + ("", _) -> Nothing + (t, xs') -> fmap (t:) $ lex_ccall_impent xs' + where is_valid :: Char -> Bool + is_valid c = isAscii c && (isAlphaNum c || c `elem` "._") + + +--------------------------------------------------- +-- Declarations +--------------------------------------------------- + +cvtDecs :: [TH.Dec] -> CvtM (HsLocalBinds RdrName) +cvtDecs [] = return EmptyLocalBinds +cvtDecs ds = do { (binds,sigs) <- cvtBindsAndSigs ds + ; return (HsValBinds (ValBindsIn binds sigs)) } + +cvtBindsAndSigs ds + = do { binds' <- mapM cvtBind binds; sigs' <- mapM cvtSig sigs + ; return (listToBag binds', sigs') } + where + (sigs, binds) = partition is_sig ds + + is_sig (TH.SigD _ _) = True + is_sig other = False + +cvtSig (TH.SigD nm ty) + = do { nm' <- vNameL nm; ty' <- cvtType ty; returnL (Hs.TypeSig nm' ty') } + +cvtBind :: TH.Dec -> CvtM (LHsBind RdrName) +-- Used only for declarations in a 'let/where' clause, +-- not for top level decls +cvtBind (TH.ValD (TH.VarP s) body ds) + = do { s' <- vNameL s + ; cl' <- cvtClause (Clause [] body ds) + ; returnL $ mkFunBind s' [cl'] } + +cvtBind (TH.FunD nm cls) + = do { nm' <- vNameL nm + ; cls' <- mapM cvtClause cls + ; returnL $ mkFunBind nm' cls' } + +cvtBind (TH.ValD p body ds) + = do { p' <- cvtPat p + ; g' <- cvtGuard body + ; ds' <- cvtDecs ds + ; returnL $ PatBind { pat_lhs = p', pat_rhs = GRHSs g' ds', + pat_rhs_ty = void, bind_fvs = placeHolderNames } } + +cvtBind d + = failWith (sep [ptext SLIT("Illegal kind of declaration in where clause"), + nest 2 (text (TH.pprint d))]) + +cvtClause :: TH.Clause -> CvtM (Hs.LMatch RdrName) +cvtClause (Clause ps body wheres) + = do { ps' <- cvtPats ps + ; g' <- cvtGuard body + ; ds' <- cvtDecs wheres + ; returnL $ Hs.Match ps' Nothing (GRHSs g' ds') } + + +------------------------------------------------------------------- +-- Expressions +------------------------------------------------------------------- + +cvtl :: TH.Exp -> CvtM (LHsExpr RdrName) +cvtl e = wrapL (cvt e) + where + cvt (VarE s) = do { s' <- vName s; return $ HsVar s' } + cvt (ConE s) = do { s' <- cName s; return $ HsVar s' } + cvt (LitE l) + | overloadedLit l = do { l' <- cvtOverLit l; return $ HsOverLit l' } + | otherwise = do { l' <- cvtLit l; return $ HsLit l' } + + cvt (AppE x y) = do { x' <- cvtl x; y' <- cvtl y; return $ HsApp x' y' } + cvt (LamE ps e) = do { ps' <- cvtPats ps; e' <- cvtl e + ; return $ HsLam (mkMatchGroup [mkSimpleMatch ps' e']) } + cvt (TupE [e]) = cvt e + cvt (TupE es) = do { es' <- mapM cvtl es; return $ ExplicitTuple es' Boxed } + cvt (CondE x y z) = do { x' <- cvtl x; y' <- cvtl y; z' <- cvtl z + ; return $ HsIf x' y' z' } + cvt (LetE ds e) = do { ds' <- cvtDecs ds; e' <- cvtl e; return $ HsLet ds' e' } + cvt (CaseE e ms) = do { e' <- cvtl e; ms' <- mapM cvtMatch ms + ; return $ HsCase e' (mkMatchGroup ms') } + cvt (DoE ss) = cvtHsDo DoExpr ss + cvt (CompE ss) = cvtHsDo ListComp ss + cvt (ArithSeqE dd) = do { dd' <- cvtDD dd; return $ ArithSeq noPostTcExpr dd' } + cvt (ListE xs) = do { xs' <- mapM cvtl xs; return $ ExplicitList void xs' } + cvt (InfixE (Just x) s (Just y)) = do { x' <- cvtl x; s' <- cvtl s; y' <- cvtl y + ; e' <- returnL $ OpApp x' s' undefined y' + ; return $ HsPar e' } + cvt (InfixE Nothing s (Just y)) = do { s' <- cvtl s; y' <- cvtl y + ; return $ SectionR s' y' } + cvt (InfixE (Just x) s Nothing ) = do { x' <- cvtl x; s' <- cvtl s + ; return $ SectionL x' s' } + cvt (InfixE Nothing s Nothing ) = cvt s -- Can I indicate this is an infix thing? + + cvt (SigE e t) = do { e' <- cvtl e; t' <- cvtType t + ; return $ ExprWithTySig e' t' } + cvt (RecConE c flds) = do { c' <- cNameL c + ; flds' <- mapM cvtFld flds + ; return $ RecordCon c' noPostTcExpr flds' } + cvt (RecUpdE e flds) = do { e' <- cvtl e + ; flds' <- mapM cvtFld flds + ; return $ RecordUpd e' flds' placeHolderType placeHolderType } + +cvtFld (v,e) = do { v' <- vNameL v; e' <- cvtl e; return (v',e') } + +cvtDD :: Range -> CvtM (ArithSeqInfo RdrName) +cvtDD (FromR x) = do { x' <- cvtl x; return $ From x' } +cvtDD (FromThenR x y) = do { x' <- cvtl x; y' <- cvtl y; return $ FromThen x' y' } +cvtDD (FromToR x y) = do { x' <- cvtl x; y' <- cvtl y; return $ FromTo x' y' } +cvtDD (FromThenToR x y z) = do { x' <- cvtl x; y' <- cvtl y; z' <- cvtl z; return $ FromThenTo x' y' z' } + +------------------------------------- +-- Do notation and statements +------------------------------------- + +cvtHsDo do_or_lc stmts + = do { stmts' <- cvtStmts stmts + ; let body = case last stmts' of + L _ (ExprStmt body _ _) -> body + ; return $ HsDo do_or_lc (init stmts') body void } + +cvtStmts = mapM cvtStmt + +cvtStmt :: TH.Stmt -> CvtM (Hs.LStmt RdrName) +cvtStmt (NoBindS e) = do { e' <- cvtl e; returnL $ mkExprStmt e' } +cvtStmt (TH.BindS p e) = do { p' <- cvtPat p; e' <- cvtl e; returnL $ mkBindStmt p' e' } +cvtStmt (TH.LetS ds) = do { ds' <- cvtDecs ds; returnL $ LetStmt ds' } +cvtStmt (TH.ParS dss) = do { dss' <- mapM cvt_one dss; returnL $ ParStmt dss' } + where + cvt_one ds = do { ds' <- cvtStmts ds; return (ds', undefined) } + +cvtMatch :: TH.Match -> CvtM (Hs.LMatch RdrName) +cvtMatch (TH.Match p body decs) + = do { p' <- cvtPat p + ; g' <- cvtGuard body + ; decs' <- cvtDecs decs + ; returnL $ Hs.Match [p'] Nothing (GRHSs g' decs') } + +cvtGuard :: TH.Body -> CvtM [LGRHS RdrName] +cvtGuard (GuardedB pairs) = mapM cvtpair pairs +cvtGuard (NormalB e) = do { e' <- cvtl e; g' <- returnL $ GRHS [] e'; return [g'] } + +cvtpair :: (TH.Guard, TH.Exp) -> CvtM (LGRHS RdrName) +cvtpair (NormalG ge,rhs) = do { ge' <- cvtl ge; rhs' <- cvtl rhs + ; g' <- returnL $ mkBindStmt truePat ge' + ; returnL $ GRHS [g'] rhs' } +cvtpair (PatG gs,rhs) = do { gs' <- cvtStmts gs; rhs' <- cvtl rhs + ; returnL $ GRHS gs' rhs' } + +cvtOverLit :: Lit -> CvtM (HsOverLit RdrName) +cvtOverLit (IntegerL i) = do { force i; return $ mkHsIntegral i } +cvtOverLit (RationalL r) = do { force r; return $ mkHsFractional r } +-- An Integer is like an an (overloaded) '3' in a Haskell source program +-- Similarly 3.5 for fractionals + +cvtLit :: Lit -> CvtM HsLit +cvtLit (IntPrimL i) = do { force i; return $ HsIntPrim i } +cvtLit (FloatPrimL f) = do { force f; return $ HsFloatPrim f } +cvtLit (DoublePrimL f) = do { force f; return $ HsDoublePrim f } +cvtLit (CharL c) = do { force c; return $ HsChar c } +cvtLit (StringL s) = do { let { s' = mkFastString s }; force s'; return $ HsString s' } + +cvtPats :: [TH.Pat] -> CvtM [Hs.LPat RdrName] +cvtPats pats = mapM cvtPat pats + +cvtPat :: TH.Pat -> CvtM (Hs.LPat RdrName) +cvtPat pat = wrapL (cvtp pat) + +cvtp :: TH.Pat -> CvtM (Hs.Pat RdrName) +cvtp (TH.LitP l) + | overloadedLit l = do { l' <- cvtOverLit l + ; return (mkNPat l' Nothing) } + -- Not right for negative patterns; + -- need to think about that! + | otherwise = do { l' <- cvtLit l; return $ Hs.LitPat l' } +cvtp (TH.VarP s) = do { s' <- vName s; return $ Hs.VarPat s' } +cvtp (TupP [p]) = cvtp p +cvtp (TupP ps) = do { ps' <- cvtPats ps; return $ TuplePat ps' Boxed void } +cvtp (ConP s ps) = do { s' <- cNameL s; ps' <- cvtPats ps; return $ ConPatIn s' (PrefixCon ps') } +cvtp (InfixP p1 s p2) = do { s' <- cNameL s; p1' <- cvtPat p1; p2' <- cvtPat p2 + ; return $ ConPatIn s' (InfixCon p1' p2') } +cvtp (TildeP p) = do { p' <- cvtPat p; return $ LazyPat p' } +cvtp (TH.AsP s p) = do { s' <- vNameL s; p' <- cvtPat p; return $ AsPat s' p' } +cvtp TH.WildP = return $ WildPat void +cvtp (RecP c fs) = do { c' <- cNameL c; fs' <- mapM cvtPatFld fs + ; return $ ConPatIn c' $ Hs.RecCon fs' } +cvtp (ListP ps) = do { ps' <- cvtPats ps; return $ ListPat ps' void } +cvtp (SigP p t) = do { p' <- cvtPat p; t' <- cvtType t; return $ SigPatIn p' t' } + +cvtPatFld (s,p) = do { s' <- vNameL s; p' <- cvtPat p; return (s',p') } + +----------------------------------------------------------- +-- Types and type variables + +cvtTvs :: [TH.Name] -> CvtM [LHsTyVarBndr RdrName] +cvtTvs tvs = mapM cvt_tv tvs + +cvt_tv tv = do { tv' <- tName tv; returnL $ UserTyVar tv' } + +cvtContext :: Cxt -> CvtM (LHsContext RdrName) +cvtContext tys = do { preds' <- mapM cvtPred tys; returnL preds' } + +cvtPred :: TH.Type -> CvtM (LHsPred RdrName) +cvtPred ty + = do { (head, tys') <- split_ty_app ty + ; case head of + ConT tc -> do { tc' <- tconName tc; returnL $ HsClassP tc' tys' } + VarT tv -> do { tv' <- tName tv; returnL $ HsClassP tv' tys' } + other -> failWith (ptext SLIT("Malformed predicate") <+> text (TH.pprint ty)) } + +cvtType :: TH.Type -> CvtM (LHsType RdrName) +cvtType ty = do { (head, tys') <- split_ty_app ty + ; case head of + TupleT n | length tys' == n -> returnL (HsTupleTy Boxed tys') + | n == 0 -> mk_apps (HsTyVar (getRdrName unitTyCon)) tys' + | otherwise -> mk_apps (HsTyVar (getRdrName (tupleTyCon Boxed n))) tys' + ArrowT | [x',y'] <- tys' -> returnL (HsFunTy x' y') + ListT | [x'] <- tys' -> returnL (HsListTy x') + VarT nm -> do { nm' <- tName nm; mk_apps (HsTyVar nm') tys' } + ConT nm -> do { nm' <- tconName nm; mk_apps (HsTyVar nm') tys' } + + ForallT tvs cxt ty | null tys' -> do { tvs' <- cvtTvs tvs + ; cxt' <- cvtContext cxt + ; ty' <- cvtType ty + ; returnL $ mkExplicitHsForAllTy tvs' cxt' ty' } + otherwise -> failWith (ptext SLIT("Malformed type") <+> text (show ty)) + } + where + mk_apps head [] = returnL head + mk_apps head (ty:tys) = do { head' <- returnL head; mk_apps (HsAppTy head' ty) tys } + +split_ty_app :: TH.Type -> CvtM (TH.Type, [LHsType RdrName]) +split_ty_app ty = go ty [] + where + go (AppT f a) as' = do { a' <- cvtType a; go f (a':as') } + go f as = return (f,as) + +----------------------------------------------------------- + + +----------------------------------------------------------- +-- some useful things + +truePat = nlConPat (getRdrName trueDataCon) [] + +overloadedLit :: Lit -> Bool +-- True for literals that Haskell treats as overloaded +overloadedLit (IntegerL l) = True +overloadedLit (RationalL l) = True +overloadedLit l = False + +void :: Type.Type +void = placeHolderType + +-------------------------------------------------------------------- +-- Turning Name back into RdrName +-------------------------------------------------------------------- + +-- variable names +vNameL, cNameL, tconNameL :: TH.Name -> CvtM (Located RdrName) +vName, cName, tName, tconName :: TH.Name -> CvtM RdrName + +vNameL n = wrapL (vName n) +vName n = cvtName OccName.varName n + +-- Constructor function names; this is Haskell source, hence srcDataName +cNameL n = wrapL (cName n) +cName n = cvtName OccName.dataName n + +-- Type variable names +tName n = cvtName OccName.tvName n + +-- Type Constructor names +tconNameL n = wrapL (tconName n) +tconName n = cvtName OccName.tcClsName n + +cvtName :: OccName.NameSpace -> TH.Name -> CvtM RdrName +cvtName ctxt_ns (TH.Name occ flavour) + | not (okOcc ctxt_ns occ_str) = failWith (badOcc ctxt_ns occ_str) + | otherwise = force (thRdrName ctxt_ns occ_str flavour) + where + occ_str = TH.occString occ + +okOcc :: OccName.NameSpace -> String -> Bool +okOcc _ [] = False +okOcc ns str@(c:_) + | OccName.isVarName ns = startsVarId c || startsVarSym c + | otherwise = startsConId c || startsConSym c || str == "[]" + +badOcc :: OccName.NameSpace -> String -> SDoc +badOcc ctxt_ns occ + = ptext SLIT("Illegal") <+> pprNameSpace ctxt_ns + <+> ptext SLIT("name:") <+> quotes (text occ) + +thRdrName :: OccName.NameSpace -> String -> TH.NameFlavour -> RdrName +-- This turns a Name into a RdrName +-- The passed-in name space tells what the context is expecting; +-- use it unless the TH name knows what name-space it comes +-- from, in which case use the latter +-- +-- ToDo: we may generate silly RdrNames, by passing a name space +-- that doesn't match the string, like VarName ":+", +-- which will give confusing error messages later +-- +-- The strict applications ensure that any buried exceptions get forced +thRdrName ctxt_ns occ (TH.NameG th_ns mod) = (mkOrig $! (mk_mod mod)) $! (mk_occ (mk_ghc_ns th_ns) occ) +thRdrName ctxt_ns occ (TH.NameL uniq) = nameRdrName $! (((Name.mkInternalName $! (mk_uniq uniq)) $! (mk_occ ctxt_ns occ)) noSrcLoc) +thRdrName ctxt_ns occ (TH.NameQ mod) = (mkRdrQual $! (mk_mod mod)) $! (mk_occ ctxt_ns occ) +thRdrName ctxt_ns occ (TH.NameU uniq) = mkRdrUnqual $! (mk_uniq_occ ctxt_ns occ uniq) +thRdrName ctxt_ns occ TH.NameS + | Just name <- isBuiltInOcc ctxt_ns occ = nameRdrName $! name + | otherwise = mkRdrUnqual $! (mk_occ ctxt_ns occ) + +isBuiltInOcc :: OccName.NameSpace -> String -> Maybe Name.Name +-- Built in syntax isn't "in scope" so an Unqual RdrName won't do +-- We must generate an Exact name, just as the parser does +isBuiltInOcc ctxt_ns occ + = case occ of + ":" -> Just (Name.getName consDataCon) + "[]" -> Just (Name.getName nilDataCon) + "()" -> Just (tup_name 0) + '(' : ',' : rest -> go_tuple 2 rest + other -> Nothing + where + go_tuple n ")" = Just (tup_name n) + go_tuple n (',' : rest) = go_tuple (n+1) rest + go_tuple n other = Nothing + + tup_name n + | OccName.isTcClsName ctxt_ns = Name.getName (tupleTyCon Boxed n) + | otherwise = Name.getName (tupleCon Boxed n) + +mk_uniq_occ :: OccName.NameSpace -> String -> Int# -> OccName.OccName +mk_uniq_occ ns occ uniq + = OccName.mkOccName ns (occ ++ '[' : shows (mk_uniq uniq) "]") + -- The idea here is to make a name that + -- a) the user could not possibly write, and + -- b) cannot clash with another NameU + -- Previously I generated an Exact RdrName with mkInternalName. + -- This works fine for local binders, but does not work at all for + -- top-level binders, which must have External Names, since they are + -- rapidly baked into data constructors and the like. Baling out + -- and generating an unqualified RdrName here is the simple solution + +-- The packing and unpacking is rather turgid :-( +mk_occ :: OccName.NameSpace -> String -> OccName.OccName +mk_occ ns occ = OccName.mkOccNameFS ns (mkFastString occ) + +mk_ghc_ns :: TH.NameSpace -> OccName.NameSpace +mk_ghc_ns TH.DataName = OccName.dataName +mk_ghc_ns TH.TcClsName = OccName.tcClsName +mk_ghc_ns TH.VarName = OccName.varName + +mk_mod :: TH.ModName -> Module +mk_mod mod = mkModule (TH.modString mod) + +mk_uniq :: Int# -> Unique +mk_uniq u = mkUniqueGrimily (I# u) +\end{code} + diff --git a/compiler/hsSyn/HsBinds.lhs b/compiler/hsSyn/HsBinds.lhs new file mode 100644 index 0000000000..b5c21792af --- /dev/null +++ b/compiler/hsSyn/HsBinds.lhs @@ -0,0 +1,479 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[HsBinds]{Abstract syntax: top-level bindings and signatures} + +Datatype for: @BindGroup@, @Bind@, @Sig@, @Bind@. + +\begin{code} +module HsBinds where + +#include "HsVersions.h" + +import {-# SOURCE #-} HsExpr ( HsExpr, pprExpr, LHsExpr, + MatchGroup, pprFunBind, + GRHSs, pprPatBind ) +import {-# SOURCE #-} HsPat ( LPat ) + +import HsTypes ( LHsType, PostTcType ) +import Type ( Type ) +import Name ( Name ) +import NameSet ( NameSet, elemNameSet ) +import BasicTypes ( IPName, RecFlag(..), InlineSpec(..), Fixity ) +import Outputable +import SrcLoc ( Located(..), SrcSpan, unLoc ) +import Util ( sortLe ) +import Var ( TyVar, DictId, Id ) +import Bag ( Bag, emptyBag, isEmptyBag, bagToList, unionBags, unionManyBags ) +\end{code} + +%************************************************************************ +%* * +\subsection{Bindings: @BindGroup@} +%* * +%************************************************************************ + +Global bindings (where clauses) + +\begin{code} +data HsLocalBinds id -- Bindings in a 'let' expression + -- or a 'where' clause + = HsValBinds (HsValBinds id) + | HsIPBinds (HsIPBinds id) + + | EmptyLocalBinds + +data HsValBinds id -- Value bindings (not implicit parameters) + = ValBindsIn -- Before typechecking + (LHsBinds id) [LSig id] -- Not dependency analysed + -- Recursive by default + + | ValBindsOut -- After renaming + [(RecFlag, LHsBinds id)] -- Dependency analysed + [LSig Name] + +type LHsBinds id = Bag (LHsBind id) +type DictBinds id = LHsBinds id -- Used for dictionary or method bindings +type LHsBind id = Located (HsBind id) + +data HsBind id + = FunBind { -- FunBind is used for both functions f x = e + -- and variables f = \x -> e +-- Reason 1: the Match stuff lets us have an optional +-- result type sig f :: a->a = ...mentions a... +-- +-- Reason 2: Special case for type inference: see TcBinds.tcMonoBinds +-- +-- Reason 3: instance decls can only have FunBinds, which is convenient +-- If you change this, you'll need tochange e.g. rnMethodBinds + + fun_id :: Located id, + + fun_infix :: Bool, -- True => infix declaration + + fun_matches :: MatchGroup id, -- The payload + + fun_co_fn :: ExprCoFn, -- Coercion from the type of the MatchGroup to the type of + -- the Id. Example: + -- f :: Int -> forall a. a -> a + -- f x y = y + -- Then the MatchGroup will have type (Int -> a' -> a') + -- (with a free type variable a'). The coercion will take + -- a CoreExpr of this type and convert it to a CoreExpr of + -- type Int -> forall a'. a' -> a' + -- Notice that the coercion captures the free a'. That's + -- why coercions are (CoreExpr -> CoreExpr), rather than + -- just CoreExpr (with a functional type) + + bind_fvs :: NameSet -- After the renamer, this contains a superset of the + -- Names of the other binders in this binding group that + -- are free in the RHS of the defn + -- Before renaming, and after typechecking, + -- the field is unused; it's just an error thunk + } + + | PatBind { -- The pattern is never a simple variable; + -- That case is done by FunBind + pat_lhs :: LPat id, + pat_rhs :: GRHSs id, + pat_rhs_ty :: PostTcType, -- Type of the GRHSs + bind_fvs :: NameSet -- Same as for FunBind + } + + | VarBind { -- Dictionary binding and suchlike + var_id :: id, -- All VarBinds are introduced by the type checker + var_rhs :: LHsExpr id -- Located only for consistency + } + + | AbsBinds { -- Binds abstraction; TRANSLATION + abs_tvs :: [TyVar], + abs_dicts :: [DictId], + abs_exports :: [([TyVar], id, id, [Prag])], -- (tvs, poly_id, mono_id, prags) + abs_binds :: LHsBinds id -- The dictionary bindings and typechecked user bindings + -- mixed up together; you can tell the dict bindings because + -- they are all VarBinds + } + -- Consider (AbsBinds tvs ds [(ftvs, poly_f, mono_f) binds] + -- + -- Creates bindings for (polymorphic, overloaded) poly_f + -- in terms of monomorphic, non-overloaded mono_f + -- + -- Invariants: + -- 1. 'binds' binds mono_f + -- 2. ftvs is a subset of tvs + -- 3. ftvs includes all tyvars free in ds + -- + -- See section 9 of static semantics paper for more details. + -- (You can get a PhD for explaining the True Meaning + -- of this last construct.) + +placeHolderNames :: NameSet +-- Used for the NameSet in FunBind and PatBind prior to the renamer +placeHolderNames = panic "placeHolderNames" + +------------ +instance OutputableBndr id => Outputable (HsLocalBinds id) where + ppr (HsValBinds bs) = ppr bs + ppr (HsIPBinds bs) = ppr bs + ppr EmptyLocalBinds = empty + +instance OutputableBndr id => Outputable (HsValBinds id) where + ppr (ValBindsIn binds sigs) + = pprValBindsForUser binds sigs + + ppr (ValBindsOut sccs sigs) + = getPprStyle $ \ sty -> + if debugStyle sty then -- Print with sccs showing + vcat (map ppr sigs) $$ vcat (map ppr_scc sccs) + else + pprValBindsForUser (unionManyBags (map snd sccs)) sigs + where + ppr_scc (rec_flag, binds) = pp_rec rec_flag <+> pprLHsBinds binds + pp_rec Recursive = ptext SLIT("rec") + pp_rec NonRecursive = ptext SLIT("nonrec") + +-- *not* pprLHsBinds because we don't want braces; 'let' and +-- 'where' include a list of HsBindGroups and we don't want +-- several groups of bindings each with braces around. +-- Sort by location before printing +pprValBindsForUser binds sigs + = vcat (map snd (sort_by_loc decls)) + where + + decls :: [(SrcSpan, SDoc)] + decls = [(loc, ppr sig) | L loc sig <- sigs] ++ + [(loc, ppr bind) | L loc bind <- bagToList binds] + + sort_by_loc decls = sortLe (\(l1,_) (l2,_) -> l1 <= l2) decls + +pprLHsBinds :: OutputableBndr id => LHsBinds id -> SDoc +pprLHsBinds binds + | isEmptyLHsBinds binds = empty + | otherwise = lbrace <+> vcat (map ppr (bagToList binds)) <+> rbrace + +------------ +emptyLocalBinds :: HsLocalBinds a +emptyLocalBinds = EmptyLocalBinds + +isEmptyLocalBinds :: HsLocalBinds a -> Bool +isEmptyLocalBinds (HsValBinds ds) = isEmptyValBinds ds +isEmptyLocalBinds (HsIPBinds ds) = isEmptyIPBinds ds +isEmptyLocalBinds EmptyLocalBinds = True + +isEmptyValBinds :: HsValBinds a -> Bool +isEmptyValBinds (ValBindsIn ds sigs) = isEmptyLHsBinds ds && null sigs +isEmptyValBinds (ValBindsOut ds sigs) = null ds && null sigs + +emptyValBindsIn, emptyValBindsOut :: HsValBinds a +emptyValBindsIn = ValBindsIn emptyBag [] +emptyValBindsOut = ValBindsOut [] [] + +emptyLHsBinds :: LHsBinds id +emptyLHsBinds = emptyBag + +isEmptyLHsBinds :: LHsBinds id -> Bool +isEmptyLHsBinds = isEmptyBag + +------------ +plusHsValBinds :: HsValBinds a -> HsValBinds a -> HsValBinds a +plusHsValBinds (ValBindsIn ds1 sigs1) (ValBindsIn ds2 sigs2) + = ValBindsIn (ds1 `unionBags` ds2) (sigs1 ++ sigs2) +plusHsValBinds (ValBindsOut ds1 sigs1) (ValBindsOut ds2 sigs2) + = ValBindsOut (ds1 ++ ds2) (sigs1 ++ sigs2) +\end{code} + +What AbsBinds means +~~~~~~~~~~~~~~~~~~~ + AbsBinds tvs + [d1,d2] + [(tvs1, f1p, f1m), + (tvs2, f2p, f2m)] + BIND +means + + f1p = /\ tvs -> \ [d1,d2] -> letrec DBINDS and BIND + in fm + + gp = ...same again, with gm instead of fm + +This is a pretty bad translation, because it duplicates all the bindings. +So the desugarer tries to do a better job: + + fp = /\ [a,b] -> \ [d1,d2] -> case tp [a,b] [d1,d2] of + (fm,gm) -> fm + ..ditto for gp.. + + tp = /\ [a,b] -> \ [d1,d2] -> letrec DBINDS and BIND + in (fm,gm) + +\begin{code} +instance OutputableBndr id => Outputable (HsBind id) where + ppr mbind = ppr_monobind mbind + +ppr_monobind :: OutputableBndr id => HsBind id -> SDoc + +ppr_monobind (PatBind { pat_lhs = pat, pat_rhs = grhss }) = pprPatBind pat grhss +ppr_monobind (VarBind { var_id = var, var_rhs = rhs }) = ppr var <+> equals <+> pprExpr (unLoc rhs) +ppr_monobind (FunBind { fun_id = fun, fun_matches = matches }) = pprFunBind (unLoc fun) matches + -- ToDo: print infix if appropriate + +ppr_monobind (AbsBinds { abs_tvs = tyvars, abs_dicts = dictvars, + abs_exports = exports, abs_binds = val_binds }) + = sep [ptext SLIT("AbsBinds"), + brackets (interpp'SP tyvars), + brackets (interpp'SP dictvars), + brackets (sep (punctuate comma (map ppr_exp exports)))] + $$ + nest 2 ( vcat [pprBndr LetBind x | (_,x,_,_) <- exports] + -- Print type signatures + $$ pprLHsBinds val_binds ) + where + ppr_exp (tvs, gbl, lcl, prags) + = vcat [ppr gbl <+> ptext SLIT("<=") <+> ppr tvs <+> ppr lcl, + nest 2 (vcat (map (pprPrag gbl) prags))] +\end{code} + +%************************************************************************ +%* * + Implicit parameter bindings +%* * +%************************************************************************ + +\begin{code} +data HsIPBinds id + = IPBinds + [LIPBind id] + (DictBinds id) -- Only in typechecker output; binds + -- uses of the implicit parameters + +isEmptyIPBinds :: HsIPBinds id -> Bool +isEmptyIPBinds (IPBinds is ds) = null is && isEmptyBag ds + +type LIPBind id = Located (IPBind id) + +-- | Implicit parameter bindings. +data IPBind id + = IPBind + (IPName id) + (LHsExpr id) + +instance (OutputableBndr id) => Outputable (HsIPBinds id) where + ppr (IPBinds bs ds) = vcat (map ppr bs) + $$ pprLHsBinds ds + +instance (OutputableBndr id) => Outputable (IPBind id) where + ppr (IPBind id rhs) = pprBndr LetBind id <+> equals <+> pprExpr (unLoc rhs) +\end{code} + + +%************************************************************************ +%* * +\subsection{Coercion functions} +%* * +%************************************************************************ + +\begin{code} +-- A Coercion is an expression with a hole in it +-- We need coercions to have concrete form so that we can zonk them + +data ExprCoFn + = CoHole -- The identity coercion + | CoCompose ExprCoFn ExprCoFn + | CoApps ExprCoFn [Id] -- Non-empty list + | CoTyApps ExprCoFn [Type] -- in all of these + | CoLams [Id] ExprCoFn -- so that the identity coercion + | CoTyLams [TyVar] ExprCoFn -- is just Hole + | CoLet (LHsBinds Id) ExprCoFn -- Would be nicer to be core bindings + +(<.>) :: ExprCoFn -> ExprCoFn -> ExprCoFn +(<.>) = CoCompose + +idCoercion :: ExprCoFn +idCoercion = CoHole + +isIdCoercion :: ExprCoFn -> Bool +isIdCoercion CoHole = True +isIdCoercion other = False +\end{code} + + +%************************************************************************ +%* * +\subsection{@Sig@: type signatures and value-modifying user pragmas} +%* * +%************************************************************************ + +It is convenient to lump ``value-modifying'' user-pragmas (e.g., +``specialise this function to these four types...'') in with type +signatures. Then all the machinery to move them into place, etc., +serves for both. + +\begin{code} +type LSig name = Located (Sig name) + +data Sig name + = TypeSig (Located name) -- A bog-std type signature + (LHsType name) + + | SpecSig (Located name) -- Specialise a function or datatype ... + (LHsType name) -- ... to these types + InlineSpec + + | InlineSig (Located name) -- Function name + InlineSpec + + | SpecInstSig (LHsType name) -- (Class tys); should be a specialisation of the + -- current instance decl + + | FixSig (FixitySig name) -- Fixity declaration + +type LFixitySig name = Located (FixitySig name) +data FixitySig name = FixitySig (Located name) Fixity + +-- A Prag conveys pragmas from the type checker to the desugarer +data Prag + = InlinePrag + InlineSpec + + | SpecPrag + (HsExpr Id) -- An expression, of the given specialised type, which + PostTcType -- specialises the polymorphic function + [Id] -- Dicts mentioned free in the expression + InlineSpec -- Inlining spec for the specialised function + +isInlinePrag (InlinePrag _) = True +isInlinePrag prag = False + +isSpecPrag (SpecPrag _ _ _ _) = True +isSpecPrag prag = False +\end{code} + +\begin{code} +okBindSig :: NameSet -> LSig Name -> Bool +okBindSig ns sig = sigForThisGroup ns sig + +okHsBootSig :: LSig Name -> Bool +okHsBootSig (L _ (TypeSig _ _)) = True +okHsBootSig (L _ (FixSig _)) = True +okHsBootSig sig = False + +okClsDclSig :: LSig Name -> Bool +okClsDclSig (L _ (SpecInstSig _)) = False +okClsDclSig sig = True -- All others OK + +okInstDclSig :: NameSet -> LSig Name -> Bool +okInstDclSig ns lsig@(L _ sig) = ok ns sig + where + ok ns (TypeSig _ _) = False + ok ns (FixSig _) = False + ok ns (SpecInstSig _) = True + ok ns sig = sigForThisGroup ns lsig + +sigForThisGroup :: NameSet -> LSig Name -> Bool +sigForThisGroup ns sig + = case sigName sig of + Nothing -> False + Just n -> n `elemNameSet` ns + +sigName :: LSig name -> Maybe name +sigName (L _ sig) = f sig + where + f (TypeSig n _) = Just (unLoc n) + f (SpecSig n _ _) = Just (unLoc n) + f (InlineSig n _) = Just (unLoc n) + f (FixSig (FixitySig n _)) = Just (unLoc n) + f other = Nothing + +isFixityLSig :: LSig name -> Bool +isFixityLSig (L _ (FixSig {})) = True +isFixityLSig _ = False + +isVanillaLSig :: LSig name -> Bool +isVanillaLSig (L _(TypeSig {})) = True +isVanillaLSig sig = False + +isSpecLSig :: LSig name -> Bool +isSpecLSig (L _(SpecSig {})) = True +isSpecLSig sig = False + +isSpecInstLSig (L _ (SpecInstSig {})) = True +isSpecInstLSig sig = False + +isPragLSig :: LSig name -> Bool + -- Identifies pragmas +isPragLSig (L _ (SpecSig {})) = True +isPragLSig (L _ (InlineSig {})) = True +isPragLSig other = False + +isInlineLSig :: LSig name -> Bool + -- Identifies inline pragmas +isInlineLSig (L _ (InlineSig {})) = True +isInlineLSig other = False + +hsSigDoc (TypeSig {}) = ptext SLIT("type signature") +hsSigDoc (SpecSig {}) = ptext SLIT("SPECIALISE pragma") +hsSigDoc (InlineSig _ spec) = ppr spec <+> ptext SLIT("pragma") +hsSigDoc (SpecInstSig {}) = ptext SLIT("SPECIALISE instance pragma") +hsSigDoc (FixSig {}) = ptext SLIT("fixity declaration") +\end{code} + +Signature equality is used when checking for duplicate signatures + +\begin{code} +eqHsSig :: LSig Name -> LSig Name -> Bool +eqHsSig (L _ (FixSig (FixitySig n1 _))) (L _ (FixSig (FixitySig n2 _))) = unLoc n1 == unLoc n2 +eqHsSig (L _ (TypeSig n1 _)) (L _ (TypeSig n2 _)) = unLoc n1 == unLoc n2 +eqHsSig (L _ (InlineSig n1 s1)) (L _ (InlineSig n2 s2)) = s1 == s2 && unLoc n1 == unLoc n2 + -- For specialisations, we don't have equality over + -- HsType, so it's not convenient to spot duplicate + -- specialisations here. Check for this later, when we're in Type land +eqHsSig _other1 _other2 = False +\end{code} + +\begin{code} +instance (OutputableBndr name) => Outputable (Sig name) where + ppr sig = ppr_sig sig + +ppr_sig :: OutputableBndr name => Sig name -> SDoc +ppr_sig (TypeSig var ty) = pprVarSig (unLoc var) ty +ppr_sig (FixSig fix_sig) = ppr fix_sig +ppr_sig (SpecSig var ty inl) = pragBrackets (pprSpec var ty inl) +ppr_sig (InlineSig var inl) = pragBrackets (ppr inl <+> ppr var) +ppr_sig (SpecInstSig ty) = pragBrackets (ptext SLIT("SPECIALIZE instance") <+> ppr ty) + +instance Outputable name => Outputable (FixitySig name) where + ppr (FixitySig name fixity) = sep [ppr fixity, ppr name] + +pragBrackets :: SDoc -> SDoc +pragBrackets doc = ptext SLIT("{-#") <+> doc <+> ptext SLIT("#-}") + +pprVarSig :: (Outputable id, Outputable ty) => id -> ty -> SDoc +pprVarSig var ty = sep [ppr var <+> dcolon, nest 2 (ppr ty)] + +pprSpec :: (Outputable id, Outputable ty) => id -> ty -> InlineSpec -> SDoc +pprSpec var ty inl = sep [ptext SLIT("SPECIALIZE") <+> ppr inl <+> pprVarSig var ty] + +pprPrag :: Outputable id => id -> Prag -> SDoc +pprPrag var (InlinePrag inl) = ppr inl <+> ppr var +pprPrag var (SpecPrag expr ty _ inl) = pprSpec var ty inl +\end{code} diff --git a/compiler/hsSyn/HsDecls.lhs b/compiler/hsSyn/HsDecls.lhs new file mode 100644 index 0000000000..8ff39857c6 --- /dev/null +++ b/compiler/hsSyn/HsDecls.lhs @@ -0,0 +1,796 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[HsDecls]{Abstract syntax: global declarations} + +Definitions for: @TyDecl@ and @oCnDecl@, @ClassDecl@, +@InstDecl@, @DefaultDecl@ and @ForeignDecl@. + +\begin{code} +module HsDecls ( + HsDecl(..), LHsDecl, TyClDecl(..), LTyClDecl, + InstDecl(..), LInstDecl, NewOrData(..), + RuleDecl(..), LRuleDecl, RuleBndr(..), + DefaultDecl(..), LDefaultDecl, SpliceDecl(..), + ForeignDecl(..), LForeignDecl, ForeignImport(..), ForeignExport(..), + CImportSpec(..), FoType(..), + ConDecl(..), ResType(..), LConDecl, + DeprecDecl(..), LDeprecDecl, + HsGroup(..), emptyRdrGroup, emptyRnGroup, appendGroups, + tcdName, tyClDeclNames, tyClDeclTyVars, + isClassDecl, isSynDecl, isDataDecl, + countTyClDecls, + conDetailsTys, + collectRuleBndrSigTys, + ) where + +#include "HsVersions.h" + +-- friends: +import {-# SOURCE #-} HsExpr( HsExpr, pprExpr ) + -- Because Expr imports Decls via HsBracket + +import HsBinds ( HsValBinds(..), HsBind, LHsBinds, plusHsValBinds, + Sig(..), LSig, LFixitySig, pprLHsBinds, + emptyValBindsIn, emptyValBindsOut ) +import HsPat ( HsConDetails(..), hsConArgs ) +import HsImpExp ( pprHsVar ) +import HsTypes +import NameSet ( NameSet ) +import HscTypes ( DeprecTxt ) +import CoreSyn ( RuleName ) +import Kind ( Kind, pprKind ) +import BasicTypes ( Activation(..) ) +import ForeignCall ( CCallTarget(..), DNCallSpec, CCallConv, Safety, + CExportSpec(..), CLabelString ) + +-- others: +import FunDeps ( pprFundeps ) +import Class ( FunDep ) +import Outputable +import Util ( count ) +import SrcLoc ( Located(..), unLoc, noLoc ) +import FastString +\end{code} + + +%************************************************************************ +%* * +\subsection[HsDecl]{Declarations} +%* * +%************************************************************************ + +\begin{code} +type LHsDecl id = Located (HsDecl id) + +data HsDecl id + = TyClD (TyClDecl id) + | InstD (InstDecl id) + | ValD (HsBind id) + | SigD (Sig id) + | DefD (DefaultDecl id) + | ForD (ForeignDecl id) + | DeprecD (DeprecDecl id) + | RuleD (RuleDecl id) + | SpliceD (SpliceDecl id) + +-- NB: all top-level fixity decls are contained EITHER +-- EITHER SigDs +-- OR in the ClassDecls in TyClDs +-- +-- The former covers +-- a) data constructors +-- b) class methods (but they can be also done in the +-- signatures of class decls) +-- c) imported functions (that have an IfacSig) +-- d) top level decls +-- +-- The latter is for class methods only + +-- A [HsDecl] is categorised into a HsGroup before being +-- fed to the renamer. +data HsGroup id + = HsGroup { + hs_valds :: HsValBinds id, + hs_tyclds :: [LTyClDecl id], + hs_instds :: [LInstDecl id], + + hs_fixds :: [LFixitySig id], + -- Snaffled out of both top-level fixity signatures, + -- and those in class declarations + + hs_defds :: [LDefaultDecl id], + hs_fords :: [LForeignDecl id], + hs_depds :: [LDeprecDecl id], + hs_ruleds :: [LRuleDecl id] + } + +emptyGroup, emptyRdrGroup, emptyRnGroup :: HsGroup a +emptyRdrGroup = emptyGroup { hs_valds = emptyValBindsIn } +emptyRnGroup = emptyGroup { hs_valds = emptyValBindsOut } + +emptyGroup = HsGroup { hs_tyclds = [], hs_instds = [], + hs_fixds = [], hs_defds = [], hs_fords = [], + hs_depds = [], hs_ruleds = [], + hs_valds = error "emptyGroup hs_valds: Can't happen" } + +appendGroups :: HsGroup a -> HsGroup a -> HsGroup a +appendGroups + HsGroup { + hs_valds = val_groups1, + hs_tyclds = tyclds1, + hs_instds = instds1, + hs_fixds = fixds1, + hs_defds = defds1, + hs_fords = fords1, + hs_depds = depds1, + hs_ruleds = rulds1 } + HsGroup { + hs_valds = val_groups2, + hs_tyclds = tyclds2, + hs_instds = instds2, + hs_fixds = fixds2, + hs_defds = defds2, + hs_fords = fords2, + hs_depds = depds2, + hs_ruleds = rulds2 } + = + HsGroup { + hs_valds = val_groups1 `plusHsValBinds` val_groups2, + hs_tyclds = tyclds1 ++ tyclds2, + hs_instds = instds1 ++ instds2, + hs_fixds = fixds1 ++ fixds2, + hs_defds = defds1 ++ defds2, + hs_fords = fords1 ++ fords2, + hs_depds = depds1 ++ depds2, + hs_ruleds = rulds1 ++ rulds2 } +\end{code} + +\begin{code} +instance OutputableBndr name => Outputable (HsDecl name) where + ppr (TyClD dcl) = ppr dcl + ppr (ValD binds) = ppr binds + ppr (DefD def) = ppr def + ppr (InstD inst) = ppr inst + ppr (ForD fd) = ppr fd + ppr (SigD sd) = ppr sd + ppr (RuleD rd) = ppr rd + ppr (DeprecD dd) = ppr dd + ppr (SpliceD dd) = ppr dd + +instance OutputableBndr name => Outputable (HsGroup name) where + ppr (HsGroup { hs_valds = val_decls, + hs_tyclds = tycl_decls, + hs_instds = inst_decls, + hs_fixds = fix_decls, + hs_depds = deprec_decls, + hs_fords = foreign_decls, + hs_defds = default_decls, + hs_ruleds = rule_decls }) + = vcat [ppr_ds fix_decls, ppr_ds default_decls, + ppr_ds deprec_decls, ppr_ds rule_decls, + ppr val_decls, + ppr_ds tycl_decls, ppr_ds inst_decls, + ppr_ds foreign_decls] + where + ppr_ds [] = empty + ppr_ds ds = text "" $$ vcat (map ppr ds) + +data SpliceDecl id = SpliceDecl (Located (HsExpr id)) -- Top level splice + +instance OutputableBndr name => Outputable (SpliceDecl name) where + ppr (SpliceDecl e) = ptext SLIT("$") <> parens (pprExpr (unLoc e)) +\end{code} + + +%************************************************************************ +%* * +\subsection[TyDecl]{@data@, @newtype@ or @type@ (synonym) type declaration} +%* * +%************************************************************************ + + -------------------------------- + THE NAMING STORY + -------------------------------- + +Here is the story about the implicit names that go with type, class, +and instance decls. It's a bit tricky, so pay attention! + +"Implicit" (or "system") binders +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Each data type decl defines + a worker name for each constructor + to-T and from-T convertors + Each class decl defines + a tycon for the class + a data constructor for that tycon + the worker for that constructor + a selector for each superclass + +All have occurrence names that are derived uniquely from their parent +declaration. + +None of these get separate definitions in an interface file; they are +fully defined by the data or class decl. But they may *occur* in +interface files, of course. Any such occurrence must haul in the +relevant type or class decl. + +Plan of attack: + - Ensure they "point to" the parent data/class decl + when loading that decl from an interface file + (See RnHiFiles.getSysBinders) + + - When typechecking the decl, we build the implicit TyCons and Ids. + When doing so we look them up in the name cache (RnEnv.lookupSysName), + to ensure correct module and provenance is set + +These are the two places that we have to conjure up the magic derived +names. (The actual magic is in OccName.mkWorkerOcc, etc.) + +Default methods +~~~~~~~~~~~~~~~ + - Occurrence name is derived uniquely from the method name + E.g. $dmmax + + - If there is a default method name at all, it's recorded in + the ClassOpSig (in HsBinds), in the DefMeth field. + (DefMeth is defined in Class.lhs) + +Source-code class decls and interface-code class decls are treated subtly +differently, which has given me a great deal of confusion over the years. +Here's the deal. (We distinguish the two cases because source-code decls +have (Just binds) in the tcdMeths field, whereas interface decls have Nothing. + +In *source-code* class declarations: + + - When parsing, every ClassOpSig gets a DefMeth with a suitable RdrName + This is done by RdrHsSyn.mkClassOpSigDM + + - The renamer renames it to a Name + + - During typechecking, we generate a binding for each $dm for + which there's a programmer-supplied default method: + class Foo a where + op1 :: <type> + op2 :: <type> + op1 = ... + We generate a binding for $dmop1 but not for $dmop2. + The Class for Foo has a NoDefMeth for op2 and a DefMeth for op1. + The Name for $dmop2 is simply discarded. + +In *interface-file* class declarations: + - When parsing, we see if there's an explicit programmer-supplied default method + because there's an '=' sign to indicate it: + class Foo a where + op1 = :: <type> -- NB the '=' + op2 :: <type> + We use this info to generate a DefMeth with a suitable RdrName for op1, + and a NoDefMeth for op2 + - The interface file has a separate definition for $dmop1, with unfolding etc. + - The renamer renames it to a Name. + - The renamer treats $dmop1 as a free variable of the declaration, so that + the binding for $dmop1 will be sucked in. (See RnHsSyn.tyClDeclFVs) + This doesn't happen for source code class decls, because they *bind* the default method. + +Dictionary functions +~~~~~~~~~~~~~~~~~~~~ +Each instance declaration gives rise to one dictionary function binding. + +The type checker makes up new source-code instance declarations +(e.g. from 'deriving' or generic default methods --- see +TcInstDcls.tcInstDecls1). So we can't generate the names for +dictionary functions in advance (we don't know how many we need). + +On the other hand for interface-file instance declarations, the decl +specifies the name of the dictionary function, and it has a binding elsewhere +in the interface file: + instance {Eq Int} = dEqInt + dEqInt :: {Eq Int} <pragma info> + +So again we treat source code and interface file code slightly differently. + +Source code: + - Source code instance decls have a Nothing in the (Maybe name) field + (see data InstDecl below) + + - The typechecker makes up a Local name for the dict fun for any source-code + instance decl, whether it comes from a source-code instance decl, or whether + the instance decl is derived from some other construct (e.g. 'deriving'). + + - The occurrence name it chooses is derived from the instance decl (just for + documentation really) --- e.g. dNumInt. Two dict funs may share a common + occurrence name, but will have different uniques. E.g. + instance Foo [Int] where ... + instance Foo [Bool] where ... + These might both be dFooList + + - The CoreTidy phase externalises the name, and ensures the occurrence name is + unique (this isn't special to dict funs). So we'd get dFooList and dFooList1. + + - We can take this relaxed approach (changing the occurrence name later) + because dict fun Ids are not captured in a TyCon or Class (unlike default + methods, say). Instead, they are kept separately in the InstEnv. This + makes it easy to adjust them after compiling a module. (Once we've finished + compiling that module, they don't change any more.) + + +Interface file code: + - The instance decl gives the dict fun name, so the InstDecl has a (Just name) + in the (Maybe name) field. + + - RnHsSyn.instDeclFVs treats the dict fun name as free in the decl, so that we + suck in the dfun binding + + +\begin{code} +-- TyClDecls are precisely the kind of declarations that can +-- appear in interface files; or (internally) in GHC's interface +-- for a module. That's why (despite the misnomer) IfaceSig and ForeignType +-- are both in TyClDecl + +type LTyClDecl name = Located (TyClDecl name) + +data TyClDecl name + = ForeignType { + tcdLName :: Located name, + tcdExtName :: Maybe FastString, + tcdFoType :: FoType + } + + | TyData { tcdND :: NewOrData, + tcdCtxt :: LHsContext name, -- Context + tcdLName :: Located name, -- Type constructor + tcdTyVars :: [LHsTyVarBndr name], -- Type variables + tcdKindSig :: Maybe Kind, -- Optional kind sig; + -- (only for the 'where' form) + + tcdCons :: [LConDecl name], -- Data constructors + -- For data T a = T1 | T2 a the LConDecls all have ResTyH98 + -- For data T a where { T1 :: T a } the LConDecls all have ResTyGADT + + tcdDerivs :: Maybe [LHsType name] + -- Derivings; Nothing => not specified + -- Just [] => derive exactly what is asked + -- These "types" must be of form + -- forall ab. C ty1 ty2 + -- Typically the foralls and ty args are empty, but they + -- are non-empty for the newtype-deriving case + } + + | TySynonym { tcdLName :: Located name, -- type constructor + tcdTyVars :: [LHsTyVarBndr name], -- type variables + tcdSynRhs :: LHsType name -- synonym expansion + } + + | ClassDecl { tcdCtxt :: LHsContext name, -- Context... + tcdLName :: Located name, -- Name of the class + tcdTyVars :: [LHsTyVarBndr name], -- Class type variables + tcdFDs :: [Located (FunDep name)], -- Functional deps + tcdSigs :: [LSig name], -- Methods' signatures + tcdMeths :: LHsBinds name -- Default methods + } + +data NewOrData + = NewType -- "newtype Blah ..." + | DataType -- "data Blah ..." + deriving( Eq ) -- Needed because Demand derives Eq +\end{code} + +Simple classifiers + +\begin{code} +isDataDecl, isSynDecl, isClassDecl :: TyClDecl name -> Bool + +isSynDecl (TySynonym {}) = True +isSynDecl other = False + +isDataDecl (TyData {}) = True +isDataDecl other = False + +isClassDecl (ClassDecl {}) = True +isClassDecl other = False +\end{code} + +Dealing with names + +\begin{code} +tcdName :: TyClDecl name -> name +tcdName decl = unLoc (tcdLName decl) + +tyClDeclNames :: Eq name => TyClDecl name -> [Located name] +-- Returns all the *binding* names of the decl, along with their SrcLocs +-- The first one is guaranteed to be the name of the decl +-- For record fields, the first one counts as the SrcLoc +-- We use the equality to filter out duplicate field names + +tyClDeclNames (TySynonym {tcdLName = name}) = [name] +tyClDeclNames (ForeignType {tcdLName = name}) = [name] + +tyClDeclNames (ClassDecl {tcdLName = cls_name, tcdSigs = sigs}) + = cls_name : [n | L _ (TypeSig n _) <- sigs] + +tyClDeclNames (TyData {tcdLName = tc_name, tcdCons = cons}) + = tc_name : conDeclsNames (map unLoc cons) + +tyClDeclTyVars (TySynonym {tcdTyVars = tvs}) = tvs +tyClDeclTyVars (TyData {tcdTyVars = tvs}) = tvs +tyClDeclTyVars (ClassDecl {tcdTyVars = tvs}) = tvs +tyClDeclTyVars (ForeignType {}) = [] +\end{code} + +\begin{code} +countTyClDecls :: [TyClDecl name] -> (Int, Int, Int, Int) + -- class, data, newtype, synonym decls +countTyClDecls decls + = (count isClassDecl decls, + count isSynDecl decls, + count isDataTy decls, + count isNewTy decls) + where + isDataTy TyData{tcdND=DataType} = True + isDataTy _ = False + + isNewTy TyData{tcdND=NewType} = True + isNewTy _ = False +\end{code} + +\begin{code} +instance OutputableBndr name + => Outputable (TyClDecl name) where + + ppr (ForeignType {tcdLName = ltycon}) + = hsep [ptext SLIT("foreign import type dotnet"), ppr ltycon] + + ppr (TySynonym {tcdLName = ltycon, tcdTyVars = tyvars, tcdSynRhs = mono_ty}) + = hang (ptext SLIT("type") <+> pp_decl_head [] ltycon tyvars <+> equals) + 4 (ppr mono_ty) + + ppr (TyData {tcdND = new_or_data, tcdCtxt = context, tcdLName = ltycon, + tcdTyVars = tyvars, tcdKindSig = mb_sig, tcdCons = condecls, + tcdDerivs = derivings}) + = pp_tydecl (ppr new_or_data <+> pp_decl_head (unLoc context) ltycon tyvars <+> ppr_sig mb_sig) + (pp_condecls condecls) + derivings + where + ppr_sig Nothing = empty + ppr_sig (Just kind) = dcolon <+> pprKind kind + + ppr (ClassDecl {tcdCtxt = context, tcdLName = lclas, tcdTyVars = tyvars, tcdFDs = fds, + tcdSigs = sigs, tcdMeths = methods}) + | null sigs -- No "where" part + = top_matter + + | otherwise -- Laid out + = sep [hsep [top_matter, ptext SLIT("where {")], + nest 4 (sep [sep (map ppr_sig sigs), ppr methods, char '}'])] + where + top_matter = ptext SLIT("class") <+> pp_decl_head (unLoc context) lclas tyvars <+> pprFundeps (map unLoc fds) + ppr_sig sig = ppr sig <> semi + +pp_decl_head :: OutputableBndr name + => HsContext name + -> Located name + -> [LHsTyVarBndr name] + -> SDoc +pp_decl_head context thing tyvars + = hsep [pprHsContext context, ppr thing, interppSP tyvars] +pp_condecls cs@(L _ ConDecl{ con_res = ResTyGADT _ } : _) -- In GADT syntax + = hang (ptext SLIT("where")) 2 (vcat (map ppr cs)) +pp_condecls cs -- In H98 syntax + = equals <+> sep (punctuate (ptext SLIT(" |")) (map ppr cs)) + +pp_tydecl pp_head pp_decl_rhs derivings + = hang pp_head 4 (sep [ + pp_decl_rhs, + case derivings of + Nothing -> empty + Just ds -> hsep [ptext SLIT("deriving"), parens (interpp'SP ds)] + ]) + +instance Outputable NewOrData where + ppr NewType = ptext SLIT("newtype") + ppr DataType = ptext SLIT("data") +\end{code} + + +%************************************************************************ +%* * +\subsection[ConDecl]{A data-constructor declaration} +%* * +%************************************************************************ + +\begin{code} +type LConDecl name = Located (ConDecl name) + +-- data T b = forall a. Eq a => MkT a b +-- MkT :: forall b a. Eq a => MkT a b + +-- data T b where +-- MkT1 :: Int -> T Int + +-- data T = Int `MkT` Int +-- | MkT2 + +-- data T a where +-- Int `MkT` Int :: T Int + +data ConDecl name + = ConDecl + { con_name :: Located name -- Constructor name; this is used for the + -- DataCon itself, and for the user-callable wrapper Id + + , con_explicit :: HsExplicitForAll -- Is there an user-written forall? (cf. HStypes.HsForAllTy) + + , con_qvars :: [LHsTyVarBndr name] -- ResTyH98: the constructor's existential type variables + -- ResTyGADT: all the constructor's quantified type variables + + , con_cxt :: LHsContext name -- The context. This *does not* include the + -- "stupid theta" which lives only in the TyData decl + + , con_details :: HsConDetails name (LBangType name) -- The main payload + + , con_res :: ResType name -- Result type of the constructor + } + +data ResType name + = ResTyH98 -- Constructor was declared using Haskell 98 syntax + | ResTyGADT (LHsType name) -- Constructor was declared using GADT-style syntax, + -- and here is its result type +\end{code} + +\begin{code} +conDeclsNames :: Eq name => [ConDecl name] -> [Located name] + -- See tyClDeclNames for what this does + -- The function is boringly complicated because of the records + -- And since we only have equality, we have to be a little careful +conDeclsNames cons + = snd (foldl do_one ([], []) cons) + where + do_one (flds_seen, acc) (ConDecl { con_name = lname, con_details = RecCon flds }) + = (map unLoc new_flds ++ flds_seen, lname : [f | f <- new_flds] ++ acc) + where + new_flds = [ f | (f,_) <- flds, not (unLoc f `elem` flds_seen) ] + + do_one (flds_seen, acc) c + = (flds_seen, (con_name c):acc) + +conDetailsTys details = map getBangType (hsConArgs details) +\end{code} + + +\begin{code} +instance (OutputableBndr name) => Outputable (ConDecl name) where + ppr = pprConDecl + +pprConDecl (ConDecl con expl tvs cxt details ResTyH98) + = sep [pprHsForAll expl tvs cxt, ppr_details con details] + where + ppr_details con (InfixCon t1 t2) = hsep [ppr t1, pprHsVar con, ppr t2] + ppr_details con (PrefixCon tys) = hsep (pprHsVar con : map ppr tys) + ppr_details con (RecCon fields) = ppr con <+> ppr_fields fields + +pprConDecl (ConDecl con expl tvs cxt details (ResTyGADT res_ty)) + = sep [pprHsForAll expl tvs cxt, ppr con <+> ppr_details details] + where + ppr_details (PrefixCon arg_tys) = dcolon <+> ppr (foldr mk_fun_ty res_ty arg_tys) + ppr_details (RecCon fields) = ppr fields <+> dcolon <+> ppr res_ty + + mk_fun_ty a b = noLoc (HsFunTy a b) + +ppr_fields fields = braces (sep (punctuate comma (map ppr_field fields))) +ppr_field (n, ty) = ppr n <+> dcolon <+> ppr ty +\end{code} + +%************************************************************************ +%* * +\subsection[InstDecl]{An instance declaration +%* * +%************************************************************************ + +\begin{code} +type LInstDecl name = Located (InstDecl name) + +data InstDecl name + = InstDecl (LHsType name) -- Context => Class Instance-type + -- Using a polytype means that the renamer conveniently + -- figures out the quantified type variables for us. + (LHsBinds name) + [LSig name] -- User-supplied pragmatic info + +instance (OutputableBndr name) => Outputable (InstDecl name) where + + ppr (InstDecl inst_ty binds uprags) + = vcat [hsep [ptext SLIT("instance"), ppr inst_ty, ptext SLIT("where")], + nest 4 (ppr uprags), + nest 4 (pprLHsBinds binds) ] +\end{code} + +%************************************************************************ +%* * +\subsection[DefaultDecl]{A @default@ declaration} +%* * +%************************************************************************ + +There can only be one default declaration per module, but it is hard +for the parser to check that; we pass them all through in the abstract +syntax, and that restriction must be checked in the front end. + +\begin{code} +type LDefaultDecl name = Located (DefaultDecl name) + +data DefaultDecl name + = DefaultDecl [LHsType name] + +instance (OutputableBndr name) + => Outputable (DefaultDecl name) where + + ppr (DefaultDecl tys) + = ptext SLIT("default") <+> parens (interpp'SP tys) +\end{code} + +%************************************************************************ +%* * +\subsection{Foreign function interface declaration} +%* * +%************************************************************************ + +\begin{code} + +-- foreign declarations are distinguished as to whether they define or use a +-- Haskell name +-- +-- * the Boolean value indicates whether the pre-standard deprecated syntax +-- has been used +-- +type LForeignDecl name = Located (ForeignDecl name) + +data ForeignDecl name + = ForeignImport (Located name) (LHsType name) ForeignImport Bool -- defines name + | ForeignExport (Located name) (LHsType name) ForeignExport Bool -- uses name + +-- specification of an imported external entity in dependence on the calling +-- convention +-- +data ForeignImport = -- import of a C entity + -- + -- * the two strings specifying a header file or library + -- may be empty, which indicates the absence of a + -- header or object specification (both are not used + -- in the case of `CWrapper' and when `CFunction' + -- has a dynamic target) + -- + -- * the calling convention is irrelevant for code + -- generation in the case of `CLabel', but is needed + -- for pretty printing + -- + -- * `Safety' is irrelevant for `CLabel' and `CWrapper' + -- + CImport CCallConv -- ccall or stdcall + Safety -- safe or unsafe + FastString -- name of C header + FastString -- name of library object + CImportSpec -- details of the C entity + + -- import of a .NET function + -- + | DNImport DNCallSpec + +-- details of an external C entity +-- +data CImportSpec = CLabel CLabelString -- import address of a C label + | CFunction CCallTarget -- static or dynamic function + | CWrapper -- wrapper to expose closures + -- (former f.e.d.) + +-- specification of an externally exported entity in dependence on the calling +-- convention +-- +data ForeignExport = CExport CExportSpec -- contains the calling convention + | DNExport -- presently unused + +-- abstract type imported from .NET +-- +data FoType = DNType -- In due course we'll add subtype stuff + deriving (Eq) -- Used for equality instance for TyClDecl + + +-- pretty printing of foreign declarations +-- + +instance OutputableBndr name => Outputable (ForeignDecl name) where + ppr (ForeignImport n ty fimport _) = + ptext SLIT("foreign import") <+> ppr fimport <+> + ppr n <+> dcolon <+> ppr ty + ppr (ForeignExport n ty fexport _) = + ptext SLIT("foreign export") <+> ppr fexport <+> + ppr n <+> dcolon <+> ppr ty + +instance Outputable ForeignImport where + ppr (DNImport spec) = + ptext SLIT("dotnet") <+> ppr spec + ppr (CImport cconv safety header lib spec) = + ppr cconv <+> ppr safety <+> + char '"' <> pprCEntity header lib spec <> char '"' + where + pprCEntity header lib (CLabel lbl) = + ptext SLIT("static") <+> ftext header <+> char '&' <> + pprLib lib <> ppr lbl + pprCEntity header lib (CFunction (StaticTarget lbl)) = + ptext SLIT("static") <+> ftext header <+> char '&' <> + pprLib lib <> ppr lbl + pprCEntity header lib (CFunction (DynamicTarget)) = + ptext SLIT("dynamic") + pprCEntity _ _ (CWrapper) = ptext SLIT("wrapper") + -- + pprLib lib | nullFS lib = empty + | otherwise = char '[' <> ppr lib <> char ']' + +instance Outputable ForeignExport where + ppr (CExport (CExportStatic lbl cconv)) = + ppr cconv <+> char '"' <> ppr lbl <> char '"' + ppr (DNExport ) = + ptext SLIT("dotnet") <+> ptext SLIT("\"<unused>\"") + +instance Outputable FoType where + ppr DNType = ptext SLIT("type dotnet") +\end{code} + + +%************************************************************************ +%* * +\subsection{Transformation rules} +%* * +%************************************************************************ + +\begin{code} +type LRuleDecl name = Located (RuleDecl name) + +data RuleDecl name + = HsRule -- Source rule + RuleName -- Rule name + Activation + [RuleBndr name] -- Forall'd vars; after typechecking this includes tyvars + (Located (HsExpr name)) -- LHS + NameSet -- Free-vars from the LHS + (Located (HsExpr name)) -- RHS + NameSet -- Free-vars from the RHS + +data RuleBndr name + = RuleBndr (Located name) + | RuleBndrSig (Located name) (LHsType name) + +collectRuleBndrSigTys :: [RuleBndr name] -> [LHsType name] +collectRuleBndrSigTys bndrs = [ty | RuleBndrSig _ ty <- bndrs] + +instance OutputableBndr name => Outputable (RuleDecl name) where + ppr (HsRule name act ns lhs fv_lhs rhs fv_rhs) + = sep [text "{-# RULES" <+> doubleQuotes (ftext name) <+> ppr act, + nest 4 (pp_forall <+> pprExpr (unLoc lhs)), + nest 4 (equals <+> pprExpr (unLoc rhs) <+> text "#-}") ] + where + pp_forall | null ns = empty + | otherwise = text "forall" <+> fsep (map ppr ns) <> dot + +instance OutputableBndr name => Outputable (RuleBndr name) where + ppr (RuleBndr name) = ppr name + ppr (RuleBndrSig name ty) = ppr name <> dcolon <> ppr ty +\end{code} + + +%************************************************************************ +%* * +\subsection[DeprecDecl]{Deprecations} +%* * +%************************************************************************ + +We use exported entities for things to deprecate. + +\begin{code} +type LDeprecDecl name = Located (DeprecDecl name) + +data DeprecDecl name = Deprecation name DeprecTxt + +instance OutputableBndr name => Outputable (DeprecDecl name) where + ppr (Deprecation thing txt) + = hsep [text "{-# DEPRECATED", ppr thing, doubleQuotes (ppr txt), text "#-}"] +\end{code} diff --git a/compiler/hsSyn/HsExpr.hi-boot-5 b/compiler/hsSyn/HsExpr.hi-boot-5 new file mode 100644 index 0000000000..05e2eb5394 --- /dev/null +++ b/compiler/hsSyn/HsExpr.hi-boot-5 @@ -0,0 +1,14 @@ +__interface HsExpr 1 0 where +__export HsExpr HsExpr pprExpr Match GRHSs LHsExpr LMatch pprPatBind pprFunBind ; + +1 data HsExpr i ; +1 data Match a ; +1 data GRHSs a ; + +1 type LHsExpr a = SrcLoc.Located (HsExpr a) ; +1 type LMatch a = SrcLoc.Located (Match a) ; + +1 pprExpr :: __forall [i] {Outputable.OutputableBndr i} => HsExpr.HsExpr i -> Outputable.SDoc ; +1 pprPatBind :: __forall [i] {Outputable.OutputableBndr i} => HsPat.LPat i -> HsExpr.GRHSs i -> Outputable.SDoc ; +1 pprFunBind :: __forall [i] {Outputable.OutputableBndr i} => i -> [HsExpr.LMatch i] -> Outputable.SDoc ; + diff --git a/compiler/hsSyn/HsExpr.hi-boot-6 b/compiler/hsSyn/HsExpr.hi-boot-6 new file mode 100644 index 0000000000..40e18ef971 --- /dev/null +++ b/compiler/hsSyn/HsExpr.hi-boot-6 @@ -0,0 +1,22 @@ +module HsExpr where + +data HsExpr i +data HsSplice i +data MatchGroup a +data GRHSs a + +type LHsExpr a = SrcLoc.Located (HsExpr a) +type SyntaxExpr a = HsExpr a +type PostTcExpr = HsExpr Var.Id + +pprExpr :: (Outputable.OutputableBndr i) => + HsExpr.HsExpr i -> Outputable.SDoc + +pprSplice :: (Outputable.OutputableBndr i) => + HsExpr.HsSplice i -> Outputable.SDoc + +pprPatBind :: (Outputable.OutputableBndr b, Outputable.OutputableBndr i) => + HsPat.LPat b -> HsExpr.GRHSs i -> Outputable.SDoc + +pprFunBind :: (Outputable.OutputableBndr i) => + i -> HsExpr.MatchGroup i -> Outputable.SDoc diff --git a/compiler/hsSyn/HsExpr.lhs b/compiler/hsSyn/HsExpr.lhs new file mode 100644 index 0000000000..dbdd24c3c5 --- /dev/null +++ b/compiler/hsSyn/HsExpr.lhs @@ -0,0 +1,975 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[HsExpr]{Abstract Haskell syntax: expressions} + +\begin{code} +module HsExpr where + +#include "HsVersions.h" + +-- friends: +import HsDecls ( HsGroup ) +import HsPat ( LPat ) +import HsLit ( HsLit(..), HsOverLit ) +import HsTypes ( LHsType, PostTcType ) +import HsImpExp ( isOperator, pprHsVar ) +import HsBinds ( HsLocalBinds, DictBinds, ExprCoFn, isEmptyLocalBinds ) + +-- others: +import Type ( Type, pprParendType ) +import Var ( TyVar, Id ) +import Name ( Name ) +import BasicTypes ( IPName, Boxity, tupleParens, Arity, Fixity(..) ) +import SrcLoc ( Located(..), unLoc ) +import Outputable +import FastString +\end{code} + + +%************************************************************************ +%* * +\subsection{Expressions proper} +%* * +%************************************************************************ + +\begin{code} +type LHsExpr id = Located (HsExpr id) + +------------------------- +-- PostTcExpr is an evidence expression attached to the +-- syntax tree by the type checker (c.f. postTcType) +-- We use a PostTcTable where there are a bunch of pieces of +-- evidence, more than is convenient to keep individually +type PostTcExpr = HsExpr Id +type PostTcTable = [(Name, Id)] + +noPostTcExpr :: PostTcExpr +noPostTcExpr = HsLit (HsString FSLIT("noPostTcExpr")) + +noPostTcTable :: PostTcTable +noPostTcTable = [] + +------------------------- +-- SyntaxExpr is like PostTcExpr, but it's filled in a little earlier, +-- by the renamer. It's used for rebindable syntax. +-- E.g. (>>=) is filled in before the renamer by the appropriate Name +-- for (>>=), and then instantiated by the type checker with its +-- type args tec + +type SyntaxExpr id = HsExpr id + +noSyntaxExpr :: SyntaxExpr id -- Before renaming, and sometimes after, + -- (if the syntax slot makes no sense) +noSyntaxExpr = HsLit (HsString FSLIT("noSyntaxExpr")) + + +type SyntaxTable id = [(Name, SyntaxExpr id)] +-- *** Currently used only for CmdTop (sigh) *** +-- * Before the renamer, this list is noSyntaxTable +-- +-- * After the renamer, it takes the form [(std_name, HsVar actual_name)] +-- For example, for the 'return' op of a monad +-- normal case: (GHC.Base.return, HsVar GHC.Base.return) +-- with rebindable syntax: (GHC.Base.return, return_22) +-- where return_22 is whatever "return" is in scope +-- +-- * After the type checker, it takes the form [(std_name, <expression>)] +-- where <expression> is the evidence for the method + +noSyntaxTable :: SyntaxTable id +noSyntaxTable = [] + + +------------------------- +data HsExpr id + = HsVar id -- variable + | HsIPVar (IPName id) -- implicit parameter + | HsOverLit (HsOverLit id) -- Overloaded literals + | HsLit HsLit -- Simple (non-overloaded) literals + + | HsLam (MatchGroup id) -- Currently always a single match + + | HsApp (LHsExpr id) -- Application + (LHsExpr id) + + -- Operator applications: + -- NB Bracketed ops such as (+) come out as Vars. + + -- NB We need an expr for the operator in an OpApp/Section since + -- the typechecker may need to apply the operator to a few types. + + | OpApp (LHsExpr id) -- left operand + (LHsExpr id) -- operator + Fixity -- Renamer adds fixity; bottom until then + (LHsExpr id) -- right operand + + | NegApp (LHsExpr id) -- negated expr + (SyntaxExpr id) -- Name of 'negate' + + | HsPar (LHsExpr id) -- parenthesised expr + + | SectionL (LHsExpr id) -- operand + (LHsExpr id) -- operator + | SectionR (LHsExpr id) -- operator + (LHsExpr id) -- operand + + | HsCase (LHsExpr id) + (MatchGroup id) + + | HsIf (LHsExpr id) -- predicate + (LHsExpr id) -- then part + (LHsExpr id) -- else part + + | HsLet (HsLocalBinds id) -- let(rec) + (LHsExpr id) + + | HsDo (HsStmtContext Name) -- The parameterisation is unimportant + -- because in this context we never use + -- the PatGuard or ParStmt variant + [LStmt id] -- "do":one or more stmts + (LHsExpr id) -- The body; the last expression in the 'do' + -- of [ body | ... ] in a list comp + PostTcType -- Type of the whole expression + + | ExplicitList -- syntactic list + PostTcType -- Gives type of components of list + [LHsExpr id] + + | ExplicitPArr -- syntactic parallel array: [:e1, ..., en:] + PostTcType -- type of elements of the parallel array + [LHsExpr id] + + | ExplicitTuple -- tuple + [LHsExpr id] + -- NB: Unit is ExplicitTuple [] + -- for tuples, we can get the types + -- direct from the components + Boxity + + + -- Record construction + | RecordCon (Located id) -- The constructor. After type checking + -- it's the dataConWrapId of the constructor + PostTcExpr -- Data con Id applied to type args + (HsRecordBinds id) + + -- Record update + | RecordUpd (LHsExpr id) + (HsRecordBinds id) + PostTcType -- Type of *input* record + PostTcType -- Type of *result* record (may differ from + -- type of input record) + + | ExprWithTySig -- e :: type + (LHsExpr id) + (LHsType id) + + | ExprWithTySigOut -- TRANSLATION + (LHsExpr id) + (LHsType Name) -- Retain the signature for round-tripping purposes + + | ArithSeq -- arithmetic sequence + PostTcExpr + (ArithSeqInfo id) + + | PArrSeq -- arith. sequence for parallel array + PostTcExpr -- [:e1..e2:] or [:e1, e2..e3:] + (ArithSeqInfo id) + + | HsSCC FastString -- "set cost centre" (_scc_) annotation + (LHsExpr id) -- expr whose cost is to be measured + + | HsCoreAnn FastString -- hdaume: core annotation + (LHsExpr id) + + ----------------------------------------------------------- + -- MetaHaskell Extensions + | HsBracket (HsBracket id) + + | HsBracketOut (HsBracket Name) -- Output of the type checker is the *original* + [PendingSplice] -- renamed expression, plus *typechecked* splices + -- to be pasted back in by the desugarer + + | HsSpliceE (HsSplice id) + + ----------------------------------------------------------- + -- Arrow notation extension + + | HsProc (LPat id) -- arrow abstraction, proc + (LHsCmdTop id) -- body of the abstraction + -- always has an empty stack + + --------------------------------------- + -- The following are commands, not expressions proper + + | HsArrApp -- Arrow tail, or arrow application (f -< arg) + (LHsExpr id) -- arrow expression, f + (LHsExpr id) -- input expression, arg + PostTcType -- type of the arrow expressions f, + -- of the form a t t', where arg :: t + HsArrAppType -- higher-order (-<<) or first-order (-<) + Bool -- True => right-to-left (f -< arg) + -- False => left-to-right (arg >- f) + + | HsArrForm -- Command formation, (| e cmd1 .. cmdn |) + (LHsExpr id) -- the operator + -- after type-checking, a type abstraction to be + -- applied to the type of the local environment tuple + (Maybe Fixity) -- fixity (filled in by the renamer), for forms that + -- were converted from OpApp's by the renamer + [LHsCmdTop id] -- argument commands +\end{code} + + +These constructors only appear temporarily in the parser. +The renamer translates them into the Right Thing. + +\begin{code} + | EWildPat -- wildcard + + | EAsPat (Located id) -- as pattern + (LHsExpr id) + + | ELazyPat (LHsExpr id) -- ~ pattern + + | HsType (LHsType id) -- Explicit type argument; e.g f {| Int |} x y +\end{code} + +Everything from here on appears only in typechecker output. + +\begin{code} + | TyLam -- TRANSLATION + [TyVar] + (LHsExpr id) + | TyApp -- TRANSLATION + (LHsExpr id) -- generated by Spec + [Type] + + -- DictLam and DictApp are "inverses" + | DictLam + [id] + (LHsExpr id) + | DictApp + (LHsExpr id) + [id] + + | HsCoerce ExprCoFn -- TRANSLATION + (HsExpr id) + +type PendingSplice = (Name, LHsExpr Id) -- Typechecked splices, waiting to be + -- pasted back in by the desugarer +\end{code} + +A @Dictionary@, unless of length 0 or 1, becomes a tuple. A +@ClassDictLam dictvars methods expr@ is, therefore: +\begin{verbatim} +\ x -> case x of ( dictvars-and-methods-tuple ) -> expr +\end{verbatim} + +\begin{code} +instance OutputableBndr id => Outputable (HsExpr id) where + ppr expr = pprExpr expr +\end{code} + +\begin{code} +pprExpr :: OutputableBndr id => HsExpr id -> SDoc + +pprExpr e = pprDeeper (ppr_expr e) + +pprBinds :: OutputableBndr id => HsLocalBinds id -> SDoc +pprBinds b = pprDeeper (ppr b) + +ppr_lexpr :: OutputableBndr id => LHsExpr id -> SDoc +ppr_lexpr e = ppr_expr (unLoc e) + +ppr_expr (HsVar v) = pprHsVar v +ppr_expr (HsIPVar v) = ppr v +ppr_expr (HsLit lit) = ppr lit +ppr_expr (HsOverLit lit) = ppr lit + +ppr_expr (HsApp e1 e2) + = let (fun, args) = collect_args e1 [e2] in + (ppr_lexpr fun) <+> (sep (map pprParendExpr args)) + where + collect_args (L _ (HsApp fun arg)) args = collect_args fun (arg:args) + collect_args fun args = (fun, args) + +ppr_expr (OpApp e1 op fixity e2) + = case unLoc op of + HsVar v -> pp_infixly v + _ -> pp_prefixly + where + pp_e1 = pprParendExpr e1 -- Add parens to make precedence clear + pp_e2 = pprParendExpr e2 + + pp_prefixly + = hang (ppr op) 4 (sep [pp_e1, pp_e2]) + + pp_infixly v + = sep [pp_e1, hsep [pprInfix v, pp_e2]] + +ppr_expr (NegApp e _) = char '-' <+> pprParendExpr e + +ppr_expr (HsPar e) = parens (ppr_lexpr e) + +ppr_expr (SectionL expr op) + = case unLoc op of + HsVar v -> pp_infixly v + _ -> pp_prefixly + where + pp_expr = pprParendExpr expr + + pp_prefixly = hang (hsep [text " \\ x_ ->", ppr op]) + 4 (hsep [pp_expr, ptext SLIT("x_ )")]) + pp_infixly v = parens (sep [pp_expr, pprInfix v]) + +ppr_expr (SectionR op expr) + = case unLoc op of + HsVar v -> pp_infixly v + _ -> pp_prefixly + where + pp_expr = pprParendExpr expr + + pp_prefixly = hang (hsep [text "( \\ x_ ->", ppr op, ptext SLIT("x_")]) + 4 ((<>) pp_expr rparen) + pp_infixly v + = parens (sep [pprInfix v, pp_expr]) + +ppr_expr (HsLam matches) + = pprMatches LambdaExpr matches + +ppr_expr (HsCase expr matches) + = sep [ sep [ptext SLIT("case"), nest 4 (ppr expr), ptext SLIT("of")], + nest 2 (pprMatches CaseAlt matches) ] + +ppr_expr (HsIf e1 e2 e3) + = sep [hsep [ptext SLIT("if"), nest 2 (ppr e1), ptext SLIT("then")], + nest 4 (ppr e2), + ptext SLIT("else"), + nest 4 (ppr e3)] + +-- special case: let ... in let ... +ppr_expr (HsLet binds expr@(L _ (HsLet _ _))) + = sep [hang (ptext SLIT("let")) 2 (hsep [pprBinds binds, ptext SLIT("in")]), + ppr_lexpr expr] + +ppr_expr (HsLet binds expr) + = sep [hang (ptext SLIT("let")) 2 (pprBinds binds), + hang (ptext SLIT("in")) 2 (ppr expr)] + +ppr_expr (HsDo do_or_list_comp stmts body _) = pprDo do_or_list_comp stmts body + +ppr_expr (ExplicitList _ exprs) + = brackets (fsep (punctuate comma (map ppr_lexpr exprs))) + +ppr_expr (ExplicitPArr _ exprs) + = pa_brackets (fsep (punctuate comma (map ppr_lexpr exprs))) + +ppr_expr (ExplicitTuple exprs boxity) + = tupleParens boxity (sep (punctuate comma (map ppr_lexpr exprs))) + +ppr_expr (RecordCon con_id con_expr rbinds) + = pp_rbinds (ppr con_id) rbinds + +ppr_expr (RecordUpd aexp rbinds _ _) + = pp_rbinds (pprParendExpr aexp) rbinds + +ppr_expr (ExprWithTySig expr sig) + = hang (nest 2 (ppr_lexpr expr) <+> dcolon) + 4 (ppr sig) +ppr_expr (ExprWithTySigOut expr sig) + = hang (nest 2 (ppr_lexpr expr) <+> dcolon) + 4 (ppr sig) + +ppr_expr (ArithSeq expr info) = brackets (ppr info) +ppr_expr (PArrSeq expr info) = pa_brackets (ppr info) + +ppr_expr EWildPat = char '_' +ppr_expr (ELazyPat e) = char '~' <> pprParendExpr e +ppr_expr (EAsPat v e) = ppr v <> char '@' <> pprParendExpr e + +ppr_expr (HsSCC lbl expr) + = sep [ ptext SLIT("_scc_") <+> doubleQuotes (ftext lbl), pprParendExpr expr ] + +ppr_expr (TyLam tyvars expr) + = hang (hsep [ptext SLIT("/\\"), + hsep (map (pprBndr LambdaBind) tyvars), + ptext SLIT("->")]) + 4 (ppr_lexpr expr) + +ppr_expr (TyApp expr [ty]) + = hang (ppr_lexpr expr) 4 (pprParendType ty) + +ppr_expr (TyApp expr tys) + = hang (ppr_lexpr expr) + 4 (brackets (interpp'SP tys)) + +ppr_expr (DictLam dictvars expr) + = hang (hsep [ptext SLIT("\\{-dict-}"), + hsep (map (pprBndr LambdaBind) dictvars), + ptext SLIT("->")]) + 4 (ppr_lexpr expr) + +ppr_expr (DictApp expr [dname]) + = hang (ppr_lexpr expr) 4 (ppr dname) + +ppr_expr (DictApp expr dnames) + = hang (ppr_lexpr expr) + 4 (brackets (interpp'SP dnames)) + +ppr_expr (HsCoerce co_fn e) = ppr_expr e + +ppr_expr (HsType id) = ppr id + +ppr_expr (HsSpliceE s) = pprSplice s +ppr_expr (HsBracket b) = pprHsBracket b +ppr_expr (HsBracketOut e []) = ppr e +ppr_expr (HsBracketOut e ps) = ppr e $$ ptext SLIT("pending") <+> ppr ps + +ppr_expr (HsProc pat (L _ (HsCmdTop cmd _ _ _))) + = hsep [ptext SLIT("proc"), ppr pat, ptext SLIT("->"), ppr cmd] + +ppr_expr (HsArrApp arrow arg _ HsFirstOrderApp True) + = hsep [ppr_lexpr arrow, ptext SLIT("-<"), ppr_lexpr arg] +ppr_expr (HsArrApp arrow arg _ HsFirstOrderApp False) + = hsep [ppr_lexpr arg, ptext SLIT(">-"), ppr_lexpr arrow] +ppr_expr (HsArrApp arrow arg _ HsHigherOrderApp True) + = hsep [ppr_lexpr arrow, ptext SLIT("-<<"), ppr_lexpr arg] +ppr_expr (HsArrApp arrow arg _ HsHigherOrderApp False) + = hsep [ppr_lexpr arg, ptext SLIT(">>-"), ppr_lexpr arrow] + +ppr_expr (HsArrForm (L _ (HsVar v)) (Just _) [arg1, arg2]) + = sep [pprCmdArg (unLoc arg1), hsep [pprInfix v, pprCmdArg (unLoc arg2)]] +ppr_expr (HsArrForm op _ args) + = hang (ptext SLIT("(|") <> ppr_lexpr op) + 4 (sep (map (pprCmdArg.unLoc) args) <> ptext SLIT("|)")) + +pprCmdArg :: OutputableBndr id => HsCmdTop id -> SDoc +pprCmdArg (HsCmdTop cmd@(L _ (HsArrForm _ Nothing [])) _ _ _) + = ppr_lexpr cmd +pprCmdArg (HsCmdTop cmd _ _ _) + = parens (ppr_lexpr cmd) + +-- Put a var in backquotes if it's not an operator already +pprInfix :: Outputable name => name -> SDoc +pprInfix v | isOperator ppr_v = ppr_v + | otherwise = char '`' <> ppr_v <> char '`' + where + ppr_v = ppr v + +-- add parallel array brackets around a document +-- +pa_brackets :: SDoc -> SDoc +pa_brackets p = ptext SLIT("[:") <> p <> ptext SLIT(":]") +\end{code} + +Parenthesize unless very simple: +\begin{code} +pprParendExpr :: OutputableBndr id => LHsExpr id -> SDoc +pprParendExpr expr + = let + pp_as_was = ppr_lexpr expr + -- Using ppr_expr here avoids the call to 'deeper' + -- Not sure if that's always right. + in + case unLoc expr of + HsLit l -> ppr l + HsOverLit l -> ppr l + + HsVar _ -> pp_as_was + HsIPVar _ -> pp_as_was + ExplicitList _ _ -> pp_as_was + ExplicitPArr _ _ -> pp_as_was + ExplicitTuple _ _ -> pp_as_was + HsPar _ -> pp_as_was + HsBracket _ -> pp_as_was + HsBracketOut _ [] -> pp_as_was + + _ -> parens pp_as_was +\end{code} + +%************************************************************************ +%* * +\subsection{Commands (in arrow abstractions)} +%* * +%************************************************************************ + +We re-use HsExpr to represent these. + +\begin{code} +type HsCmd id = HsExpr id + +type LHsCmd id = LHsExpr id + +data HsArrAppType = HsHigherOrderApp | HsFirstOrderApp +\end{code} + +The legal constructors for commands are: + + = HsArrApp ... -- as above + + | HsArrForm ... -- as above + + | HsApp (HsCmd id) + (HsExpr id) + + | HsLam (Match id) -- kappa + + -- the renamer turns this one into HsArrForm + | OpApp (HsExpr id) -- left operand + (HsCmd id) -- operator + Fixity -- Renamer adds fixity; bottom until then + (HsCmd id) -- right operand + + | HsPar (HsCmd id) -- parenthesised command + + | HsCase (HsExpr id) + [Match id] -- bodies are HsCmd's + SrcLoc + + | HsIf (HsExpr id) -- predicate + (HsCmd id) -- then part + (HsCmd id) -- else part + SrcLoc + + | HsLet (HsLocalBinds id) -- let(rec) + (HsCmd id) + + | HsDo (HsStmtContext Name) -- The parameterisation is unimportant + -- because in this context we never use + -- the PatGuard or ParStmt variant + [Stmt id] -- HsExpr's are really HsCmd's + PostTcType -- Type of the whole expression + SrcLoc + +Top-level command, introducing a new arrow. +This may occur inside a proc (where the stack is empty) or as an +argument of a command-forming operator. + +\begin{code} +type LHsCmdTop id = Located (HsCmdTop id) + +data HsCmdTop id + = HsCmdTop (LHsCmd id) + [PostTcType] -- types of inputs on the command's stack + PostTcType -- return type of the command + (SyntaxTable id) + -- after type checking: + -- names used in the command's desugaring +\end{code} + +%************************************************************************ +%* * +\subsection{Record binds} +%* * +%************************************************************************ + +\begin{code} +type HsRecordBinds id = [(Located id, LHsExpr id)] + +recBindFields :: HsRecordBinds id -> [id] +recBindFields rbinds = [unLoc field | (field,_) <- rbinds] + +pp_rbinds :: OutputableBndr id => SDoc -> HsRecordBinds id -> SDoc +pp_rbinds thing rbinds + = hang thing + 4 (braces (sep (punctuate comma (map (pp_rbind) rbinds)))) + where + pp_rbind (v, e) = hsep [pprBndr LetBind (unLoc v), char '=', ppr e] +\end{code} + + + +%************************************************************************ +%* * +\subsection{@Match@, @GRHSs@, and @GRHS@ datatypes} +%* * +%************************************************************************ + +@Match@es are sets of pattern bindings and right hand sides for +functions, patterns or case branches. For example, if a function @g@ +is defined as: +\begin{verbatim} +g (x,y) = y +g ((x:ys),y) = y+1, +\end{verbatim} +then \tr{g} has two @Match@es: @(x,y) = y@ and @((x:ys),y) = y+1@. + +It is always the case that each element of an @[Match]@ list has the +same number of @pats@s inside it. This corresponds to saying that +a function defined by pattern matching must have the same number of +patterns in each equation. + +\begin{code} +data MatchGroup id + = MatchGroup + [LMatch id] -- The alternatives + PostTcType -- The type is the type of the entire group + -- t1 -> ... -> tn -> tr + -- where there are n patterns + +type LMatch id = Located (Match id) + +data Match id + = Match + [LPat id] -- The patterns + (Maybe (LHsType id)) -- A type signature for the result of the match + -- Nothing after typechecking + (GRHSs id) + +matchGroupArity :: MatchGroup id -> Arity +matchGroupArity (MatchGroup (match:matches) _) + = ASSERT( all ((== n_pats) . length . hsLMatchPats) matches ) + -- Assertion just checks that all the matches have the same number of pats + n_pats + where + n_pats = length (hsLMatchPats match) + +hsLMatchPats :: LMatch id -> [LPat id] +hsLMatchPats (L _ (Match pats _ _)) = pats + +-- GRHSs are used both for pattern bindings and for Matches +data GRHSs id + = GRHSs [LGRHS id] -- Guarded RHSs + (HsLocalBinds id) -- The where clause + +type LGRHS id = Located (GRHS id) + +data GRHS id = GRHS [LStmt id] -- Guards + (LHsExpr id) -- Right hand side +\end{code} + +We know the list must have at least one @Match@ in it. + +\begin{code} +pprMatches :: (OutputableBndr id) => HsMatchContext id -> MatchGroup id -> SDoc +pprMatches ctxt (MatchGroup matches _) = vcat (map (pprMatch ctxt) (map unLoc matches)) + +-- Exported to HsBinds, which can't see the defn of HsMatchContext +pprFunBind :: (OutputableBndr id) => id -> MatchGroup id -> SDoc +pprFunBind fun matches = pprMatches (FunRhs fun) matches + +-- Exported to HsBinds, which can't see the defn of HsMatchContext +pprPatBind :: (OutputableBndr bndr, OutputableBndr id) + => LPat bndr -> GRHSs id -> SDoc +pprPatBind pat grhss = sep [ppr pat, nest 4 (pprGRHSs PatBindRhs grhss)] + + +pprMatch :: OutputableBndr id => HsMatchContext id -> Match id -> SDoc +pprMatch ctxt (Match pats maybe_ty grhss) + = pp_name ctxt <+> sep [sep (map ppr pats), + ppr_maybe_ty, + nest 2 (pprGRHSs ctxt grhss)] + where + pp_name (FunRhs fun) = ppr fun -- Not pprBndr; the AbsBinds will + -- have printed the signature + pp_name LambdaExpr = char '\\' + pp_name other = empty + + ppr_maybe_ty = case maybe_ty of + Just ty -> dcolon <+> ppr ty + Nothing -> empty + + +pprGRHSs :: OutputableBndr id => HsMatchContext id -> GRHSs id -> SDoc +pprGRHSs ctxt (GRHSs grhss binds) + = vcat (map (pprGRHS ctxt . unLoc) grhss) + $$ + (if isEmptyLocalBinds binds then empty + else text "where" $$ nest 4 (pprBinds binds)) + +pprGRHS :: OutputableBndr id => HsMatchContext id -> GRHS id -> SDoc + +pprGRHS ctxt (GRHS [] expr) + = pp_rhs ctxt expr + +pprGRHS ctxt (GRHS guards expr) + = sep [char '|' <+> interpp'SP guards, pp_rhs ctxt expr] + +pp_rhs ctxt rhs = matchSeparator ctxt <+> pprDeeper (ppr rhs) +\end{code} + +%************************************************************************ +%* * +\subsection{Do stmts and list comprehensions} +%* * +%************************************************************************ + +\begin{code} +type LStmt id = Located (Stmt id) + +-- The SyntaxExprs in here are used *only* for do-notation, which +-- has rebindable syntax. Otherwise they are unused. +data Stmt id + = BindStmt (LPat id) + (LHsExpr id) + (SyntaxExpr id) -- The (>>=) operator + (SyntaxExpr id) -- The fail operator + -- The fail operator is noSyntaxExpr + -- if the pattern match can't fail + + | ExprStmt (LHsExpr id) + (SyntaxExpr id) -- The (>>) operator + PostTcType -- Element type of the RHS (used for arrows) + + | LetStmt (HsLocalBinds id) + + -- ParStmts only occur in a list comprehension + | ParStmt [([LStmt id], [id])] -- After renaming, the ids are the binders + -- bound by the stmts and used subsequently + + -- Recursive statement (see Note [RecStmt] below) + | RecStmt [LStmt id] + --- The next two fields are only valid after renaming + [id] -- The ids are a subset of the variables bound by the stmts + -- that are used in stmts that follow the RecStmt + + [id] -- Ditto, but these variables are the "recursive" ones, that + -- are used before they are bound in the stmts of the RecStmt + -- From a type-checking point of view, these ones have to be monomorphic + + --- These fields are only valid after typechecking + [PostTcExpr] -- These expressions correspond + -- 1-to-1 with the "recursive" [id], and are the expresions that + -- should be returned by the recursion. They may not quite be the + -- Ids themselves, because the Id may be *polymorphic*, but + -- the returned thing has to be *monomorphic*. + (DictBinds id) -- Method bindings of Ids bound by the RecStmt, + -- and used afterwards +\end{code} + +ExprStmts are a bit tricky, because what they mean +depends on the context. Consider the following contexts: + + A do expression of type (m res_ty) + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + * ExprStmt E any_ty: do { ....; E; ... } + E :: m any_ty + Translation: E >> ... + + A list comprehensions of type [elt_ty] + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + * ExprStmt E Bool: [ .. | .... E ] + [ .. | ..., E, ... ] + [ .. | .... | ..., E | ... ] + E :: Bool + Translation: if E then fail else ... + + A guard list, guarding a RHS of type rhs_ty + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + * ExprStmt E Bool: f x | ..., E, ... = ...rhs... + E :: Bool + Translation: if E then fail else ... + +Array comprehensions are handled like list comprehensions -=chak + +Note [RecStmt] +~~~~~~~~~~~~~~ +Example: + HsDo [ BindStmt x ex + + , RecStmt [a::forall a. a -> a, b] + [a::Int -> Int, c] + [ BindStmt b (return x) + , LetStmt a = ea + , BindStmt c ec ] + + , return (a b) ] + +Here, the RecStmt binds a,b,c; but + - Only a,b are used in the stmts *following* the RecStmt, + This 'a' is *polymorphic' + - Only a,c are used in the stmts *inside* the RecStmt + *before* their bindings + This 'a' is monomorphic + +Nota Bene: the two a's have different types, even though they +have the same Name. + + +\begin{code} +instance OutputableBndr id => Outputable (Stmt id) where + ppr stmt = pprStmt stmt + +pprStmt (BindStmt pat expr _ _) = hsep [ppr pat, ptext SLIT("<-"), ppr expr] +pprStmt (LetStmt binds) = hsep [ptext SLIT("let"), pprBinds binds] +pprStmt (ExprStmt expr _ _) = ppr expr +pprStmt (ParStmt stmtss) = hsep (map (\stmts -> ptext SLIT("| ") <> ppr stmts) stmtss) +pprStmt (RecStmt segment _ _ _ _) = ptext SLIT("rec") <+> braces (vcat (map ppr segment)) + +pprDo :: OutputableBndr id => HsStmtContext any -> [LStmt id] -> LHsExpr id -> SDoc +pprDo DoExpr stmts body = hang (ptext SLIT("do")) 2 (vcat (map ppr stmts) $$ ppr body) +pprDo (MDoExpr _) stmts body = hang (ptext SLIT("mdo")) 3 (vcat (map ppr stmts) $$ ppr body) +pprDo ListComp stmts body = pprComp brackets stmts body +pprDo PArrComp stmts body = pprComp pa_brackets stmts body + +pprComp :: OutputableBndr id => (SDoc -> SDoc) -> [LStmt id] -> LHsExpr id -> SDoc +pprComp brack quals body + = brack $ + hang (ppr body <+> char '|') + 4 (interpp'SP quals) +\end{code} + +%************************************************************************ +%* * + Template Haskell quotation brackets +%* * +%************************************************************************ + +\begin{code} +data HsSplice id = HsSplice -- $z or $(f 4) + id -- The id is just a unique name to + (LHsExpr id) -- identify this splice point + +instance OutputableBndr id => Outputable (HsSplice id) where + ppr = pprSplice + +pprSplice :: OutputableBndr id => HsSplice id -> SDoc +pprSplice (HsSplice n e) = char '$' <> brackets (ppr n) <> pprParendExpr e + + +data HsBracket id = ExpBr (LHsExpr id) -- [| expr |] + | PatBr (LPat id) -- [p| pat |] + | DecBr (HsGroup id) -- [d| decls |] + | TypBr (LHsType id) -- [t| type |] + | VarBr id -- 'x, ''T + +instance OutputableBndr id => Outputable (HsBracket id) where + ppr = pprHsBracket + + +pprHsBracket (ExpBr e) = thBrackets empty (ppr e) +pprHsBracket (PatBr p) = thBrackets (char 'p') (ppr p) +pprHsBracket (DecBr d) = thBrackets (char 'd') (ppr d) +pprHsBracket (TypBr t) = thBrackets (char 't') (ppr t) +pprHsBracket (VarBr n) = char '\'' <> ppr n + -- Infelicity: can't show ' vs '', because + -- we can't ask n what its OccName is, because the + -- pretty-printer for HsExpr doesn't ask for NamedThings + -- But the pretty-printer for names will show the OccName class + +thBrackets pp_kind pp_body = char '[' <> pp_kind <> char '|' <+> + pp_body <+> ptext SLIT("|]") +\end{code} + +%************************************************************************ +%* * +\subsection{Enumerations and list comprehensions} +%* * +%************************************************************************ + +\begin{code} +data ArithSeqInfo id + = From (LHsExpr id) + | FromThen (LHsExpr id) + (LHsExpr id) + | FromTo (LHsExpr id) + (LHsExpr id) + | FromThenTo (LHsExpr id) + (LHsExpr id) + (LHsExpr id) +\end{code} + +\begin{code} +instance OutputableBndr id => Outputable (ArithSeqInfo id) where + ppr (From e1) = hcat [ppr e1, pp_dotdot] + ppr (FromThen e1 e2) = hcat [ppr e1, comma, space, ppr e2, pp_dotdot] + ppr (FromTo e1 e3) = hcat [ppr e1, pp_dotdot, ppr e3] + ppr (FromThenTo e1 e2 e3) + = hcat [ppr e1, comma, space, ppr e2, pp_dotdot, ppr e3] + +pp_dotdot = ptext SLIT(" .. ") +\end{code} + + +%************************************************************************ +%* * +\subsection{HsMatchCtxt} +%* * +%************************************************************************ + +\begin{code} +data HsMatchContext id -- Context of a Match + = FunRhs id -- Function binding for f + | CaseAlt -- Guard on a case alternative + | LambdaExpr -- Pattern of a lambda + | ProcExpr -- Pattern of a proc + | PatBindRhs -- Pattern binding + | RecUpd -- Record update [used only in DsExpr to tell matchWrapper + -- what sort of runtime error message to generate] + | StmtCtxt (HsStmtContext id) -- Pattern of a do-stmt or list comprehension + deriving () + +data HsStmtContext id + = ListComp + | DoExpr + | MDoExpr PostTcTable -- Recursive do-expression + -- (tiresomely, it needs table + -- of its return/bind ops) + | PArrComp -- Parallel array comprehension + | PatGuard (HsMatchContext id) -- Pattern guard for specified thing + | ParStmtCtxt (HsStmtContext id) -- A branch of a parallel stmt +\end{code} + +\begin{code} +isDoExpr :: HsStmtContext id -> Bool +isDoExpr DoExpr = True +isDoExpr (MDoExpr _) = True +isDoExpr other = False +\end{code} + +\begin{code} +matchSeparator (FunRhs _) = ptext SLIT("=") +matchSeparator CaseAlt = ptext SLIT("->") +matchSeparator LambdaExpr = ptext SLIT("->") +matchSeparator ProcExpr = ptext SLIT("->") +matchSeparator PatBindRhs = ptext SLIT("=") +matchSeparator (StmtCtxt _) = ptext SLIT("<-") +matchSeparator RecUpd = panic "unused" +\end{code} + +\begin{code} +pprMatchContext (FunRhs fun) = ptext SLIT("the definition of") <+> quotes (ppr fun) +pprMatchContext CaseAlt = ptext SLIT("a case alternative") +pprMatchContext RecUpd = ptext SLIT("a record-update construct") +pprMatchContext PatBindRhs = ptext SLIT("a pattern binding") +pprMatchContext LambdaExpr = ptext SLIT("a lambda abstraction") +pprMatchContext ProcExpr = ptext SLIT("an arrow abstraction") +pprMatchContext (StmtCtxt ctxt) = ptext SLIT("a pattern binding in") $$ pprStmtContext ctxt + +pprMatchRhsContext (FunRhs fun) = ptext SLIT("a right-hand side of function") <+> quotes (ppr fun) +pprMatchRhsContext CaseAlt = ptext SLIT("the body of a case alternative") +pprMatchRhsContext PatBindRhs = ptext SLIT("the right-hand side of a pattern binding") +pprMatchRhsContext LambdaExpr = ptext SLIT("the body of a lambda") +pprMatchRhsContext ProcExpr = ptext SLIT("the body of a proc") +pprMatchRhsContext RecUpd = panic "pprMatchRhsContext" + +pprStmtContext (ParStmtCtxt c) = sep [ptext SLIT("a parallel branch of"), pprStmtContext c] +pprStmtContext (PatGuard ctxt) = ptext SLIT("a pattern guard for") $$ pprMatchContext ctxt +pprStmtContext DoExpr = ptext SLIT("a 'do' expression") +pprStmtContext (MDoExpr _) = ptext SLIT("an 'mdo' expression") +pprStmtContext ListComp = ptext SLIT("a list comprehension") +pprStmtContext PArrComp = ptext SLIT("an array comprehension") + +-- Used for the result statement of comprehension +-- e.g. the 'e' in [ e | ... ] +-- or the 'r' in f x = r +pprStmtResultContext (PatGuard ctxt) = pprMatchRhsContext ctxt +pprStmtResultContext other = ptext SLIT("the result of") <+> pprStmtContext other + + +-- Used to generate the string for a *runtime* error message +matchContextErrString (FunRhs fun) = "function " ++ showSDoc (ppr fun) +matchContextErrString CaseAlt = "case" +matchContextErrString PatBindRhs = "pattern binding" +matchContextErrString RecUpd = "record update" +matchContextErrString LambdaExpr = "lambda" +matchContextErrString ProcExpr = "proc" +matchContextErrString (StmtCtxt (ParStmtCtxt c)) = matchContextErrString (StmtCtxt c) +matchContextErrString (StmtCtxt (PatGuard _)) = "pattern guard" +matchContextErrString (StmtCtxt DoExpr) = "'do' expression" +matchContextErrString (StmtCtxt (MDoExpr _)) = "'mdo' expression" +matchContextErrString (StmtCtxt ListComp) = "list comprehension" +matchContextErrString (StmtCtxt PArrComp) = "array comprehension" +\end{code} diff --git a/compiler/hsSyn/HsExpr.lhs-boot b/compiler/hsSyn/HsExpr.lhs-boot new file mode 100644 index 0000000000..503701bf66 --- /dev/null +++ b/compiler/hsSyn/HsExpr.lhs-boot @@ -0,0 +1,27 @@ +\begin{code} +module HsExpr where + +import SrcLoc ( Located ) +import Outputable ( SDoc, OutputableBndr ) +import {-# SOURCE #-} HsPat ( LPat ) + +data HsExpr i +data HsSplice i +data MatchGroup a +data GRHSs a + +type LHsExpr a = Located (HsExpr a) +type SyntaxExpr a = HsExpr a + +pprExpr :: (OutputableBndr i) => + HsExpr i -> SDoc + +pprSplice :: (OutputableBndr i) => + HsSplice i -> SDoc + +pprPatBind :: (OutputableBndr b, OutputableBndr i) => + LPat b -> GRHSs i -> SDoc + +pprFunBind :: (OutputableBndr i) => + i -> MatchGroup i -> SDoc +\end{code} diff --git a/compiler/hsSyn/HsImpExp.lhs b/compiler/hsSyn/HsImpExp.lhs new file mode 100644 index 0000000000..220afb7499 --- /dev/null +++ b/compiler/hsSyn/HsImpExp.lhs @@ -0,0 +1,125 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[HsImpExp]{Abstract syntax: imports, exports, interfaces} + +\begin{code} +module HsImpExp where + +#include "HsVersions.h" + +import Module ( Module ) +import Outputable +import FastString +import SrcLoc ( Located(..) ) +import Char ( isAlpha ) +\end{code} + +%************************************************************************ +%* * +\subsection{Import and export declaration lists} +%* * +%************************************************************************ + +One per \tr{import} declaration in a module. +\begin{code} +type LImportDecl name = Located (ImportDecl name) + +data ImportDecl name + = ImportDecl (Located Module) -- module name + Bool -- True <=> {-# SOURCE #-} import + Bool -- True => qualified + (Maybe Module) -- as Module + (Maybe (Bool, [LIE name])) -- (True => hiding, names) +\end{code} + +\begin{code} +instance (Outputable name) => Outputable (ImportDecl name) where + ppr (ImportDecl mod from qual as spec) + = hang (hsep [ptext SLIT("import"), ppr_imp from, + pp_qual qual, ppr mod, pp_as as]) + 4 (pp_spec spec) + where + pp_qual False = empty + pp_qual True = ptext SLIT("qualified") + + pp_as Nothing = empty + pp_as (Just a) = ptext SLIT("as ") <+> ppr a + + ppr_imp True = ptext SLIT("{-# SOURCE #-}") + ppr_imp False = empty + + pp_spec Nothing = empty + pp_spec (Just (False, spec)) + = parens (interpp'SP spec) + pp_spec (Just (True, spec)) + = ptext SLIT("hiding") <+> parens (interpp'SP spec) + +ideclName (ImportDecl mod_nm _ _ _ _) = mod_nm +\end{code} + +%************************************************************************ +%* * +\subsection{Imported and exported entities} +%* * +%************************************************************************ + +\begin{code} +type LIE name = Located (IE name) + +data IE name + = IEVar name + | IEThingAbs name -- Class/Type (can't tell) + | IEThingAll name -- Class/Type plus all methods/constructors + | IEThingWith name [name] -- Class/Type plus some methods/constructors + | IEModuleContents Module -- (Export Only) +\end{code} + +\begin{code} +ieName :: IE name -> name +ieName (IEVar n) = n +ieName (IEThingAbs n) = n +ieName (IEThingWith n _) = n +ieName (IEThingAll n) = n + +ieNames :: IE a -> [a] +ieNames (IEVar n ) = [n] +ieNames (IEThingAbs n ) = [n] +ieNames (IEThingAll n ) = [n] +ieNames (IEThingWith n ns) = n:ns +ieNames (IEModuleContents _ ) = [] +\end{code} + +\begin{code} +instance (Outputable name) => Outputable (IE name) where + ppr (IEVar var) = pprHsVar var + ppr (IEThingAbs thing) = ppr thing + ppr (IEThingAll thing) = hcat [ppr thing, text "(..)"] + ppr (IEThingWith thing withs) + = ppr thing <> parens (fsep (punctuate comma (map pprHsVar withs))) + ppr (IEModuleContents mod) + = ptext SLIT("module") <+> ppr mod +\end{code} + +\begin{code} +pprHsVar :: Outputable name => name -> SDoc +pprHsVar v | isOperator ppr_v = parens ppr_v + | otherwise = ppr_v + where + ppr_v = ppr v + +isOperator :: SDoc -> Bool +isOperator ppr_v + = case showSDocUnqual ppr_v of + ('(':s) -> False -- (), (,) etc + ('[':s) -> False -- [] + ('$':c:s) -> not (isAlpha c) -- Don't treat $d as an operator + (':':c:s) -> not (isAlpha c) -- Don't treat :T as an operator + ('_':s) -> False -- Not an operator + (c:s) -> not (isAlpha c) -- Starts with non-alpha + other -> False + -- We use (showSDoc (ppr v)), rather than isSymOcc (getOccName v) simply so + -- that we don't need NamedThing in the context of all these functions. + -- Gruesome, but simple. +\end{code} + diff --git a/compiler/hsSyn/HsLit.lhs b/compiler/hsSyn/HsLit.lhs new file mode 100644 index 0000000000..c6d7e5dbea --- /dev/null +++ b/compiler/hsSyn/HsLit.lhs @@ -0,0 +1,96 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[HsLit]{Abstract syntax: source-language literals} + +\begin{code} +module HsLit where + +#include "HsVersions.h" + +import {-# SOURCE #-} HsExpr( SyntaxExpr ) +import Type ( Type ) +import Outputable +import FastString +import Ratio ( Rational ) +\end{code} + + +%************************************************************************ +%* * +\subsection[HsLit]{Literals} +%* * +%************************************************************************ + + +\begin{code} +data HsLit + = HsChar Char -- Character + | HsCharPrim Char -- Unboxed character + | HsString FastString -- String + | HsStringPrim FastString -- Packed string + | HsInt Integer -- Genuinely an Int; arises from TcGenDeriv, + -- and from TRANSLATION + | HsIntPrim Integer -- Unboxed Int + | HsInteger Integer Type -- Genuinely an integer; arises only from TRANSLATION + -- (overloaded literals are done with HsOverLit) + | HsRat Rational Type -- Genuinely a rational; arises only from TRANSLATION + -- (overloaded literals are done with HsOverLit) + | HsFloatPrim Rational -- Unboxed Float + | HsDoublePrim Rational -- Unboxed Double + +instance Eq HsLit where + (HsChar x1) == (HsChar x2) = x1==x2 + (HsCharPrim x1) == (HsCharPrim x2) = x1==x2 + (HsString x1) == (HsString x2) = x1==x2 + (HsStringPrim x1) == (HsStringPrim x2) = x1==x2 + (HsInt x1) == (HsInt x2) = x1==x2 + (HsIntPrim x1) == (HsIntPrim x2) = x1==x2 + (HsInteger x1 _) == (HsInteger x2 _) = x1==x2 + (HsRat x1 _) == (HsRat x2 _) = x1==x2 + (HsFloatPrim x1) == (HsFloatPrim x2) = x1==x2 + (HsDoublePrim x1) == (HsDoublePrim x2) = x1==x2 + lit1 == lit2 = False + +data HsOverLit id -- An overloaded literal + = HsIntegral Integer (SyntaxExpr id) -- Integer-looking literals; + | HsFractional Rational (SyntaxExpr id) -- Frac-looking literals + -- Before type checking, the SyntaxExpr is 'fromInteger' or 'fromRational' + -- After type checking, it is (fromInteger 3) or lit_78; that is, + -- the expression that should replace the literal. + -- This is unusual, because we're replacing 'fromInteger' with a call + -- to fromInteger. Reason: it allows commoning up of the fromInteger + -- calls, which wouldn't be possible if the desguarar made the application + +-- Comparison operations are needed when grouping literals +-- for compiling pattern-matching (module MatchLit) +instance Eq (HsOverLit id) where + (HsIntegral i1 _) == (HsIntegral i2 _) = i1 == i2 + (HsFractional f1 _) == (HsFractional f2 _) = f1 == f2 + l1 == l2 = False + +instance Ord (HsOverLit id) where + compare (HsIntegral i1 _) (HsIntegral i2 _) = i1 `compare` i2 + compare (HsIntegral _ _) (HsFractional _ _) = LT + compare (HsFractional f1 _) (HsFractional f2 _) = f1 `compare` f2 + compare (HsFractional f1 _) (HsIntegral _ _) = GT +\end{code} + +\begin{code} +instance Outputable HsLit where + -- Use "show" because it puts in appropriate escapes + ppr (HsChar c) = pprHsChar c + ppr (HsCharPrim c) = pprHsChar c <> char '#' + ppr (HsString s) = pprHsString s + ppr (HsStringPrim s) = pprHsString s <> char '#' + ppr (HsInt i) = integer i + ppr (HsInteger i _) = integer i + ppr (HsRat f _) = rational f + ppr (HsFloatPrim f) = rational f <> char '#' + ppr (HsDoublePrim d) = rational d <> text "##" + ppr (HsIntPrim i) = integer i <> char '#' + +instance Outputable (HsOverLit id) where + ppr (HsIntegral i _) = integer i + ppr (HsFractional f _) = rational f +\end{code} diff --git a/compiler/hsSyn/HsPat.hi-boot-5 b/compiler/hsSyn/HsPat.hi-boot-5 new file mode 100644 index 0000000000..1f02ce3d47 --- /dev/null +++ b/compiler/hsSyn/HsPat.hi-boot-5 @@ -0,0 +1,6 @@ +__interface HsPat 1 0 where +__export Pat LPat ; + +1 data Pat i ; +1 type LPat i = SrcLoc.Located (Pat i) ; + diff --git a/compiler/hsSyn/HsPat.hi-boot-6 b/compiler/hsSyn/HsPat.hi-boot-6 new file mode 100644 index 0000000000..593caf2d17 --- /dev/null +++ b/compiler/hsSyn/HsPat.hi-boot-6 @@ -0,0 +1,4 @@ +module HsPat where + +data Pat i +type LPat i = SrcLoc.Located (Pat i) diff --git a/compiler/hsSyn/HsPat.lhs b/compiler/hsSyn/HsPat.lhs new file mode 100644 index 0000000000..953d228942 --- /dev/null +++ b/compiler/hsSyn/HsPat.lhs @@ -0,0 +1,324 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[PatSyntax]{Abstract Haskell syntax---patterns} + +\begin{code} +module HsPat ( + Pat(..), InPat, OutPat, LPat, + + HsConDetails(..), hsConArgs, + + mkPrefixConPat, mkCharLitPat, mkNilPat, + + isBangHsBind, + patsAreAllCons, isConPat, isSigPat, isWildPat, + patsAreAllLits, isLitPat, isIrrefutableHsPat + ) where + +#include "HsVersions.h" + + +import {-# SOURCE #-} HsExpr ( SyntaxExpr ) + +-- friends: +import HsBinds ( DictBinds, HsBind(..), emptyLHsBinds, pprLHsBinds ) +import HsLit ( HsLit(HsCharPrim), HsOverLit ) +import HsTypes ( LHsType, PostTcType ) +import BasicTypes ( Boxity, tupleParens ) +-- others: +import PprCore ( {- instance OutputableBndr TyVar -} ) +import TysWiredIn ( nilDataCon, charDataCon, charTy ) +import Var ( TyVar ) +import DataCon ( DataCon, dataConTyCon ) +import TyCon ( isProductTyCon ) +import Outputable +import Type ( Type ) +import SrcLoc ( Located(..), unLoc, noLoc ) +\end{code} + + +\begin{code} +type InPat id = LPat id -- No 'Out' constructors +type OutPat id = LPat id -- No 'In' constructors + +type LPat id = Located (Pat id) + +data Pat id + = ------------ Simple patterns --------------- + WildPat PostTcType -- Wild card + | VarPat id -- Variable + | VarPatOut id (DictBinds id) -- Used only for overloaded Ids; the + -- bindings give its overloaded instances + | LazyPat (LPat id) -- Lazy pattern + | AsPat (Located id) (LPat id) -- As pattern + | ParPat (LPat id) -- Parenthesised pattern + | BangPat (LPat id) -- Bang patterng + + ------------ Lists, tuples, arrays --------------- + | ListPat [LPat id] -- Syntactic list + PostTcType -- The type of the elements + + | TuplePat [LPat id] -- Tuple + Boxity -- UnitPat is TuplePat [] + PostTcType + -- You might think that the PostTcType was redundant, but it's essential + -- data T a where + -- T1 :: Int -> T Int + -- f :: (T a, a) -> Int + -- f (T1 x, z) = z + -- When desugaring, we must generate + -- f = /\a. \v::a. case v of (t::T a, w::a) -> + -- case t of (T1 (x::Int)) -> + -- Note the (w::a), NOT (w::Int), because we have not yet + -- refined 'a' to Int. So we must know that the second component + -- of the tuple is of type 'a' not Int. See selectMatchVar + + | PArrPat [LPat id] -- Syntactic parallel array + PostTcType -- The type of the elements + + ------------ Constructor patterns --------------- + | ConPatIn (Located id) + (HsConDetails id (LPat id)) + + | ConPatOut (Located DataCon) + [TyVar] -- Existentially bound type variables + [id] -- Ditto dictionaries + (DictBinds id) -- Bindings involving those dictionaries + (HsConDetails id (LPat id)) + Type -- The type of the pattern + + ------------ Literal and n+k patterns --------------- + | LitPat HsLit -- Used for *non-overloaded* literal patterns: + -- Int#, Char#, Int, Char, String, etc. + + | NPat (HsOverLit id) -- ALWAYS positive + (Maybe (SyntaxExpr id)) -- Just (Name of 'negate') for negative + -- patterns, Nothing otherwise + (SyntaxExpr id) -- Equality checker, of type t->t->Bool + PostTcType -- Type of the pattern + + | NPlusKPat (Located id) -- n+k pattern + (HsOverLit id) -- It'll always be an HsIntegral + (SyntaxExpr id) -- (>=) function, of type t->t->Bool + (SyntaxExpr id) -- Name of '-' (see RnEnv.lookupSyntaxName) + + ------------ Generics --------------- + | TypePat (LHsType id) -- Type pattern for generic definitions + -- e.g f{| a+b |} = ... + -- These show up only in class declarations, + -- and should be a top-level pattern + + ------------ Pattern type signatures --------------- + | SigPatIn (LPat id) -- Pattern with a type signature + (LHsType id) + + | SigPatOut (LPat id) -- Pattern with a type signature + Type + + ------------ Dictionary patterns (translation only) --------------- + | DictPat -- Used when destructing Dictionaries with an explicit case + [id] -- superclass dicts + [id] -- methods +\end{code} + +HsConDetails is use both for patterns and for data type declarations + +\begin{code} +data HsConDetails id arg + = PrefixCon [arg] -- C p1 p2 p3 + | RecCon [(Located id, arg)] -- C { x = p1, y = p2 } + | InfixCon arg arg -- p1 `C` p2 + +hsConArgs :: HsConDetails id arg -> [arg] +hsConArgs (PrefixCon ps) = ps +hsConArgs (RecCon fs) = map snd fs +hsConArgs (InfixCon p1 p2) = [p1,p2] +\end{code} + + +%************************************************************************ +%* * +%* Printing patterns +%* * +%************************************************************************ + +\begin{code} +instance (OutputableBndr name) => Outputable (Pat name) where + ppr = pprPat + +pprPatBndr :: OutputableBndr name => name -> SDoc +pprPatBndr var -- Print with type info if -dppr-debug is on + = getPprStyle $ \ sty -> + if debugStyle sty then + parens (pprBndr LambdaBind var) -- Could pass the site to pprPat + -- but is it worth it? + else + ppr var + +pprPat :: (OutputableBndr name) => Pat name -> SDoc +pprPat (VarPat var) = pprPatBndr var +pprPat (VarPatOut var bs) = parens (pprPatBndr var <+> braces (ppr bs)) +pprPat (WildPat _) = char '_' +pprPat (LazyPat pat) = char '~' <> ppr pat +pprPat (BangPat pat) = char '!' <> ppr pat +pprPat (AsPat name pat) = parens (hcat [ppr name, char '@', ppr pat]) +pprPat (ParPat pat) = parens (ppr pat) +pprPat (ListPat pats _) = brackets (interpp'SP pats) +pprPat (PArrPat pats _) = pabrackets (interpp'SP pats) +pprPat (TuplePat pats bx _) = tupleParens bx (interpp'SP pats) + +pprPat (ConPatIn con details) = pprUserCon con details +pprPat (ConPatOut con tvs dicts binds details _) + = getPprStyle $ \ sty -> -- Tiresome; in TcBinds.tcRhs we print out a + if debugStyle sty then -- typechecked Pat in an error message, + -- and we want to make sure it prints nicely + ppr con <+> sep [ hsep (map pprPatBndr tvs) <+> hsep (map pprPatBndr dicts), + pprLHsBinds binds, pprConArgs details] + else pprUserCon con details + +pprPat (LitPat s) = ppr s +pprPat (NPat l Nothing _ _) = ppr l +pprPat (NPat l (Just _) _ _) = char '-' <> ppr l +pprPat (NPlusKPat n k _ _) = hcat [ppr n, char '+', ppr k] +pprPat (TypePat ty) = ptext SLIT("{|") <> ppr ty <> ptext SLIT("|}") +pprPat (SigPatIn pat ty) = ppr pat <+> dcolon <+> ppr ty +pprPat (SigPatOut pat ty) = ppr pat <+> dcolon <+> ppr ty +pprPat (DictPat ds ms) = parens (sep [ptext SLIT("{-dict-}"), + brackets (interpp'SP ds), + brackets (interpp'SP ms)]) + +pprUserCon c (InfixCon p1 p2) = ppr p1 <+> ppr c <+> ppr p2 +pprUserCon c details = ppr c <+> pprConArgs details + +pprConArgs (PrefixCon pats) = interppSP pats +pprConArgs (InfixCon p1 p2) = interppSP [p1,p2] +pprConArgs (RecCon rpats) = braces (hsep (punctuate comma (map (pp_rpat) rpats))) + where + pp_rpat (v, p) = hsep [ppr v, char '=', ppr p] + + +-- add parallel array brackets around a document +-- +pabrackets :: SDoc -> SDoc +pabrackets p = ptext SLIT("[:") <> p <> ptext SLIT(":]") +\end{code} + + +%************************************************************************ +%* * +%* Building patterns +%* * +%************************************************************************ + +\begin{code} +mkPrefixConPat :: DataCon -> [OutPat id] -> Type -> OutPat id +-- Make a vanilla Prefix constructor pattern +mkPrefixConPat dc pats ty = noLoc $ ConPatOut (noLoc dc) [] [] emptyLHsBinds (PrefixCon pats) ty + +mkNilPat :: Type -> OutPat id +mkNilPat ty = mkPrefixConPat nilDataCon [] ty + +mkCharLitPat :: Char -> OutPat id +mkCharLitPat c = mkPrefixConPat charDataCon [noLoc $ LitPat (HsCharPrim c)] charTy +\end{code} + + +%************************************************************************ +%* * +%* Predicates for checking things about pattern-lists in EquationInfo * +%* * +%************************************************************************ + +\subsection[Pat-list-predicates]{Look for interesting things in patterns} + +Unlike in the Wadler chapter, where patterns are either ``variables'' +or ``constructors,'' here we distinguish between: +\begin{description} +\item[unfailable:] +Patterns that cannot fail to match: variables, wildcards, and lazy +patterns. + +These are the irrefutable patterns; the two other categories +are refutable patterns. + +\item[constructor:] +A non-literal constructor pattern (see next category). + +\item[literal patterns:] +At least the numeric ones may be overloaded. +\end{description} + +A pattern is in {\em exactly one} of the above three categories; `as' +patterns are treated specially, of course. + +The 1.3 report defines what ``irrefutable'' and ``failure-free'' patterns are. +\begin{code} +isWildPat (WildPat _) = True +isWildPat other = False + +patsAreAllCons :: [Pat id] -> Bool +patsAreAllCons pat_list = all isConPat pat_list + +isConPat (AsPat _ pat) = isConPat (unLoc pat) +isConPat (ConPatIn _ _) = True +isConPat (ConPatOut _ _ _ _ _ _) = True +isConPat (ListPat _ _) = True +isConPat (PArrPat _ _) = True +isConPat (TuplePat _ _ _) = True +isConPat (DictPat ds ms) = (length ds + length ms) > 1 +isConPat other = False + +isSigPat (SigPatIn _ _) = True +isSigPat (SigPatOut _ _) = True +isSigPat other = False + +patsAreAllLits :: [Pat id] -> Bool +patsAreAllLits pat_list = all isLitPat pat_list + +isLitPat (AsPat _ pat) = isLitPat (unLoc pat) +isLitPat (LitPat _) = True +isLitPat (NPat _ _ _ _) = True +isLitPat (NPlusKPat _ _ _ _) = True +isLitPat other = False + +isBangHsBind :: HsBind id -> Bool +-- In this module because HsPat is above HsBinds in the import graph +isBangHsBind (PatBind { pat_lhs = L _ (BangPat p) }) = True +isBangHsBind bind = False + +isIrrefutableHsPat :: LPat id -> Bool +-- This function returns False if it's in doubt; specifically +-- on a ConPatIn it doesn't know the size of the constructor family +-- But if it returns True, the pattern is definitely irrefutable +isIrrefutableHsPat pat + = go pat + where + go (L _ pat) = go1 pat + + go1 (WildPat _) = True + go1 (VarPat _) = True + go1 (VarPatOut _ _) = True + go1 (LazyPat pat) = True + go1 (BangPat pat) = go pat + go1 (ParPat pat) = go pat + go1 (AsPat _ pat) = go pat + go1 (SigPatIn pat _) = go pat + go1 (SigPatOut pat _) = go pat + go1 (TuplePat pats _ _) = all go pats + go1 (ListPat pats _) = False + go1 (PArrPat pats _) = False -- ? + + go1 (ConPatIn _ _) = False -- Conservative + go1 (ConPatOut (L _ con) _ _ _ details _) + = isProductTyCon (dataConTyCon con) + && all go (hsConArgs details) + + go1 (LitPat _) = False + go1 (NPat _ _ _ _) = False + go1 (NPlusKPat _ _ _ _) = False + + go1 (TypePat _) = panic "isIrrefutableHsPat: type pattern" + go1 (DictPat _ _) = panic "isIrrefutableHsPat: type pattern" +\end{code} + diff --git a/compiler/hsSyn/HsPat.lhs-boot b/compiler/hsSyn/HsPat.lhs-boot new file mode 100644 index 0000000000..d5b685c1f1 --- /dev/null +++ b/compiler/hsSyn/HsPat.lhs-boot @@ -0,0 +1,7 @@ +\begin{code} +module HsPat where +import SrcLoc( Located ) + +data Pat i +type LPat i = Located (Pat i) +\end{code} diff --git a/compiler/hsSyn/HsSyn.lhs b/compiler/hsSyn/HsSyn.lhs new file mode 100644 index 0000000000..a9982a630a --- /dev/null +++ b/compiler/hsSyn/HsSyn.lhs @@ -0,0 +1,98 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section{Haskell abstract syntax definition} + +This module glues together the pieces of the Haskell abstract syntax, +which is declared in the various \tr{Hs*} modules. This module, +therefore, is almost nothing but re-exporting. + +\begin{code} +module HsSyn ( + module HsBinds, + module HsDecls, + module HsExpr, + module HsImpExp, + module HsLit, + module HsPat, + module HsTypes, + module HsUtils, + Fixity, + + HsModule(..), HsExtCore(..) + ) where + +#include "HsVersions.h" + +-- friends: +import HsDecls +import HsBinds +import HsExpr +import HsImpExp +import HsLit +import HsPat +import HsTypes +import HscTypes ( DeprecTxt ) +import BasicTypes ( Fixity ) +import HsUtils + +-- others: +import IfaceSyn ( IfaceBinding ) +import Outputable +import SrcLoc ( Located(..) ) +import Module ( Module ) +\end{code} + +All we actually declare here is the top-level structure for a module. +\begin{code} +data HsModule name + = HsModule + (Maybe (Located Module))-- Nothing => "module X where" is omitted + -- (in which case the next field is Nothing too) + (Maybe [LIE name]) -- Export list; Nothing => export list omitted, so export everything + -- Just [] => export *nothing* + -- Just [...] => as you would expect... + [LImportDecl name] -- We snaffle interesting stuff out of the + -- imported interfaces early on, adding that + -- info to TyDecls/etc; so this list is + -- often empty, downstream. + [LHsDecl name] -- Type, class, value, and interface signature decls + (Maybe DeprecTxt) -- reason/explanation for deprecation of this module + +data HsExtCore name -- Read from Foo.hcr + = HsExtCore + Module + [TyClDecl name] -- Type declarations only; just as in Haskell source, + -- so that we can infer kinds etc + [IfaceBinding] -- And the bindings +\end{code} + +\begin{code} +instance (OutputableBndr name) + => Outputable (HsModule name) where + + ppr (HsModule Nothing _ imports decls _) + = pp_nonnull imports $$ pp_nonnull decls + + ppr (HsModule (Just name) exports imports decls deprec) + = vcat [ + case exports of + Nothing -> pp_header (ptext SLIT("where")) + Just es -> vcat [ + pp_header lparen, + nest 8 (fsep (punctuate comma (map ppr es))), + nest 4 (ptext SLIT(") where")) + ], + pp_nonnull imports, + pp_nonnull decls + ] + where + pp_header rest = case deprec of + Nothing -> pp_modname <+> rest + Just d -> vcat [ pp_modname, ppr d, rest ] + + pp_modname = ptext SLIT("module") <+> ppr name + +pp_nonnull [] = empty +pp_nonnull xs = vcat (map ppr xs) +\end{code} diff --git a/compiler/hsSyn/HsTypes.lhs b/compiler/hsSyn/HsTypes.lhs new file mode 100644 index 0000000000..f1343a39ef --- /dev/null +++ b/compiler/hsSyn/HsTypes.lhs @@ -0,0 +1,370 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[HsTypes]{Abstract syntax: user-defined types} + +\begin{code} +module HsTypes ( + HsType(..), LHsType, + HsTyVarBndr(..), LHsTyVarBndr, + HsExplicitForAll(..), + HsContext, LHsContext, + HsPred(..), LHsPred, + + LBangType, BangType, HsBang(..), + getBangType, getBangStrictness, + + mkExplicitHsForAllTy, mkImplicitHsForAllTy, + hsTyVarName, hsTyVarNames, replaceTyVarName, + hsLTyVarName, hsLTyVarNames, hsLTyVarLocName, hsLTyVarLocNames, + splitHsInstDeclTy, splitHsFunType, + + -- Type place holder + PostTcType, placeHolderType, + + -- Printing + pprParendHsType, pprHsForAll, pprHsContext, ppr_hs_context, pprHsTyVarBndr + ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} HsExpr ( HsSplice, pprSplice ) + +import Type ( Type ) +import Kind ( {- instance Outputable Kind -}, Kind, + pprParendKind, pprKind, isLiftedTypeKind ) +import BasicTypes ( IPName, Boxity, tupleParens ) +import SrcLoc ( Located(..), unLoc, noSrcSpan ) +import StaticFlags ( opt_PprStyle_Debug ) +import Outputable +\end{code} + + +%************************************************************************ +%* * +\subsection{Annotating the syntax} +%* * +%************************************************************************ + +\begin{code} +type PostTcType = Type -- Used for slots in the abstract syntax + -- where we want to keep slot for a type + -- to be added by the type checker...but + -- before typechecking it's just bogus + +placeHolderType :: PostTcType -- Used before typechecking +placeHolderType = panic "Evaluated the place holder for a PostTcType" +\end{code} + +%************************************************************************ +%* * +\subsection{Bang annotations} +%* * +%************************************************************************ + +\begin{code} +type LBangType name = Located (BangType name) +type BangType name = HsType name -- Bangs are in the HsType data type + +data HsBang = HsNoBang -- Only used as a return value for getBangStrictness, + -- never appears on a HsBangTy + | HsStrict -- ! + | HsUnbox -- {-# UNPACK #-} ! (GHC extension, meaning "unbox") + +instance Outputable HsBang where + ppr (HsNoBang) = empty + ppr (HsStrict) = char '!' + ppr (HsUnbox) = ptext SLIT("!!") + +getBangType :: LHsType a -> LHsType a +getBangType (L _ (HsBangTy _ ty)) = ty +getBangType ty = ty + +getBangStrictness :: LHsType a -> HsBang +getBangStrictness (L _ (HsBangTy s _)) = s +getBangStrictness _ = HsNoBang +\end{code} + + +%************************************************************************ +%* * +\subsection{Data types} +%* * +%************************************************************************ + +This is the syntax for types as seen in type signatures. + +\begin{code} +type LHsContext name = Located (HsContext name) + +type HsContext name = [LHsPred name] + +type LHsPred name = Located (HsPred name) + +data HsPred name = HsClassP name [LHsType name] + | HsIParam (IPName name) (LHsType name) + +type LHsType name = Located (HsType name) + +data HsType name + = HsForAllTy HsExplicitForAll -- Renamer leaves this flag unchanged, to record the way + -- the user wrote it originally, so that the printer can + -- print it as the user wrote it + [LHsTyVarBndr name] -- With ImplicitForAll, this is the empty list + -- until the renamer fills in the variables + (LHsContext name) + (LHsType name) + + | HsTyVar name -- Type variable or type constructor + + | HsBangTy HsBang (LHsType name) -- Bang-style type annotations + + | HsAppTy (LHsType name) + (LHsType name) + + | HsFunTy (LHsType name) -- function type + (LHsType name) + + | HsListTy (LHsType name) -- Element type + + | HsPArrTy (LHsType name) -- Elem. type of parallel array: [:t:] + + | HsTupleTy Boxity + [LHsType name] -- Element types (length gives arity) + + | HsOpTy (LHsType name) (Located name) (LHsType name) + + | HsParTy (LHsType name) + -- Parenthesis preserved for the precedence re-arrangement in RnTypes + -- It's important that a * (b + c) doesn't get rearranged to (a*b) + c! + -- + -- However, NB that toHsType doesn't add HsParTys (in an effort to keep + -- interface files smaller), so when printing a HsType we may need to + -- add parens. + + | HsNumTy Integer -- Generics only + + | HsPredTy (HsPred name) -- Only used in the type of an instance + -- declaration, eg. Eq [a] -> Eq a + -- ^^^^ + -- HsPredTy + -- Note no need for location info on the + -- enclosed HsPred; the one on the type will do + + | HsKindSig (LHsType name) -- (ty :: kind) + Kind -- A type with a kind signature + + | HsSpliceTy (HsSplice name) + +data HsExplicitForAll = Explicit | Implicit + +----------------------- +-- Combine adjacent for-alls. +-- The following awkward situation can happen otherwise: +-- f :: forall a. ((Num a) => Int) +-- might generate HsForAll (Just [a]) [] (HsForAll Nothing [Num a] t) +-- Then a isn't discovered as ambiguous, and we abstract the AbsBinds wrt [] +-- but the export list abstracts f wrt [a]. Disaster. +-- +-- A valid type must have one for-all at the top of the type, or of the fn arg types + +mkImplicitHsForAllTy ctxt ty = mkHsForAllTy Implicit [] ctxt ty +mkExplicitHsForAllTy tvs ctxt ty = mkHsForAllTy Explicit tvs ctxt ty + +mkHsForAllTy :: HsExplicitForAll -> [LHsTyVarBndr name] -> LHsContext name -> LHsType name -> HsType name +-- Smart constructor for HsForAllTy +mkHsForAllTy exp tvs (L _ []) ty = mk_forall_ty exp tvs ty +mkHsForAllTy exp tvs ctxt ty = HsForAllTy exp tvs ctxt ty + +-- mk_forall_ty makes a pure for-all type (no context) +mk_forall_ty exp tvs (L _ (HsParTy ty)) = mk_forall_ty exp tvs ty +mk_forall_ty exp1 tvs1 (L _ (HsForAllTy exp2 tvs2 ctxt ty)) = mkHsForAllTy (exp1 `plus` exp2) (tvs1 ++ tvs2) ctxt ty +mk_forall_ty exp tvs ty = HsForAllTy exp tvs (L noSrcSpan []) ty + -- Even if tvs is empty, we still make a HsForAll! + -- In the Implicit case, this signals the place to do implicit quantification + -- In the Explicit case, it prevents implicit quantification + -- (see the sigtype production in Parser.y.pp) + -- so that (forall. ty) isn't implicitly quantified + +Implicit `plus` Implicit = Implicit +exp1 `plus` exp2 = Explicit + +type LHsTyVarBndr name = Located (HsTyVarBndr name) + +data HsTyVarBndr name + = UserTyVar name + | KindedTyVar name Kind + -- *** NOTA BENE *** A "monotype" in a pragma can have + -- for-alls in it, (mostly to do with dictionaries). These + -- must be explicitly Kinded. + +hsTyVarName :: HsTyVarBndr name -> name +hsTyVarName (UserTyVar n) = n +hsTyVarName (KindedTyVar n _) = n + +hsLTyVarName :: LHsTyVarBndr name -> name +hsLTyVarName = hsTyVarName . unLoc + +hsTyVarNames :: [HsTyVarBndr name] -> [name] +hsTyVarNames tvs = map hsTyVarName tvs + +hsLTyVarNames :: [LHsTyVarBndr name] -> [name] +hsLTyVarNames = map hsLTyVarName + +hsLTyVarLocName :: LHsTyVarBndr name -> Located name +hsLTyVarLocName = fmap hsTyVarName + +hsLTyVarLocNames :: [LHsTyVarBndr name] -> [Located name] +hsLTyVarLocNames = map hsLTyVarLocName + +replaceTyVarName :: HsTyVarBndr name1 -> name2 -> HsTyVarBndr name2 +replaceTyVarName (UserTyVar n) n' = UserTyVar n' +replaceTyVarName (KindedTyVar n k) n' = KindedTyVar n' k +\end{code} + + +\begin{code} +splitHsInstDeclTy + :: OutputableBndr name + => HsType name + -> ([LHsTyVarBndr name], HsContext name, name, [LHsType name]) + -- Split up an instance decl type, returning the pieces + +splitHsInstDeclTy inst_ty + = case inst_ty of + HsParTy (L _ ty) -> splitHsInstDeclTy ty + HsForAllTy _ tvs cxt (L _ ty) -> split_tau tvs (unLoc cxt) ty + other -> split_tau [] [] other + -- The type vars should have been computed by now, even if they were implicit + where + split_tau tvs cxt (HsPredTy (HsClassP cls tys)) = (tvs, cxt, cls, tys) + split_tau tvs cxt (HsParTy (L _ ty)) = split_tau tvs cxt ty + +-- Splits HsType into the (init, last) parts +-- Breaks up any parens in the result type: +-- splitHsFunType (a -> (b -> c)) = ([a,b], c) +splitHsFunType :: LHsType name -> ([LHsType name], LHsType name) +splitHsFunType (L l (HsFunTy x y)) = (x:args, res) + where + (args, res) = splitHsFunType y +splitHsFunType (L _ (HsParTy ty)) = splitHsFunType ty +splitHsFunType other = ([], other) +\end{code} + + +%************************************************************************ +%* * +\subsection{Pretty printing} +%* * +%************************************************************************ + +NB: these types get printed into interface files, so + don't change the printing format lightly + +\begin{code} +instance (OutputableBndr name) => Outputable (HsType name) where + ppr ty = pprHsType ty + +instance (Outputable name) => Outputable (HsTyVarBndr name) where + ppr (UserTyVar name) = ppr name + ppr (KindedTyVar name kind) = pprHsTyVarBndr name kind + +instance OutputableBndr name => Outputable (HsPred name) where + ppr (HsClassP clas tys) = ppr clas <+> hsep (map (pprParendHsType.unLoc) tys) + ppr (HsIParam n ty) = hsep [ppr n, dcolon, ppr ty] + +pprHsTyVarBndr :: Outputable name => name -> Kind -> SDoc +pprHsTyVarBndr name kind | isLiftedTypeKind kind = ppr name + | otherwise = hsep [ppr name, dcolon, pprParendKind kind] + +pprHsForAll exp tvs cxt + | show_forall = forall_part <+> pprHsContext (unLoc cxt) + | otherwise = pprHsContext (unLoc cxt) + where + show_forall = opt_PprStyle_Debug + || (not (null tvs) && is_explicit) + is_explicit = case exp of {Explicit -> True; Implicit -> False} + forall_part = ptext SLIT("forall") <+> interppSP tvs <> dot + +pprHsContext :: (OutputableBndr name) => HsContext name -> SDoc +pprHsContext [] = empty +pprHsContext cxt = ppr_hs_context cxt <+> ptext SLIT("=>") + +ppr_hs_context [] = empty +ppr_hs_context cxt = parens (interpp'SP cxt) +\end{code} + +\begin{code} +pREC_TOP = (0 :: Int) -- type in ParseIface.y +pREC_FUN = (1 :: Int) -- btype in ParseIface.y + -- Used for LH arg of (->) +pREC_OP = (2 :: Int) -- Used for arg of any infix operator + -- (we don't keep their fixities around) +pREC_CON = (3 :: Int) -- Used for arg of type applicn: + -- always parenthesise unless atomic + +maybeParen :: Int -- Precedence of context + -> Int -- Precedence of top-level operator + -> SDoc -> SDoc -- Wrap in parens if (ctxt >= op) +maybeParen ctxt_prec op_prec p | ctxt_prec >= op_prec = parens p + | otherwise = p + +-- printing works more-or-less as for Types + +pprHsType, pprParendHsType :: (OutputableBndr name) => HsType name -> SDoc + +pprHsType ty = getPprStyle $ \sty -> ppr_mono_ty pREC_TOP (prepare sty ty) +pprParendHsType ty = ppr_mono_ty pREC_CON ty + +-- Before printing a type +-- (a) Remove outermost HsParTy parens +-- (b) Drop top-level for-all type variables in user style +-- since they are implicit in Haskell +prepare sty (HsParTy ty) = prepare sty (unLoc ty) +prepare sty ty = ty + +ppr_mono_lty ctxt_prec ty = ppr_mono_ty ctxt_prec (unLoc ty) + +ppr_mono_ty ctxt_prec (HsForAllTy exp tvs ctxt ty) + = maybeParen ctxt_prec pREC_FUN $ + sep [pprHsForAll exp tvs ctxt, ppr_mono_lty pREC_TOP ty] + +-- gaw 2004 +ppr_mono_ty ctxt_prec (HsBangTy b ty) = ppr b <> ppr ty +ppr_mono_ty ctxt_prec (HsTyVar name) = ppr name +ppr_mono_ty ctxt_prec (HsFunTy ty1 ty2) = ppr_fun_ty ctxt_prec ty1 ty2 +ppr_mono_ty ctxt_prec (HsTupleTy con tys) = tupleParens con (interpp'SP tys) +ppr_mono_ty ctxt_prec (HsKindSig ty kind) = parens (ppr_mono_lty pREC_TOP ty <+> dcolon <+> pprKind kind) +ppr_mono_ty ctxt_prec (HsListTy ty) = brackets (ppr_mono_lty pREC_TOP ty) +ppr_mono_ty ctxt_prec (HsPArrTy ty) = pabrackets (ppr_mono_lty pREC_TOP ty) +ppr_mono_ty ctxt_prec (HsPredTy pred) = braces (ppr pred) +ppr_mono_ty ctxt_prec (HsNumTy n) = integer n -- generics only +ppr_mono_ty ctxt_prec (HsSpliceTy s) = pprSplice s + +ppr_mono_ty ctxt_prec (HsAppTy fun_ty arg_ty) + = maybeParen ctxt_prec pREC_CON $ + hsep [ppr_mono_lty pREC_FUN fun_ty, ppr_mono_lty pREC_CON arg_ty] + +ppr_mono_ty ctxt_prec (HsOpTy ty1 op ty2) + = maybeParen ctxt_prec pREC_OP $ + ppr_mono_lty pREC_OP ty1 <+> ppr op <+> ppr_mono_lty pREC_OP ty2 + +ppr_mono_ty ctxt_prec (HsParTy ty) + = parens (ppr_mono_lty pREC_TOP ty) + -- Put the parens in where the user did + -- But we still use the precedence stuff to add parens because + -- toHsType doesn't put in any HsParTys, so we may still need them + +-------------------------- +ppr_fun_ty ctxt_prec ty1 ty2 + = let p1 = ppr_mono_lty pREC_FUN ty1 + p2 = ppr_mono_lty pREC_TOP ty2 + in + maybeParen ctxt_prec pREC_FUN $ + sep [p1, ptext SLIT("->") <+> p2] + +-------------------------- +pabrackets p = ptext SLIT("[:") <> p <> ptext SLIT(":]") +\end{code} + + diff --git a/compiler/hsSyn/HsUtils.lhs b/compiler/hsSyn/HsUtils.lhs new file mode 100644 index 0000000000..d9c45e6529 --- /dev/null +++ b/compiler/hsSyn/HsUtils.lhs @@ -0,0 +1,423 @@ +% +% (c) The University of Glasgow, 1992-2003 +% + +Here we collect a variety of helper functions that construct or +analyse HsSyn. All these functions deal with generic HsSyn; functions +which deal with the intantiated versions are located elsewhere: + + Parameterised by Module + ---------------- ------------- + RdrName parser/RdrHsSyn + Name rename/RnHsSyn + Id typecheck/TcHsSyn + +\begin{code} +module HsUtils where + +#include "HsVersions.h" + +import HsBinds +import HsExpr +import HsPat +import HsTypes +import HsLit + +import RdrName ( RdrName, getRdrName, mkRdrUnqual ) +import Var ( Id ) +import Type ( Type ) +import DataCon ( DataCon, dataConWrapId, dataConSourceArity ) +import OccName ( mkVarOccFS ) +import Name ( Name ) +import BasicTypes ( RecFlag(..) ) +import SrcLoc +import FastString ( mkFastString ) +import Outputable +import Util ( nOfThem ) +import Bag +\end{code} + + +%************************************************************************ +%* * + Some useful helpers for constructing syntax +%* * +%************************************************************************ + +These functions attempt to construct a not-completely-useless SrcSpan +from their components, compared with the nl* functions below which +just attach noSrcSpan to everything. + +\begin{code} +mkHsPar :: LHsExpr id -> LHsExpr id +mkHsPar e = L (getLoc e) (HsPar e) + +-- gaw 2004 +mkSimpleMatch :: [LPat id] -> LHsExpr id -> LMatch id +mkSimpleMatch pats rhs + = L loc $ + Match pats Nothing (GRHSs (unguardedRHS rhs) emptyLocalBinds) + where + loc = case pats of + [] -> getLoc rhs + (pat:_) -> combineSrcSpans (getLoc pat) (getLoc rhs) + +unguardedRHS :: LHsExpr id -> [LGRHS id] +unguardedRHS rhs@(L loc _) = [L loc (GRHS [] rhs)] + +mkHsAppTy :: LHsType name -> LHsType name -> LHsType name +mkHsAppTy t1 t2 = addCLoc t1 t2 (HsAppTy t1 t2) + +mkHsApp :: LHsExpr name -> LHsExpr name -> LHsExpr name +mkHsApp e1 e2 = addCLoc e1 e2 (HsApp e1 e2) + +mkHsTyApp :: LHsExpr name -> [Type] -> LHsExpr name +mkHsTyApp expr [] = expr +mkHsTyApp expr tys = L (getLoc expr) (TyApp expr tys) + +mkHsDictApp :: LHsExpr name -> [name] -> LHsExpr name +mkHsDictApp expr [] = expr +mkHsDictApp expr dict_vars = L (getLoc expr) (DictApp expr dict_vars) + +mkHsCoerce :: ExprCoFn -> HsExpr id -> HsExpr id +mkHsCoerce co_fn e | isIdCoercion co_fn = e + | otherwise = HsCoerce co_fn e + +mkHsLam :: [LPat id] -> LHsExpr id -> LHsExpr id +mkHsLam pats body = mkHsPar (L (getLoc body) (HsLam matches)) + where + matches = mkMatchGroup [mkSimpleMatch pats body] + +mkMatchGroup :: [LMatch id] -> MatchGroup id +mkMatchGroup matches = MatchGroup matches placeHolderType + +mkHsTyLam [] expr = expr +mkHsTyLam tyvars expr = L (getLoc expr) (TyLam tyvars expr) + +mkHsDictLam [] expr = expr +mkHsDictLam dicts expr = L (getLoc expr) (DictLam dicts expr) + +mkHsDictLet :: LHsBinds Id -> LHsExpr Id -> LHsExpr Id +-- Used for the dictionary bindings gotten from TcSimplify +-- We make them recursive to be on the safe side +mkHsDictLet binds expr + | isEmptyLHsBinds binds = expr + | otherwise = L (getLoc expr) (HsLet (HsValBinds val_binds) expr) + where + val_binds = ValBindsOut [(Recursive, binds)] [] + +mkHsConApp :: DataCon -> [Type] -> [HsExpr Id] -> LHsExpr Id +-- Used for constructing dictinoary terms etc, so no locations +mkHsConApp data_con tys args + = foldl mk_app (noLoc (HsVar (dataConWrapId data_con)) `mkHsTyApp` tys) args + where + mk_app f a = noLoc (HsApp f (noLoc a)) + +mkSimpleHsAlt :: LPat id -> LHsExpr id -> LMatch id +-- A simple lambda with a single pattern, no binds, no guards; pre-typechecking +mkSimpleHsAlt pat expr + = mkSimpleMatch [pat] expr + +------------------------------- +-- These are the bits of syntax that contain rebindable names +-- See RnEnv.lookupSyntaxName + +mkHsIntegral i = HsIntegral i noSyntaxExpr +mkHsFractional f = HsFractional f noSyntaxExpr +mkHsDo ctxt stmts body = HsDo ctxt stmts body placeHolderType + +mkNPat lit neg = NPat lit neg noSyntaxExpr placeHolderType +mkNPlusKPat id lit = NPlusKPat id lit noSyntaxExpr noSyntaxExpr + +mkExprStmt expr = ExprStmt expr noSyntaxExpr placeHolderType +mkBindStmt pat expr = BindStmt pat expr noSyntaxExpr noSyntaxExpr +mkRecStmt stmts = RecStmt stmts [] [] [] emptyLHsBinds + +------------------------------- +--- A useful function for building @OpApps@. The operator is always a +-- variable, and we don't know the fixity yet. +mkHsOpApp e1 op e2 = OpApp e1 (noLoc (HsVar op)) (error "mkOpApp:fixity") e2 + +mkHsSplice e = HsSplice unqualSplice e + +unqualSplice = mkRdrUnqual (mkVarOccFS FSLIT("splice")) + -- A name (uniquified later) to + -- identify the splice + +mkHsString s = HsString (mkFastString s) + +------------- +userHsTyVarBndrs :: [Located name] -> [Located (HsTyVarBndr name)] +userHsTyVarBndrs bndrs = [ L loc (UserTyVar v) | L loc v <- bndrs ] +\end{code} + + +%************************************************************************ +%* * + Constructing syntax with no location info +%* * +%************************************************************************ + +\begin{code} +nlHsVar :: id -> LHsExpr id +nlHsVar n = noLoc (HsVar n) + +nlHsLit :: HsLit -> LHsExpr id +nlHsLit n = noLoc (HsLit n) + +nlVarPat :: id -> LPat id +nlVarPat n = noLoc (VarPat n) + +nlLitPat :: HsLit -> LPat id +nlLitPat l = noLoc (LitPat l) + +nlHsApp :: LHsExpr id -> LHsExpr id -> LHsExpr id +nlHsApp f x = noLoc (HsApp f x) + +nlHsIntLit n = noLoc (HsLit (HsInt n)) + +nlHsApps :: id -> [LHsExpr id] -> LHsExpr id +nlHsApps f xs = foldl nlHsApp (nlHsVar f) xs + +nlHsVarApps :: id -> [id] -> LHsExpr id +nlHsVarApps f xs = noLoc (foldl mk (HsVar f) (map HsVar xs)) + where + mk f a = HsApp (noLoc f) (noLoc a) + +nlConVarPat :: id -> [id] -> LPat id +nlConVarPat con vars = nlConPat con (map nlVarPat vars) + +nlInfixConPat :: id -> LPat id -> LPat id -> LPat id +nlInfixConPat con l r = noLoc (ConPatIn (noLoc con) (InfixCon l r)) + +nlConPat :: id -> [LPat id] -> LPat id +nlConPat con pats = noLoc (ConPatIn (noLoc con) (PrefixCon pats)) + +nlNullaryConPat :: id -> LPat id +nlNullaryConPat con = noLoc (ConPatIn (noLoc con) (PrefixCon [])) + +nlWildConPat :: DataCon -> LPat RdrName +nlWildConPat con = noLoc (ConPatIn (noLoc (getRdrName con)) + (PrefixCon (nOfThem (dataConSourceArity con) nlWildPat))) + +nlTuplePat pats box = noLoc (TuplePat pats box placeHolderType) +nlWildPat = noLoc (WildPat placeHolderType) -- Pre-typechecking + +nlHsDo :: HsStmtContext Name -> [LStmt id] -> LHsExpr id -> LHsExpr id +nlHsDo ctxt stmts body = noLoc (mkHsDo ctxt stmts body) + +nlHsOpApp e1 op e2 = noLoc (mkHsOpApp e1 op e2) + +nlHsLam match = noLoc (HsLam (mkMatchGroup [match])) +nlHsPar e = noLoc (HsPar e) +nlHsIf cond true false = noLoc (HsIf cond true false) +nlHsCase expr matches = noLoc (HsCase expr (mkMatchGroup matches)) +nlTuple exprs box = noLoc (ExplicitTuple exprs box) +nlList exprs = noLoc (ExplicitList placeHolderType exprs) + +nlHsAppTy f t = noLoc (HsAppTy f t) +nlHsTyVar x = noLoc (HsTyVar x) +nlHsFunTy a b = noLoc (HsFunTy a b) +\end{code} + + + +%************************************************************************ +%* * + Bindings; with a location at the top +%* * +%************************************************************************ + +\begin{code} +mkFunBind :: Located id -> [LMatch id] -> HsBind id +-- Not infix, with place holders for coercion and free vars +mkFunBind fn ms = FunBind { fun_id = fn, fun_infix = False, fun_matches = mkMatchGroup ms, + fun_co_fn = idCoercion, bind_fvs = placeHolderNames } + + +mkVarBind :: SrcSpan -> RdrName -> LHsExpr RdrName -> LHsBind RdrName +mkVarBind loc var rhs = mk_easy_FunBind loc var [] rhs + +------------ +mk_easy_FunBind :: SrcSpan -> RdrName -> [LPat RdrName] + -> LHsExpr RdrName -> LHsBind RdrName + +mk_easy_FunBind loc fun pats expr + = L loc $ mkFunBind (L loc fun) [mkMatch pats expr emptyLocalBinds] + +------------ +mk_FunBind :: SrcSpan -> RdrName + -> [([LPat RdrName], LHsExpr RdrName)] + -> LHsBind RdrName + +mk_FunBind loc fun [] = panic "TcGenDeriv:mk_FunBind" +mk_FunBind loc fun pats_and_exprs + = L loc $ mkFunBind (L loc fun) matches + where + matches = [mkMatch p e emptyLocalBinds | (p,e) <-pats_and_exprs] + +------------ +mkMatch :: [LPat id] -> LHsExpr id -> HsLocalBinds id -> LMatch id +mkMatch pats expr binds + = noLoc (Match (map paren pats) Nothing + (GRHSs (unguardedRHS expr) binds)) + where + paren p = case p of + L _ (VarPat _) -> p + L l _ -> L l (ParPat p) +\end{code} + + +%************************************************************************ +%* * + Collecting binders from HsBindGroups and HsBinds +%* * +%************************************************************************ + +Get all the binders in some HsBindGroups, IN THE ORDER OF APPEARANCE. eg. + +... +where + (x, y) = ... + f i j = ... + [a, b] = ... + +it should return [x, y, f, a, b] (remember, order important). + +\begin{code} +collectLocalBinders :: HsLocalBinds name -> [Located name] +collectLocalBinders (HsValBinds val_binds) = collectHsValBinders val_binds +collectLocalBinders (HsIPBinds _) = [] +collectLocalBinders EmptyLocalBinds = [] + +collectHsValBinders :: HsValBinds name -> [Located name] +collectHsValBinders (ValBindsIn binds sigs) = collectHsBindLocatedBinders binds +collectHsValBinders (ValBindsOut binds sigs) = foldr collect_one [] binds + where + collect_one (_,binds) acc = foldrBag (collectAcc . unLoc) acc binds + +collectAcc :: HsBind name -> [Located name] -> [Located name] +collectAcc (PatBind { pat_lhs = p }) acc = collectLocatedPatBinders p ++ acc +collectAcc (FunBind { fun_id = f }) acc = f : acc +collectAcc (VarBind { var_id = f }) acc = noLoc f : acc +collectAcc (AbsBinds { abs_exports = dbinds, abs_binds = binds }) acc + = [noLoc dp | (_,dp,_,_) <- dbinds] ++ acc + -- ++ foldr collectAcc acc binds + -- I don't think we want the binders from the nested binds + -- The only time we collect binders from a typechecked + -- binding (hence see AbsBinds) is in zonking in TcHsSyn + +collectHsBindBinders :: LHsBinds name -> [name] +collectHsBindBinders binds = map unLoc (collectHsBindLocatedBinders binds) + +collectHsBindLocatedBinders :: LHsBinds name -> [Located name] +collectHsBindLocatedBinders binds = foldrBag (collectAcc . unLoc) [] binds +\end{code} + + +%************************************************************************ +%* * + Getting binders from statements +%* * +%************************************************************************ + +\begin{code} +collectLStmtsBinders :: [LStmt id] -> [Located id] +collectLStmtsBinders = concatMap collectLStmtBinders + +collectStmtsBinders :: [Stmt id] -> [Located id] +collectStmtsBinders = concatMap collectStmtBinders + +collectLStmtBinders :: LStmt id -> [Located id] +collectLStmtBinders = collectStmtBinders . unLoc + +collectStmtBinders :: Stmt id -> [Located id] + -- Id Binders for a Stmt... [but what about pattern-sig type vars]? +collectStmtBinders (BindStmt pat _ _ _) = collectLocatedPatBinders pat +collectStmtBinders (LetStmt binds) = collectLocalBinders binds +collectStmtBinders (ExprStmt _ _ _) = [] +collectStmtBinders (RecStmt ss _ _ _ _) = collectLStmtsBinders ss +collectStmtBinders other = panic "collectStmtBinders" +\end{code} + + +%************************************************************************ +%* * +%* Gathering stuff out of patterns +%* * +%************************************************************************ + +This function @collectPatBinders@ works with the ``collectBinders'' +functions for @HsBinds@, etc. The order in which the binders are +collected is important; see @HsBinds.lhs@. + +It collects the bounds *value* variables in renamed patterns; type variables +are *not* collected. + +\begin{code} +collectPatBinders :: LPat a -> [a] +collectPatBinders pat = map unLoc (collectLocatedPatBinders pat) + +collectLocatedPatBinders :: LPat a -> [Located a] +collectLocatedPatBinders pat = collectl pat [] + +collectPatsBinders :: [LPat a] -> [a] +collectPatsBinders pats = map unLoc (collectLocatedPatsBinders pats) + +collectLocatedPatsBinders :: [LPat a] -> [Located a] +collectLocatedPatsBinders pats = foldr collectl [] pats + +--------------------- +collectl (L l pat) bndrs + = go pat + where + go (VarPat var) = L l var : bndrs + go (VarPatOut var bs) = L l var : collectHsBindLocatedBinders bs + ++ bndrs + go (WildPat _) = bndrs + go (LazyPat pat) = collectl pat bndrs + go (BangPat pat) = collectl pat bndrs + go (AsPat a pat) = a : collectl pat bndrs + go (ParPat pat) = collectl pat bndrs + + go (ListPat pats _) = foldr collectl bndrs pats + go (PArrPat pats _) = foldr collectl bndrs pats + go (TuplePat pats _ _) = foldr collectl bndrs pats + + go (ConPatIn c ps) = foldr collectl bndrs (hsConArgs ps) + go (ConPatOut c _ ds bs ps _) = map noLoc ds + ++ collectHsBindLocatedBinders bs + ++ foldr collectl bndrs (hsConArgs ps) + go (LitPat _) = bndrs + go (NPat _ _ _ _) = bndrs + go (NPlusKPat n _ _ _) = n : bndrs + + go (SigPatIn pat _) = collectl pat bndrs + go (SigPatOut pat _) = collectl pat bndrs + go (TypePat ty) = bndrs + go (DictPat ids1 ids2) = map noLoc ids1 ++ map noLoc ids2 + ++ bndrs +\end{code} + +\begin{code} +collectSigTysFromPats :: [InPat name] -> [LHsType name] +collectSigTysFromPats pats = foldr collect_lpat [] pats + +collectSigTysFromPat :: InPat name -> [LHsType name] +collectSigTysFromPat pat = collect_lpat pat [] + +collect_lpat pat acc = collect_pat (unLoc pat) acc + +collect_pat (SigPatIn pat ty) acc = collect_lpat pat (ty:acc) +collect_pat (TypePat ty) acc = ty:acc + +collect_pat (LazyPat pat) acc = collect_lpat pat acc +collect_pat (BangPat pat) acc = collect_lpat pat acc +collect_pat (AsPat a pat) acc = collect_lpat pat acc +collect_pat (ParPat pat) acc = collect_lpat pat acc +collect_pat (ListPat pats _) acc = foldr collect_lpat acc pats +collect_pat (PArrPat pats _) acc = foldr collect_lpat acc pats +collect_pat (TuplePat pats _ _) acc = foldr collect_lpat acc pats +collect_pat (ConPatIn c ps) acc = foldr collect_lpat acc (hsConArgs ps) +collect_pat other acc = acc -- Literals, vars, wildcard +\end{code} diff --git a/compiler/iface/BinIface.hs b/compiler/iface/BinIface.hs new file mode 100644 index 0000000000..6d02fe00c7 --- /dev/null +++ b/compiler/iface/BinIface.hs @@ -0,0 +1,1056 @@ +{-% DrIFT (Automatic class derivations for Haskell) v1.1 %-} +{-% DrIFT (Automatic class derivations for Haskell) v1.1 %-} +-- +-- (c) The University of Glasgow 2002 +-- +-- Binary interface file support. + +module BinIface ( writeBinIface, readBinIface, v_IgnoreHiWay ) where + +#include "HsVersions.h" + +import HscTypes +import BasicTypes +import NewDemand +import IfaceSyn +import VarEnv +import InstEnv ( OverlapFlag(..) ) +import Packages ( PackageIdH(..) ) +import Class ( DefMeth(..) ) +import CostCentre +import StaticFlags ( opt_HiVersion, v_Build_tag ) +import Kind ( Kind(..) ) +import Panic +import Binary +import Util +import Config ( cGhcUnregisterised ) + +import DATA_IOREF +import EXCEPTION ( throwDyn ) +import Monad ( when ) +import Outputable + +#include "HsVersions.h" + +-- --------------------------------------------------------------------------- +writeBinIface :: FilePath -> ModIface -> IO () +writeBinIface hi_path mod_iface + = putBinFileWithDict hi_path mod_iface + +readBinIface :: FilePath -> IO ModIface +readBinIface hi_path = getBinFileWithDict hi_path + + +-- %********************************************************* +-- %* * +-- All the Binary instances +-- %* * +-- %********************************************************* + +-- BasicTypes +{-! for IPName derive: Binary !-} +{-! for Fixity derive: Binary !-} +{-! for FixityDirection derive: Binary !-} +{-! for Boxity derive: Binary !-} +{-! for StrictnessMark derive: Binary !-} +{-! for Activation derive: Binary !-} + +-- NewDemand +{-! for Demand derive: Binary !-} +{-! for Demands derive: Binary !-} +{-! for DmdResult derive: Binary !-} +{-! for StrictSig derive: Binary !-} + +-- Class +{-! for DefMeth derive: Binary !-} + +-- HsTypes +{-! for HsPred derive: Binary !-} +{-! for HsType derive: Binary !-} +{-! for TupCon derive: Binary !-} +{-! for HsTyVarBndr derive: Binary !-} + +-- HsCore +{-! for UfExpr derive: Binary !-} +{-! for UfConAlt derive: Binary !-} +{-! for UfBinding derive: Binary !-} +{-! for UfBinder derive: Binary !-} +{-! for HsIdInfo derive: Binary !-} +{-! for UfNote derive: Binary !-} + +-- HsDecls +{-! for ConDetails derive: Binary !-} +{-! for BangType derive: Binary !-} + +-- CostCentre +{-! for IsCafCC derive: Binary !-} +{-! for IsDupdCC derive: Binary !-} +{-! for CostCentre derive: Binary !-} + + + +-- --------------------------------------------------------------------------- +-- Reading a binary interface into ParsedIface + +instance Binary ModIface where + put_ bh (ModIface { + mi_module = mod, + mi_boot = is_boot, + mi_mod_vers = mod_vers, + mi_package = _, -- we ignore the package on output + mi_orphan = orphan, + mi_deps = deps, + mi_usages = usages, + mi_exports = exports, + mi_exp_vers = exp_vers, + mi_fixities = fixities, + mi_deprecs = deprecs, + mi_decls = decls, + mi_insts = insts, + mi_rules = rules, + mi_rule_vers = rule_vers }) = do + put_ bh (show opt_HiVersion) + way_descr <- getWayDescr + put bh way_descr + put_ bh mod + put_ bh is_boot + put_ bh mod_vers + put_ bh orphan + lazyPut bh deps + lazyPut bh usages + put_ bh exports + put_ bh exp_vers + put_ bh fixities + lazyPut bh deprecs + put_ bh decls + put_ bh insts + lazyPut bh rules + put_ bh rule_vers + + get bh = do + check_ver <- get bh + let our_ver = show opt_HiVersion + when (check_ver /= our_ver) $ + -- use userError because this will be caught by readIface + -- which will emit an error msg containing the iface module name. + throwDyn (ProgramError ( + "mismatched interface file versions: expected " + ++ our_ver ++ ", found " ++ check_ver)) + + check_way <- get bh + ignore_way <- readIORef v_IgnoreHiWay + way_descr <- getWayDescr + when (not ignore_way && check_way /= way_descr) $ + -- use userError because this will be caught by readIface + -- which will emit an error msg containing the iface module name. + throwDyn (ProgramError ( + "mismatched interface file ways: expected " + ++ way_descr ++ ", found " ++ check_way)) + + mod_name <- get bh + is_boot <- get bh + mod_vers <- get bh + orphan <- get bh + deps <- lazyGet bh + usages <- {-# SCC "bin_usages" #-} lazyGet bh + exports <- {-# SCC "bin_exports" #-} get bh + exp_vers <- get bh + fixities <- {-# SCC "bin_fixities" #-} get bh + deprecs <- {-# SCC "bin_deprecs" #-} lazyGet bh + decls <- {-# SCC "bin_tycldecls" #-} get bh + insts <- {-# SCC "bin_insts" #-} get bh + rules <- {-# SCC "bin_rules" #-} lazyGet bh + rule_vers <- get bh + return (ModIface { + mi_package = HomePackage, -- to be filled in properly later + mi_module = mod_name, + mi_boot = is_boot, + mi_mod_vers = mod_vers, + mi_orphan = orphan, + mi_deps = deps, + mi_usages = usages, + mi_exports = exports, + mi_exp_vers = exp_vers, + mi_fixities = fixities, + mi_deprecs = deprecs, + mi_decls = decls, + mi_globals = Nothing, + mi_insts = insts, + mi_rules = rules, + mi_rule_vers = rule_vers, + -- And build the cached values + mi_dep_fn = mkIfaceDepCache deprecs, + mi_fix_fn = mkIfaceFixCache fixities, + mi_ver_fn = mkIfaceVerCache decls }) + +GLOBAL_VAR(v_IgnoreHiWay, False, Bool) + +getWayDescr :: IO String +getWayDescr = do + tag <- readIORef v_Build_tag + if cGhcUnregisterised == "YES" then return ('u':tag) else return tag + -- if this is an unregisterised build, make sure our interfaces + -- can't be used by a registerised build. + +------------------------------------------------------------------------- +-- Types from: HscTypes +------------------------------------------------------------------------- + +instance Binary Dependencies where + put_ bh deps = do put_ bh (dep_mods deps) + put_ bh (dep_pkgs deps) + put_ bh (dep_orphs deps) + + get bh = do ms <- get bh + ps <- get bh + os <- get bh + return (Deps { dep_mods = ms, dep_pkgs = ps, dep_orphs = os }) + +instance (Binary name) => Binary (GenAvailInfo name) where + put_ bh (Avail aa) = do + putByte bh 0 + put_ bh aa + put_ bh (AvailTC ab ac) = do + putByte bh 1 + put_ bh ab + put_ bh ac + get bh = do + h <- getByte bh + case h of + 0 -> do aa <- get bh + return (Avail aa) + _ -> do ab <- get bh + ac <- get bh + return (AvailTC ab ac) + +instance Binary Usage where + put_ bh usg = do + put_ bh (usg_name usg) + put_ bh (usg_mod usg) + put_ bh (usg_exports usg) + put_ bh (usg_entities usg) + put_ bh (usg_rules usg) + + get bh = do + nm <- get bh + mod <- get bh + exps <- get bh + ents <- get bh + rules <- get bh + return (Usage { usg_name = nm, usg_mod = mod, + usg_exports = exps, usg_entities = ents, + usg_rules = rules }) + +instance Binary a => Binary (Deprecs a) where + put_ bh NoDeprecs = putByte bh 0 + put_ bh (DeprecAll t) = do + putByte bh 1 + put_ bh t + put_ bh (DeprecSome ts) = do + putByte bh 2 + put_ bh ts + + get bh = do + h <- getByte bh + case h of + 0 -> return NoDeprecs + 1 -> do aa <- get bh + return (DeprecAll aa) + _ -> do aa <- get bh + return (DeprecSome aa) + +------------------------------------------------------------------------- +-- Types from: BasicTypes +------------------------------------------------------------------------- + +instance Binary Activation where + put_ bh NeverActive = do + putByte bh 0 + put_ bh AlwaysActive = do + putByte bh 1 + put_ bh (ActiveBefore aa) = do + putByte bh 2 + put_ bh aa + put_ bh (ActiveAfter ab) = do + putByte bh 3 + put_ bh ab + get bh = do + h <- getByte bh + case h of + 0 -> do return NeverActive + 1 -> do return AlwaysActive + 2 -> do aa <- get bh + return (ActiveBefore aa) + _ -> do ab <- get bh + return (ActiveAfter ab) + +instance Binary StrictnessMark where + put_ bh MarkedStrict = do + putByte bh 0 + put_ bh MarkedUnboxed = do + putByte bh 1 + put_ bh NotMarkedStrict = do + putByte bh 2 + get bh = do + h <- getByte bh + case h of + 0 -> do return MarkedStrict + 1 -> do return MarkedUnboxed + _ -> do return NotMarkedStrict + +instance Binary Boxity where + put_ bh Boxed = do + putByte bh 0 + put_ bh Unboxed = do + putByte bh 1 + get bh = do + h <- getByte bh + case h of + 0 -> do return Boxed + _ -> do return Unboxed + +instance Binary TupCon where + put_ bh (TupCon ab ac) = do + put_ bh ab + put_ bh ac + get bh = do + ab <- get bh + ac <- get bh + return (TupCon ab ac) + +instance Binary RecFlag where + put_ bh Recursive = do + putByte bh 0 + put_ bh NonRecursive = do + putByte bh 1 + get bh = do + h <- getByte bh + case h of + 0 -> do return Recursive + _ -> do return NonRecursive + +instance Binary DefMeth where + put_ bh NoDefMeth = putByte bh 0 + put_ bh DefMeth = putByte bh 1 + put_ bh GenDefMeth = putByte bh 2 + get bh = do + h <- getByte bh + case h of + 0 -> return NoDefMeth + 1 -> return DefMeth + _ -> return GenDefMeth + +instance Binary FixityDirection where + put_ bh InfixL = do + putByte bh 0 + put_ bh InfixR = do + putByte bh 1 + put_ bh InfixN = do + putByte bh 2 + get bh = do + h <- getByte bh + case h of + 0 -> do return InfixL + 1 -> do return InfixR + _ -> do return InfixN + +instance Binary Fixity where + put_ bh (Fixity aa ab) = do + put_ bh aa + put_ bh ab + get bh = do + aa <- get bh + ab <- get bh + return (Fixity aa ab) + +instance (Binary name) => Binary (IPName name) where + put_ bh (Dupable aa) = do + putByte bh 0 + put_ bh aa + put_ bh (Linear ab) = do + putByte bh 1 + put_ bh ab + get bh = do + h <- getByte bh + case h of + 0 -> do aa <- get bh + return (Dupable aa) + _ -> do ab <- get bh + return (Linear ab) + +------------------------------------------------------------------------- +-- Types from: Demand +------------------------------------------------------------------------- + +instance Binary DmdType where + -- Ignore DmdEnv when spitting out the DmdType + put bh (DmdType _ ds dr) = do p <- put bh ds; put bh dr; return (castBin p) + get bh = do ds <- get bh; dr <- get bh; return (DmdType emptyVarEnv ds dr) + +instance Binary Demand where + put_ bh Top = do + putByte bh 0 + put_ bh Abs = do + putByte bh 1 + put_ bh (Call aa) = do + putByte bh 2 + put_ bh aa + put_ bh (Eval ab) = do + putByte bh 3 + put_ bh ab + put_ bh (Defer ac) = do + putByte bh 4 + put_ bh ac + put_ bh (Box ad) = do + putByte bh 5 + put_ bh ad + put_ bh Bot = do + putByte bh 6 + get bh = do + h <- getByte bh + case h of + 0 -> do return Top + 1 -> do return Abs + 2 -> do aa <- get bh + return (Call aa) + 3 -> do ab <- get bh + return (Eval ab) + 4 -> do ac <- get bh + return (Defer ac) + 5 -> do ad <- get bh + return (Box ad) + _ -> do return Bot + +instance Binary Demands where + put_ bh (Poly aa) = do + putByte bh 0 + put_ bh aa + put_ bh (Prod ab) = do + putByte bh 1 + put_ bh ab + get bh = do + h <- getByte bh + case h of + 0 -> do aa <- get bh + return (Poly aa) + _ -> do ab <- get bh + return (Prod ab) + +instance Binary DmdResult where + put_ bh TopRes = do + putByte bh 0 + put_ bh RetCPR = do + putByte bh 1 + put_ bh BotRes = do + putByte bh 2 + get bh = do + h <- getByte bh + case h of + 0 -> do return TopRes + 1 -> do return RetCPR -- Really use RetCPR even if -fcpr-off + -- The wrapper was generated for CPR in + -- the imported module! + _ -> do return BotRes + +instance Binary StrictSig where + put_ bh (StrictSig aa) = do + put_ bh aa + get bh = do + aa <- get bh + return (StrictSig aa) + + +------------------------------------------------------------------------- +-- Types from: CostCentre +------------------------------------------------------------------------- + +instance Binary IsCafCC where + put_ bh CafCC = do + putByte bh 0 + put_ bh NotCafCC = do + putByte bh 1 + get bh = do + h <- getByte bh + case h of + 0 -> do return CafCC + _ -> do return NotCafCC + +instance Binary IsDupdCC where + put_ bh OriginalCC = do + putByte bh 0 + put_ bh DupdCC = do + putByte bh 1 + get bh = do + h <- getByte bh + case h of + 0 -> do return OriginalCC + _ -> do return DupdCC + +instance Binary CostCentre where + put_ bh NoCostCentre = do + putByte bh 0 + put_ bh (NormalCC aa ab ac ad) = do + putByte bh 1 + put_ bh aa + put_ bh ab + put_ bh ac + put_ bh ad + put_ bh (AllCafsCC ae) = do + putByte bh 2 + put_ bh ae + get bh = do + h <- getByte bh + case h of + 0 -> do return NoCostCentre + 1 -> do aa <- get bh + ab <- get bh + ac <- get bh + ad <- get bh + return (NormalCC aa ab ac ad) + _ -> do ae <- get bh + return (AllCafsCC ae) + +------------------------------------------------------------------------- +-- IfaceTypes and friends +------------------------------------------------------------------------- + +instance Binary IfaceExtName where + put_ bh (ExtPkg mod occ) = do + putByte bh 0 + put_ bh mod + put_ bh occ + put_ bh (HomePkg mod occ vers) = do + putByte bh 1 + put_ bh mod + put_ bh occ + put_ bh vers + put_ bh (LocalTop occ) = do + putByte bh 2 + put_ bh occ + put_ bh (LocalTopSub occ _) = do -- Write LocalTopSub just like LocalTop + putByte bh 2 + put_ bh occ + + get bh = do + h <- getByte bh + case h of + 0 -> do mod <- get bh + occ <- get bh + return (ExtPkg mod occ) + 1 -> do mod <- get bh + occ <- get bh + vers <- get bh + return (HomePkg mod occ vers) + _ -> do occ <- get bh + return (LocalTop occ) + +instance Binary IfaceBndr where + put_ bh (IfaceIdBndr aa) = do + putByte bh 0 + put_ bh aa + put_ bh (IfaceTvBndr ab) = do + putByte bh 1 + put_ bh ab + get bh = do + h <- getByte bh + case h of + 0 -> do aa <- get bh + return (IfaceIdBndr aa) + _ -> do ab <- get bh + return (IfaceTvBndr ab) + +instance Binary Kind where + put_ bh LiftedTypeKind = putByte bh 0 + put_ bh UnliftedTypeKind = putByte bh 1 + put_ bh OpenTypeKind = putByte bh 2 + put_ bh ArgTypeKind = putByte bh 3 + put_ bh UbxTupleKind = putByte bh 4 + put_ bh (FunKind k1 k2) = do + putByte bh 5 + put_ bh k1 + put_ bh k2 + put_ bh (KindVar kv) = pprPanic "BinIface.put_: kind var" (ppr kv) + + get bh = do + h <- getByte bh + case h of + 0 -> return LiftedTypeKind + 1 -> return UnliftedTypeKind + 2 -> return OpenTypeKind + 3 -> return ArgTypeKind + 4 -> return UbxTupleKind + _ -> do k1 <- get bh + k2 <- get bh + return (FunKind k1 k2) + +instance Binary IfaceType where + put_ bh (IfaceForAllTy aa ab) = do + putByte bh 0 + put_ bh aa + put_ bh ab + put_ bh (IfaceTyVar ad) = do + putByte bh 1 + put_ bh ad + put_ bh (IfaceAppTy ae af) = do + putByte bh 2 + put_ bh ae + put_ bh af + put_ bh (IfaceFunTy ag ah) = do + putByte bh 3 + put_ bh ag + put_ bh ah + put_ bh (IfacePredTy aq) = do + putByte bh 5 + put_ bh aq + + -- Simple compression for common cases of TyConApp + put_ bh (IfaceTyConApp IfaceIntTc []) = putByte bh 6 + put_ bh (IfaceTyConApp IfaceCharTc []) = putByte bh 7 + put_ bh (IfaceTyConApp IfaceBoolTc []) = putByte bh 8 + put_ bh (IfaceTyConApp IfaceListTc [ty]) = do { putByte bh 9; put_ bh ty } + -- Unit tuple and pairs + put_ bh (IfaceTyConApp (IfaceTupTc Boxed 0) []) = putByte bh 10 + put_ bh (IfaceTyConApp (IfaceTupTc Boxed 2) [t1,t2]) = do { putByte bh 11; put_ bh t1; put_ bh t2 } + -- Generic cases + put_ bh (IfaceTyConApp (IfaceTc tc) tys) = do { putByte bh 12; put_ bh tc; put_ bh tys } + put_ bh (IfaceTyConApp tc tys) = do { putByte bh 13; put_ bh tc; put_ bh tys } + + get bh = do + h <- getByte bh + case h of + 0 -> do aa <- get bh + ab <- get bh + return (IfaceForAllTy aa ab) + 1 -> do ad <- get bh + return (IfaceTyVar ad) + 2 -> do ae <- get bh + af <- get bh + return (IfaceAppTy ae af) + 3 -> do ag <- get bh + ah <- get bh + return (IfaceFunTy ag ah) + 5 -> do ap <- get bh + return (IfacePredTy ap) + + -- Now the special cases for TyConApp + 6 -> return (IfaceTyConApp IfaceIntTc []) + 7 -> return (IfaceTyConApp IfaceCharTc []) + 8 -> return (IfaceTyConApp IfaceBoolTc []) + 9 -> do { ty <- get bh; return (IfaceTyConApp IfaceListTc [ty]) } + 10 -> return (IfaceTyConApp (IfaceTupTc Boxed 0) []) + 11 -> do { t1 <- get bh; t2 <- get bh; return (IfaceTyConApp (IfaceTupTc Boxed 2) [t1,t2]) } + 12 -> do { tc <- get bh; tys <- get bh; return (IfaceTyConApp (IfaceTc tc) tys) } + _ -> do { tc <- get bh; tys <- get bh; return (IfaceTyConApp tc tys) } + +instance Binary IfaceTyCon where + -- Int,Char,Bool can't show up here because they can't not be saturated + + put_ bh IfaceIntTc = putByte bh 1 + put_ bh IfaceBoolTc = putByte bh 2 + put_ bh IfaceCharTc = putByte bh 3 + put_ bh IfaceListTc = putByte bh 4 + put_ bh IfacePArrTc = putByte bh 5 + put_ bh (IfaceTupTc bx ar) = do { putByte bh 6; put_ bh bx; put_ bh ar } + put_ bh (IfaceTc ext) = do { putByte bh 7; put_ bh ext } + + get bh = do + h <- getByte bh + case h of + 1 -> return IfaceIntTc + 2 -> return IfaceBoolTc + 3 -> return IfaceCharTc + 4 -> return IfaceListTc + 5 -> return IfacePArrTc + 6 -> do { bx <- get bh; ar <- get bh; return (IfaceTupTc bx ar) } + _ -> do { ext <- get bh; return (IfaceTc ext) } + +instance Binary IfacePredType where + put_ bh (IfaceClassP aa ab) = do + putByte bh 0 + put_ bh aa + put_ bh ab + put_ bh (IfaceIParam ac ad) = do + putByte bh 1 + put_ bh ac + put_ bh ad + get bh = do + h <- getByte bh + case h of + 0 -> do aa <- get bh + ab <- get bh + return (IfaceClassP aa ab) + _ -> do ac <- get bh + ad <- get bh + return (IfaceIParam ac ad) + +------------------------------------------------------------------------- +-- IfaceExpr and friends +------------------------------------------------------------------------- + +instance Binary IfaceExpr where + put_ bh (IfaceLcl aa) = do + putByte bh 0 + put_ bh aa + put_ bh (IfaceType ab) = do + putByte bh 1 + put_ bh ab + put_ bh (IfaceTuple ac ad) = do + putByte bh 2 + put_ bh ac + put_ bh ad + put_ bh (IfaceLam ae af) = do + putByte bh 3 + put_ bh ae + put_ bh af + put_ bh (IfaceApp ag ah) = do + putByte bh 4 + put_ bh ag + put_ bh ah +-- gaw 2004 + put_ bh (IfaceCase ai aj al ak) = do + putByte bh 5 + put_ bh ai + put_ bh aj +-- gaw 2004 + put_ bh al + put_ bh ak + put_ bh (IfaceLet al am) = do + putByte bh 6 + put_ bh al + put_ bh am + put_ bh (IfaceNote an ao) = do + putByte bh 7 + put_ bh an + put_ bh ao + put_ bh (IfaceLit ap) = do + putByte bh 8 + put_ bh ap + put_ bh (IfaceFCall as at) = do + putByte bh 9 + put_ bh as + put_ bh at + put_ bh (IfaceExt aa) = do + putByte bh 10 + put_ bh aa + get bh = do + h <- getByte bh + case h of + 0 -> do aa <- get bh + return (IfaceLcl aa) + 1 -> do ab <- get bh + return (IfaceType ab) + 2 -> do ac <- get bh + ad <- get bh + return (IfaceTuple ac ad) + 3 -> do ae <- get bh + af <- get bh + return (IfaceLam ae af) + 4 -> do ag <- get bh + ah <- get bh + return (IfaceApp ag ah) + 5 -> do ai <- get bh + aj <- get bh +-- gaw 2004 + al <- get bh + ak <- get bh +-- gaw 2004 + return (IfaceCase ai aj al ak) + 6 -> do al <- get bh + am <- get bh + return (IfaceLet al am) + 7 -> do an <- get bh + ao <- get bh + return (IfaceNote an ao) + 8 -> do ap <- get bh + return (IfaceLit ap) + 9 -> do as <- get bh + at <- get bh + return (IfaceFCall as at) + _ -> do aa <- get bh + return (IfaceExt aa) + +instance Binary IfaceConAlt where + put_ bh IfaceDefault = do + putByte bh 0 + put_ bh (IfaceDataAlt aa) = do + putByte bh 1 + put_ bh aa + put_ bh (IfaceTupleAlt ab) = do + putByte bh 2 + put_ bh ab + put_ bh (IfaceLitAlt ac) = do + putByte bh 3 + put_ bh ac + get bh = do + h <- getByte bh + case h of + 0 -> do return IfaceDefault + 1 -> do aa <- get bh + return (IfaceDataAlt aa) + 2 -> do ab <- get bh + return (IfaceTupleAlt ab) + _ -> do ac <- get bh + return (IfaceLitAlt ac) + +instance Binary IfaceBinding where + put_ bh (IfaceNonRec aa ab) = do + putByte bh 0 + put_ bh aa + put_ bh ab + put_ bh (IfaceRec ac) = do + putByte bh 1 + put_ bh ac + get bh = do + h <- getByte bh + case h of + 0 -> do aa <- get bh + ab <- get bh + return (IfaceNonRec aa ab) + _ -> do ac <- get bh + return (IfaceRec ac) + +instance Binary IfaceIdInfo where + put_ bh NoInfo = putByte bh 0 + put_ bh (HasInfo i) = do + putByte bh 1 + lazyPut bh i -- NB lazyPut + + get bh = do + h <- getByte bh + case h of + 0 -> return NoInfo + _ -> do info <- lazyGet bh -- NB lazyGet + return (HasInfo info) + +instance Binary IfaceInfoItem where + put_ bh (HsArity aa) = do + putByte bh 0 + put_ bh aa + put_ bh (HsStrictness ab) = do + putByte bh 1 + put_ bh ab + put_ bh (HsUnfold ac ad) = do + putByte bh 2 + put_ bh ac + put_ bh ad + put_ bh HsNoCafRefs = do + putByte bh 3 + put_ bh (HsWorker ae af) = do + putByte bh 4 + put_ bh ae + put_ bh af + get bh = do + h <- getByte bh + case h of + 0 -> do aa <- get bh + return (HsArity aa) + 1 -> do ab <- get bh + return (HsStrictness ab) + 2 -> do ac <- get bh + ad <- get bh + return (HsUnfold ac ad) + 3 -> do return HsNoCafRefs + _ -> do ae <- get bh + af <- get bh + return (HsWorker ae af) + +instance Binary IfaceNote where + put_ bh (IfaceSCC aa) = do + putByte bh 0 + put_ bh aa + put_ bh (IfaceCoerce ab) = do + putByte bh 1 + put_ bh ab + put_ bh IfaceInlineCall = do + putByte bh 2 + put_ bh IfaceInlineMe = do + putByte bh 3 + put_ bh (IfaceCoreNote s) = do + putByte bh 4 + put_ bh s + get bh = do + h <- getByte bh + case h of + 0 -> do aa <- get bh + return (IfaceSCC aa) + 1 -> do ab <- get bh + return (IfaceCoerce ab) + 2 -> do return IfaceInlineCall + 3 -> do return IfaceInlineMe + _ -> do ac <- get bh + return (IfaceCoreNote ac) + + +------------------------------------------------------------------------- +-- IfaceDecl and friends +------------------------------------------------------------------------- + +instance Binary IfaceDecl where + put_ bh (IfaceId name ty idinfo) = do + putByte bh 0 + put_ bh name + put_ bh ty + put_ bh idinfo + put_ bh (IfaceForeign ae af) = + error "Binary.put_(IfaceDecl): IfaceForeign" + put_ bh (IfaceData a1 a2 a3 a4 a5 a6 a7) = do + putByte bh 2 + put_ bh a1 + put_ bh a2 + put_ bh a3 + put_ bh a4 + put_ bh a5 + put_ bh a6 + put_ bh a7 + + put_ bh (IfaceSyn aq ar as at) = do + putByte bh 3 + put_ bh aq + put_ bh ar + put_ bh as + put_ bh at + put_ bh (IfaceClass a1 a2 a3 a4 a5 a6 a7) = do + putByte bh 4 + put_ bh a1 + put_ bh a2 + put_ bh a3 + put_ bh a4 + put_ bh a5 + put_ bh a6 + put_ bh a7 + get bh = do + h <- getByte bh + case h of + 0 -> do name <- get bh + ty <- get bh + idinfo <- get bh + return (IfaceId name ty idinfo) + 1 -> error "Binary.get(TyClDecl): ForeignType" + 2 -> do + a1 <- get bh + a2 <- get bh + a3 <- get bh + a4 <- get bh + a5 <- get bh + a6 <- get bh + a7 <- get bh + return (IfaceData a1 a2 a3 a4 a5 a6 a7) + 3 -> do + aq <- get bh + ar <- get bh + as <- get bh + at <- get bh + return (IfaceSyn aq ar as at) + _ -> do + a1 <- get bh + a2 <- get bh + a3 <- get bh + a4 <- get bh + a5 <- get bh + a6 <- get bh + a7 <- get bh + return (IfaceClass a1 a2 a3 a4 a5 a6 a7) + +instance Binary IfaceInst where + put_ bh (IfaceInst cls tys dfun flag orph) = do + put_ bh cls + put_ bh tys + put_ bh dfun + put_ bh flag + put_ bh orph + get bh = do cls <- get bh + tys <- get bh + dfun <- get bh + flag <- get bh + orph <- get bh + return (IfaceInst cls tys dfun flag orph) + +instance Binary OverlapFlag where + put_ bh NoOverlap = putByte bh 0 + put_ bh OverlapOk = putByte bh 1 + put_ bh Incoherent = putByte bh 2 + get bh = do h <- getByte bh + case h of + 0 -> return NoOverlap + 1 -> return OverlapOk + 2 -> return Incoherent + +instance Binary IfaceConDecls where + put_ bh IfAbstractTyCon = putByte bh 0 + put_ bh (IfDataTyCon cs) = do { putByte bh 1 + ; put_ bh cs } + put_ bh (IfNewTyCon c) = do { putByte bh 2 + ; put_ bh c } + get bh = do + h <- getByte bh + case h of + 0 -> return IfAbstractTyCon + 1 -> do cs <- get bh + return (IfDataTyCon cs) + _ -> do aa <- get bh + return (IfNewTyCon aa) + +instance Binary IfaceConDecl where + put_ bh (IfVanillaCon a1 a2 a3 a4 a5) = do + putByte bh 0 + put_ bh a1 + put_ bh a2 + put_ bh a3 + put_ bh a4 + put_ bh a5 + put_ bh (IfGadtCon a1 a2 a3 a4 a5 a6) = do + putByte bh 1 + put_ bh a1 + put_ bh a2 + put_ bh a3 + put_ bh a4 + put_ bh a5 + put_ bh a6 + get bh = do + h <- getByte bh + case h of + 0 -> do a1 <- get bh + a2 <- get bh + a3 <- get bh + a4 <- get bh + a5 <- get bh + return (IfVanillaCon a1 a2 a3 a4 a5) + _ -> do a1 <- get bh + a2 <- get bh + a3 <- get bh + a4 <- get bh + a5 <- get bh + a6 <- get bh + return (IfGadtCon a1 a2 a3 a4 a5 a6) + +instance Binary IfaceClassOp where + put_ bh (IfaceClassOp n def ty) = do + put_ bh n + put_ bh def + put_ bh ty + get bh = do + n <- get bh + def <- get bh + ty <- get bh + return (IfaceClassOp n def ty) + +instance Binary IfaceRule where + put_ bh (IfaceRule a1 a2 a3 a4 a5 a6 a7) = do + put_ bh a1 + put_ bh a2 + put_ bh a3 + put_ bh a4 + put_ bh a5 + put_ bh a6 + put_ bh a7 + get bh = do + a1 <- get bh + a2 <- get bh + a3 <- get bh + a4 <- get bh + a5 <- get bh + a6 <- get bh + a7 <- get bh + return (IfaceRule a1 a2 a3 a4 a5 a6 a7) + + diff --git a/compiler/iface/BuildTyCl.lhs b/compiler/iface/BuildTyCl.lhs new file mode 100644 index 0000000000..f81f2e7d07 --- /dev/null +++ b/compiler/iface/BuildTyCl.lhs @@ -0,0 +1,256 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% + +\begin{code} +module BuildTyCl ( + buildSynTyCon, buildAlgTyCon, buildDataCon, + buildClass, + mkAbstractTyConRhs, mkNewTyConRhs, mkDataTyConRhs + ) where + +#include "HsVersions.h" + +import IfaceEnv ( newImplicitBinder ) +import TcRnMonad + +import DataCon ( DataCon, isNullarySrcDataCon, dataConTyVars, + mkDataCon, dataConFieldLabels, dataConOrigArgTys ) +import Var ( tyVarKind, TyVar, Id ) +import VarSet ( isEmptyVarSet, intersectVarSet, elemVarSet ) +import TysWiredIn ( unitTy ) +import BasicTypes ( RecFlag, StrictnessMark(..) ) +import Name ( Name ) +import OccName ( mkDataConWrapperOcc, mkDataConWorkerOcc, mkClassTyConOcc, + mkClassDataConOcc, mkSuperDictSelOcc ) +import MkId ( mkDataConIds, mkRecordSelId, mkDictSelId ) +import Class ( mkClass, Class( classTyCon), FunDep, DefMeth(..) ) +import TyCon ( mkSynTyCon, mkAlgTyCon, visibleDataCons, tyConStupidTheta, + tyConDataCons, isNewTyCon, mkClassTyCon, TyCon( tyConTyVars ), + isRecursiveTyCon, + ArgVrcs, AlgTyConRhs(..), newTyConRhs ) +import Type ( mkArrowKinds, liftedTypeKind, typeKind, + tyVarsOfType, tyVarsOfTypes, tyVarsOfPred, + splitTyConApp_maybe, splitAppTy_maybe, getTyVar_maybe, + mkPredTys, mkTyVarTys, ThetaType, Type, + substTyWith, zipTopTvSubst, substTheta ) +import Outputable +import List ( nub ) + +\end{code} + + +\begin{code} +------------------------------------------------------ +buildSynTyCon name tvs rhs_ty arg_vrcs + = mkSynTyCon name kind tvs rhs_ty arg_vrcs + where + kind = mkArrowKinds (map tyVarKind tvs) (typeKind rhs_ty) + + +------------------------------------------------------ +buildAlgTyCon :: Name -> [TyVar] + -> ThetaType -- Stupid theta + -> AlgTyConRhs + -> ArgVrcs -> RecFlag + -> Bool -- True <=> want generics functions + -> TcRnIf m n TyCon + +buildAlgTyCon tc_name tvs stupid_theta rhs arg_vrcs is_rec want_generics + = do { let { tycon = mkAlgTyCon tc_name kind tvs arg_vrcs stupid_theta + rhs fields is_rec want_generics + ; kind = mkArrowKinds (map tyVarKind tvs) liftedTypeKind + ; fields = mkTyConSelIds tycon rhs + } + ; return tycon } + +------------------------------------------------------ +mkAbstractTyConRhs :: AlgTyConRhs +mkAbstractTyConRhs = AbstractTyCon + +mkDataTyConRhs :: [DataCon] -> AlgTyConRhs +mkDataTyConRhs cons + = DataTyCon { data_cons = cons, is_enum = all isNullarySrcDataCon cons } + +mkNewTyConRhs :: TyCon -> DataCon -> AlgTyConRhs +mkNewTyConRhs tycon con + = NewTyCon { data_con = con, + nt_rhs = rhs_ty, + nt_etad_rhs = eta_reduce tvs rhs_ty, + nt_rep = mkNewTyConRep tycon rhs_ty } + where + tvs = dataConTyVars con + rhs_ty = head (dataConOrigArgTys con) + -- Newtypes are guaranteed vanilla, so OrigArgTys will do + + eta_reduce [] ty = ([], ty) + eta_reduce (a:as) ty | null as', + Just (fun, arg) <- splitAppTy_maybe ty', + Just tv <- getTyVar_maybe arg, + tv == a, + not (a `elemVarSet` tyVarsOfType fun) + = ([], fun) -- Successful eta reduction + | otherwise + = (a:as', ty') + where + (as', ty') = eta_reduce as ty + +mkNewTyConRep :: TyCon -- The original type constructor + -> Type -- The arg type of its constructor + -> Type -- Chosen representation type +-- The "representation type" is guaranteed not to be another newtype +-- at the outermost level; but it might have newtypes in type arguments + +-- Find the representation type for this newtype TyCon +-- Remember that the representation type is the *ultimate* representation +-- type, looking through other newtypes. +-- +-- The non-recursive newtypes are easy, because they look transparent +-- to splitTyConApp_maybe, but recursive ones really are represented as +-- TyConApps (see TypeRep). +-- +-- The trick is to to deal correctly with recursive newtypes +-- such as newtype T = MkT T + +mkNewTyConRep tc rhs_ty + | null (tyConDataCons tc) = unitTy + -- External Core programs can have newtypes with no data constructors + | otherwise = go [tc] rhs_ty + where + -- Invariant: tcs have been seen before + go tcs rep_ty + = case splitTyConApp_maybe rep_ty of + Just (tc, tys) + | tc `elem` tcs -> unitTy -- Recursive loop + | isNewTyCon tc -> ASSERT( isRecursiveTyCon tc ) + -- Non-recursive ones have been + -- dealt with by splitTyConApp_maybe + go (tc:tcs) (substTyWith tvs tys rhs_ty) + where + (tvs, rhs_ty) = newTyConRhs tc + + other -> rep_ty + +------------------------------------------------------ +buildDataCon :: Name -> Bool -> Bool + -> [StrictnessMark] + -> [Name] -- Field labels + -> [TyVar] + -> ThetaType -- Does not include the "stupid theta" + -> [Type] -> TyCon -> [Type] + -> TcRnIf m n DataCon +-- A wrapper for DataCon.mkDataCon that +-- a) makes the worker Id +-- b) makes the wrapper Id if necessary, including +-- allocating its unique (hence monadic) +buildDataCon src_name declared_infix vanilla arg_stricts field_lbls + tyvars ctxt arg_tys tycon res_tys + = do { wrap_name <- newImplicitBinder src_name mkDataConWrapperOcc + ; work_name <- newImplicitBinder src_name mkDataConWorkerOcc + -- This last one takes the name of the data constructor in the source + -- code, which (for Haskell source anyway) will be in the DataName name + -- space, and puts it into the VarName name space + + ; let + stupid_ctxt = mkDataConStupidTheta tycon arg_tys res_tys + data_con = mkDataCon src_name declared_infix vanilla + arg_stricts field_lbls + tyvars stupid_ctxt ctxt + arg_tys tycon res_tys dc_ids + dc_ids = mkDataConIds wrap_name work_name data_con + + ; returnM data_con } + + +-- The stupid context for a data constructor should be limited to +-- the type variables mentioned in the arg_tys +mkDataConStupidTheta tycon arg_tys res_tys + | null stupid_theta = [] -- The common case + | otherwise = filter in_arg_tys stupid_theta + where + tc_subst = zipTopTvSubst (tyConTyVars tycon) res_tys + stupid_theta = substTheta tc_subst (tyConStupidTheta tycon) + -- Start by instantiating the master copy of the + -- stupid theta, taken from the TyCon + + arg_tyvars = tyVarsOfTypes arg_tys + in_arg_tys pred = not $ isEmptyVarSet $ + tyVarsOfPred pred `intersectVarSet` arg_tyvars + +------------------------------------------------------ +mkTyConSelIds :: TyCon -> AlgTyConRhs -> [Id] +mkTyConSelIds tycon rhs + = [ mkRecordSelId tycon fld + | fld <- nub (concatMap dataConFieldLabels (visibleDataCons rhs)) ] + -- We'll check later that fields with the same name + -- from different constructors have the same type. +\end{code} + + +------------------------------------------------------ +\begin{code} +buildClass :: Name -> [TyVar] -> ThetaType + -> [FunDep TyVar] -- Functional dependencies + -> [(Name, DefMeth, Type)] -- Method info + -> RecFlag -> ArgVrcs -- Info for type constructor + -> TcRnIf m n Class + +buildClass class_name tvs sc_theta fds sig_stuff tc_isrec tc_vrcs + = do { tycon_name <- newImplicitBinder class_name mkClassTyConOcc + ; datacon_name <- newImplicitBinder class_name mkClassDataConOcc + -- The class name is the 'parent' for this datacon, not its tycon, + -- because one should import the class to get the binding for + -- the datacon + ; sc_sel_names <- mapM (newImplicitBinder class_name . mkSuperDictSelOcc) + [1..length sc_theta] + -- We number off the superclass selectors, 1, 2, 3 etc so that we + -- can construct names for the selectors. Thus + -- class (C a, C b) => D a b where ... + -- gives superclass selectors + -- D_sc1, D_sc2 + -- (We used to call them D_C, but now we can have two different + -- superclasses both called C!) + + ; fixM (\ clas -> do { -- Only name generation inside loop + + let { op_tys = [ty | (_,_,ty) <- sig_stuff] + ; sc_tys = mkPredTys sc_theta + ; dict_component_tys = sc_tys ++ op_tys + ; sc_sel_ids = [mkDictSelId sc_name clas | sc_name <- sc_sel_names] + ; op_items = [ (mkDictSelId op_name clas, dm_info) + | (op_name, dm_info, _) <- sig_stuff ] } + -- Build the selector id and default method id + + ; dict_con <- buildDataCon datacon_name + False -- Not declared infix + True -- Is vanilla; tyvars same as tycon + (map (const NotMarkedStrict) dict_component_tys) + [{- No labelled fields -}] + tvs [{-No context-}] dict_component_tys + (classTyCon clas) (mkTyVarTys tvs) + + ; let { clas = mkClass class_name tvs fds + sc_theta sc_sel_ids op_items + tycon + + ; tycon = mkClassTyCon tycon_name clas_kind tvs + tc_vrcs rhs clas tc_isrec + -- A class can be recursive, and in the case of newtypes + -- this matters. For example + -- class C a where { op :: C b => a -> b -> Int } + -- Because C has only one operation, it is represented by + -- a newtype, and it should be a *recursive* newtype. + -- [If we don't make it a recursive newtype, we'll expand the + -- newtype like a synonym, but that will lead to an infinite type] + + ; clas_kind = mkArrowKinds (map tyVarKind tvs) liftedTypeKind + + ; rhs = case dict_component_tys of + [rep_ty] -> mkNewTyConRhs tycon dict_con + other -> mkDataTyConRhs [dict_con] + } + ; return clas + })} +\end{code} + + diff --git a/compiler/iface/IfaceEnv.lhs b/compiler/iface/IfaceEnv.lhs new file mode 100644 index 0000000000..40b7d31f13 --- /dev/null +++ b/compiler/iface/IfaceEnv.lhs @@ -0,0 +1,359 @@ +(c) The University of Glasgow 2002 + +\begin{code} +module IfaceEnv ( + newGlobalBinder, newIPName, newImplicitBinder, + lookupIfaceTop, lookupIfaceExt, + lookupOrig, lookupIfaceTc, + newIfaceName, newIfaceNames, + extendIfaceIdEnv, extendIfaceTyVarEnv, refineIfaceIdEnv, + tcIfaceLclId, tcIfaceTyVar, + + lookupAvail, ifaceExportNames, + + -- Name-cache stuff + allocateGlobalBinder, initNameCache, + ) where + +#include "HsVersions.h" + +import TcRnMonad +import IfaceType ( IfaceExtName(..), IfaceTyCon(..), ifaceTyConName ) +import TysWiredIn ( tupleTyCon, tupleCon ) +import HscTypes ( NameCache(..), HscEnv(..), GenAvailInfo(..), + IfaceExport, OrigNameCache ) +import Type ( mkOpenTvSubst, substTy ) +import TyCon ( TyCon, tyConName ) +import Unify ( TypeRefinement ) +import DataCon ( dataConWorkId, dataConName ) +import Var ( TyVar, Id, varName, setIdType, idType ) +import Name ( Name, nameUnique, nameModule, + nameOccName, nameSrcLoc, + getOccName, nameParent_maybe, + isWiredInName, mkIPName, + mkExternalName, mkInternalName ) +import NameSet ( NameSet, emptyNameSet, addListToNameSet ) +import OccName ( OccName, isTupleOcc_maybe, tcName, dataName, mapOccEnv, + lookupOccEnv, unitOccEnv, extendOccEnv, extendOccEnvList ) +import PrelNames ( gHC_PRIM, pREL_TUP ) +import Module ( Module, emptyModuleEnv, + lookupModuleEnv, extendModuleEnv_C ) +import UniqSupply ( UniqSupply, splitUniqSupply, uniqFromSupply, uniqsFromSupply ) +import FiniteMap ( emptyFM, lookupFM, addToFM ) +import BasicTypes ( IPName(..), mapIPName ) +import SrcLoc ( SrcLoc, noSrcLoc ) +import Maybes ( orElse ) + +import Outputable +\end{code} + + +%********************************************************* +%* * + Allocating new Names in the Name Cache +%* * +%********************************************************* + +\begin{code} +newGlobalBinder :: Module -> OccName -> Maybe Name -> SrcLoc -> TcRnIf a b Name +-- Used for source code and interface files, to make the +-- Name for a thing, given its Module and OccName +-- +-- The cache may already already have a binding for this thing, +-- because we may have seen an occurrence before, but now is the +-- moment when we know its Module and SrcLoc in their full glory + +newGlobalBinder mod occ mb_parent loc + = do { mod `seq` occ `seq` return () -- See notes with lookupOrig_help + -- ; traceIf (text "newGlobalBinder" <+> ppr mod <+> ppr occ <+> ppr loc) + ; name_supply <- getNameCache + ; let (name_supply', name) = allocateGlobalBinder + name_supply mod occ + mb_parent loc + ; setNameCache name_supply' + ; return name } + +allocateGlobalBinder + :: NameCache + -> Module -> OccName -> Maybe Name -> SrcLoc + -> (NameCache, Name) +allocateGlobalBinder name_supply mod occ mb_parent loc + = case lookupOrigNameCache (nsNames name_supply) mod occ of + -- A hit in the cache! We are at the binding site of the name. + -- This is the moment when we know the defining parent and SrcLoc + -- of the Name, so we set these fields in the Name we return. + -- + -- Then (bogus) multiple bindings of the same Name + -- get different SrcLocs can can be reported as such. + -- + -- Possible other reason: it might be in the cache because we + -- encountered an occurrence before the binding site for an + -- implicitly-imported Name. Perhaps the current SrcLoc is + -- better... but not really: it'll still just say 'imported' + -- + -- IMPORTANT: Don't mess with wired-in names. + -- Their wired-in-ness is in their NameSort + -- and their Module is correct. + + Just name | isWiredInName name -> (name_supply, name) + | otherwise -> (new_name_supply, name') + where + uniq = nameUnique name + name' = mkExternalName uniq mod occ mb_parent loc + new_cache = extend_name_cache (nsNames name_supply) mod occ name' + new_name_supply = name_supply {nsNames = new_cache} + + -- Miss in the cache! + -- Build a completely new Name, and put it in the cache + Nothing -> (new_name_supply, name) + where + (us', us1) = splitUniqSupply (nsUniqs name_supply) + uniq = uniqFromSupply us1 + name = mkExternalName uniq mod occ mb_parent loc + new_cache = extend_name_cache (nsNames name_supply) mod occ name + new_name_supply = name_supply {nsUniqs = us', nsNames = new_cache} + + +newImplicitBinder :: Name -- Base name + -> (OccName -> OccName) -- Occurrence name modifier + -> TcRnIf m n Name -- Implicit name +-- Called in BuildTyCl to allocate the implicit binders of type/class decls +-- For source type/class decls, this is the first occurrence +-- For iface ones, the LoadIface has alrady allocated a suitable name in the cache +-- +-- An *implicit* name has the base-name as parent +newImplicitBinder base_name mk_sys_occ + = newGlobalBinder (nameModule base_name) + (mk_sys_occ (nameOccName base_name)) + (Just parent_name) + (nameSrcLoc base_name) + where + parent_name = case nameParent_maybe base_name of + Just parent_name -> parent_name + Nothing -> base_name + +ifaceExportNames :: [IfaceExport] -> TcRnIf gbl lcl NameSet +ifaceExportNames exports + = foldlM do_one emptyNameSet exports + where + do_one acc (mod, exports) = foldlM (do_avail mod) acc exports + do_avail mod acc avail = do { ns <- lookupAvail mod avail + ; return (addListToNameSet acc ns) } + +lookupAvail :: Module -> GenAvailInfo OccName -> TcRnIf a b [Name] +-- Find all the names arising from an import +-- Make sure the parent info is correct, even though we may not +-- yet have read the interface for this module +lookupAvail mod (Avail n) = do { n' <- lookupOrig mod n; + ; return [n'] } +lookupAvail mod (AvailTC p_occ occs) + = do { p_name <- lookupOrig mod p_occ + ; let lookup_sub occ | occ == p_occ = return p_name + | otherwise = lookup_orig mod occ (Just p_name) + ; mappM lookup_sub occs } + -- Remember that 'occs' is all the exported things, including + -- the parent. It's possible to export just class ops without + -- the class, via C( op ). If the class was exported too we'd + -- have C( C, op ) + + -- The use of lookupOrigSub here (rather than lookupOrig) + -- ensures that the subordinate names record their parent; + -- and that in turn ensures that the GlobalRdrEnv + -- has the correct parent for all the names in its range. + -- For imported things, we may only suck in the interface later, if ever. + -- Reason for all this: + -- Suppose module M exports type A.T, and constructor A.MkT + -- Then, we know that A.MkT is a subordinate name of A.T, + -- even though we aren't at the binding site of A.T + -- And it's important, because we may simply re-export A.T + -- without ever sucking in the declaration itself. + + +lookupOrig :: Module -> OccName -> TcRnIf a b Name +-- Even if we get a miss in the original-name cache, we +-- make a new External Name. +-- We fake up +-- SrcLoc to noSrcLoc +-- Parent no Nothing +-- They'll be overwritten, in due course, by LoadIface.loadDecl. +lookupOrig mod occ = lookup_orig mod occ Nothing + +lookup_orig :: Module -> OccName -> Maybe Name -> TcRnIf a b Name +-- Used when we know the parent of the thing we are looking up +lookup_orig mod occ mb_parent + = do { -- First ensure that mod and occ are evaluated + -- If not, chaos can ensue: + -- we read the name-cache + -- then pull on mod (say) + -- which does some stuff that modifies the name cache + -- This did happen, with tycon_mod in TcIface.tcIfaceAlt (DataAlt..) + mod `seq` occ `seq` return () + + ; name_supply <- getNameCache + ; case lookupOrigNameCache (nsNames name_supply) mod occ of { + Just name -> returnM name ; + Nothing -> do + + { let { (us', us1) = splitUniqSupply (nsUniqs name_supply) + ; uniq = uniqFromSupply us1 + ; name = mkExternalName uniq mod occ mb_parent noSrcLoc + ; new_cache = extend_name_cache (nsNames name_supply) mod occ name + ; new_name_supply = name_supply {nsUniqs = us', nsNames = new_cache} + } + ; setNameCache new_name_supply + ; return name } + }} + +newIPName :: IPName OccName -> TcRnIf m n (IPName Name) +newIPName occ_name_ip + = getNameCache `thenM` \ name_supply -> + let + ipcache = nsIPs name_supply + in + case lookupFM ipcache key of + Just name_ip -> returnM name_ip + Nothing -> setNameCache new_ns `thenM_` + returnM name_ip + where + (us', us1) = splitUniqSupply (nsUniqs name_supply) + uniq = uniqFromSupply us1 + name_ip = mapIPName (mkIPName uniq) occ_name_ip + new_ipcache = addToFM ipcache key name_ip + new_ns = name_supply {nsUniqs = us', nsIPs = new_ipcache} + where + key = occ_name_ip -- Ensures that ?x and %x get distinct Names +\end{code} + + Local helper functions (not exported) + +\begin{code} +lookupOrigNameCache :: OrigNameCache -> Module -> OccName -> Maybe Name +lookupOrigNameCache nc mod occ + | mod == pREL_TUP || mod == gHC_PRIM, -- Boxed tuples from one, + Just tup_info <- isTupleOcc_maybe occ -- unboxed from the other + = -- Special case for tuples; there are too many + -- of them to pre-populate the original-name cache + Just (mk_tup_name tup_info) + where + mk_tup_name (ns, boxity, arity) + | ns == tcName = tyConName (tupleTyCon boxity arity) + | ns == dataName = dataConName (tupleCon boxity arity) + | otherwise = varName (dataConWorkId (tupleCon boxity arity)) + +lookupOrigNameCache nc mod occ -- The normal case + = case lookupModuleEnv nc mod of + Nothing -> Nothing + Just occ_env -> lookupOccEnv occ_env occ + +extendOrigNameCache :: OrigNameCache -> Name -> OrigNameCache +extendOrigNameCache nc name + = extend_name_cache nc (nameModule name) (nameOccName name) name + +extend_name_cache :: OrigNameCache -> Module -> OccName -> Name -> OrigNameCache +extend_name_cache nc mod occ name + = extendModuleEnv_C combine nc mod (unitOccEnv occ name) + where + combine occ_env _ = extendOccEnv occ_env occ name + +getNameCache :: TcRnIf a b NameCache +getNameCache = do { HscEnv { hsc_NC = nc_var } <- getTopEnv; + readMutVar nc_var } + +setNameCache :: NameCache -> TcRnIf a b () +setNameCache nc = do { HscEnv { hsc_NC = nc_var } <- getTopEnv; + writeMutVar nc_var nc } +\end{code} + + +\begin{code} +initNameCache :: UniqSupply -> [Name] -> NameCache +initNameCache us names + = NameCache { nsUniqs = us, + nsNames = initOrigNames names, + nsIPs = emptyFM } + +initOrigNames :: [Name] -> OrigNameCache +initOrigNames names = foldl extendOrigNameCache emptyModuleEnv names +\end{code} + + + +%************************************************************************ +%* * + Type variables and local Ids +%* * +%************************************************************************ + +\begin{code} +tcIfaceLclId :: OccName -> IfL Id +tcIfaceLclId occ + = do { lcl <- getLclEnv + ; return (lookupOccEnv (if_id_env lcl) occ + `orElse` + pprPanic "tcIfaceLclId" (ppr occ)) } + +refineIfaceIdEnv :: TypeRefinement -> IfL a -> IfL a +refineIfaceIdEnv (tv_subst, _) thing_inside + = do { env <- getLclEnv + ; let { id_env' = mapOccEnv refine_id (if_id_env env) + ; refine_id id = setIdType id (substTy subst (idType id)) + ; subst = mkOpenTvSubst tv_subst } + ; setLclEnv (env { if_id_env = id_env' }) thing_inside } + +extendIfaceIdEnv :: [Id] -> IfL a -> IfL a +extendIfaceIdEnv ids thing_inside + = do { env <- getLclEnv + ; let { id_env' = extendOccEnvList (if_id_env env) pairs + ; pairs = [(getOccName id, id) | id <- ids] } + ; setLclEnv (env { if_id_env = id_env' }) thing_inside } + + +tcIfaceTyVar :: OccName -> IfL TyVar +tcIfaceTyVar occ + = do { lcl <- getLclEnv + ; return (lookupOccEnv (if_tv_env lcl) occ + `orElse` + pprPanic "tcIfaceTyVar" (ppr occ)) } + +extendIfaceTyVarEnv :: [TyVar] -> IfL a -> IfL a +extendIfaceTyVarEnv tyvars thing_inside + = do { env <- getLclEnv + ; let { tv_env' = extendOccEnvList (if_tv_env env) pairs + ; pairs = [(getOccName tv, tv) | tv <- tyvars] } + ; setLclEnv (env { if_tv_env = tv_env' }) thing_inside } +\end{code} + + +%************************************************************************ +%* * + Getting from RdrNames to Names +%* * +%************************************************************************ + +\begin{code} +lookupIfaceTc :: IfaceTyCon -> IfL Name +lookupIfaceTc (IfaceTc ext) = lookupIfaceExt ext +lookupIfaceTc other_tc = return (ifaceTyConName other_tc) + +lookupIfaceExt :: IfaceExtName -> IfL Name +lookupIfaceExt (ExtPkg mod occ) = lookupOrig mod occ +lookupIfaceExt (HomePkg mod occ _) = lookupOrig mod occ +lookupIfaceExt (LocalTop occ) = lookupIfaceTop occ +lookupIfaceExt (LocalTopSub occ _) = lookupIfaceTop occ + +lookupIfaceTop :: OccName -> IfL Name +-- Look up a top-level name from the current Iface module +lookupIfaceTop occ + = do { env <- getLclEnv; lookupOrig (if_mod env) occ } + +newIfaceName :: OccName -> IfL Name +newIfaceName occ + = do { uniq <- newUnique + ; return (mkInternalName uniq occ noSrcLoc) } + +newIfaceNames :: [OccName] -> IfL [Name] +newIfaceNames occs + = do { uniqs <- newUniqueSupply + ; return [ mkInternalName uniq occ noSrcLoc + | (occ,uniq) <- occs `zip` uniqsFromSupply uniqs] } +\end{code} diff --git a/compiler/iface/IfaceSyn.lhs b/compiler/iface/IfaceSyn.lhs new file mode 100644 index 0000000000..99501a5b68 --- /dev/null +++ b/compiler/iface/IfaceSyn.lhs @@ -0,0 +1,998 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1993-1998 +% +%************************************************************************ +%* * +\section[HsCore]{Core-syntax unfoldings in Haskell interface files} +%* * +%************************************************************************ + +We could either use this, or parameterise @GenCoreExpr@ on @Types@ and +@TyVars@ as well. Currently trying the former... MEGA SIGH. + +\begin{code} +module IfaceSyn ( + module IfaceType, -- Re-export all this + + IfaceDecl(..), IfaceClassOp(..), IfaceConDecl(..), IfaceConDecls(..), + IfaceExpr(..), IfaceAlt, IfaceNote(..), + IfaceBinding(..), IfaceConAlt(..), IfaceIdInfo(..), + IfaceInfoItem(..), IfaceRule(..), IfaceInst(..), + + -- Misc + visibleIfConDecls, + + -- Converting things to IfaceSyn + tyThingToIfaceDecl, instanceToIfaceInst, coreRuleToIfaceRule, + + -- Equality + IfaceEq(..), (&&&), bool, eqListBy, eqMaybeBy, + eqIfDecl, eqIfInst, eqIfRule, + + -- Pretty printing + pprIfaceExpr, pprIfaceDecl, pprIfaceDeclHead + ) where + +#include "HsVersions.h" + +import CoreSyn +import IfaceType + +import FunDeps ( pprFundeps ) +import NewDemand ( StrictSig, pprIfaceStrictSig ) +import TcType ( deNoteType ) +import Type ( TyThing(..), splitForAllTys, funResultTy ) +import InstEnv ( Instance(..), OverlapFlag ) +import Id ( Id, idName, idType, idInfo, idArity, isDataConWorkId_maybe, isFCallId_maybe ) +import NewDemand ( isTopSig ) +import IdInfo ( IdInfo, CafInfo(..), WorkerInfo(..), + arityInfo, cafInfo, newStrictnessInfo, + workerInfo, unfoldingInfo, inlinePragInfo ) +import TyCon ( TyCon, ArgVrcs, AlgTyConRhs(..), isRecursiveTyCon, isForeignTyCon, + isSynTyCon, isAlgTyCon, isPrimTyCon, isFunTyCon, + isTupleTyCon, tupleTyConBoxity, tyConStupidTheta, + tyConHasGenerics, tyConArgVrcs, synTyConRhs, + tyConArity, tyConTyVars, algTyConRhs, tyConExtName ) +import DataCon ( dataConName, dataConSig, dataConFieldLabels, dataConStrictMarks, + dataConTyCon, dataConIsInfix, isVanillaDataCon ) +import Class ( FunDep, DefMeth, classExtraBigSig, classTyCon ) +import OccName ( OccName, OccEnv, emptyOccEnv, + lookupOccEnv, extendOccEnv, parenSymOcc, + OccSet, unionOccSets, unitOccSet ) +import Name ( Name, NamedThing(..), nameOccName, isExternalName ) +import CostCentre ( CostCentre, pprCostCentreCore ) +import Literal ( Literal ) +import ForeignCall ( ForeignCall ) +import TysPrim ( alphaTyVars ) +import BasicTypes ( Arity, Activation(..), StrictnessMark, + RecFlag(..), boolToRecFlag, Boxity(..), + tupleParens ) +import Outputable +import FastString +import Maybes ( catMaybes ) +import Util ( lengthIs ) + +infixl 3 &&& +infix 4 `eqIfExt`, `eqIfIdInfo`, `eqIfType` +\end{code} + + +%************************************************************************ +%* * + Data type declarations +%* * +%************************************************************************ + +\begin{code} +data IfaceDecl + = IfaceId { ifName :: OccName, + ifType :: IfaceType, + ifIdInfo :: IfaceIdInfo } + + | IfaceData { ifName :: OccName, -- Type constructor + ifTyVars :: [IfaceTvBndr], -- Type variables + ifCtxt :: IfaceContext, -- The "stupid theta" + ifCons :: IfaceConDecls, -- Includes new/data info + ifRec :: RecFlag, -- Recursive or not? + ifVrcs :: ArgVrcs, + ifGeneric :: Bool -- True <=> generic converter functions available + } -- We need this for imported data decls, since the + -- imported modules may have been compiled with + -- different flags to the current compilation unit + + | IfaceSyn { ifName :: OccName, -- Type constructor + ifTyVars :: [IfaceTvBndr], -- Type variables + ifVrcs :: ArgVrcs, + ifSynRhs :: IfaceType -- synonym expansion + } + + | IfaceClass { ifCtxt :: IfaceContext, -- Context... + ifName :: OccName, -- Name of the class + ifTyVars :: [IfaceTvBndr], -- Type variables + ifFDs :: [FunDep OccName], -- Functional dependencies + ifSigs :: [IfaceClassOp], -- Method signatures + ifRec :: RecFlag, -- Is newtype/datatype associated with the class recursive? + ifVrcs :: ArgVrcs -- ... and what are its argument variances ... + } + + | IfaceForeign { ifName :: OccName, -- Needs expanding when we move beyond .NET + ifExtName :: Maybe FastString } + +data IfaceClassOp = IfaceClassOp OccName DefMeth IfaceType + -- Nothing => no default method + -- Just False => ordinary polymorphic default method + -- Just True => generic default method + +data IfaceConDecls + = IfAbstractTyCon -- No info + | IfDataTyCon [IfaceConDecl] -- data type decls + | IfNewTyCon IfaceConDecl -- newtype decls + +visibleIfConDecls :: IfaceConDecls -> [IfaceConDecl] +visibleIfConDecls IfAbstractTyCon = [] +visibleIfConDecls (IfDataTyCon cs) = cs +visibleIfConDecls (IfNewTyCon c) = [c] + +data IfaceConDecl + = IfVanillaCon { + ifConOcc :: OccName, -- Constructor name + ifConInfix :: Bool, -- True <=> declared infix + ifConArgTys :: [IfaceType], -- Arg types + ifConStricts :: [StrictnessMark], -- Empty (meaning all lazy), or 1-1 corresp with arg types + ifConFields :: [OccName] } -- ...ditto... (field labels) + | IfGadtCon { + ifConOcc :: OccName, -- Constructor name + ifConTyVars :: [IfaceTvBndr], -- All tyvars + ifConCtxt :: IfaceContext, -- Non-stupid context + ifConArgTys :: [IfaceType], -- Arg types + ifConResTys :: [IfaceType], -- Result type args + ifConStricts :: [StrictnessMark] } -- Empty (meaning all lazy), or 1-1 corresp with arg types + +data IfaceInst + = IfaceInst { ifInstCls :: IfaceExtName, -- See comments with + ifInstTys :: [Maybe IfaceTyCon], -- the defn of Instance + ifDFun :: OccName, -- The dfun + ifOFlag :: OverlapFlag, -- Overlap flag + ifInstOrph :: Maybe OccName } -- See is_orph in defn of Instance + -- There's always a separate IfaceDecl for the DFun, which gives + -- its IdInfo with its full type and version number. + -- The instance declarations taken together have a version number, + -- and we don't want that to wobble gratuitously + -- If this instance decl is *used*, we'll record a usage on the dfun; + -- and if the head does not change it won't be used if it wasn't before + +data IfaceRule + = IfaceRule { + ifRuleName :: RuleName, + ifActivation :: Activation, + ifRuleBndrs :: [IfaceBndr], -- Tyvars and term vars + ifRuleHead :: IfaceExtName, -- Head of lhs + ifRuleArgs :: [IfaceExpr], -- Args of LHS + ifRuleRhs :: IfaceExpr, + ifRuleOrph :: Maybe OccName -- Just like IfaceInst + } + +data IfaceIdInfo + = NoInfo -- When writing interface file without -O + | HasInfo [IfaceInfoItem] -- Has info, and here it is + +-- Here's a tricky case: +-- * Compile with -O module A, and B which imports A.f +-- * Change function f in A, and recompile without -O +-- * When we read in old A.hi we read in its IdInfo (as a thunk) +-- (In earlier GHCs we used to drop IdInfo immediately on reading, +-- but we do not do that now. Instead it's discarded when the +-- ModIface is read into the various decl pools.) +-- * The version comparsion sees that new (=NoInfo) differs from old (=HasInfo *) +-- and so gives a new version. + +data IfaceInfoItem + = HsArity Arity + | HsStrictness StrictSig + | HsUnfold Activation IfaceExpr + | HsNoCafRefs + | HsWorker IfaceExtName Arity -- Worker, if any see IdInfo.WorkerInfo + -- for why we want arity here. + -- NB: we need IfaceExtName (not just OccName) because the worker + -- can simplify to a function in another module. +-- NB: Specialisations and rules come in separately and are +-- only later attached to the Id. Partial reason: some are orphans. + +-------------------------------- +data IfaceExpr + = IfaceLcl OccName + | IfaceExt IfaceExtName + | IfaceType IfaceType + | IfaceTuple Boxity [IfaceExpr] -- Saturated; type arguments omitted + | IfaceLam IfaceBndr IfaceExpr + | IfaceApp IfaceExpr IfaceExpr + | IfaceCase IfaceExpr OccName IfaceType [IfaceAlt] + | IfaceLet IfaceBinding IfaceExpr + | IfaceNote IfaceNote IfaceExpr + | IfaceLit Literal + | IfaceFCall ForeignCall IfaceType + +data IfaceNote = IfaceSCC CostCentre + | IfaceCoerce IfaceType + | IfaceInlineCall + | IfaceInlineMe + | IfaceCoreNote String + +type IfaceAlt = (IfaceConAlt, [OccName], IfaceExpr) + -- Note: OccName, not IfaceBndr (and same with the case binder) + -- We reconstruct the kind/type of the thing from the context + -- thus saving bulk in interface files + +data IfaceConAlt = IfaceDefault + | IfaceDataAlt OccName + | IfaceTupleAlt Boxity + | IfaceLitAlt Literal + +data IfaceBinding + = IfaceNonRec IfaceIdBndr IfaceExpr + | IfaceRec [(IfaceIdBndr, IfaceExpr)] +\end{code} + + +%************************************************************************ +%* * +\subsection[HsCore-print]{Printing Core unfoldings} +%* * +%************************************************************************ + +----------------------------- Printing IfaceDecl ------------------------------------ + +\begin{code} +instance Outputable IfaceDecl where + ppr = pprIfaceDecl + +pprIfaceDecl (IfaceId {ifName = var, ifType = ty, ifIdInfo = info}) + = sep [ ppr var <+> dcolon <+> ppr ty, + nest 2 (ppr info) ] + +pprIfaceDecl (IfaceForeign {ifName = tycon}) + = hsep [ptext SLIT("foreign import type dotnet"), ppr tycon] + +pprIfaceDecl (IfaceSyn {ifName = tycon, ifTyVars = tyvars, ifSynRhs = mono_ty, ifVrcs = vrcs}) + = hang (ptext SLIT("type") <+> pprIfaceDeclHead [] tycon tyvars) + 4 (vcat [equals <+> ppr mono_ty, + pprVrcs vrcs]) + +pprIfaceDecl (IfaceData {ifName = tycon, ifGeneric = gen, ifCtxt = context, + ifTyVars = tyvars, ifCons = condecls, + ifRec = isrec, ifVrcs = vrcs}) + = hang (pp_nd <+> pprIfaceDeclHead context tycon tyvars) + 4 (vcat [pprVrcs vrcs, pprRec isrec, pprGen gen, pp_condecls tycon condecls]) + where + pp_nd = case condecls of + IfAbstractTyCon -> ptext SLIT("data") + IfDataTyCon _ -> ptext SLIT("data") + IfNewTyCon _ -> ptext SLIT("newtype") + +pprIfaceDecl (IfaceClass {ifCtxt = context, ifName = clas, ifTyVars = tyvars, + ifFDs = fds, ifSigs = sigs, ifVrcs = vrcs, ifRec = isrec}) + = hang (ptext SLIT("class") <+> pprIfaceDeclHead context clas tyvars <+> pprFundeps fds) + 4 (vcat [pprVrcs vrcs, + pprRec isrec, + sep (map ppr sigs)]) + +pprVrcs vrcs = ptext SLIT("Variances") <+> ppr vrcs +pprRec isrec = ptext SLIT("RecFlag") <+> ppr isrec +pprGen True = ptext SLIT("Generics: yes") +pprGen False = ptext SLIT("Generics: no") + +instance Outputable IfaceClassOp where + ppr (IfaceClassOp n dm ty) = ppr n <+> ppr dm <+> dcolon <+> ppr ty + +pprIfaceDeclHead :: IfaceContext -> OccName -> [IfaceTvBndr] -> SDoc +pprIfaceDeclHead context thing tyvars + = hsep [pprIfaceContext context, parenSymOcc thing (ppr thing), pprIfaceTvBndrs tyvars] + +pp_condecls tc IfAbstractTyCon = ptext SLIT("{- abstract -}") +pp_condecls tc (IfNewTyCon c) = equals <+> pprIfaceConDecl tc c +pp_condecls tc (IfDataTyCon cs) = equals <+> sep (punctuate (ptext SLIT(" |")) + (map (pprIfaceConDecl tc) cs)) + +pprIfaceConDecl tc (IfVanillaCon { + ifConOcc = name, ifConInfix = is_infix, + ifConArgTys = arg_tys, + ifConStricts = strs, ifConFields = fields }) + = sep [ppr name <+> sep (map pprParendIfaceType arg_tys), + if is_infix then ptext SLIT("Infix") else empty, + if null strs then empty + else nest 4 (ptext SLIT("Stricts:") <+> hsep (map ppr strs)), + if null fields then empty + else nest 4 (ptext SLIT("Fields:") <+> hsep (map ppr fields))] + +pprIfaceConDecl tc (IfGadtCon { + ifConOcc = name, + ifConTyVars = tvs, ifConCtxt = ctxt, + ifConArgTys = arg_tys, ifConResTys = res_tys, + ifConStricts = strs }) + = sep [ppr name <+> dcolon <+> pprIfaceForAllPart tvs ctxt (ppr con_tau), + if null strs then empty + else nest 4 (ptext SLIT("Stricts:") <+> hsep (map ppr strs))] + where + con_tau = foldr1 IfaceFunTy (arg_tys ++ [tc_app]) + tc_app = IfaceTyConApp (IfaceTc (LocalTop tc)) res_tys + -- Gruesome, but jsut for debug print + +instance Outputable IfaceRule where + ppr (IfaceRule { ifRuleName = name, ifActivation = act, ifRuleBndrs = bndrs, + ifRuleHead = fn, ifRuleArgs = args, ifRuleRhs = rhs }) + = sep [hsep [doubleQuotes (ftext name), ppr act, + ptext SLIT("forall") <+> pprIfaceBndrs bndrs], + nest 2 (sep [ppr fn <+> sep (map (pprIfaceExpr parens) args), + ptext SLIT("=") <+> ppr rhs]) + ] + +instance Outputable IfaceInst where + ppr (IfaceInst {ifDFun = dfun_id, ifOFlag = flag, + ifInstCls = cls, ifInstTys = mb_tcs}) + = hang (ptext SLIT("instance") <+> ppr flag + <+> ppr cls <+> brackets (pprWithCommas ppr_mb mb_tcs)) + 2 (equals <+> ppr dfun_id) + where + ppr_mb Nothing = dot + ppr_mb (Just tc) = ppr tc +\end{code} + + +----------------------------- Printing IfaceExpr ------------------------------------ + +\begin{code} +instance Outputable IfaceExpr where + ppr e = pprIfaceExpr noParens e + +pprIfaceExpr :: (SDoc -> SDoc) -> IfaceExpr -> SDoc + -- The function adds parens in context that need + -- an atomic value (e.g. function args) + +pprIfaceExpr add_par (IfaceLcl v) = ppr v +pprIfaceExpr add_par (IfaceExt v) = ppr v +pprIfaceExpr add_par (IfaceLit l) = ppr l +pprIfaceExpr add_par (IfaceFCall cc ty) = braces (ppr cc <+> ppr ty) +pprIfaceExpr add_par (IfaceType ty) = char '@' <+> pprParendIfaceType ty + +pprIfaceExpr add_par app@(IfaceApp _ _) = add_par (pprIfaceApp app []) +pprIfaceExpr add_par (IfaceTuple c as) = tupleParens c (interpp'SP as) + +pprIfaceExpr add_par e@(IfaceLam _ _) + = add_par (sep [char '\\' <+> sep (map ppr bndrs) <+> arrow, + pprIfaceExpr noParens body]) + where + (bndrs,body) = collect [] e + collect bs (IfaceLam b e) = collect (b:bs) e + collect bs e = (reverse bs, e) + +-- gaw 2004 +pprIfaceExpr add_par (IfaceCase scrut bndr ty [(con, bs, rhs)]) +-- gaw 2004 + = add_par (sep [ptext SLIT("case") <+> char '@' <+> pprParendIfaceType ty <+> pprIfaceExpr noParens scrut <+> ptext SLIT("of") + <+> ppr bndr <+> char '{' <+> ppr_con_bs con bs <+> arrow, + pprIfaceExpr noParens rhs <+> char '}']) + +-- gaw 2004 +pprIfaceExpr add_par (IfaceCase scrut bndr ty alts) +-- gaw 2004 + = add_par (sep [ptext SLIT("case") <+> char '@' <+> pprParendIfaceType ty <+> pprIfaceExpr noParens scrut <+> ptext SLIT("of") + <+> ppr bndr <+> char '{', + nest 2 (sep (map ppr_alt alts)) <+> char '}']) + +pprIfaceExpr add_par (IfaceLet (IfaceNonRec b rhs) body) + = add_par (sep [ptext SLIT("let {"), + nest 2 (ppr_bind (b, rhs)), + ptext SLIT("} in"), + pprIfaceExpr noParens body]) + +pprIfaceExpr add_par (IfaceLet (IfaceRec pairs) body) + = add_par (sep [ptext SLIT("letrec {"), + nest 2 (sep (map ppr_bind pairs)), + ptext SLIT("} in"), + pprIfaceExpr noParens body]) + +pprIfaceExpr add_par (IfaceNote note body) = add_par (ppr note <+> pprIfaceExpr parens body) + +ppr_alt (con, bs, rhs) = sep [ppr_con_bs con bs, + arrow <+> pprIfaceExpr noParens rhs] + +ppr_con_bs (IfaceTupleAlt tup_con) bs = tupleParens tup_con (interpp'SP bs) +ppr_con_bs con bs = ppr con <+> hsep (map ppr bs) + +ppr_bind ((b,ty),rhs) = sep [ppr b <+> dcolon <+> ppr ty, + equals <+> pprIfaceExpr noParens rhs] + +------------------ +pprIfaceApp (IfaceApp fun arg) args = pprIfaceApp fun (nest 2 (pprIfaceExpr parens arg) : args) +pprIfaceApp fun args = sep (pprIfaceExpr parens fun : args) + +------------------ +instance Outputable IfaceNote where + ppr (IfaceSCC cc) = pprCostCentreCore cc + ppr (IfaceCoerce ty) = ptext SLIT("__coerce") <+> pprParendIfaceType ty + ppr IfaceInlineCall = ptext SLIT("__inline_call") + ppr IfaceInlineMe = ptext SLIT("__inline_me") + ppr (IfaceCoreNote s) = ptext SLIT("__core_note") <+> pprHsString (mkFastString s) + +instance Outputable IfaceConAlt where + ppr IfaceDefault = text "DEFAULT" + ppr (IfaceLitAlt l) = ppr l + ppr (IfaceDataAlt d) = ppr d + ppr (IfaceTupleAlt b) = panic "ppr IfaceConAlt" + -- IfaceTupleAlt is handled by the case-alternative printer + +------------------ +instance Outputable IfaceIdInfo where + ppr NoInfo = empty + ppr (HasInfo is) = ptext SLIT("{-") <+> fsep (map ppr_hs_info is) <+> ptext SLIT("-}") + +ppr_hs_info (HsUnfold prag unf) = sep [ptext SLIT("Unfolding: ") <> ppr prag, + parens (pprIfaceExpr noParens unf)] +ppr_hs_info (HsArity arity) = ptext SLIT("Arity:") <+> int arity +ppr_hs_info (HsStrictness str) = ptext SLIT("Strictness:") <+> pprIfaceStrictSig str +ppr_hs_info HsNoCafRefs = ptext SLIT("HasNoCafRefs") +ppr_hs_info (HsWorker w a) = ptext SLIT("Worker:") <+> ppr w <+> int a +\end{code} + + +%************************************************************************ +%* * + Converting things to their Iface equivalents +%* * +%************************************************************************ + + +\begin{code} +tyThingToIfaceDecl :: (Name -> IfaceExtName) -> TyThing -> IfaceDecl +-- Assumption: the thing is already tidied, so that locally-bound names +-- (lambdas, for-alls) already have non-clashing OccNames +-- Reason: Iface stuff uses OccNames, and the conversion here does +-- not do tidying on the way +tyThingToIfaceDecl ext (AnId id) + = IfaceId { ifName = getOccName id, + ifType = toIfaceType ext (idType id), + ifIdInfo = info } + where + info = case toIfaceIdInfo ext (idInfo id) of + [] -> NoInfo + items -> HasInfo items + +tyThingToIfaceDecl ext (AClass clas) + = IfaceClass { ifCtxt = toIfaceContext ext sc_theta, + ifName = getOccName clas, + ifTyVars = toIfaceTvBndrs clas_tyvars, + ifFDs = map toIfaceFD clas_fds, + ifSigs = map toIfaceClassOp op_stuff, + ifRec = boolToRecFlag (isRecursiveTyCon tycon), + ifVrcs = tyConArgVrcs tycon } + where + (clas_tyvars, clas_fds, sc_theta, _, op_stuff) = classExtraBigSig clas + tycon = classTyCon clas + + toIfaceClassOp (sel_id, def_meth) + = ASSERT(sel_tyvars == clas_tyvars) + IfaceClassOp (getOccName sel_id) def_meth (toIfaceType ext op_ty) + where + -- Be careful when splitting the type, because of things + -- like class Foo a where + -- op :: (?x :: String) => a -> a + -- and class Baz a where + -- op :: (Ord a) => a -> a + (sel_tyvars, rho_ty) = splitForAllTys (idType sel_id) + op_ty = funResultTy rho_ty + + toIfaceFD (tvs1, tvs2) = (map getOccName tvs1, map getOccName tvs2) + +tyThingToIfaceDecl ext (ATyCon tycon) + | isSynTyCon tycon + = IfaceSyn { ifName = getOccName tycon, + ifTyVars = toIfaceTvBndrs tyvars, + ifVrcs = tyConArgVrcs tycon, + ifSynRhs = toIfaceType ext syn_ty } + + | isAlgTyCon tycon + = IfaceData { ifName = getOccName tycon, + ifTyVars = toIfaceTvBndrs tyvars, + ifCtxt = toIfaceContext ext (tyConStupidTheta tycon), + ifCons = ifaceConDecls (algTyConRhs tycon), + ifRec = boolToRecFlag (isRecursiveTyCon tycon), + ifVrcs = tyConArgVrcs tycon, + ifGeneric = tyConHasGenerics tycon } + + | isForeignTyCon tycon + = IfaceForeign { ifName = getOccName tycon, + ifExtName = tyConExtName tycon } + + | isPrimTyCon tycon || isFunTyCon tycon + -- Needed in GHCi for ':info Int#', for example + = IfaceData { ifName = getOccName tycon, + ifTyVars = toIfaceTvBndrs (take (tyConArity tycon) alphaTyVars), + ifCtxt = [], + ifCons = IfAbstractTyCon, + ifGeneric = False, + ifRec = NonRecursive, + ifVrcs = tyConArgVrcs tycon } + + | otherwise = pprPanic "toIfaceDecl" (ppr tycon) + where + tyvars = tyConTyVars tycon + syn_ty = synTyConRhs tycon + + ifaceConDecls (NewTyCon { data_con = con }) = IfNewTyCon (ifaceConDecl con) + ifaceConDecls (DataTyCon { data_cons = cons }) = IfDataTyCon (map ifaceConDecl cons) + ifaceConDecls AbstractTyCon = IfAbstractTyCon + -- The last case happens when a TyCon has been trimmed during tidying + -- Furthermore, tyThingToIfaceDecl is also used + -- in TcRnDriver for GHCi, when browsing a module, in which case the + -- AbstractTyCon case is perfectly sensible. + + ifaceConDecl data_con + | isVanillaDataCon data_con + = IfVanillaCon {ifConOcc = getOccName (dataConName data_con), + ifConInfix = dataConIsInfix data_con, + ifConArgTys = map (toIfaceType ext) arg_tys, + ifConStricts = strict_marks, + ifConFields = map getOccName field_labels } + | otherwise + = IfGadtCon { ifConOcc = getOccName (dataConName data_con), + ifConTyVars = toIfaceTvBndrs tyvars, + ifConCtxt = toIfaceContext ext theta, + ifConArgTys = map (toIfaceType ext) arg_tys, + ifConResTys = map (toIfaceType ext) res_tys, + ifConStricts = strict_marks } + where + (tyvars, theta, arg_tys, _, res_tys) = dataConSig data_con + field_labels = dataConFieldLabels data_con + strict_marks = dataConStrictMarks data_con + +tyThingToIfaceDecl ext (ADataCon dc) + = pprPanic "toIfaceDecl" (ppr dc) -- Should be trimmed out earlier + + +-------------------------- +instanceToIfaceInst :: (Name -> IfaceExtName) -> Instance -> IfaceInst +instanceToIfaceInst ext_lhs ispec@(Instance { is_dfun = dfun_id, is_flag = oflag, + is_cls = cls, is_tcs = mb_tcs, + is_orph = orph }) + = IfaceInst { ifDFun = getOccName dfun_id, + ifOFlag = oflag, + ifInstCls = ext_lhs cls, + ifInstTys = map do_rough mb_tcs, + ifInstOrph = orph } + where + do_rough Nothing = Nothing + do_rough (Just n) = Just (toIfaceTyCon_name ext_lhs n) + +-------------------------- +toIfaceIdInfo :: (Name -> IfaceExtName) -> IdInfo -> [IfaceInfoItem] +toIfaceIdInfo ext id_info + = catMaybes [arity_hsinfo, caf_hsinfo, strict_hsinfo, + wrkr_hsinfo, unfold_hsinfo] + where + ------------ Arity -------------- + arity_info = arityInfo id_info + arity_hsinfo | arity_info == 0 = Nothing + | otherwise = Just (HsArity arity_info) + + ------------ Caf Info -------------- + caf_info = cafInfo id_info + caf_hsinfo = case caf_info of + NoCafRefs -> Just HsNoCafRefs + _other -> Nothing + + ------------ Strictness -------------- + -- No point in explicitly exporting TopSig + strict_hsinfo = case newStrictnessInfo id_info of + Just sig | not (isTopSig sig) -> Just (HsStrictness sig) + _other -> Nothing + + ------------ Worker -------------- + work_info = workerInfo id_info + has_worker = case work_info of { HasWorker _ _ -> True; other -> False } + wrkr_hsinfo = case work_info of + HasWorker work_id wrap_arity -> + Just (HsWorker (ext (idName work_id)) wrap_arity) + NoWorker -> Nothing + + ------------ Unfolding -------------- + -- The unfolding is redundant if there is a worker + unfold_info = unfoldingInfo id_info + inline_prag = inlinePragInfo id_info + rhs = unfoldingTemplate unfold_info + unfold_hsinfo | neverUnfold unfold_info + || has_worker = Nothing + | otherwise = Just (HsUnfold inline_prag (toIfaceExpr ext rhs)) + +-------------------------- +coreRuleToIfaceRule :: (Name -> IfaceExtName) -- For the LHS names + -> (Name -> IfaceExtName) -- For the RHS names + -> CoreRule -> IfaceRule +coreRuleToIfaceRule ext_lhs ext_rhs (BuiltinRule { ru_fn = fn}) + = pprTrace "toHsRule: builtin" (ppr fn) $ + bogusIfaceRule (mkIfaceExtName fn) + +coreRuleToIfaceRule ext_lhs ext_rhs + (Rule { ru_name = name, ru_fn = fn, ru_act = act, ru_bndrs = bndrs, + ru_args = args, ru_rhs = rhs, ru_orph = orph }) + = IfaceRule { ifRuleName = name, ifActivation = act, + ifRuleBndrs = map (toIfaceBndr ext_lhs) bndrs, + ifRuleHead = ext_lhs fn, + ifRuleArgs = map do_arg args, + ifRuleRhs = toIfaceExpr ext_rhs rhs, + ifRuleOrph = orph } + where + -- For type args we must remove synonyms from the outermost + -- level. Reason: so that when we read it back in we'll + -- construct the same ru_rough field as we have right now; + -- see tcIfaceRule + do_arg (Type ty) = IfaceType (toIfaceType ext_lhs (deNoteType ty)) + do_arg arg = toIfaceExpr ext_lhs arg + +bogusIfaceRule :: IfaceExtName -> IfaceRule +bogusIfaceRule id_name + = IfaceRule { ifRuleName = FSLIT("bogus"), ifActivation = NeverActive, + ifRuleBndrs = [], ifRuleHead = id_name, ifRuleArgs = [], + ifRuleRhs = IfaceExt id_name, ifRuleOrph = Nothing } + +--------------------- +toIfaceExpr :: (Name -> IfaceExtName) -> CoreExpr -> IfaceExpr +toIfaceExpr ext (Var v) = toIfaceVar ext v +toIfaceExpr ext (Lit l) = IfaceLit l +toIfaceExpr ext (Type ty) = IfaceType (toIfaceType ext ty) +toIfaceExpr ext (Lam x b) = IfaceLam (toIfaceBndr ext x) (toIfaceExpr ext b) +toIfaceExpr ext (App f a) = toIfaceApp ext f [a] +-- gaw 2004 +toIfaceExpr ext (Case s x ty as) = IfaceCase (toIfaceExpr ext s) (getOccName x) (toIfaceType ext ty) (map (toIfaceAlt ext) as) +toIfaceExpr ext (Let b e) = IfaceLet (toIfaceBind ext b) (toIfaceExpr ext e) +toIfaceExpr ext (Note n e) = IfaceNote (toIfaceNote ext n) (toIfaceExpr ext e) + +--------------------- +toIfaceNote ext (SCC cc) = IfaceSCC cc +toIfaceNote ext (Coerce t1 _) = IfaceCoerce (toIfaceType ext t1) +toIfaceNote ext InlineCall = IfaceInlineCall +toIfaceNote ext InlineMe = IfaceInlineMe +toIfaceNote ext (CoreNote s) = IfaceCoreNote s + +--------------------- +toIfaceBind ext (NonRec b r) = IfaceNonRec (toIfaceIdBndr ext b) (toIfaceExpr ext r) +toIfaceBind ext (Rec prs) = IfaceRec [(toIfaceIdBndr ext b, toIfaceExpr ext r) | (b,r) <- prs] + +--------------------- +toIfaceAlt ext (c,bs,r) = (toIfaceCon c, map getOccName bs, toIfaceExpr ext r) + +--------------------- +toIfaceCon (DataAlt dc) | isTupleTyCon tc = IfaceTupleAlt (tupleTyConBoxity tc) + | otherwise = IfaceDataAlt (getOccName dc) + where + tc = dataConTyCon dc + +toIfaceCon (LitAlt l) = IfaceLitAlt l +toIfaceCon DEFAULT = IfaceDefault + +--------------------- +toIfaceApp ext (App f a) as = toIfaceApp ext f (a:as) +toIfaceApp ext (Var v) as + = case isDataConWorkId_maybe v of + -- We convert the *worker* for tuples into IfaceTuples + Just dc | isTupleTyCon tc && saturated + -> IfaceTuple (tupleTyConBoxity tc) tup_args + where + val_args = dropWhile isTypeArg as + saturated = val_args `lengthIs` idArity v + tup_args = map (toIfaceExpr ext) val_args + tc = dataConTyCon dc + + other -> mkIfaceApps ext (toIfaceVar ext v) as + +toIfaceApp ext e as = mkIfaceApps ext (toIfaceExpr ext e) as + +mkIfaceApps ext f as = foldl (\f a -> IfaceApp f (toIfaceExpr ext a)) f as + +--------------------- +toIfaceVar :: (Name -> IfaceExtName) -> Id -> IfaceExpr +toIfaceVar ext v + | Just fcall <- isFCallId_maybe v = IfaceFCall fcall (toIfaceType ext (idType v)) + -- Foreign calls have special syntax + | isExternalName name = IfaceExt (ext name) + | otherwise = IfaceLcl (nameOccName name) + where + name = idName v +\end{code} + + +%************************************************************************ +%* * + Equality, for interface file version generaion only +%* * +%************************************************************************ + +Equality over IfaceSyn returns an IfaceEq, not a Bool. The new constructor is +EqBut, which gives the set of *locally-defined* things whose version must be equal +for the whole thing to be equal. So the key function is eqIfExt, which compares +IfaceExtNames. + +Of course, equality is also done modulo alpha conversion. + +\begin{code} +data IfaceEq + = Equal -- Definitely exactly the same + | NotEqual -- Definitely different + | EqBut OccSet -- The same provided these local things have not changed + +bool :: Bool -> IfaceEq +bool True = Equal +bool False = NotEqual + +zapEq :: IfaceEq -> IfaceEq -- Used to forget EqBut information +zapEq (EqBut _) = Equal +zapEq other = other + +(&&&) :: IfaceEq -> IfaceEq -> IfaceEq +Equal &&& x = x +NotEqual &&& x = NotEqual +EqBut occs &&& Equal = EqBut occs +EqBut occs &&& NotEqual = NotEqual +EqBut occs1 &&& EqBut occs2 = EqBut (occs1 `unionOccSets` occs2) + +--------------------- +eqIfExt :: IfaceExtName -> IfaceExtName -> IfaceEq +-- This function is the core of the EqBut stuff +eqIfExt (ExtPkg mod1 occ1) (ExtPkg mod2 occ2) = bool (mod1==mod2 && occ1==occ2) +eqIfExt (HomePkg mod1 occ1 v1) (HomePkg mod2 occ2 v2) = bool (mod1==mod2 && occ1==occ2 && v1==v2) +eqIfExt (LocalTop occ1) (LocalTop occ2) | occ1 == occ2 = EqBut (unitOccSet occ1) +eqIfExt (LocalTopSub occ1 p1) (LocalTop occ2) | occ1 == occ2 = EqBut (unitOccSet p1) +eqIfExt (LocalTopSub occ1 p1) (LocalTopSub occ2 _) | occ1 == occ2 = EqBut (unitOccSet p1) +eqIfExt n1 n2 = NotEqual +\end{code} + + +\begin{code} +--------------------- +eqIfDecl :: IfaceDecl -> IfaceDecl -> IfaceEq +eqIfDecl (IfaceId s1 t1 i1) (IfaceId s2 t2 i2) + = bool (s1 == s2) &&& (t1 `eqIfType` t2) &&& (i1 `eqIfIdInfo` i2) + +eqIfDecl d1@(IfaceForeign {}) d2@(IfaceForeign {}) + = bool (ifName d1 == ifName d2 && ifExtName d1 == ifExtName d2) + +eqIfDecl d1@(IfaceData {}) d2@(IfaceData {}) + = bool (ifName d1 == ifName d2 && + ifRec d1 == ifRec d2 && + ifVrcs d1 == ifVrcs d2 && + ifGeneric d1 == ifGeneric d2) &&& + eqWith (ifTyVars d1) (ifTyVars d2) (\ env -> + eq_ifContext env (ifCtxt d1) (ifCtxt d2) &&& + eq_hsCD env (ifCons d1) (ifCons d2) + ) + -- The type variables of the data type do not scope + -- over the constructors (any more), but they do scope + -- over the stupid context in the IfaceConDecls + +eqIfDecl d1@(IfaceSyn {}) d2@(IfaceSyn {}) + = bool (ifName d1 == ifName d2) &&& + eqWith (ifTyVars d1) (ifTyVars d2) (\ env -> + eq_ifType env (ifSynRhs d1) (ifSynRhs d2) + ) + +eqIfDecl d1@(IfaceClass {}) d2@(IfaceClass {}) + = bool (ifName d1 == ifName d2 && + ifRec d1 == ifRec d2 && + ifVrcs d1 == ifVrcs d2) &&& + eqWith (ifTyVars d1) (ifTyVars d2) (\ env -> + eq_ifContext env (ifCtxt d1) (ifCtxt d2) &&& + eqListBy (eq_hsFD env) (ifFDs d1) (ifFDs d2) &&& + eqListBy (eq_cls_sig env) (ifSigs d1) (ifSigs d2) + ) + +eqIfDecl _ _ = NotEqual -- default case + +-- Helper +eqWith :: [IfaceTvBndr] -> [IfaceTvBndr] -> (EqEnv -> IfaceEq) -> IfaceEq +eqWith = eq_ifTvBndrs emptyEqEnv + +----------------------- +eqIfInst d1 d2 = bool (ifDFun d1 == ifDFun d2) +-- All other changes are handled via the version info on the dfun + +eqIfRule (IfaceRule n1 a1 bs1 f1 es1 rhs1 o1) + (IfaceRule n2 a2 bs2 f2 es2 rhs2 o2) + = bool (n1==n2 && a1==a2 && o1 == o2) &&& + f1 `eqIfExt` f2 &&& + eq_ifBndrs emptyEqEnv bs1 bs2 (\env -> + zapEq (eqListBy (eq_ifaceExpr env) es1 es2) &&& + -- zapEq: for the LHSs, ignore the EqBut part + eq_ifaceExpr env rhs1 rhs2) + +eq_hsCD env (IfDataTyCon c1) (IfDataTyCon c2) + = eqListBy (eq_ConDecl env) c1 c2 + +eq_hsCD env (IfNewTyCon c1) (IfNewTyCon c2) = eq_ConDecl env c1 c2 +eq_hsCD env IfAbstractTyCon IfAbstractTyCon = Equal +eq_hsCD env d1 d2 = NotEqual + +eq_ConDecl env c1@(IfVanillaCon {}) c2@(IfVanillaCon {}) + = bool (ifConOcc c1 == ifConOcc c2 && + ifConInfix c1 == ifConInfix c2 && + ifConStricts c1 == ifConStricts c2 && + ifConFields c1 == ifConFields c2) &&& + eq_ifTypes env (ifConArgTys c1) (ifConArgTys c2) + +eq_ConDecl env c1@(IfGadtCon {}) c2@(IfGadtCon {}) + = bool (ifConOcc c1 == ifConOcc c2 && + ifConStricts c1 == ifConStricts c2) &&& + eq_ifTvBndrs env (ifConTyVars c1) (ifConTyVars c2) (\ env -> + eq_ifContext env (ifConCtxt c1) (ifConCtxt c2) &&& + eq_ifTypes env (ifConResTys c1) (ifConResTys c2) &&& + eq_ifTypes env (ifConArgTys c1) (ifConArgTys c2)) + +eq_ConDecl env c1 c2 = NotEqual + +eq_hsFD env (ns1,ms1) (ns2,ms2) + = eqListBy (eqIfOcc env) ns1 ns2 &&& eqListBy (eqIfOcc env) ms1 ms2 + +eq_cls_sig env (IfaceClassOp n1 dm1 ty1) (IfaceClassOp n2 dm2 ty2) + = bool (n1==n2 && dm1 == dm2) &&& eq_ifType env ty1 ty2 +\end{code} + + +\begin{code} +----------------- +eqIfIdInfo NoInfo NoInfo = Equal +eqIfIdInfo (HasInfo is1) (HasInfo is2) = eqListBy eq_item is1 is2 +eqIfIdInfo i1 i2 = NotEqual + +eq_item (HsArity a1) (HsArity a2) = bool (a1 == a2) +eq_item (HsStrictness s1) (HsStrictness s2) = bool (s1 == s2) +eq_item (HsUnfold a1 u1) (HsUnfold a2 u2) = bool (a1 == a2) &&& eq_ifaceExpr emptyEqEnv u1 u2 +eq_item HsNoCafRefs HsNoCafRefs = Equal +eq_item (HsWorker wkr1 a1) (HsWorker wkr2 a2) = bool (a1==a2) &&& (wkr1 `eqIfExt` wkr2) +eq_item _ _ = NotEqual + +----------------- +eq_ifaceExpr :: EqEnv -> IfaceExpr -> IfaceExpr -> IfaceEq +eq_ifaceExpr env (IfaceLcl v1) (IfaceLcl v2) = eqIfOcc env v1 v2 +eq_ifaceExpr env (IfaceExt v1) (IfaceExt v2) = eqIfExt v1 v2 +eq_ifaceExpr env (IfaceLit l1) (IfaceLit l2) = bool (l1 == l2) +eq_ifaceExpr env (IfaceFCall c1 ty1) (IfaceFCall c2 ty2) = bool (c1==c2) &&& eq_ifType env ty1 ty2 +eq_ifaceExpr env (IfaceType ty1) (IfaceType ty2) = eq_ifType env ty1 ty2 +eq_ifaceExpr env (IfaceTuple n1 as1) (IfaceTuple n2 as2) = bool (n1==n2) &&& eqListBy (eq_ifaceExpr env) as1 as2 +eq_ifaceExpr env (IfaceLam b1 body1) (IfaceLam b2 body2) = eq_ifBndr env b1 b2 (\env -> eq_ifaceExpr env body1 body2) +eq_ifaceExpr env (IfaceApp f1 a1) (IfaceApp f2 a2) = eq_ifaceExpr env f1 f2 &&& eq_ifaceExpr env a1 a2 +eq_ifaceExpr env (IfaceNote n1 r1) (IfaceNote n2 r2) = eq_ifaceNote env n1 n2 &&& eq_ifaceExpr env r1 r2 + +eq_ifaceExpr env (IfaceCase s1 b1 ty1 as1) (IfaceCase s2 b2 ty2 as2) + = eq_ifaceExpr env s1 s2 &&& + eq_ifType env ty1 ty2 &&& + eq_ifNakedBndr env b1 b2 (\env -> eqListBy (eq_ifaceAlt env) as1 as2) + where + eq_ifaceAlt env (c1,bs1,r1) (c2,bs2,r2) + = bool (eq_ifaceConAlt c1 c2) &&& + eq_ifNakedBndrs env bs1 bs2 (\env -> eq_ifaceExpr env r1 r2) + +eq_ifaceExpr env (IfaceLet (IfaceNonRec b1 r1) x1) (IfaceLet (IfaceNonRec b2 r2) x2) + = eq_ifaceExpr env r1 r2 &&& eq_ifIdBndr env b1 b2 (\env -> eq_ifaceExpr env x1 x2) + +eq_ifaceExpr env (IfaceLet (IfaceRec as1) x1) (IfaceLet (IfaceRec as2) x2) + = eq_ifIdBndrs env bs1 bs2 (\env -> eqListBy (eq_ifaceExpr env) rs1 rs2 &&& eq_ifaceExpr env x1 x2) + where + (bs1,rs1) = unzip as1 + (bs2,rs2) = unzip as2 + + +eq_ifaceExpr env _ _ = NotEqual + +----------------- +eq_ifaceConAlt :: IfaceConAlt -> IfaceConAlt -> Bool +eq_ifaceConAlt IfaceDefault IfaceDefault = True +eq_ifaceConAlt (IfaceDataAlt n1) (IfaceDataAlt n2) = n1==n2 +eq_ifaceConAlt (IfaceTupleAlt c1) (IfaceTupleAlt c2) = c1==c2 +eq_ifaceConAlt (IfaceLitAlt l1) (IfaceLitAlt l2) = l1==l2 +eq_ifaceConAlt _ _ = False + +----------------- +eq_ifaceNote :: EqEnv -> IfaceNote -> IfaceNote -> IfaceEq +eq_ifaceNote env (IfaceSCC c1) (IfaceSCC c2) = bool (c1==c2) +eq_ifaceNote env (IfaceCoerce t1) (IfaceCoerce t2) = eq_ifType env t1 t2 +eq_ifaceNote env IfaceInlineCall IfaceInlineCall = Equal +eq_ifaceNote env IfaceInlineMe IfaceInlineMe = Equal +eq_ifaceNote env (IfaceCoreNote s1) (IfaceCoreNote s2) = bool (s1==s2) +eq_ifaceNote env _ _ = NotEqual +\end{code} + +\begin{code} +--------------------- +eqIfType t1 t2 = eq_ifType emptyEqEnv t1 t2 + +------------------- +eq_ifType env (IfaceTyVar n1) (IfaceTyVar n2) = eqIfOcc env n1 n2 +eq_ifType env (IfaceAppTy s1 t1) (IfaceAppTy s2 t2) = eq_ifType env s1 s2 &&& eq_ifType env t1 t2 +eq_ifType env (IfacePredTy st1) (IfacePredTy st2) = eq_ifPredType env st1 st2 +eq_ifType env (IfaceTyConApp tc1 ts1) (IfaceTyConApp tc2 ts2) = tc1 `eqIfTc` tc2 &&& eq_ifTypes env ts1 ts2 +eq_ifType env (IfaceForAllTy tv1 t1) (IfaceForAllTy tv2 t2) = eq_ifTvBndr env tv1 tv2 (\env -> eq_ifType env t1 t2) +eq_ifType env (IfaceFunTy s1 t1) (IfaceFunTy s2 t2) = eq_ifType env s1 s2 &&& eq_ifType env t1 t2 +eq_ifType env _ _ = NotEqual + +------------------- +eq_ifTypes env = eqListBy (eq_ifType env) + +------------------- +eq_ifContext env a b = eqListBy (eq_ifPredType env) a b + +------------------- +eq_ifPredType env (IfaceClassP c1 tys1) (IfaceClassP c2 tys2) = c1 `eqIfExt` c2 &&& eq_ifTypes env tys1 tys2 +eq_ifPredType env (IfaceIParam n1 ty1) (IfaceIParam n2 ty2) = bool (n1 == n2) &&& eq_ifType env ty1 ty2 +eq_ifPredType env _ _ = NotEqual + +------------------- +eqIfTc (IfaceTc tc1) (IfaceTc tc2) = tc1 `eqIfExt` tc2 +eqIfTc IfaceIntTc IfaceIntTc = Equal +eqIfTc IfaceCharTc IfaceCharTc = Equal +eqIfTc IfaceBoolTc IfaceBoolTc = Equal +eqIfTc IfaceListTc IfaceListTc = Equal +eqIfTc IfacePArrTc IfacePArrTc = Equal +eqIfTc (IfaceTupTc bx1 ar1) (IfaceTupTc bx2 ar2) = bool (bx1==bx2 && ar1==ar2) +eqIfTc _ _ = NotEqual +\end{code} + +----------------------------------------------------------- + Support code for equality checking +----------------------------------------------------------- + +\begin{code} +------------------------------------ +type EqEnv = OccEnv OccName -- Tracks the mapping from L-variables to R-variables + +eqIfOcc :: EqEnv -> OccName -> OccName -> IfaceEq +eqIfOcc env n1 n2 = case lookupOccEnv env n1 of + Just n1 -> bool (n1 == n2) + Nothing -> bool (n1 == n2) + +extendEqEnv :: EqEnv -> OccName -> OccName -> EqEnv +extendEqEnv env n1 n2 | n1 == n2 = env + | otherwise = extendOccEnv env n1 n2 + +emptyEqEnv :: EqEnv +emptyEqEnv = emptyOccEnv + +------------------------------------ +type ExtEnv bndr = EqEnv -> bndr -> bndr -> (EqEnv -> IfaceEq) -> IfaceEq + +eq_ifNakedBndr :: ExtEnv OccName +eq_ifBndr :: ExtEnv IfaceBndr +eq_ifTvBndr :: ExtEnv IfaceTvBndr +eq_ifIdBndr :: ExtEnv IfaceIdBndr + +eq_ifNakedBndr env n1 n2 k = k (extendEqEnv env n1 n2) + +eq_ifBndr env (IfaceIdBndr b1) (IfaceIdBndr b2) k = eq_ifIdBndr env b1 b2 k +eq_ifBndr env (IfaceTvBndr b1) (IfaceTvBndr b2) k = eq_ifTvBndr env b1 b2 k +eq_ifBndr _ _ _ _ = NotEqual + +eq_ifTvBndr env (v1, k1) (v2, k2) k = bool (k1 == k2) &&& k (extendEqEnv env v1 v2) +eq_ifIdBndr env (v1, t1) (v2, t2) k = eq_ifType env t1 t2 &&& k (extendEqEnv env v1 v2) + +eq_ifBndrs :: ExtEnv [IfaceBndr] +eq_ifIdBndrs :: ExtEnv [IfaceIdBndr] +eq_ifTvBndrs :: ExtEnv [IfaceTvBndr] +eq_ifNakedBndrs :: ExtEnv [OccName] +eq_ifBndrs = eq_bndrs_with eq_ifBndr +eq_ifIdBndrs = eq_bndrs_with eq_ifIdBndr +eq_ifTvBndrs = eq_bndrs_with eq_ifTvBndr +eq_ifNakedBndrs = eq_bndrs_with eq_ifNakedBndr + +eq_bndrs_with eq env [] [] k = k env +eq_bndrs_with eq env (b1:bs1) (b2:bs2) k = eq env b1 b2 (\env -> eq_bndrs_with eq env bs1 bs2 k) +eq_bndrs_with eq env _ _ _ = NotEqual +\end{code} + +\begin{code} +eqListBy :: (a->a->IfaceEq) -> [a] -> [a] -> IfaceEq +eqListBy eq [] [] = Equal +eqListBy eq (x:xs) (y:ys) = eq x y &&& eqListBy eq xs ys +eqListBy eq xs ys = NotEqual + +eqMaybeBy :: (a->a->IfaceEq) -> Maybe a -> Maybe a -> IfaceEq +eqMaybeBy eq Nothing Nothing = Equal +eqMaybeBy eq (Just x) (Just y) = eq x y +eqMaybeBy eq x y = NotEqual +\end{code} diff --git a/compiler/iface/IfaceType.lhs b/compiler/iface/IfaceType.lhs new file mode 100644 index 0000000000..76438ddb23 --- /dev/null +++ b/compiler/iface/IfaceType.lhs @@ -0,0 +1,390 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1993-1998 +% + + This module defines interface types and binders + +\begin{code} +module IfaceType ( + IfaceType(..), IfaceKind, IfacePredType(..), IfaceTyCon(..), + IfaceContext, IfaceBndr(..), IfaceTvBndr, IfaceIdBndr, + + IfaceExtName(..), mkIfaceExtName, isLocalIfaceExtName, + ifaceTyConName, interactiveExtNameFun, + + -- Conversion from Type -> IfaceType + toIfaceType, toIfacePred, toIfaceContext, + toIfaceBndr, toIfaceIdBndr, toIfaceTvBndrs, + toIfaceTyCon, toIfaceTyCon_name, + + -- Printing + pprIfaceType, pprParendIfaceType, pprIfaceContext, + pprIfaceIdBndr, pprIfaceTvBndr, pprIfaceTvBndrs, pprIfaceBndrs, + tOP_PREC, tYCON_PREC, noParens, maybeParen, pprIfaceForAllPart + + ) where + +#include "HsVersions.h" + +import Kind ( Kind(..) ) +import TypeRep ( TyThing(..), Type(..), PredType(..), ThetaType ) +import TyCon ( TyCon, isTupleTyCon, tyConArity, tupleTyConBoxity, tyConName ) +import Var ( isId, tyVarKind, idType ) +import TysWiredIn ( listTyConName, parrTyConName, tupleTyCon, intTyConName, charTyConName, boolTyConName ) +import OccName ( OccName, parenSymOcc ) +import Name ( Name, getName, getOccName, nameModule, nameOccName, + wiredInNameTyThing_maybe ) +import Module ( Module ) +import BasicTypes ( IPName(..), Arity, Version, mapIPName, tupleParens, Boxity ) +import Outputable +import FastString +\end{code} + + +%************************************************************************ +%* * + IfaceExtName +%* * +%************************************************************************ + +\begin{code} +data IfaceExtName + = ExtPkg Module OccName -- From an external package; no version # + -- Also used for wired-in things regardless + -- of whether they are home-pkg or not + + | HomePkg Module OccName Version -- From another module in home package; + -- has version #; in all other respects, + -- HomePkg and ExtPkg are the same + + | LocalTop OccName -- Top-level from the same module as + -- the enclosing IfaceDecl + + | LocalTopSub -- Same as LocalTop, but for a class method or constr + OccName -- Class-meth/constr name + OccName -- Parent class/datatype name + -- LocalTopSub is written into iface files as LocalTop; the parent + -- info is only used when computing version information in MkIface + +isLocalIfaceExtName :: IfaceExtName -> Bool +isLocalIfaceExtName (LocalTop _) = True +isLocalIfaceExtName (LocalTopSub _ _) = True +isLocalIfaceExtName other = False + +mkIfaceExtName name = ExtPkg (nameModule name) (nameOccName name) + -- Local helper for wired-in names + +ifaceExtOcc :: IfaceExtName -> OccName +ifaceExtOcc (ExtPkg _ occ) = occ +ifaceExtOcc (HomePkg _ occ _) = occ +ifaceExtOcc (LocalTop occ) = occ +ifaceExtOcc (LocalTopSub occ _) = occ + +interactiveExtNameFun :: PrintUnqualified -> Name-> IfaceExtName +interactiveExtNameFun print_unqual name + | print_unqual mod occ = LocalTop occ + | otherwise = ExtPkg mod occ + where + mod = nameModule name + occ = nameOccName name +\end{code} + + +%************************************************************************ +%* * + Local (nested) binders +%* * +%************************************************************************ + +\begin{code} +data IfaceBndr -- Local (non-top-level) binders + = IfaceIdBndr IfaceIdBndr + | IfaceTvBndr IfaceTvBndr + +type IfaceIdBndr = (OccName, IfaceType) -- OccName, because always local +type IfaceTvBndr = (OccName, IfaceKind) + +------------------------------- +type IfaceKind = Kind -- Re-use the Kind type, but no KindVars in it + +data IfaceType + = IfaceTyVar OccName -- Type variable only, not tycon + | IfaceAppTy IfaceType IfaceType + | IfaceForAllTy IfaceTvBndr IfaceType + | IfacePredTy IfacePredType + | IfaceTyConApp IfaceTyCon [IfaceType] -- Not necessarily saturated + -- Includes newtypes, synonyms, tuples + | IfaceFunTy IfaceType IfaceType + +data IfacePredType -- NewTypes are handled as ordinary TyConApps + = IfaceClassP IfaceExtName [IfaceType] + | IfaceIParam (IPName OccName) IfaceType + +type IfaceContext = [IfacePredType] + +data IfaceTyCon -- Abbreviations for common tycons with known names + = IfaceTc IfaceExtName -- The common case + | IfaceIntTc | IfaceBoolTc | IfaceCharTc + | IfaceListTc | IfacePArrTc + | IfaceTupTc Boxity Arity + +ifaceTyConName :: IfaceTyCon -> Name -- Works for all except IfaceTc +ifaceTyConName IfaceIntTc = intTyConName +ifaceTyConName IfaceBoolTc = boolTyConName +ifaceTyConName IfaceCharTc = charTyConName +ifaceTyConName IfaceListTc = listTyConName +ifaceTyConName IfacePArrTc = parrTyConName +ifaceTyConName (IfaceTupTc bx ar) = getName (tupleTyCon bx ar) +ifaceTyConName (IfaceTc ext) = pprPanic "ifaceTyConName" (ppr ext) +\end{code} + + +%************************************************************************ +%* * + Functions over IFaceTypes +%* * +%************************************************************************ + + +\begin{code} +splitIfaceSigmaTy :: IfaceType -> ([IfaceTvBndr], IfaceContext, IfaceType) +-- Mainly for printing purposes +splitIfaceSigmaTy ty + = (tvs,theta,tau) + where + (tvs, rho) = split_foralls ty + (theta, tau) = split_rho rho + + split_foralls (IfaceForAllTy tv ty) + = case split_foralls ty of { (tvs, rho) -> (tv:tvs, rho) } + split_foralls rho = ([], rho) + + split_rho (IfaceFunTy (IfacePredTy st) ty) + = case split_rho ty of { (sts, tau) -> (st:sts, tau) } + split_rho tau = ([], tau) +\end{code} + +%************************************************************************ +%* * + Pretty-printing +%* * +%************************************************************************ + +Precedence +~~~~~~~~~~ +@ppr_ty@ takes an @Int@ that is the precedence of the context. +The precedence levels are: +\begin{description} +\item[tOP_PREC] No parens required. +\item[fUN_PREC] Left hand argument of a function arrow. +\item[tYCON_PREC] Argument of a type constructor. +\end{description} + +\begin{code} +tOP_PREC = (0 :: Int) -- type in ParseIface.y +fUN_PREC = (1 :: Int) -- btype in ParseIface.y +tYCON_PREC = (2 :: Int) -- atype in ParseIface.y + +noParens :: SDoc -> SDoc +noParens pp = pp + +maybeParen ctxt_prec inner_prec pretty + | ctxt_prec < inner_prec = pretty + | otherwise = parens pretty +\end{code} + + +----------------------------- Printing binders ------------------------------------ + +\begin{code} +-- These instances are used only when printing for the user, either when +-- debugging, or in GHCi when printing the results of a :info command +instance Outputable IfaceExtName where + ppr (ExtPkg mod occ) = pprExt mod occ + ppr (HomePkg mod occ vers) = pprExt mod occ <> braces (ppr vers) + ppr (LocalTop occ) = ppr occ -- Do we want to distinguish these + ppr (LocalTopSub occ _) = ppr occ -- from an ordinary occurrence? + +pprExt :: Module -> OccName -> SDoc +-- No need to worry about printing unqualified becuase that was handled +-- in the transiation to IfaceSyn +pprExt mod occ = ppr mod <> dot <> ppr occ + +instance Outputable IfaceBndr where + ppr (IfaceIdBndr bndr) = pprIfaceIdBndr bndr + ppr (IfaceTvBndr bndr) = char '@' <+> pprIfaceTvBndr bndr + +pprIfaceBndrs :: [IfaceBndr] -> SDoc +pprIfaceBndrs bs = sep (map ppr bs) + +pprIfaceIdBndr (name, ty) = hsep [ppr name, dcolon, ppr ty] + +pprIfaceTvBndr :: IfaceTvBndr -> SDoc +pprIfaceTvBndr (tv, LiftedTypeKind) = ppr tv +pprIfaceTvBndr (tv, kind) = parens (ppr tv <> dcolon <> ppr kind) + +pprIfaceTvBndrs :: [IfaceTvBndr] -> SDoc +pprIfaceTvBndrs tyvars = hsep (map pprIfaceTvBndr tyvars) +\end{code} + +----------------------------- Printing IfaceType ------------------------------------ + +\begin{code} +--------------------------------- +instance Outputable IfaceType where + ppr ty = pprIfaceTypeForUser ty + +pprIfaceTypeForUser ::IfaceType -> SDoc +-- Drop top-level for-alls; if that's not what you want, use pprIfaceType dire +pprIfaceTypeForUser ty + = pprIfaceForAllPart [] theta (pprIfaceType tau) + where + (_tvs, theta, tau) = splitIfaceSigmaTy ty + +pprIfaceType, pprParendIfaceType ::IfaceType -> SDoc +pprIfaceType = ppr_ty tOP_PREC +pprParendIfaceType = ppr_ty tYCON_PREC + + +ppr_ty :: Int -> IfaceType -> SDoc +ppr_ty ctxt_prec (IfaceTyVar tyvar) = ppr tyvar +ppr_ty ctxt_prec (IfaceTyConApp tc tys) = ppr_tc_app ctxt_prec tc tys +ppr_ty ctxt_prec (IfacePredTy st) = ppr st + + -- Function types +ppr_ty ctxt_prec (IfaceFunTy ty1 ty2) + = -- We don't want to lose synonyms, so we mustn't use splitFunTys here. + maybeParen ctxt_prec fUN_PREC $ + sep (ppr_ty fUN_PREC ty1 : ppr_fun_tail ty2) + where + ppr_fun_tail (IfaceFunTy ty1 ty2) + = (arrow <+> ppr_ty fUN_PREC ty1) : ppr_fun_tail ty2 + ppr_fun_tail other_ty + = [arrow <+> pprIfaceType other_ty] + +ppr_ty ctxt_prec (IfaceAppTy ty1 ty2) + = maybeParen ctxt_prec tYCON_PREC $ + ppr_ty fUN_PREC ty1 <+> pprParendIfaceType ty2 + +ppr_ty ctxt_prec ty@(IfaceForAllTy _ _) + = maybeParen ctxt_prec fUN_PREC (pprIfaceForAllPart tvs theta (pprIfaceType tau)) + where + (tvs, theta, tau) = splitIfaceSigmaTy ty + +------------------- +pprIfaceForAllPart :: [IfaceTvBndr] -> IfaceContext -> SDoc -> SDoc +pprIfaceForAllPart tvs ctxt doc + = sep [ppr_tvs, pprIfaceContext ctxt, doc] + where + ppr_tvs | null tvs = empty + | otherwise = ptext SLIT("forall") <+> pprIfaceTvBndrs tvs <> dot + +------------------- +ppr_tc_app ctxt_prec tc [] = ppr_tc tc +ppr_tc_app ctxt_prec IfaceListTc [ty] = brackets (pprIfaceType ty) +ppr_tc_app ctxt_prec IfacePArrTc [ty] = pabrackets (pprIfaceType ty) +ppr_tc_app ctxt_prec (IfaceTupTc bx arity) tys + | arity == length tys + = tupleParens bx (sep (punctuate comma (map pprIfaceType tys))) +ppr_tc_app ctxt_prec tc tys + = maybeParen ctxt_prec tYCON_PREC + (sep [ppr_tc tc, nest 4 (sep (map pprParendIfaceType tys))]) + +ppr_tc :: IfaceTyCon -> SDoc +-- Wrap infix type constructors in parens +ppr_tc tc@(IfaceTc ext_nm) = parenSymOcc (ifaceExtOcc ext_nm) (ppr tc) +ppr_tc tc = ppr tc + +------------------- +instance Outputable IfacePredType where + -- Print without parens + ppr (IfaceIParam ip ty) = hsep [ppr ip, dcolon, ppr ty] + ppr (IfaceClassP cls ts) = parenSymOcc (ifaceExtOcc cls) (ppr cls) + <+> sep (map pprParendIfaceType ts) + +instance Outputable IfaceTyCon where + ppr (IfaceTc ext) = ppr ext + ppr other_tc = ppr (ifaceTyConName other_tc) + +------------------- +pprIfaceContext :: IfaceContext -> SDoc +-- Prints "(C a, D b) =>", including the arrow +pprIfaceContext [] = empty +pprIfaceContext theta = ppr_preds theta <+> ptext SLIT("=>") + +ppr_preds [pred] = ppr pred -- No parens +ppr_preds preds = parens (sep (punctuate comma (map ppr preds))) + +------------------- +pabrackets p = ptext SLIT("[:") <> p <> ptext SLIT(":]") +\end{code} + +%************************************************************************ +%* * + Conversion from Type to IfaceType +%* * +%************************************************************************ + +\begin{code} +---------------- +toIfaceTvBndr tyvar = (getOccName tyvar, tyVarKind tyvar) +toIfaceIdBndr ext id = (getOccName id, toIfaceType ext (idType id)) +toIfaceTvBndrs tyvars = map toIfaceTvBndr tyvars + +toIfaceBndr ext var + | isId var = IfaceIdBndr (toIfaceIdBndr ext var) + | otherwise = IfaceTvBndr (toIfaceTvBndr var) + +--------------------- +toIfaceType :: (Name -> IfaceExtName) -> Type -> IfaceType +-- Synonyms are retained in the interface type +toIfaceType ext (TyVarTy tv) = IfaceTyVar (getOccName tv) +toIfaceType ext (AppTy t1 t2) = IfaceAppTy (toIfaceType ext t1) (toIfaceType ext t2) +toIfaceType ext (FunTy t1 t2) = IfaceFunTy (toIfaceType ext t1) (toIfaceType ext t2) +toIfaceType ext (TyConApp tc tys) = IfaceTyConApp (toIfaceTyCon ext tc) (toIfaceTypes ext tys) +toIfaceType ext (ForAllTy tv t) = IfaceForAllTy (toIfaceTvBndr tv) (toIfaceType ext t) +toIfaceType ext (PredTy st) = IfacePredTy (toIfacePred ext st) +toIfaceType ext (NoteTy other_note ty) = toIfaceType ext ty + +---------------- +-- A little bit of (perhaps optional) trickiness here. When +-- compiling Data.Tuple, the tycons are not TupleTyCons, although +-- they have a wired-in name. But we'd like to dump them into the Iface +-- as a tuple tycon, to save lookups when reading the interface +-- Hence a tuple tycon may 'miss' in toIfaceTyCon, but then +-- toIfaceTyCon_name will still catch it. + +toIfaceTyCon :: (Name -> IfaceExtName) -> TyCon -> IfaceTyCon +toIfaceTyCon ext tc + | isTupleTyCon tc = IfaceTupTc (tupleTyConBoxity tc) (tyConArity tc) + | otherwise = toIfaceTyCon_name ext (tyConName tc) + +toIfaceTyCon_name :: (Name -> IfaceExtName) -> Name -> IfaceTyCon +toIfaceTyCon_name ext nm + | Just (ATyCon tc) <- wiredInNameTyThing_maybe nm + = toIfaceWiredInTyCon ext tc nm + | otherwise + = IfaceTc (ext nm) + +toIfaceWiredInTyCon :: (Name -> IfaceExtName) -> TyCon -> Name -> IfaceTyCon +toIfaceWiredInTyCon ext tc nm + | isTupleTyCon tc = IfaceTupTc (tupleTyConBoxity tc) (tyConArity tc) + | nm == intTyConName = IfaceIntTc + | nm == boolTyConName = IfaceBoolTc + | nm == charTyConName = IfaceCharTc + | nm == listTyConName = IfaceListTc + | nm == parrTyConName = IfacePArrTc + | otherwise = IfaceTc (ext nm) + +---------------- +toIfaceTypes ext ts = map (toIfaceType ext) ts + +---------------- +toIfacePred ext (ClassP cls ts) = IfaceClassP (ext (getName cls)) (toIfaceTypes ext ts) +toIfacePred ext (IParam ip t) = IfaceIParam (mapIPName getOccName ip) (toIfaceType ext t) + +---------------- +toIfaceContext :: (Name -> IfaceExtName) -> ThetaType -> IfaceContext +toIfaceContext ext cs = map (toIfacePred ext) cs +\end{code} + diff --git a/compiler/iface/LoadIface.lhs b/compiler/iface/LoadIface.lhs new file mode 100644 index 0000000000..8c496f76ef --- /dev/null +++ b/compiler/iface/LoadIface.lhs @@ -0,0 +1,582 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section{Dealing with interface files} + +\begin{code} +module LoadIface ( + loadInterface, loadHomeInterface, loadWiredInHomeIface, + loadSrcInterface, loadSysInterface, loadOrphanModules, + findAndReadIface, readIface, -- Used when reading the module's old interface + loadDecls, ifaceStats, discardDeclPrags, + initExternalPackageState + ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} TcIface( tcIfaceDecl, tcIfaceRule, tcIfaceInst ) + +import Packages ( PackageState(..), PackageIdH(..), isHomePackage ) +import DynFlags ( DynFlags(..), DynFlag( Opt_IgnoreInterfacePragmas ), + isOneShot ) +import IfaceSyn ( IfaceDecl(..), IfaceConDecl(..), IfaceClassOp(..), + IfaceConDecls(..), IfaceIdInfo(..) ) +import IfaceEnv ( newGlobalBinder ) +import HscTypes ( ModIface(..), TyThing, emptyModIface, EpsStats(..), + addEpsInStats, ExternalPackageState(..), + PackageTypeEnv, emptyTypeEnv, HscEnv(..), + lookupIfaceByModule, emptyPackageIfaceTable, + IsBootInterface, mkIfaceFixCache, + implicitTyThings + ) + +import BasicTypes ( Version, Fixity(..), FixityDirection(..), + isMarkedStrict ) +import TcRnMonad + +import PrelNames ( gHC_PRIM ) +import PrelInfo ( ghcPrimExports ) +import PrelRules ( builtinRules ) +import Rules ( extendRuleBaseList, mkRuleBase ) +import InstEnv ( emptyInstEnv, extendInstEnvList ) +import Name ( Name {-instance NamedThing-}, getOccName, + nameModule, nameIsLocalOrFrom, isWiredInName ) +import NameEnv +import MkId ( seqId ) +import Module ( Module, ModLocation(ml_hi_file), emptyModuleEnv, + addBootSuffix_maybe, + extendModuleEnv, lookupModuleEnv, moduleString + ) +import OccName ( OccName, mkOccEnv, lookupOccEnv, mkClassTyConOcc, mkClassDataConOcc, + mkSuperDictSelOcc, mkDataConWrapperOcc, mkDataConWorkerOcc ) +import SrcLoc ( importedSrcLoc ) +import Maybes ( MaybeErr(..) ) +import FastString ( mkFastString ) +import ErrUtils ( Message ) +import Finder ( findModule, findPackageModule, FindResult(..), cantFindError ) +import Outputable +import BinIface ( readBinIface ) +import Panic ( ghcError, tryMost, showException, GhcException(..) ) +import List ( nub ) +\end{code} + + +%************************************************************************ +%* * + loadSrcInterface, loadOrphanModules, loadHomeInterface + + These three are called from TcM-land +%* * +%************************************************************************ + +\begin{code} +loadSrcInterface :: SDoc -> Module -> IsBootInterface -> RnM ModIface +-- This is called for each 'import' declaration in the source code +-- On a failure, fail in the monad with an error message + +loadSrcInterface doc mod want_boot + = do { mb_iface <- initIfaceTcRn $ + loadInterface doc mod (ImportByUser want_boot) + ; case mb_iface of + Failed err -> failWithTc (elaborate err) + Succeeded iface -> return iface + } + where + elaborate err = hang (ptext SLIT("Failed to load interface for") <+> + quotes (ppr mod) <> colon) 4 err + +--------------- +loadOrphanModules :: [Module] -> TcM () +loadOrphanModules mods + | null mods = returnM () + | otherwise = initIfaceTcRn $ + do { traceIf (text "Loading orphan modules:" <+> + fsep (map ppr mods)) + ; mappM_ load mods + ; returnM () } + where + load mod = loadSysInterface (mk_doc mod) mod + mk_doc mod = ppr mod <+> ptext SLIT("is a orphan-instance module") + +--------------- +loadHomeInterface :: SDoc -> Name -> TcRn ModIface +loadHomeInterface doc name + = do { +#ifdef DEBUG + -- Should not be called with a name from the module being compiled + this_mod <- getModule + ; ASSERT2( not (nameIsLocalOrFrom this_mod name), ppr name <+> parens doc ) +#endif + initIfaceTcRn $ loadSysInterface doc (nameModule name) + } + +--------------- +loadWiredInHomeIface :: Name -> IfM lcl () +-- A IfM function to load the home interface for a wired-in thing, +-- so that we're sure that we see its instance declarations and rules +loadWiredInHomeIface name + = ASSERT( isWiredInName name ) + do { loadSysInterface doc (nameModule name); return () } + where + doc = ptext SLIT("Need home interface for wired-in thing") <+> ppr name + +--------------- +loadSysInterface :: SDoc -> Module -> IfM lcl ModIface +-- A wrapper for loadInterface that Throws an exception if it fails +loadSysInterface doc mod_name + = do { mb_iface <- loadInterface doc mod_name ImportBySystem + ; case mb_iface of + Failed err -> ghcError (ProgramError (showSDoc err)) + Succeeded iface -> return iface } +\end{code} + + +%********************************************************* +%* * + loadInterface + + The main function to load an interface + for an imported module, and put it in + the External Package State +%* * +%********************************************************* + +\begin{code} +loadInterface :: SDoc -> Module -> WhereFrom + -> IfM lcl (MaybeErr Message ModIface) + +-- If it can't find a suitable interface file, we +-- a) modify the PackageIfaceTable to have an empty entry +-- (to avoid repeated complaints) +-- b) return (Left message) +-- +-- It's not necessarily an error for there not to be an interface +-- file -- perhaps the module has changed, and that interface +-- is no longer used + +loadInterface doc_str mod from + = do { -- Read the state + (eps,hpt) <- getEpsAndHpt + + ; traceIf (text "Considering whether to load" <+> ppr mod <+> ppr from) + + -- Check whether we have the interface already + ; case lookupIfaceByModule hpt (eps_PIT eps) mod of { + Just iface + -> returnM (Succeeded iface) ; -- Already loaded + -- The (src_imp == mi_boot iface) test checks that the already-loaded + -- interface isn't a boot iface. This can conceivably happen, + -- if an earlier import had a before we got to real imports. I think. + other -> do + + { let { hi_boot_file = case from of + ImportByUser usr_boot -> usr_boot + ImportBySystem -> sys_boot + + ; mb_dep = lookupModuleEnv (eps_is_boot eps) mod + ; sys_boot = case mb_dep of + Just (_, is_boot) -> is_boot + Nothing -> False + -- The boot-ness of the requested interface, + } -- based on the dependencies in directly-imported modules + + -- READ THE MODULE IN + ; let explicit | ImportByUser _ <- from = True + | otherwise = False + ; read_result <- findAndReadIface explicit doc_str mod hi_boot_file + ; dflags <- getDOpts + ; case read_result of { + Failed err -> do + { let fake_iface = emptyModIface HomePackage mod + + ; updateEps_ $ \eps -> + eps { eps_PIT = extendModuleEnv (eps_PIT eps) (mi_module fake_iface) fake_iface } + -- Not found, so add an empty iface to + -- the EPS map so that we don't look again + + ; returnM (Failed err) } ; + + -- Found and parsed! + Succeeded (iface, file_path) -- Sanity check: + | ImportBySystem <- from, -- system-importing... + isHomePackage (mi_package iface), -- ...a home-package module + Nothing <- mb_dep -- ...that we know nothing about + -> returnM (Failed (badDepMsg mod)) + + | otherwise -> + + let + loc_doc = text file_path + in + initIfaceLcl mod loc_doc $ do + + -- Load the new ModIface into the External Package State + -- Even home-package interfaces loaded by loadInterface + -- (which only happens in OneShot mode; in Batch/Interactive + -- mode, home-package modules are loaded one by one into the HPT) + -- are put in the EPS. + -- + -- The main thing is to add the ModIface to the PIT, but + -- we also take the + -- IfaceDecls, IfaceInst, IfaceRules + -- out of the ModIface and put them into the big EPS pools + + -- NB: *first* we do loadDecl, so that the provenance of all the locally-defined + --- names is done correctly (notably, whether this is an .hi file or .hi-boot file). + -- If we do loadExport first the wrong info gets into the cache (unless we + -- explicitly tag each export which seems a bit of a bore) + + ; ignore_prags <- doptM Opt_IgnoreInterfacePragmas + ; new_eps_decls <- loadDecls ignore_prags (mi_decls iface) + ; new_eps_insts <- mapM tcIfaceInst (mi_insts iface) + ; new_eps_rules <- if ignore_prags + then return [] + else mapM tcIfaceRule (mi_rules iface) + + ; let { final_iface = iface { mi_decls = panic "No mi_decls in PIT", + mi_insts = panic "No mi_insts in PIT", + mi_rules = panic "No mi_rules in PIT" } } + + ; updateEps_ $ \ eps -> + eps { eps_PIT = extendModuleEnv (eps_PIT eps) mod final_iface, + eps_PTE = addDeclsToPTE (eps_PTE eps) new_eps_decls, + eps_rule_base = extendRuleBaseList (eps_rule_base eps) new_eps_rules, + eps_inst_env = extendInstEnvList (eps_inst_env eps) new_eps_insts, + eps_stats = addEpsInStats (eps_stats eps) (length new_eps_decls) + (length new_eps_insts) (length new_eps_rules) } + + ; return (Succeeded final_iface) + }}}} + +badDepMsg mod + = hang (ptext SLIT("Interface file inconsistency:")) + 2 (sep [ptext SLIT("home-package module") <+> quotes (ppr mod) <+> ptext SLIT("is mentioned,"), + ptext SLIT("but does not appear in the dependencies of the interface")]) + +----------------------------------------------------- +-- Loading type/class/value decls +-- We pass the full Module name here, replete with +-- its package info, so that we can build a Name for +-- each binder with the right package info in it +-- All subsequent lookups, including crucially lookups during typechecking +-- the declaration itself, will find the fully-glorious Name +----------------------------------------------------- + +addDeclsToPTE :: PackageTypeEnv -> [(Name,TyThing)] -> PackageTypeEnv +addDeclsToPTE pte things = extendNameEnvList pte things + +loadDecls :: Bool + -> [(Version, IfaceDecl)] + -> IfL [(Name,TyThing)] +loadDecls ignore_prags ver_decls + = do { mod <- getIfModule + ; thingss <- mapM (loadDecl ignore_prags mod) ver_decls + ; return (concat thingss) + } + +loadDecl :: Bool -- Don't load pragmas into the decl pool + -> Module + -> (Version, IfaceDecl) + -> IfL [(Name,TyThing)] -- The list can be poked eagerly, but the + -- TyThings are forkM'd thunks +loadDecl ignore_prags mod (_version, decl) + = do { -- Populate the name cache with final versions of all + -- the names associated with the decl + main_name <- mk_new_bndr mod Nothing (ifName decl) + ; implicit_names <- mapM (mk_new_bndr mod (Just main_name)) (ifaceDeclSubBndrs decl) + + -- Typecheck the thing, lazily + -- NB. firstly, the laziness is there in case we never need the + -- declaration (in one-shot mode), and secondly it is there so that + -- we don't look up the occurrence of a name before calling mk_new_bndr + -- on the binder. This is important because we must get the right name + -- which includes its nameParent. + ; thing <- forkM doc (bumpDeclStats main_name >> tcIfaceDecl stripped_decl) + ; let mini_env = mkOccEnv [(getOccName t, t) | t <- implicitTyThings thing] + lookup n = case lookupOccEnv mini_env (getOccName n) of + Just thing -> thing + Nothing -> pprPanic "loadDecl" (ppr main_name <+> ppr n) + + ; returnM ((main_name, thing) : [(n, lookup n) | n <- implicit_names]) } + -- We build a list from the *known* names, with (lookup n) thunks + -- as the TyThings. That way we can extend the PTE without poking the + -- thunks + where + stripped_decl | ignore_prags = discardDeclPrags decl + | otherwise = decl + + -- mk_new_bndr allocates in the name cache the final canonical + -- name for the thing, with the correct + -- * parent + -- * location + -- imported name, to fix the module correctly in the cache + mk_new_bndr mod mb_parent occ + = newGlobalBinder mod occ mb_parent + (importedSrcLoc (moduleString mod)) + + doc = ptext SLIT("Declaration for") <+> ppr (ifName decl) + +discardDeclPrags :: IfaceDecl -> IfaceDecl +discardDeclPrags decl@(IfaceId {ifIdInfo = HasInfo _}) = decl { ifIdInfo = NoInfo } +discardDeclPrags decl = decl + +bumpDeclStats :: Name -> IfL () -- Record that one more declaration has actually been used +bumpDeclStats name + = do { traceIf (text "Loading decl for" <+> ppr name) + ; updateEps_ (\eps -> let stats = eps_stats eps + in eps { eps_stats = stats { n_decls_out = n_decls_out stats + 1 } }) + } + +----------------- +ifaceDeclSubBndrs :: IfaceDecl -> [OccName] +-- *Excludes* the 'main' name, but *includes* the implicitly-bound names +-- Deeply revolting, because it has to predict what gets bound, +-- especially the question of whether there's a wrapper for a datacon + +ifaceDeclSubBndrs (IfaceClass {ifCtxt = sc_ctxt, ifName = cls_occ, ifSigs = sigs }) + = [tc_occ, dc_occ, dcww_occ] ++ + [op | IfaceClassOp op _ _ <- sigs] ++ + [mkSuperDictSelOcc n cls_occ | n <- [1..n_ctxt]] + where + n_ctxt = length sc_ctxt + n_sigs = length sigs + tc_occ = mkClassTyConOcc cls_occ + dc_occ = mkClassDataConOcc cls_occ + dcww_occ | is_newtype = mkDataConWrapperOcc dc_occ -- Newtypes have wrapper but no worker + | otherwise = mkDataConWorkerOcc dc_occ -- Otherwise worker but no wrapper + is_newtype = n_sigs + n_ctxt == 1 -- Sigh + +ifaceDeclSubBndrs (IfaceData {ifCons = IfAbstractTyCon}) + = [] +-- Newtype +ifaceDeclSubBndrs (IfaceData {ifCons = IfNewTyCon (IfVanillaCon { ifConOcc = con_occ, + ifConFields = fields})}) + = fields ++ [con_occ, mkDataConWrapperOcc con_occ] + -- Wrapper, no worker; see MkId.mkDataConIds + +ifaceDeclSubBndrs (IfaceData {ifCons = IfDataTyCon cons}) + = nub (concatMap fld_occs cons) -- Eliminate duplicate fields + ++ concatMap dc_occs cons + where + fld_occs (IfVanillaCon { ifConFields = fields }) = fields + fld_occs (IfGadtCon {}) = [] + dc_occs con_decl + | has_wrapper = [con_occ, work_occ, wrap_occ] + | otherwise = [con_occ, work_occ] + where + con_occ = ifConOcc con_decl + strs = ifConStricts con_decl + wrap_occ = mkDataConWrapperOcc con_occ + work_occ = mkDataConWorkerOcc con_occ + has_wrapper = any isMarkedStrict strs -- See MkId.mkDataConIds (sigh) + -- ToDo: may miss strictness in existential dicts + +ifaceDeclSubBndrs _other = [] + +\end{code} + + +%********************************************************* +%* * +\subsection{Reading an interface file} +%* * +%********************************************************* + +\begin{code} +findAndReadIface :: Bool -- True <=> explicit user import + -> SDoc -> Module + -> IsBootInterface -- True <=> Look for a .hi-boot file + -- False <=> Look for .hi file + -> TcRnIf gbl lcl (MaybeErr Message (ModIface, FilePath)) + -- Nothing <=> file not found, or unreadable, or illegible + -- Just x <=> successfully found and parsed + + -- It *doesn't* add an error to the monad, because + -- sometimes it's ok to fail... see notes with loadInterface + +findAndReadIface explicit doc_str mod_name hi_boot_file + = do { traceIf (sep [hsep [ptext SLIT("Reading"), + if hi_boot_file + then ptext SLIT("[boot]") + else empty, + ptext SLIT("interface for"), + ppr mod_name <> semi], + nest 4 (ptext SLIT("reason:") <+> doc_str)]) + + -- Check for GHC.Prim, and return its static interface + ; dflags <- getDOpts + ; let base_pkg = basePackageId (pkgState dflags) + ; if mod_name == gHC_PRIM + then returnM (Succeeded (ghcPrimIface{ mi_package = base_pkg }, + "<built in interface for GHC.Prim>")) + else do + + -- Look for the file + ; hsc_env <- getTopEnv + ; mb_found <- ioToIOEnv (findHiFile hsc_env explicit mod_name hi_boot_file) + ; case mb_found of { + Failed err -> do + { traceIf (ptext SLIT("...not found")) + ; dflags <- getDOpts + ; returnM (Failed (cantFindError dflags mod_name err)) } ; + + Succeeded (file_path, pkg) -> do + + -- Found file, so read it + { traceIf (ptext SLIT("readIFace") <+> text file_path) + ; read_result <- readIface mod_name file_path hi_boot_file + ; case read_result of + Failed err -> returnM (Failed (badIfaceFile file_path err)) + Succeeded iface + | mi_module iface /= mod_name -> + return (Failed (wrongIfaceModErr iface mod_name file_path)) + | otherwise -> + returnM (Succeeded (iface{mi_package=pkg}, file_path)) + -- Don't forget to fill in the package name... + }}} + +findHiFile :: HscEnv -> Bool -> Module -> IsBootInterface + -> IO (MaybeErr FindResult (FilePath, PackageIdH)) +findHiFile hsc_env explicit mod_name hi_boot_file + = do { + -- In interactive or --make mode, we are *not allowed* to demand-load + -- a home package .hi file. So don't even look for them. + -- This helps in the case where you are sitting in eg. ghc/lib/std + -- and start up GHCi - it won't complain that all the modules it tries + -- to load are found in the home location. + let { home_allowed = isOneShot (ghcMode (hsc_dflags hsc_env)) } ; + maybe_found <- if home_allowed + then findModule hsc_env mod_name explicit + else findPackageModule hsc_env mod_name explicit; + + case maybe_found of + Found loc pkg -> return (Succeeded (path, pkg)) + where + path = addBootSuffix_maybe hi_boot_file (ml_hi_file loc) + + err -> return (Failed err) + } +\end{code} + +@readIface@ tries just the one file. + +\begin{code} +readIface :: Module -> String -> IsBootInterface + -> TcRnIf gbl lcl (MaybeErr Message ModIface) + -- Failed err <=> file not found, or unreadable, or illegible + -- Succeeded iface <=> successfully found and parsed + +readIface wanted_mod file_path is_hi_boot_file + = do { dflags <- getDOpts + ; ioToIOEnv $ do + { res <- tryMost (readBinIface file_path) + ; case res of + Right iface + | wanted_mod == actual_mod -> return (Succeeded iface) + | otherwise -> return (Failed err) + where + actual_mod = mi_module iface + err = hiModuleNameMismatchWarn wanted_mod actual_mod + + Left exn -> return (Failed (text (showException exn))) + }} +\end{code} + + +%********************************************************* +%* * + Wired-in interface for GHC.Prim +%* * +%********************************************************* + +\begin{code} +initExternalPackageState :: ExternalPackageState +initExternalPackageState + = EPS { + eps_is_boot = emptyModuleEnv, + eps_PIT = emptyPackageIfaceTable, + eps_PTE = emptyTypeEnv, + eps_inst_env = emptyInstEnv, + eps_rule_base = mkRuleBase builtinRules, + -- Initialise the EPS rule pool with the built-in rules + eps_stats = EpsStats { n_ifaces_in = 0, n_decls_in = 0, n_decls_out = 0 + , n_insts_in = 0, n_insts_out = 0 + , n_rules_in = length builtinRules, n_rules_out = 0 } + } +\end{code} + + +%********************************************************* +%* * + Wired-in interface for GHC.Prim +%* * +%********************************************************* + +\begin{code} +ghcPrimIface :: ModIface +ghcPrimIface + = (emptyModIface HomePackage gHC_PRIM) { + mi_exports = [(gHC_PRIM, ghcPrimExports)], + mi_decls = [], + mi_fixities = fixities, + mi_fix_fn = mkIfaceFixCache fixities + } + where + fixities = [(getOccName seqId, Fixity 0 InfixR)] + -- seq is infixr 0 +\end{code} + +%********************************************************* +%* * +\subsection{Statistics} +%* * +%********************************************************* + +\begin{code} +ifaceStats :: ExternalPackageState -> SDoc +ifaceStats eps + = hcat [text "Renamer stats: ", msg] + where + stats = eps_stats eps + msg = vcat + [int (n_ifaces_in stats) <+> text "interfaces read", + hsep [ int (n_decls_out stats), text "type/class/variable imported, out of", + int (n_decls_in stats), text "read"], + hsep [ int (n_insts_out stats), text "instance decls imported, out of", + int (n_insts_in stats), text "read"], + hsep [ int (n_rules_out stats), text "rule decls imported, out of", + int (n_rules_in stats), text "read"] + ] +\end{code} + + +%********************************************************* +%* * +\subsection{Errors} +%* * +%********************************************************* + +\begin{code} +badIfaceFile file err + = vcat [ptext SLIT("Bad interface file:") <+> text file, + nest 4 err] + +hiModuleNameMismatchWarn :: Module -> Module -> Message +hiModuleNameMismatchWarn requested_mod read_mod = + hsep [ ptext SLIT("Something is amiss; requested module name") + , ppr requested_mod + , ptext SLIT("differs from name found in the interface file") + , ppr read_mod + ] + +wrongIfaceModErr iface mod_name file_path + = sep [ptext SLIT("Interface file") <+> iface_file, + ptext SLIT("contains module") <+> quotes (ppr (mi_module iface)) <> comma, + ptext SLIT("but we were expecting module") <+> quotes (ppr mod_name), + sep [ptext SLIT("Probable cause: the source code which generated"), + nest 2 iface_file, + ptext SLIT("has an incompatible module name") + ] + ] + where iface_file = doubleQuotes (text file_path) +\end{code} diff --git a/compiler/iface/MkIface.lhs b/compiler/iface/MkIface.lhs new file mode 100644 index 0000000000..cafb6b6692 --- /dev/null +++ b/compiler/iface/MkIface.lhs @@ -0,0 +1,1066 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1993-1998 +% + +\begin{code} +module MkIface ( + pprModIface, showIface, -- Print the iface in Foo.hi + + mkUsageInfo, -- Construct the usage info for a module + + mkIface, -- Build a ModIface from a ModGuts, + -- including computing version information + + writeIfaceFile, -- Write the interface file + + checkOldIface -- See if recompilation is required, by + -- comparing version information + ) where +\end{code} + + ----------------------------------------------- + MkIface.lhs deals with versioning + ----------------------------------------------- + +Here's the version-related info in an interface file + + module Foo 8 -- module-version + 3 -- export-list-version + 2 -- rule-version + Usages: -- Version info for what this compilation of Foo imported + Baz 3 -- Module version + [4] -- The export-list version if Foo depended on it + (g,2) -- Function and its version + (T,1) -- Type and its version + + <version> f :: Int -> Int {- Unfolding: \x -> Wib.t[2] x -} + -- The [2] says that f's unfolding + -- mentions verison 2 of Wib.t + + ----------------------------------------------- + Basic idea + ----------------------------------------------- + +Basic idea: + * In the mi_usages information in an interface, we record the + version number of each free variable of the module + + * In mkIface, we compute the version number of each exported thing A.f + by comparing its A.f's info with its new info, and bumping its + version number if it differs. If A.f mentions B.g, and B.g's version + number has changed, then we count A.f as having changed too. + + * In checkOldIface we compare the mi_usages for the module with + the actual version info for all each thing recorded in mi_usages + + +Fixities +~~~~~~~~ +We count A.f as changing if its fixity changes + +Rules +~~~~~ +If a rule changes, we want to recompile any module that might be +affected by that rule. For non-orphan rules, this is relatively easy. +If module M defines f, and a rule for f, just arrange that the version +number for M.f changes if any of the rules for M.f change. Any module +that does not depend on M.f can't be affected by the rule-change +either. + +Orphan rules (ones whose 'head function' is not defined in M) are +harder. Here's what we do. + + * We have a per-module orphan-rule version number which changes if + any orphan rule changes. (It's unaffected by non-orphan rules.) + + * We record usage info for any orphan module 'below' this one, + giving the orphan-rule version number. We recompile if this + changes. + +The net effect is that if an orphan rule changes, we recompile every +module above it. That's very conservative, but it's devilishly hard +to know what it might affect, so we just have to be conservative. + +Instance decls +~~~~~~~~~~~~~~ +In an iface file we have + module A where + instance Eq a => Eq [a] = dfun29 + dfun29 :: ... + +We have a version number for dfun29, covering its unfolding +etc. Suppose we are compiling a module M that imports A only +indirectly. If typechecking M uses this instance decl, we record the +dependency on A.dfun29 as if it were a free variable of the module +(via the tcg_inst_usages accumulator). That means that A will appear +in M's usage list. If the shape of the instance declaration changes, +then so will dfun29's version, triggering a recompilation. + +Adding an instance declaration, or changing an instance decl that is +not currently used, is more tricky. (This really only makes a +difference when we have overlapping instance decls, because then the +new instance decl might kick in to override the old one.) We handle +this in a very similar way that we handle rules above. + + * For non-orphan instance decls, identify one locally-defined tycon/class + mentioned in the decl. Treat the instance decl as part of the defn of that + tycon/class, so that if the shape of the instance decl changes, so does the + tycon/class; that in turn will force recompilation of anything that uses + that tycon/class. + + * For orphan instance decls, act the same way as for orphan rules. + Indeed, we use the same global orphan-rule version number. + +mkUsageInfo +~~~~~~~~~~~ +mkUsageInfo figures out what the ``usage information'' for this +moudule is; that is, what it must record in its interface file as the +things it uses. + +We produce a line for every module B below the module, A, currently being +compiled: + import B <n> ; +to record the fact that A does import B indirectly. This is used to decide +to look to look for B.hi rather than B.hi-boot when compiling a module that +imports A. This line says that A imports B, but uses nothing in it. +So we'll get an early bale-out when compiling A if B's version changes. + +The usage information records: + +\begin{itemize} +\item (a) anything reachable from its body code +\item (b) any module exported with a @module Foo@ +\item (c) anything reachable from an exported item +\end{itemize} + +Why (b)? Because if @Foo@ changes then this module's export list +will change, so we must recompile this module at least as far as +making a new interface file --- but in practice that means complete +recompilation. + +Why (c)? Consider this: +\begin{verbatim} + module A( f, g ) where | module B( f ) where + import B( f ) | f = h 3 + g = ... | h = ... +\end{verbatim} + +Here, @B.f@ isn't used in A. Should we nevertheless record @B.f@ in +@A@'s usages? Our idea is that we aren't going to touch A.hi if it is +*identical* to what it was before. If anything about @B.f@ changes +than anyone who imports @A@ should be recompiled in case they use +@B.f@ (they'll get an early exit if they don't). So, if anything +about @B.f@ changes we'd better make sure that something in A.hi +changes, and the convenient way to do that is to record the version +number @B.f@ in A.hi in the usage list. If B.f changes that'll force a +complete recompiation of A, which is overkill but it's the only way to +write a new, slightly different, A.hi. + +But the example is tricker. Even if @B.f@ doesn't change at all, +@B.h@ may do so, and this change may not be reflected in @f@'s version +number. But with -O, a module that imports A must be recompiled if +@B.h@ changes! So A must record a dependency on @B.h@. So we treat +the occurrence of @B.f@ in the export list *just as if* it were in the +code of A, and thereby haul in all the stuff reachable from it. + + *** Conclusion: if A mentions B.f in its export list, + behave just as if A mentioned B.f in its source code, + and slurp in B.f and all its transitive closure *** + +[NB: If B was compiled with -O, but A isn't, we should really *still* +haul in all the unfoldings for B, in case the module that imports A *is* +compiled with -O. I think this is the case.] + + +\begin{code} +#include "HsVersions.h" + +import HsSyn +import Packages ( isHomeModule, PackageIdH(..) ) +import IfaceSyn ( IfaceDecl(..), IfaceClassOp(..), IfaceConDecl(..), + IfaceRule(..), IfaceInst(..), IfaceExtName(..), + eqIfDecl, eqIfRule, eqIfInst, IfaceEq(..), (&&&), bool, + eqMaybeBy, eqListBy, visibleIfConDecls, + tyThingToIfaceDecl, instanceToIfaceInst, coreRuleToIfaceRule ) +import LoadIface ( readIface, loadInterface ) +import BasicTypes ( Version, initialVersion, bumpVersion ) +import TcRnMonad +import HscTypes ( ModIface(..), ModDetails(..), + ModGuts(..), IfaceExport, + HscEnv(..), hscEPS, Dependencies(..), FixItem(..), + ModSummary(..), msHiFilePath, + mkIfaceDepCache, mkIfaceFixCache, mkIfaceVerCache, + typeEnvElts, + GenAvailInfo(..), availName, + ExternalPackageState(..), + Usage(..), IsBootInterface, + Deprecs(..), IfaceDeprecs, Deprecations, + lookupIfaceByModule + ) + + +import Packages ( HomeModules ) +import DynFlags ( GhcMode(..), DynFlags(..), DynFlag(..), dopt ) +import StaticFlags ( opt_HiVersion ) +import Name ( Name, nameModule, nameOccName, nameParent, + isExternalName, isInternalName, nameParent_maybe, isWiredInName, + isImplicitName, NamedThing(..) ) +import NameEnv +import NameSet +import OccName ( OccName, OccEnv, mkOccEnv, lookupOccEnv, emptyOccEnv, + extendOccEnv_C, + OccSet, emptyOccSet, elemOccSet, occSetElts, + extendOccSet, extendOccSetList, + isEmptyOccSet, intersectOccSet, intersectsOccSet, + occNameFS, isTcOcc ) +import Module ( Module, moduleFS, + ModLocation(..), mkModuleFS, moduleString, + ModuleEnv, emptyModuleEnv, lookupModuleEnv, + extendModuleEnv_C + ) +import Outputable +import Util ( createDirectoryHierarchy, directoryOf ) +import Util ( sortLe, seqList ) +import Binary ( getBinFileWithDict ) +import BinIface ( writeBinIface, v_IgnoreHiWay ) +import Unique ( Unique, Uniquable(..) ) +import ErrUtils ( dumpIfSet_dyn, showPass ) +import Digraph ( stronglyConnComp, SCC(..) ) +import SrcLoc ( SrcSpan ) +import FiniteMap +import FastString + +import DATA_IOREF ( writeIORef ) +import Monad ( when ) +import List ( insert ) +import Maybes ( orElse, mapCatMaybes, isNothing, isJust, + expectJust, MaybeErr(..) ) +\end{code} + + + +%************************************************************************ +%* * +\subsection{Completing an interface} +%* * +%************************************************************************ + +\begin{code} +mkIface :: HscEnv + -> Maybe ModIface -- The old interface, if we have it + -> ModGuts -- Usages, deprecations, etc + -> ModDetails -- The trimmed, tidied interface + -> IO (ModIface, -- The new one, complete with decls and versions + Bool) -- True <=> there was an old Iface, and the new one + -- is identical, so no need to write it + +mkIface hsc_env maybe_old_iface + (ModGuts{ mg_module = this_mod, + mg_boot = is_boot, + mg_usages = usages, + mg_deps = deps, + mg_home_mods = home_mods, + mg_rdr_env = rdr_env, + mg_fix_env = fix_env, + mg_deprecs = src_deprecs }) + (ModDetails{ md_insts = insts, + md_rules = rules, + md_types = type_env, + md_exports = exports }) + +-- NB: notice that mkIface does not look at the bindings +-- only at the TypeEnv. The previous Tidy phase has +-- put exactly the info into the TypeEnv that we want +-- to expose in the interface + + = do { eps <- hscEPS hsc_env + ; let { ext_nm_rhs = mkExtNameFn hsc_env home_mods eps this_mod + ; ext_nm_lhs = mkLhsNameFn this_mod + + ; decls = [ tyThingToIfaceDecl ext_nm_rhs thing + | thing <- typeEnvElts type_env, + not (isImplicitName (getName thing)) ] + -- Don't put implicit Ids and class tycons in the interface file + + ; fixities = [(occ,fix) | FixItem occ fix _ <- nameEnvElts fix_env] + ; deprecs = mkIfaceDeprec src_deprecs + ; iface_rules = map (coreRuleToIfaceRule ext_nm_lhs ext_nm_rhs) rules + ; iface_insts = map (instanceToIfaceInst ext_nm_lhs) insts + + ; intermediate_iface = ModIface { + mi_module = this_mod, + mi_package = HomePackage, + mi_boot = is_boot, + mi_deps = deps, + mi_usages = usages, + mi_exports = mkIfaceExports exports, + mi_insts = sortLe le_inst iface_insts, + mi_rules = sortLe le_rule iface_rules, + mi_fixities = fixities, + mi_deprecs = deprecs, + mi_globals = Just rdr_env, + + -- Left out deliberately: filled in by addVersionInfo + mi_mod_vers = initialVersion, + mi_exp_vers = initialVersion, + mi_rule_vers = initialVersion, + mi_orphan = False, -- Always set by addVersionInfo, but + -- it's a strict field, so we can't omit it. + mi_decls = deliberatelyOmitted "decls", + mi_ver_fn = deliberatelyOmitted "ver_fn", + + -- And build the cached values + mi_dep_fn = mkIfaceDepCache deprecs, + mi_fix_fn = mkIfaceFixCache fixities } + + -- Add version information + ; (new_iface, no_change_at_all, pp_diffs, pp_orphs) + = _scc_ "versioninfo" + addVersionInfo maybe_old_iface intermediate_iface decls + } + + -- Debug printing + ; when (isJust pp_orphs && dopt Opt_WarnOrphans dflags) + (printDump (expectJust "mkIface" pp_orphs)) + ; when (dopt Opt_D_dump_hi_diffs dflags) (printDump pp_diffs) + ; dumpIfSet_dyn dflags Opt_D_dump_hi "FINAL INTERFACE" + (pprModIface new_iface) + + ; return (new_iface, no_change_at_all) } + where + r1 `le_rule` r2 = ifRuleName r1 <= ifRuleName r2 + i1 `le_inst` i2 = ifDFun i1 <= ifDFun i2 + + dflags = hsc_dflags hsc_env + deliberatelyOmitted x = panic ("Deliberately omitted: " ++ x) + + +----------------------------- +writeIfaceFile :: ModLocation -> ModIface -> IO () +writeIfaceFile location new_iface + = do createDirectoryHierarchy (directoryOf hi_file_path) + writeBinIface hi_file_path new_iface + where hi_file_path = ml_hi_file location + + +----------------------------- +mkExtNameFn :: HscEnv -> HomeModules -> ExternalPackageState -> Module -> Name -> IfaceExtName +mkExtNameFn hsc_env hmods eps this_mod + = ext_nm + where + hpt = hsc_HPT hsc_env + pit = eps_PIT eps + + ext_nm name + | mod == this_mod = case nameParent_maybe name of + Nothing -> LocalTop occ + Just par -> LocalTopSub occ (nameOccName par) + | isWiredInName name = ExtPkg mod occ + | isHomeModule hmods mod = HomePkg mod occ vers + | otherwise = ExtPkg mod occ + where + mod = nameModule name + occ = nameOccName name + par_occ = nameOccName (nameParent name) + -- The version of the *parent* is the one want + vers = lookupVersion mod par_occ + + lookupVersion :: Module -> OccName -> Version + -- Even though we're looking up a home-package thing, in + -- one-shot mode the imported interfaces may be in the PIT + lookupVersion mod occ + = mi_ver_fn iface occ `orElse` + pprPanic "lookupVers1" (ppr mod <+> ppr occ) + where + iface = lookupIfaceByModule hpt pit mod `orElse` + pprPanic "lookupVers2" (ppr mod <+> ppr occ) + + +--------------------- +-- mkLhsNameFn ignores versioning info altogether +-- It is used for the LHS of instance decls and rules, where we +-- there's no point in recording version info +mkLhsNameFn :: Module -> Name -> IfaceExtName +mkLhsNameFn this_mod name + | isInternalName name = pprTrace "mkLhsNameFn: unexpected internal" (ppr name) $ + LocalTop occ -- Should not happen + | mod == this_mod = LocalTop occ + | otherwise = ExtPkg mod occ + where + mod = nameModule name + occ = nameOccName name + + +----------------------------- +-- Compute version numbers for local decls + +addVersionInfo :: Maybe ModIface -- The old interface, read from M.hi + -> ModIface -- The new interface decls (lacking decls) + -> [IfaceDecl] -- The new decls + -> (ModIface, + Bool, -- True <=> no changes at all; no need to write new Iface + SDoc, -- Differences + Maybe SDoc) -- Warnings about orphans + +addVersionInfo Nothing new_iface new_decls +-- No old interface, so definitely write a new one! + = (new_iface { mi_orphan = anyNothing ifInstOrph (mi_insts new_iface) + || anyNothing ifRuleOrph (mi_rules new_iface), + mi_decls = [(initialVersion, decl) | decl <- new_decls], + mi_ver_fn = \n -> Just initialVersion }, + False, + ptext SLIT("No old interface file"), + pprOrphans orph_insts orph_rules) + where + orph_insts = filter (isNothing . ifInstOrph) (mi_insts new_iface) + orph_rules = filter (isNothing . ifRuleOrph) (mi_rules new_iface) + +addVersionInfo (Just old_iface@(ModIface { mi_mod_vers = old_mod_vers, + mi_exp_vers = old_exp_vers, + mi_rule_vers = old_rule_vers, + mi_decls = old_decls, + mi_ver_fn = old_decl_vers, + mi_fix_fn = old_fixities })) + new_iface@(ModIface { mi_fix_fn = new_fixities }) + new_decls + + | no_change_at_all = (old_iface, True, ptext SLIT("Interface file unchanged"), pp_orphs) + | otherwise = (final_iface, False, vcat [ptext SLIT("Interface file has changed"), + nest 2 pp_diffs], pp_orphs) + where + final_iface = new_iface { mi_mod_vers = bump_unless no_output_change old_mod_vers, + mi_exp_vers = bump_unless no_export_change old_exp_vers, + mi_rule_vers = bump_unless no_rule_change old_rule_vers, + mi_orphan = not (null new_orph_rules && null new_orph_insts), + mi_decls = decls_w_vers, + mi_ver_fn = mkIfaceVerCache decls_w_vers } + + decls_w_vers = [(add_vers decl, decl) | decl <- new_decls] + + ------------------- + (old_non_orph_insts, old_orph_insts) = mkOrphMap ifInstOrph (mi_insts old_iface) + (new_non_orph_insts, new_orph_insts) = mkOrphMap ifInstOrph (mi_insts new_iface) + same_insts occ = eqMaybeBy (eqListBy eqIfInst) + (lookupOccEnv old_non_orph_insts occ) + (lookupOccEnv new_non_orph_insts occ) + + (old_non_orph_rules, old_orph_rules) = mkOrphMap ifRuleOrph (mi_rules old_iface) + (new_non_orph_rules, new_orph_rules) = mkOrphMap ifRuleOrph (mi_rules new_iface) + same_rules occ = eqMaybeBy (eqListBy eqIfRule) + (lookupOccEnv old_non_orph_rules occ) + (lookupOccEnv new_non_orph_rules occ) + ------------------- + -- Computing what changed + no_output_change = no_decl_change && no_rule_change && + no_export_change && no_deprec_change + no_export_change = mi_exports new_iface == mi_exports old_iface -- Kept sorted + no_decl_change = isEmptyOccSet changed_occs + no_rule_change = not (changedWrt changed_occs (eqListBy eqIfRule old_orph_rules new_orph_rules) + || changedWrt changed_occs (eqListBy eqIfInst old_orph_insts new_orph_insts)) + no_deprec_change = mi_deprecs new_iface == mi_deprecs old_iface + + -- If the usages havn't changed either, we don't need to write the interface file + no_other_changes = mi_usages new_iface == mi_usages old_iface && + mi_deps new_iface == mi_deps old_iface + no_change_at_all = no_output_change && no_other_changes + + pp_diffs = vcat [pp_change no_export_change "Export list" + (ppr old_exp_vers <+> arrow <+> ppr (mi_exp_vers final_iface)), + pp_change no_rule_change "Rules" + (ppr old_rule_vers <+> arrow <+> ppr (mi_rule_vers final_iface)), + pp_change no_deprec_change "Deprecations" empty, + pp_change no_other_changes "Usages" empty, + pp_decl_diffs] + pp_change True what info = empty + pp_change False what info = text what <+> ptext SLIT("changed") <+> info + + ------------------- + old_decl_env = mkOccEnv [(ifName decl, decl) | (_,decl) <- old_decls] + same_fixity n = bool (old_fixities n == new_fixities n) + + ------------------- + -- Adding version info + new_version = bumpVersion old_mod_vers + add_vers decl | occ `elemOccSet` changed_occs = new_version + | otherwise = expectJust "add_vers" (old_decl_vers occ) + -- If it's unchanged, there jolly well + where -- should be an old version number + occ = ifName decl + + ------------------- + changed_occs :: OccSet + changed_occs = computeChangedOccs eq_info + + eq_info :: [(OccName, IfaceEq)] + eq_info = map check_eq new_decls + check_eq new_decl | Just old_decl <- lookupOccEnv old_decl_env occ + = (occ, new_decl `eqIfDecl` old_decl &&& + eq_indirects new_decl) + | otherwise {- No corresponding old decl -} + = (occ, NotEqual) + where + occ = ifName new_decl + + eq_indirects :: IfaceDecl -> IfaceEq + -- When seeing if two decls are the same, remember to + -- check whether any relevant fixity or rules have changed + eq_indirects (IfaceId {ifName = occ}) = eq_ind_occ occ + eq_indirects (IfaceClass {ifName = cls_occ, ifSigs = sigs}) + = same_insts cls_occ &&& + eq_ind_occs [op | IfaceClassOp op _ _ <- sigs] + eq_indirects (IfaceData {ifName = tc_occ, ifCons = cons}) + = same_insts tc_occ &&& same_fixity tc_occ &&& -- The TyCon can have a fixity too + eq_ind_occs (map ifConOcc (visibleIfConDecls cons)) + eq_indirects other = Equal -- Synonyms and foreign declarations + + eq_ind_occ :: OccName -> IfaceEq -- For class ops and Ids; check fixity and rules + eq_ind_occ occ = same_fixity occ &&& same_rules occ + eq_ind_occs = foldr ((&&&) . eq_ind_occ) Equal + + ------------------- + -- Diffs + pp_decl_diffs :: SDoc -- Nothing => no changes + pp_decl_diffs + | isEmptyOccSet changed_occs = empty + | otherwise + = vcat [ptext SLIT("Changed occs:") <+> ppr (occSetElts changed_occs), + ptext SLIT("Version change for these decls:"), + nest 2 (vcat (map show_change new_decls))] + + eq_env = mkOccEnv eq_info + show_change new_decl + | not (occ `elemOccSet` changed_occs) = empty + | otherwise + = vcat [ppr occ <+> ppr (old_decl_vers occ) <+> arrow <+> ppr new_version, + nest 2 why] + where + occ = ifName new_decl + why = case lookupOccEnv eq_env occ of + Just (EqBut occs) -> sep [ppr occ <> colon, ptext SLIT("Free vars (only) changed:"), + nest 2 (braces (fsep (map ppr (occSetElts + (occs `intersectOccSet` changed_occs)))))] + Just NotEqual + | Just old_decl <- lookupOccEnv old_decl_env occ + -> vcat [ptext SLIT("Old:") <+> ppr old_decl, + ptext SLIT("New:") <+> ppr new_decl] + | otherwise + -> ppr occ <+> ptext SLIT("only in new interface") + other -> pprPanic "MkIface.show_change" (ppr occ) + + pp_orphs = pprOrphans new_orph_insts new_orph_rules + +pprOrphans insts rules + | null insts && null rules = Nothing + | otherwise + = Just $ vcat [ + if null insts then empty else + hang (ptext SLIT("Warning: orphan instances:")) + 2 (vcat (map ppr insts)), + if null rules then empty else + hang (ptext SLIT("Warning: orphan rules:")) + 2 (vcat (map ppr rules)) + ] + +computeChangedOccs :: [(OccName, IfaceEq)] -> OccSet +computeChangedOccs eq_info + = foldl add_changes emptyOccSet (stronglyConnComp edges) + where + edges :: [((OccName,IfaceEq), Unique, [Unique])] + edges = [ (node, getUnique occ, map getUnique occs) + | node@(occ, iface_eq) <- eq_info + , let occs = case iface_eq of + EqBut occ_set -> occSetElts occ_set + other -> [] ] + + -- Changes in declarations + add_changes :: OccSet -> SCC (OccName, IfaceEq) -> OccSet + add_changes so_far (AcyclicSCC (occ, iface_eq)) + | changedWrt so_far iface_eq -- This one has changed + = extendOccSet so_far occ + add_changes so_far (CyclicSCC pairs) + | changedWrt so_far (foldr1 (&&&) (map snd pairs)) -- One of this group has changed + = extendOccSetList so_far (map fst pairs) + add_changes so_far other = so_far + +changedWrt :: OccSet -> IfaceEq -> Bool +changedWrt so_far Equal = False +changedWrt so_far NotEqual = True +changedWrt so_far (EqBut kids) = so_far `intersectsOccSet` kids + +---------------------- +-- mkOrphMap partitions instance decls or rules into +-- (a) an OccEnv for ones that are not orphans, +-- mapping the local OccName to a list of its decls +-- (b) a list of orphan decls +mkOrphMap :: (decl -> Maybe OccName) -- (Just occ) for a non-orphan decl, keyed by occ + -- Nothing for an orphan decl + -> [decl] -- Sorted into canonical order + -> (OccEnv [decl], -- Non-orphan decls associated with their key; + -- each sublist in canonical order + [decl]) -- Orphan decls; in canonical order +mkOrphMap get_key decls + = foldl go (emptyOccEnv, []) decls + where + go (non_orphs, orphs) d + | Just occ <- get_key d + = (extendOccEnv_C (\ ds _ -> d:ds) non_orphs occ [d], orphs) + | otherwise = (non_orphs, d:orphs) + +anyNothing :: (a -> Maybe b) -> [a] -> Bool +anyNothing p [] = False +anyNothing p (x:xs) = isNothing (p x) || anyNothing p xs + +---------------------- +mkIfaceDeprec :: Deprecations -> IfaceDeprecs +mkIfaceDeprec NoDeprecs = NoDeprecs +mkIfaceDeprec (DeprecAll t) = DeprecAll t +mkIfaceDeprec (DeprecSome env) = DeprecSome (sortLe (<=) (nameEnvElts env)) + +---------------------- +bump_unless :: Bool -> Version -> Version +bump_unless True v = v -- True <=> no change +bump_unless False v = bumpVersion v +\end{code} + + +%********************************************************* +%* * +\subsection{Keeping track of what we've slurped, and version numbers} +%* * +%********************************************************* + + +\begin{code} +mkUsageInfo :: HscEnv + -> HomeModules + -> ModuleEnv (Module, Bool, SrcSpan) + -> [(Module, IsBootInterface)] + -> NameSet -> IO [Usage] +mkUsageInfo hsc_env hmods dir_imp_mods dep_mods used_names + = do { eps <- hscEPS hsc_env + ; let usages = mk_usage_info (eps_PIT eps) hsc_env hmods + dir_imp_mods dep_mods used_names + ; usages `seqList` return usages } + -- seq the list of Usages returned: occasionally these + -- don't get evaluated for a while and we can end up hanging on to + -- the entire collection of Ifaces. + +mk_usage_info pit hsc_env hmods dir_imp_mods dep_mods proto_used_names + = mapCatMaybes mkUsage dep_mods + -- ToDo: do we need to sort into canonical order? + where + hpt = hsc_HPT hsc_env + + used_names = mkNameSet $ -- Eliminate duplicates + [ nameParent n -- Just record usage on the 'main' names + | n <- nameSetToList proto_used_names + , not (isWiredInName n) -- Don't record usages for wired-in names + , isExternalName n -- Ignore internal names + ] + + -- ent_map groups together all the things imported and used + -- from a particular module in this package + ent_map :: ModuleEnv [OccName] + ent_map = foldNameSet add_mv emptyModuleEnv used_names + add_mv name mv_map = extendModuleEnv_C add_item mv_map mod [occ] + where + occ = nameOccName name + mod = nameModule name + add_item occs _ = occ:occs + + depend_on_exports mod = case lookupModuleEnv dir_imp_mods mod of + Just (_,no_imp,_) -> not no_imp + Nothing -> True + + -- We want to create a Usage for a home module if + -- a) we used something from; has something in used_names + -- b) we imported all of it, even if we used nothing from it + -- (need to recompile if its export list changes: export_vers) + -- c) is a home-package orphan module (need to recompile if its + -- instance decls change: rules_vers) + mkUsage :: (Module, Bool) -> Maybe Usage + mkUsage (mod_name, _) + | isNothing maybe_iface -- We can't depend on it if we didn't + || not (isHomeModule hmods mod) -- even open the interface! + || (null used_occs + && isNothing export_vers + && not orphan_mod) + = Nothing -- Record no usage info + + | otherwise + = Just (Usage { usg_name = mod, + usg_mod = mod_vers, + usg_exports = export_vers, + usg_entities = ent_vers, + usg_rules = rules_vers }) + where + maybe_iface = lookupIfaceByModule hpt pit mod_name + -- In one-shot mode, the interfaces for home-package + -- modules accumulate in the PIT not HPT. Sigh. + + Just iface = maybe_iface + mod = mi_module iface + orphan_mod = mi_orphan iface + version_env = mi_ver_fn iface + mod_vers = mi_mod_vers iface + rules_vers = mi_rule_vers iface + export_vers | depend_on_exports mod = Just (mi_exp_vers iface) + | otherwise = Nothing + + -- The sort is to put them into canonical order + used_occs = lookupModuleEnv ent_map mod `orElse` [] + ent_vers :: [(OccName,Version)] + ent_vers = [ (occ, version_env occ `orElse` initialVersion) + | occ <- sortLe (<=) used_occs] +\end{code} + +\begin{code} +mkIfaceExports :: NameSet -> [(Module, [GenAvailInfo OccName])] + -- Group by module and sort by occurrence + -- This keeps the list in canonical order +mkIfaceExports exports + = [ (mkModuleFS fs, eltsFM avails) + | (fs, avails) <- fmToList groupFM + ] + where + groupFM :: FiniteMap FastString (FiniteMap FastString (GenAvailInfo OccName)) + -- Deliberately use the FastString so we + -- get a canonical ordering + groupFM = foldl add emptyFM (nameSetToList exports) + + add env name = addToFM_C add_avail env mod_fs + (unitFM avail_fs avail) + where + occ = nameOccName name + mod_fs = moduleFS (nameModule name) + avail | Just p <- nameParent_maybe name = AvailTC (nameOccName p) [occ] + | isTcOcc occ = AvailTC occ [occ] + | otherwise = Avail occ + avail_fs = occNameFS (availName avail) + add_avail avail_fm _ = addToFM_C add_item avail_fm avail_fs avail + + add_item (AvailTC p occs) _ = AvailTC p (List.insert occ occs) + add_item (Avail n) _ = pprPanic "MkIface.addAvail" (ppr n <+> ppr name) +\end{code} + + +%************************************************************************ +%* * + Load the old interface file for this module (unless + we have it aleady), and check whether it is up to date + +%* * +%************************************************************************ + +\begin{code} +checkOldIface :: HscEnv + -> ModSummary + -> Bool -- Source unchanged + -> Maybe ModIface -- Old interface from compilation manager, if any + -> IO (RecompileRequired, Maybe ModIface) + +checkOldIface hsc_env mod_summary source_unchanged maybe_iface + = do { showPass (hsc_dflags hsc_env) + ("Checking old interface for " ++ moduleString (ms_mod mod_summary)) ; + + ; initIfaceCheck hsc_env $ + check_old_iface mod_summary source_unchanged maybe_iface + } + +check_old_iface mod_summary source_unchanged maybe_iface + = -- CHECK WHETHER THE SOURCE HAS CHANGED + ifM (not source_unchanged) + (traceHiDiffs (nest 4 (text "Source file changed or recompilation check turned off"))) + `thenM_` + + -- If the source has changed and we're in interactive mode, avoid reading + -- an interface; just return the one we might have been supplied with. + getGhcMode `thenM` \ ghc_mode -> + if (ghc_mode == Interactive || ghc_mode == JustTypecheck) + && not source_unchanged then + returnM (outOfDate, maybe_iface) + else + + case maybe_iface of { + Just old_iface -> -- Use the one we already have + checkVersions source_unchanged old_iface `thenM` \ recomp -> + returnM (recomp, Just old_iface) + + ; Nothing -> + + -- Try and read the old interface for the current module + -- from the .hi file left from the last time we compiled it + let + iface_path = msHiFilePath mod_summary + in + readIface (ms_mod mod_summary) iface_path False `thenM` \ read_result -> + case read_result of { + Failed err -> -- Old interface file not found, or garbled; give up + traceIf (text "FYI: cannot read old interface file:" + $$ nest 4 err) `thenM_` + returnM (outOfDate, Nothing) + + ; Succeeded iface -> + + -- We have got the old iface; check its versions + checkVersions source_unchanged iface `thenM` \ recomp -> + returnM (recomp, Just iface) + }} +\end{code} + +@recompileRequired@ is called from the HscMain. It checks whether +a recompilation is required. It needs access to the persistent state, +finder, etc, because it may have to load lots of interface files to +check their versions. + +\begin{code} +type RecompileRequired = Bool +upToDate = False -- Recompile not required +outOfDate = True -- Recompile required + +checkVersions :: Bool -- True <=> source unchanged + -> ModIface -- Old interface + -> IfG RecompileRequired +checkVersions source_unchanged iface + | not source_unchanged + = returnM outOfDate + | otherwise + = do { traceHiDiffs (text "Considering whether compilation is required for" <+> + ppr (mi_module iface) <> colon) + + -- Source code unchanged and no errors yet... carry on + + -- First put the dependent-module info, read from the old interface, into the envt, + -- so that when we look for interfaces we look for the right one (.hi or .hi-boot) + -- + -- It's just temporary because either the usage check will succeed + -- (in which case we are done with this module) or it'll fail (in which + -- case we'll compile the module from scratch anyhow). + -- + -- We do this regardless of compilation mode + ; updateEps_ $ \eps -> eps { eps_is_boot = mod_deps } + + ; checkList [checkModUsage u | u <- mi_usages iface] + } + where + -- This is a bit of a hack really + mod_deps :: ModuleEnv (Module, IsBootInterface) + mod_deps = mkModDeps (dep_mods (mi_deps iface)) + +checkModUsage :: Usage -> IfG RecompileRequired +-- Given the usage information extracted from the old +-- M.hi file for the module being compiled, figure out +-- whether M needs to be recompiled. + +checkModUsage (Usage { usg_name = mod_name, usg_mod = old_mod_vers, + usg_rules = old_rule_vers, + usg_exports = maybe_old_export_vers, + usg_entities = old_decl_vers }) + = -- Load the imported interface is possible + let + doc_str = sep [ptext SLIT("need version info for"), ppr mod_name] + in + traceHiDiffs (text "Checking usages for module" <+> ppr mod_name) `thenM_` + + loadInterface doc_str mod_name ImportBySystem `thenM` \ mb_iface -> + -- Load the interface, but don't complain on failure; + -- Instead, get an Either back which we can test + + case mb_iface of { + Failed exn -> (out_of_date (sep [ptext SLIT("Can't find version number for module"), + ppr mod_name])); + -- Couldn't find or parse a module mentioned in the + -- old interface file. Don't complain -- it might just be that + -- the current module doesn't need that import and it's been deleted + + Succeeded iface -> + let + new_mod_vers = mi_mod_vers iface + new_decl_vers = mi_ver_fn iface + new_export_vers = mi_exp_vers iface + new_rule_vers = mi_rule_vers iface + in + -- CHECK MODULE + checkModuleVersion old_mod_vers new_mod_vers `thenM` \ recompile -> + if not recompile then + returnM upToDate + else + + -- CHECK EXPORT LIST + if checkExportList maybe_old_export_vers new_export_vers then + out_of_date_vers (ptext SLIT(" Export list changed")) + (expectJust "checkModUsage" maybe_old_export_vers) + new_export_vers + else + + -- CHECK RULES + if old_rule_vers /= new_rule_vers then + out_of_date_vers (ptext SLIT(" Rules changed")) + old_rule_vers new_rule_vers + else + + -- CHECK ITEMS ONE BY ONE + checkList [checkEntityUsage new_decl_vers u | u <- old_decl_vers] `thenM` \ recompile -> + if recompile then + returnM outOfDate -- This one failed, so just bail out now + else + up_to_date (ptext SLIT(" Great! The bits I use are up to date")) + } + +------------------------ +checkModuleVersion old_mod_vers new_mod_vers + | new_mod_vers == old_mod_vers + = up_to_date (ptext SLIT("Module version unchanged")) + + | otherwise + = out_of_date_vers (ptext SLIT(" Module version has changed")) + old_mod_vers new_mod_vers + +------------------------ +checkExportList Nothing new_vers = upToDate +checkExportList (Just v) new_vers = v /= new_vers + +------------------------ +checkEntityUsage new_vers (name,old_vers) + = case new_vers name of + + Nothing -> -- We used it before, but it ain't there now + out_of_date (sep [ptext SLIT("No longer exported:"), ppr name]) + + Just new_vers -- It's there, but is it up to date? + | new_vers == old_vers -> traceHiDiffs (text " Up to date" <+> ppr name <+> parens (ppr new_vers)) `thenM_` + returnM upToDate + | otherwise -> out_of_date_vers (ptext SLIT(" Out of date:") <+> ppr name) + old_vers new_vers + +up_to_date msg = traceHiDiffs msg `thenM_` returnM upToDate +out_of_date msg = traceHiDiffs msg `thenM_` returnM outOfDate +out_of_date_vers msg old_vers new_vers + = out_of_date (hsep [msg, ppr old_vers, ptext SLIT("->"), ppr new_vers]) + +---------------------- +checkList :: [IfG RecompileRequired] -> IfG RecompileRequired +-- This helper is used in two places +checkList [] = returnM upToDate +checkList (check:checks) = check `thenM` \ recompile -> + if recompile then + returnM outOfDate + else + checkList checks +\end{code} + +%************************************************************************ +%* * + Printing interfaces +%* * +%************************************************************************ + +\begin{code} +showIface :: FilePath -> IO () +-- Read binary interface, and print it out +showIface filename = do + -- skip the version check; we don't want to worry about profiled vs. + -- non-profiled interfaces, for example. + writeIORef v_IgnoreHiWay True + iface <- Binary.getBinFileWithDict filename + printDump (pprModIface iface) + where +\end{code} + + +\begin{code} +pprModIface :: ModIface -> SDoc +-- Show a ModIface +pprModIface iface + = vcat [ ptext SLIT("interface") + <+> ppr_package (mi_package iface) + <+> ppr (mi_module iface) <+> pp_boot + <+> ppr (mi_mod_vers iface) <+> pp_sub_vers + <+> (if mi_orphan iface then ptext SLIT("[orphan module]") else empty) + <+> int opt_HiVersion + <+> ptext SLIT("where") + , vcat (map pprExport (mi_exports iface)) + , pprDeps (mi_deps iface) + , vcat (map pprUsage (mi_usages iface)) + , pprFixities (mi_fixities iface) + , vcat (map pprIfaceDecl (mi_decls iface)) + , vcat (map ppr (mi_insts iface)) + , vcat (map ppr (mi_rules iface)) + , pprDeprecs (mi_deprecs iface) + ] + where + pp_boot | mi_boot iface = ptext SLIT("[boot]") + | otherwise = empty + ppr_package HomePackage = empty + ppr_package (ExtPackage id) = doubleQuotes (ppr id) + + exp_vers = mi_exp_vers iface + rule_vers = mi_rule_vers iface + + pp_sub_vers | exp_vers == initialVersion && rule_vers == initialVersion = empty + | otherwise = brackets (ppr exp_vers <+> ppr rule_vers) +\end{code} + +When printing export lists, we print like this: + Avail f f + AvailTC C [C, x, y] C(x,y) + AvailTC C [x, y] C!(x,y) -- Exporting x, y but not C + +\begin{code} +pprExport :: IfaceExport -> SDoc +pprExport (mod, items) + = hsep [ ptext SLIT("export"), ppr mod, hsep (map pp_avail items) ] + where + pp_avail :: GenAvailInfo OccName -> SDoc + pp_avail (Avail occ) = ppr occ + pp_avail (AvailTC _ []) = empty + pp_avail (AvailTC n (n':ns)) + | n==n' = ppr n <> pp_export ns + | otherwise = ppr n <> char '|' <> pp_export (n':ns) + + pp_export [] = empty + pp_export names = braces (hsep (map ppr names)) + +pprUsage :: Usage -> SDoc +pprUsage usage + = hsep [ptext SLIT("import"), ppr (usg_name usage), + int (usg_mod usage), + pp_export_version (usg_exports usage), + int (usg_rules usage), + pp_versions (usg_entities usage) ] + where + pp_versions nvs = hsep [ ppr n <+> int v | (n,v) <- nvs ] + pp_export_version Nothing = empty + pp_export_version (Just v) = int v + +pprDeps :: Dependencies -> SDoc +pprDeps (Deps { dep_mods = mods, dep_pkgs = pkgs, dep_orphs = orphs}) + = vcat [ptext SLIT("module dependencies:") <+> fsep (map ppr_mod mods), + ptext SLIT("package dependencies:") <+> fsep (map ppr pkgs), + ptext SLIT("orphans:") <+> fsep (map ppr orphs) + ] + where + ppr_mod (mod_name, boot) = ppr mod_name <+> ppr_boot boot + ppr_boot True = text "[boot]" + ppr_boot False = empty + +pprIfaceDecl :: (Version, IfaceDecl) -> SDoc +pprIfaceDecl (ver, decl) + = ppr_vers ver <+> ppr decl + where + -- Print the version for the decl + ppr_vers v | v == initialVersion = empty + | otherwise = int v + +pprFixities :: [(OccName, Fixity)] -> SDoc +pprFixities [] = empty +pprFixities fixes = ptext SLIT("fixities") <+> pprWithCommas pprFix fixes + where + pprFix (occ,fix) = ppr fix <+> ppr occ + +pprDeprecs NoDeprecs = empty +pprDeprecs (DeprecAll txt) = ptext SLIT("Deprecate all") <+> doubleQuotes (ftext txt) +pprDeprecs (DeprecSome prs) = ptext SLIT("Deprecate") <+> vcat (map pprDeprec prs) + where + pprDeprec (name, txt) = ppr name <+> doubleQuotes (ftext txt) +\end{code} diff --git a/compiler/iface/TcIface.hi-boot-5 b/compiler/iface/TcIface.hi-boot-5 new file mode 100644 index 0000000000..3647edfa22 --- /dev/null +++ b/compiler/iface/TcIface.hi-boot-5 @@ -0,0 +1,5 @@ +__interface TcIface 1 0 where +__export TcIface tcImportDecl ; +1 tcImportDecl :: Name.Name -> TcRnTypes.IfG TypeRep.TyThing ; + + diff --git a/compiler/iface/TcIface.hi-boot-6 b/compiler/iface/TcIface.hi-boot-6 new file mode 100644 index 0000000000..b03830c03d --- /dev/null +++ b/compiler/iface/TcIface.hi-boot-6 @@ -0,0 +1,7 @@ +module TcIface where + +tcIfaceDecl :: IfaceSyn.IfaceDecl -> TcRnTypes.IfL TypeRep.TyThing +tcIfaceInst :: IfaceSyn.IfaceInst -> TcRnTypes.IfL InstEnv.Instance +tcIfaceRule :: IfaceSyn.IfaceRule -> TcRnTypes.IfL CoreSyn.CoreRule + + diff --git a/compiler/iface/TcIface.lhs b/compiler/iface/TcIface.lhs new file mode 100644 index 0000000000..b902c8c5fe --- /dev/null +++ b/compiler/iface/TcIface.lhs @@ -0,0 +1,977 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[TcIfaceSig]{Type checking of type signatures in interface files} + +\begin{code} +module TcIface ( + tcImportDecl, checkWiredInTyCon, tcHiBootIface, typecheckIface, + tcIfaceDecl, tcIfaceInst, tcIfaceRule, tcIfaceGlobal, + tcExtCoreBindings + ) where + +#include "HsVersions.h" + +import IfaceSyn +import LoadIface ( loadInterface, loadWiredInHomeIface, + loadDecls, findAndReadIface ) +import IfaceEnv ( lookupIfaceTop, lookupIfaceExt, newGlobalBinder, + extendIfaceIdEnv, extendIfaceTyVarEnv, newIPName, + tcIfaceTyVar, tcIfaceLclId, lookupIfaceTc, refineIfaceIdEnv, + newIfaceName, newIfaceNames, ifaceExportNames ) +import BuildTyCl ( buildSynTyCon, buildAlgTyCon, buildDataCon, buildClass, + mkAbstractTyConRhs, mkDataTyConRhs, mkNewTyConRhs ) +import TcRnMonad +import Type ( liftedTypeKind, splitTyConApp, mkTyConApp, + mkTyVarTys, ThetaType ) +import TypeRep ( Type(..), PredType(..) ) +import TyCon ( TyCon, tyConName ) +import HscTypes ( ExternalPackageState(..), + TyThing(..), tyThingClass, tyThingTyCon, + ModIface(..), ModDetails(..), HomeModInfo(..), + emptyModDetails, lookupTypeEnv, lookupType, typeEnvIds ) +import InstEnv ( Instance(..), mkImportedInstance ) +import Unify ( coreRefineTys ) +import CoreSyn +import CoreUtils ( exprType ) +import CoreUnfold +import CoreLint ( lintUnfolding ) +import WorkWrap ( mkWrapper ) +import Id ( Id, mkVanillaGlobal, mkLocalId ) +import MkId ( mkFCallId ) +import IdInfo ( IdInfo, CafInfo(..), WorkerInfo(..), + setUnfoldingInfoLazily, setAllStrictnessInfo, setWorkerInfo, + setArityInfo, setInlinePragInfo, setCafInfo, + vanillaIdInfo, newStrictnessInfo ) +import Class ( Class ) +import TyCon ( tyConDataCons, isTupleTyCon, mkForeignTyCon ) +import DataCon ( DataCon, dataConWorkId, dataConTyVars, dataConInstArgTys, isVanillaDataCon ) +import TysWiredIn ( tupleCon, tupleTyCon, listTyCon, intTyCon, boolTyCon, charTyCon, parrTyCon ) +import Var ( TyVar, mkTyVar, tyVarKind ) +import Name ( Name, nameModule, nameIsLocalOrFrom, isWiredInName, + wiredInNameTyThing_maybe, nameParent ) +import NameEnv +import OccName ( OccName ) +import Module ( Module, lookupModuleEnv ) +import UniqSupply ( initUs_ ) +import Outputable +import ErrUtils ( Message ) +import Maybes ( MaybeErr(..) ) +import SrcLoc ( noSrcLoc ) +import Util ( zipWithEqual, dropList, equalLength ) +import DynFlags ( DynFlag(..), isOneShot ) +\end{code} + +This module takes + + IfaceDecl -> TyThing + IfaceType -> Type + etc + +An IfaceDecl is populated with RdrNames, and these are not renamed to +Names before typechecking, because there should be no scope errors etc. + + -- For (b) consider: f = $(...h....) + -- where h is imported, and calls f via an hi-boot file. + -- This is bad! But it is not seen as a staging error, because h + -- is indeed imported. We don't want the type-checker to black-hole + -- when simplifying and compiling the splice! + -- + -- Simple solution: discard any unfolding that mentions a variable + -- bound in this module (and hence not yet processed). + -- The discarding happens when forkM finds a type error. + +%************************************************************************ +%* * +%* tcImportDecl is the key function for "faulting in" * +%* imported things +%* * +%************************************************************************ + +The main idea is this. We are chugging along type-checking source code, and +find a reference to GHC.Base.map. We call tcLookupGlobal, which doesn't find +it in the EPS type envt. So it + 1 loads GHC.Base.hi + 2 gets the decl for GHC.Base.map + 3 typechecks it via tcIfaceDecl + 4 and adds it to the type env in the EPS + +Note that DURING STEP 4, we may find that map's type mentions a type +constructor that also + +Notice that for imported things we read the current version from the EPS +mutable variable. This is important in situations like + ...$(e1)...$(e2)... +where the code that e1 expands to might import some defns that +also turn out to be needed by the code that e2 expands to. + +\begin{code} +tcImportDecl :: Name -> TcM TyThing +-- Entry point for *source-code* uses of importDecl +tcImportDecl name + | Just thing <- wiredInNameTyThing_maybe name + = do { initIfaceTcRn (loadWiredInHomeIface name) + ; return thing } + | otherwise + = do { traceIf (text "tcImportDecl" <+> ppr name) + ; mb_thing <- initIfaceTcRn (importDecl name) + ; case mb_thing of + Succeeded thing -> return thing + Failed err -> failWithTc err } + +checkWiredInTyCon :: TyCon -> TcM () +-- Ensure that the home module of the TyCon (and hence its instances) +-- are loaded. It might not be a wired-in tycon (see the calls in TcUnify), +-- in which case this is a no-op. +checkWiredInTyCon tc + | not (isWiredInName tc_name) + = return () + | otherwise + = do { mod <- getModule + ; if nameIsLocalOrFrom mod tc_name then + -- Don't look for (non-existent) Float.hi when + -- compiling Float.lhs, which mentions Float of course + return () + else -- A bit yukky to call initIfaceTcRn here + initIfaceTcRn (loadWiredInHomeIface tc_name) + } + where + tc_name = tyConName tc + +importDecl :: Name -> IfM lcl (MaybeErr Message TyThing) +-- Get the TyThing for this Name from an interface file +-- It's not a wired-in thing -- the caller caught that +importDecl name + = ASSERT( not (isWiredInName name) ) + do { traceIf nd_doc + + -- Load the interface, which should populate the PTE + ; mb_iface <- loadInterface nd_doc (nameModule name) ImportBySystem + ; case mb_iface of { + Failed err_msg -> return (Failed err_msg) ; + Succeeded iface -> do + + -- Now look it up again; this time we should find it + { eps <- getEps + ; case lookupTypeEnv (eps_PTE eps) name of + Just thing -> return (Succeeded thing) + Nothing -> return (Failed not_found_msg) + }}} + where + nd_doc = ptext SLIT("Need decl for") <+> ppr name + not_found_msg = hang (ptext SLIT("Can't find interface-file declaration for") <+> ppr (nameParent name)) + 2 (vcat [ptext SLIT("Probable cause: bug in .hi-boot file, or inconsistent .hi file"), + ptext SLIT("Use -ddump-if-trace to get an idea of which file caused the error")]) +\end{code} + +%************************************************************************ +%* * + Type-checking a complete interface +%* * +%************************************************************************ + +Suppose we discover we don't need to recompile. Then we must type +check the old interface file. This is a bit different to the +incremental type checking we do as we suck in interface files. Instead +we do things similarly as when we are typechecking source decls: we +bring into scope the type envt for the interface all at once, using a +knot. Remember, the decls aren't necessarily in dependency order -- +and even if they were, the type decls might be mutually recursive. + +\begin{code} +typecheckIface :: ModIface -- Get the decls from here + -> TcRnIf gbl lcl ModDetails +typecheckIface iface + = initIfaceTc iface $ \ tc_env_var -> do + -- The tc_env_var is freshly allocated, private to + -- type-checking this particular interface + { -- Get the right set of decls and rules. If we are compiling without -O + -- we discard pragmas before typechecking, so that we don't "see" + -- information that we shouldn't. From a versioning point of view + -- It's not actually *wrong* to do so, but in fact GHCi is unable + -- to handle unboxed tuples, so it must not see unfoldings. + ignore_prags <- doptM Opt_IgnoreInterfacePragmas + + -- Load & typecheck the decls + ; decl_things <- loadDecls ignore_prags (mi_decls iface) + + ; let type_env = mkNameEnv decl_things + ; writeMutVar tc_env_var type_env + + -- Now do those rules and instances + ; let { rules | ignore_prags = [] + | otherwise = mi_rules iface + ; dfuns = mi_insts iface + } + ; dfuns <- mapM tcIfaceInst dfuns + ; rules <- mapM tcIfaceRule rules + + -- Exports + ; exports <- ifaceExportNames (mi_exports iface) + + -- Finished + ; return (ModDetails { md_types = type_env, + md_insts = dfuns, + md_rules = rules, + md_exports = exports }) + } +\end{code} + + +%************************************************************************ +%* * + Type and class declarations +%* * +%************************************************************************ + +\begin{code} +tcHiBootIface :: Module -> TcRn ModDetails +-- Load the hi-boot iface for the module being compiled, +-- if it indeed exists in the transitive closure of imports +-- Return the ModDetails, empty if no hi-boot iface +tcHiBootIface mod + = do { traceIf (text "loadHiBootInterface" <+> ppr mod) + + ; mode <- getGhcMode + ; if not (isOneShot mode) + -- In --make and interactive mode, if this module has an hs-boot file + -- we'll have compiled it already, and it'll be in the HPT + -- + -- We check wheher the interface is a *boot* interface. + -- It can happen (when using GHC from Visual Studio) that we + -- compile a module in TypecheckOnly mode, with a stable, + -- fully-populated HPT. In that case the boot interface isn't there + -- (it's been replaced by the mother module) so we can't check it. + -- And that's fine, because if M's ModInfo is in the HPT, then + -- it's been compiled once, and we don't need to check the boot iface + then do { hpt <- getHpt + ; case lookupModuleEnv hpt mod of + Just info | mi_boot (hm_iface info) + -> return (hm_details info) + other -> return emptyModDetails } + else do + + -- OK, so we're in one-shot mode. + -- In that case, we're read all the direct imports by now, + -- so eps_is_boot will record if any of our imports mention us by + -- way of hi-boot file + { eps <- getEps + ; case lookupModuleEnv (eps_is_boot eps) mod of { + Nothing -> return emptyModDetails ; -- The typical case + + Just (_, False) -> failWithTc moduleLoop ; + -- Someone below us imported us! + -- This is a loop with no hi-boot in the way + + Just (mod, True) -> -- There's a hi-boot interface below us + + do { read_result <- findAndReadIface + True -- Explicit import? + need mod + True -- Hi-boot file + + ; case read_result of + Failed err -> failWithTc (elaborate err) + Succeeded (iface, _path) -> typecheckIface iface + }}}} + where + need = ptext SLIT("Need the hi-boot interface for") <+> ppr mod + <+> ptext SLIT("to compare against the Real Thing") + + moduleLoop = ptext SLIT("Circular imports: module") <+> quotes (ppr mod) + <+> ptext SLIT("depends on itself") + + elaborate err = hang (ptext SLIT("Could not find hi-boot interface for") <+> + quotes (ppr mod) <> colon) 4 err +\end{code} + + +%************************************************************************ +%* * + Type and class declarations +%* * +%************************************************************************ + +When typechecking a data type decl, we *lazily* (via forkM) typecheck +the constructor argument types. This is in the hope that we may never +poke on those argument types, and hence may never need to load the +interface files for types mentioned in the arg types. + +E.g. + data Foo.S = MkS Baz.T +Mabye we can get away without even loading the interface for Baz! + +This is not just a performance thing. Suppose we have + data Foo.S = MkS Baz.T + data Baz.T = MkT Foo.S +(in different interface files, of course). +Now, first we load and typecheck Foo.S, and add it to the type envt. +If we do explore MkS's argument, we'll load and typecheck Baz.T. +If we explore MkT's argument we'll find Foo.S already in the envt. + +If we typechecked constructor args eagerly, when loading Foo.S we'd try to +typecheck the type Baz.T. So we'd fault in Baz.T... and then need Foo.S... +which isn't done yet. + +All very cunning. However, there is a rather subtle gotcha which bit +me when developing this stuff. When we typecheck the decl for S, we +extend the type envt with S, MkS, and all its implicit Ids. Suppose +(a bug, but it happened) that the list of implicit Ids depended in +turn on the constructor arg types. Then the following sequence of +events takes place: + * we build a thunk <t> for the constructor arg tys + * we build a thunk for the extended type environment (depends on <t>) + * we write the extended type envt into the global EPS mutvar + +Now we look something up in the type envt + * that pulls on <t> + * which reads the global type envt out of the global EPS mutvar + * but that depends in turn on <t> + +It's subtle, because, it'd work fine if we typechecked the constructor args +eagerly -- they don't need the extended type envt. They just get the extended +type envt by accident, because they look at it later. + +What this means is that the implicitTyThings MUST NOT DEPEND on any of +the forkM stuff. + + +\begin{code} +tcIfaceDecl :: IfaceDecl -> IfL TyThing + +tcIfaceDecl (IfaceId {ifName = occ_name, ifType = iface_type, ifIdInfo = info}) + = do { name <- lookupIfaceTop occ_name + ; ty <- tcIfaceType iface_type + ; info <- tcIdInfo name ty info + ; return (AnId (mkVanillaGlobal name ty info)) } + +tcIfaceDecl (IfaceData {ifName = occ_name, + ifTyVars = tv_bndrs, + ifCtxt = ctxt, + ifCons = rdr_cons, + ifVrcs = arg_vrcs, ifRec = is_rec, + ifGeneric = want_generic }) + = do { tc_name <- lookupIfaceTop occ_name + ; bindIfaceTyVars tv_bndrs $ \ tyvars -> do + + { tycon <- fixM ( \ tycon -> do + { stupid_theta <- tcIfaceCtxt ctxt + ; cons <- tcIfaceDataCons tycon tyvars rdr_cons + ; buildAlgTyCon tc_name tyvars stupid_theta + cons arg_vrcs is_rec want_generic + }) + ; traceIf (text "tcIfaceDecl4" <+> ppr tycon) + ; return (ATyCon tycon) + }} + +tcIfaceDecl (IfaceSyn {ifName = occ_name, ifTyVars = tv_bndrs, + ifSynRhs = rdr_rhs_ty, ifVrcs = arg_vrcs}) + = bindIfaceTyVars tv_bndrs $ \ tyvars -> do + { tc_name <- lookupIfaceTop occ_name + ; rhs_ty <- tcIfaceType rdr_rhs_ty + ; return (ATyCon (buildSynTyCon tc_name tyvars rhs_ty arg_vrcs)) + } + +tcIfaceDecl (IfaceClass {ifCtxt = rdr_ctxt, ifName = occ_name, ifTyVars = tv_bndrs, + ifFDs = rdr_fds, ifSigs = rdr_sigs, + ifVrcs = tc_vrcs, ifRec = tc_isrec }) + = bindIfaceTyVars tv_bndrs $ \ tyvars -> do + { cls_name <- lookupIfaceTop occ_name + ; ctxt <- tcIfaceCtxt rdr_ctxt + ; sigs <- mappM tc_sig rdr_sigs + ; fds <- mappM tc_fd rdr_fds + ; cls <- buildClass cls_name tyvars ctxt fds sigs tc_isrec tc_vrcs + ; return (AClass cls) } + where + tc_sig (IfaceClassOp occ dm rdr_ty) + = do { op_name <- lookupIfaceTop occ + ; op_ty <- forkM (mk_doc op_name rdr_ty) (tcIfaceType rdr_ty) + -- Must be done lazily for just the same reason as the + -- context of a data decl: the type sig might mention the + -- class being defined + ; return (op_name, dm, op_ty) } + + mk_doc op_name op_ty = ptext SLIT("Class op") <+> sep [ppr op_name, ppr op_ty] + + tc_fd (tvs1, tvs2) = do { tvs1' <- mappM tcIfaceTyVar tvs1 + ; tvs2' <- mappM tcIfaceTyVar tvs2 + ; return (tvs1', tvs2') } + +tcIfaceDecl (IfaceForeign {ifName = rdr_name, ifExtName = ext_name}) + = do { name <- lookupIfaceTop rdr_name + ; return (ATyCon (mkForeignTyCon name ext_name + liftedTypeKind 0 [])) } + +tcIfaceDataCons tycon tc_tyvars if_cons + = case if_cons of + IfAbstractTyCon -> return mkAbstractTyConRhs + IfDataTyCon cons -> do { data_cons <- mappM tc_con_decl cons + ; return (mkDataTyConRhs data_cons) } + IfNewTyCon con -> do { data_con <- tc_con_decl con + ; return (mkNewTyConRhs tycon data_con) } + where + tc_con_decl (IfVanillaCon { ifConOcc = occ, ifConInfix = is_infix, ifConArgTys = args, + ifConStricts = stricts, ifConFields = field_lbls}) + = do { name <- lookupIfaceTop occ + -- Read the argument types, but lazily to avoid faulting in + -- the component types unless they are really needed + ; arg_tys <- forkM (mk_doc name) (mappM tcIfaceType args) + ; lbl_names <- mappM lookupIfaceTop field_lbls + ; buildDataCon name is_infix True {- Vanilla -} + stricts lbl_names + tc_tyvars [] arg_tys tycon + (mkTyVarTys tc_tyvars) -- Vanilla => we know result tys + } + + tc_con_decl (IfGadtCon { ifConTyVars = con_tvs, + ifConOcc = occ, ifConCtxt = ctxt, + ifConArgTys = args, ifConResTys = ress, + ifConStricts = stricts}) + = bindIfaceTyVars con_tvs $ \ con_tyvars -> do + { name <- lookupIfaceTop occ + ; theta <- tcIfaceCtxt ctxt -- Laziness seems not worth the bother here + -- At one stage I thought that this context checking *had* + -- to be lazy, because of possible mutual recursion between the + -- type and the classe: + -- E.g. + -- class Real a where { toRat :: a -> Ratio Integer } + -- data (Real a) => Ratio a = ... + -- But now I think that the laziness in checking class ops breaks + -- the loop, so no laziness needed + + -- Read the argument types, but lazily to avoid faulting in + -- the component types unless they are really needed + ; arg_tys <- forkM (mk_doc name) (mappM tcIfaceType args) + ; res_tys <- forkM (mk_doc name) (mappM tcIfaceType ress) + + ; buildDataCon name False {- Not infix -} False {- Not vanilla -} + stricts [{- No fields -}] + con_tyvars theta + arg_tys tycon res_tys + } + mk_doc con_name = ptext SLIT("Constructor") <+> ppr con_name +\end{code} + + +%************************************************************************ +%* * + Instances +%* * +%************************************************************************ + +\begin{code} +tcIfaceInst :: IfaceInst -> IfL Instance +tcIfaceInst (IfaceInst { ifDFun = dfun_occ, ifOFlag = oflag, + ifInstCls = cls, ifInstTys = mb_tcs, + ifInstOrph = orph }) + = do { dfun <- forkM (ptext SLIT("Dict fun") <+> ppr dfun_occ) $ + tcIfaceExtId (LocalTop dfun_occ) + ; cls' <- lookupIfaceExt cls + ; mb_tcs' <- mapM do_tc mb_tcs + ; return (mkImportedInstance cls' mb_tcs' orph dfun oflag) } + where + do_tc Nothing = return Nothing + do_tc (Just tc) = do { tc' <- lookupIfaceTc tc; return (Just tc') } +\end{code} + + +%************************************************************************ +%* * + Rules +%* * +%************************************************************************ + +We move a IfaceRule from eps_rules to eps_rule_base when all its LHS free vars +are in the type environment. However, remember that typechecking a Rule may +(as a side effect) augment the type envt, and so we may need to iterate the process. + +\begin{code} +tcIfaceRule :: IfaceRule -> IfL CoreRule +tcIfaceRule (IfaceRule {ifRuleName = name, ifActivation = act, ifRuleBndrs = bndrs, + ifRuleHead = fn, ifRuleArgs = args, ifRuleRhs = rhs, + ifRuleOrph = orph }) + = do { fn' <- lookupIfaceExt fn + ; ~(bndrs', args', rhs') <- + -- Typecheck the payload lazily, in the hope it'll never be looked at + forkM (ptext SLIT("Rule") <+> ftext name) $ + bindIfaceBndrs bndrs $ \ bndrs' -> + do { args' <- mappM tcIfaceExpr args + ; rhs' <- tcIfaceExpr rhs + ; return (bndrs', args', rhs') } + ; mb_tcs <- mapM ifTopFreeName args + ; returnM (Rule { ru_name = name, ru_fn = fn', ru_act = act, + ru_bndrs = bndrs', ru_args = args', + ru_rhs = rhs', ru_orph = orph, + ru_rough = mb_tcs, + ru_local = isLocalIfaceExtName fn }) } + where + -- This function *must* mirror exactly what Rules.topFreeName does + -- We could have stored the ru_rough field in the iface file + -- but that would be redundant, I think. + -- The only wrinkle is that we must not be deceived by + -- type syononyms at the top of a type arg. Since + -- we can't tell at this point, we are careful not + -- to write them out in coreRuleToIfaceRule + ifTopFreeName :: IfaceExpr -> IfL (Maybe Name) + ifTopFreeName (IfaceType (IfaceTyConApp tc _ )) + = do { n <- lookupIfaceTc tc + ; return (Just n) } + ifTopFreeName (IfaceApp f a) = ifTopFreeName f + ifTopFreeName (IfaceExt ext) = do { n <- lookupIfaceExt ext + ; return (Just n) } + ifTopFreeName other = return Nothing +\end{code} + + +%************************************************************************ +%* * + Types +%* * +%************************************************************************ + +\begin{code} +tcIfaceType :: IfaceType -> IfL Type +tcIfaceType (IfaceTyVar n) = do { tv <- tcIfaceTyVar n; return (TyVarTy tv) } +tcIfaceType (IfaceAppTy t1 t2) = do { t1' <- tcIfaceType t1; t2' <- tcIfaceType t2; return (AppTy t1' t2') } +tcIfaceType (IfaceFunTy t1 t2) = do { t1' <- tcIfaceType t1; t2' <- tcIfaceType t2; return (FunTy t1' t2') } +tcIfaceType (IfaceTyConApp tc ts) = do { tc' <- tcIfaceTyCon tc; ts' <- tcIfaceTypes ts; return (mkTyConApp tc' ts') } +tcIfaceType (IfaceForAllTy tv t) = bindIfaceTyVar tv $ \ tv' -> do { t' <- tcIfaceType t; return (ForAllTy tv' t') } +tcIfaceType (IfacePredTy st) = do { st' <- tcIfacePredType st; return (PredTy st') } + +tcIfaceTypes tys = mapM tcIfaceType tys + +----------------------------------------- +tcIfacePredType :: IfacePredType -> IfL PredType +tcIfacePredType (IfaceClassP cls ts) = do { cls' <- tcIfaceClass cls; ts' <- tcIfaceTypes ts; return (ClassP cls' ts') } +tcIfacePredType (IfaceIParam ip t) = do { ip' <- newIPName ip; t' <- tcIfaceType t; return (IParam ip' t') } + +----------------------------------------- +tcIfaceCtxt :: IfaceContext -> IfL ThetaType +tcIfaceCtxt sts = mappM tcIfacePredType sts +\end{code} + + +%************************************************************************ +%* * + Core +%* * +%************************************************************************ + +\begin{code} +tcIfaceExpr :: IfaceExpr -> IfL CoreExpr +tcIfaceExpr (IfaceType ty) + = tcIfaceType ty `thenM` \ ty' -> + returnM (Type ty') + +tcIfaceExpr (IfaceLcl name) + = tcIfaceLclId name `thenM` \ id -> + returnM (Var id) + +tcIfaceExpr (IfaceExt gbl) + = tcIfaceExtId gbl `thenM` \ id -> + returnM (Var id) + +tcIfaceExpr (IfaceLit lit) + = returnM (Lit lit) + +tcIfaceExpr (IfaceFCall cc ty) + = tcIfaceType ty `thenM` \ ty' -> + newUnique `thenM` \ u -> + returnM (Var (mkFCallId u cc ty')) + +tcIfaceExpr (IfaceTuple boxity args) + = mappM tcIfaceExpr args `thenM` \ args' -> + let + -- Put the missing type arguments back in + con_args = map (Type . exprType) args' ++ args' + in + returnM (mkApps (Var con_id) con_args) + where + arity = length args + con_id = dataConWorkId (tupleCon boxity arity) + + +tcIfaceExpr (IfaceLam bndr body) + = bindIfaceBndr bndr $ \ bndr' -> + tcIfaceExpr body `thenM` \ body' -> + returnM (Lam bndr' body') + +tcIfaceExpr (IfaceApp fun arg) + = tcIfaceExpr fun `thenM` \ fun' -> + tcIfaceExpr arg `thenM` \ arg' -> + returnM (App fun' arg') + +tcIfaceExpr (IfaceCase scrut case_bndr ty alts) + = tcIfaceExpr scrut `thenM` \ scrut' -> + newIfaceName case_bndr `thenM` \ case_bndr_name -> + let + scrut_ty = exprType scrut' + case_bndr' = mkLocalId case_bndr_name scrut_ty + tc_app = splitTyConApp scrut_ty + -- NB: Won't always succeed (polymoprhic case) + -- but won't be demanded in those cases + -- NB: not tcSplitTyConApp; we are looking at Core here + -- look through non-rec newtypes to find the tycon that + -- corresponds to the datacon in this case alternative + in + extendIfaceIdEnv [case_bndr'] $ + mappM (tcIfaceAlt tc_app) alts `thenM` \ alts' -> + tcIfaceType ty `thenM` \ ty' -> + returnM (Case scrut' case_bndr' ty' alts') + +tcIfaceExpr (IfaceLet (IfaceNonRec bndr rhs) body) + = tcIfaceExpr rhs `thenM` \ rhs' -> + bindIfaceId bndr $ \ bndr' -> + tcIfaceExpr body `thenM` \ body' -> + returnM (Let (NonRec bndr' rhs') body') + +tcIfaceExpr (IfaceLet (IfaceRec pairs) body) + = bindIfaceIds bndrs $ \ bndrs' -> + mappM tcIfaceExpr rhss `thenM` \ rhss' -> + tcIfaceExpr body `thenM` \ body' -> + returnM (Let (Rec (bndrs' `zip` rhss')) body') + where + (bndrs, rhss) = unzip pairs + +tcIfaceExpr (IfaceNote note expr) + = tcIfaceExpr expr `thenM` \ expr' -> + case note of + IfaceCoerce to_ty -> tcIfaceType to_ty `thenM` \ to_ty' -> + returnM (Note (Coerce to_ty' + (exprType expr')) expr') + IfaceInlineCall -> returnM (Note InlineCall expr') + IfaceInlineMe -> returnM (Note InlineMe expr') + IfaceSCC cc -> returnM (Note (SCC cc) expr') + IfaceCoreNote n -> returnM (Note (CoreNote n) expr') + +------------------------- +tcIfaceAlt _ (IfaceDefault, names, rhs) + = ASSERT( null names ) + tcIfaceExpr rhs `thenM` \ rhs' -> + returnM (DEFAULT, [], rhs') + +tcIfaceAlt _ (IfaceLitAlt lit, names, rhs) + = ASSERT( null names ) + tcIfaceExpr rhs `thenM` \ rhs' -> + returnM (LitAlt lit, [], rhs') + +-- A case alternative is made quite a bit more complicated +-- by the fact that we omit type annotations because we can +-- work them out. True enough, but its not that easy! +tcIfaceAlt (tycon, inst_tys) (IfaceDataAlt data_occ, arg_occs, rhs) + = do { let tycon_mod = nameModule (tyConName tycon) + ; con <- tcIfaceDataCon (ExtPkg tycon_mod data_occ) + ; ASSERT2( con `elem` tyConDataCons tycon, + ppr con $$ ppr tycon $$ ppr (tyConDataCons tycon) ) + + if isVanillaDataCon con then + tcVanillaAlt con inst_tys arg_occs rhs + else + do { -- General case + arg_names <- newIfaceNames arg_occs + ; let tyvars = [ mkTyVar name (tyVarKind tv) + | (name,tv) <- arg_names `zip` dataConTyVars con] + arg_tys = dataConInstArgTys con (mkTyVarTys tyvars) + id_names = dropList tyvars arg_names + arg_ids = ASSERT2( equalLength id_names arg_tys, + ppr (con, arg_names, rhs) $$ ppr tyvars $$ ppr arg_tys ) + zipWith mkLocalId id_names arg_tys + + Just refine = coreRefineTys con tyvars (mkTyConApp tycon inst_tys) + + ; rhs' <- extendIfaceTyVarEnv tyvars $ + extendIfaceIdEnv arg_ids $ + refineIfaceIdEnv refine $ + -- You might think that we don't need to refine the envt here, + -- but we do: \(x::a) -> case y of + -- MkT -> case x of { True -> ... } + -- In the "case x" we need to know x's type, because we use that + -- to find which module to look for "True" in. Sigh. + tcIfaceExpr rhs + ; return (DataAlt con, tyvars ++ arg_ids, rhs') }} + +tcIfaceAlt (tycon, inst_tys) (IfaceTupleAlt boxity, arg_occs, rhs) + = ASSERT( isTupleTyCon tycon ) + do { let [data_con] = tyConDataCons tycon + ; tcVanillaAlt data_con inst_tys arg_occs rhs } + +tcVanillaAlt data_con inst_tys arg_occs rhs + = do { arg_names <- newIfaceNames arg_occs + ; let arg_tys = dataConInstArgTys data_con inst_tys + ; let arg_ids = ASSERT2( equalLength arg_names arg_tys, + ppr data_con <+> ppr inst_tys <+> ppr arg_occs $$ ppr rhs ) + zipWith mkLocalId arg_names arg_tys + ; rhs' <- extendIfaceIdEnv arg_ids (tcIfaceExpr rhs) + ; returnM (DataAlt data_con, arg_ids, rhs') } +\end{code} + + +\begin{code} +tcExtCoreBindings :: [IfaceBinding] -> IfL [CoreBind] -- Used for external core +tcExtCoreBindings [] = return [] +tcExtCoreBindings (b:bs) = do_one b (tcExtCoreBindings bs) + +do_one :: IfaceBinding -> IfL [CoreBind] -> IfL [CoreBind] +do_one (IfaceNonRec bndr rhs) thing_inside + = do { rhs' <- tcIfaceExpr rhs + ; bndr' <- newExtCoreBndr bndr + ; extendIfaceIdEnv [bndr'] $ do + { core_binds <- thing_inside + ; return (NonRec bndr' rhs' : core_binds) }} + +do_one (IfaceRec pairs) thing_inside + = do { bndrs' <- mappM newExtCoreBndr bndrs + ; extendIfaceIdEnv bndrs' $ do + { rhss' <- mappM tcIfaceExpr rhss + ; core_binds <- thing_inside + ; return (Rec (bndrs' `zip` rhss') : core_binds) }} + where + (bndrs,rhss) = unzip pairs +\end{code} + + +%************************************************************************ +%* * + IdInfo +%* * +%************************************************************************ + +\begin{code} +tcIdInfo :: Name -> Type -> IfaceIdInfo -> IfL IdInfo +tcIdInfo name ty NoInfo = return vanillaIdInfo +tcIdInfo name ty (HasInfo info) = foldlM tcPrag init_info info + where + -- Set the CgInfo to something sensible but uninformative before + -- we start; default assumption is that it has CAFs + init_info = vanillaIdInfo + + tcPrag info HsNoCafRefs = returnM (info `setCafInfo` NoCafRefs) + tcPrag info (HsArity arity) = returnM (info `setArityInfo` arity) + tcPrag info (HsStrictness str) = returnM (info `setAllStrictnessInfo` Just str) + + -- The next two are lazy, so they don't transitively suck stuff in + tcPrag info (HsWorker nm arity) = tcWorkerInfo ty info nm arity + tcPrag info (HsUnfold inline_prag expr) + = tcPragExpr name expr `thenM` \ maybe_expr' -> + let + -- maybe_expr' doesn't get looked at if the unfolding + -- is never inspected; so the typecheck doesn't even happen + unfold_info = case maybe_expr' of + Nothing -> noUnfolding + Just expr' -> mkTopUnfolding expr' + in + returnM (info `setUnfoldingInfoLazily` unfold_info + `setInlinePragInfo` inline_prag) +\end{code} + +\begin{code} +tcWorkerInfo ty info wkr arity + = do { mb_wkr_id <- forkM_maybe doc (tcIfaceExtId wkr) + + -- We return without testing maybe_wkr_id, but as soon as info is + -- looked at we will test it. That's ok, because its outside the + -- knot; and there seems no big reason to further defer the + -- tcIfaceId lookup. (Contrast with tcPragExpr, where postponing walking + -- over the unfolding until it's actually used does seem worth while.) + ; us <- newUniqueSupply + + ; returnM (case mb_wkr_id of + Nothing -> info + Just wkr_id -> add_wkr_info us wkr_id info) } + where + doc = text "Worker for" <+> ppr wkr + add_wkr_info us wkr_id info + = info `setUnfoldingInfoLazily` mk_unfolding us wkr_id + `setWorkerInfo` HasWorker wkr_id arity + + mk_unfolding us wkr_id = mkTopUnfolding (initUs_ us (mkWrapper ty strict_sig) wkr_id) + + -- We are relying here on strictness info always appearing + -- before worker info, fingers crossed .... + strict_sig = case newStrictnessInfo info of + Just sig -> sig + Nothing -> pprPanic "Worker info but no strictness for" (ppr wkr) +\end{code} + +For unfoldings we try to do the job lazily, so that we never type check +an unfolding that isn't going to be looked at. + +\begin{code} +tcPragExpr :: Name -> IfaceExpr -> IfL (Maybe CoreExpr) +tcPragExpr name expr + = forkM_maybe doc $ + tcIfaceExpr expr `thenM` \ core_expr' -> + + -- Check for type consistency in the unfolding + ifOptM Opt_DoCoreLinting ( + get_in_scope_ids `thenM` \ in_scope -> + case lintUnfolding noSrcLoc in_scope core_expr' of + Nothing -> returnM () + Just fail_msg -> pprPanic "Iface Lint failure" (doc <+> fail_msg) + ) `thenM_` + + returnM core_expr' + where + doc = text "Unfolding of" <+> ppr name + get_in_scope_ids -- Urgh; but just for linting + = setLclEnv () $ + do { env <- getGblEnv + ; case if_rec_types env of { + Nothing -> return [] ; + Just (_, get_env) -> do + { type_env <- get_env + ; return (typeEnvIds type_env) }}} +\end{code} + + + +%************************************************************************ +%* * + Getting from Names to TyThings +%* * +%************************************************************************ + +\begin{code} +tcIfaceGlobal :: Name -> IfL TyThing +tcIfaceGlobal name + | Just thing <- wiredInNameTyThing_maybe name + -- Wired-in things include TyCons, DataCons, and Ids + = do { loadWiredInHomeIface name; return thing } + -- Even though we are in an interface file, we want to make + -- sure its instances are loaded (imagine f :: Double -> Double) + -- and its RULES are loaded too + | otherwise + = do { (eps,hpt) <- getEpsAndHpt + ; case lookupType hpt (eps_PTE eps) name of { + Just thing -> return thing ; + Nothing -> do + + { env <- getGblEnv + ; case if_rec_types env of { + Just (mod, get_type_env) + | nameIsLocalOrFrom mod name + -> do -- It's defined in the module being compiled + { type_env <- setLclEnv () get_type_env -- yuk + ; case lookupNameEnv type_env name of + Just thing -> return thing + Nothing -> pprPanic "tcIfaceGlobal (local): not found:" + (ppr name $$ ppr type_env) } + + ; other -> do + + { mb_thing <- importDecl name -- It's imported; go get it + ; case mb_thing of + Failed err -> failIfM err + Succeeded thing -> return thing + }}}}} + +tcIfaceTyCon :: IfaceTyCon -> IfL TyCon +tcIfaceTyCon IfaceIntTc = tcWiredInTyCon intTyCon +tcIfaceTyCon IfaceBoolTc = tcWiredInTyCon boolTyCon +tcIfaceTyCon IfaceCharTc = tcWiredInTyCon charTyCon +tcIfaceTyCon IfaceListTc = tcWiredInTyCon listTyCon +tcIfaceTyCon IfacePArrTc = tcWiredInTyCon parrTyCon +tcIfaceTyCon (IfaceTupTc bx ar) = tcWiredInTyCon (tupleTyCon bx ar) +tcIfaceTyCon (IfaceTc ext_nm) = do { name <- lookupIfaceExt ext_nm + ; thing <- tcIfaceGlobal name + ; return (check_tc (tyThingTyCon thing)) } + where +#ifdef DEBUG + check_tc tc = case toIfaceTyCon (error "urk") tc of + IfaceTc _ -> tc + other -> pprTrace "check_tc" (ppr tc) tc +#else + check_tc tc = tc +#endif + +-- Even though we are in an interface file, we want to make +-- sure the instances and RULES of this tycon are loaded +-- Imagine: f :: Double -> Double +tcWiredInTyCon :: TyCon -> IfL TyCon +tcWiredInTyCon tc = do { loadWiredInHomeIface (tyConName tc) + ; return tc } + +tcIfaceClass :: IfaceExtName -> IfL Class +tcIfaceClass rdr_name = do { name <- lookupIfaceExt rdr_name + ; thing <- tcIfaceGlobal name + ; return (tyThingClass thing) } + +tcIfaceDataCon :: IfaceExtName -> IfL DataCon +tcIfaceDataCon gbl = do { name <- lookupIfaceExt gbl + ; thing <- tcIfaceGlobal name + ; case thing of + ADataCon dc -> return dc + other -> pprPanic "tcIfaceExtDC" (ppr gbl $$ ppr name$$ ppr thing) } + +tcIfaceExtId :: IfaceExtName -> IfL Id +tcIfaceExtId gbl = do { name <- lookupIfaceExt gbl + ; thing <- tcIfaceGlobal name + ; case thing of + AnId id -> return id + other -> pprPanic "tcIfaceExtId" (ppr gbl $$ ppr name$$ ppr thing) } +\end{code} + +%************************************************************************ +%* * + Bindings +%* * +%************************************************************************ + +\begin{code} +bindIfaceBndr :: IfaceBndr -> (CoreBndr -> IfL a) -> IfL a +bindIfaceBndr (IfaceIdBndr bndr) thing_inside + = bindIfaceId bndr thing_inside +bindIfaceBndr (IfaceTvBndr bndr) thing_inside + = bindIfaceTyVar bndr thing_inside + +bindIfaceBndrs :: [IfaceBndr] -> ([CoreBndr] -> IfL a) -> IfL a +bindIfaceBndrs [] thing_inside = thing_inside [] +bindIfaceBndrs (b:bs) thing_inside + = bindIfaceBndr b $ \ b' -> + bindIfaceBndrs bs $ \ bs' -> + thing_inside (b':bs') + +----------------------- +bindIfaceId :: (OccName, IfaceType) -> (Id -> IfL a) -> IfL a +bindIfaceId (occ, ty) thing_inside + = do { name <- newIfaceName occ + ; ty' <- tcIfaceType ty + ; let { id = mkLocalId name ty' } + ; extendIfaceIdEnv [id] (thing_inside id) } + +bindIfaceIds :: [(OccName, IfaceType)] -> ([Id] -> IfL a) -> IfL a +bindIfaceIds bndrs thing_inside + = do { names <- newIfaceNames occs + ; tys' <- mappM tcIfaceType tys + ; let { ids = zipWithEqual "tcCoreValBndr" mkLocalId names tys' } + ; extendIfaceIdEnv ids (thing_inside ids) } + where + (occs,tys) = unzip bndrs + + +----------------------- +newExtCoreBndr :: (OccName, IfaceType) -> IfL Id +newExtCoreBndr (occ, ty) + = do { mod <- getIfModule + ; name <- newGlobalBinder mod occ Nothing noSrcLoc + ; ty' <- tcIfaceType ty + ; return (mkLocalId name ty') } + +----------------------- +bindIfaceTyVar :: IfaceTvBndr -> (TyVar -> IfL a) -> IfL a +bindIfaceTyVar (occ,kind) thing_inside + = do { name <- newIfaceName occ + ; let tyvar = mk_iface_tyvar name kind + ; extendIfaceTyVarEnv [tyvar] (thing_inside tyvar) } + +bindIfaceTyVars :: [IfaceTvBndr] -> ([TyVar] -> IfL a) -> IfL a +bindIfaceTyVars bndrs thing_inside + = do { names <- newIfaceNames occs + ; let tyvars = zipWith mk_iface_tyvar names kinds + ; extendIfaceTyVarEnv tyvars (thing_inside tyvars) } + where + (occs,kinds) = unzip bndrs + +mk_iface_tyvar name kind = mkTyVar name kind +\end{code} + diff --git a/compiler/iface/TcIface.lhs-boot b/compiler/iface/TcIface.lhs-boot new file mode 100644 index 0000000000..25191fcaae --- /dev/null +++ b/compiler/iface/TcIface.lhs-boot @@ -0,0 +1,13 @@ +\begin{code} +module TcIface where +import IfaceSyn ( IfaceDecl, IfaceInst, IfaceRule ) +import TypeRep ( TyThing ) +import TcRnTypes ( IfL ) +import InstEnv ( Instance ) +import CoreSyn ( CoreRule ) + +tcIfaceDecl :: IfaceDecl -> IfL TyThing +tcIfaceInst :: IfaceInst -> IfL Instance +tcIfaceRule :: IfaceRule -> IfL CoreRule +\end{code} + diff --git a/compiler/ilxGen/Entry.ilx b/compiler/ilxGen/Entry.ilx new file mode 100644 index 0000000000..674c83141a --- /dev/null +++ b/compiler/ilxGen/Entry.ilx @@ -0,0 +1,53 @@ +.assembly test { } +.assembly extern 'mscorlib' { } +.assembly extern ilx 'std' { } +// ENTRYPOINT +.class MainMain { + .method public static void Main(class [mscorlib]System.String[]) { + .entrypoint + ldstr "LOG: *** loading main value" call void [mscorlib]System.Console::WriteLine(class [mscorlib]System.String) + ldsfld thunk<(func ( /* unit skipped */ ) --> class [std]PrelBase_Z0T)> class Main::'Main_main' + + ldstr "LOG: *** evaluating main value" + call void [mscorlib]System.Console::WriteLine(class [mscorlib]System.String) + callfunc () --> (func ( /* unit skipped */ ) --> class [std]PrelBase_Z0T) + ldstr "LOG: *** calling main value" + call void [mscorlib]System.Console::WriteLine(class [mscorlib]System.String) + // ldunit + callfunc ( /* unit skipped */ ) --> class [std]PrelBase_Z0T + + pop + + +// HACK HACK HACK +// Call the "finalizers" for stdin, stdout and stderr, because COM+ doesn't +// guarantee that finalizers will be run. WE DON'T GUARANTEE TO RUN ANY +// OTHER FINALIZERS... + + ldstr "LOG: ***calling critical finalizers manually in main()" + call void [mscorlib]System.Console::WriteLine(class [mscorlib]System.String) + +ldsfld thunk<(func (thunk<class [std]PrelIOBase_MVar<class [std]PrelIOBase_Handlezuzu>>) --> (func (/* unit skipped */) --> class [std]PrelBase_Z0T))> [std]'PrelHandle'::'PrelHandle_stdHandleFinalizzer' +ldsfld thunk<class [std]PrelIOBase_MVar<class [std]PrelIOBase_Handle__>> [std]'PrelHandle'::'PrelHandle_stdin' + callfunc () (thunk<class [std]PrelIOBase_MVar<class [std]PrelIOBase_Handle__>>) --> (func ( /* unit skipped */ ) --> class [std]PrelBase_Z0T) + callfunc ( /* unit skipped */ ) --> class [std]PrelBase_Z0T + pop + +ldsfld thunk<(func (thunk<class [std]PrelIOBase_MVar<class [std]PrelIOBase_Handlezuzu>>) --> (func (/* unit skipped */) --> class [std]PrelBase_Z0T))> [std]'PrelHandle'::'PrelHandle_stdHandleFinalizzer' +ldsfld thunk<class [std]PrelIOBase_MVar<class [std]PrelIOBase_Handle__>> [std]'PrelHandle'::'PrelHandle_stdout' + callfunc () (thunk<class [std]PrelIOBase_MVar<class [std]PrelIOBase_Handle__>>) --> (func ( /* unit skipped */ ) --> class [std]PrelBase_Z0T) + callfunc ( /* unit skipped */ ) --> class [std]PrelBase_Z0T + pop + +ldsfld thunk<(func (thunk<class [std]PrelIOBase_MVar<class [std]PrelIOBase_Handlezuzu>>) --> (func (/* unit skipped */) --> class [std]PrelBase_Z0T))> [std]'PrelHandle'::'PrelHandle_stdHandleFinalizzer' +ldsfld thunk<class [std]PrelIOBase_MVar<class [std]PrelIOBase_Handle__>> [std]'PrelHandle'::'PrelHandle_stderr' + callfunc () (thunk<class [std]PrelIOBase_MVar<class [std]PrelIOBase_Handle__>>) --> (func ( /* unit skipped */ ) --> class [std]PrelBase_Z0T) + callfunc ( /* unit skipped */ ) --> class [std]PrelBase_Z0T + pop + + ldstr "LOG: exit main()\n" + call void [mscorlib]System.Console::WriteLine(class [mscorlib]System.String) + ret + } +} + diff --git a/compiler/ilxGen/IlxGen.lhs b/compiler/ilxGen/IlxGen.lhs new file mode 100644 index 0000000000..19e9f76ecf --- /dev/null +++ b/compiler/ilxGen/IlxGen.lhs @@ -0,0 +1,2403 @@ +% +\section{Generate .NET extended IL} + +\begin{code} +module IlxGen( ilxGen ) where + +#include "HsVersions.h" + +import Char ( ord, chr ) +import StgSyn +import Id ( idType, idName, isDeadBinder, idArity ) +import Var ( Var, Id, TyVar, isId, isTyVar, tyVarKind, tyVarName ) +import VarEnv +import VarSet ( isEmptyVarSet ) +import TyCon ( TyCon, tyConPrimRep, isUnboxedTupleTyCon, tyConDataCons, + tyConTyVars, isDataTyCon, isAlgTyCon, tyConArity + ) +import Type ( liftedTypeKind, openTypeKind, unliftedTypeKind, + isUnLiftedType, isTyVarTy, mkTyVarTy, predTypeRep, pprType, + splitForAllTys, splitFunTys, applyTy, applyTys, eqKind, tyVarsOfTypes + ) +import TypeRep ( Type(..) ) +import DataCon ( isUnboxedTupleCon, dataConTyCon, dataConRepType, dataConRepArgTys, DataCon(..) ) +import Literal ( Literal(..) ) +import PrelNames -- Lots of keys +import PrimOp ( PrimOp(..) ) +import ForeignCall ( CCallConv(..), ForeignCall(..), CCallSpec(..), CCallTarget(..), DNCallSpec(..) ) +import TysWiredIn ( mkTupleTy, tupleCon ) +import PrimRep ( PrimRep(..) ) +import Name ( nameModule, nameOccName, isExternalName, isInternalName, NamedThing(getName) ) +import Subst ( substTyWith ) + +import Module ( Module, PackageName, ModuleName, moduleName, + modulePackage, basePackage, + isHomeModule, isVanillaModule, + pprModuleName, mkHomeModule, mkModuleName + ) + +import UniqFM +import BasicTypes ( Boxity(..) ) +import CStrings ( CLabelString, pprCLabelString ) +import Outputable +import Char ( ord ) +import List ( partition, elem, insertBy,any ) +import UniqSet + +import TysPrim ( foreignObjPrimTyCon, weakPrimTyCon, byteArrayPrimTyCon, mutableByteArrayPrimTyCon ) + +-- opt_SimplDoEtaReduction is used to help with assembly naming conventions for different +-- versions of compiled Haskell code. We add a ".O" to all assembly and module +-- names when this is set (because that's clue that -O was set). +-- One day this will be configured by the command line. +import DynFlags ( opt_InPackage, opt_SimplDoEtaReduction ) + +import Util ( lengthIs, equalLength ) + +\end{code} + + + +%************************************************************************ +%* * +\subsection{Main driver} +%* * +%************************************************************************ + +\begin{code} +ilxGen :: Module -> [TyCon] -> [(StgBinding,[Id])] -> SDoc + -- The TyCons should include those arising from classes +ilxGen mod tycons binds_w_srts + = vcat [ text ".module '" <> (ppr (moduleName mod)) <> hscOptionQual <> text "o'", + text ".assembly extern 'mscorlib' {}", + vcat (map (ilxImportPackage topenv) (uniqSetToList import_packages)), + vcat (map (ilxImportModule topenv) (uniqSetToList import_modules)), + vcat (map (ilxImportTyCon topenv) (uniqSetToList import_tycons)), + vcat (map (ilxImportCCall topenv) (map snd (ufmToList import_ccalls))), + vcat (map (ilxTyCon topenv) data_tycons), + vcat (map (ilxBindClosures topenv) binds), + ilxTopBind mod topenv toppairs + ] + where + binds = map fst binds_w_srts + toppairs = ilxPairs binds + topenv = extendIlxEnvWithTops (emptyIlxEnv False mod) mod toppairs + -- Generate info from class decls as well + (import_packages,import_modules,import_tycons,import_ccalls) = importsBinds topenv binds (importsPrelude emptyImpInfo) + data_tycons = filter isDataTyCon tycons +\end{code} + +%************************************************************************ +%* * +\subsection{Find Imports} +%* * +%************************************************************************ + +\begin{code} + +importsBinds :: IlxEnv -> [StgBinding] -> ImportsInfo -> ImportsInfo +importsBinds env binds = foldR (importsBind env) binds + +importsNone :: ImportsInfo -> ImportsInfo +importsNone sofar = sofar + +importsBind :: IlxEnv -> StgBinding -> ImportsInfo -> ImportsInfo +importsBind env (StgNonRec _ b rhs) = importsRhs env rhs.importsVar env b +importsBind env (StgRec _ pairs) = foldR (\(b,rhs) -> importsRhs env rhs . importsVar env b) pairs + +importsRhs :: IlxEnv -> StgRhs -> ImportsInfo -> ImportsInfo +importsRhs env (StgRhsCon _ con args) = importsDataCon env con . importsStgArgs env args +importsRhs env (StgRhsClosure _ _ _ _ args body) = importsExpr env body. importsVars env args + +importsExpr :: IlxEnv -> StgExpr -> ImportsInfo -> ImportsInfo +importsExpr env (StgLit _) = importsNone +importsExpr env (StgApp f args) = importsVar env f.importsStgArgs env args +importsExpr env (StgConApp con args) = importsDataCon env con.importsStgArgs env args +importsExpr env (StgOpApp (StgFCallOp (CCall (CCallSpec (StaticTarget c) cc _)) _) args rty) + = addCCallInfo (c,cc, map stgArgType tm_args, rty) . importsStgArgs env args + where + (ty_args,tm_args) = splitTyArgs1 args + +importsExpr env (StgOpApp _ args res_ty) = importsType env res_ty. importsStgArgs env args + + +importsExpr env (StgSCC _ expr) = importsExpr env expr +importsExpr env (StgCase scrut _ _ bndr _ alts) + = importsExpr env scrut. imports_alts alts. importsVar env bndr + where + imports_alts (StgAlgAlts _ alg_alts deflt) -- The Maybe TyCon part is dealt with + -- by the case-binder's type + = foldR imports_alg_alt alg_alts . imports_deflt deflt + where + imports_alg_alt (con, bndrs, _, rhs) + = importsExpr env rhs . importsDataCon env con. importsVars env bndrs + + imports_alts (StgPrimAlts _ alg_alts deflt) + = foldR imports_prim_alt alg_alts . imports_deflt deflt + where + imports_prim_alt (_, rhs) = importsExpr env rhs + imports_deflt StgNoDefault = importsNone + imports_deflt (StgBindDefault rhs) = importsExpr env rhs + + +importsExpr env (StgLetNoEscape _ _ bind body) = importsExpr env (StgLet bind body) +importsExpr env (StgLet bind body) + = importsBind env bind . importsExpr env body + +importsApp env v args = importsVar env v. importsStgArgs env args +importsStgArgs env args = foldR (importsStgArg env) args + +importsStgArg :: IlxEnv -> StgArg -> ImportsInfo -> ImportsInfo +importsStgArg env (StgTypeArg ty) = importsType env ty +importsStgArg env (StgVarArg v) = importsVar env v +importsStgArg env _ = importsNone + +importsVars env vs = foldR (importsVar env) vs +importsVar env v = importsName env (idName v). importsType env (idType v) + +importsName env n + | isInternalName n = importsNone + | ilxEnvModule env == nameModule n = importsNone + | isHomeModule (nameModule n) = addModuleImpInfo (moduleName (nameModule n)) +-- See HACK below + | isVanillaModule (nameModule n) && not inPrelude = importsPrelude + | isVanillaModule (nameModule n) && inPrelude = addModuleImpInfo (moduleName (nameModule n)) +-- End HACK + | otherwise = addPackageImpInfo (modulePackage (nameModule n)) + + +importsPrelude | inPrelude = addModuleImpInfo (mkModuleName "PrelGHC") + | otherwise = addPackageImpInfo basePackage + + +importsType :: IlxEnv -> Type -> ImportsInfo -> ImportsInfo +importsType env ty = importsType2 env (deepIlxRepType ty) + +importsType2 :: IlxEnv -> Type -> ImportsInfo -> ImportsInfo +importsType2 env (AppTy f x) = importsType2 env f . importsType2 env x +importsType2 env (TyVarTy _) = importsNone +importsType2 env (TyConApp tc args) =importsTyCon env tc . importsTypeArgs2 env args +importsType2 env (FunTy arg res) = importsType env arg . importsType2 env res +importsType2 env (ForAllTy tv body_ty) = importsType2 env body_ty +importsType2 env (NoteTy _ ty) = importsType2 env ty +importsType2 _ _ = panic "IlxGen.lhs: importsType2 ty" +importsTypeArgs2 env tys = foldR (importsType2 env) tys + +importsDataCon env dcon = importsTyCon env (dataConTyCon dcon) + +importsTyCon env tc | (not (isDataTyCon tc) || + isInternalName (getName tc) || + ilxEnvModule env == nameModule (getName tc)) = importsNone +importsTyCon env tc | otherwise = importsName env (getName tc) . addTyConImpInfo tc . + foldR (importsTyConDataCon env) (tyConDataCons tc) + + +importsTyConDataCon :: IlxEnv -> DataCon -> ImportsInfo -> ImportsInfo +importsTyConDataCon env dcon = foldR (importsTyConDataConType env) (filter (not . isVoidIlxRepType) (dataConRepArgTys dcon)) + +importsTyConDataConType :: IlxEnv -> Type -> ImportsInfo -> ImportsInfo +importsTyConDataConType env ty = importsTyConDataConType2 env (deepIlxRepType ty) + +importsTyConDataConType2 :: IlxEnv -> Type -> ImportsInfo -> ImportsInfo +importsTyConDataConType2 env (AppTy f x) = importsTyConDataConType2 env f . importsTyConDataConType2 env x +importsTyConDataConType2 env (TyVarTy _) = importsNone +importsTyConDataConType2 env (TyConApp tc args) = importsTyConDataConTypeTyCon env tc . importsTyConDataConTypeArgs2 env args +importsTyConDataConType2 env (FunTy arg res) = importsTyConDataConType env arg . importsTyConDataConType2 env res +importsTyConDataConType2 env (ForAllTy tv body_ty) = importsTyConDataConType2 env body_ty +importsTyConDataConType2 env (NoteTy _ ty) = importsTyConDataConType2 env ty +importsTyConDataConType2 _ _ = panic "IlxGen.lhs: importsTyConDataConType2 ty" +importsTyConDataConTypeArgs2 env tys = foldR (importsTyConDataConType2 env) tys + +importsTyConDataConTypeTyCon env tc | (not (isDataTyCon tc) || + isInternalName (getName tc) || + ilxEnvModule env == nameModule (getName tc)) = importsNone +importsTyConDataConTypeTyCon env tc | otherwise = importsName env (getName tc) + + +type StaticCCallInfo = (CLabelString,CCallConv,[Type],Type) +type ImportsInfo = (UniqSet PackageName, UniqSet ModuleName, UniqSet TyCon, UniqFM StaticCCallInfo) + -- (Packages, Modules, Datatypes, Imported CCalls) + +emptyImpInfo :: ImportsInfo +emptyImpInfo = (emptyUniqSet, emptyUniqSet, emptyUniqSet, emptyUFM) +addPackageImpInfo p (w,x,y,z) = (addOneToUniqSet w p, x, y,z) +addModuleImpInfo m (w,x,y,z) = (w, addOneToUniqSet x m, y,z) +addTyConImpInfo tc (w,x,y,z) = (w, x, addOneToUniqSet y tc,z) +addCCallInfo info@(nm,a,b,c) (w,x,y,z) = (w, x, y,addToUFM z nm info) + +ilxImportTyCon :: IlxEnv -> TyCon -> SDoc +ilxImportTyCon env tycon | isDataTyCon tycon = ilxTyConDef True env tycon +ilxImportTyCon _ _ | otherwise = empty + +ilxImportPackage :: IlxEnv -> PackageName -> SDoc +ilxImportPackage _ p = text ".assembly extern" <+> singleQuotes (ppr p <> hscOptionQual) <+> text "{ }" + +ilxImportModule :: IlxEnv -> ModuleName -> SDoc +ilxImportModule _ m = text ".module extern" <+> singleQuotes (ppr m <> hscOptionQual <> text "o") + +-- Emit a P/Invoke declaration for the imported C function +-- TODO: emit the right DLL name +ilxImportCCall :: IlxEnv -> StaticCCallInfo -> SDoc +ilxImportCCall env (c,cc,args,ret) = + text ".method static assembly pinvokeimpl" <+> + parens (doubleQuotes (text "HSstd_cbits.dll") <+> text "cdecl") <+> retdoc <+> singleQuotes (pprCLabelString c) <+> + pprCValArgTys ilxTypeL env (map deepIlxRepType (filter (not. isVoidIlxRepType) args)) <+> + text "unmanaged preservesig { }" + where + retdoc = + if isVoidIlxRepType ret then text "void" + else ilxTypeR env (deepIlxRepType ret) + + +\end{code} + +%************************************************************************ +%* * +\subsection{Type declarations} +%* * +%************************************************************************ + +\begin{code} + + +ilxTyCon :: IlxEnv -> TyCon -> SDoc +ilxTyCon env tycon = ilxTyConDef False env tycon + +-- filter to get only dataTyCons? +ilxTyConDef importing env tycon = + vcat [empty $$ line, + text ".classunion" <+> (if importing then text "import" else empty) <+> tycon_ref <+> tyvars_text <+> super_text <+> alts_text] + where + tycon_ref = nameReference env (getName tycon) <> (ppr tycon) + super_text = if importing then empty else text "extends thunk" <> angleBrackets (text "class" <+> tycon_ref) + tyvars = tyConTyVars tycon + (ilx_tvs, _) = categorizeTyVars tyvars + alts_env = extendIlxEnvWithFormalTyVars env ilx_tvs + tyvars_text = pprTyVarBinders alts_env ilx_tvs + alts = vcat (map (pprIlxDataCon alts_env) (tyConDataCons tycon)) + alts_text = nest 2 (braces alts) + +pprIlxDataCon env dcon = + text ".alternative" <+> pprId dcon <+> + parens (pprSepWithCommas (ilxTypeL env) (map deepIlxRepType (filter (not. isVoidIlxRepType) (dataConRepArgTys dcon)))) +\end{code} + + +%************************************************************************ +%* * +\subsection{Getting the .closures and literals out} * +%************************************************************************ + +\begin{code} + +ilxBindClosures :: IlxEnv -> StgBinding -> SDoc +ilxBindClosures env (StgNonRec _ b rhs) = ilxRhsClosures env (b,rhs) +ilxBindClosures env (StgRec _ pairs) + = vcat (map (ilxRhsClosures new_env) pairs) + where + new_env = extendIlxEnvWithBinds env pairs + +--------------- +ilxRhsClosures _ (_, StgRhsCon _ _ _) + = empty + +ilxRhsClosures env (bndr, StgRhsClosure _ _ fvs upd args rhs) + = vcat [ilxExprClosures next_env rhs, + + empty $$ line, + kind_text <+> singleQuotes cloname <+> free_vs_text, + nest 2 (braces ( + nest 2 (vcat [empty, + vcat [text ".apply" <+> closure_sig_text, + body_text + ], + empty + ]) + )) + ] + where + kind_of_thing = case upd of + Updatable -> ASSERT( null args ) ".thunk" + otherwise -> ".closure" + kind_text = text kind_of_thing + + cloname = ilxEnvQualifyByModule env (ppr bndr) + next_env = ilxPlaceStgRhsClosure env bndr + (free_vs_text,env_with_fvs) = pprFreeBinders next_env fvs + + + closure_sig_text = + vcat [ text "()", + (case args of + [] -> empty + otherwise -> args_text), + text "-->" <+> rty_text] + + (args_text,env_with_args) = pprArgBinders env_with_fvs args + + -- Find the type returned, from the no. of args and the type of "bndr" + rty_text = + case retType env_with_fvs (idIlxRepType bndr) args of + Just (env,ty) -> + if isVoidIlxRepType ty then (text "void") + else ilxTypeR env ty + Nothing -> trace "WARNING! IlxGen.trace could not find return type - see generated ILX for context where this occurs." (text "// Could not find return type:" <+> ilxTypeR env_with_fvs (idIlxRepType bndr)<+> text ", non representation: " <+> ilxTypeR env_with_fvs (idType bndr)) + + -- strip off leading ForAll and Fun type constructions + -- up to the given number of arguments, extending the environment as + -- we go. + retType env ty [] = Just (env, ty) + retType env (ForAllTy tv ty) (arg:args) = retType (extendIlxEnvWithTyArgs env [tv]) ty args + retType env (FunTy l r) (arg:args) = retType env r args + retType _ _ _ = Nothing + + -- Code for the local variables + locals = ilxExprLocals env_with_args rhs + + env_with_locals = extendIlxEnvWithLocals env_with_args locals + + -- Code for the body of the main apply method + body_code = vcat [empty, + pprIlxLocals env_with_args locals, + ilxExpr (IlxEEnv env_with_locals (mkUniqSet (filter (not.isTyVar) args))) rhs Return, + empty + ] + + body_text = nest 2 (braces (text ".maxstack 100" <+> nest 2 body_code)) + + +pprIlxLocals env [] = empty +pprIlxLocals env vs + = text ".locals" <+> parens (pprSepWithCommas (pprIlxLocal env) (filter nonVoidLocal vs)) + where + nonVoidLocal (LocalId v,_) = not (isVoidIlxRepId v) + nonVoidLocal _ = True + +pprIlxLocal env (LocalId v,_) = ilxTypeL env (idIlxRepType v) <+> pprId v +pprIlxLocal env (LocalSDoc (ty,doc,pin),_) = ilxTypeL env (deepIlxRepType ty) <+> (if pin then text "pinned" else empty) <+> doc + + +pprFreeBinders env fvs + = (ilx_tvs_text <+> vs_text, env2) + where + (free_ilx_tvs, _,free_vs) = categorizeVars fvs + real_free_vs = filter (not . isVoidIlxRepId) free_vs + -- ignore the higher order type parameters for the moment + env1 = extendIlxEnvWithFreeTyVars env free_ilx_tvs + ilx_tvs_text = pprTyVarBinders env1 free_ilx_tvs + vs_text = parens (pprSepWithCommas ppr_id real_free_vs) + ppr_id v = ilxTypeL env1 (idIlxRepType v) <+> pprId v + env2 = extendIlxEnvWithFreeVars env1 real_free_vs + +pprIdBinder env v = parens (ilxTypeL env (idIlxRepType v) <+> pprId v) + + -- Declarations for the arguments of the main apply method +pprArgBinders env [] = (empty,env) +pprArgBinders env (arg:args) + = (arg_text <+> rest_text, res_env) + where + (arg_text,env') = pprArgBinder env arg + (rest_text,res_env) = pprArgBinders env' args + +-- We could probably omit some void argument binders, but +-- don't... +pprArgBinder env arg + | isVoidIlxRepId arg = (text "()", extendIlxEnvWithArgs env [arg]) + | otherwise + = if isTyVar arg then + let env' = extendIlxEnvWithTyArgs env [arg] in + (pprTyVarBinder env' arg, env') + else (pprIdBinder env arg,extendIlxEnvWithArgs env [arg]) + +-------------- +-- Compute local variables used by generated method. +-- The names of some generated locals are recorded as SDocs. + +data LocalSpec = LocalId Id | LocalSDoc (Type, SDoc, Bool) -- flag is for pinning + +ilxExprLocals :: IlxEnv -> StgExpr -> [(LocalSpec,Maybe (IlxEnv,StgRhs))] +ilxExprLocals env (StgLet bind body) = ilxBindLocals env bind ++ ilxExprLocals env body +ilxExprLocals env (StgLetNoEscape _ _ bind body) = ilxBindLocals env bind ++ ilxExprLocals env body -- TO DO???? +ilxExprLocals env (StgCase scrut _ _ bndr _ alts) + = ilxExprLocals (ilxPlaceStgCaseScrut env) scrut ++ + (if isDeadBinder bndr then [] else [(LocalId bndr,Nothing)]) ++ + ilxAltsLocals env alts +ilxExprLocals env (StgOpApp (StgFCallOp fcall _) args _) + = concat (ilxMapPlaceArgs 0 ilxCCallArgLocals env args) +ilxExprLocals _ _ = [] + +-- Generate locals to use for pinning arguments as we cross the boundary +-- to C. +ilxCCallArgLocals env (StgVarArg v) | pinCCallArg v = + [(LocalSDoc (idType v, ilxEnvQualifyByExact env (ppr v) <> text "pin", True), Nothing)] +ilxCCallArgLocals _ _ | otherwise = [] + +ilxBindLocals env (StgNonRec _ b rhs) = [(LocalId b,Just (env, rhs))] +ilxBindLocals env (StgRec _ pairs) = map (\(x,y) -> (LocalId x,Just (env, y))) pairs + +ilxAltsLocals env (StgAlgAlts _ alts deflt) = ilxDefltLocals env deflt ++ concat (ilxMapPlaceAlts ilxAlgAltLocals env alts) +ilxAltsLocals env (StgPrimAlts _ alts deflt) = ilxDefltLocals env deflt ++ concat (ilxMapPlaceAlts ilxPrimAltLocals env alts) + +ilxAlgAltLocals env (_, bndrs, _, rhs) = map (\x -> (LocalId x,Nothing)) (filter (\v -> isId v && not (isDeadBinder v)) bndrs) ++ ilxExprLocals env rhs +ilxPrimAltLocals env (_, rhs) = ilxExprLocals env rhs + +ilxDefltLocals _ StgNoDefault = [] +ilxDefltLocals env (StgBindDefault rhs) = ilxExprLocals (ilxPlaceStgBindDefault env) rhs + +-------------- +ilxExprClosures :: IlxEnv -> StgExpr -> SDoc +ilxExprClosures env (StgApp _ args) + = vcat (ilxMapPlaceArgs 0 (ilxArgClosures) env args) -- get strings +ilxExprClosures env (StgConApp _ args) + = vcat (ilxMapPlaceArgs 0 (ilxArgClosures) env args) -- get strings +ilxExprClosures env (StgOpApp _ args _) + = vcat (ilxMapPlaceArgs 0 (ilxArgClosures) env args) -- get strings +ilxExprClosures env (StgLet bind body) + = ilxBindClosures env bind $$ ilxExprClosures (extendIlxEnvWithBinds env (ilxPairs1 bind)) body +ilxExprClosures env (StgLetNoEscape _ _ bind body) -- TO DO???? + = ilxBindClosures env bind $$ ilxExprClosures (extendIlxEnvWithBinds env (ilxPairs1 bind)) body +ilxExprClosures env (StgCase scrut _ _ _ _ alts) + = ilxExprClosures (ilxPlaceStgCaseScrut env) scrut $$ ilxAltsClosures env alts +ilxExprClosures env (StgLit lit) + = ilxGenLit env lit +ilxExprClosures _ _ + = empty + +ilxAltsClosures env (StgAlgAlts _ alts deflt) + = vcat [ilxExprClosures (ilxPlaceAlt env i) rhs | (i,(_, _, _, rhs)) <- [1..] `zip` alts] + $$ + ilxDefltClosures env deflt + +ilxAltsClosures env (StgPrimAlts _ alts deflt) + = vcat [ilxExprClosures (ilxPlaceAlt env i) rhs | (i,(_, rhs)) <- [1..] `zip` alts] + $$ + vcat [ ilxGenLit (ilxPlacePrimAltLit env i) lit | (i,(lit,_)) <- [1..] `zip` alts] + $$ + ilxDefltClosures env deflt + +ilxDefltClosures env (StgBindDefault rhs) = ilxExprClosures (ilxPlaceStgBindDefault env) rhs +ilxDefltClosures _ StgNoDefault = empty + +ilxArgClosures env (StgLitArg lit) = ilxGenLit env lit +ilxArgClosures _ _ = empty + + + +ilxGenLit env (MachStr fs) + = vcat [text ".field static assembly char " <+> singleQuotes nm <+> text "at" <+> nm <> text "L", + text ".data" <+> nm <> text "L" <+> text "= char *(" <> pprFSInILStyle fs <> text ")" + ] + where + nm = ilxEnvQualifyByExact env (text "string") + +ilxGenLit _ _ = empty + +\end{code} + + +%************************************************************************ +%* * +\subsection{Generating code} +%* * +%************************************************************************ + + +\begin{code} + +-- Environment when generating expressions +data IlxEEnv = IlxEEnv IlxEnv (UniqSet Id) + +data Sequel = Return | Jump IlxLabel + +ilxSequel Return = text "ret" +ilxSequel (Jump lbl) = text "br" <+> pprIlxLabel lbl + +isReturn Return = True +isReturn (Jump _) = False + + +ilxExpr :: IlxEEnv -> StgExpr + -> Sequel -- What to do at the end + -> SDoc + +ilxExpr (IlxEEnv env _) (StgApp fun args) sequel + = ilxFunApp env fun args (isReturn sequel) $$ ilxSequel sequel + +-- ilxExpr eenv (StgLit lit) sequel +ilxExpr (IlxEEnv env _) (StgLit lit) sequel + = pushLit env lit $$ ilxSequel sequel + +-- ilxExpr eenv (StgConApp data_con args) sequel +ilxExpr (IlxEEnv env _) (StgConApp data_con args) sequel + = text " /* ilxExpr:StgConApp */ " <+> ilxConApp env data_con args $$ ilxSequel sequel + +-- ilxExpr eenv (StgPrimApp primop args _) sequel +ilxExpr (IlxEEnv env _) (StgOpApp (StgFCallOp fcall _) args ret_ty) sequel + = ilxFCall env fcall args ret_ty $$ ilxSequel sequel + +ilxExpr (IlxEEnv env _) (StgOpApp (StgPrimOp primop) args ret_ty) sequel + = ilxPrimOpTable primop args env $$ ilxSequel sequel + +--BEGIN TEMPORARY +-- The following are versions of a peephole optimizations for "let t = \[] t2[fvs] in t" +-- I think would be subsumed by a general treatmenet of let-no-rec bindings?? +ilxExpr eenv@(IlxEEnv env _) (StgLet (StgNonRec _ bndr (StgRhsClosure _ _ _ _ [] rhs)) (StgApp fun [])) sequel + | (bndr == fun && null (ilxExprLocals env rhs)) -- TO DO??? + = ilxExpr eenv rhs sequel +ilxExpr eenv@(IlxEEnv env _) (StgLetNoEscape _ _ (StgNonRec _ bndr (StgRhsClosure _ _ _ _ [] rhs)) (StgApp fun [])) sequel + | (bndr == fun && null (ilxExprLocals env rhs)) -- TO DO??? + = ilxExpr eenv rhs sequel +--END TEMPORARY + +ilxExpr eenv (StgLet bind body) sequel + = ilxBind eenv bind $$ ilxExpr eenv body sequel + + +ilxExpr eenv (StgLetNoEscape _ _ bind body) sequel -- TO DO??? + = ilxBind eenv bind $$ ilxExpr eenv body sequel + +-- StgCase: Special case 1 to avoid spurious branch. +ilxExpr eenv@(IlxEEnv env live) (StgCase (StgApp fun args) live_in_case _live_in_alts bndr _ alts) sequel + = vcat [ilxWipe env (uniqSetToList (live `minusUniqSet` live_in_case)), + ilxFunApp (ilxPlaceStgCaseScrut env) fun args False, + --ilxWipe env (uniqSetToList (live_in_case `minusUniqSet` _live_in_alts)), + --ilxAlts (IlxEEnv env _live_in_alts) bndr alts sequel + ilxAlts (IlxEEnv env live_in_case) bndr alts sequel + ] + +-- StgCase: Special case 2 to avoid spurious branch. +ilxExpr eenv@(IlxEEnv env live) (StgCase (StgOpApp (StgPrimOp primop) args ret_ty) live_in_case _live_in_alts bndr _ alts) sequel + = vcat [ilxWipe env (uniqSetToList (live `minusUniqSet` live_in_case)), + ilxPrimOpTable primop args (ilxPlaceStgCaseScrut env), + --ilxWipe env (uniqSetToList (live_in_case `minusUniqSet` _live_in_alts)), + --ilxAlts (IlxEEnv env _live_in_alts) bndr alts sequel + ilxAlts (IlxEEnv env live_in_case) bndr alts sequel + ] + +-- StgCase: Normal case. +ilxExpr eenv@(IlxEEnv env live) (StgCase scrut live_in_case _live_in_alts bndr _ alts) sequel + = vcat [ilxWipe env (uniqSetToList (live `minusUniqSet` live_in_case)), + ilxExpr (IlxEEnv (ilxPlaceStgCaseScrut env) live_in_case) scrut (Jump join_lbl), + ilxLabel join_lbl, + --ilxWipe env (uniqSetToList (live_in_case `minusUniqSet` _live_in_alts)), + --ilxAlts (IlxEEnv env _live_in_alts) bndr alts sequel + ilxAlts (IlxEEnv env live_in_case) bndr alts sequel + ] + where + join_lbl = mkJoinLabel bndr + +ilxExpr _ _ _ + = panic "ilxExpr: Patterns not matched:(IlxEEnv _ _) (StgSCC _ _) _ (IlxEEnv _ _) (StgLam _ _ _) _" + + +-- Wipe out locals and arguments that are no longer in use, to +-- prevent space leaks. If the VM is implemented 100% correctly then +-- this should probably not be needed, as the live variable analysis +-- in the JIT would tell the GC that these locals and arguments are +-- no longer live. However I'm putting it in here so we can +-- check out if it helps. +-- +-- Also, in any case this doesn't capture everything we need. e.g. +-- when making a call: +-- case f x of ... +-- where x is not used in the alternatives, then the variable x +-- is no longer live from the point it is transferred to the call +-- onwards. We should expunge "live_in_case - live_in_alts" right +-- before making the call, not after returning from the call.... +-- +-- Strictly speaking we also don't need to do this for primitive +-- values such as integers and addresses, i.e. things not +-- mapped down to GC'able objects. +ilxWipe env ids + = vcat (map (ilxWipeOne env) (filter (not.isVoidIlxRepId) ids)) + +ilxWipeOne env id + = case lookupIlxVarEnv env id of + Just Local -> text "ldloca " <+> pprId id <+> text "initobj.any" <+> (ilxTypeL env (idIlxRepType id)) + Just Arg -> text "deadarg " <+> pprId id <+> text "," <+> (ilxTypeL env (idIlxRepType id)) + Just (CloVar _) -> ilxComment (text "not yet wiping closure variable" <+> pprId id ) + _ -> ilxComment (text "cannot wipe non-local/non-argument" <+> pprId id ) + where + + +---------------------- + +ilxAlts :: IlxEEnv -> Id -> StgCaseAlts -> Sequel -> SDoc +ilxAlts (IlxEEnv env live) bndr alts sequel + -- At the join label, the result is on top + -- of the stack + = vcat [store_in_bndr, + do_case_analysis alts + ] + where + scrut_rep_ty = deepIlxRepType (idType bndr) + + store_in_bndr | isDeadBinder bndr = empty + | isVoidIlxRepId bndr + = ilxComment (text "ignoring store of zero-rep value to be analyzed") + | otherwise = text "dup" $$ (text "stloc" <+> pprId bndr) + + do_case_analysis (StgAlgAlts _ [] deflt) + = do_deflt deflt + + do_case_analysis (StgAlgAlts _ args deflt) + = do_alg_alts ([1..] `zip` args) deflt + + do_case_analysis (StgPrimAlts _ alts deflt) + = do_prim_alts ([1..] `zip` alts) $$ do_deflt deflt + + do_alg_alts [(i, alt@(data_con,bndrs,used_flags, rhs))] StgNoDefault | isUnboxedTupleCon data_con + -- Collapse the analysis of unboxed tuples where + -- some or all elements are zero-sized + -- + -- TO DO: add bndrs to set of live variables + = case bndrs' of + [h] -> bind_collapse bndrs used_flags <+> do_rhs_no_pop alt_env rhs + _ -> bind_components alt_env dcon' bndrs 0 used_flags <+> do_rhs alt_env rhs + where + bndrs' = filter (not. isVoidIlxRepId) bndrs + -- Replacement unboxed tuple type constructor, used if any of the + -- arguments have zero-size and more than one remains. + dcon' = tupleCon Unboxed (length bndrs') + + alt_env = IlxEEnv (ilxPlaceAlt env i) live + --alt_env = IlxEEnv (ilxPlaceAlt env i) + + bind_collapse [] _ = panic "bind_collapse: unary element not found" + bind_collapse (h:t) (is_used:used_flags) + | isVoidIlxRepId h = ilxComment (text "zero-rep binding eliminated") <+> (bind_collapse t used_flags) + | not is_used = ilxComment (text "not used") <+> text "pop" + | otherwise = text "stloc" <+> pprId h + + + do_alg_alts [(i, alt@(data_con,bndrs,used_flags, rhs))] StgNoDefault + = vcat [text "castdata" <+> sep [ilxTypeR env scrut_rep_ty <> comma, + ilxConRef env data_con], + do_alg_alt (IlxEEnv (ilxPlaceAlt env i) live) alt + ] + + do_alg_alts alts deflt + = vcat [text "datacase" <+> sep [ilxTypeR env scrut_rep_ty,text ",", + pprSepWithCommas pp_case labels_w_alts], + do_deflt deflt, + vcat (map do_labelled_alg_alt labels_w_alts) + ] + where + pp_case (i, (lbl, (data_con, _, _, _))) = parens (ilxConRef env data_con <> comma <> pprIlxLabel lbl) + labels_w_alts = [(i,(mkAltLabel bndr i, alt)) | (i, alt) <- alts] + + do_prim_alts [] = empty + do_prim_alts ((i, (lit,alt)) : alts) + = vcat [text "dup", pushLit (ilxPlacePrimAltLit env i) lit, text "bne.un" <+> pprIlxLabel lbl, + do_rhs (IlxEEnv (ilxPlaceAlt env i) live) alt, + ilxLabel lbl, do_prim_alts alts] + where + lbl = mkAltLabel bndr i + + do_labelled_alg_alt (i,(lbl, alt)) + = ilxLabel lbl $$ do_alg_alt (IlxEEnv (ilxPlaceAlt env i) live) alt + + do_alg_alt alt_eenv (data_con, bndrs, used_flags, rhs) + = vcat [bind_components alt_eenv data_con bndrs 0 used_flags, + do_rhs alt_eenv rhs + ] + + bind_components alt_eenv data_con [] n _ = empty + bind_components alt_eenv data_con (h:t) n (is_used:used_flags) + | isVoidIlxRepId h + -- don't increase the count in this case + = ilxComment (text "zero-rep binding eliminated") + <+> bind_components alt_eenv data_con t n used_flags + | otherwise + = bind_component alt_eenv data_con h is_used n + <+> bind_components alt_eenv data_con t (n + 1) used_flags + + bind_component alt_eenv@(IlxEEnv alt_env _) data_con bndr is_used reduced_fld_no + | not is_used + = ilxComment (text "not used") + | isVoidIlxRepId bndr + = ilxComment (text "ignoring bind of zero-rep variable") + | otherwise = vcat [text "dup", + ld_data alt_env data_con reduced_fld_no bndr, + text "stloc" <+> pprId bndr] + + do_deflt (StgBindDefault rhs) = do_rhs (IlxEEnv (ilxPlaceStgBindDefault env) live) rhs + do_deflt StgNoDefault = empty + + do_rhs alt_eenv rhs + | isVoidIlxRepId bndr = do_rhs_no_pop alt_eenv rhs -- void on the stack, nothing to pop + | otherwise = text "pop" $$ do_rhs_no_pop alt_eenv rhs -- drop the value + + do_rhs_no_pop alt_env rhs = ilxExpr alt_env rhs sequel + + ld_data alt_env data_con reduced_fld_no bndr + | isUnboxedTupleCon data_con + = text "ldfld" <+> sep [text "!" <> integer reduced_fld_no, + ilxTypeR alt_env scrut_rep_ty <> text "::fld" <> integer reduced_fld_no] + | otherwise + = text "lddata" <+> sep [ilxTypeR alt_env scrut_rep_ty <> comma, + ilxConRef env data_con <> comma, + integer reduced_fld_no] + + +------------------------- + +ilxBestTermArity = 3 +ilxBestTypeArity = 7 + + +-- Constants of unlifted types are represented as +-- applications to no arguments. +ilxFunApp env fun [] _ | isUnLiftedType (idType fun) + = pushId env fun + +ilxFunApp env fun args tail_call + = -- For example: + -- ldloc f function of type forall a. a->a + -- ldloc x arg of type Int + -- .tail callfunc <Int32> (!0) --> !0 + -- + vcat [pushId env fun,ilxFunAppAfterPush env fun args tail_call] + +ilxFunAppAfterPush env fun args tail_call + = -- For example: + -- ldloc f function of type forall a. a->a + -- ldloc x arg of type Int + -- .tail callfunc <Int32> (!0) --> !0 + -- + vcat [ilxFunAppArgs env 0 (idIlxRepType fun) args tail_call known_clo] + where + known_clo :: KnownClosure + known_clo = + case lookupIlxBindEnv env fun of + Just (_, StgRhsClosure _ _ _ Updatable _ _) -> Nothing + Just (place, StgRhsClosure _ _ fvs _ args _) -> Just (place,fun,args,fvs) + _ -> Nothing -- trace (show fun ++ " --> " ++ show (idArity fun)) + +type KnownClosure = Maybe ( IlxEnv -- Of the binding site of the function + , Id -- The function + , [Var] -- Binders + , [Var]) -- Free vars of the closure + +-- Push as many arguments as ILX allows us to in one go, and call the function +-- Recurse until we're done. +-- The function is already on the stack +ilxFunAppArgs :: IlxEnv + -> Int -- Number of args already pushed (zero is a special case; + -- otherwise used only for place generation) + -> Type -- Type of the function + -> [StgArg] -- The arguments + -> Bool -- True <=> tail call please + -> KnownClosure -- Information about the function we're calling + -> SDoc + +ilxFunAppArgs env num_sofar funty args tail_call known_clo + = vcat [vcat (ilxMapPlaceArgs num_sofar pushArgWithVoids env now_args), + call_instr <+> (if num_sofar == 0 then text "() /* first step in every Haskell app. is to a thunk */ " else empty) + <+> now_args_text + <+> text "-->" + <+> later_ty_text, + later + ] + where + now_args_text = + case now_arg_tys of + [] -> empty + _ -> hsep (map (pprIlxArgInfo env_after_now_tyvs) now_arg_tys) + + later_ty_text + | isVoidIlxRepType later_ty = text "void" + | otherwise = ilxTypeR env_after_now_tyvs later_ty + + (now_args,now_arg_tys,env_after_now_tyvs,later_args,later_ty) = + case args of + (StgTypeArg v:rest) -> get_type_args ilxBestTypeArity args env funty + _ -> get_term_args 0 ilxBestTermArity args env funty + + -- Only apply up to maxArity real (non-type) arguments + -- at a time. ILX should, in principle, allow us to apply + -- arbitrary numbers, but you will get more succinct + -- (and perhaps more efficient) IL code + -- if you apply in clumps according to its maxArity setting. + -- This is because it has to unwind the stack and store it away + -- in local variables to do the partial applications. + -- + -- Similarly, ILX only allows one type application at a time, at + -- least until we implement unwinding the stack for this case. + -- + -- NB: In the future we may have to be more careful + -- all the way through + -- this file to bind type variables as we move through + -- type abstractions and "forall" types. This would apply + -- especially if the type variables were ever bound by expressions + -- involving the type variables. + + -- This part strips off at most "max" term applications or one type application + get_type_args 0 args env funty = ([],[],env,args,funty) + get_type_args max args env (NoteTy _ ty) = + trace "IlxGen Internal Error: non representation type passed to get_args" (get_type_args max args env ty) + get_type_args max ((arg@(StgTypeArg v)):rest) env (ForAllTy tv rem_funty) + = if isIlxTyVar tv then + let env2 = extendIlxEnvWithFormalTyVars env [tv] in + let rest_ty = deepIlxRepType (substTyWith [tv] [v] rem_funty) in + let (now,now_tys,env3,later,later_ty) = get_type_args (max - 1) rest env rest_ty in + let arg_ty = mkTyVarTy tv in + (arg:now,(arg,arg_ty):now_tys,env2, later, later_ty) + else + get_type_args max rest env rem_funty -- ? subst?? + get_type_args _ (StgTypeArg _:_) _ _ = trace "IlxGen Internal Error: get_type_args could not get ForAllTy for corresponding arg" ([],[],env,[],funty) + get_type_args _ args env funty = ([],[],env,args,funty) + + get_term_args n max args env (NoteTy _ ty) + -- Skip NoteTy types + = trace "IlxGen Internal Error: non representation type passed to get_term_args" (get_term_args n max args env ty) + get_term_args n 0 args env funty + -- Stop if we've hit the maximum number of ILX arguments to apply n one hit. + = ([],[],env,args,funty) + get_term_args n max args env funty + | (case known_clo of + Just (_,_,needed,_) -> needed `lengthIs` n + Nothing -> False) + -- Stop if we have the optimal number for a direct call + = ([],[],env,args,funty) + get_term_args _ _ (args@(StgTypeArg _:_)) env funty + -- Stop if we hit a type arg. + = ([],[],env,args,funty) + get_term_args n max (h:t) env (FunTy dom ran) + -- Take an argument. + = let (now,now_tys,env2,later,later_ty) = get_term_args (n+1) (max - 1) t env ran in + (h:now, (h,dom):now_tys,env2,later,later_ty) + get_term_args _ max (h:t) env funty = trace "IlxGen Internal Error: get_term_args could not get FunTy or ForAllTy for corresponding arg" ([],[],env,[],funty) + get_term_args _ max args env funty = ([],[],env,args,funty) + + -- Are there any remaining arguments? + done = case later_args of + [] -> True + _ -> False + + -- If so, generate the subsequent calls. + later = if done then text "// done" + else ilxFunAppArgs env (num_sofar + length now_args) later_ty later_args tail_call Nothing + + -- Work out whether to issue a direct call a known closure (callclo) or + -- an indirect call (callfunc). Basically, see if the identifier has + -- been let-bound, and then check we are applying exactly the right + -- number of arguments. Also check that it's not a thunk (actually, this + -- is done up above). + -- + -- The nasty "all" check makes sure that + -- the set of type variables in scope at the callsite is a superset + -- of the set of type variables needed for the direct call. This is + -- is needed because not all of the type variables captured by a + -- let-bound binding will get propogated down to the callsite, and + -- the ILX system of polymorphism demands that the free type variables + -- get reapplied when we issue the direct "callclo". The + -- type variables are in reality also "bound up" in the closure that is + -- passed as the first argument, so when we do an indirect call + -- to that closure we're fine, which is why we don't need them in + -- the "callfunc" case. + basic_call_instr = + case known_clo of + Just (known_env,fun,needed,fvs) | (equalLength needed now_args) && + all (\x -> elemIlxTyEnv x env) free_ilx_tvs -> + vcat [text "callclo class", + nameReference env (idName fun) <+> singleQuotes (ilxEnvQualifyByModule env (ppr fun)), + pprTypeArgs ilxTypeR env (map mkTyVarTy free_ilx_tvs)] + <> text "," + where + (free_ilx_tvs, free_non_ilx_tvs,free_vs) = categorizeVars fvs + otherwise -> text "callfunc" + call_instr = + if (tail_call && done) then text "tail." <+> basic_call_instr + else basic_call_instr + + +-------------------------- +-- Print the arg info at the call site +-- For type args we are, at the moment, required to +-- give both the actual and the formal (bound). The formal +-- bound is always System.Object at the moment (bounds are +-- not properly implemented in ILXASM in any case, and nor do +-- we plan on making use og them) For +-- non-type args the actuals are on the stack, and we just give the +-- formal type. +pprIlxArgInfo env (StgTypeArg arg,ty) = + angleBrackets (ilxTypeR env (deepIlxRepType arg) <+> ilxComment (text "actual for tyvar")) <+> text "<class [mscorlib] System.Object>" +pprIlxArgInfo env (_,ty) = + parens (ilxTypeL env ty) + + +---------------------------- +-- Code for a binding +ilxBind :: IlxEEnv -> StgBinding -> SDoc +ilxBind eenv@(IlxEEnv env _) bind = + vcat [vcat (map (ilxRhs env rec) pairs), + vcat (map (ilxFixupRec env rec) pairs)] + where + rec = ilxRecIds1 bind + pairs = ilxPairs1 bind + + +---------------------------- +-- Allocate a closure or constructor. Fix up recursive definitions. +ilxRhs :: IlxEnv -> [Id] -> (Id, StgRhs) -> SDoc + +ilxRhs env rec (bndr, _) | isVoidIlxRepId bndr + = empty + +ilxRhs env rec (bndr, StgRhsCon _ con args) + = vcat [text " /* ilxRhs:StgRhsCon */ " <+> ilxConApp env con args, + text "stloc" <+> pprId bndr + ] + +ilxRhs env rec (bndr, StgRhsClosure _ _ fvs upd args rhs) + = -- Assume .closure v<any A>(int64,!A) { + -- .apply <any B> (int32) (B) { ... } + -- } + -- Then + -- let v = \B (x:int32) (y:B). ... + -- becomes: + -- newclo v<int32>(int64,!0) + -- stloc v + vcat [vcat (map pushFv free_vs), + (if null free_non_ilx_tvs then empty else (ilxComment (text "ignored some higher order type arguments in application - code will be non-verifiable"))), + text "newclo" <+> clotext, + text "stloc" <+> pprId bndr + ] + where + pushFv id = if elem id rec then text "ldnull" else pushId env id + (free_ilx_tvs, free_non_ilx_tvs,free_vs) = categorizeVars fvs + clotext = pprIlxNamedTyConApp env (ilxEnvQualifyByModule env (ppr bndr)) (map mkTyVarTy free_ilx_tvs) + +ilxFixupRec env rec (bndr, _) | isVoidIlxRepId bndr = ilxComment (text "no recursive fixup for void-rep-id") + +ilxFixupRec env rec (bndr, StgRhsCon _ con args) + = text "// no recursive fixup" + +ilxFixupRec env rec (bndr, StgRhsClosure _ _ fvs upd args rhs) + = vcat [vcat (map fixFv rec)] + where + fixFv recid = if elem recid fvs then + vcat [pushId env bndr, + pushId env recid, + text "stclofld" <+> clotext <> text "," <+> pprId recid] + else text "//no fixup needed for" <+> pprId recid + (free_ilx_tvs, free_non_ilx_tvs,free_vs) = categorizeVars fvs + clotext = pprIlxNamedTyConApp env (ilxEnvQualifyByModule env (ppr bndr)) (map mkTyVarTy free_ilx_tvs) + + + +--------------------------------------------- +-- Code for a top-level binding in a module +ilxPairs binds = concat (map ilxPairs1 binds) + +ilxPairs1 (StgNonRec _ bndr rhs) = [(bndr,rhs)] +ilxPairs1 (StgRec _ pairs) = pairs + +ilxRecIds1 (StgNonRec _ bndr rhs) = [] +ilxRecIds1 (StgRec _ pairs) = map fst pairs + +--------------------------------------------- +-- Code for a top-level binding in a module +-- TODO: fix up recursions amongst CAF's +-- e.g. +-- x = S x +-- for infinity... +-- +-- For the moment I've put in a completely spurious "reverse"... +-- +-- Consider: make fixing up of CAF's part of ILX? i.e. +-- put static, constant, allocated datastructures into ILX. + +stableSortBy :: (a -> a -> Ordering) -> [a] -> [a] +stableSortBy f (h:t) = insertBy f h (stableSortBy f t) +stableSortBy f [] = [] + +usedBy :: (Id,StgRhs) -> (Id,StgRhs) -> Ordering +usedBy (m,_) (_,StgRhsCon _ data_con args) | any (isArg m) args = LT +usedBy (m,_) (n,_) | m == n = EQ +usedBy (m,_) (_,_) = GT + +isArg m (StgVarArg n) = (n == m) +isArg m _ = False + + +ilxTopBind :: Module -> IlxEnv -> [(Id,StgRhs)] -> SDoc +--ilxTopBind mod env (StgNonRec _ bndr rhs) = +--ilxTopRhs env (bndr,rhs) +ilxTopBind mod env pairs = + vcat [text ".class" <+> pprId mod, + nest 2 (braces (nest 2 (vcat [empty,cctor, flds, empty])))] + where + cctor = vcat [text ".method static rtspecialname specialname void .cctor()", + nest 2 (braces + (nest 2 (vcat [text ".maxstack 100", + text "ldstr \"LOG: initializing module" <+> pprId mod <+> text "\" call void ['mscorlib']System.Console::WriteLine(class [mscorlib]System.String)", + vcat (map (ilxTopRhs mod env) (stableSortBy usedBy pairs)), + text "ldstr \"LOG: initialized module" <+> pprId mod <+> text "\" call void ['mscorlib']System.Console::WriteLine(class [mscorlib]System.String)", + text "ret", + empty])))] + flds = vcat (map (ilxTopRhsStorage mod env) pairs) + +--ilxTopRhs mod env (bndr, _) | isVoidIlxRepId bndr +-- = empty + +ilxTopRhs mod env (bndr, StgRhsClosure _ _ fvs upd args rhs) + = vcat [vcat (map (pushId env) free_vs), + (if null free_non_ilx_tvs then empty else (ilxComment (text "ignored some higher order type arguments in application - code will be non verifiable...."))), + text "newclo" <+> pprIlxNamedTyConApp env (ilxEnvQualifyByModule env (ppr bndr)) (map mkTyVarTy free_ilx_tvs), + text "stsfld" <+> pprFieldRef env (mod,bndTy,bndr) + ] + where + (free_ilx_tvs, free_non_ilx_tvs,free_vs) = categorizeVars fvs + bndTy = idIlxRepType bndr + +ilxTopRhs mod env (bndr, StgRhsCon _ data_con args) + = vcat [ text " /* ilxTopRhs: StgRhsCon */ " <+> ilxConApp env data_con args, + text "stsfld" <+> pprFieldRef env (mod,bndTy,bndr) + ] + where + bndTy = idIlxRepType bndr + +pprFieldRef env (mod,ty,id) + = ilxTypeL env ty <+> moduleReference env mod <+> pprId mod <> text "::" <> pprId id + +ilxTopRhsStorage mod env (bndr, StgRhsClosure _ _ _ _ _ _) + = text ".field public static " <+> ilxTypeL env bndTy <+> pprId bndr + where + bndTy = idIlxRepType bndr +ilxTopRhsStorage mod env (bndr, StgRhsCon _ _ _) + = text ".field public static " <+> ilxTypeL env bndTy <+> pprId bndr + where + bndTy = idIlxRepType bndr + +-------------------------------------- +-- Push an argument +pushArgWithVoids = pushArg_aux True +pushArg = pushArg_aux False + +pushArg_aux voids env (StgTypeArg ty) = empty +pushArg_aux voids env (StgVarArg var) = pushId_aux voids env var +pushArg_aux voids env (StgLitArg lit) = pushLit env lit + + +mapi f l = mapi_aux f l 0 + +mapi_aux f [] n = [] +mapi_aux f (h:t) n = f n h : mapi_aux f t (n+1) + +-------------------------------------- +-- Push an Id +pushId = pushId_aux False + +pushId_aux :: Bool -> IlxEnv -> Id -> SDoc +pushId_aux voids _ id | isVoidIlxRepId id = + /* if voids then text "ldunit" else */ ilxComment (text "pushId: void rep skipped") +pushId_aux _ env var + = case lookupIlxVarEnv env var of + Just Arg -> text "ldarg" <+> pprId var + Just (CloVar n) -> text "ldenv" <+> int n + Just Local -> text "ldloc" <+> pprId var + Just (Top m) -> + vcat [ilxComment (text "pushId (Top) " <+> pprId m), + text "ldsfld" <+> ilxTypeL env (idIlxRepType var) + <+> moduleReference env m <+> pprId (moduleName m) <> text "::" <> pprId var] + + Nothing -> + vcat [ilxComment (text "pushId (import) " <+> pprIlxTopVar env var), + text "ldsfld" <+> ilxTypeL env (idIlxRepType var) + <+> pprIlxTopVar env var] + +-------------------------------------- +-- Push a literal +pushLit env (MachChar c) = text "ldc.i4" <+> int c +pushLit env (MachStr s) = text "ldsflda char " <+> ilxEnvQualifyByExact env (text "string") -- pprFSInILStyle s +pushLit env (MachInt i) = text "ldc.i4" <+> integer i +pushLit env (MachInt64 i) = text "ldc.i8" <+> integer i +pushLit env (MachWord w) = text "ldc.i4" <+> integer w <+> text "conv.u4" +pushLit env (MachWord64 w) = text "ldc.i8" <+> integer w <+> text "conv.u8" +pushLit env (MachFloat f) = text "ldc.r4" <+> rational f +pushLit env (MachDouble f) = text "ldc.r8" <+> rational f +pushLit env (MachNullAddr) = text "ldc.i4 0" +pushLit env (MachLabel l _) = trace "WARNING: Cannot compile MachLabel to ILX in IlxGen.lhs" (text "// MachLabel!!! Not valid in ILX!!") + +pprIlxTopVar env v + | isExternalName n = (nameReference env n) <> pprId (nameModule n) <> text "::" <> singleQuotes (ppr (nameModule n) <> text "_" <> ppr (nameOccName n)) + | otherwise = pprId (nameOccName n) + where + n = idName v + +\end{code} + + +%************************************************************************ +%* * +\subsection{Printing types} +%* * +%************************************************************************ + + +\begin{code} + +isVoidIlxRepType (NoteTy _ ty) = isVoidIlxRepType ty +isVoidIlxRepType (TyConApp tc _) | (tyConPrimRep tc == VoidRep) = True +isVoidIlxRepType (TyConApp tc tys) + = isUnboxedTupleTyCon tc && null (filter (not. isVoidIlxRepType) tys) +isVoidIlxRepType _ = False + +isVoidIlxRepId id = isVoidIlxRepType (idType id) + + + +-- Get rid of all NoteTy and NewTy artifacts +deepIlxRepType :: Type -> Type +deepIlxRepType (FunTy l r) + = FunTy (deepIlxRepType l) (deepIlxRepType r) + +deepIlxRepType ty@(TyConApp tc tys) + = -- collapse UnboxedTupleTyCon down when it contains VoidRep types. + -- e.g. (# State#, Int#, Int# #) ===> (# Int#, Int# #) + if isUnboxedTupleTyCon tc then + let tys' = map deepIlxRepType (filter (not. isVoidIlxRepType) tys) in + case tys' of + [h] -> h + _ -> mkTupleTy Unboxed (length tys') tys' + else + TyConApp tc (map deepIlxRepType tys) +deepIlxRepType (AppTy f x) = AppTy (deepIlxRepType f) (deepIlxRepType x) +deepIlxRepType (ForAllTy b ty) = ForAllTy b (deepIlxRepType ty) +deepIlxRepType (NoteTy _ ty) = deepIlxRepType ty +deepIlxRepType (PredTy p) = deepIlxRepType (predTypeRep p) +deepIlxRepType ty@(TyVarTy tv) = ty + +idIlxRepType id = deepIlxRepType (idType id) + +-------------------------- +-- Some primitive type constructors are not thunkable. +-- Everything else needs to be marked thunkable. +ilxTypeL :: IlxEnv -> Type -> SDoc + +ilxTypeL env ty | isUnLiftedType ty || isVoidIlxRepType ty = ilxTypeR env ty +ilxTypeL env ty = text "thunk" <> angleBrackets (ilxTypeR env ty) + + +-------------------------- +-- Print non-thunkable version of type. +-- + +ilxTypeR :: IlxEnv -> Type -> SDoc +ilxTypeR env ty | isVoidIlxRepType ty = text "/* unit skipped */" +ilxTypeR env ty@(AppTy f _) | isTyVarTy f = ilxComment (text "type app:" <+> pprType ty) <+> (text "class [mscorlib]System.Object") +ilxTypeR env ty@(AppTy f x) = trace "ilxTypeR: should I be beta reducing types?!" (ilxComment (text "ilxTypeR: should I be beta reducing types?!") <+> ilxTypeR env (applyTy f x)) +ilxTypeR env (TyVarTy tv) = ilxTyVar env tv + +-- The following is a special rule for types constructed out of +-- higher kinds, e.g. Monad f or Functor f. +-- +-- The code below is not as general as it should be, but as I +-- have no idea if this approach will even work, I'm going to +-- just try it out on some simple cases arising from the prelude. +ilxTypeR env ty@(TyConApp tc (h:t)) | isAlgTyCon tc && null (tyConTyVars tc) + = ilxComment (text "what on earth? 2") <+> (ilxTypeR env (TyConApp tc t)) +ilxTypeR env ty@(TyConApp tc (h:t)) | isAlgTyCon tc && not (isIlxTyVar (hd (tyConTyVars tc))) + = ilxTypeR env (TyConApp tc t) +ilxTypeR env (TyConApp tc args) = ilxTyConApp env tc args + + -- nb. the only legitimate place for VoidIlxRepTypes to occur in normalized IlxRepTypes + -- is on the left of an arrow + -- We could probably eliminate all but a final occurrence of these. +ilxTypeR env (FunTy arg res)| isVoidIlxRepType res + = pprIlxFunTy (ilxTypeL env arg) (text "void") +ilxTypeR env (FunTy arg res) + = pprIlxFunTy (ilxTypeL env arg) (ilxTypeR env res) + +ilxTypeR env ty@(ForAllTy tv body_ty) | isIlxTyVar tv + = parens (text "forall" <+> pprTyVarBinders env' [tv] <+> nest 2 (ilxTypeR env' body_ty)) + where + env' = extendIlxEnvWithFormalTyVars env [tv] + +ilxTypeR env ty@(ForAllTy tv body_ty) | otherwise + = ilxComment (text "higher order type var " <+> pprId tv) <+> + pprIlxFunTy (text "class [mscorlib]System.Object") (ilxTypeR env body_ty) + +ilxTypeR env (NoteTy _ ty) + = trace "WARNING! non-representation type given to ilxTypeR: see generated ILX for context where this occurs" + (vcat [text "/* WARNING! non-representation type given to ilxTypeR! */", + ilxTypeR env ty ]) + +pprIlxFunTy dom ran = parens (hsep [text "func",parens dom,text "-->", ran]) + +ilxTyConApp env tcon args = + case lookupUFM tyPrimConTable (getUnique tcon) of + Just f -> f args env + Nothing -> + (if isUnboxedTupleTyCon tcon then pprIlxUnboxedTupleTyConApp else pprIlxBoxedTyConApp) + env tcon args + +pprIlxTyCon env tcon = nameReference env (getName tcon) <> ppr tcon +pprIlxUnboxedTupleTyConApp env tcon args + = text "/* unboxed */ value class" <+> pprIlxTyCon env tcon' <> pprTypeArgs ilxTypeL env non_void + where + non_void = filter (not . isVoidIlxRepType) args + tcon' = dataConTyCon (tupleCon Unboxed (length non_void)) +pprIlxBoxedTyConApp env tcon args + = pprIlxNamedTyConApp env (pprIlxTyCon env tcon) args +pprIlxNamedTyConApp env tcon_text args + = text "class" <+> tcon_text <> pprTypeArgs ilxTypeR env args + +-- Returns e.g: <Int32, Bool> +-- Void-sized type arguments are _always_ eliminated, everywhere. +-- If the type constructor is an unboxed tuple type then it should already have +-- been adjusted to be the correct constructor. +pprTypeArgs f env tys = pprTypeArgs_aux f env (filter (not . isVoidIlxRepType) tys) + +pprTypeArgs_aux f env [] = empty +pprTypeArgs_aux f env tys = angleBrackets (pprSepWithCommas (f env) tys) + + +pprTyVarBinders :: IlxEnv -> [TyVar] -> SDoc +-- Returns e.g: <class [mscorlib]System.Object> <class [mscorlib]System.Object> +-- plus a new environment with the type variables added. +pprTyVarBinders env [] = empty +pprTyVarBinders env tvs = angleBrackets (pprSepWithCommas (pprTyVarBinder_aux env) tvs) + +pprTyVarBinder :: IlxEnv -> TyVar -> SDoc +pprTyVarBinder env tv = + if isIlxTyVar tv then + angleBrackets (pprTyVarBinder_aux env tv) + else + ilxComment (text "higher order tyvar" <+> pprId tv <+> + text ":" <+> ilxTypeR env (tyVarKind tv)) <+> + ilxComment (text "omitted") + -- parens (text "class [mscorlib]System.Object" <+> pprId tv) + + +pprTyVarBinder_aux env tv = + ilxComment (text "tyvar" <+> pprId tv <+> text ":" <+> + ilxTypeR env (tyVarKind tv)) <+> + (text "class [mscorlib]System.Object") + +-- Only a subset of Haskell types can be generalized using the type quantification +-- of ILX +isIlxForAllKind h = + ( h `eqKind` liftedTypeKind) || + ( h `eqKind` unliftedTypeKind) || + ( h `eqKind` openTypeKind) + +isIlxTyVar v = isTyVar v && isIlxForAllKind (tyVarKind v) + +categorizeVars fvs = (ilx_tvs, non_ilx_tvs, vs) + where + (tvs, vs) = partition isTyVar fvs + (ilx_tvs, non_ilx_tvs) = categorizeTyVars tvs + +categorizeTyVars tyvs = partition isIlxTyVar tyvs + +pprValArgTys ppr_ty env tys = parens (pprSepWithCommas (ppr_ty env) tys) + +pprId id = singleQuotes (ppr id) + +\end{code} + +%************************************************************************ +%* * +\subsection{IlxEnv} +%* * +%************************************************************************ + +\begin{code} +type IlxTyEnv = [TyVar] +emptyIlxTyEnv = [] + +-- Nb. There is currently no distinction between the kinds of type variables. +-- We may need to add this to print out correct numbers, esp. for +-- "forall" types +extendIlxTyEnvWithFreeTyVars env tyvars = env ++ mkIlxTyEnv tyvars -- bound by .closure x<...> in a closure declared with type parameters +extendIlxTyEnvWithFormalTyVars env tyvars = env ++ mkIlxTyEnv tyvars -- bound by "forall <...>" in a type +extendIlxTyEnvWithTyArgs env tyvars = env ++ mkIlxTyEnv tyvars -- bound by "<...>" in a closure implementing a universal type + +formalIlxTyEnv tyvars = mkIlxTyEnv tyvars +mkIlxTyEnv tyvars = [ v | v <- tyvars, isIlxTyVar v ] + +data HowBound = Top Module -- Bound in a modules + | Arg -- Arguments to the enclosing closure + | CloVar Int -- A free variable of the enclosing closure + -- The int is the index of the field in the + -- environment + | Local -- Local let binding + +-- The SDoc prints a unique name for the syntactic block we're currently processing, +-- e.g. Foo_bar_baz when inside closure baz inside closure bar inside module Foo. +data IlxEnv = IlxEnv (Module, IlxTyEnv, IdEnv HowBound,IdEnv (IlxEnv, StgRhs), Place,Bool) +type Place = (SDoc,SDoc) + +ilxTyVar env tv + = go 0 (ilxEnvTyEnv env) + where + go n [] + = pprTrace "ilxTyVar" (pprId tv <+> text "tv_env = { " + <+> pprSepWithCommas + (\x -> pprId x <+> text ":" <+> ilxTypeR env (tyVarKind x)) + (ilxEnvTyEnv env) <+> text "}") + (char '!' <> pprId tv) + go n (x:xs) + = {- pprTrace "go" (ppr (tyVarName tv) <+> ppr (tyVarName x)) -} + (if tyVarName x== tyVarName tv then char '!' <> int n <+> ilxComment (char '!' <> pprId tv) + else go (n+1) xs) + +emptyIlxEnv :: Bool -> Module -> IlxEnv +emptyIlxEnv trace mod = IlxEnv (mod, emptyIlxTyEnv, emptyVarEnv, emptyVarEnv, (ppr mod,empty),trace) + +nextPlace place sdoc = place <> sdoc +usePlace place sdoc = place <> sdoc + +ilxEnvModule (IlxEnv (m, _, _, _, _,_)) = m +ilxEnvSetPlace (IlxEnv (m, tv_env, id_env, bind_env, (mod,exact),tr)) sdoc + = IlxEnv (m, tv_env, id_env, bind_env, (mod, sdoc),tr) +ilxEnvNextPlace (IlxEnv (m, tv_env, id_env, bind_env, (mod,exact),tr)) sdoc + = IlxEnv (m, tv_env, id_env, bind_env, (mod, nextPlace exact sdoc),tr) +ilxEnvQualifyByModule (IlxEnv (_, _, _, _,(mod,_),_)) sdoc = usePlace mod sdoc +ilxEnvQualifyByExact (IlxEnv (_, _, _, _,(mod,exact),_)) sdoc = usePlace mod sdoc <> usePlace exact sdoc + +ilxPlaceStgBindDefault env = ilxEnvNextPlace env (text "D") +ilxPlaceStgRhsClosure env bndr = ilxEnvSetPlace env (ppr bndr) -- binders are already unique +ilxPlaceStgCaseScrut env = ilxEnvNextPlace env (text "S") + +ilxPlaceAlt :: IlxEnv -> Int -> IlxEnv +ilxPlaceAlt env i = ilxEnvNextPlace env (text "a" <> int i) +ilxPlacePrimAltLit env i = ilxEnvNextPlace env (text "P" <> int i) +ilxMapPlaceArgs start f env args = [ f (ilxEnvNextPlace env (text "A" <> int i)) a | (i,a) <- [start..] `zip` args ] +ilxMapPlaceAlts f env alts = [ f (ilxPlaceAlt env i) alt | (i,alt) <- [1..] `zip` alts ] + +extendIlxEnvWithFreeTyVars (IlxEnv (mod, tv_env, id_env, bind_env, place,tr)) tyvars + = IlxEnv (mod, extendIlxTyEnvWithFreeTyVars tv_env tyvars,id_env, bind_env, place,tr) + +extendIlxEnvWithFormalTyVars (IlxEnv (mod, tv_env, id_env, bind_env, place,tr)) tyvars + = IlxEnv (mod, extendIlxTyEnvWithFormalTyVars tv_env tyvars,id_env, bind_env, place,tr) + +extendIlxEnvWithTyArgs (IlxEnv (mod, tv_env, id_env, bind_env, place,tr)) tyvars + = IlxEnv (mod, extendIlxTyEnvWithTyArgs tv_env tyvars,id_env, bind_env, place,tr) + +extendIlxEnvWithArgs :: IlxEnv -> [Var] -> IlxEnv +extendIlxEnvWithArgs (IlxEnv (mod, tv_env, id_env, bind_env, place,tr)) args + = IlxEnv (mod, extendIlxTyEnvWithTyArgs tv_env [tv | tv <- args, isIlxTyVar tv], + extendVarEnvList id_env [(v,Arg) | v <- args, not (isIlxTyVar v)], + bind_env, place,tr) + +extendIlxEnvWithFreeVars (IlxEnv (mod, tv_env, id_env, bind_env, place,tr)) args + = IlxEnv (mod, + extendIlxTyEnvWithFreeTyVars tv_env [tv | tv <- args, isIlxTyVar tv], + extendVarEnvList id_env (clovs 0 args), + bind_env, + place,tr) + where + clovs _ [] = [] + clovs n (x:xs) = if not (isIlxTyVar x) then (x,CloVar n):clovs (n+1) xs else clovs n xs + +extendIlxEnvWithBinds env@(IlxEnv (mod, tv_env, id_env, bind_env, place,tr)) bnds + = IlxEnv (mod, tv_env, id_env, + extendVarEnvList bind_env [(v,(env,rhs)) | (v,rhs) <- bnds], + place,tr) + +extendIlxEnvWithLocals (IlxEnv (m, tv_env, id_env, bind_env, p,tr)) locals + = IlxEnv (m, tv_env, + extendVarEnvList id_env [(v,Local) | (LocalId v,_) <- locals], + extendVarEnvList bind_env [(v,(env,rhs)) | (LocalId v,Just (env,rhs)) <- locals], + p,tr) +extendIlxEnvWithTops env@(IlxEnv (m, tv_env, id_env, bind_env, place,tr)) mod binds + = IlxEnv (m, tv_env, + extendVarEnvList id_env [(bndr,Top mod) | (bndr,rhs) <- binds], + extendVarEnvList bind_env [(bndr,(env, rhs)) | (bndr,rhs) <- binds], + place,tr) + +formalIlxEnv (IlxEnv (m, tv_env, id_env, bind_env, place, tr)) tyvars + = IlxEnv (m, formalIlxTyEnv tyvars, id_env, bind_env, place, tr) + +ilxEnvTyEnv :: IlxEnv -> IlxTyEnv +ilxEnvTyEnv (IlxEnv (_, tv_env, _,_,_,_)) = tv_env +elemIlxTyEnv var env = elem var (ilxEnvTyEnv env ) +elemIlxVarEnv var (IlxEnv (_, _, id_env,_,_,_)) = elemVarEnv var id_env +lookupIlxVarEnv (IlxEnv (_, _, id_env,_,_,_)) var = lookupVarEnv id_env var +lookupIlxBindEnv (IlxEnv (_, _, _, bind_env,_,_)) var = lookupVarEnv bind_env var + +\end{code} + + +\begin{code} +type IlxLabel = SDoc + +pprIlxLabel lbl = lbl + +mkJoinLabel :: Id -> IlxLabel +mkJoinLabel v = text "J_" <> ppr v + +mkAltLabel :: Id -> Int -> IlxLabel +mkAltLabel v n = text "A" <> int n <> ppr v + +ilxLabel :: IlxLabel -> SDoc +ilxLabel lbl = line $$ (pprIlxLabel lbl <> colon) +\end{code} + + +%************************************************************************ +%* * +\subsection{Local pretty helper functions} +%* * +%************************************************************************ + +\begin{code} +pprSepWithCommas :: (a -> SDoc) -> [a] -> SDoc +pprSepWithCommas pp xs = sep (punctuate comma (map pp xs)) +ilxComment pp = text "/*" <+> pp <+> text "*/" +singleQuotes pp = char '\'' <> pp <> char '\'' + +line = text "// ----------------------------------" + +hscOptionQual = text ".i_" + +nameReference env n + | isInternalName n = empty + | ilxEnvModule env == nameModule n = text "" + | isHomeModule (nameModule n) = moduleNameReference (moduleName (nameModule n)) +-- HACK: no Vanilla modules should be around, but they are!! This +-- gets things working for the scenario "standard library linked as one +-- assembly with multiple modules + a one module program running on top of this" +-- Same applies to all other mentions of Vailla modules in this file + | isVanillaModule (nameModule n) && not inPrelude = basePackageReference + | isVanillaModule (nameModule n) && inPrelude = moduleNameReference (moduleName (nameModule n)) +-- end hack + | otherwise = packageReference (modulePackage (nameModule n)) + +packageReference p = brackets (singleQuotes (ppr p <> hscOptionQual)) +moduleNameReference m = brackets ((text ".module") <+> (singleQuotes (pprModuleName m <> hscOptionQual <> text "o"))) + +moduleReference env m + | ilxEnvModule env == m = text "" + | isHomeModule m = moduleNameReference (moduleName m) + -- See hack above + | isVanillaModule m && not inPrelude = basePackageReference + | isVanillaModule m && inPrelude = moduleNameReference (moduleName m) + -- end hack + | otherwise = packageReference (modulePackage m) + +basePackageReference = packageReference basePackage +inPrelude = basePackage == opt_InPackage + +------------------------------------------------ +-- This code is copied from absCSyn/CString.lhs, +-- and modified to do the correct thing! It's +-- still a mess though. Also, still have to do the +-- right thing for embedded nulls. + +pprFSInILStyle :: FastString -> SDoc +pprFSInILStyle fs = doubleQuotes (text (stringToC (unpackFS fs))) + +stringToC :: String -> String +-- Convert a string to the form required by C in a C literal string +-- Tthe hassle is what to do w/ strings like "ESC 0"... +stringToC "" = "" +stringToC [c] = charToC c +stringToC (c:cs) + -- if we have something "octifiable" in "c", we'd better "octify" + -- the rest of the string, too. + = if (c < ' ' || c > '~') + then (charToC c) ++ (concat (map char_to_C cs)) + else (charToC c) ++ (stringToC cs) + where + char_to_C c | c == '\n' = "\\n" -- use C escapes when we can + | c == '\a' = "\\a" + | c == '\b' = "\\b" -- ToDo: chk some of these... + | c == '\r' = "\\r" + | c == '\t' = "\\t" + | c == '\f' = "\\f" + | c == '\v' = "\\v" + | otherwise = '\\' : (trigraph (ord c)) + +charToC :: Char -> String +-- Convert a character to the form reqd in a C character literal +charToC c = if (c >= ' ' && c <= '~') -- non-portable... + then case c of + '\'' -> "\\'" + '\\' -> "\\\\" + '"' -> "\\\"" + '\n' -> "\\n" + '\a' -> "\\a" + '\b' -> "\\b" + '\r' -> "\\r" + '\t' -> "\\t" + '\f' -> "\\f" + '\v' -> "\\v" + _ -> [c] + else '\\' : (trigraph (ord c)) + +trigraph :: Int -> String +trigraph n + = [chr ((n `div` 100) `rem` 10 + ord '0'), + chr ((n `div` 10) `rem` 10 + ord '0'), + chr (n `rem` 10 + ord '0')] + + +\end{code} + +%************************************************************************ +%* * +\subsection{PrimOps and Constructors} +%* * +%************************************************************************ + +\begin{code} +---------------------------- +-- Allocate a fresh constructor + +ilxConApp env data_con args + | isUnboxedTupleCon data_con + = let tm_args' = filter (not. isVoidIlxRepType . stgArgType) tm_args in + case tm_args' of + [h] -> + -- Collapse the construction of an unboxed tuple type where + -- every element is zero-sized + vcat (ilxMapPlaceArgs 0 pushArg env tm_args') + _ -> + -- Minimize the construction of an unboxed tuple type, which + -- may contain zero-sized elements. Recompute all the + -- bits and pieces from the simpler case below for the new data + -- type constructor.... + let data_con' = tupleCon Unboxed (length tm_args') in + let rep_ty_args' = filter (not . isVoidIlxRepType) rep_ty_args in + + let tycon' = dataConTyCon data_con' in + let (formal_tyvars', formal_tau_ty') = splitForAllTys (dataConRepType data_con') in + let (formal_arg_tys', _) = splitFunTys formal_tau_ty' in + let formal_env' = formalIlxEnv env formal_tyvars' in + + vcat [vcat (ilxMapPlaceArgs 0 pushArg env tm_args'), + sep [text "newobj void ", + ilxTyConApp env tycon' rep_ty_args', + text "::.ctor", + pprValArgTys ilxTypeR formal_env' (map deepIlxRepType formal_arg_tys') + ] + ] + | otherwise + -- Now all other constructions + = -- Assume C :: forall a. a -> T a -> T a + -- ldloc x arg of type Int + -- ldloc y arg of type T Int + -- newdata classunion T<Int32>, C(!0, T <!0>) + -- + let tycon = dataConTyCon data_con in + let (formal_tyvars, formal_tau_ty) = splitForAllTys (dataConRepType data_con) in + let (formal_arg_tys, _) = splitFunTys formal_tau_ty in + + vcat [vcat (ilxMapPlaceArgs 0 pushArg env tm_args), + sep [ text "newdata", + nest 2 (ilxTyConApp env tycon rep_ty_args <> comma), + nest 2 (ilxConRef env data_con) + ] + ] + where + tycon = dataConTyCon data_con + rep_ty_args = map deepIlxRepType ty_args + (ty_args,tm_args) = if isAlgTyCon tycon then splitTyArgs (tyConTyVars tycon) args else splitTyArgs1 args + +-- Split some type arguments off, throwing away the higher kinded ones for the moment. +-- Base the higher-kinded checks off a corresponding list of formals. +splitTyArgs :: [Var] -- Formals + -> [StgArg] -- Actuals + -> ([Type], [StgArg]) +splitTyArgs (htv:ttv) (StgTypeArg h:t) + | isIlxTyVar htv = ((h:l), r) + | otherwise = trace "splitTyArgs: threw away higher kinded type arg" (l, r) + where (l,r) = splitTyArgs ttv t +splitTyArgs _ l = ([],l) + +-- Split some type arguments off, where none should be higher kinded +splitTyArgs1 :: [StgArg] -> ([Type], [StgArg]) +splitTyArgs1 (StgTypeArg ty : args) = (ty:tys, args') + where + (tys, args') = splitTyArgs1 args +splitTyArgs1 args = ([], args) + +ilxConRef env data_con + | isUnboxedTupleCon data_con + = let data_con' = tupleCon Unboxed (length non_void_args)in + pprId data_con' <> arg_text + | otherwise + = pprId data_con <> arg_text + where + arg_text = pprValArgTys ilxTypeL env' (map deepIlxRepType non_void_args) + non_void_args = filter (not . isVoidIlxRepType) arg_tys + (tyvars, tau_ty) = splitForAllTys (dataConRepType data_con) + (arg_tys, _) = splitFunTys tau_ty + env' = formalIlxEnv env tyvars + + + + +\end{code} + + +%************************************************************************ +%* * +\subsection{PrimOps and Prim Representations} * +%************************************************************************ + +\begin{code} + +ilxPrimApp env op args ret_ty = ilxPrimOpTable op args env + + +type IlxTyFrag = IlxEnv -> SDoc +ilxType s env = text s + +ilxLift ty env = text "thunk" <> angleBrackets (ty env) + +ilxTypeSeq :: [IlxTyFrag] -> IlxTyFrag +ilxTypeSeq ops env = hsep (map (\x -> x env) ops) + +tyPrimConTable :: UniqFM ([Type] -> IlxTyFrag) +tyPrimConTable = + listToUFM [(addrPrimTyConKey, (\_ -> repAddr)), +-- (fileStreamPrimTyConKey, (\_ -> repFileStream)), + (foreignObjPrimTyConKey, (\_ -> repForeign)), + (stablePtrPrimTyConKey, (\[ty] -> repStablePtr {- (ilxTypeL2 ty) -})), + (stableNamePrimTyConKey, (\[ty] -> repStableName {- (ilxTypeL2 ty) -} )), + (charPrimTyConKey, (\_ -> repChar)), + (wordPrimTyConKey, (\_ -> repWord)), + (byteArrayPrimTyConKey, (\_ -> repByteArray)), + (intPrimTyConKey, (\_ -> repInt)), + (int64PrimTyConKey, (\_ -> repInt64)), + (word64PrimTyConKey, (\_ -> repWord64)), + (floatPrimTyConKey, (\_ -> repFloat)), + (doublePrimTyConKey, (\_ -> repDouble)), + -- These can all also accept unlifted parameter types so we explicitly lift. + (arrayPrimTyConKey, (\[ty] -> repArray (ilxTypeL2 ty))), + (mutableArrayPrimTyConKey, (\[_, ty] -> repMutArray (ilxTypeL2 ty))), + (weakPrimTyConKey, (\[ty] -> repWeak (ilxTypeL2 ty))), + (mVarPrimTyConKey, (\[_, ty] -> repMVar (ilxTypeL2 ty))), + (mutVarPrimTyConKey, (\[ty1, ty2] -> repMutVar (ilxTypeL2 ty1) (ilxTypeL2 ty2))), + (mutableByteArrayPrimTyConKey, (\_ -> repByteArray)), + (threadIdPrimTyConKey, (\_ -> repThread)), + (bcoPrimTyConKey, (\_ -> repBCO)) + ] + +ilxTypeL2 :: Type -> IlxTyFrag +ilxTypeL2 ty env = ilxTypeL env ty +ilxTypeR2 :: Type -> IlxTyFrag +ilxTypeR2 ty env = ilxTypeR env ty + +ilxMethTyVarA = ilxType "!!0" +ilxMethTyVarB = ilxType "!!1" +prelGHCReference :: IlxTyFrag +prelGHCReference env = + if ilxEnvModule env == mkHomeModule (mkModuleName "PrelGHC") then empty + else if inPrelude then moduleNameReference (mkModuleName "PrelGHC") + else basePackageReference + +prelBaseReference :: IlxTyFrag +prelBaseReference env = + if ilxEnvModule env == mkHomeModule (mkModuleName "PrelBase") then empty + else if inPrelude then moduleNameReference (mkModuleName "PrelBase") + else basePackageReference + +repThread = ilxType "class [mscorlib]System.Threading.Thread /* ThreadId# */ " +repByteArray = ilxType "unsigned int8[] /* ByteArr# */ " +--repFileStream = text "void * /* FileStream# */ " -- text "class [mscorlib]System.IO.FileStream" +repInt = ilxType "int32" +repWord = ilxType "unsigned int32" +repAddr =ilxType "/* Addr */ void *" +repInt64 = ilxType "int64" +repWord64 = ilxType "unsigned int64" +repFloat = ilxType "float32" +repDouble = ilxType "float64" +repChar = ilxType "/* Char */ unsigned int8" +repForeign = ilxTypeSeq [ilxType "class ",prelGHCReference,ilxType "PrelGHC_Foreignzh"] +repInteger = ilxUnboxedPairRep repInt repByteArray +repIntegerPair = ilxUnboxedQuadRep repInt repByteArray repInt repByteArray +repArray ty = ilxTypeSeq [ty,ilxType "[]"] +repMutArray ty = ilxTypeSeq [ty,ilxType "[]"] +repMVar ty = ilxTypeSeq [ilxType "class ",prelGHCReference,ilxType "PrelGHC_MVarzh",ilxTyParams [ty]] +repMutVar _ ty2 = ilxTypeSeq [ilxType "class ",prelGHCReference,ilxType "PrelGHC_MutVarzh",ilxTyParams [ty2]] +repWeak ty1 = ilxTypeSeq [ilxType "class ",prelGHCReference,ilxType "PrelGHC_Weakzh",ilxTyParams [ty1]] +repStablePtr {- ty1 -} = ilxTypeSeq [ilxType "class ",prelGHCReference,ilxType "PrelGHC_StablePtrzh" {- ,ilxTyParams [ty1] -} ] +repStableName {- ty1 -} = ilxTypeSeq [ilxType "class ",prelGHCReference,ilxType "PrelGHC_StableNamezh" {- ,ilxTyParams [ty1] -} ] +classWeak = ilxTypeSeq [ilxType "class ",prelGHCReference,ilxType "PrelGHC_Weakzh"] +repBCO = ilxTypeSeq [ilxType "class ",prelGHCReference,ilxType "PrelGHC_BCOzh"] + +ilxTyPair l r = ilxTyParams [l,r] +ilxTyTriple l m r = ilxTyParams [l,m,r] +ilxTyQuad l m1 m2 r = ilxTyParams [l,m1,m2,r] +ilxUnboxedEmptyRep = ilxTypeSeq [ilxType "value class",prelGHCReference,ilxType "PrelGHC_Z1H"] +ilxUnboxedPairRep l r = ilxTypeSeq [ilxType "value class",prelGHCReference,ilxType "PrelGHC_Z2H",ilxTyPair l r] +ilxUnboxedTripleRep l m r = ilxTypeSeq [ilxType "value class",prelGHCReference,ilxType "PrelGHC_Z3H",ilxTyTriple l m r] +ilxUnboxedQuadRep l m1 m2 r = ilxTypeSeq [ilxType "value class",prelGHCReference,ilxType "PrelGHC_Z4H",ilxTyQuad l m1 m2 r] + +ilxTyIO b = ilxTypeSeq [ilxType "(func ( /* unit skipped */ ) --> ", b, ilxType ")"] + +ilxTyParams :: [IlxTyFrag] -> IlxTyFrag +ilxTyParams [] env = empty +ilxTyParams l env = angleBrackets (ilxTyParamsAux l env) + where + ilxTyParamsAux [] env = empty + ilxTyParamsAux [h] env = h env + ilxTyParamsAux (h:t) env = h env <> text "," <+> ilxTyParamsAux t env + ilxTyParams [] env = empty + + +type IlxOpFrag = IlxEnv -> SDoc +ilxOp :: String -> IlxOpFrag +ilxOp s env = text s +ilxOpSeq :: [IlxOpFrag] -> IlxOpFrag +ilxOpSeq ops env = hsep (map (\x -> x env) ops) + +ilxParams :: [IlxOpFrag] -> IlxOpFrag +ilxParams l env = parens (ilxParamsAux l env) + where + ilxParamsAux [] env = empty + ilxParamsAux [h] env = h env + ilxParamsAux (h:t) env = h env <> text "," <+> ilxParamsAux t env + + +ilxMethodRef rty cls nm tyargs args = + ilxOpSeq [rty,cls,ilxOp "::",ilxOp nm, + ilxTyParams tyargs,ilxParams args] + +ilxCall m = ilxOpSeq [ilxOp "call", m] + +ilxSupportClass = ilxOpSeq [prelGHCReference, ilxOp "'GHC.support'"] +ilxSuppMeth rty nm tyargs args = ilxMethodRef rty ilxSupportClass nm tyargs args + +ilxCallSuppMeth rty nm tyargs args = ilxCall (ilxSuppMeth rty nm tyargs args) + +ilxMkBool :: IlxOpFrag +ilxMkBool = ilxOpSeq [ilxOp "call class",prelBaseReference, + ilxOp "PrelBase_Bool", + prelGHCReference,ilxOp "GHC.support::mkBool(bool)"] +ilxCgt = ilxOpSeq [ilxOp "cgt",ilxMkBool] +ilxCge = ilxOpSeq [ilxOp "clt ldc.i4 0 ceq ",ilxMkBool] +ilxClt = ilxOpSeq [ilxOp "clt ",ilxMkBool] +ilxCle = ilxOpSeq [ilxOp "cgt ldc.i4 0 ceq ",ilxMkBool] +ilxCeq = ilxOpSeq [ilxOp "ceq ",ilxMkBool] +ilxCne = ilxOpSeq [ilxOp "ceq ldc.i4 0 ceq " ,ilxMkBool] +ilxCgtUn = ilxOpSeq [ilxOp "cgt.un ",ilxMkBool] +ilxCgeUn = ilxOpSeq [ilxOp "clt.un ldc.i4 0 ceq ",ilxMkBool] +ilxCltUn = ilxOpSeq [ilxOp "clt.un ",ilxMkBool] +ilxCleUn = ilxOpSeq [ilxOp "cgt.un ldc.i4 0 ceq ",ilxMkBool] + +ilxAddrOfForeignOp = ilxOpSeq [ilxOp "ldfld void *" , repForeign, ilxOp "::contents"] +ilxAddrOfByteArrOp = ilxOp "ldc.i4 0 ldelema unsigned int8" + +ilxPrimOpTable :: PrimOp -> [StgArg] -> IlxOpFrag +ilxPrimOpTable op + = case op of + CharGtOp -> simp_op ilxCgt + CharGeOp -> simp_op ilxCge + CharEqOp -> simp_op ilxCeq + CharNeOp -> simp_op ilxCne + CharLtOp -> simp_op ilxClt + CharLeOp -> simp_op ilxCle + + OrdOp -> simp_op (ilxOp "conv.i4") -- chars represented by UInt32 (u4) + ChrOp -> simp_op (ilxOp "conv.u4") + + IntGtOp -> simp_op ilxCgt + IntGeOp -> simp_op ilxCge + IntEqOp -> simp_op ilxCeq + IntNeOp -> simp_op ilxCne + IntLtOp -> simp_op ilxClt + IntLeOp -> simp_op ilxCle + + Narrow8IntOp -> simp_op (ilxOp"conv.i1") + Narrow16IntOp -> simp_op (ilxOp "conv.i2") + Narrow32IntOp -> simp_op (ilxOp "conv.i4") + Narrow8WordOp -> simp_op (ilxOp "conv.u1") + Narrow16WordOp -> simp_op (ilxOp "conv.u2") + Narrow32WordOp -> simp_op (ilxOp "conv.u4") + + WordGtOp -> simp_op ilxCgtUn + WordGeOp -> simp_op ilxCgeUn + WordEqOp -> simp_op ilxCeq + WordNeOp -> simp_op ilxCne + WordLtOp -> simp_op ilxCltUn + WordLeOp -> simp_op ilxCleUn + + AddrGtOp -> simp_op ilxCgt + AddrGeOp -> simp_op ilxCge + AddrEqOp -> simp_op ilxCeq + AddrNeOp -> simp_op ilxCne + AddrLtOp -> simp_op ilxClt + AddrLeOp -> simp_op ilxCle + + FloatGtOp -> simp_op ilxCgt + FloatGeOp -> simp_op ilxCge + FloatEqOp -> simp_op ilxCeq + FloatNeOp -> simp_op ilxCne + FloatLtOp -> simp_op ilxClt + FloatLeOp -> simp_op ilxCle + + DoubleGtOp -> simp_op ilxCgt + DoubleGeOp -> simp_op ilxCge + DoubleEqOp -> simp_op ilxCeq + DoubleNeOp -> simp_op ilxCne + DoubleLtOp -> simp_op ilxClt + DoubleLeOp -> simp_op ilxCle + + -- Int#-related ops: + IntAddOp -> simp_op (ilxOp "add") + IntSubOp -> simp_op (ilxOp "sub") + IntMulOp -> simp_op (ilxOp "mul") + IntQuotOp -> simp_op (ilxOp "div") + IntNegOp -> simp_op (ilxOp "neg") + IntRemOp -> simp_op (ilxOp "rem") + + -- Addr# ops: + AddrAddOp -> simp_op (ilxOp "add") + AddrSubOp -> simp_op (ilxOp "sub") + AddrRemOp -> simp_op (ilxOp "rem") + Int2AddrOp -> warn_op "int2Addr" (simp_op (ilxOp "/* PrimOp int2Addr */ ")) + Addr2IntOp -> warn_op "addr2Int" (simp_op (ilxOp "/* PrimOp addr2Int */ ")) + + -- Word#-related ops: + WordAddOp -> simp_op (ilxOp "add") + WordSubOp -> simp_op (ilxOp "sub") + WordMulOp -> simp_op (ilxOp "mul") + WordQuotOp -> simp_op (ilxOp "div") + WordRemOp -> simp_op (ilxOp "rem") + + ISllOp -> simp_op (ilxOp "shl") + ISraOp -> simp_op (ilxOp "shr") + ISrlOp -> simp_op (ilxOp "shr.un") + IntAddCOp -> simp_op (ilxCallSuppMeth (ilxUnboxedPairRep repInt repInt) "IntAddCOp" [] [repInt, repInt]) + IntSubCOp -> simp_op (ilxCallSuppMeth (ilxUnboxedPairRep repInt repInt) "IntSubCOp" [] [repInt, repInt]) + IntGcdOp -> simp_op (ilxCallSuppMeth repInt "IntGcdOp" [] [repInt, repInt]) + + + -- Word#-related ops: + AndOp -> simp_op (ilxOp "and") + OrOp -> simp_op (ilxOp "or") + NotOp -> simp_op (ilxOp "not") + XorOp -> simp_op (ilxOp "xor") + SllOp -> simp_op (ilxOp "shl") + SrlOp -> simp_op (ilxOp "shr") + Word2IntOp -> simp_op (ilxOp "conv.i4") + Int2WordOp -> simp_op (ilxOp "conv.u4") + + -- Float#-related ops: + FloatAddOp -> simp_op (ilxOp "add") + FloatSubOp -> simp_op (ilxOp "sub") + FloatMulOp -> simp_op (ilxOp "mul") + FloatDivOp -> simp_op (ilxOp "div") + FloatNegOp -> simp_op (ilxOp "neg") + Float2IntOp -> simp_op (ilxOp "conv.i4") + Int2FloatOp -> simp_op (ilxOp "conv.r4") + + DoubleAddOp -> simp_op (ilxOp "add") + DoubleSubOp -> simp_op (ilxOp "sub") + DoubleMulOp -> simp_op (ilxOp "mul") + DoubleDivOp -> simp_op (ilxOp "div") + DoubleNegOp -> simp_op (ilxOp "neg") + Double2IntOp -> simp_op (ilxOp "conv.i4") + Int2DoubleOp -> simp_op (ilxOp "conv.r4") + Double2FloatOp -> simp_op (ilxOp "conv.r4") + Float2DoubleOp -> simp_op (ilxOp "conv.r8") + DoubleDecodeOp -> simp_op (ilxCallSuppMeth (ilxUnboxedTripleRep repInt repInt repByteArray) "decodeDouble" [] [ilxType "float64"]) + FloatDecodeOp -> simp_op (ilxCallSuppMeth (ilxUnboxedTripleRep repInt repInt repByteArray) "decodeFloat" [] [ilxType "float32"]) + + FloatExpOp -> simp_op (ilxOp "conv.r8 call float64 [mscorlib]System.Math::Exp(float64) conv.r4") + FloatLogOp -> simp_op (ilxOp "conv.r8 call float64 [mscorlib]System.Math::Log(float64) conv.r4") + FloatSqrtOp -> simp_op (ilxOp "conv.r8 call float64 [mscorlib]System.Math::Sqrt(float64) conv.r4") + FloatSinOp -> simp_op (ilxOp "conv.r8 call float64 [mscorlib]System.Math::Sin(float64) conv.r4") + FloatCosOp -> simp_op (ilxOp "conv.r8 call float64 [mscorlib]System.Math::Cos(float64) conv.r4") + FloatTanOp -> simp_op (ilxOp "conv.r8 call float64 [mscorlib]System.Math::Tan(float64) conv.r4") + FloatAsinOp -> simp_op (ilxOp "conv.r8 call float64 [mscorlib]System.Math::Asin(float64) conv.r4") + FloatAcosOp -> simp_op (ilxOp "conv.r8 call float64 [mscorlib]System.Math::Acos(float64) conv.r4") + FloatAtanOp -> simp_op (ilxOp "conv.r8 call float64 [mscorlib]System.Math::Atan(float64) conv.r4") + FloatSinhOp -> simp_op (ilxOp "conv.r8 call float64 [mscorlib]System.Math::Sinh(float64) conv.r4") + FloatCoshOp -> simp_op (ilxOp "conv.r8 call float64 [mscorlib]System.Math::Cosh(float64) conv.r4") + FloatTanhOp -> simp_op (ilxOp "conv.r8 call float64 [mscorlib]System.Math::Tanh(float64) conv.r4") + FloatPowerOp -> simp_op (ilxOp "call float64 [mscorlib]System.Math::Pow(float64, float64) conv.r4") -- ** op, make use of implicit cast to r8... + + DoubleExpOp -> simp_op (ilxOp "call float64 [mscorlib]System.Math::Exp(float64)") + DoubleLogOp -> simp_op (ilxOp "call float64 [mscorlib]System.Math::Log(float64)") + DoubleSqrtOp -> simp_op (ilxOp "call float64 [mscorlib]System.Math::Sqrt(float64)") + + DoubleSinOp -> simp_op (ilxOp "call float64 [mscorlib]System.Math::Sin(float64)") + DoubleCosOp -> simp_op (ilxOp "call float64 [mscorlib]System.Math::Cos(float64)") + DoubleTanOp -> simp_op (ilxOp "call float64 [mscorlib]System.Math::Tan(float64)") + + DoubleAsinOp -> simp_op (ilxOp "call float64 [mscorlib]System.Math::Asin(float64)") + DoubleAcosOp -> simp_op (ilxOp "call float64 [mscorlib]System.Math::Acos(float64)") + DoubleAtanOp -> simp_op (ilxOp "call float64 [mscorlib]System.Math::Atan(float64)") + + DoubleSinhOp -> simp_op (ilxOp "call float64 [mscorlib]System.Math::Sinh(float64)") + DoubleCoshOp -> simp_op (ilxOp "call float64 [mscorlib]System.Math::Cosh(float64)") + DoubleTanhOp -> simp_op (ilxOp "call float64 [mscorlib]System.Math::Tanh(float64)") + + DoublePowerOp -> simp_op (ilxOp "call float64 [mscorlib]System.Math::Pow(float64, float64)") + + -- Integer (and related...) ops: bail out to support routines + IntegerAndOp -> simp_op (ilxCallSuppMeth repInteger "IntegerAndOp" [] [repInt, repByteArray, repInt, repByteArray]) + IntegerOrOp -> simp_op (ilxCallSuppMeth repInteger "IntegerOrOp" [] [repInt, repByteArray, repInt, repByteArray]) + IntegerXorOp -> simp_op (ilxCallSuppMeth repInteger "IntegerXorOp" [] [repInt, repByteArray, repInt, repByteArray]) + IntegerComplementOp -> simp_op (ilxCallSuppMeth repInteger "IntegerComplementOp" [] [repInt, repByteArray]) + IntegerAddOp -> simp_op (ilxCallSuppMeth repInteger "IntegerAddOp" [] [repInt, repByteArray, repInt, repByteArray]) + IntegerSubOp -> simp_op (ilxCallSuppMeth repInteger "IntegerSubOp" [] [repInt, repByteArray, repInt, repByteArray]) + IntegerMulOp -> simp_op (ilxCallSuppMeth repInteger "IntegerMulOp" [] [repInt, repByteArray, repInt, repByteArray]) + IntegerGcdOp -> simp_op (ilxCallSuppMeth repInteger "IntegerGcdOp" [] [repInt, repByteArray, repInt, repByteArray]) + IntegerQuotRemOp -> simp_op (ilxCallSuppMeth repIntegerPair "IntegerQuotRemOp" [] [repInt, repByteArray, repInt, repByteArray]) + IntegerDivModOp -> simp_op (ilxCallSuppMeth repIntegerPair "IntegerDivModOp" [] [repInt, repByteArray, repInt, repByteArray]) + IntegerIntGcdOp -> simp_op (ilxCallSuppMeth repInt "IntegerIntGcdOp" [] [repInt, repByteArray, repInt]) + IntegerDivExactOp -> simp_op (ilxCallSuppMeth repInteger "IntegerDivExactOp" [] [repInt, repByteArray, repInt, repByteArray]) + IntegerQuotOp -> simp_op (ilxCallSuppMeth repInteger "IntegerQuotOp" [] [repInt, repByteArray, repInt, repByteArray]) + IntegerRemOp -> simp_op (ilxCallSuppMeth repInteger "IntegerRemOp" [] [repInt, repByteArray, repInt, repByteArray]) + IntegerCmpOp -> simp_op (ilxCallSuppMeth repInt "IntegerCmpOp" [] [repInt, repByteArray, repInt, repByteArray]) + IntegerCmpIntOp -> simp_op (ilxCallSuppMeth repInt "IntegerCmpIntOp" [] [repInt, repByteArray, repInt]) + Integer2IntOp -> simp_op (ilxCallSuppMeth repInt "Integer2IntOp" [] [repInt, repByteArray]) + Integer2WordOp -> simp_op (ilxCallSuppMeth repWord "Integer2WordOp" [] [repInt, repByteArray]) + Int2IntegerOp -> simp_op (ilxCallSuppMeth repInteger "Int2IntegerOp" [] [repInt]) + Word2IntegerOp -> simp_op (ilxCallSuppMeth repInteger "Word2IntegerOp" [] [repWord]) +-- IntegerToInt64Op -> simp_op (ilxCallSuppMeth repInt64 "IntegerToInt64Op" [] [repInt,repByteArray]) + Int64ToIntegerOp -> simp_op (ilxCallSuppMeth repInteger "Int64ToIntegerOp" [] [repInt64]) +-- IntegerToWord64Op -> simp_op (ilxCallSuppMeth repWord64 "IntegerToWord64Op" [] [repInt,repByteArray]) + Word64ToIntegerOp -> simp_op (ilxCallSuppMeth repInteger "Word64ToIntegerOp" [] [repWord64]) + + + + IndexByteArrayOp_Char -> simp_op (ilxOp "ldelem.u1") + IndexByteArrayOp_WideChar -> simp_op (ilxOp "ldelem.u4") + IndexByteArrayOp_Int -> simp_op (ilxOp "ldelem.i4") + IndexByteArrayOp_Word -> simp_op (ilxOp "ldelem.u4") + IndexByteArrayOp_Addr -> simp_op (ilxOp "ldelem.u") + IndexByteArrayOp_Float -> simp_op (ilxOp "ldelem.r4") + IndexByteArrayOp_Double -> simp_op (ilxOp "ldelem.r8") + IndexByteArrayOp_StablePtr -> simp_op (ilxOp "ldelem.ref") + IndexByteArrayOp_Int8 -> simp_op (ilxOp "ldelem.i1") + IndexByteArrayOp_Int16 -> simp_op (ilxOp "ldelem.i2") + IndexByteArrayOp_Int32 -> simp_op (ilxOp "ldelem.i4") + IndexByteArrayOp_Int64 -> simp_op (ilxOp "ldelem.i8") + IndexByteArrayOp_Word8 -> simp_op (ilxOp "ldelem.u1") + IndexByteArrayOp_Word16 -> simp_op (ilxOp "ldelem.u2") + IndexByteArrayOp_Word32 -> simp_op (ilxOp "ldelem.u4") + IndexByteArrayOp_Word64 -> simp_op (ilxOp "ldelem.u8") + + {- should be monadic??? -} + ReadByteArrayOp_Char -> simp_op (ilxOp "ldelem.u1") + ReadByteArrayOp_WideChar -> simp_op (ilxOp "ldelem.u4") + ReadByteArrayOp_Int -> simp_op (ilxOp "ldelem.i4") + ReadByteArrayOp_Word -> simp_op (ilxOp "ldelem.u4") + ReadByteArrayOp_Addr -> simp_op (ilxOp "ldelem.u") + ReadByteArrayOp_Float -> simp_op (ilxOp "ldelem.r4") + ReadByteArrayOp_Double -> simp_op (ilxOp "ldelem.r8") + ReadByteArrayOp_StablePtr -> simp_op (ilxOp "ldelem.ref") + ReadByteArrayOp_Int8 -> simp_op (ilxOp "ldelem.i1") + ReadByteArrayOp_Int16 -> simp_op (ilxOp "ldelem.i2") + ReadByteArrayOp_Int32 -> simp_op (ilxOp "ldelem.i4") + ReadByteArrayOp_Int64 -> simp_op (ilxOp "ldelem.i8") + ReadByteArrayOp_Word8 -> simp_op (ilxOp "ldelem.u1") + ReadByteArrayOp_Word16 -> simp_op (ilxOp "ldelem.u2") + ReadByteArrayOp_Word32 -> simp_op (ilxOp "ldelem.u4") + ReadByteArrayOp_Word64 -> simp_op (ilxOp "ldelem.u8") + {- MutByteArr# s -> Int# -> State# s -> (# State# s, Char# #) -} + {- ByteArr# -> Int# -> Char# -} + + + WriteByteArrayOp_Char -> simp_op (ilxOp "stelem.u1") + WriteByteArrayOp_WideChar -> simp_op (ilxOp "stelem.u4") + WriteByteArrayOp_Int -> simp_op (ilxOp "stelem.i4") + WriteByteArrayOp_Word -> simp_op (ilxOp "stelem.u4") + WriteByteArrayOp_Addr -> simp_op (ilxOp "stelem.u") + WriteByteArrayOp_Float -> simp_op (ilxOp "stelem.r4") + WriteByteArrayOp_Double -> simp_op (ilxOp "stelem.r8") + WriteByteArrayOp_StablePtr -> simp_op (ilxOp "stelem.ref") + WriteByteArrayOp_Int8 -> simp_op (ilxOp "stelem.i1") + WriteByteArrayOp_Int16 -> simp_op (ilxOp "stelem.i2") + WriteByteArrayOp_Int32 -> simp_op (ilxOp "stelem.i4") + WriteByteArrayOp_Int64 -> simp_op (ilxOp "stelem.i8") + WriteByteArrayOp_Word8 -> simp_op (ilxOp "stelem.u1") + WriteByteArrayOp_Word16 -> simp_op (ilxOp "stelem.u2") + WriteByteArrayOp_Word32 -> simp_op (ilxOp "stelem.u4") + WriteByteArrayOp_Word64 -> simp_op (ilxOp "stelem.i8 /* nb. no stelem.u8 */") + {- MutByteArr# s -> Int# -> Char# -> State# s -> State# s -} + + IndexOffAddrOp_Char -> simp_op (ilxOp "sizeof unsigned int8 mul add ldind.u1") + IndexOffAddrOp_WideChar -> simp_op (ilxOp "sizeof int32 mul add ldind.u4") + IndexOffAddrOp_Int -> simp_op (ilxOp "sizeof int32 mul add ldind.i4") + IndexOffAddrOp_Word -> simp_op (ilxOp "sizeof int32 mul add ldind.u4") + IndexOffAddrOp_Addr -> simp_op (ilxOp "sizeof native unsigned int mul add ldind.i") + IndexOffAddrOp_StablePtr -> simp_op (ilxOp "sizeof native unsigned int mul add ldind.ref") + IndexOffAddrOp_Float -> simp_op (ilxOp "sizeof float32 mul add ldind.r4") + IndexOffAddrOp_Double -> simp_op (ilxOp "sizeof float64 mul add ldind.r8") + IndexOffAddrOp_Int8 -> simp_op (ilxOp "sizeof int8 mul add ldind.i1") + IndexOffAddrOp_Int16 -> simp_op (ilxOp "sizeof int16 mul add ldind.i2") + IndexOffAddrOp_Int32 -> simp_op (ilxOp "sizeof int32 mul add ldind.i4") + IndexOffAddrOp_Int64 -> simp_op (ilxOp "sizeof int64 mul add ldind.i8") + IndexOffAddrOp_Word8 -> simp_op (ilxOp "sizeof unsigned int8 mul add ldind.u1") + IndexOffAddrOp_Word16 -> simp_op (ilxOp "sizeof unsigned int16 mul add ldind.u2") + IndexOffAddrOp_Word32 -> simp_op (ilxOp "sizeof unsigned int32 mul add ldind.u4") + IndexOffAddrOp_Word64 -> simp_op (ilxOp "sizeof int64 mul add ldind.u8") + + -- ForeignObj: load the address inside the object first + -- TODO: is this remotely right? + EqForeignObj -> warn_op "eqForeignObj" (simp_op (ilxOp "pop /* PrimOp eqForeignObj */ ")) + IndexOffForeignObjOp_Char -> arg2_op (\fobj n -> ilxOpSeq [fobj, ilxAddrOfForeignOp, n, ilxOp "sizeof unsigned int8 mul add ldind.u1"]) + IndexOffForeignObjOp_WideChar -> arg2_op (\fobj n -> ilxOpSeq [fobj, ilxAddrOfForeignOp, n, ilxOp "sizeof int32 mul add ldind.u4"]) + IndexOffForeignObjOp_Int -> arg2_op (\fobj n -> ilxOpSeq [fobj, ilxAddrOfForeignOp, n, ilxOp "sizeof int32 mul add ldind.i4"]) + IndexOffForeignObjOp_Word -> arg2_op (\fobj n -> ilxOpSeq [fobj, ilxAddrOfForeignOp, n, ilxOp "sizeof unsigned int32 mul add ldind.u4"]) + IndexOffForeignObjOp_Addr -> arg2_op (\fobj n -> ilxOpSeq [fobj, ilxAddrOfForeignOp, n, ilxOp "sizeof native unsigned int mul add ldind.i "]) + IndexOffForeignObjOp_StablePtr -> ty1_arg2_op (\ty fobj n -> ilxOpSeq [fobj, ilxAddrOfForeignOp, n, ilxOp "sizeof native unsigned int mul add ldind.ref "]) + IndexOffForeignObjOp_Float -> arg2_op (\fobj n -> ilxOpSeq [fobj, ilxAddrOfForeignOp, n, ilxOp "sizeof float32 mul add ldind.r4"]) + IndexOffForeignObjOp_Double -> arg2_op (\fobj n -> ilxOpSeq [fobj, ilxAddrOfForeignOp, n, ilxOp "sizeof float64 mul add ldind.r8"]) + IndexOffForeignObjOp_Int8 -> arg2_op (\fobj n -> ilxOpSeq [fobj, ilxAddrOfForeignOp, n, ilxOp "sizeof int8 mul add ldind.i1"]) + IndexOffForeignObjOp_Int16 -> arg2_op (\fobj n -> ilxOpSeq [fobj, ilxAddrOfForeignOp, n, ilxOp "sizeof int16 mul add ldind.i2"]) + IndexOffForeignObjOp_Int32 -> arg2_op (\fobj n -> ilxOpSeq [fobj, ilxAddrOfForeignOp, n, ilxOp "sizeof int32 mul add ldind.i4"]) + IndexOffForeignObjOp_Int64 -> arg2_op (\fobj n -> ilxOpSeq [fobj, ilxAddrOfForeignOp, n, ilxOp "sizeof int64 mul add ldind.i8"]) + IndexOffForeignObjOp_Word8 -> arg2_op (\fobj n -> ilxOpSeq [fobj, ilxAddrOfForeignOp, n, ilxOp "sizeof unsigned int8 mul add ldind.u1"]) + IndexOffForeignObjOp_Word16 -> arg2_op (\fobj n -> ilxOpSeq [fobj, ilxAddrOfForeignOp, n, ilxOp "sizeof unsigned int16 mul add ldind.u2"]) + IndexOffForeignObjOp_Word32 -> arg2_op (\fobj n -> ilxOpSeq [fobj, ilxAddrOfForeignOp, n, ilxOp "sizeof unsigned int32 mul add ldind.u4"]) + IndexOffForeignObjOp_Word64 -> arg2_op (\fobj n -> ilxOpSeq [fobj, ilxAddrOfForeignOp, n, ilxOp "sizeof unsigned int64 mul add ldind.u8"]) + + ReadOffAddrOp_Char -> simp_op (ilxOp "sizeof unsigned int8 mul add ldind.u1") + ReadOffAddrOp_WideChar -> simp_op (ilxOp "sizeof int32 mul add ldind.u4") + ReadOffAddrOp_Int -> simp_op (ilxOp "sizeof int32 mul add ldind.i4") + ReadOffAddrOp_Word -> simp_op (ilxOp "sizeof unsigned int32 mul add ldind.u4") + ReadOffAddrOp_Addr -> simp_op (ilxOp "sizeof native unsigned int mul add ldind.i") + ReadOffAddrOp_Float -> simp_op (ilxOp "sizeof float32 mul add ldind.r4") + ReadOffAddrOp_Double -> simp_op (ilxOp "sizeof float64 mul add ldind.r8") + ReadOffAddrOp_StablePtr -> simp_op (ilxOp "sizeof native unsigned int mul add ldind.ref") + ReadOffAddrOp_Int8 -> simp_op (ilxOp "sizeof int8 mul add ldind.i1") + ReadOffAddrOp_Int16 -> simp_op (ilxOp "sizeof int16 mul add ldind.i2") + ReadOffAddrOp_Int32 -> simp_op (ilxOp "sizeof int32 mul add ldind.i4") + ReadOffAddrOp_Int64 -> simp_op (ilxOp "sizeof int64 mul add ldind.i8") + ReadOffAddrOp_Word8 -> simp_op (ilxOp "sizeof unsigned int8 mul add ldind.u1") + ReadOffAddrOp_Word16 -> simp_op (ilxOp "sizeof unsigned int16 mul add ldind.u2") + ReadOffAddrOp_Word32 -> simp_op (ilxOp "sizeof unsigned int32 mul add ldind.u4") + ReadOffAddrOp_Word64 -> simp_op (ilxOp "sizeof unsigned int64 mul add ldind.u8") + {- Addr# -> Int# -> Char# -> State# s -> State# s -} + + WriteOffAddrOp_Char -> ty1_arg4_op (\sty addr n v s -> ilxOpSeq [addr, n, ilxOp "add", v, ilxOp "stind.u1"]) + WriteOffAddrOp_WideChar -> ty1_arg4_op (\sty addr n v s -> ilxOpSeq [addr, n, ilxOp "sizeof int32 mul add", v, ilxOp "stind.u4"]) + WriteOffAddrOp_Int -> ty1_arg4_op (\sty addr n v s -> ilxOpSeq [addr, n, ilxOp "sizeof int32 mul add", v, ilxOp "stind.i4"]) + WriteOffAddrOp_Word -> ty1_arg4_op (\sty addr n v s -> ilxOpSeq [addr, n, ilxOp "sizeof int32 mul add", v, ilxOp "stind.u4"]) + WriteOffAddrOp_Addr -> ty1_arg4_op (\sty addr n v s -> ilxOpSeq [addr, n, ilxOp "sizeof native unsigned int mul add", v, ilxOp "stind.i"]) + WriteOffAddrOp_ForeignObj -> ty1_arg4_op (\sty addr n v s -> ilxOpSeq [addr, n, ilxOp "sizeof native unsigned int mul add", v, ilxOp "stind.ref"]) + WriteOffAddrOp_Float -> ty1_arg4_op (\sty addr n v s -> ilxOpSeq [addr, n, ilxOp "sizeof float32 mul add", v,ilxOp "stind.r4"]) + WriteOffAddrOp_StablePtr -> ty2_arg4_op (\ty1 sty addr n v s -> ilxOpSeq [addr, n, ilxOp "sizeof native unsigned int mul add", v, ilxOp "stind.ref"]) + WriteOffAddrOp_Double -> ty1_arg4_op (\sty addr n v s -> ilxOpSeq [addr,n,ilxOp "sizeof float64 mul add",v,ilxOp "stind.r8"]) + WriteOffAddrOp_Int8 -> ty1_arg4_op (\sty addr n v s -> ilxOpSeq [addr,n,ilxOp "sizeof int8 mul add",v,ilxOp "stind.i1"]) + WriteOffAddrOp_Int16 -> ty1_arg4_op (\sty addr n v s -> ilxOpSeq [addr,n,ilxOp "sizeof int16 mul add",v,ilxOp "stind.i2"]) + WriteOffAddrOp_Int32 -> ty1_arg4_op (\sty addr n v s -> ilxOpSeq [addr,n,ilxOp "sizeof int32 mul add",v,ilxOp "stind.i4"]) + WriteOffAddrOp_Int64 -> ty1_arg4_op (\sty addr n v s -> ilxOpSeq [addr,n,ilxOp "sizeof int64 mul add",v,ilxOp "stind.i8"]) + WriteOffAddrOp_Word8 -> ty1_arg4_op (\sty addr n v s -> ilxOpSeq [addr,n,ilxOp "sizeof unsigned int8 mul add",v,ilxOp "stind.u1"]) + WriteOffAddrOp_Word16 -> ty1_arg4_op (\sty addr n v s -> ilxOpSeq [addr,n,ilxOp "sizeof unsigned int16 mul add",v,ilxOp "stind.u2"]) + WriteOffAddrOp_Word32 -> ty1_arg4_op (\sty addr n v s -> ilxOpSeq [addr,n,ilxOp "sizeof unsigned int32 mul add",v,ilxOp "stind.u4"]) + WriteOffAddrOp_Word64 -> ty1_arg4_op (\sty addr n v s -> ilxOpSeq [addr,n,ilxOp "sizeof unsigned int64 mul add",v,ilxOp "stind.u8"]) + {- Addr# -> Int# -> Char# -> State# s -> State# s -} + + {- should be monadic??? -} + NewPinnedByteArrayOp_Char -> warn_op "newPinnedByteArray" (simp_op (ilxOp "newarr [mscorlib]System.Byte ")) + NewByteArrayOp_Char -> simp_op (ilxOp "newarr [mscorlib]System.Byte") +-- NewByteArrayOp_Int -> simp_op (ilxOp "newarr [mscorlib]System.Int32") +-- NewByteArrayOp_Word -> simp_op (ilxOp "newarr [mscorlib]System.UInt32") +-- NewByteArrayOp_Addr -> simp_op (ilxOp "newarr [mscorlib]System.UInt64") +-- NewByteArrayOp_Float -> simp_op (ilxOp "newarr [mscorlib]System.Single") +-- NewByteArrayOp_Double -> simp_op (ilxOp "newarr [mscorlib]System.Double") +-- NewByteArrayOp_StablePtr -> simp_op (ilxOp "newarr [mscorlib]System.UInt32") +-- NewByteArrayOp_Int64 -> simp_op (ilxOp "newarr [mscorlib]System.Int64") TODO: there is no unique for this one -} +-- NewByteArrayOp_Word64 -> simp_op (ilxOp "newarr [mscorlib]System.UInt64") -} + {- Int# -> State# s -> (# State# s, MutByteArr# s #) -} + ByteArrayContents_Char -> warn_op "byteArrayContents" (simp_op ilxAddrOfByteArrOp) + + UnsafeFreezeByteArrayOp -> ty1_op (\ty1 -> ilxOp "nop ") + {- MutByteArr# s -> State# s -> (# State# s, ByteArr# #) -} + SizeofByteArrayOp -> simp_op (ilxOp "ldlen") + {- ByteArr# -> Int# -} + + SameMutableByteArrayOp -> ty1_op (\ty1 -> ilxCeq) + {- MutByteArr# s -> MutByteArr# s -> Bool -} + SizeofMutableByteArrayOp -> ty1_op (\ty1 -> ilxOp "ldlen") + {- MutByteArr# s -> Int# -} + + SameMutVarOp -> ty2_op (\ty1 ty2 -> ilxCeq) + {- MutVar# s a -> MutVar# s a -> Bool -} + NewMutVarOp -> ty2_op (\ty1 ty2 -> ilxOpSeq [ilxOp "newobj void" , repMutVar ty1 ty2 , ilxOp "::.ctor(!0)"]) + {- a -> State# s -> (# State# s, MutVar# s a #) -} + ReadMutVarOp -> ty2_op (\ty1 ty2 -> ilxOpSeq [ilxOp "ldfld !0" , repMutVar ty1 ty2 , ilxOp "::contents"]) + {- MutVar# s a -> State# s -> (# State# s, a #) -} + WriteMutVarOp -> ty2_op (\ty1 ty2 -> ilxOpSeq [ilxOp "stfld !0" , repMutVar ty1 ty2 , ilxOp "::contents"]) + {- MutVar# s a -> a -> State# s -> State# s -} + + NewArrayOp -> ty2_op (\ty1 ty2 -> ilxCallSuppMeth (ilxType "!!0[]") "newArray" [ty1] [repInt,ilxMethTyVarA]) + {- Int# -> a -> State# s -> (# State# s, MutArr# s a #) -} + IndexArrayOp -> ty1_op (\ty1 -> ilxOp "ldelem.ref") + {- Array# a -> Int# -> (# a #) -} + WriteArrayOp -> ty2_op (\ty1 ty2 -> ilxOp "stelem.ref") + {- MutArr# s a -> Int# -> a -> State# s -> State# s -} + ReadArrayOp -> ty2_op (\ty1 ty2 -> ilxOp "ldelem.ref") + {- MutArr# s a -> Int# -> State# s -> (# State# s, a #) -} + UnsafeFreezeArrayOp -> ty2_op (\ty1 ty2 -> ilxOp "nop") + {- MutArr# s a -> State# s -> (# State# s, Array# a #) -} + UnsafeThawArrayOp -> ty2_op (\ty1 ty2 -> ilxOp "nop") + {- Array# a -> State# s -> (# State# s, MutArr# s a #) -} + + SameMutableArrayOp -> ty2_op (\ty1 ty2 -> ilxCeq) + {- MutArr# s a -> MutArr# s a -> Bool -} + + + RaiseOp -> ty2_op (\ty1 ty2 -> ilxOp "throw") + CatchOp -> ty2_op (\ty1 ty2 -> + ilxCallSuppMeth ilxMethTyVarA "'catch'" [ty1,ty2] [ilxLift (ilxTyIO (ilxType "!!0")), + ilxOp "thunk<(func (!!1) --> (func ( /* unit skipped */ ) --> !!0))>"]) + {- (State# RealWorld -> (# State# RealWorld, a #) ) + -> (b -> State# RealWorld -> (# State# RealWorld, a #) ) + -> State# RealWorld + -> (# State# RealWorld, a #) + -} + + BlockAsyncExceptionsOp -> ty1_op (\ty1 -> + ilxCallSuppMeth ilxMethTyVarA "blockAsyncExceptions" [ty1] [ilxLift (ilxTyIO (ilxType "!!0"))]) + + {- (State# RealWorld -> (# State# RealWorld, a #)) + -> (State# RealWorld -> (# State# RealWorld, a #)) + -} + + UnblockAsyncExceptionsOp -> ty1_op (\ty1 -> + ilxCallSuppMeth ilxMethTyVarA "unblockAsyncExceptions" [ty1] [ilxLift (ilxTyIO (ilxType "!!0"))]) + + {- + State# RealWorld -> (# State# RealWorld, a #)) + -> (State# RealWorld -> (# State# RealWorld, a #)) + -} + + NewMVarOp -> ty2_op (\sty ty -> + ilxOpSeq [ilxOp "newobj void " , repMVar ty , ilxOp "::.ctor()"]) + {- State# s -> (# State# s, MVar# s a #) -} + + TakeMVarOp -> ty2_op (\sty ty -> + ilxCallSuppMeth ilxMethTyVarA "takeMVar" [ty] [repMVar ilxMethTyVarA]) + {- MVar# s a -> State# s -> (# State# s, a #) -} + + -- These aren't yet right + TryTakeMVarOp -> ty2_op (\sty ty -> + ilxCallSuppMeth (ilxUnboxedPairRep repInt ilxMethTyVarA) "tryTakeMVar" [ty] [repMVar ilxMethTyVarA]) + {- MVar# s a -> State# s -> (# State# s, a #) -} + + TryPutMVarOp -> ty2_op (\sty ty -> + ilxCallSuppMeth repInt "tryPutMVar" [ty] [repMVar ilxMethTyVarA,ilxMethTyVarA]) + {- MVar# s a -> State# s -> (# State# s, a #) -} + + PutMVarOp -> ty2_op (\sty ty -> + ilxCallSuppMeth (ilxOp "void") "putMVar" [ty] [repMVar ilxMethTyVarA, ilxMethTyVarA]) + {- MVar# s a -> a -> State# s -> State# s -} + + SameMVarOp -> ty2_op (\sty ty -> ilxCeq) + {- MVar# s a -> MVar# s a -> Bool -} + +-- TakeMaybeMVarOp -> ty2_op (\sty ty -> +-- (ilxCallSuppMeth (ilxUnboxedPairRep repInt ilxMethTyVarA) "tryTakeMVar" [ty] [repMVar ilxMethTyVarA])) +-- {- MVar# s a -> State# s -> (# State# s, Int#, a #) -} + + IsEmptyMVarOp -> ty2_op (\sty ty -> + ilxCallSuppMeth repInt "isEmptyMVar" [ty] [repMVar ilxMethTyVarA]) + {- MVar# s a -> State# s -> (# State# s, Int# #) -} + + TouchOp -> warn_op "touch" (ty1_op (\ty1 -> ilxOp "pop /* PrimOp touch */ ")) + + {- a -> Int# -} + DataToTagOp -> ty1_op (\ty1 -> + ilxCallSuppMeth repInt "dataToTag" [ty1] [ilxMethTyVarA]) + {- a -> Int# -} + + TagToEnumOp -> ty1_op (\ty1 -> + ilxCallSuppMeth ilxMethTyVarA "tagToEnum" [ty1] [repInt]) + {- Int# -> a -} + + MakeStablePtrOp -> ty1_op (\ty1 -> ilxOpSeq [ilxOp "box", ty1, ilxOp "newobj void", repStablePtr {- ty1 -}, ilxOp "::.ctor(class [mscorlib]System.Object)"]) + {- a -> State# RealWorld -> (# State# RealWorld, StablePtr# a #) -} + MakeStableNameOp -> ty1_op (\ty1 -> ilxOpSeq [ilxOp "pop newobj void", repStableName {- ty1 -}, ilxOp "::.ctor()"]) + -- primOpInfo MakeStableNameOp = mkGenPrimOp SLIT("makeStableName#") [alphaTyVar] [alphaTy, mkStatePrimTy realWorldTy] ((mkTupleTy Unboxed 2 [mkStatePrimTy realWorldTy, mkStableNamePrimTy alphaTy])) + + EqStableNameOp -> ty1_op (\ty1 -> ilxOp "ceq") + -- [alphaTyVar] [mkStableNamePrimTy alphaTy, mkStableNamePrimTy alphaTy] (intPrimTy) + StableNameToIntOp -> warn_op "StableNameToIntOp" (ty1_op (\ty1 -> ilxOp "pop ldc.i4 0")) + -- [alphaTyVar] [mkStableNamePrimTy alphaTy] (intPrimTy) + + DeRefStablePtrOp -> ty1_op (\ty1 -> ilxOpSeq [ilxOp "ldfld class [mscorlib]System.Object", repStablePtr {- ty1 -}, ilxOp "::contents"]) + {- StablePtr# a -> State# RealWorld -> (# State# RealWorld, a #) -} + + EqStablePtrOp -> ty1_op (\ty1 -> ilxOp "ceq") + {- StablePtr# a -> StablePtr# a -> Int# -} + + -- The 3rd argument to MkWeakOp is always a IO Monad action, i.e. passed as () --> () + MkWeakOp -> ty3_op (\ty1 ty2 ty3 -> ilxCall (ilxMethodRef (repWeak ilxMethTyVarB) classWeak "bake" [ilxLift ty1,ilxLift ty2] [ilxMethTyVarA, ilxMethTyVarB, ilxLift (ilxTyIO ilxUnboxedEmptyRep)])) + {- o -> b -> c -> State# RealWorld -> (# State# RealWorld, Weak# b #) -} + + DeRefWeakOp -> ty1_op (\ty1 -> ilxCall (ilxMethodRef (ilxUnboxedPairRep repInt ilxMethTyVarA) classWeak "deref" [ty1] [repWeak ilxMethTyVarA])) + FinalizeWeakOp -> ty1_op (\ty1 -> ilxCall (ilxMethodRef (ilxUnboxedPairRep repInt (ilxTyIO ilxUnboxedEmptyRep)) classWeak "finalizer" [ty1] [repWeak ilxMethTyVarA])) + {- Weak# a -> State# RealWorld -> (# State# RealWorld, Int#, + State# RealWorld -> (# State# RealWorld, Unit #)) #) -} + + MkForeignObjOp -> simp_op (ilxOpSeq [ilxOp "newobj void", repForeign, ilxOp "::.ctor(void *)"]) + WriteForeignObjOp -> ty1_op (\sty -> ilxOpSeq [ilxOp "stfld void *", repForeign, ilxOp "::contents"]) + ForeignObjToAddrOp -> simp_op ilxAddrOfForeignOp + YieldOp -> simp_op (ilxOpSeq [ilxOp "call class [mscorlib]System.Threading.Thread class [mscorlib]System.Threading.Thread::get_CurrentThread() + call instance void class [mscorlib]System.Threading.Thread::Suspend()"]) + MyThreadIdOp -> simp_op (ilxOpSeq [ilxOp "call default class [mscorlib]System.Threading.Thread class [mscorlib]System.Threading.Thread::get_CurrentThread() "]) + -- This pushes a THUNK across as the exception value. + -- This is the correct Haskell semantics... TODO: we should probably + -- push across an HaskellThreadAbortException object that wraps this + -- thunk, but which is still actually an exception of + -- an appropriate type. + KillThreadOp -> ty1_op (\ty -> ilxOpSeq [ilxOp "call instance void class [mscorlib]System.Threading.Thread::Abort(class [mscorlib]System.Object) "]) + {- ThreadId# -> a -> State# RealWorld -> State# RealWorld -} + + ForkOp -> warn_op "ForkOp" (simp_op (ilxOp "/* ForkOp skipped... */ newobj void [mscorlib]System.Object::.ctor() throw")) + ParOp -> warn_op "ParOp" (simp_op (ilxOp "/* ParOp skipped... */ newobj void [mscorlib]System.Object::.ctor() throw")) + DelayOp -> simp_op (ilxOp "call void class [mscorlib]System.Threading.Thread::Sleep(int32) ") + {- Int# -> State# s -> State# s -} + + WaitReadOp -> warn_op "WaitReadOp" (simp_op (ilxOp "/* WaitReadOp skipped... */ pop")) + WaitWriteOp -> warn_op "WaitWriteOp" (simp_op (ilxOp " /* WaitWriteOp skipped... */ newobj void [mscorlib]System.Object::.ctor() throw")) + ParAtForNowOp -> warn_op "ParAtForNowOp" (simp_op (ilxOp " /* ParAtForNowOp skipped... */ newobj void [mscorlib]System.Object::.ctor() throw")) + ParAtRelOp -> warn_op "ParAtRelOp" (simp_op (ilxOp " /* ParAtRelOp skipped... */ newobj void [mscorlib]System.Object::.ctor() throw")) + ParAtAbsOp -> warn_op "ParAtAbsOp" (simp_op (ilxOp " /* ParAtAbsOp skipped... */ newobj void [mscorlib]System.Object::.ctor() throw")) + ParAtOp -> warn_op "ParAtOp" (simp_op (ilxOp " /* ParAtOp skipped... */ newobj void [mscorlib]System.Object::.ctor() throw")) + ParLocalOp -> warn_op "ParLocalOp" (simp_op (ilxOp " /* ParLocalOp skipped... */ newobj void [mscorlib]System.Object::.ctor() throw")) + ParGlobalOp -> warn_op "ParGlobalOp" (simp_op (ilxOp " /* ParGlobalOp skipped... */ newobj void [mscorlib]System.Object::.ctor() throw")) + SeqOp -> warn_op "SeqOp" (simp_op (ilxOp " newobj void [mscorlib]System.Object::.ctor() throw ")) + AddrToHValueOp -> warn_op "AddrToHValueOp" (simp_op (ilxOp "newobj void [mscorlib]System.Object::.ctor() throw")) +-- ReallyUnsafePtrEqualityOp -> simp_op (ilxOp "ceq") + + MkApUpd0_Op -> warn_op "MkApUpd0_Op" (simp_op (ilxOp " newobj void [mscorlib]System.Object::.ctor() throw")) + NewBCOOp -> warn_op "NewBCOOp" (simp_op (ilxOp " newobj void [mscorlib]System.Object::.ctor() throw")) + -- ("newBCO#") [alphaTyVar, deltaTyVar] [byteArrayPrimTy, byteArrayPrimTy, mkArrayPrimTy alphaTy, byteArrayPrimTy, mkStatePrimTy deltaTy] ((mkTupleTy Unboxed 2 [mkStatePrimTy deltaTy, bcoPrimTy])) + _ -> pprPanic "Unimplemented primop" (ppr op) + + +ty1_op :: (IlxTyFrag -> IlxOpFrag) -> [StgArg] -> IlxOpFrag +ty1_op op ((StgTypeArg ty1):rest) = + ilxOpSeq [getArgsStartingAt 1 rest, + op (ilxTypeR2 (deepIlxRepType ty1))] + +ty2_op :: (IlxTyFrag -> IlxTyFrag -> IlxOpFrag) -> [StgArg] -> IlxOpFrag +ty2_op op ((StgTypeArg ty1):(StgTypeArg ty2):rest) = + ilxOpSeq [getArgsStartingAt 2 rest, + op (ilxTypeR2 (deepIlxRepType ty1)) + (ilxTypeR2 (deepIlxRepType ty2))] + +ty3_op :: (IlxTyFrag -> IlxTyFrag -> IlxTyFrag -> IlxOpFrag) -> [StgArg] -> IlxOpFrag +ty3_op op ((StgTypeArg ty1):(StgTypeArg ty2):(StgTypeArg ty3):rest) = + ilxOpSeq [getArgsStartingAt 3 rest, + op (ilxTypeR2 (deepIlxRepType ty1)) + (ilxTypeR2 (deepIlxRepType ty2)) + (ilxTypeR2 (deepIlxRepType ty3))] + +arg2_op :: (IlxTyFrag -> IlxOpFrag -> IlxOpFrag) -> [StgArg] -> IlxOpFrag +arg2_op op [a1, a2] = + op (getAsArg 1 a1) + (getAsArg 2 a2) + +ty1_arg2_op :: (IlxTyFrag -> IlxOpFrag -> IlxOpFrag -> IlxOpFrag) -> [StgArg] -> IlxOpFrag +ty1_arg2_op op [(StgTypeArg ty1), a1, a2] = + op (ilxTypeR2 (deepIlxRepType ty1)) + (getAsArg 1 a1) + (getAsArg 2 a2) + +ty1_arg4_op :: (IlxTyFrag -> IlxOpFrag -> IlxOpFrag -> IlxOpFrag -> IlxOpFrag -> IlxOpFrag) -> [StgArg] -> IlxOpFrag +ty1_arg4_op op [(StgTypeArg ty1), a1, a2, a3, a4] = + op (ilxTypeR2 (deepIlxRepType ty1)) + (getAsArg 1 a1) + (getAsArg 2 a2) + (getAsArg 3 a3) + (getAsArg 4 a4) + +ty2_arg4_op :: (IlxTyFrag -> IlxTyFrag -> IlxOpFrag -> IlxOpFrag -> IlxOpFrag -> IlxOpFrag -> IlxOpFrag) -> [StgArg] -> IlxOpFrag +ty2_arg4_op op [(StgTypeArg ty1), (StgTypeArg ty2),a1, a2, a3, a4] = + op (ilxTypeR2 (deepIlxRepType ty1)) + (ilxTypeR2 (deepIlxRepType ty2)) + (getAsArg 2 a1) + (getAsArg 3 a2) + (getAsArg 4 a3) + (getAsArg 5 a4) + +hd (h:t) = h + +getAsArg n a env = hd (ilxMapPlaceArgs n pushArg env [a]) +getArgsStartingAt n a env = vcat (ilxMapPlaceArgs n pushArg env a) + +simp_op :: IlxOpFrag -> [StgArg] -> IlxOpFrag +simp_op op args env = vcat (ilxMapPlaceArgs 0 pushArg env args) $$ op env +warn_op warning f args = trace ("WARNING! IlxGen cannot translate primop " ++ warning) (f args) +\end{code} + +%************************************************************************ +%* * +\subsection{C Calls} +%* * +%************************************************************************ + +\begin{code} +-- Call the P/Invoke stub wrapper generated in the import section. +-- We eliminate voids in and around an IL C Call. +-- We also do some type-directed translation for pinning Haskell-managed blobs +-- of data as we throw them across the boundary. +ilxFCall env (CCall (CCallSpec (StaticTarget c) cconv gc)) args ret_ty + = ilxComment ((text "C call") <+> pprCLabelString c) <+> + vcat [vcat (ilxMapPlaceArgs 0 pushCArg env args), + text "call" <+> retdoc <+> pprCLabelString c <+> tyarg_doc + <+> pprCValArgTys ilxTypeL env (map deepIlxRepType (filter (not. isVoidIlxRepType) (map stgArgType tm_args))) ] + where + retdoc | isVoidIlxRepType ret_ty = text "void" + | otherwise = ilxTypeR env (deepIlxRepType ret_ty) + (ty_args,tm_args) = splitTyArgs1 args + tyarg_doc | not (isEmptyVarSet (tyVarsOfTypes ty_args)) = text "/* type variable found */" + | otherwise = pprTypeArgs ilxTypeR env ty_args + +ilxFCall env (DNCall (DNCallSpec call_instr)) args ret_ty + = ilxComment (text "IL call") <+> + vcat [vcat (ilxMapPlaceArgs 0 pushEvalArg env tm_args), + ptext call_instr + -- In due course we'll need to pass the type arguments + -- and to do that we'll need to have more than just a string + -- for call_instr + ] + where + (ty_args,tm_args) = splitTyArgs1 args + +-- Push and argument and force its evaluation if necessary. +pushEvalArg _ (StgTypeArg _) = empty +pushEvalArg env (StgVarArg arg) = ilxFunApp env arg [] False +pushEvalArg env (StgLitArg lit) = pushLit env lit + + +hasTyCon (TyConApp tc _) tc2 = tc == tc2 +hasTyCon _ _ = False + +isByteArrayCArgTy ty = hasTyCon ty byteArrayPrimTyCon || hasTyCon ty mutableByteArrayPrimTyCon +isByteArrayCArg v = isByteArrayCArgTy (deepIlxRepType (idType v)) + +isForeignObjCArgTy ty = hasTyCon ty foreignObjPrimTyCon +isForeignObjCArg v = isForeignObjCArgTy (deepIlxRepType (idType v)) + +pinCCallArg v = isByteArrayCArg v || isForeignObjCArg v + +pinCArg env arg v = pushArg env arg <+> text "dup stloc" <+> singleQuotes (ilxEnvQualifyByExact env (ppr v) <> text "pin") +pushCArg env arg@(StgVarArg v) | isByteArrayCArg v = pinCArg env arg v <+> ilxAddrOfByteArrOp env +pushCArg env arg@(StgVarArg v) | isForeignObjCArg v = pinCArg env arg v <+> ilxAddrOfForeignOp env +pushCArg env arg | otherwise = pushArg env arg + +pprCValArgTys f env tys = parens (pprSepWithCommas (pprCValArgTy f env) tys) +pprCValArgTy f env ty | isByteArrayCArgTy ty = text "void *" <+> ilxComment (text "interior pointer into ByteArr#") +pprCValArgTy f env ty | isForeignObjCArgTy ty = text "void *" <+> ilxComment (text "foreign object") +pprCValArgTy f env ty | otherwise = f env ty + + +foldR :: (a -> b -> b) -> [a] -> b -> b +-- foldR _ [] z = z +-- foldR f (x:xs) z = f x (foldR f xs z) +{-# INLINE foldR #-} +foldR k xs z = go xs + where + go [] = z + go (y:ys) = y `k` go ys + +\end{code} + diff --git a/compiler/ilxGen/Makefile.stdlib b/compiler/ilxGen/Makefile.stdlib new file mode 100644 index 0000000000..bab993346e --- /dev/null +++ b/compiler/ilxGen/Makefile.stdlib @@ -0,0 +1,82 @@ +PrelAll_SRC=Array.lhs Maybe.lhs PrelDynamic.lhs PrelIOBase.lhs PrelShow.lhs \ +CPUTime.lhs Monad.lhs PrelEnum.lhs PrelList.lhs PrelStable.lhs \ +Char.lhs Numeric.lhs PrelErr.lhs PrelTup.lhs \ +Complex.lhs PrelAddr.lhs PrelException.lhs PrelMaybe.lhs PrelWeak.lhs \ +Directory.lhs PrelArr.lhs PrelFloat.lhs PrelNum.lhs Prelude.lhs \ +IO.lhs PrelArrExtra.lhs PrelForeign.lhs PrelPack.lhs Random.lhs \ +Ix.lhs PrelBase.lhs PrelHandle.lhs PrelRead.lhs Ratio.lhs \ +List.lhs PrelByteArr.lhs PrelHugs.lhs PrelReal.lhs System.lhs \ +Locale.lhs PrelConc.lhs PrelIO.lhs PrelST.lhs Time.lhs + +PrelAll_ILX=$(patsubst %.lhs,%.ilx,$(PrelAll_SRC)) +CLEAN_FILES += $(PrelAll_ILX) +PrelAll_ILX_FWD=$(patsubst %.lhs,%.ilx.fwd.ok,$(PrelAll_SRC)) +PrelAll_IL=$(patsubst %.lhs,%.il,$(PrelAll_SRC)) PrelGHC.il +PrelAll_MOD=$(patsubst %.il,%.mod,$(PrelAll_IL)) + + +%.ilx %.ilx.fwd: %.lhs + $(HC_PRE_OPTS) + $(HC) $(HC_OPTS) -Onot -D__ILX__ --ilx $*.lhs -o $*.ilx + $(HC_POST_OPTS) + + +CORRUN= +LOCALRUN=./ +ifeq ($(HOSTNAME),msrc-hilda) +CORRUN=cmd /c "devvs && " +LOCALRUN=.\\ +endif + +ILXASM=/devel/fcom/src/bin/ilxasmx.exe -l /devel/fcom/src/ilxasm --no-ilasm --box-everything +ILASM=$(CORRUN)ilasm +AL=$(CORRUN)al + +%.ilx.fwd.ok: %.ilx.fwd + if diff -q $< $@; then true; else cp $< $@; fi + +%.mod : %.il + $(ILASM) /QUIET /DLL /OUT=$@ $< + +PrelGHC.il: ../../compiler/ilxGen/PrelGHC.il + cp $< $@ + +PrelAll.dll : ilxasm-stdlib.mod $(PrelAll_MOD) + $(AL) ilxasm-stdlib.mod $(PrelAll_MOD) -out:$@ + +%.ilx_with_fwd: %.ilx $(PrelAll_ILX_FWD) + cat $(PrelAll_ILX_FWD) $*.ilx > $@ + +%.il : %.ilx_with_fwd /devel/fcom/src/bin/ilxasmx.exe + $(ILXASM) --no-stdlib -o $@ $*.ilx_with_fwd + +ilxasm-stdlib.il : /devel/fcom/src/bin/ilxasmx.exe /devel/fcom/src/ilxasm/stdlib-func-by-mcalli.ilx + rm -f tmp.ilx + touch tmp.ilx + $(ILXASM) -o $@ tmp.ilx + rm -f tmp.ilx + + +#-------------------- +# For validation only: + +PrelAll.il: $(PrelAll_IL) ilxasm-stdlib.il + cat ilxasm-stdlib.il $(PrelAll_IL) > $@ + +%.mvl: %.il + make -C ../../compiler/ilxGen/tests ilvalidx + ILVALID_HOME=/devel/fcom/src /devel/fcom/src/bin/ilvalidx.exe $*.il + + +ilxasm: + make -C ../../compiler/ilxGen/tests ilxasmx + +ilvalid: + $(MAKE) -C /devel/fcom/src bin/ilvalidx.exe + + +ghc: + make -C ../../compiler/ilxGen/tests ghc + + +.PRECIOUS: %.ilx.fwd %.ilx.fwd.ok %.il %.ilx_with_fwd diff --git a/compiler/ilxGen/tests/Makefile b/compiler/ilxGen/tests/Makefile new file mode 100644 index 0000000000..423839c9e8 --- /dev/null +++ b/compiler/ilxGen/tests/Makefile @@ -0,0 +1,130 @@ + +TOP = ../../.. +include $(TOP)/mk/boilerplate.mk + +WAYS=$(GhcLibWays) + +#----------------------------------------------------------------------------- +# Setting the standard variables +# + +HC = $(GHC_INPLACE) +SRC_HC_OPTS+=-cpp -fglasgow-exts + +#----------------------------------------------------------------------------- +# +CORENV_DEBUG= +CORENV_RETAIL= +LOCALRUN=./ +ifeq ($(HOSTNAME),MSRC-HILDA) +CORENV_DEBUG="call devcorb2gen.bat checked" +CORENV_RETAIL="call devcorb2gen.bat free" +LOCALRUN=.\\ +endif + +ghc: + $(MAKE) -C ../.. + +ilx: + $(MAKE) -C $(ILX2IL_HOME) ilxdefault + +prel: ilx + $(MAKE) -C ../../../lib/std std.$(ilx_way).dll std.$(ilx_way).vlb + +#======================================================================== +# 1. From Haskell to ILX and then to IL - see build.mk + +#------------------------------------------------------------------------ +# 2. From IL to .EXE + +%.$(ilx_way).exe : %.$(ilx_way).il ../Entry.$(ilx_way).il + cat $*.$(ilx_way).il ../Entry.$(ilx_way).il > $@.tmp +# echo "call devcorb2gen free" > tmp.bat + echo "ilasm /DEBUG /QUIET /OUT=$@ $@.tmp" >> tmp.bat + cmd /c tmp.bat + +../Entry.$(hs2ilx_suffix)_o: ../Entry.ilx + sed -e "s|ilx std|ilx std.$(hs2ilx_suffix)|g" ../Entry.ilx > $@.tmp + mv $@.tmp $@ + + +%.$(ilx_way).mvl: %.$(ilx_way).il + (ILVALID_HOME=c:\\devel\\fcom\\src\\ ILVALID_MSCORLIB=mscorlib.vlb $(ILVALID) c:\\devel\\fcom\\src\\bin\\msilxlib$(ilx2il_suffix).vlb $(TOP)/lib/std/std.$(ilx_way).vlb $<) 2>&1 + + +#------------------------------------------------------------------------ +# From .HS to .EXE without using ILX +# Used to run performance comparisons against native code GHC + +%.Onot.exe: %.hs + $(GHC_INPLACE) -Onot -o $@ $< + +%.O.exe: %.hs + $(GHC_INPLACE) -O -o $@ $< + +WIN_TOP_ABS = $(subst /,\,$(FPTOOLS_TOP_ABS)) +WIN_ILX2IL_HOME = $(subst /,\,$(ILX2IL_HOME)) + +app.config: + echo "<configuration>" > $@ + echo "<runtime>" >> $@ + echo "<assemblyBinding xmlns=\"urn:schemas-microsoft-com:asm.v1\">" >> $@ + echo "<probing privatePath=\"$(WIN_TOP_ABS)\\ghc\\lib\\std;$(WIN_ILX2IL_HOME)\\bin\"/>" >> $@ + echo "</assemblyBinding>" >> $@ + echo "</runtime>" >> $@ + echo "</configuration>" >> $@ + +%.run: %.exe app.config + time -p $< + +#------------------------------------------------------------------------ +# Running: + +HSstd_cbits.dll: $(DLL_PEN)/HSstd_cbits.dll + cp $< $@ + +%.cordbg.run: HSstd_cbits.dll %.exe + cp app.config $@.config +# echo "call devcorb2gen fastchecked" > $@.bat + echo "$(LOCALRUN)$(subst /,\\,$*).exe 2>&1" >> $@.bat + time -p cmd /c $(subst /,\\,$@).bat + rm $@.bat + +%.debug.run: HSstd_cbits.dll %.exe + cp app.config $@.config +# echo "call devcorb2gen fastchecked" > $@.bat + echo "$(LOCALRUN)$(subst /,\\,$*).exe 2>&1" >> $@.bat + time -p cmd /c $(subst /,\\,$@).bat + rm $@.bat + +%.retail.run: HSstd_cbits.dll %.exe + cp app.config $@.config +# echo "call devcorb2gen free" > $@.bat + echo "$(LOCALRUN)$(subst /,\\,$*).exe 2>&1" >> $@.bat + time -p cmd /c $(subst /,\\,$@).bat + rm $@.bat + + +%.run: %.exe + time -p $< + + +#-------------------- + +%.mvl: %.nolib.il + ILVALID_HOME=$(ILX2IL_HOME) $(ILVALID) $*.nolib.il + +ci: + (cd $(ILX2IL_HOME); $(CVS) ci -m "") + (cd ../..; cvs ci -m "") + (cd ../../../lib/std; $(CVS) ci -m "") + +upd: + (cd $(ILX2IL_HOME); $(CVS) up) + (cd ../..; $(CVS) up) + (cd ../../../lib/std; $(CVS) up) + + +.PHONY: %.run + +include $(TOP)/mk/target.mk diff --git a/compiler/ilxGen/tests/PrelNum.hs b/compiler/ilxGen/tests/PrelNum.hs new file mode 100644 index 0000000000..ca23e149ff --- /dev/null +++ b/compiler/ilxGen/tests/PrelNum.hs @@ -0,0 +1,120 @@ + + + + + + + + + + + + + + + + +{-# OPTIONS -fglasgow-exts -fno-implicit-prelude #-} + +module PrelNum where + +import {-# SOURCE #-} PrelErr +import PrelBase +import PrelList +import PrelEnum +import PrelShow + +infixl 7 * +infixl 6 +, - + +default () -- Double isn't available yet, + -- and we shouldn't be using defaults anyway + + + + + + + + + +class (Eq a, Show a) => Num a where + (+), (-), (*) :: a -> a -> a + negate :: a -> a + abs, signum :: a -> a + fromInteger :: Integer -> a + fromInt :: Int -> a -- partain: Glasgow extension + + x - y = x + negate y + negate x = 0 - x + fromInt (I# i#) = fromInteger (S# i#) + -- Go via the standard class-op if the + -- non-standard one ain't provided + + + + + +subtract :: (Num a) => a -> a -> a +{-# INLINE subtract #-} +subtract x y = y - x + +ord_0 :: Num a => a +ord_0 = fromInt (ord '0') + + + + + + + + + + +instance Num Int where + (+) x y = plusInt x y + (-) x y = minusInt x y + negate x = negateInt x + (*) x y = timesInt x y + abs n = if n `geInt` 0 then n else (negateInt n) + + signum n | n `ltInt` 0 = negateInt 1 + | n `eqInt` 0 = 0 + | otherwise = 1 + + fromInt n = n + + + + +-- These can't go in PrelBase with the defn of Int, because +-- we don't have pairs defined at that time! + +quotRemInt :: Int -> Int -> (Int, Int) +a@(I# _) `quotRemInt` b@(I# _) = (a `quotInt` b, a `remInt` b) + -- OK, so I made it a little stricter. Shoot me. (WDP 94/10) + +divModInt :: Int -> Int -> (Int, Int) +divModInt x@(I# _) y@(I# _) = (x `divInt` y, x `modInt` y) + -- Stricter. Sorry if you don't like it. (WDP 94/10) + + + + + + + + + + +data Integer + = S# Int# -- small integers + | J# Int# ByteArray# -- large integers + + + + + +zeroInteger :: Integer +zeroInteger = S# 0# + diff --git a/compiler/ilxGen/tests/build.mk b/compiler/ilxGen/tests/build.mk new file mode 100644 index 0000000000..285fd5de4e --- /dev/null +++ b/compiler/ilxGen/tests/build.mk @@ -0,0 +1,121 @@ +# 1. To make standard library: +# +# e.g. from lib/std directory: +# $(MAKE) way=ilx-Onot-mono std.ilx-Onot.mono.dll std.ilx-Onot.mono.vlb +# $(MAKE) way=ilx-O-mono std.ilx-O.mono.dll std.ilx-O.mono.vlb +# $(MAKE) way=ilx-Onot-generic std.ilx-Onot.generic.dll +# +# 2. To make tests: +# +# e.g. from ilxGen/tests directory: +# +# $ make -n way=ilx-Onot-mono test1.ilx-Onot.mono.retail.run +# +# $ make -n way=ilx-Onot-mono test1-nostdlib.ilx-Onot.mono.retail.run HC_OPTS="-fno-implicit-prelude -fglasgow-exts" +# + + +# Add all the ILX ways so dependencies get made correctly. +# (n.b. Actually we only need to add "ilx-Onot" and "ilx-O" for the +# GHC --> ILX dependencies, as these are the portions of the ILX +# ways that are relevant in terms of GHC options, +# but we list some of the others anyway. Also note that +# there are no dependencies required for the ILX --> IL or +# IL --> CLR phases as these operate on the "standalone" +# ILX and IL files). +# +#GhcLibWays+= ilx-Onot-mono ilx-Onot ilx-O ilx-O-mono +GhcLibWays+=i +GhcWithIlx=YES + +ILXized=YES + +GhcHcOpts+=-DILX -DNO_BIG_TUPLES +GhcLibHcOpts+=-optI--mono -optI--add-suffix-to-assembly -optImsilxlib -optI--suffix-to-add -optI.mono + +# Each set of args below defines one ILX way. +#ALL_WAYS+=ilx-Onot-generic +#WAY_ilx-Onot-generic_NAME=ILX with Haskell Optimizer Off to run on Generic CLR +#WAY_ilx-Onot-generic_HC_OPTS=-buildtag ilx-Onot $(GHC_ILX_OPTS) -Onot +#WAY_ilx-Onot-generic_ILX2IL_OPTS=--generic +#WAY_ilx-Onot-generic_ILX=YES + +#ALL_WAYS+=ilx-Onot-fullgeneric-verifiable +#WAY_ilx-Onot-fullgeneric-verifiable_NAME=ILX with Haskell Optimizer Off to run on Generic CLR +#WAY_ilx-Onot-fullgeneric-verifiable_HC_OPTS=-buildtag ilx-Onot $(GHC_ILX_OPTS) -Onot +#WAY_ilx-Onot-fullgeneric-verifiable_ILX2IL_OPTS=--fullgeneric --verifiable +#WAY_ilx-Onot-fullgeneric-verifiable_ILX=YES + +#ALL_WAYS+=ilx-Onot-repgeneric-verifiable +#WAY_ilx-Onot-repgeneric-verifiable_NAME=ILX with Haskell Optimizer Off to run on Generic CLR +#WAY_ilx-Onot-repgeneric-verifiable_HC_OPTS=-buildtag ilx-Onot $(GHC_ILX_OPTS) -Onot +#WAY_ilx-Onot-repgeneric-verifiable_ILX2IL_OPTS=--repgeneric --verifiable +#WAY_ilx-Onot-repgeneric-verifiable_ILX=YES + +#ALL_WAYS+=ilx-O-generic +#WAY_ilx-O-generic_NAME=ILX with Haskell Optimizer On to run on Generic CLR +#WAY_ilx-O-generic_HC_OPTS=-buildtag ilx-O $(GHC_ILX_OPTS) -O +#WAY_ilx-O-generic_ILX2IL_OPTS=--generic +#WAY_ilx-O-generic_ILX=YES + +#ALL_WAYS+=ilx-Onot-mono +#WAY_ilx-Onot-mono_NAME=ILX with Haskell Optimizer Off to run on V1 CLR +#WAY_ilx-Onot-mono_HC_OPTS=-buildtag ilx-Onot $(GHC_ILX_OPTS) -Onot +#WAY_ilx-Onot-mono_ILX2IL_OPTS=--mono +#WAY_ilx-Onot-mono_ILX=YES + +#ALL_WAYS+=ilx-Onot-mono-verifiable +#WAY_ilx-Onot-mono-verifiable_NAME=ILX with Haskell Optimizer Off to run on V1 CLR, verifiable code (CURRENTLY WILL NOT RUN BECAUSE OF LACK OF HIGHER KINDED TYPE PARAMETERS BUT IS USEFUL TO FIND BUGS USING THE VERIFIER) +#WAY_ilx-Onot-mono-verifiable_HC_OPTS=-buildtag ilx-Onot $(GHC_ILX_OPTS) -Onot +#WAY_ilx-Onot-mono-verifiable_ILX2IL_OPTS=--mono --verifiable +#WAY_ilx-Onot-mono-verifiable_ILX=YES + +#ALL_WAYS+=ilx-O-mono +#WAY_ilx-O-mono_NAME=ILX with Haskell Optimizer On to run on V1 CLR +#WAY_ilx-O-mono_HC_OPTS=-buildtag ilx-O $(GHC_ILX_OPTS) -O +#WAY_ilx-O-mono_ILX2IL_OPTS=--mono +#WAY_ilx-O-mono_ILX=YES + +#ALL_WAYS+=ilx-Onot-generic-traced +#WAY_ilx-Onot-generic-traced_NAME=ILX with Haskell Optimizer Off to run on Generic CLR +#WAY_ilx-Onot-generic-traced_HC_OPTS=-buildtag ilx-Onot $(GHC_ILX_OPTS) -Onot +#WAY_ilx-Onot-generic-traced_ILX2IL_OPTS=--generic --traced +#WAY_ilx-Onot-generic-traced_ILX=YES + +#ALL_WAYS+=ilx-O-generic-traced +#WAY_ilx-O-generic-traced_NAME=ILX with Haskell Optimizer On to run on Generic CLR +#WAY_ilx-O-generic-traced_HC_OPTS=-buildtag ilx-O $(GHC_ILX_OPTS) -O +#WAY_ilx-O-generic-traced_ILX2IL_OPTS=--generic --traced +#WAY_ilx-O-generic-traced_ILX=YES + +#ALL_WAYS+=ilx-Onot-mono-traced +#WAY_ilx-Onot-mono-traced_NAME=ILX with Haskell Optimizer Off to run on V1 CLR +#WAY_ilx-Onot-mono-traced_HC_OPTS=-buildtag ilx-Onot $(GHC_ILX_OPTS) -Onot +#WAY_ilx-Onot-mono-traced_ILX2IL_OPTS=--mono --traced +#WAY_ilx-Onot-mono-traced_ILX=YES + +#ALL_WAYS+=ilx-O-mono-traced +#WAY_ilx-O-mono-traced_NAME=ILX with Haskell Optimizer On to run on V1 CLR +#WAY_ilx-O-mono-traced_HC_OPTS=-buildtag ilx-O $(GHC_ILX_OPTS) -O +#WAY_ilx-O-mono-traced_ILX2IL_OPTS=--mono --traced +#WAY_ilx-O-mono-traced_ILX=YES + +# Put a "." after the Haskell portion of the way. Way names can't contain +# dots for some reason elsewhere in the Make system. But we need to be able +# to split out the Haskell portion of the way from the ILX portion (e.g. --generic) +# and the runtime portion (e.g. --retail). +ilx_way=$(subst ilx-Onot-,ilx-Onot.,$(subst ilx-O-,ilx-O.,$(way))) +ilx2il_suffix=$(subst ilx-Onot.,.,$(subst ilx-O.,.,$(ilx_way))) +hs2ilx_suffix=$(subst $(ilx2il_suffix),,$(ilx_way)) +HS_ILX=$(subst $(way),$(hs2ilx_suffix),$(HS_OBJS)) +#HS_IL=$(subst $(hs2ilx_suffix)_o,$(ilx_way).il,$(HS_ILX)) +HS_IL=$(subst .o,.il,$(HS_ILX)) + +ILVALID=C:/devel/fcom/bin/ilvalid.exe +ILVERIFY=C:/devel/fcom/bin/ilverify.exe + +%.$(ilx_way).mvl : %.$(ilx_way).il $(HS_IL) + ((ILVALID_HOME=c:\\devel\\fcom\\src\\ ILVALID_MSCORLIB=mscorlib.vlb $(ILVALID) c:\\devel\\fcom\\src\\bin\\msilxlib$(ilx2il_suffix).vlb $(addprefix --other-il-module ,$(filter-out $*.$(ilx_way).il,$(HS_IL))) $<) 2>&1) | tee $@ + +%.$(ilx_way).mvr : %.$(ilx_way).il $(HS_IL) + ((ILVALID_HOME=c:\\devel\\fcom\\src\\ ILVALID_MSCORLIB=mscorlib.vlb $(ILVERIFY) c:\\devel\\fcom\\src\\bin\\msilxlib$(ilx2il_suffix).vlb $(addprefix --other-il-module ,$(filter-out $<,$(HS_IL))) $<) 2>&1) | tee $@ diff --git a/compiler/ilxGen/tests/foo.hs b/compiler/ilxGen/tests/foo.hs new file mode 100644 index 0000000000..d66608ba22 --- /dev/null +++ b/compiler/ilxGen/tests/foo.hs @@ -0,0 +1,9 @@ +{-# OPTIONS -fglasgow-exts #-} +module Foo where +import PrelGHC +import PrelNum +import PrelBase +integer2Intx :: Integer -> Int +integer2Intx (S# i) = I# i +integer2Intx (J# s d) = case (integer2Int# s d) of { n# -> I# n# } + diff --git a/compiler/ilxGen/tests/life.hs b/compiler/ilxGen/tests/life.hs new file mode 100644 index 0000000000..d6bcd16f9f --- /dev/null +++ b/compiler/ilxGen/tests/life.hs @@ -0,0 +1,360 @@ +-------------------------------- +-- The Game of Life -- +-------------------------------- + +generations x = 30 + +data L a = N | C1 a (L a) + +data Tuple2 a b = T2 a b + +data Tuple3 a b c = T3 a b c + + +main = putStr (listChar_string + (append1 (C1 '\FF' N) + (life1 (generations ()) (start ())))) + +listChar_string :: L Char -> String +listChar_string N = [] +listChar_string (C1 x xs) = x : listChar_string xs + +start :: a -> L (L Int) +start x = (C1 N + (C1 N + (C1 N + (C1 N + (C1 N + (C1 N + (C1 N + (C1 N + (C1 N + (C1 N + (C1 N + (C1 N + (C1 N + (C1 N + (C1 + (C1 0 + (C1 0 + (C1 0 + (C1 1 + (C1 1 + (C1 1 + (C1 1 + (C1 1 + (C1 0 + (C1 1 + (C1 1 + (C1 1 + (C1 1 + (C1 1 + (C1 0 + (C1 1 + (C1 1 + (C1 1 + (C1 1 + (C1 1 + (C1 0 + (C1 1 + (C1 1 + (C1 1 + (C1 1 + (C1 1 + (C1 0 N))))))))))))))))))))))))))) N))))))))))))))) + +-- Calculating the next generation + +gen1 :: Int -> L (L Int) -> L (L Int) +gen1 n board = map1 row1 (shift1 (copy1 n 0) board) + +row1 :: Tuple3 (L Int) (L Int) (L Int) -> L Int +row1 (T3 last this next) + = zipWith31 elt1 (shift2 0 last) + (shift2 0 this) + (shift2 0 next) + + +elt1 :: Tuple3 Int Int Int + -> (Tuple3 Int Int Int) + -> (Tuple3 Int Int Int) -> Int +elt1 (T3 a b c) (T3 d e f) (T3 g h i) + = if (not (eq tot 2)) + && (not (eq tot 3)) + then 0 + else if (eq tot 3) then 1 else e + where tot = a `plus` b `plus` c `plus` d + `plus` f `plus` g `plus` h `plus` i + +eq :: Int -> Int -> Bool +eq x y = x == y + +plus :: Int -> Int -> Int +plus x y = x + y + +shiftr1 :: L Int -> L (L Int) -> L (L Int) +shiftr1 x xs = append2 (C1 x N) (init1 xs) + +shiftl1 :: L Int -> L (L Int) -> L (L Int) +shiftl1 x xs = append2 (tail1 xs) (C1 x N) + +shift1 :: L Int -> L (L Int) + -> L (Tuple3 (L Int) (L Int) (L Int)) +shift1 x xs = zip31 (shiftr1 x xs) xs (shiftl1 x xs) + +shiftr2 :: Int -> L Int -> L Int +shiftr2 x xs = append3 (C1 x N) (init2 xs) + +shiftl2 :: Int -> L Int -> L Int +shiftl2 x xs = append3 (tail2 xs) (C1 x N) + +shift2 :: Int -> L Int -> L (Tuple3 Int Int Int) +shift2 x xs = zip32 (shiftr2 x xs) xs (shiftl2 x xs) + +-- copy + +copy1 :: Int -> Int -> L Int +copy1 0 x = N +copy1 n x = C1 x (copy1 (n-1) x) + +copy2 :: Int -> L Int -> L (L Int) +copy2 0 x = N +copy2 n x = C1 x (copy2 (n-1) x) + +copy3 :: Int -> Char -> L Char +copy3 0 x = N +copy3 n x = C1 x (copy3 (n-1) x) + +-- Displaying one generation + +disp1 :: (Tuple2 (L Char) (L (L Int))) -> L Char +disp1 (T2 gen xss) + = append1 gen + (append1 (C1 '\n' (C1 '\n' N)) + (foldr_1 (glue1 (C1 '\n' N)) N + (map4 (compose2 concat1 (map2 star1)) xss))) + +star1 :: Int -> L Char +star1 i = case i of + 0 -> C1 ' ' (C1 ' ' N) + 1 -> C1 ' ' (C1 'o' N) + +glue1 :: L Char -> L Char -> L Char -> L Char +glue1 s xs ys = append1 xs (append1 s ys) + +-- Generating and displaying a sequence of generations + +life1 :: Int -> L (L Int) -> L Char +life1 n xss + = foldr_1 (glue1 (copy3 (n+2) '\VT')) N + (map5 disp1 + (zip1_ (map6 (string_ListChar.show) (ints 0)) + gens)) + where + gens = take3 (740::Int) (iterate1 (gen1 n) (initial1 n xss)) + +ints :: Int -> L Int +ints x = C1 x (ints (x+1)) + +string_ListChar :: String -> L Char +string_ListChar [] = N +string_ListChar (x:xs) = C1 x (string_ListChar xs) + +initial1 :: Int -> L (L Int) -> L (L Int) +initial1 n xss = take1 n (append2 (map3 (compose1 (take2 n) + (`append3` (copy1 n 0))) xss) + (copy2 n (copy1 n 0))) + +iterate1 :: (L (L Int) -> L (L Int)) + -> L (L Int) -> L (L (L Int)) +iterate1 f x = C1 x (iterate1 f (f x)) + +-- versions of built in functions + +-- take +take1 :: Int -> L (L Int) -> L (L Int) +take1 0 _ = N +take1 _ N = N +--should be:take1 (n+1) (C1 x xs) = C1 x (take1 n xs) +take1 n (C1 x xs) | n < 0 = error "Main.take1" + | otherwise = C1 x (take1 (n-1) xs) + +take2 :: Int -> L Int -> L Int +take2 0 _ = N +take2 _ N = N +--should be:take2 (n+1) (C1 x xs) = C1 x (take2 n xs) +take2 n (C1 x xs) | n < 0 = error "Main.take2" + | otherwise = C1 x (take2 (n-1) xs) + +take3 :: Int -> L (L (L Int)) + -> L (L (L Int)) +take3 0 _ = N +take3 _ N = N +take3 n (C1 x xs) = C1 x (take3 (n-1) xs) + +-- init + +init1 :: L (L Int) -> L (L Int) +init1 (C1 x N) = N +init1 (C1 x xs) = C1 x (init1 xs) +init1 N = error "init1 got a bad list" + +init2 :: L Int -> L Int +init2 (C1 x N) = N +init2 (C1 x xs) = C1 x (init2 xs) +init2 N = error "init1 got a bad list" + +-- tail + +tail1 :: L (L Int) -> L (L Int) +tail1 (C1 _ xs) = xs +tail1 N = error "tail1 got a bad list" + +tail2 :: L Int -> L Int +tail2 (C1 _ xs) = xs +tail2 N = error "tail2 got a bad list" + +-- maps + +map1 :: (Tuple3 (L Int) (L Int) (L Int) -> L Int) -> + L (Tuple3 (L Int) (L Int) (L Int)) + -> L (L Int) +map1 f N = N +map1 f (C1 x xs) = C1 (f x) (map1 f xs) + +map2 :: (Int -> L Char) -> L Int -> L (L Char) +map2 f N = N +map2 f (C1 x xs) = C1 (f x) (map2 f xs) + +map3 :: (L Int -> L Int) -> L (L Int) -> L (L Int) +map3 f N = N +map3 f (C1 x xs) = C1 (f x) (map3 f xs) + +map4 :: (L Int -> L Char) + -> L (L Int) -> L (L Char) +map4 f N = N +map4 f (C1 x xs) = C1 (f x) (map4 f xs) + +map5 :: (Tuple2 (L Char) (L (L Int)) -> L Char) + -> L (Tuple2 (L Char) (L (L Int))) + -> L (L Char) +map5 f N = N +map5 f (C1 x xs) = C1 (f x) (map5 f xs) + +map6 :: (Int -> L Char) -> L Int -> L (L Char) +map6 f N = N +map6 f (C1 x xs) = C1 (f x) (map6 f xs) + +-- compose + +compose2 :: (L (L Char) -> L Char) + -> (L Int -> L (L Char)) + -> L Int -> L Char +compose2 f g xs = f (g xs) + +compose1 :: (L Int -> L Int) + -> (L Int -> L Int) -> L Int -> L Int +compose1 f g xs = f (g xs) + +-- concat + +concat1 :: L (L Char) -> L Char +concat1 = foldr_1 append1 N + +-- foldr + +foldr_1 :: (L Char -> L Char -> L Char) + -> L Char -> L (L Char) -> L Char +foldr_1 f a N = a +foldr_1 f a (C1 x xs) = f x (foldr_1 f a xs) + +-- appends + +append1 :: L Char -> L Char -> L Char +append1 N ys = ys +append1 (C1 x xs) ys = C1 x (append1 xs ys) + +append2 :: L (L Int) -> L (L Int) -> L (L Int) +append2 N ys = ys +append2 (C1 x xs) ys = C1 x (append2 xs ys) + +append3 :: L Int -> L Int -> L Int +append3 N ys = ys +append3 (C1 x xs) ys = C1 x (append3 xs ys) + +-- zips + +pzip f (C1 x1 xs) (C1 y1 ys) + = C1 (f x1 y1) (pzip f xs ys) +pzip f _ _ = N + + +zip1_ :: L (L Char) + -> L (L (L Int)) + -> L (Tuple2 (L Char) (L (L Int))) +zip1_ = pzip T2 + +zip2_ :: L (L Int) + -> L (L Int) + -> L (Tuple2 (L Int) (L Int)) +zip2_ = pzip T2 + +zip3d :: L Int -> (Tuple2 (L Int) (L Int)) + -> (Tuple3 (L Int) (L Int) (L Int)) +zip3d x (T2 y z) = T3 x y z + +zip3_ :: L (L Int) + -> L (Tuple2 (L Int) (L Int)) + -> L (Tuple3 (L Int) (L Int) (L Int)) +zip3_ = pzip zip3d + +zip4_ :: L Int + -> L Int + -> L (Tuple2 Int Int) +zip4_ = pzip T2 + +zip5d :: Int -> (Tuple2 Int Int) -> (Tuple3 Int Int Int) +zip5d x (T2 y z) = T3 x y z + +zip5_ :: L Int + -> L (Tuple2 Int Int) + -> L (Tuple3 Int Int Int) +zip5_ = pzip zip5d + +zip6_ :: L (Tuple3 Int Int Int) + -> L (Tuple3 Int Int Int) + -> L (Tuple2 (Tuple3 Int Int Int) + (Tuple3 Int Int Int)) +zip6_ = pzip T2 + +zip31 :: L (L Int) -> L (L Int) + -> L (L Int) + -> L (Tuple3 (L Int) (L Int) (L Int)) +zip31 as bs cs + = zip3_ as (zip2_ bs cs) + +zip32 :: L Int -> L Int -> L Int + -> L (Tuple3 Int Int Int) +zip32 as bs cs + = zip5_ as (zip4_ bs cs) + +-- zipWith + +zipWith21 :: ((Tuple3 Int Int Int) + -> (Tuple2 (Tuple3 Int Int Int) + (Tuple3 Int Int Int)) -> Int) + -> L (Tuple3 Int Int Int) + -> L (Tuple2 (Tuple3 Int Int Int) + (Tuple3 Int Int Int)) + -> L Int +zipWith21 = pzip + +zipWith31 :: ((Tuple3 Int Int Int) + -> (Tuple3 Int Int Int) + -> (Tuple3 Int Int Int) -> Int) + -> L (Tuple3 Int Int Int) + -> L (Tuple3 Int Int Int) + -> L (Tuple3 Int Int Int) -> L Int +zipWith31 z as bs cs + = zipWith21 z' as (zip6_ bs cs) + where z' a (T2 b c) = z a b c diff --git a/compiler/ilxGen/tests/reduce.ml b/compiler/ilxGen/tests/reduce.ml new file mode 100644 index 0000000000..cad379b522 --- /dev/null +++ b/compiler/ilxGen/tests/reduce.ml @@ -0,0 +1,101 @@ + + +type kind = + ARROW of kind * kind + | TYP + +type tycon = + | TyVar of int + | FUN + | LIST + | STRING + +type typ = + TyForall of kind * typ + | TyApp of tycon * typ list + +type exp = + | AbsTm of typ * exp + | Var of int + | App of exp * exp + | String of string + | AbsTy of kind * exp + | AppTy of exp * typ + +type ttyp = + | TTyFun of ttyp * ttyp + | TTyList of ttyp + | TTyString + | TTyAny + | TTyVar of int + | TTyForall of ttyp + +type texp = + | TAbsTm of ttyp * texp + | TVar of int + | TApp of texp * texp + | TString of string + | TLetTy of texp * texp + | TCast of texp * ttyp + + | TAppTy of texp * ttyp + | TAbsTy of texp + + +let (-->) x y = TyApp (FUN, [x;y]) +let (--->) x y = TTyFun (x,y) + +let rec trans_kind = function + ARROW (k1,k2) -> (trans_kind k1 ---> trans_kind k2) + | TYP -> (TTyForall TANY ---> TTyAny) + +let rec trans_typ_arg_aux = function + (* TyForall (k,ty) -> TAbsTm (trans_kind k, TAbsTy (trans_typ ty)) ??? *) + | TyApp (TyVar tv, args) -> failwith "unreduced" + | ty -> TAbsTm (trans_kind k, TAbsTy (trans_typ ty))failwith "unreduced" + | +let rec trans_typ_arg env = function + | TyApp (FUN, []) -> + TAbsTm + (trans_kind TYP, + TLetTy (TVar 0, + TAbsTm + (trans_kind TYP, + TLetTy (TVar 0, + TAbsTm + (TTyForall TANY, + TAppTy (TVar 0, TTyFun (TTyVar 0, TTyVar 1))))))) + | TyApp (TyVar tv, args) -> + try List.assoc (tv,args) env + with Not_found -> failwith "trans_typ: unreduced type variable" + | ty -> TAbsTm (TTyForall TANY, TAppTy (TVar 0, trans_typ env ty)) +(* + | TyApp (STRING, []) -> TAbsTm (TTyForall TANY, TAppTy (TVar 0, TTyString)) + | TyApp (FUN, [l;r]) -> TAbsTm (TTyForall TANY, TAppTy (TVar 0, TTyFun (trans_typ l, trans_typ r))) +*) + + +let rec trans_typ env = function + TyForall (k,ty) -> (trans_kind k ---> TTyAny) + | TyApp (TyVar tv, args) -> + try List.assoc (tv,args) env + with Not_found -> failwith "trans_typ: unreduced type variable" + | TyApp (FUN, [l;r]) -> TTyFun (trans_typ env l, trans_typ env r) + | TyApp (STRING, []) -> TTyString + | _ -> failwith "trans_typ: badly formed input type" + + +let rec trans_exp env = function + | AbsTm (ty,e) -> TAbsTm(trans_typ ty, trans_exp e) + | Var n -> TVar n + | App (l,r) -> TApp(trans_exp l, trans_exp r) + | String s -> TString s + | AbsTy (k,e) -> TAbsTm(trans_kind k, reduce env e) + | AppTy (tm,ty) -> TAppTy(trans_exp tm, trans_typ_arg env ty) + + +open Format;; + + +let rec pp_print_exp pps = function + L e -> fprintf pps "\ diff --git a/compiler/ilxGen/tests/test1-nostdlib.hs b/compiler/ilxGen/tests/test1-nostdlib.hs new file mode 100644 index 0000000000..1e9053ea41 --- /dev/null +++ b/compiler/ilxGen/tests/test1-nostdlib.hs @@ -0,0 +1,4 @@ +module Test1_nostdlib where +foreign import "ilxHello" unsafe ilxHello :: () + +ilx_main_no_stdlib = ilxHello diff --git a/compiler/ilxGen/tests/test1.hs b/compiler/ilxGen/tests/test1.hs new file mode 100644 index 0000000000..10f307e08e --- /dev/null +++ b/compiler/ilxGen/tests/test1.hs @@ -0,0 +1 @@ +main = putStr "HELLO HELLO Hello world WORLD WORLD.\n" diff --git a/compiler/ilxGen/tests/test10.hs b/compiler/ilxGen/tests/test10.hs new file mode 100644 index 0000000000..46c384d9e0 --- /dev/null +++ b/compiler/ilxGen/tests/test10.hs @@ -0,0 +1,45 @@ + +data N = Z | S N + +choose1 n1 = + case n1 of + Z -> "even\n" + S Z -> "odd\n" + S (S m) -> choose1 m +choose2 n1 n2 = + case n1 of + Z -> choose1 n2 + S Z -> "odd\n" + S (S m) -> choose2 m n2 +choose3 n1 n2 n3 = + case n1 of + Z -> choose2 n2 n3 + S Z -> "odd\n" + S (S m) -> choose3 m n2 n3 + +choose4 n1 n2 n3 n4 = + case n1 of + Z -> choose3 n2 n3 n4 + S Z -> "odd\n" + S (S m) -> choose4 m n2 n3 n4 + +choose5 n1 n2 n3 n4 n5 = + case n1 of + Z -> choose4 n2 n3 n4 n5 + S Z -> "odd\n" + S (S m) -> choose5 m n2 n3 n4 n5 + +add n m = + case n of + Z -> m + S nn -> S (add nn m) + +n1 = S Z +n2 = add n1 n1 +n4 = add n2 n2 +n6 = add n2 n4 + + + +main = putStr (choose5 n6 n4 n2 n2 n1) + diff --git a/compiler/ilxGen/tests/test11.hs b/compiler/ilxGen/tests/test11.hs new file mode 100644 index 0000000000..ce53f71389 --- /dev/null +++ b/compiler/ilxGen/tests/test11.hs @@ -0,0 +1,61 @@ +{-# OPTIONS -fglasgow-exts #-} + +import PrelGHC + +class EEq a where + (===), (/==) :: a -> a -> Bool + +-- x /= y = not (x == y) +-- x == y = not (x /= y) +-- x /= y = True + (/==) x y = mynot ((===) x y) + x === y = True + +data EOrdering = ELT | EEQ | EGT + +mynot True = False +mynot False = True + +{- +class (EEq a) => EOrd a where + ecompare :: a -> a -> EOrdering + (<<), (<<=), (>>>=), (>>>):: a -> a -> Bool + emax, emin :: a -> a -> a + +-- An instance of Ord should define either compare or <= +-- Using compare can be more efficient for complex types. + ecompare x y + | x === y = EEQ + | x <<= y = ELT -- NB: must be '<=' not '<' to validate the + -- above claim about the minimal things that can + -- be defined for an instance of Ord + | otherwise = EGT + + x <<= y = case ecompare x y of { EGT -> False; _other -> True } + x << y = case ecompare x y of { ELT -> True; _other -> False } + x >>>= y = case ecompare x y of { ELT -> False; _other -> True } + x >>> y = case ecompare x y of { EGT -> True; _other -> False } + + -- These two default methods use '>' rather than compare + -- because the latter is often more expensive + emax x y = if x >>> y then x else y + emin x y = if x >>> y then y else x +-} + +data EInt = EI Int# + +ezeroInt, eoneInt, etwoInt, emaxInt, eminInt :: EInt +ezeroInt = EI 0# +eoneInt = EI 1# +etwoInt = EI 2# +eminInt = EI (-2147483648#) -- GHC <= 2.09 had this at -2147483647 +emaxInt = EI 2147483647# +eeqInt (EI x) (EI y) = x ==# y +eneInt (EI x) (EI y) = x /=# y + +instance EEq EInt where + (===) x y = x `eeqInt` y + (/==) x y = x `eneInt` y + +main = putStr (if (ezeroInt === eoneInt) then "no!\n" else "yes!\n") + diff --git a/compiler/ilxGen/tests/test12.hs b/compiler/ilxGen/tests/test12.hs new file mode 100644 index 0000000000..216c792f32 --- /dev/null +++ b/compiler/ilxGen/tests/test12.hs @@ -0,0 +1,44 @@ +class NewFunctor f where + new_fmap :: (a -> b) -> f a -> f b + +data N a = Z a | S (N a) + +nmap f (Z x) = Z (f x) +nmap f (S n) = S (nmap f n) + +tag (Z x) = x +tag (S n) = tag n + +instance NewFunctor N where + new_fmap = nmap + +--class Strange f where +-- zero :: a -> f a +-- suc :: f a -> f a +-- tag :: f a -> a + + +--class FMonad m where +-- (>>=) :: m a -> (a -> m b) -> m b +-- (>>) :: m a -> m b -> m b +-- return :: a -> m a +-- fail :: String -> m a +-- +-- m >> k = m >>= \_ -> k +-- fail s = error s + + + + +--instance Strange N +-- where +-- zero x = Z x +-- suc y = S y +-- tag n = gettag n + +twice :: NewFunctor f => (a -> a) -> f a -> f a +twice f x = new_fmap f (new_fmap f x) + +main = putStr (tag (nmap (\x -> x) (Z "hello world\n"))) +--main = putStr (tag (nmap (\x -> x) (Z "hello world\n"))) +-- main = putStr (tag {- (twice (\x -> x) -} (Z "hello world\n")) diff --git a/compiler/ilxGen/tests/test13.hs b/compiler/ilxGen/tests/test13.hs new file mode 100644 index 0000000000..559c8674fa --- /dev/null +++ b/compiler/ilxGen/tests/test13.hs @@ -0,0 +1,20 @@ +class NewFunctor f where + inj :: a -> f a + surj :: f a -> a + +data N a = Z a + +ninj x = (Z x) +nsurj (Z x) = x + +instance NewFunctor N where + inj = ninj + surj = nsurj + +twice :: NewFunctor f => a -> f (f a) +twice x = inj(inj x) + +undo :: NewFunctor f => f (f a) -> a +undo x = surj(surj x) + +main = putStr (undo (Z (Z "hello world\n"))) diff --git a/compiler/ilxGen/tests/test14.hs b/compiler/ilxGen/tests/test14.hs new file mode 100644 index 0000000000..86b5d1c821 --- /dev/null +++ b/compiler/ilxGen/tests/test14.hs @@ -0,0 +1,11 @@ +class EMonad m where + aaaaa :: m a -> (a -> m b) -> m b + bbbbb :: m a -> m b -> m b + + bbbbb m k = aaaaa m (\_ -> k) + -- = \M \A \B -> \m:(M A) -> \k:(M B) -> aaaaa M A B m (\_:A -> k: M B) + -- Free types must include "A"!!! + +main = putStr "hello world\n" + + diff --git a/compiler/ilxGen/tests/test15.hs b/compiler/ilxGen/tests/test15.hs new file mode 100644 index 0000000000..3e522d757c --- /dev/null +++ b/compiler/ilxGen/tests/test15.hs @@ -0,0 +1,18 @@ + +{-# OPTIONS -fglasgow-exts -fno-implicit-prelude #-} + + +import PrelBase +import PrelList +import PrelEnum +import PrelShow +import PrelIO + + +bbuild :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a] +{-# INLINE 2 bbuild #-} +bbuild g = g (:) [] + +main = putStr "hello world\n" + + diff --git a/compiler/ilxGen/tests/test16.hs b/compiler/ilxGen/tests/test16.hs new file mode 100644 index 0000000000..0e8b9974a9 --- /dev/null +++ b/compiler/ilxGen/tests/test16.hs @@ -0,0 +1,5 @@ + + +data MMaybe a = No | Yes a + +main = putStr "hello world\n"
\ No newline at end of file diff --git a/compiler/ilxGen/tests/test17.hs b/compiler/ilxGen/tests/test17.hs new file mode 100644 index 0000000000..5e551b2dcd --- /dev/null +++ b/compiler/ilxGen/tests/test17.hs @@ -0,0 +1,44 @@ +{-# OPTIONS -fno-implicit-prelude #-} + +module Test17 where + +import PrelGHC +import PrelBase + +data Exception = IOException IOError | OtherExc + +data IOError + = IOError + String + +tthrow :: Exception -> a + +tthrow exception = raise# exception +ccatchException (IO m) k = IO (\s -> catch# m (\ex -> unIO (k ex)) s) + + +ccatch :: IO a -> (IOError -> IO a) -> IO a +ccatch m k = ccatchException m handler + where handler (IOException err) = k err + handler other = tthrow other + +ccatchNonIO :: IO a -> (Exception -> IO a) -> IO a +ccatchNonIO m k = ccatchException m handler + where handler (IOException err) = ioError err + handler other = k other + +newtype IO a = IO (State# RealWorld -> (# State# RealWorld, a #)) + +unIO :: IO a -> (State# RealWorld -> (# State# RealWorld, a #)) +unIO (IO a) = a + +ioError :: IOError -> IO a +ioError err = IO (\s -> tthrow (IOException err) s) + + + +blockAsyncExceptions :: IO a -> IO a +blockAsyncExceptions (IO io) = IO (blockAsyncExceptions# io) + +unblockAsyncExceptions :: IO a -> IO a +unblockAsyncExceptions (IO io) = IO (unblockAsyncExceptions# io) diff --git a/compiler/ilxGen/tests/test18.hs b/compiler/ilxGen/tests/test18.hs new file mode 100644 index 0000000000..12ca7413f1 --- /dev/null +++ b/compiler/ilxGen/tests/test18.hs @@ -0,0 +1,129 @@ +{-# OPTIONS -fno-implicit-prelude #-} + +module Test18 where + +import PrelGHC +import PrelBase + +eftCharFB c n x y = go x + where + go x | x ># y = n + | otherwise = C# (chr# x) `c` go (x +# 1#) + + +eftIntFB c n x y | x ># y = n + | otherwise = go x + where + go x = I# x `c` if x ==# y then n else go (x +# 1#) + +eftIntList x y | x ># y = [] + | otherwise = go x + where + go x = I# x : if x ==# y then [] else go (x +# 1#) + + +efdCharFB c n x1 x2 + | delta >=# 0# = go_up_char_fb c n x1 delta 255# + | otherwise = go_dn_char_fb c n x1 delta 0# + where + delta = x2 -# x1 + +efdCharList x1 x2 + | delta >=# 0# = go_up_char_list x1 delta 255# + | otherwise = go_dn_char_list x1 delta 0# + where + delta = x2 -# x1 + +efdtCharFB c n x1 x2 lim + | delta >=# 0# = go_up_char_fb c n x1 delta lim + | otherwise = go_dn_char_fb c n x1 delta lim + where + delta = x2 -# x1 + +efdtCharList x1 x2 lim + | delta >=# 0# = go_up_char_list x1 delta lim + | otherwise = go_dn_char_list x1 delta lim + where + delta = x2 -# x1 + +go_up_char_fb c n x delta lim + = go_up x + where + go_up x | x ># lim = n + | otherwise = C# (chr# x) `c` go_up (x +# delta) + +go_dn_char_fb c n x delta lim + = go_dn x + where + go_dn x | x <# lim = n + | otherwise = C# (chr# x) `c` go_dn (x +# delta) + +go_up_char_list x delta lim + = go_up x + where + go_up x | x ># lim = [] + | otherwise = C# (chr# x) : go_up (x +# delta) + + +go_dn_char_list x delta lim + = go_dn x + where + go_dn x | x <# lim = [] + | otherwise = C# (chr# x) : go_dn (x +# delta) + +efdtIntFB c n x1 x2 y + | delta >=# 0# = if x1 ># y then n else go_up_int_fb c n x1 delta lim + | otherwise = if x1 <# y then n else go_dn_int_fb c n x1 delta lim + where + delta = x2 -# x1 + lim = y -# delta + +efdtIntList x1 x2 y + | delta >=# 0# = if x1 ># y then [] else go_up_int_list x1 delta lim + | otherwise = if x1 <# y then [] else go_dn_int_list x1 delta lim + where + delta = x2 -# x1 + lim = y -# delta + +efdIntFB c n x1 x2 + | delta >=# 0# = go_up_int_fb c n x1 delta ( 2147483647# -# delta) + | otherwise = go_dn_int_fb c n x1 delta ((-2147483648#) -# delta) + where + delta = x2 -# x1 + +efdIntList x1 x2 + | delta >=# 0# = go_up_int_list x1 delta ( 2147483647# -# delta) + | otherwise = go_dn_int_list x1 delta ((-2147483648#) -# delta) + where + delta = x2 -# x1 + +-- In all of these, the (x +# delta) is guaranteed not to overflow + +go_up_int_fb c n x delta lim + = go_up x + where + go_up x | x ># lim = I# x `c` n + | otherwise = I# x `c` go_up (x +# delta) + +go_dn_int_fb c n x delta lim + = go_dn x + where + go_dn x | x <# lim = I# x `c` n + | otherwise = I# x `c` go_dn (x +# delta) + +go_up_int_list x delta lim + = go_up x + where + go_up x | x ># lim = [I# x] + | otherwise = I# x : go_up (x +# delta) + +go_dn_int_list x delta lim + = go_dn x + where + go_dn x | x <# lim = [I# x] + | otherwise = I# x : go_dn (x +# delta) +eftInt = eftIntList +efdInt = efdIntList +efdtInt = efdtIntList + + diff --git a/compiler/ilxGen/tests/test19.hs b/compiler/ilxGen/tests/test19.hs new file mode 100644 index 0000000000..a292599031 --- /dev/null +++ b/compiler/ilxGen/tests/test19.hs @@ -0,0 +1,37 @@ + +{-# OPTIONS -fno-implicit-prelude -#include "cbits/stgio.h" #-} + + +module Test19 where + +import PrelST +import PrelBase +import PrelErr + +newtype IIO a = IIO (State# RealWorld -> (# State# RealWorld, a #)) + +unIIO :: IIO a -> (State# RealWorld -> (# State# RealWorld, a #)) +unIIO (IIO a) = a + +instance Functor IIO where + fmap f x = x >>= (return . f) + +instance Monad IIO where + {-# INLINE return #-} + {-# INLINE (>>) #-} + {-# INLINE (>>=) #-} + m >> k = m >>= \ _ -> k + return x = returnIIO x + + m >>= k = bindIIO m k + fail s = error s -- not ioError? + + +bindIIO :: IIO a -> (a -> IIO b) -> IIO b +bindIIO (IIO m) k = IIO ( \ s -> + case m s of + (# new_s, a #) -> unIIO (k a) new_s + ) + +returnIIO :: a -> IIO a +returnIIO x = IIO (\ s -> (# s, x #)) diff --git a/compiler/ilxGen/tests/test1b.hs b/compiler/ilxGen/tests/test1b.hs new file mode 100644 index 0000000000..c4b2336df1 --- /dev/null +++ b/compiler/ilxGen/tests/test1b.hs @@ -0,0 +1,104 @@ +-- To start: +-- source /bin/devghc + +-- To compile GHC +-- make ilxGen/IlxGen.o hsc + +-- To compile ILXASM +-- (cd /devel/fcom/src; make bin/ilxasm.exe) + +-- To compile to ILX +-- (cd ilxGen/tests; ../../../driver/ghc-inplace --ilx test.hs) + + + +-- To generate a complete ILX file, including preludes for GHC and ILX: +-- (cd ilxGen/tests/; cat prelude.ilx test.ilx /devel/fcom/src/ilxasm/stdlib-func.ilx > test.full.ilx) + +-- Run ILXASM to get a IL +-- ( cd ilxGen/tests/; /devel/fcom/src/bin/ilxasm.exe --no-ilasm --no-stdlib test.full.ilx > test.il) + +-- To compile IL to .EXE or .DLL: +-- With build of VS (e.g. Don & Andrew) +-- ( cd ilxGen/tests/; cmd /C "c:\\bin\\devvs.bat && ilasm test.il") +-- With Lightning SDK, where env. variables are on path (e.g. Reuben): +-- ( cd ilxGen/tests/; ilasm test.il) + +-- To validate .EXE: +-- (cd /devel/fcom/src; make bin/ilvalid.exe mscorlib.vlb) +-- (export ILVALID_HOME=/devel/fcom/src; cd ilxGen/tests/; /devel/fcom/src/bin/ilvalid.exe test.il) + +-- To run unverifiable code: +-- With build of VS (e.g. Don & Andrew) +-- (cd ilxGen/tests/; cmd /C "c:\\bin\\devvs.bat && .\test.exe") +-- With Lightning SDK, where env. variables are on path (e.g. Reuben): +-- (cd ilxGen/tests/; ./test.exe) + +-- To compile ILX to verifiable code and verify +-- (cd /devel/fcom/src; make bin/ilxasm.exe bin/ilverify.exe) && (cd ilxGen/tests/; export ILVALID_HOME=/devel/fcom/src; cat prelude.ilx test.ilx /devel/fcom/src/assem/stdlib-func.ilx > test.full.ilx && cd ilxGen/tests/; /devel/fcom/src/bin/ilxasm.exe --no-ilasm test.full.ilx > test.safe.il && /devel/fcom/src/bin/ilverify.exe test.safe.il) + +-- (cd ilxGen/tests/; cmd /C "c:\\bin\\devvs.bat && .\test.safe.exe") + +--append:: [Char] -> [Char] -> [Char] +--append [] l2 = l2 +--append (h:t) l2 = h:append t l2 + +data N = Z | S N + +chooseN n = + case n of + Z -> "even\n" + S Z -> "odd\n" + S (S m) -> chooseN m + +signN n = + case n of + Z -> Z + S Z -> S Z + S (S m) -> signN m +add n m = + case n of + Z -> m + S nn -> S (add nn m) + +mul n m = + case n of + Z -> Z + S nn -> add m (mul nn m) + +pow n m = + case m of + Z -> S Z + S mm -> mul n (pow n mm) + +sq n = mul n n + +n1 = S Z +n2 = add n1 n1 +n4 = add n2 n2 +n6 = add n2 n4 +n8 = add n2 n6 +n10 = add n2 n8 +n11 = add n1 n10 +n12 = add n1 n11 +n13 = add n1 n12 +n14 = add n1 n13 +n15 = add n1 n14 +n16 = add n1 n15 +n17 = add n1 n16 +n18 = add n1 n17 +n19 = add n1 n18 +n20 = add n1 n18 + +bign = pow n2 n19 +bign1 = add bign n1 + +foldn f n acc = + case n of + Z -> acc + S x -> foldn f x (f n acc) + +main = putStr (chooseN (foldn (\x y -> add (signN x) y) (pow n2 n4) n1)) + + + diff --git a/compiler/ilxGen/tests/test2.hs b/compiler/ilxGen/tests/test2.hs new file mode 100644 index 0000000000..8b1f5b5eb6 --- /dev/null +++ b/compiler/ilxGen/tests/test2.hs @@ -0,0 +1,88 @@ +-- To start: +-- source /bin/devghc + +-- To compile GHC +-- make ilxGen/IlxGen.o hsc + +-- To compile ILXASM +-- (cd /devel/fcom/src; make bin/ilxasm.exe) + +-- To compile to ILX +-- (cd ilxGen/tests; ../../../driver/ghc-inplace --ilx test.hs) + + + +-- To generate a complete ILX file, including preludes for GHC and ILX: +-- (cd ilxGen/tests/; cat prelude.ilx test.ilx /devel/fcom/src/ilxasm/stdlib-func.ilx > test.full.ilx) + +-- Run ILXASM to get a IL +-- ( cd ilxGen/tests/; /devel/fcom/src/bin/ilxasm.exe --no-ilasm --no-stdlib test.full.ilx > test.il) + +-- To compile IL to .EXE or .DLL: +-- With build of VS (e.g. Don & Andrew) +-- ( cd ilxGen/tests/; cmd /C "c:\\bin\\devvs.bat && ilasm test.il") +-- With Lightning SDK, where env. variables are on path (e.g. Reuben): +-- ( cd ilxGen/tests/; ilasm test.il) + +-- To validate .EXE: +-- (cd /devel/fcom/src; make bin/ilvalid.exe mscorlib.vlb) +-- (export ILVALID_HOME=/devel/fcom/src; cd ilxGen/tests/; /devel/fcom/src/bin/ilvalid.exe test.il) + +-- To run unverifiable code: +-- With build of VS (e.g. Don & Andrew) +-- (cd ilxGen/tests/; cmd /C "c:\\bin\\devvs.bat && .\test.exe") +-- With Lightning SDK, where env. variables are on path (e.g. Reuben): +-- (cd ilxGen/tests/; ./test.exe) + +-- To compile ILX to verifiable code and verify +-- (cd /devel/fcom/src; make bin/ilxasm.exe bin/ilverify.exe) && (cd ilxGen/tests/; export ILVALID_HOME=/devel/fcom/src; cat prelude.ilx test.ilx /devel/fcom/src/assem/stdlib-func.ilx > test.full.ilx && cd ilxGen/tests/; /devel/fcom/src/bin/ilxasm.exe --no-ilasm test.full.ilx > test.safe.il && /devel/fcom/src/bin/ilverify.exe test.safe.il) + +-- (cd ilxGen/tests/; cmd /C "c:\\bin\\devvs.bat && .\test.safe.exe") + +--append:: [Char] -> [Char] -> [Char] +--append [] l2 = l2 +--append (h:t) l2 = h:append t l2 + +data N = Z | S N + +chooseN n = + case n of + Z -> "even\n" + S Z -> "odd\n" + S (S m) -> chooseN m + +add n m = + case n of + Z -> m + S nn -> S (add nn m) + +mul n m = + case n of + Z -> Z + S nn -> add m (mul nn m) + +pow n m = + case m of + Z -> S Z + S mm -> mul n (pow n mm) + +sq n = mul n n + +n1 = S Z +n2 = add n1 n1 +n4 = add n2 n2 +n6 = add n2 n4 +n8 = add n2 n6 +n10 = add n2 n8 +n16 = add n6 n10 +n17 = add n1 n16 +n18 = add n8 n10 +n19 = add n1 n18 +n20 = add n4 n16 + +bign = pow n2 n10 +bign1 = add bign n1 + +main = putStr (chooseN bign1) + + diff --git a/compiler/ilxGen/tests/test20.hs b/compiler/ilxGen/tests/test20.hs new file mode 100644 index 0000000000..157a16da1d --- /dev/null +++ b/compiler/ilxGen/tests/test20.hs @@ -0,0 +1,9 @@ + +data N = Z | S N + +res Z x y = (# x, y #) +res (S n) x y = res n x y + +(# x, y #) = res (S Z) "no!" "hello world\n" + +main = putStr y diff --git a/compiler/ilxGen/tests/test21.hs b/compiler/ilxGen/tests/test21.hs new file mode 100644 index 0000000000..1870f22b97 --- /dev/null +++ b/compiler/ilxGen/tests/test21.hs @@ -0,0 +1,13 @@ +{-# OPTIONS -fno-implicit-prelude #-} + +import PrelIOBase +import PrelIO +import PrelBase +import PrelAddr + +foreign import "libHS_cbits" "getErrStr__" unsafe ggetErrStr__ :: Int -> IO Addr + +main = putStr (uunsafePerformIO (ggetErrStr__ 4)) + +uunsafePerformIO :: IO Addr -> [Char] +uunsafePerformIO (IO m) = case m realWorld# of (# _, (A# r) #) -> (unpackCString# r) diff --git a/compiler/ilxGen/tests/test2b.hs b/compiler/ilxGen/tests/test2b.hs new file mode 100644 index 0000000000..08a391f799 --- /dev/null +++ b/compiler/ilxGen/tests/test2b.hs @@ -0,0 +1,2 @@ +foreign import "ilxHello" unsafe ilxHello :: IO () +main = ilxHello diff --git a/compiler/ilxGen/tests/test2c.hs b/compiler/ilxGen/tests/test2c.hs new file mode 100644 index 0000000000..d01df051f8 --- /dev/null +++ b/compiler/ilxGen/tests/test2c.hs @@ -0,0 +1,14 @@ +import PrelIOBase + + +bindIO2 :: IO () -> IO () -> IO () +bindIO2 m (IO k) = IO ( \ s -> k s ) + +foreign import "ilxHello" unsafe ilxHello :: IO () + +data N = S N | Z + +f Z = bindIO2 +f (S x) = f x + +main = f(S Z) ilxHello ilxHello diff --git a/compiler/ilxGen/tests/test2d.hs b/compiler/ilxGen/tests/test2d.hs new file mode 100644 index 0000000000..8126127a32 --- /dev/null +++ b/compiler/ilxGen/tests/test2d.hs @@ -0,0 +1,7 @@ +foreign import ccall "libHS_cbits.so" "get_prog_argc" unsafe primArgc :: Int + +foreign import "ilxHello" unsafe ilxHello :: IO () +foreign import "ilxBad" unsafe ilxBad :: IO () + + +main = if (primArgc == 0) then ilxHello else ilxBad diff --git a/compiler/ilxGen/tests/test3.hs b/compiler/ilxGen/tests/test3.hs new file mode 100644 index 0000000000..0254ee41c4 --- /dev/null +++ b/compiler/ilxGen/tests/test3.hs @@ -0,0 +1,24 @@ +foreign import "ilxHello" unsafe ilxHello :: IO () +foreign import "ilxBad" unsafe ilxBad :: IO () + +class Eqq a where + eqq :: a -> Bool + eqq2 :: a -> Bool + +-- x /= y = not (x == y) +-- x == y = not (x /= y) +-- x /= y = True + eqq x = False + eqq2 x = True + + +data Unit = Unit + +instance Eqq Unit +-- where +-- eqq Unit = True +-- eqq2 Unit = False + +choose x = if eqq x then ilxHello else if eqq2 x then ilxBad else ilxBad + +main = choose Unit diff --git a/compiler/ilxGen/tests/test4.hs b/compiler/ilxGen/tests/test4.hs new file mode 100644 index 0000000000..080c6521e3 --- /dev/null +++ b/compiler/ilxGen/tests/test4.hs @@ -0,0 +1,47 @@ +class Eqq a where + evenN :: a -> Bool + oddN :: a -> Bool + evenN x = False + oddN x = True + + +data N = Z | S N + +instance Eqq N + where + evenN Z = True + evenN (S x) = oddN x + oddN Z = False + oddN (S x) = evenN x + +choose x = if evenN x then "hello world (evenN)\n" else if oddN x then "hello world (oddN)\n" else "no!\n" + +add n m = + case n of + Z -> m + S nn -> S (add nn m) + +mul n m = + case n of + Z -> Z + S nn -> add m (mul nn m) + +pow n m = + case m of + Z -> S Z + S mm -> mul n (pow n mm) + +n1 = S Z +n2 = add n1 n1 +n4 = add n2 n2 +n6 = add n2 n4 +n8 = add n2 n6 +n10 = add n2 n8 +n16 = add n6 n10 +n18 = add n8 n10 +n20 = add n4 n16 + +bign = pow n2 n16 +bign1 = add bign n1 + +main = putStr (choose bign1) diff --git a/compiler/ilxGen/tests/test5.hs b/compiler/ilxGen/tests/test5.hs new file mode 100644 index 0000000000..13d6028c02 --- /dev/null +++ b/compiler/ilxGen/tests/test5.hs @@ -0,0 +1,5 @@ +data One a = One a + +choose (One x) = x +main = putStr (choose (One "hello world\n")) + diff --git a/compiler/ilxGen/tests/test6.hs b/compiler/ilxGen/tests/test6.hs new file mode 100644 index 0000000000..17e51ab51d --- /dev/null +++ b/compiler/ilxGen/tests/test6.hs @@ -0,0 +1,8 @@ +data List a = Cons a (List a) + +hdL (Cons x y) = x +tlL (Cons x y) = y + +test = Cons "hello world\n" test +main = putStr (hdL (tlL test)) + diff --git a/compiler/ilxGen/tests/test7.hs b/compiler/ilxGen/tests/test7.hs new file mode 100644 index 0000000000..c146038052 --- /dev/null +++ b/compiler/ilxGen/tests/test7.hs @@ -0,0 +1,8 @@ +data List a = Cons a (List a) + +hdL (Cons x y) = x +tlL (Cons x y) = y + +mk f x = f x (mk f x) +main = putStr (hdL (tlL (mk Cons "hello world!\n"))) + diff --git a/compiler/ilxGen/tests/test8.hs b/compiler/ilxGen/tests/test8.hs new file mode 100644 index 0000000000..94a7e1f83d --- /dev/null +++ b/compiler/ilxGen/tests/test8.hs @@ -0,0 +1,8 @@ +data Inf a = A (Inf a) + +hd (A x) = x + +choose (A (A x)) = "hello world\n" +mk f = f (mk f) +main = putStr (choose (hd (mk A))) + diff --git a/compiler/ilxGen/tests/test9.hs b/compiler/ilxGen/tests/test9.hs new file mode 100644 index 0000000000..311b65c4e1 --- /dev/null +++ b/compiler/ilxGen/tests/test9.hs @@ -0,0 +1,10 @@ +data Tree a = Node (Tree a) (Tree a) + +left (Node x y) = x +right (Node x y) = y + +choose (Node (Node _ _) (Node _ _)) = "hello world!\n" + +mk f = f (mk f) (mk f) +main = putStr (choose (mk Node)) + diff --git a/compiler/ilxGen/tests/yes.hs b/compiler/ilxGen/tests/yes.hs new file mode 100644 index 0000000000..1dc4f085fd --- /dev/null +++ b/compiler/ilxGen/tests/yes.hs @@ -0,0 +1,5 @@ + +foreign import "ilxHello" unsafe ilxHello :: IO () + +main :: IO () +main = ilxHello >> main
\ No newline at end of file diff --git a/compiler/ilxGen/tests/yes2.hs b/compiler/ilxGen/tests/yes2.hs new file mode 100644 index 0000000000..7fa20c5b7d --- /dev/null +++ b/compiler/ilxGen/tests/yes2.hs @@ -0,0 +1,18 @@ + +import PrelIOBase +foreign import "ilxHello" unsafe ilxHello :: IO () + + + +seqIO :: IO () -> IO () -> IO () +seqIO (IO m) (IO k) = IO ( \ s -> + case m s of + (# new_s, a #) -> k new_s + ) + + +yes () = seqIO ilxHello (yes ()) + +main :: IO () +main = yes () + diff --git a/compiler/javaGen/Java.lhs b/compiler/javaGen/Java.lhs new file mode 100644 index 0000000000..368be03fc1 --- /dev/null +++ b/compiler/javaGen/Java.lhs @@ -0,0 +1,169 @@ +Anbstract syntax for Java subset that is the target of Mondrian. +The syntax has been taken from "The Java Language Specification". + +(c) Erik Meijer & Arjan van IJzendoorn + +November 1999 + +Major reworking to be usable for the intermeduate (GOO) language +for the backend of GHC and to target languauges like Java sucessfully. +-- Andy Gill + +\begin{code} +module Java where + +\end{code} + +%************************************************************************ +%* * +\subsection{Java type declararations} +%* * +%************************************************************************ + +\begin{code} +data CompilationUnit + = Package PackageName [Decl] + deriving (Show) + +data Decl + = Import PackageName + | Field [Modifier] Name (Maybe Expr) + | Constructor [Modifier] TypeName [Parameter] [Statement] + | Method [Modifier] Name [Parameter] [Exception] [Statement] + | Comment [String] + | Interface [Modifier] TypeName [TypeName] [Decl] + | Class [Modifier] TypeName [TypeName] [TypeName] [Decl] + deriving (Show) + +data Parameter + = Parameter [Modifier] Name + deriving (Show) + +data Statement + = Skip + | Return Expr -- This always comes last in a list + -- of statements, and it is understood + -- you might change this to something + -- else (like a variable assignment) + -- if this is not top level statements. + | Block [Statement] + | ExprStatement Expr -- You are never interested in the result + -- of an ExprStatement + | Declaration Decl -- variable = inner Field, Class = innerclass + | IfThenElse [(Expr,Statement)] (Maybe Statement) + | Switch Expr [(Expr, [Statement])] (Maybe [Statement]) + deriving (Show) + +data Expr + = Var Name + | Literal Lit + | Cast Type Expr + | Access Expr Name + | Assign Expr Expr + | InstanceOf Expr Type + | Call Expr Name [Expr] + | Op Expr String Expr + | Raise TypeName [Expr] + | New Type [Expr] (Maybe [Decl]) -- anonymous innerclass + deriving (Show) + +data Modifier + = Public | Protected | Private + | Static + | Abstract | Final | Native | Synchronized | Transient | Volatile + deriving (Show, Eq, Ord) + +-- A type is used to refer in general to the shape of things, +-- or a specific class. Never use a name to refer to a class, +-- always use a type. + +data Type + = PrimType PrimType + | ArrayType Type + | Type TypeName + deriving (Show, Eq) + +data PrimType + = PrimInt + | PrimBoolean + | PrimChar + | PrimLong + | PrimFloat + | PrimDouble + | PrimByte + | PrimVoid + deriving (Show, Eq) + +type PackageName = String -- A package name + -- like "java.awt.Button" + +type Exception = TypeName -- A class name that must be an exception. + +type TypeName = String -- a fully qualified type name + -- like "java.lang.Object". + -- has type "Type <the name>" + +data Name = Name String Type + deriving Show -- A class name or method etc, + -- at defintion time, + -- this generally not a qualified name. + + -- The type is shape of the box require + -- to store an access to this thing. + -- So variables might be Int or Object. + + -- ** method calls store the returned + -- ** type, not a complete arg x result type. + -- + -- Thinking: + -- ... foo1.foo2(...).foo3 ... + -- here you want to know the *result* + -- after calling foo1, then foo2, + -- then foo3. + +instance Eq Name where + (Name nm _) == (Name nm' _) = nm == nm' + + +instance Ord Name where + (Name nm _) `compare` (Name nm' _) = nm `compare` nm' + + +data Lit + = IntLit Integer -- unboxed + | CharLit Int -- unboxed + | StringLit String -- java string + deriving Show + +addModifier :: Modifier -> Decl -> Decl +addModifier = \m -> \d -> + case d of + { Import n -> Import n + ; Field ms n e -> Field (m:ms) n e + ; Constructor ms n as ss -> Constructor (m:ms) n as ss + ; Method ms n as ts ss -> Method (m:ms) n as ts ss + ; Comment ss -> Comment ss + ; Interface ms n xs ds -> Interface (m:ms) n xs ds + ; Class ms n xs is ds -> Class (m:ms) n xs is ds + } + +changeNameType :: Type -> Name -> Name +changeNameType ty (Name n _) = Name n ty + +areSimple :: [Expr] -> Bool +areSimple = \es -> all isSimple es + +isSimple :: Expr -> Bool +isSimple = \e -> + case e of + { Cast t e -> isSimple e + ; Access e n -> isSimple e + ; Assign l r -> isSimple l && isSimple r + ; InstanceOf e t -> isSimple e + ; Call e n es -> isSimple e && areSimple es + ; Op e1 o e2 -> False + ; New n es Nothing -> areSimple es + ; New n es (Just ds) -> False + ; otherwise -> True + } +\end{code} diff --git a/compiler/javaGen/JavaGen.lhs b/compiler/javaGen/JavaGen.lhs new file mode 100644 index 0000000000..a3925b18e8 --- /dev/null +++ b/compiler/javaGen/JavaGen.lhs @@ -0,0 +1,1166 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1993-2000 +% +\section{Generate Java} + +Name mangling for Java. +~~~~~~~~~~~~~~~~~~~~~~ + +Haskell has a number of namespaces. The Java translator uses +the standard Haskell mangles (see OccName.lhs), and some extra +mangles. + +All names are hidden inside packages. + +module name: + - becomes a first level java package. + - can not clash with java, because haskell modules are upper case, + java default packages are lower case. + +function names: + - these turn into classes + - java keywords (eg. private) have the suffix "zdk" ($k) added. + +data *types* + - These have a base class, so need to appear in the + same name space as other object. for example data Foo = Foo + - We add a postfix to types: "zdc" ($c) + - Types are upper case, so never clash with keywords + +data constructors + - There are tWO classes for each Constructor + (1) - Class with the payload extends the relevent datatype baseclass. + - This class has the prefix zdw ($w) + (2) - Constructor *wrapper* just use their own name. + - Constructors are upper case, so never clash with keywords + - So Foo would become 2 classes. + * Foo -- the constructor wrapper + * zdwFoo -- the worker, with the payload + + +$i for instances. +$k for keyword nameclash avoidance. + +\begin{code} +module JavaGen( javaGen ) where + +import Java + +import Literal ( Literal(..) ) +import Id ( Id, isDataConWorkId_maybe, isId, idName, isDeadBinder, idPrimRep + , isPrimOpId_maybe ) +import Name ( NamedThing(..), getOccString, isExternalName, isInternalName + , nameModule ) +import PrimRep ( PrimRep(..) ) +import DataCon ( DataCon, dataConRepArity, dataConRepArgTys, dataConWorkId ) +import qualified Type +import qualified CoreSyn +import CoreSyn ( CoreBind, CoreExpr, CoreAlt, CoreBndr, + Bind(..), AltCon(..), collectBinders, isValArg + ) +import TysWiredIn ( boolTy, trueDataCon, falseDataCon ) +import qualified CoreUtils +import Module ( Module, moduleString ) +import TyCon ( TyCon, isDataTyCon, tyConDataCons ) +import Outputable + +import Maybe +import PrimOp +import Util ( lengthIs, notNull ) + +#include "HsVersions.h" + +\end{code} + + +\begin{code} +javaGen :: Module -> [Module] -> [TyCon] -> [CoreBind] -> CompilationUnit + +javaGen mod import_mods tycons binds + = liftCompilationUnit package + where + decls = [Import "haskell.runtime.*"] ++ + [Import (moduleString mod) | mod <- import_mods] ++ + concat (map javaTyCon (filter isDataTyCon tycons)) ++ + concat (map javaTopBind binds) + package = Package (moduleString mod) decls +\end{code} + + +%************************************************************************ +%* * +\subsection{Type declarations} +%* * +%************************************************************************ + +\begin{code} +javaTyCon :: TyCon -> [Decl] +-- public class List {} +-- +-- public class $wCons extends List { +-- Object f1; Object f2 +-- } +-- public class $wNil extends List {} + +javaTyCon tycon + = tycon_jclass : concat (map constr_class constrs) + where + constrs = tyConDataCons tycon + tycon_jclass_jname = javaTyConTypeName tycon ++ "zdc" + tycon_jclass = Class [Public] (shortName tycon_jclass_jname) [] [] [] + + constr_class data_con + = [ Class [Public] constr_jname [tycon_jclass_jname] [] + (field_decls ++ [cons_meth,debug_meth]) + ] + where + constr_jname = shortName (javaConstrWkrName data_con) + + field_names = constrToFields data_con + field_decls = [ Field [Public] n Nothing + | n <- field_names + ] + + cons_meth = mkCons constr_jname field_names + + debug_meth = Method [Public] (Name "toString" stringType) + [] + [] + ( [ Declaration (Field [] txt Nothing) ] + ++ [ ExprStatement + (Assign (Var txt) + (mkStr + ("( " ++ + getOccString data_con ++ + " ") + ) + ) + ] + ++ [ ExprStatement + (Assign (Var txt) + (Op (Var txt) + "+" + (Op (Var n) "+" litSp) + ) + ) + | n <- field_names + ] + ++ [ Return (Op (Var txt) + "+" + (mkStr ")") + ) + ] + ) + + litSp = mkStr " " + txt = Name "__txt" stringType + + +-- This checks to see the type is reasonable to call new with. +-- primitives might use a static method later. +mkNew :: Type -> [Expr] -> Expr +mkNew t@(PrimType primType) _ = error "new primitive -- fix it???" +mkNew t@(Type _) es = New t es Nothing +mkNew _ _ = error "new with strange arguments" + +constrToFields :: DataCon -> [Name] +constrToFields cons = + [ fieldName i t + | (i,t) <- zip [1..] (map primRepToType + (map Type.typePrimRep + (dataConRepArgTys cons) + ) + ) + ] + +mkCons :: TypeName -> [Name] -> Decl +mkCons name args = Constructor [Public] name + [ Parameter [] n | n <- args ] + [ ExprStatement (Assign + (Access this n) + (Var n) + ) + | n <- args ] + +mkStr :: String -> Expr +mkStr str = Literal (StringLit str) +\end{code} + +%************************************************************************ +%* * +\subsection{Bindings} +%* * +%************************************************************************ + +\begin{code} +javaTopBind :: CoreBind -> [Decl] +javaTopBind (NonRec bndr rhs) = [java_top_bind bndr rhs] +javaTopBind (Rec prs) = [java_top_bind bndr rhs | (bndr,rhs) <- prs] + +java_top_bind :: Id -> CoreExpr -> Decl +-- public class f implements Code { +-- public Object ENTER() { ...translation of rhs... } +-- } +java_top_bind bndr rhs + = Class [Public] (shortName (javaIdTypeName bndr)) + [] [codeName] [enter_meth] + where + enter_meth = Method [Public] + enterName + [vmArg] + [excName] + (javaExpr vmRETURN rhs) +\end{code} + +%************************************************************************ +%* * +\subsection{Expressions} +%* * +%************************************************************************ + +\begin{code} +javaVar :: Id -> Expr +javaVar v | isExternalName (idName v) = mkNew (javaIdType v) [] + | otherwise = Var (javaName v) + +javaLit :: Literal.Literal -> Expr +javaLit (MachInt i) = Literal (IntLit (fromInteger i)) +javaLit (MachChar c) = Literal (CharLit c) +javaLit (MachStr fs) = Literal (StringLit str) + where + str = concatMap renderString (unpackFS fs) ++ "\\000" + -- This should really handle all the chars 0..31. + renderString '\NUL' = "\\000" + renderString other = [other] + +javaLit other = pprPanic "javaLit" (ppr other) + +-- Pass in the 'shape' of the result. +javaExpr :: (Expr -> Statement) -> CoreExpr -> [Statement] +-- Generate code to apply the value of +-- the expression to the arguments aleady on the stack +javaExpr r (CoreSyn.Var v) = [r (javaVar v)] +javaExpr r (CoreSyn.Lit l) = [r (javaLit l)] +javaExpr r (CoreSyn.App f a) = javaApp r f [a] +javaExpr r e@(CoreSyn.Lam _ _) = javaLam r (collectBinders e) +javaExpr r (CoreSyn.Case e x alts) = javaCase r e x alts +javaExpr r (CoreSyn.Let bind body) = javaBind bind ++ javaExpr r body +javaExpr r (CoreSyn.Note _ e) = javaExpr r e + +javaCase :: (Expr -> Statement) -> CoreExpr -> Id -> [CoreAlt] -> [Statement] +-- case e of x { Nil -> r1 +-- Cons p q -> r2 } +-- ==> +-- final Object x = VM.WHNF(...code for e...) +-- else if x instance_of Nil { +-- ...translation of r1... +-- } else if x instance_of Cons { +-- final Object p = ((Cons) x).f1 +-- final Object q = ((Cons) x).f2 +-- ...translation of r2... +-- } else throw java.lang.Exception + +-- This first special case happens a lot, typically +-- during dictionary deconstruction. +-- We need to access at least *one* field, to check to see +-- if we have correct constructor. +-- If we've got the wrong one, this is _|_, and the +-- casting will catch this with an exception. + +javaCase r e x [(DataAlt d,bs,rhs)] | notNull bs + = java_expr PushExpr e ++ + [ var [Final] (javaName x) + (whnf primRep (vmPOP (primRepToType primRep))) ] ++ + bind_args d bs ++ + javaExpr r rhs + where + primRep = idPrimRep x + whnf PtrRep = vmWHNF -- needs evaluation + whnf _ = id -- anything else does notg + + bind_args d bs = [var [Final] (javaName b) + (Access (Cast (javaConstrWkrType d) (javaVar x) + ) f + ) + | (b,f) <- filter isId bs `zip` (constrToFields d) + , not (isDeadBinder b) + ] + +javaCase r e x alts + | isIfThenElse && isPrimCmp + = javaIfThenElse r (fromJust maybePrim) tExpr fExpr + | otherwise + = java_expr PushExpr e ++ + [ var [Final] (javaName x) + (whnf primRep (vmPOP (primRepToType primRep))) + , IfThenElse (map mk_alt con_alts) (Just default_code) + ] + where + isIfThenElse = CoreUtils.exprType e `Type.eqType` boolTy + -- also need to check that x is not free in + -- any of the branches. + maybePrim = findCmpPrim e [] + isPrimCmp = isJust maybePrim + (_,_,tExpr) = CoreUtils.findAlt (DataAlt trueDataCon) alts + (_,_,fExpr) = CoreUtils.findAlt (DataAlt falseDataCon) alts + + primRep = idPrimRep x + whnf PtrRep = vmWHNF -- needs evaluation + whnf _ = id + + (con_alts, maybe_default) = CoreUtils.findDefault alts + default_code = case maybe_default of + Nothing -> ExprStatement (Raise excName [Literal (StringLit "case failure")]) + Just rhs -> Block (javaExpr r rhs) + + mk_alt (DataAlt d, bs, rhs) = (instanceOf x d, Block (bind_args d bs ++ javaExpr r rhs)) + mk_alt (LitAlt lit, bs, rhs) = (eqLit lit , Block (javaExpr r rhs)) + + + eqLit (MachInt n) = Op (Literal (IntLit n)) + + "==" + (Var (javaName x)) + eqLit (MachChar n) = Op (Literal (CharLit n)) + "==" + (Var (javaName x)) + eqLit other = pprPanic "eqLit" (ppr other) + + bind_args d bs = [var [Final] (javaName b) + (Access (Cast (javaConstrWkrType d) (javaVar x) + ) f + ) + | (b,f) <- filter isId bs `zip` (constrToFields d) + , not (isDeadBinder b) + ] + +javaIfThenElse r cmp tExpr fExpr +{- + - Now what we need to do is generate code for the if/then/else. + - [all arguments are already check for simpleness (Var or Lit).] + - + - if (<prim> arg1 arg2 arg3 ...) { + - trueCode + - } else { + - falseCode + - } + -} + = [IfThenElse [(cmp,j_tExpr)] (Just j_fExpr)] + where + j_tExpr, j_fExpr :: Statement + j_tExpr = Block (javaExpr r tExpr) + j_fExpr = Block (javaExpr r fExpr) + +javaBind (NonRec x rhs) +{- + x = ...rhs_x... + ==> + final Object x = new Thunk( new Code() { ...code for rhs_x... } ) +-} + + = java_expr (SetVar name) rhs + where + name = case coreTypeToType rhs of + ty@(PrimType _) -> javaName x `withType` ty + _ -> javaName x `withType` codeType + +javaBind (Rec prs) +{- rec { x = ...rhs_x...; y = ...rhs_y... } + ==> + class x implements Code { + Code x, y; + public Object ENTER() { ...code for rhs_x...} + } + ...ditto for y... + + final x x_inst = new x(); + ...ditto for y... + + final Thunk x = new Thunk( x_inst ); + ...ditto for y... + + x_inst.x = x; + x_inst.y = y; + ...ditto for y... +-} + = (map mk_class prs) ++ (map mk_inst prs) ++ + (map mk_thunk prs) ++ concat (map mk_knot prs) + where + mk_class (b,r) = Declaration (Class [] class_name [] [codeName] stmts) + where + class_name = javaIdTypeName b + stmts = [Field [] (javaName b `withType` codeType) Nothing | (b,_) <- prs] ++ + [Method [Public] enterName [vmArg] [excName] (javaExpr vmRETURN r)] + + mk_inst (b,r) = var [Final] name (mkNew ty []) + where + name@(Name _ ty) = javaInstName b + + mk_thunk (b,r) = var [Final] (javaName b `withType` codeType) + (mkNew thunkType [Var (javaInstName b)]) + + mk_knot (b,_) = [ ExprStatement (Assign lhs rhs) + | (b',_) <- prs, + let lhs = Access (Var (javaInstName b)) (javaName b'), + let rhs = Var (javaName b') + ] + +javaLam :: (Expr -> Statement) -> ([CoreBndr], CoreExpr) -> [Statement] +javaLam r (bndrs, body) + | null val_bndrs = javaExpr r body + | otherwise + = vmCOLLECT (length val_bndrs) this + ++ [var [Final] n (vmPOP t) | n@(Name _ t) <- val_bndrs] + ++ javaExpr r body + where + val_bndrs = map javaName (filter isId bndrs) + +javaApp :: (Expr -> Statement) -> CoreExpr -> [CoreExpr] -> [Statement] +javaApp r (CoreSyn.App f a) as + | isValArg a = javaApp r f (a:as) + | otherwise = javaApp r f as +javaApp r (CoreSyn.Var f) as + = case isDataConWorkId_maybe f of { + Just dc | as `lengthIs` dataConRepArity dc + -- NOTE: Saturated constructors never returning a primitive at this point + -- + -- We push the arguments backwards, because we are using + -- the (ugly) semantics of the order of evaluation of arguments, + -- to avoid making up local names. Oh to have a namesupply... + -- + -> javaArgs (reverse as) ++ + [r (New (javaIdType f) + (javaPops as) + Nothing + ) + ] + | otherwise -> + -- build a local + let stmts = + vmCOLLECT (dataConRepArity dc) this ++ + [ vmRETURN + (New (javaIdType f) + [ vmPOP ty | (Name _ ty) <- constrToFields dc ] + Nothing + ) + ] + in javaArgs (reverse as) ++ [r (newCode stmts)] + ; other -> java_apply r (CoreSyn.Var f) as + } + +javaApp r f as = java_apply r f as + +-- This means, given a expression an a list of arguments, +-- generate code for "pushing the arguments on the stack, +-- and the executing the expression." + +java_apply :: (Expr -> Statement) -> CoreExpr -> [CoreExpr] -> [Statement] +java_apply r f as = javaArgs as ++ javaExpr r f + +-- This generates statements that have the net effect +-- of pushing values (perhaps thunks) onto the stack. + +javaArgs :: [CoreExpr] -> [Statement] +javaArgs args = concat [ java_expr PushExpr a | a <- args, isValArg a] + +javaPops :: [CoreExpr] -> [Expr] +javaPops args = [ vmPOP (primRepToType (Type.typePrimRep (CoreUtils.exprType a))) + | a <- args + , isValArg a + ] + + +-- The result is a list of statments that have the effect of +-- pushing onto the stack (via one of the VM.PUSH* commands) +-- the argument, (or returning, or setting a variable) +-- perhaps thunked. + +{- This is mixing two things. + (1) Optimizations for things like primitives, whnf calls, etc. + (2) If something needs a thunk constructor round it. + - Seperate them at some point! + -} +data ExprRetStyle = SetVar Name | PushExpr | ReturnExpr + +java_expr :: ExprRetStyle -> CoreExpr -> [Statement] +java_expr _ (CoreSyn.Type t) = pprPanic "java_expr" (ppr t) +java_expr ret e + | isPrimCall = [push (fromJust maybePrim)] + -- This is a shortcut, + -- basic names and literals do not need a code block + -- to compute the value. + | isPrim primty && CoreUtils.exprIsTrivial e = javaExpr push e + | isPrim primty = + let expr = javaExpr vmRETURN e + code = access (vmWHNF (newCode expr)) (primRepToType primty) + in [push code] + | otherwise = + let expr = javaExpr vmRETURN e + code = newCode expr + code' = if CoreUtils.exprIsValue e + || CoreUtils.exprIsTrivial e + || isPrim primty + then code + else newThunk code + in [push code'] + where + maybePrim = findFnPrim e [] + isPrimCall = isJust maybePrim + + push e = case ret of + SetVar name -> var [Final] name e + PushExpr -> vmPUSH e + ReturnExpr -> vmRETURN e + corety = CoreUtils.exprType e + primty = Type.typePrimRep corety + isPrim PtrRep = False -- only this needs updated + isPrim _ = True + +coreTypeToType = primRepToType . Type.typePrimRep . CoreUtils.exprType + +renameForKeywords :: (NamedThing name) => name -> String +renameForKeywords name + | str `elem` keywords = "zdk" ++ str + | otherwise = str + where + str = getOccString name + +keywords :: [String] +keywords = + [ "return" + , "if" + , "then" + , "else" + , "class" + , "instance" + , "import" + , "throw" + , "try" + ] + +\end{code} + +%************************************************************************ +%* * +\subsection{Helper functions} +%* * +%************************************************************************ + +\begin{code} +true, this,javaNull :: Expr +this = Var thisName +true = Var (Name "true" (PrimType PrimBoolean)) +javaNull = Var (Name "null" objectType) + +vmCOLLECT :: Int -> Expr -> [Statement] +vmCOLLECT 0 e = [] +vmCOLLECT n e = [ExprStatement + (Call varVM collectName + [ Literal (IntLit (toInteger n)) + , e + ] + ) + ] + +vmPOP :: Type -> Expr +vmPOP ty = Call varVM (Name ("POP" ++ suffix ty) ty) [] + +vmPUSH :: Expr -> Statement +vmPUSH e = ExprStatement + (Call varVM (Name ("PUSH" ++ suffix (exprType e)) void) [e]) + +vmRETURN :: Expr -> Statement +vmRETURN e = Return ( + case ty of + PrimType _ -> Call varVM (Name ("RETURN" ++ suffix ty) + valueType + ) [e] + _ -> e) + where + ty = exprType e + +var :: [Modifier] -> Name -> Expr -> Statement +var ms field_name@(Name _ ty) value + | exprType value == ty = Declaration (Field ms field_name (Just value)) + | otherwise = var ms field_name (Cast ty value) + +vmWHNF :: Expr -> Expr +vmWHNF e = Call varVM whnfName [e] + +suffix :: Type -> String +suffix (PrimType t) = primName t +suffix _ = "" + +primName :: PrimType -> String +primName PrimInt = "int" +primName PrimChar = "char" +primName PrimByte = "byte" +primName PrimBoolean = "boolean" +primName _ = error "unsupported primitive" + +varVM :: Expr +varVM = Var vmName + +instanceOf :: Id -> DataCon -> Expr +instanceOf x data_con + = InstanceOf (Var (javaName x)) (javaConstrWkrType data_con) + +newCode :: [Statement] -> Expr +newCode [Return e] = e +newCode stmts = New codeType [] (Just [Method [Public] enterName [vmArg] [excName] stmts]) + +newThunk :: Expr -> Expr +newThunk e = New thunkType [e] Nothing + +vmArg :: Parameter +vmArg = Parameter [Final] vmName + +-- This is called with boolean compares, checking +-- to see if we can do an obvious shortcut. +-- If there is, we return a (GOO) expression for doing this, + +-- So if, we have case (#< x y) of { True -> e1; False -> e2 }, +-- we will call findCmpFn with (#< x y), this return Just (Op x "<" y) + +findCmpPrim :: CoreExpr -> [Expr] -> Maybe Expr +findCmpPrim (CoreSyn.App f a) as = + case a of + CoreSyn.Var v -> findCmpPrim f (javaVar v:as) + CoreSyn.Lit l -> findCmpPrim f (javaLit l:as) + _ -> Nothing +findCmpPrim (CoreSyn.Var p) as = + case isPrimOpId_maybe p of + Just prim -> find_cmp_prim prim as + Nothing -> Nothing +findCmpPrim _ as = Nothing + +find_cmp_prim cmpPrim args@[a,b] = + case cmpPrim of + IntGtOp -> fn ">" + IntGeOp -> fn ">=" + IntEqOp -> fn "==" + IntNeOp -> fn "/=" + IntLtOp -> fn "<" + IntLeOp -> fn "<=" + _ -> Nothing + where + fn op = Just (Op a op b) +find_cmp_prim _ _ = Nothing + +findFnPrim :: CoreExpr -> [Expr] -> Maybe Expr +findFnPrim (CoreSyn.App f a) as = + case a of + CoreSyn.Var v -> findFnPrim f (javaVar v:as) + CoreSyn.Lit l -> findFnPrim f (javaLit l:as) + _ -> Nothing +findFnPrim (CoreSyn.Var p) as = + case isPrimOpId_maybe p of + Just prim -> find_fn_prim prim as + Nothing -> Nothing +findFnPrim _ as = Nothing + +find_fn_prim cmpPrim args@[a,b] = + case cmpPrim of + IntAddOp -> fn "+" + IntSubOp -> fn "-" + IntMulOp -> fn "*" + _ -> Nothing + where + fn op = Just (Op a op b) +find_fn_prim _ _ = Nothing +\end{code} + +%************************************************************************ +%* * +\subsection{Haskell to Java Types} +%* * +%************************************************************************ + +\begin{code} +exprType (Var (Name _ t)) = t +exprType (Literal lit) = litType lit +exprType (Cast t _) = t +exprType (New t _ _) = t +exprType (Call _ (Name _ t) _) = t +exprType (Access _ (Name _ t)) = t +exprType (Raise t _) = error "do not know the type of raise!" +exprType (Op _ op _) | op `elem` ["==","/=","<","<=","=>",">"] + = PrimType PrimBoolean +exprType (Op x op _) | op `elem` ["+","-","*"] + = exprType x +exprType expr = error ("can't figure out an expression type: " ++ show expr) + +litType (IntLit i) = PrimType PrimInt +litType (CharLit i) = PrimType PrimChar +litType (StringLit i) = stringType -- later, might use char array? +\end{code} + +%************************************************************************ +%* * +\subsection{Name mangling} +%* * +%************************************************************************ + +\begin{code} +codeName, excName, thunkName :: TypeName +codeName = "haskell.runtime.Code" +thunkName = "haskell.runtime.Thunk" +excName = "java.lang.Exception" + +enterName, vmName,thisName,collectName, whnfName :: Name +enterName = Name "ENTER" objectType +vmName = Name "VM" vmType +thisName = Name "this" (Type "<this>") +collectName = Name "COLLECT" void +whnfName = Name "WHNF" objectType + +fieldName :: Int -> Type -> Name -- Names for fields of a constructor +fieldName n ty = Name ("f" ++ show n) ty + +withType :: Name -> Type -> Name +withType (Name n _) t = Name n t + +-- This maps (local only) names Ids to Names, +-- using the same string as the Id. +javaName :: Id -> Name +javaName n + | isExternalName (idName n) = error "useing javaName on global" + | otherwise = Name (getOccString n) + (primRepToType (idPrimRep n)) + +-- TypeName's are almost always global. This would typically return something +-- like Test.foo or Test.Foozdc or PrelBase.foldr. +-- Local might use locally bound types, (which do not have '.' in them). + +javaIdTypeName :: Id -> TypeName +javaIdTypeName n + | isInternalName n' = renameForKeywords n' + | otherwise = moduleString (nameModule n') ++ "." ++ renameForKeywords n' + where + n' = getName n + +-- There is no such thing as a local type constructor. + +javaTyConTypeName :: TyCon -> TypeName +javaTyConTypeName n = (moduleString (nameModule n') ++ "." ++ renameForKeywords n') + where + n' = getName n + +-- this is used for getting the name of a class when defining it. +shortName :: TypeName -> TypeName +shortName = reverse . takeWhile (/= '.') . reverse + +-- The function that makes the constructor name +-- The constructor "Foo ..." in module Test, +-- would return the name "Test.Foo". + +javaConstrWkrName :: DataCon -> TypeName +javaConstrWkrName = javaIdTypeName . dataConWorkId + +-- Makes x_inst for Rec decls +-- They are *never* is primitive +-- and always have local (type) names. +javaInstName :: Id -> Name +javaInstName n = Name (renameForKeywords n ++ "zdi_inst") + (Type (renameForKeywords n)) +\end{code} + +%************************************************************************ +%* * +\subsection{Types and type mangling} +%* * +%************************************************************************ + +\begin{code} +-- Haskell RTS types +codeType, thunkType, valueType :: Type +codeType = Type codeName +thunkType = Type thunkName +valueType = Type "haskell.runtime.Value" +vmType = Type "haskell.runtime.VMEngine" + +-- Basic Java types +objectType, stringType :: Type +objectType = Type "java.lang.Object" +stringType = Type "java.lang.String" + +void :: Type +void = PrimType PrimVoid + +inttype :: Type +inttype = PrimType PrimInt + +chartype :: Type +chartype = PrimType PrimChar + +bytetype :: Type +bytetype = PrimType PrimByte + +-- This lets you get inside a possible "Value" type, +-- to access the internal unboxed object. +access :: Expr -> Type -> Expr +access expr (PrimType prim) = accessPrim (Cast valueType expr) prim +access expr other = expr + +accessPrim expr PrimInt = Call expr (Name "intValue" inttype) [] +accessPrim expr PrimChar = Call expr (Name "charValue" chartype) [] +accessPrim expr PrimByte = Call expr (Name "byteValue" bytetype) [] +accessPrim expr other = pprPanic "accessPrim" (text (show other)) + +-- This is where we map from typename to types, +-- allowing to match possible primitive types. +mkType :: TypeName -> Type +mkType "PrelGHC.Intzh" = inttype +mkType "PrelGHC.Charzh" = chartype +mkType other = Type other + +-- Turns a (global) Id into a Type (fully qualified name). +javaIdType :: Id -> Type +javaIdType = mkType . javaIdTypeName + +javaLocalIdType :: Id -> Type +javaLocalIdType = primRepToType . idPrimRep + +primRepToType ::PrimRep -> Type +primRepToType PtrRep = objectType +primRepToType IntRep = inttype +primRepToType CharRep = chartype +primRepToType Int8Rep = bytetype +primRepToType AddrRep = objectType +primRepToType other = pprPanic "primRepToType" (ppr other) + +-- The function that makes the constructor name +javaConstrWkrType :: DataCon -> Type +javaConstrWkrType con = Type (javaConstrWkrName con) +\end{code} + +%************************************************************************ +%* * +\subsection{Class Lifting} +%* * +%************************************************************************ + +This is a very simple class lifter. It works by carrying inwards a +list of bound variables (things that might need to be passed to a +lifted inner class). + * Any variable references is check with this list, and if it is + bound, then it is not top level, external reference. + * This means that for the purposes of lifting, it might be free + inside a lifted inner class. + * We remember these "free inside the inner class" values, and + use this list (which is passed, via the monad, outwards) + when lifting. + +\begin{code} +type Bound = [Name] +type Frees = [Name] + +combine :: [Name] -> [Name] -> [Name] +combine [] names = names +combine names [] = names +combine (name:names) (name':names') + | name < name' = name : combine names (name':names') + | name > name' = name' : combine (name:names) names' + | name == name = name : combine names names' + | otherwise = error "names are not a total order" + +both :: [Name] -> [Name] -> [Name] +both [] names = [] +both names [] = [] +both (name:names) (name':names') + | name < name' = both names (name':names') + | name > name' = both (name:names) names' + | name == name = name : both names names' + | otherwise = error "names are not a total order" + +combineEnv :: Env -> [Name] -> Env +combineEnv (Env bound env) new = Env (bound `combine` new) env + +addTypeMapping :: TypeName -> TypeName -> [Name] -> Env -> Env +addTypeMapping origName newName frees (Env bound env) + = Env bound ((origName,(newName,frees)) : env) + +-- This a list of bound vars (with types) +-- and a mapping from old class name +-- to inner class name (with a list of frees that need passed +-- to the inner class.) + +data Env = Env Bound [(TypeName,(TypeName,[Name]))] + +newtype LifterM a = + LifterM { unLifterM :: + TypeName -> -- this class name + Int -> -- uniq supply + ( a -- * + , Frees -- frees + , [Decl] -- lifted classes + , Int -- The uniqs + ) + } + +instance Monad LifterM where + return a = LifterM (\ n s -> (a,[],[],s)) + (LifterM m) >>= fn = LifterM (\ n s -> + case m n s of + (a,frees,lifted,s) + -> case unLifterM (fn a) n s of + (a,frees2,lifted2,s) -> ( a + , combine frees frees2 + , lifted ++ lifted2 + , s) + ) + +liftAccess :: Env -> Name -> LifterM () +liftAccess env@(Env bound _) name + | name `elem` bound = LifterM (\ n s -> ((),[name],[],s)) + | otherwise = return () + +scopedName :: TypeName -> LifterM a -> LifterM a +scopedName name (LifterM m) = + LifterM (\ _ s -> + case m name 1 of + (a,frees,lifted,_) -> (a,frees,lifted,s) + ) + +genAnonInnerClassName :: LifterM TypeName +genAnonInnerClassName = LifterM (\ n s -> + ( n ++ "$" ++ show s + , [] + , [] + , s + 1 + ) + ) + +genInnerClassName :: TypeName -> LifterM TypeName +genInnerClassName name = LifterM (\ n s -> + ( n ++ "$" ++ name + , [] + , [] + , s + ) + ) + +getFrees :: LifterM a -> LifterM (a,Frees) +getFrees (LifterM m) = LifterM (\ n s -> + case m n s of + (a,frees,lifted,n) -> ((a,frees),frees,lifted,n) + ) + +rememberClass :: Decl -> LifterM () +rememberClass decl = LifterM (\ n s -> ((),[],[decl],s)) + + +liftCompilationUnit :: CompilationUnit -> CompilationUnit +liftCompilationUnit (Package name ds) = + Package name (concatMap liftCompilationUnit' ds) + +liftCompilationUnit' :: Decl -> [Decl] +liftCompilationUnit' decl = + case unLifterM (liftDecls True (Env [] []) [decl]) [] 1 of + (ds,_,ds',_) -> ds ++ ds' + + +-- The bound vars for the current class have +-- already be captured before calling liftDecl, +-- because they are in scope everywhere inside the class. + +liftDecl :: Bool -> Env -> Decl -> LifterM Decl +liftDecl = \ top env decl -> + case decl of + { Import n -> return (Import n) + ; Field mfs n e -> + do { e <- liftMaybeExpr env e + ; return (Field mfs (liftName env n) e) + } + ; Constructor mfs n as ss -> + do { let newBound = getBoundAtParameters as + ; (ss,_) <- liftStatements (combineEnv env newBound) ss + ; return (Constructor mfs n (liftParameters env as) ss) + } + ; Method mfs n as ts ss -> + do { let newBound = getBoundAtParameters as + ; (ss,_) <- liftStatements (combineEnv env newBound) ss + ; return (Method mfs (liftName env n) (liftParameters env as) ts ss) + } + ; Comment s -> return (Comment s) + ; Interface mfs n is ms -> error "interfaces not supported" + ; Class mfs n x is ms -> + do { let newBound = getBoundAtDecls ms + ; ms <- scopedName n + (liftDecls False (combineEnv env newBound) ms) + ; return (Class mfs n x is ms) + } + } + +liftDecls :: Bool -> Env -> [Decl] -> LifterM [Decl] +liftDecls top env = mapM (liftDecl top env) + +getBoundAtDecls :: [Decl] -> Bound +getBoundAtDecls = foldr combine [] . map getBoundAtDecl + +getBoundAtDecl :: Decl -> Bound +getBoundAtDecl (Field _ n _) = [n] +getBoundAtDecl _ = [] + +getBoundAtParameters :: [Parameter] -> Bound +getBoundAtParameters = foldr combine [] . map getBoundAtParameter + +-- TODO +getBoundAtParameter :: Parameter -> Bound +getBoundAtParameter (Parameter _ n) = [n] + + +liftStatement :: Env -> Statement -> LifterM (Statement,Env) +liftStatement = \ env stmt -> + case stmt of + { Skip -> return (stmt,env) + ; Return e -> do { e <- liftExpr env e + ; return (Return e,env) + } + ; Block ss -> do { (ss,env) <- liftStatements env ss + ; return (Block ss,env) + } + ; ExprStatement e -> do { e <- liftExpr env e + ; return (ExprStatement e,env) + } + ; Declaration decl@(Field mfs n e) -> + do { e <- liftMaybeExpr env e + ; return ( Declaration (Field mfs (liftName env n) e) + , env `combineEnv` getBoundAtDecl decl + ) + } + ; Declaration decl@(Class mfs n x is ms) -> + do { innerName <- genInnerClassName n + ; frees <- liftClass env innerName ms x is + ; return ( Declaration (Comment ["lifted " ++ n]) + , addTypeMapping n innerName frees env + ) + } + ; Declaration d -> error "general Decl not supported" + ; IfThenElse ecs s -> ifthenelse env ecs s + ; Switch e as d -> error "switch not supported" + } + +ifthenelse :: Env + -> [(Expr,Statement)] + -> (Maybe Statement) + -> LifterM (Statement,Env) +ifthenelse env pairs may_stmt = + do { let (exprs,stmts) = unzip pairs + ; exprs <- liftExprs env exprs + ; (stmts,_) <- liftStatements env stmts + ; may_stmt <- case may_stmt of + Just stmt -> do { (stmt,_) <- liftStatement env stmt + ; return (Just stmt) + } + Nothing -> return Nothing + ; return (IfThenElse (zip exprs stmts) may_stmt,env) + } + +liftStatements :: Env -> [Statement] -> LifterM ([Statement],Env) +liftStatements env [] = return ([],env) +liftStatements env (s:ss) = + do { (s,env) <- liftStatement env s + ; (ss,env) <- liftStatements env ss + ; return (s:ss,env) + } + +liftExpr :: Env -> Expr -> LifterM Expr +liftExpr = \ env expr -> + case expr of + { Var n -> do { liftAccess env n + ; return (Var (liftName env n)) + } + ; Literal l -> return expr + ; Cast t e -> do { e <- liftExpr env e + ; return (Cast (liftType env t) e) + } + ; Access e n -> do { e <- liftExpr env e + -- do not consider n as an access, because + -- this is a indirection via a reference + ; return (Access e n) + } + ; Assign l r -> do { l <- liftExpr env l + ; r <- liftExpr env r + ; return (Assign l r) + } + ; InstanceOf e t -> do { e <- liftExpr env e + ; return (InstanceOf e (liftType env t)) + } + ; Raise n es -> do { es <- liftExprs env es + ; return (Raise n es) + } + ; Call e n es -> do { e <- liftExpr env e + ; es <- mapM (liftExpr env) es + ; return (Call e n es) + } + ; Op e1 o e2 -> do { e1 <- liftExpr env e1 + ; e2 <- liftExpr env e2 + ; return (Op e1 o e2) + } + ; New n es ds -> new env n es ds + } + +liftParameter env (Parameter ms n) = Parameter ms (liftName env n) +liftParameters env = map (liftParameter env) + +liftName env (Name n t) = Name n (liftType env t) + +liftExprs :: Env -> [Expr] -> LifterM [Expr] +liftExprs = mapM . liftExpr + + +liftMaybeExpr :: Env -> (Maybe Expr) -> LifterM (Maybe Expr) +liftMaybeExpr env Nothing = return Nothing +liftMaybeExpr env (Just stmt) = do { stmt <- liftExpr env stmt + ; return (Just stmt) + } + + + +new :: Env -> Type -> [Expr] -> Maybe [Decl] -> LifterM Expr +new env@(Env _ pairs) typ args Nothing = + do { args <- liftExprs env args + ; return (liftNew env typ args) + } +new env typ [] (Just inner) = + -- anon. inner class + do { innerName <- genAnonInnerClassName + ; frees <- liftClass env innerName inner [] [unType typ] + ; return (New (Type (innerName)) + (map Var frees) + Nothing) + } + where unType (Type name) = name + unType _ = error "incorrect type style" +new env typ _ (Just inner) = error "cant handle inner class with args" + + +liftClass :: Env -> TypeName -> [Decl] -> [TypeName] -> [TypeName] -> LifterM [ Name ] +liftClass env@(Env bound _) innerName inner xs is = + do { let newBound = getBoundAtDecls inner + ; (inner,frees) <- + getFrees (liftDecls False (env `combineEnv` newBound) inner) + ; let trueFrees = filter (\ (Name xs _) -> xs /= "VM") (both frees bound) + ; let freeDefs = [ Field [Final] n Nothing | n <- trueFrees ] + ; let cons = mkCons innerName trueFrees + ; let innerClass = Class [] innerName xs is (freeDefs ++ [cons] ++ inner) + ; rememberClass innerClass + ; return trueFrees + } + +liftType :: Env -> Type -> Type +liftType (Env _ env) typ@(Type name) + = case lookup name env of + Nothing -> typ + Just (nm,_) -> Type nm +liftType _ typ = typ + +liftNew :: Env -> Type -> [Expr] -> Expr +liftNew (Env _ env) typ@(Type name) exprs + = case lookup name env of + Nothing -> New typ exprs Nothing + Just (nm,args) | null exprs + -> New (Type nm) (map Var args) Nothing + _ -> error "pre-lifted constructor with arguments" +\end{code} diff --git a/compiler/javaGen/PrintJava.lhs b/compiler/javaGen/PrintJava.lhs new file mode 100644 index 0000000000..eb2811d38f --- /dev/null +++ b/compiler/javaGen/PrintJava.lhs @@ -0,0 +1,224 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1993-1998 +% +\section{Generate Java} + +\begin{code} +module PrintJava( compilationUnit ) where + +import Java +import Outputable +import Char( toLower ) +\end{code} + +\begin{code} +indent :: SDoc -> SDoc +indent = nest 2 +\end{code} + +%************************************************************************ +%* * +\subsection{Pretty printer} +%* * +%************************************************************************ + +\begin{code} +compilationUnit :: CompilationUnit -> SDoc +compilationUnit (Package n ds) = package n (decls ds) + +package = \n -> \ds -> + text "package" <+> packagename n <> text ";" + $$ + ds + +decls [] = empty +decls (d:ds) = decl d $$ decls ds + +decl = \d -> + case d of + { Import n -> importDecl (packagename n) + ; Field mfs n e -> field (modifiers mfs) (nameTy n) (name n) e + ; Constructor mfs n as ss -> constructor (modifiers mfs) (typename n) (parameters as) (statements ss) + ; Method mfs n as ts ss -> method (modifiers mfs) (nameTy n) (name n) (parameters as) (throws ts) (statements ss) + ; Comment s -> comment s + ; Interface mfs n is ms -> interface (modifiers mfs) (typename n) (extends is) (decls ms) + ; Class mfs n x is ms -> clazz (modifiers mfs) (typename n) (extends x) (implements is) (decls ms) + } + +importDecl n = text "import" <+> n <> text ";" + +field = \mfs -> \t -> \n -> \e -> + case e of + { Nothing -> mfs <+> t <+> n <> text ";" + ; Just e -> lay [mfs <+> t <+> n <+> text "=", indent (expr e <> text ";")] + where + lay | isSimple e = hsep + | otherwise = sep + } + +constructor = \mfs -> \n -> \as -> \ss -> + mfs <+> n <+> parens (hsep (punctuate comma as)) <+> text "{" + $$ indent ss + $$ text "}" + +method = \mfs -> \t -> \n -> \as -> \ts -> \ss -> + mfs <+> t <+> n <+> parens (hsep (punctuate comma as)) <+> ts <+> text "{" + $$ indent ss + $$ text "}" + +comment = \ss -> + text "/**" + $$ indent (vcat [ text s | s <- ss]) + $$ text "**/" + +interface = \mfs -> \n -> \xs -> \ms -> + mfs <+> n <+> xs <+> text "{" + $$ indent ms + $$ text "}" + +clazz = \mfs -> \n -> \x -> \is -> \ms -> + mfs <+> text "class" <+> n <+> x <+> is <+> text "{" + $$ indent ms + $$ text "}" + +modifiers mfs = hsep (map modifier mfs) + +modifier mf = text $ map toLower (show mf) + +extends [] = empty +extends xs = text "extends" <+> hsep (punctuate comma (map typename xs)) + +implements [] = empty +implements xs = text "implements" <+> hsep (punctuate comma (map typename xs)) + +throws [] = empty +throws xs = text "throws" <+> hsep (punctuate comma (map typename xs)) + +name (Name n t) = text n + +nameTy (Name n t) = typ t + +typename n = text n +packagename n = text n + +parameters as = map parameter as + +parameter (Parameter mfs n) = modifiers mfs <+> nameTy n <+> name n + +typ (PrimType s) = primtype s +typ (Type n) = typename n +typ (ArrayType t) = typ t <> text "[]" + +primtype PrimInt = text "int" +primtype PrimBoolean = text "boolean" +primtype PrimChar = text "char" +primtype PrimLong = text "long" +primtype PrimFloat = text "float" +primtype PrimDouble = text "double" +primtype PrimByte = text "byte" +primtype PrimVoid = text "void" + +statements ss = vcat (map statement ss) + +statement = \s -> + case s of + { Skip -> skip + ; Return e -> returnStat (expr e) + ; Block ss -> vcat [statement s | s <- ss] + ; ExprStatement e -> exprStatement (expr e) + ; Declaration d -> declStatement (decl d) + ; IfThenElse ecs s -> ifthenelse [ (expr e, statement s) | (e,s) <- ecs ] (maybe Nothing (Just .statement) s) + ; Switch e as d -> switch (expr e) (arms as) (deflt d) + } + +skip = empty + +returnStat e = sep [text "return", indent e <> semi] + +exprStatement e = e <> semi + +declStatement d = d + +ifthenelse ((e,s):ecs) ms = sep [ text "if" <+> parens e <+> text "{", + indent s, + thenelse ecs ms] + +thenelse ((e,s):ecs) ms = sep [ text "} else if" <+> parens e <+> text "{", + indent s, + thenelse ecs ms] + +thenelse [] Nothing = text "}" +thenelse [] (Just s) = sep [text "} else {", indent s, text "}"] + +switch = \e -> \as -> \d -> + text "switch" <+> parens e <+> text "{" + $$ indent (as $$ d) + $$ text "}" + +deflt Nothing = empty +deflt (Just ss) = text "default:" $$ indent (statements ss) + +arms [] = empty +arms ((e,ss):as) = text "case" <+> expr e <> colon + $$ indent (statements ss) + $$ arms as + +maybeExpr Nothing = Nothing +maybeExpr (Just e) = Just (expr e) + +expr = \e -> + case e of + { Var n -> name n + ; Literal l -> literal l + ; Cast t e -> cast (typ t) e + ; Access e n -> expr e <> text "." <> name n + ; Assign l r -> assign (expr l) r + ; New n es ds -> new (typ n) es (maybeClass ds) + ; Raise n es -> text "raise" <+> text n + <+> parens (hsep (punctuate comma (map expr es))) + ; Call e n es -> call (expr e) (name n) es + ; Op e1 o e2 -> op e1 o e2 + ; InstanceOf e t -> expr e <+> text "instanceof" <+> typ t + } + +op = \e1 -> \o -> \e2 -> + ( if isSimple e1 + then expr e1 + else parens (expr e1) + ) + <+> + text o + <+> + ( if isSimple e2 + then expr e2 + else parens (expr e2) + ) + +assign = \l -> \r -> + if isSimple r + then l <+> text "=" <+> (expr r) + else l <+> text "=" $$ indent (expr r) + +cast = \t -> \e -> + if isSimple e + then parens (parens t <> expr e) + else parens (parens t $$ indent (expr e)) + +new n [] (Just ds) = sep [text "new" <+> n <+> text "()" <+> text "{", + indent ds, + text "}"] +new n es Nothing = text "new" <+> n <> parens (hsep (punctuate comma (map expr es))) + + +call e n es = e <> dot <> n <> parens (hsep (punctuate comma (map expr es))) + +literal = \l -> + case l of + { IntLit i -> text (show i) + ; CharLit c -> text "(char)" <+> text (show c) + ; StringLit s -> text ("\"" ++ s ++ "\"") -- strings are already printable + } + +maybeClass Nothing = Nothing +maybeClass (Just ds) = Just (decls ds) +\end{code} diff --git a/compiler/main/CmdLineParser.hs b/compiler/main/CmdLineParser.hs new file mode 100644 index 0000000000..e34b8c0857 --- /dev/null +++ b/compiler/main/CmdLineParser.hs @@ -0,0 +1,139 @@ +----------------------------------------------------------------------------- +-- +-- Command-line parser +-- +-- This is an abstract command-line parser used by both StaticFlags and +-- DynFlags. +-- +-- (c) The University of Glasgow 2005 +-- +----------------------------------------------------------------------------- + +module CmdLineParser ( + processArgs, OptKind(..), + CmdLineP(..), getCmdLineState, putCmdLineState + ) where + +#include "HsVersions.h" + +import Util ( maybePrefixMatch, notNull, removeSpaces ) +#ifdef DEBUG +import Panic ( assertPanic ) +#endif + +data OptKind m + = NoArg (m ()) -- flag with no argument + | HasArg (String -> m ()) -- flag has an argument (maybe prefix) + | SepArg (String -> m ()) -- flag has a separate argument + | Prefix (String -> m ()) -- flag is a prefix only + | OptPrefix (String -> m ()) -- flag may be a prefix + | AnySuffix (String -> m ()) -- flag is a prefix, pass whole arg to fn + | PassFlag (String -> m ()) -- flag with no arg, pass flag to fn + | PrefixPred (String -> Bool) (String -> m ()) + | AnySuffixPred (String -> Bool) (String -> m ()) + +processArgs :: Monad m + => [(String, OptKind m)] -- cmdline parser spec + -> [String] -- args + -> m ( + [String], -- spare args + [String] -- errors + ) +processArgs spec args = process spec args [] [] + where + process _spec [] spare errs = + return (reverse spare, reverse errs) + + process spec args@(('-':arg):args') spare errs = + case findArg spec arg of + Just (rest,action) -> + case processOneArg action rest args of + Left err -> process spec args' spare (err:errs) + Right (action,rest) -> do + action >> process spec rest spare errs + Nothing -> + process spec args' (('-':arg):spare) errs + + process spec (arg:args) spare errs = + process spec args (arg:spare) errs + + +processOneArg :: OptKind m -> String -> [String] + -> Either String (m (), [String]) +processOneArg action rest (dash_arg@('-':arg):args) = + case action of + NoArg a -> ASSERT(null rest) Right (a, args) + + HasArg f -> + if rest /= "" + then Right (f rest, args) + else case args of + [] -> missingArgErr dash_arg + (arg1:args1) -> Right (f arg1, args1) + + SepArg f -> + case args of + [] -> unknownFlagErr dash_arg + (arg1:args1) -> Right (f arg1, args1) + + Prefix f -> + if rest /= "" + then Right (f rest, args) + else unknownFlagErr dash_arg + + PrefixPred p f -> + if rest /= "" + then Right (f rest, args) + else unknownFlagErr dash_arg + + OptPrefix f -> Right (f rest, args) + + AnySuffix f -> Right (f dash_arg, args) + + AnySuffixPred p f -> Right (f dash_arg, args) + + PassFlag f -> + if rest /= "" + then unknownFlagErr dash_arg + else Right (f dash_arg, args) + + +findArg :: [(String,OptKind a)] -> String -> Maybe (String,OptKind a) +findArg spec arg + = case [ (removeSpaces rest, k) + | (pat,k) <- spec, + Just rest <- [maybePrefixMatch pat arg], + arg_ok k rest arg ] + of + [] -> Nothing + (one:_) -> Just one + +arg_ok (NoArg _) rest arg = null rest +arg_ok (HasArg _) rest arg = True +arg_ok (SepArg _) rest arg = null rest +arg_ok (Prefix _) rest arg = notNull rest +arg_ok (PrefixPred p _) rest arg = notNull rest && p rest +arg_ok (OptPrefix _) rest arg = True +arg_ok (PassFlag _) rest arg = null rest +arg_ok (AnySuffix _) rest arg = True +arg_ok (AnySuffixPred p _) rest arg = p arg + +unknownFlagErr :: String -> Either String a +unknownFlagErr f = Left ("unrecognised flag: " ++ f) + +missingArgErr :: String -> Either String a +missingArgErr f = Left ("missing argument for flag: " ++ f) + +-- ----------------------------------------------------------------------------- +-- A state monad for use in the command-line parser + +newtype CmdLineP s a = CmdLineP { runCmdLine :: s -> (a, s) } + +instance Monad (CmdLineP s) where + return a = CmdLineP $ \s -> (a, s) + m >>= k = CmdLineP $ \s -> let + (a, s') = runCmdLine m s + in runCmdLine (k a) s' + +getCmdLineState = CmdLineP $ \s -> (s,s) +putCmdLineState s = CmdLineP $ \_ -> ((),s) diff --git a/compiler/main/CodeOutput.lhs b/compiler/main/CodeOutput.lhs new file mode 100644 index 0000000000..d1b293353a --- /dev/null +++ b/compiler/main/CodeOutput.lhs @@ -0,0 +1,303 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1993-1998 +% +\section{Code output phase} + +\begin{code} +module CodeOutput( codeOutput, outputForeignStubs ) where + +#include "HsVersions.h" + +#ifndef OMIT_NATIVE_CODEGEN +import UniqSupply ( mkSplitUniqSupply ) +import AsmCodeGen ( nativeCodeGen ) +#endif + +#ifdef ILX +import IlxGen ( ilxGen ) +#endif + +#ifdef JAVA +import JavaGen ( javaGen ) +import qualified PrintJava +import OccurAnal ( occurAnalyseBinds ) +#endif + +import Finder ( mkStubPaths ) +import PprC ( writeCs ) +import CmmLint ( cmmLint ) +import Packages +import Util +import FastString ( unpackFS ) +import Cmm ( Cmm ) +import HscTypes +import DynFlags +import ErrUtils ( dumpIfSet_dyn, showPass, ghcExit ) +import Outputable +import Pretty ( Mode(..), printDoc ) +import Module ( Module, ModLocation(..) ) +import List ( nub ) +import Maybes ( firstJust ) + +import Distribution.Package ( showPackageId ) +import Directory ( doesFileExist ) +import Monad ( when ) +import IO +\end{code} + +%************************************************************************ +%* * +\subsection{Steering} +%* * +%************************************************************************ + +\begin{code} +codeOutput :: DynFlags + -> Module + -> ModLocation + -> ForeignStubs + -> [PackageId] + -> [Cmm] -- Compiled C-- + -> IO (Bool{-stub_h_exists-}, Bool{-stub_c_exists-}) + +codeOutput dflags this_mod location foreign_stubs pkg_deps flat_abstractC + = + -- You can have C (c_output) or assembly-language (ncg_output), + -- but not both. [Allowing for both gives a space leak on + -- flat_abstractC. WDP 94/10] + + -- Dunno if the above comment is still meaningful now. JRS 001024. + + do { when (dopt Opt_DoCmmLinting dflags) $ do + { showPass dflags "CmmLint" + ; let lints = map cmmLint flat_abstractC + ; case firstJust lints of + Just err -> do { printDump err + ; ghcExit dflags 1 + } + Nothing -> return () + } + + ; showPass dflags "CodeOutput" + ; let filenm = hscOutName dflags + ; stubs_exist <- outputForeignStubs dflags this_mod location foreign_stubs + ; case hscTarget dflags of { + HscInterpreted -> return (); + HscAsm -> outputAsm dflags filenm flat_abstractC; + HscC -> outputC dflags filenm this_mod location + flat_abstractC stubs_exist pkg_deps + foreign_stubs; + HscJava -> +#ifdef JAVA + outputJava dflags filenm mod_name tycons core_binds; +#else + panic "Java support not compiled into this ghc"; +#endif + HscILX -> +#ifdef ILX + let tycons = typeEnvTyCons type_env in + outputIlx dflags filenm mod_name tycons stg_binds; +#else + panic "ILX support not compiled into this ghc"; +#endif + } + ; return stubs_exist + } + +doOutput :: String -> (Handle -> IO ()) -> IO () +doOutput filenm io_action = bracket (openFile filenm WriteMode) hClose io_action +\end{code} + + +%************************************************************************ +%* * +\subsection{C} +%* * +%************************************************************************ + +\begin{code} +outputC dflags filenm mod location flat_absC + (stub_h_exists, _) packages foreign_stubs + = do + -- figure out which header files to #include in the generated .hc file: + -- + -- * extra_includes from packages + -- * -#include options from the cmdline and OPTIONS pragmas + -- * the _stub.h file, if there is one. + -- + pkg_configs <- getExplicitPackagesAnd dflags packages + let pkg_names = map (showPackageId.package) pkg_configs + + c_includes <- getPackageCIncludes pkg_configs + let cmdline_includes = cmdlineHcIncludes dflags -- -#include options + + ffi_decl_headers + = case foreign_stubs of + NoStubs -> [] + ForeignStubs _ _ fdhs _ -> map unpackFS (nub fdhs) + -- Remove duplicates, because distinct foreign import decls + -- may cite the same #include. Order doesn't matter. + + all_headers = c_includes + ++ reverse cmdline_includes + ++ ffi_decl_headers + + let cc_injects = unlines (map mk_include all_headers) + mk_include h_file = + case h_file of + '"':_{-"-} -> "#include "++h_file + '<':_ -> "#include "++h_file + _ -> "#include \""++h_file++"\"" + + doOutput filenm $ \ h -> do + hPutStr h ("/* GHC_PACKAGES " ++ unwords pkg_names ++ "\n*/\n") + hPutStr h cc_injects + when stub_h_exists $ + hPutStrLn h ("#include \"" ++ (filenameOf stub_h) ++ "\"") + writeCs dflags h flat_absC + where + (_, stub_h) = mkStubPaths dflags mod location +\end{code} + + +%************************************************************************ +%* * +\subsection{Assembler} +%* * +%************************************************************************ + +\begin{code} +outputAsm dflags filenm flat_absC + +#ifndef OMIT_NATIVE_CODEGEN + + = do ncg_uniqs <- mkSplitUniqSupply 'n' + ncg_output_d <- _scc_ "NativeCodeGen" + nativeCodeGen dflags flat_absC ncg_uniqs + dumpIfSet_dyn dflags Opt_D_dump_asm "Asm code" (docToSDoc ncg_output_d) + _scc_ "OutputAsm" doOutput filenm $ + \f -> printDoc LeftMode f ncg_output_d + where + +#else /* OMIT_NATIVE_CODEGEN */ + + = pprPanic "This compiler was built without a native code generator" + (text "Use -fvia-C instead") + +#endif +\end{code} + + +%************************************************************************ +%* * +\subsection{Java} +%* * +%************************************************************************ + +\begin{code} +#ifdef JAVA +outputJava dflags filenm mod tycons core_binds + = doOutput filenm (\ f -> printForUser f alwaysQualify pp_java) + -- User style printing for now to keep indentation + where + occ_anal_binds = occurAnalyseBinds core_binds + -- Make sure we have up to date dead-var information + java_code = javaGen mod [{- Should be imports-}] tycons occ_anal_binds + pp_java = PrintJava.compilationUnit java_code +#endif +\end{code} + + +%************************************************************************ +%* * +\subsection{Ilx} +%* * +%************************************************************************ + +\begin{code} +#ifdef ILX +outputIlx dflags filename mod tycons stg_binds + = doOutput filename (\ f -> printForC f pp_ilx) + where + pp_ilx = ilxGen mod tycons stg_binds +#endif +\end{code} + + +%************************************************************************ +%* * +\subsection{Foreign import/export} +%* * +%************************************************************************ + +\begin{code} +outputForeignStubs :: DynFlags -> Module -> ModLocation -> ForeignStubs + -> IO (Bool, -- Header file created + Bool) -- C file created +outputForeignStubs dflags mod location stubs + | NoStubs <- stubs = do + -- When compiling External Core files, may need to use stub + -- files from a previous compilation + stub_c_exists <- doesFileExist stub_c + stub_h_exists <- doesFileExist stub_h + return (stub_h_exists, stub_c_exists) + + | ForeignStubs h_code c_code _ _ <- stubs + = do + let + stub_c_output_d = pprCode CStyle c_code + stub_c_output_w = showSDoc stub_c_output_d + + -- Header file protos for "foreign export"ed functions. + stub_h_output_d = pprCode CStyle h_code + stub_h_output_w = showSDoc stub_h_output_d + -- in + + createDirectoryHierarchy (directoryOf stub_c) + + dumpIfSet_dyn dflags Opt_D_dump_foreign + "Foreign export header file" stub_h_output_d + + -- we need the #includes from the rts package for the stub files + let rtsid = rtsPackageId (pkgState dflags) + rts_includes + | ExtPackage pid <- rtsid = + let rts_pkg = getPackageDetails (pkgState dflags) pid in + concatMap mk_include (includes rts_pkg) + | otherwise = [] + mk_include i = "#include \"" ++ i ++ "\"\n" + + stub_h_file_exists + <- outputForeignStubs_help stub_h stub_h_output_w + ("#include \"HsFFI.h\"\n" ++ cplusplus_hdr) cplusplus_ftr + + dumpIfSet_dyn dflags Opt_D_dump_foreign + "Foreign export stubs" stub_c_output_d + + stub_c_file_exists + <- outputForeignStubs_help stub_c stub_c_output_w + ("#define IN_STG_CODE 0\n" ++ + "#include \"Rts.h\"\n" ++ + rts_includes ++ + cplusplus_hdr) + cplusplus_ftr + -- We're adding the default hc_header to the stub file, but this + -- isn't really HC code, so we need to define IN_STG_CODE==0 to + -- avoid the register variables etc. being enabled. + + return (stub_h_file_exists, stub_c_file_exists) + where + (stub_c, stub_h) = mkStubPaths dflags mod location + +cplusplus_hdr = "#ifdef __cplusplus\nextern \"C\" {\n#endif\n" +cplusplus_ftr = "#ifdef __cplusplus\n}\n#endif\n" + +-- Don't use doOutput for dumping the f. export stubs +-- since it is more than likely that the stubs file will +-- turn out to be empty, in which case no file should be created. +outputForeignStubs_help fname "" header footer = return False +outputForeignStubs_help fname doc_str header footer + = do writeFile fname (header ++ doc_str ++ '\n':footer ++ "\n") + return True +\end{code} + diff --git a/compiler/main/Constants.lhs b/compiler/main/Constants.lhs new file mode 100644 index 0000000000..43db93249a --- /dev/null +++ b/compiler/main/Constants.lhs @@ -0,0 +1,150 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[Constants]{Info about this compilation} + +\begin{code} +module Constants (module Constants) where + +-- This magical #include brings in all the everybody-knows-these magic +-- constants unfortunately, we need to be *explicit* about which one +-- we want; if we just hope a -I... will get the right one, we could +-- be in trouble. + +#include "HsVersions.h" +#include "../includes/MachRegs.h" +#include "../includes/Constants.h" +#include "../includes/MachDeps.h" +#include "../includes/DerivedConstants.h" + +-- import Util +\end{code} + +All pretty arbitrary: + +\begin{code} +mAX_TUPLE_SIZE = (62 :: Int) -- Should really match the number + -- of decls in Data.Tuple +mAX_CONTEXT_REDUCTION_DEPTH = (20 :: Int) +\end{code} + + +\begin{code} +-- specialised fun/thunk/constr closure types +mAX_SPEC_THUNK_SIZE = (MAX_SPEC_THUNK_SIZE :: Int) +mAX_SPEC_FUN_SIZE = (MAX_SPEC_FUN_SIZE :: Int) +mAX_SPEC_CONSTR_SIZE = (MAX_SPEC_CONSTR_SIZE :: Int) + +-- pre-compiled thunk types +mAX_SPEC_SELECTEE_SIZE = (MAX_SPEC_SELECTEE_SIZE :: Int) +mAX_SPEC_AP_SIZE = (MAX_SPEC_AP_SIZE :: Int) + +-- closure sizes: these do NOT include the header (see below for header sizes) +mIN_PAYLOAD_SIZE = (MIN_PAYLOAD_SIZE::Int) +\end{code} + +\begin{code} +mIN_INTLIKE, mAX_INTLIKE :: Int +mIN_INTLIKE = MIN_INTLIKE +mAX_INTLIKE = MAX_INTLIKE + +mIN_CHARLIKE, mAX_CHARLIKE :: Int +mIN_CHARLIKE = MIN_CHARLIKE +mAX_CHARLIKE = MAX_CHARLIKE +\end{code} + +A section of code-generator-related MAGIC CONSTANTS. + +\begin{code} +mAX_FAMILY_SIZE_FOR_VEC_RETURNS = (MAX_VECTORED_RTN::Int) -- pretty arbitrary +-- If you change this, you may need to change runtimes/standard/Update.lhc +\end{code} + +\begin{code} +mAX_Vanilla_REG = (MAX_VANILLA_REG :: Int) +mAX_Float_REG = (MAX_FLOAT_REG :: Int) +mAX_Double_REG = (MAX_DOUBLE_REG :: Int) +mAX_Long_REG = (MAX_LONG_REG :: Int) + +mAX_Real_Vanilla_REG = (MAX_REAL_VANILLA_REG :: Int) +mAX_Real_Float_REG = (MAX_REAL_FLOAT_REG :: Int) +mAX_Real_Double_REG = (MAX_REAL_DOUBLE_REG :: Int) +#ifdef MAX_REAL_LONG_REG +mAX_Real_Long_REG = (MAX_REAL_LONG_REG :: Int) +#else +mAX_Real_Long_REG = (0::Int) +#endif +\end{code} + +Closure header sizes. + +\begin{code} +sTD_HDR_SIZE = (STD_HDR_SIZE :: Int) +pROF_HDR_SIZE = (PROF_HDR_SIZE :: Int) +gRAN_HDR_SIZE = (GRAN_HDR_SIZE :: Int) +\end{code} + +Info Table sizes. + +\begin{code} +sTD_ITBL_SIZE = (STD_ITBL_SIZE :: Int) +rET_ITBL_SIZE = (RET_ITBL_SIZE :: Int) +pROF_ITBL_SIZE = (PROF_ITBL_SIZE :: Int) +gRAN_ITBL_SIZE = (GRAN_ITBL_SIZE :: Int) +tICKY_ITBL_SIZE = (TICKY_ITBL_SIZE :: Int) +\end{code} + +Size of a double in StgWords. + +\begin{code} +dOUBLE_SIZE = SIZEOF_DOUBLE :: Int +wORD64_SIZE = 8 :: Int +iNT64_SIZE = wORD64_SIZE +\end{code} + +This tells the native code generator the size of the spill +area is has available. + +\begin{code} +rESERVED_C_STACK_BYTES = (RESERVED_C_STACK_BYTES :: Int) +\end{code} + +The amount of (Haskell) stack to leave free for saving registers when +returning to the scheduler. + +\begin{code} +rESERVED_STACK_WORDS = (RESERVED_STACK_WORDS :: Int) +\end{code} + +Size of a word, in bytes + +\begin{code} +wORD_SIZE = (SIZEOF_HSWORD :: Int) +wORD_SIZE_IN_BITS = wORD_SIZE * 8 :: Int +\end{code} + +Size of a C int, in bytes. May be smaller than wORD_SIZE. + +\begin{code} +cINT_SIZE = (SIZEOF_INT :: Int) +\end{code} + +Size of a storage manager block (in bytes). + +\begin{code} +bLOCK_SIZE = (BLOCK_SIZE :: Int) +bLOCK_SIZE_W = (bLOCK_SIZE `quot` wORD_SIZE :: Int) +\end{code} + +Number of bits to shift a bitfield left by in an info table. + +\begin{code} +bITMAP_BITS_SHIFT = (BITMAP_BITS_SHIFT :: Int) +\end{code} + +Constants derived from headers in ghc/includes, generated by the program +../includes/mkDerivedConstants.c. + +\begin{code} +#include "../includes/GHCConstants.h" +\end{code} diff --git a/compiler/main/DriverMkDepend.hs b/compiler/main/DriverMkDepend.hs new file mode 100644 index 0000000000..80d906c4a7 --- /dev/null +++ b/compiler/main/DriverMkDepend.hs @@ -0,0 +1,342 @@ +----------------------------------------------------------------------------- +-- +-- Makefile Dependency Generation +-- +-- (c) The University of Glasgow 2005 +-- +----------------------------------------------------------------------------- + +module DriverMkDepend ( + doMkDependHS + ) where + +#include "HsVersions.h" + +import qualified GHC +import GHC ( Session, ModSummary(..) ) +import DynFlags ( DynFlags( verbosity, opt_dep ), getOpts ) +import Util ( escapeSpaces, splitFilename, joinFileExt ) +import HscTypes ( HscEnv, IsBootInterface, msObjFilePath, msHsFilePath ) +import Packages ( PackageIdH(..) ) +import SysTools ( newTempName ) +import qualified SysTools +import Module ( Module, ModLocation(..), mkModule, + addBootSuffix_maybe ) +import Digraph ( SCC(..) ) +import Finder ( findModule, FindResult(..) ) +import Util ( global, consIORef ) +import Outputable +import Panic +import SrcLoc ( unLoc ) +import CmdLineParser + +#if __GLASGOW_HASKELL__ <= 408 +import Panic ( catchJust, ioErrors ) +#endif +import ErrUtils ( debugTraceMsg, printErrorsAndWarnings ) + +import DATA_IOREF ( IORef, readIORef, writeIORef ) +import EXCEPTION + +import System ( ExitCode(..), exitWith ) +import Directory +import IO +import Monad ( when ) +import Maybe ( isJust ) + +----------------------------------------------------------------- +-- +-- The main function +-- +----------------------------------------------------------------- + +doMkDependHS :: Session -> [FilePath] -> IO () +doMkDependHS session srcs + = do { -- Initialisation + dflags <- GHC.getSessionDynFlags session + ; files <- beginMkDependHS dflags + + -- Do the downsweep to find all the modules + ; targets <- mapM (\s -> GHC.guessTarget s Nothing) srcs + ; GHC.setTargets session targets + ; excl_mods <- readIORef v_Dep_exclude_mods + ; r <- GHC.depanal session excl_mods True {- Allow dup roots -} + ; case r of + Nothing -> exitWith (ExitFailure 1) + Just mod_summaries -> do { + + -- Sort into dependency order + -- There should be no cycles + let sorted = GHC.topSortModuleGraph False mod_summaries Nothing + + -- Print out the dependencies if wanted + ; debugTraceMsg dflags 2 (text "Module dependencies" $$ ppr sorted) + + -- Prcess them one by one, dumping results into makefile + -- and complaining about cycles + ; mapM (processDeps session excl_mods (mkd_tmp_hdl files)) sorted + + -- Tidy up + ; endMkDependHS dflags files }} + +----------------------------------------------------------------- +-- +-- beginMkDependHs +-- Create a temporary file, +-- find the Makefile, +-- slurp through it, etc +-- +----------------------------------------------------------------- + +data MkDepFiles + = MkDep { mkd_make_file :: FilePath, -- Name of the makefile + mkd_make_hdl :: Maybe Handle, -- Handle for the open makefile + mkd_tmp_file :: FilePath, -- Name of the temporary file + mkd_tmp_hdl :: Handle } -- Handle of the open temporary file + +beginMkDependHS :: DynFlags -> IO MkDepFiles + +beginMkDependHS dflags = do + -- slurp in the mkdependHS-style options + let flags = getOpts dflags opt_dep + _ <- processArgs dep_opts flags + + -- open a new temp file in which to stuff the dependency info + -- as we go along. + tmp_file <- newTempName dflags "dep" + tmp_hdl <- openFile tmp_file WriteMode + + -- open the makefile + makefile <- readIORef v_Dep_makefile + exists <- doesFileExist makefile + mb_make_hdl <- + if not exists + then return Nothing + else do + makefile_hdl <- openFile makefile ReadMode + + -- slurp through until we get the magic start string, + -- copying the contents into dep_makefile + let slurp = do + l <- hGetLine makefile_hdl + if (l == depStartMarker) + then return () + else do hPutStrLn tmp_hdl l; slurp + + -- slurp through until we get the magic end marker, + -- throwing away the contents + let chuck = do + l <- hGetLine makefile_hdl + if (l == depEndMarker) + then return () + else chuck + + catchJust ioErrors slurp + (\e -> if isEOFError e then return () else ioError e) + catchJust ioErrors chuck + (\e -> if isEOFError e then return () else ioError e) + + return (Just makefile_hdl) + + + -- write the magic marker into the tmp file + hPutStrLn tmp_hdl depStartMarker + + return (MkDep { mkd_make_file = makefile, mkd_make_hdl = mb_make_hdl, + mkd_tmp_file = tmp_file, mkd_tmp_hdl = tmp_hdl}) + + +----------------------------------------------------------------- +-- +-- processDeps +-- +----------------------------------------------------------------- + +processDeps :: Session + -> [Module] + -> Handle -- Write dependencies to here + -> SCC ModSummary + -> IO () +-- Write suitable dependencies to handle +-- Always: +-- this.o : this.hs +-- +-- If the dependency is on something other than a .hi file: +-- this.o this.p_o ... : dep +-- otherwise +-- this.o ... : dep.hi +-- this.p_o ... : dep.p_hi +-- ... +-- (where .o is $osuf, and the other suffixes come from +-- the cmdline -s options). +-- +-- For {-# SOURCE #-} imports the "hi" will be "hi-boot". + +processDeps session excl_mods hdl (CyclicSCC nodes) + = -- There shouldn't be any cycles; report them + throwDyn (ProgramError (showSDoc $ GHC.cyclicModuleErr nodes)) + +processDeps session excl_mods hdl (AcyclicSCC node) + = do { extra_suffixes <- readIORef v_Dep_suffixes + ; hsc_env <- GHC.sessionHscEnv session + ; include_pkg_deps <- readIORef v_Dep_include_pkg_deps + ; let src_file = msHsFilePath node + obj_file = msObjFilePath node + obj_files = insertSuffixes obj_file extra_suffixes + + do_imp is_boot imp_mod + = do { mb_hi <- findDependency hsc_env src_file imp_mod + is_boot include_pkg_deps + ; case mb_hi of { + Nothing -> return () ; + Just hi_file -> do + { let hi_files = insertSuffixes hi_file extra_suffixes + write_dep (obj,hi) = writeDependency hdl [obj] hi + + -- Add one dependency for each suffix; + -- e.g. A.o : B.hi + -- A.x_o : B.x_hi + ; mapM_ write_dep (obj_files `zip` hi_files) }}} + + + -- Emit std dependency of the object(s) on the source file + -- Something like A.o : A.hs + ; writeDependency hdl obj_files src_file + + -- Emit a dependency for each import + + -- SOURCE imports + ; mapM_ (do_imp True) + (filter (`notElem` excl_mods) (map unLoc (ms_srcimps node))) + + -- regular imports + ; mapM_ (do_imp False) + (filter (`notElem` excl_mods) (map unLoc (ms_imps node))) + } + + +findDependency :: HscEnv + -> FilePath -- Importing module: used only for error msg + -> Module -- Imported module + -> IsBootInterface -- Source import + -> Bool -- Record dependency on package modules + -> IO (Maybe FilePath) -- Interface file file +findDependency hsc_env src imp is_boot include_pkg_deps + = do { -- Find the module; this will be fast because + -- we've done it once during downsweep + r <- findModule hsc_env imp True {-explicit-} + ; case r of + Found loc pkg + -- Not in this package: we don't need a dependency + | ExtPackage _ <- pkg, not include_pkg_deps + -> return Nothing + + -- Home package: just depend on the .hi or hi-boot file + | otherwise + -> return (Just (addBootSuffix_maybe is_boot (ml_hi_file loc))) + + _ -> panic "findDependency" + } + +----------------------------- +writeDependency :: Handle -> [FilePath] -> FilePath -> IO () +-- (writeDependency h [t1,t2] dep) writes to handle h the dependency +-- t1 t2 : dep +writeDependency hdl targets dep + = hPutStrLn hdl (unwords (map escapeSpaces targets) ++ " : " + ++ escapeSpaces dep) + +----------------------------- +insertSuffixes + :: FilePath -- Original filename; e.g. "foo.o" + -> [String] -- Extra suffices e.g. ["x","y"] + -> [FilePath] -- Zapped filenames e.g. ["foo.o", "foo.x_o", "foo.y_o"] + -- Note that that the extra bit gets inserted *before* the old suffix + -- We assume the old suffix contains no dots, so we can strip it with removeSuffix + + -- NOTE: we used to have this comment + -- In order to construct hi files with alternate suffixes, we + -- now have to find the "basename" of the hi file. This is + -- difficult because we can't just split the hi filename + -- at the last dot - the hisuf might have dots in it. So we + -- check whether the hi filename ends in hisuf, and if it does, + -- we strip off hisuf, otherwise we strip everything after the + -- last dot. + -- But I'm not sure we care about hisufs with dots in them. + -- Lots of other things will break first! + +insertSuffixes file_name extras + = file_name : [ basename `joinFileExt` (extra ++ "_" ++ suffix) | extra <- extras ] + where + (basename, suffix) = splitFilename file_name + + +----------------------------------------------------------------- +-- +-- endMkDependHs +-- Complete the makefile, close the tmp file etc +-- +----------------------------------------------------------------- + +endMkDependHS :: DynFlags -> MkDepFiles -> IO () + +endMkDependHS dflags + (MkDep { mkd_make_file = makefile, mkd_make_hdl = makefile_hdl, + mkd_tmp_file = tmp_file, mkd_tmp_hdl = tmp_hdl }) + = do + -- write the magic marker into the tmp file + hPutStrLn tmp_hdl depEndMarker + + case makefile_hdl of + Nothing -> return () + Just hdl -> do + + -- slurp the rest of the original makefile and copy it into the output + let slurp = do + l <- hGetLine hdl + hPutStrLn tmp_hdl l + slurp + + catchJust ioErrors slurp + (\e -> if isEOFError e then return () else ioError e) + + hClose hdl + + hClose tmp_hdl -- make sure it's flushed + + -- Create a backup of the original makefile + when (isJust makefile_hdl) + (SysTools.copy dflags ("Backing up " ++ makefile) + makefile (makefile++".bak")) + + -- Copy the new makefile in place + SysTools.copy dflags "Installing new makefile" tmp_file makefile + + +----------------------------------------------------------------- +-- +-- Flags +-- +----------------------------------------------------------------- + + -- Flags +GLOBAL_VAR(v_Dep_makefile, "Makefile", String); +GLOBAL_VAR(v_Dep_include_pkg_deps, False, Bool); +GLOBAL_VAR(v_Dep_exclude_mods, [], [Module]); +GLOBAL_VAR(v_Dep_suffixes, [], [String]); +GLOBAL_VAR(v_Dep_warnings, True, Bool); + +depStartMarker = "# DO NOT DELETE: Beginning of Haskell dependencies" +depEndMarker = "# DO NOT DELETE: End of Haskell dependencies" + +-- for compatibility with the old mkDependHS, we accept options of the form +-- -optdep-f -optdep.depend, etc. +dep_opts = + [ ( "s", SepArg (consIORef v_Dep_suffixes) ) + , ( "f", SepArg (writeIORef v_Dep_makefile) ) + , ( "w", NoArg (writeIORef v_Dep_warnings False) ) + , ( "-include-prelude", NoArg (writeIORef v_Dep_include_pkg_deps True) ) + , ( "-include-pkg-deps", NoArg (writeIORef v_Dep_include_pkg_deps True) ) + , ( "-exclude-module=", Prefix (consIORef v_Dep_exclude_mods . mkModule) ) + , ( "x", Prefix (consIORef v_Dep_exclude_mods . mkModule) ) + ] diff --git a/compiler/main/DriverPhases.hs b/compiler/main/DriverPhases.hs new file mode 100644 index 0000000000..6e945314cb --- /dev/null +++ b/compiler/main/DriverPhases.hs @@ -0,0 +1,229 @@ +----------------------------------------------------------------------------- +-- $Id: DriverPhases.hs,v 1.38 2005/05/17 11:01:59 simonmar Exp $ +-- +-- GHC Driver +-- +-- (c) The University of Glasgow 2002 +-- +----------------------------------------------------------------------------- + +module DriverPhases ( + HscSource(..), isHsBoot, hscSourceString, + Phase(..), + happensBefore, eqPhase, anyHsc, isStopLn, + startPhase, -- :: String -> Phase + phaseInputExt, -- :: Phase -> String + + isHaskellishSuffix, + isHaskellSrcSuffix, + isObjectSuffix, + isCishSuffix, + isExtCoreSuffix, + isDynLibSuffix, + isHaskellUserSrcSuffix, + isSourceSuffix, + + isHaskellishFilename, + isHaskellSrcFilename, + isObjectFilename, + isCishFilename, + isExtCoreFilename, + isDynLibFilename, + isHaskellUserSrcFilename, + isSourceFilename -- :: FilePath -> Bool + ) where + +import Util ( suffixOf ) +import Panic ( panic ) + +----------------------------------------------------------------------------- +-- Phases + +{- + Phase of the | Suffix saying | Flag saying | (suffix of) + compilation system | ``start here''| ``stop after''| output file + + literate pre-processor | .lhs | - | - + C pre-processor (opt.) | - | -E | - + Haskell compiler | .hs | -C, -S | .hc, .s + C compiler (opt.) | .hc or .c | -S | .s + assembler | .s or .S | -c | .o + linker | other | - | a.out +-} + +data HscSource + = HsSrcFile | HsBootFile | ExtCoreFile + deriving( Eq, Ord, Show ) + -- Ord needed for the finite maps we build in CompManager + + +hscSourceString :: HscSource -> String +hscSourceString HsSrcFile = "" +hscSourceString HsBootFile = "[boot]" +hscSourceString ExtCoreFile = "[ext core]" + +isHsBoot :: HscSource -> Bool +isHsBoot HsBootFile = True +isHsBoot other = False + +data Phase + = Unlit HscSource + | Cpp HscSource + | HsPp HscSource + | Hsc HscSource + | Cc + | HCc -- Haskellised C (as opposed to vanilla C) compilation + | Mangle -- assembly mangling, now done by a separate script. + | SplitMangle -- after mangler if splitting + | SplitAs + | As + | CmmCpp -- pre-process Cmm source + | Cmm -- parse & compile Cmm code + + -- The final phase is a pseudo-phase that tells the pipeline to stop. + -- There is no runPhase case for it. + | StopLn -- Stop, but linking will follow, so generate .o file + deriving (Eq, Show) + +anyHsc :: Phase +anyHsc = Hsc (panic "anyHsc") + +isStopLn :: Phase -> Bool +isStopLn StopLn = True +isStopLn other = False + +eqPhase :: Phase -> Phase -> Bool +-- Equality of constructors, ignoring the HscSource field +-- NB: the HscSource field can be 'bot'; see anyHsc above +eqPhase (Unlit _) (Unlit _) = True +eqPhase (Cpp _) (Cpp _) = True +eqPhase (HsPp _) (HsPp _) = True +eqPhase (Hsc _) (Hsc _) = True +eqPhase Cc Cc = True +eqPhase HCc HCc = True +eqPhase Mangle Mangle = True +eqPhase SplitMangle SplitMangle = True +eqPhase SplitAs SplitAs = True +eqPhase As As = True +eqPhase CmmCpp CmmCpp = True +eqPhase Cmm Cmm = True +eqPhase StopLn StopLn = True +eqPhase _ _ = False + +-- Partial ordering on phases: we want to know which phases will occur before +-- which others. This is used for sanity checking, to ensure that the +-- pipeline will stop at some point (see DriverPipeline.runPipeline). +StopLn `happensBefore` y = False +x `happensBefore` y = after_x `eqPhase` y || after_x `happensBefore` y + where + after_x = nextPhase x + +nextPhase :: Phase -> Phase +-- A conservative approximation the next phase, used in happensBefore +nextPhase (Unlit sf) = Cpp sf +nextPhase (Cpp sf) = HsPp sf +nextPhase (HsPp sf) = Hsc sf +nextPhase (Hsc sf) = HCc +nextPhase HCc = Mangle +nextPhase Mangle = SplitMangle +nextPhase SplitMangle = As +nextPhase As = SplitAs +nextPhase SplitAs = StopLn +nextPhase Cc = As +nextPhase CmmCpp = Cmm +nextPhase Cmm = HCc +nextPhase StopLn = panic "nextPhase: nothing after StopLn" + +-- the first compilation phase for a given file is determined +-- by its suffix. +startPhase "lhs" = Unlit HsSrcFile +startPhase "lhs-boot" = Unlit HsBootFile +startPhase "hs" = Cpp HsSrcFile +startPhase "hs-boot" = Cpp HsBootFile +startPhase "hscpp" = HsPp HsSrcFile +startPhase "hspp" = Hsc HsSrcFile +startPhase "hcr" = Hsc ExtCoreFile +startPhase "hc" = HCc +startPhase "c" = Cc +startPhase "cpp" = Cc +startPhase "C" = Cc +startPhase "cc" = Cc +startPhase "cxx" = Cc +startPhase "raw_s" = Mangle +startPhase "split_s" = SplitMangle +startPhase "s" = As +startPhase "S" = As +startPhase "o" = StopLn +startPhase "cmm" = CmmCpp +startPhase "cmmcpp" = Cmm +startPhase _ = StopLn -- all unknown file types + +-- This is used to determine the extension for the output from the +-- current phase (if it generates a new file). The extension depends +-- on the next phase in the pipeline. +phaseInputExt (Unlit HsSrcFile) = "lhs" +phaseInputExt (Unlit HsBootFile) = "lhs-boot" +phaseInputExt (Unlit ExtCoreFile) = "lhcr" +phaseInputExt (Cpp _) = "lpp" -- intermediate only +phaseInputExt (HsPp _) = "hscpp" -- intermediate only +phaseInputExt (Hsc _) = "hspp" -- intermediate only + -- NB: as things stand, phaseInputExt (Hsc x) must not evaluate x + -- because runPipeline uses the StopBefore phase to pick the + -- output filename. That could be fixed, but watch out. +phaseInputExt HCc = "hc" +phaseInputExt Cc = "c" +phaseInputExt Mangle = "raw_s" +phaseInputExt SplitMangle = "split_s" -- not really generated +phaseInputExt As = "s" +phaseInputExt SplitAs = "split_s" -- not really generated +phaseInputExt CmmCpp = "cmm" +phaseInputExt Cmm = "cmmcpp" +phaseInputExt StopLn = "o" +#ifdef ILX +phaseInputExt Ilx2Il = "ilx" +phaseInputExt Ilasm = "il" +#endif + +haskellish_src_suffixes = haskellish_user_src_suffixes ++ + [ "hspp", "hscpp", "hcr", "cmm" ] +haskellish_suffixes = haskellish_src_suffixes ++ ["hc", "raw_s"] +cish_suffixes = [ "c", "cpp", "C", "cc", "cxx", "s", "S" ] +extcoreish_suffixes = [ "hcr" ] +haskellish_user_src_suffixes = [ "hs", "lhs", "hs-boot", "lhs-boot" ] -- Will not be deleted as temp files + +-- Use the appropriate suffix for the system on which +-- the GHC-compiled code will run +#if mingw32_TARGET_OS || cygwin32_TARGET_OS +objish_suffixes = [ "o", "O", "obj", "OBJ" ] +#else +objish_suffixes = [ "o" ] +#endif + +#ifdef mingw32_TARGET_OS +dynlib_suffixes = ["dll", "DLL"] +#elif defined(darwin_TARGET_OS) +dynlib_suffixes = ["dylib"] +#else +dynlib_suffixes = ["so"] +#endif + +isHaskellishSuffix s = s `elem` haskellish_suffixes +isHaskellSrcSuffix s = s `elem` haskellish_src_suffixes +isCishSuffix s = s `elem` cish_suffixes +isExtCoreSuffix s = s `elem` extcoreish_suffixes +isObjectSuffix s = s `elem` objish_suffixes +isHaskellUserSrcSuffix s = s `elem` haskellish_user_src_suffixes +isDynLibSuffix s = s `elem` dynlib_suffixes + +isSourceSuffix suff = isHaskellishSuffix suff || isCishSuffix suff + +isHaskellishFilename f = isHaskellishSuffix (suffixOf f) +isHaskellSrcFilename f = isHaskellSrcSuffix (suffixOf f) +isCishFilename f = isCishSuffix (suffixOf f) +isExtCoreFilename f = isExtCoreSuffix (suffixOf f) +isObjectFilename f = isObjectSuffix (suffixOf f) +isHaskellUserSrcFilename f = isHaskellUserSrcSuffix (suffixOf f) +isDynLibFilename f = isDynLibSuffix (suffixOf f) +isSourceFilename f = isSourceSuffix (suffixOf f) + + diff --git a/compiler/main/DriverPipeline.hs b/compiler/main/DriverPipeline.hs new file mode 100644 index 0000000000..e20bc56940 --- /dev/null +++ b/compiler/main/DriverPipeline.hs @@ -0,0 +1,1405 @@ +----------------------------------------------------------------------------- +-- +-- GHC Driver +-- +-- (c) The University of Glasgow 2005 +-- +----------------------------------------------------------------------------- + +module DriverPipeline ( + -- Run a series of compilation steps in a pipeline, for a + -- collection of source files. + oneShot, compileFile, + + -- Interfaces for the batch-mode driver + staticLink, + + -- Interfaces for the compilation manager (interpreted/batch-mode) + preprocess, + compile, CompResult(..), + link, + + -- DLL building + doMkDLL, + + ) where + +#include "HsVersions.h" + +import Packages +import HeaderInfo +import DriverPhases +import SysTools ( newTempName, addFilesToClean, getSysMan, copy ) +import qualified SysTools +import HscMain +import Finder +import HscTypes +import Outputable +import Module +import ErrUtils +import DynFlags +import StaticFlags ( v_Ld_inputs, opt_Static, WayName(..) ) +import Config +import Panic +import Util +import StringBuffer ( hGetStringBuffer ) +import BasicTypes ( SuccessFlag(..) ) +import Maybes ( expectJust ) +import ParserCoreUtils ( getCoreModuleName ) +import SrcLoc ( unLoc ) +import SrcLoc ( Located(..) ) + +import EXCEPTION +import DATA_IOREF ( readIORef, writeIORef, IORef ) +import GLAEXTS ( Int(..) ) + +import Directory +import System +import IO +import Monad +import Data.List ( isSuffixOf ) +import Maybe + + +-- --------------------------------------------------------------------------- +-- Pre-process + +-- Just preprocess a file, put the result in a temp. file (used by the +-- compilation manager during the summary phase). +-- +-- We return the augmented DynFlags, because they contain the result +-- of slurping in the OPTIONS pragmas + +preprocess :: DynFlags -> (FilePath, Maybe Phase) -> IO (DynFlags, FilePath) +preprocess dflags (filename, mb_phase) = + ASSERT2(isJust mb_phase || isHaskellSrcFilename filename, text filename) + runPipeline anyHsc dflags (filename, mb_phase) Temporary Nothing{-no ModLocation-} + +-- --------------------------------------------------------------------------- +-- Compile + +-- Compile a single module, under the control of the compilation manager. +-- +-- This is the interface between the compilation manager and the +-- compiler proper (hsc), where we deal with tedious details like +-- reading the OPTIONS pragma from the source file, and passing the +-- output of hsc through the C compiler. + +-- NB. No old interface can also mean that the source has changed. + +compile :: HscEnv + -> ModSummary + -> Maybe Linkable -- Just linkable <=> source unchanged + -> Maybe ModIface -- Old interface, if available + -> Int -> Int + -> IO CompResult + +data CompResult + = CompOK ModDetails -- New details + ModIface -- New iface + (Maybe Linkable) -- a Maybe, for the same reasons as hm_linkable + + | CompErrs + + +compile hsc_env mod_summary maybe_old_linkable old_iface mod_index nmods = do + + let dflags0 = ms_hspp_opts mod_summary + this_mod = ms_mod mod_summary + src_flavour = ms_hsc_src mod_summary + + have_object + | Just l <- maybe_old_linkable, isObjectLinkable l = True + | otherwise = False + + -- FIXME: We need to know whether or not we're recompiling the file. Move this to HscMain? + --showPass dflags0 ("Compiling " ++ showModMsg have_object mod_summary) + + let location = ms_location mod_summary + let input_fn = expectJust "compile:hs" (ml_hs_file location) + let input_fnpp = ms_hspp_file mod_summary + + debugTraceMsg dflags0 2 (text "compile: input file" <+> text input_fnpp) + + let (basename, _) = splitFilename input_fn + + -- We add the directory in which the .hs files resides) to the import path. + -- This is needed when we try to compile the .hc file later, if it + -- imports a _stub.h file that we created here. + let current_dir = directoryOf basename + old_paths = includePaths dflags0 + dflags = dflags0 { includePaths = current_dir : old_paths } + + -- Figure out what lang we're generating + let hsc_lang = hscMaybeAdjustTarget dflags StopLn src_flavour (hscTarget dflags) + -- ... and what the next phase should be + let next_phase = hscNextPhase dflags src_flavour hsc_lang + -- ... and what file to generate the output into + output_fn <- getOutputFilename dflags next_phase + Temporary basename next_phase (Just location) + + let dflags' = dflags { hscTarget = hsc_lang, + hscOutName = output_fn, + extCoreName = basename ++ ".hcr" } + + -- -no-recomp should also work with --make + let do_recomp = dopt Opt_RecompChecking dflags + source_unchanged = isJust maybe_old_linkable && do_recomp + hsc_env' = hsc_env { hsc_dflags = dflags' } + object_filename = ml_obj_file location + + let getStubLinkable False = return [] + getStubLinkable True + = do stub_o <- compileStub dflags' this_mod location + return [ DotO stub_o ] + + handleBatch (HscNoRecomp, iface, details) + = ASSERT (isJust maybe_old_linkable) + return (CompOK details iface maybe_old_linkable) + handleBatch (HscRecomp hasStub, iface, details) + | isHsBoot src_flavour + = return (CompOK details iface Nothing) + | otherwise + = do stub_unlinked <- getStubLinkable hasStub + (hs_unlinked, unlinked_time) <- + case hsc_lang of + HscNothing + -> return ([], ms_hs_date mod_summary) + -- We're in --make mode: finish the compilation pipeline. + _other + -> do runPipeline StopLn dflags (output_fn,Nothing) Persistent + (Just location) + -- The object filename comes from the ModLocation + o_time <- getModificationTime object_filename + return ([DotO object_filename], o_time) + let linkable = LM unlinked_time this_mod + (hs_unlinked ++ stub_unlinked) + return (CompOK details iface (Just linkable)) + + handleInterpreted (InteractiveNoRecomp, iface, details) + = ASSERT (isJust maybe_old_linkable) + return (CompOK details iface maybe_old_linkable) + handleInterpreted (InteractiveRecomp hasStub comp_bc, iface, details) + = do stub_unlinked <- getStubLinkable hasStub + let hs_unlinked = [BCOs comp_bc] + unlinked_time = ms_hs_date mod_summary + -- Why do we use the timestamp of the source file here, + -- rather than the current time? This works better in + -- the case where the local clock is out of sync + -- with the filesystem's clock. It's just as accurate: + -- if the source is modified, then the linkable will + -- be out of date. + let linkable = LM unlinked_time this_mod + (hs_unlinked ++ stub_unlinked) + return (CompOK details iface (Just linkable)) + + let runCompiler compiler handle + = do mbResult <- compiler hsc_env' mod_summary + source_unchanged old_iface + (Just (mod_index, nmods)) + case mbResult of + Nothing -> return CompErrs + Just result -> handle result + -- run the compiler + case hsc_lang of + HscInterpreted | not (isHsBoot src_flavour) -- We can't compile boot files to + -- bytecode so don't even try. + -> runCompiler hscCompileInteractive handleInterpreted + HscNothing + -> runCompiler hscCompileNothing handleBatch + _other + -> runCompiler hscCompileBatch handleBatch + +----------------------------------------------------------------------------- +-- stub .h and .c files (for foreign export support) + +-- The _stub.c file is derived from the haskell source file, possibly taking +-- into account the -stubdir option. +-- +-- Consequently, we derive the _stub.o filename from the haskell object +-- filename. +-- +-- This isn't necessarily the same as the object filename we +-- would get if we just compiled the _stub.c file using the pipeline. +-- For example: +-- +-- ghc src/A.hs -odir obj +-- +-- results in obj/A.o, and src/A_stub.c. If we compile src/A_stub.c with +-- -odir obj, we would get obj/src/A_stub.o, which is wrong; we want +-- obj/A_stub.o. + +compileStub :: DynFlags -> Module -> ModLocation -> IO FilePath +compileStub dflags mod location = do + let (o_base, o_ext) = splitFilename (ml_obj_file location) + stub_o = o_base ++ "_stub" `joinFileExt` o_ext + + -- compile the _stub.c file w/ gcc + let (stub_c,_) = mkStubPaths dflags mod location + runPipeline StopLn dflags (stub_c,Nothing) + (SpecificFile stub_o) Nothing{-no ModLocation-} + + return stub_o + + +-- --------------------------------------------------------------------------- +-- Link + +link :: GhcMode -- interactive or batch + -> DynFlags -- dynamic flags + -> Bool -- attempt linking in batch mode? + -> HomePackageTable -- what to link + -> IO SuccessFlag + +-- For the moment, in the batch linker, we don't bother to tell doLink +-- which packages to link -- it just tries all that are available. +-- batch_attempt_linking should only be *looked at* in batch mode. It +-- should only be True if the upsweep was successful and someone +-- exports main, i.e., we have good reason to believe that linking +-- will succeed. + +#ifdef GHCI +link Interactive dflags batch_attempt_linking hpt + = do -- Not Linking...(demand linker will do the job) + return Succeeded +#endif + +link JustTypecheck dflags batch_attempt_linking hpt + = return Succeeded + +link BatchCompile dflags batch_attempt_linking hpt + | batch_attempt_linking + = do + let + home_mod_infos = moduleEnvElts hpt + + -- the packages we depend on + pkg_deps = concatMap (dep_pkgs . mi_deps . hm_iface) home_mod_infos + + -- the linkables to link + linkables = map (expectJust "link".hm_linkable) home_mod_infos + + debugTraceMsg dflags 3 (text "link: linkables are ..." $$ vcat (map ppr linkables)) + + -- check for the -no-link flag + if isNoLink (ghcLink dflags) + then do debugTraceMsg dflags 3 (text "link(batch): linking omitted (-c flag given).") + return Succeeded + else do + + let getOfiles (LM _ _ us) = map nameOfObject (filter isObject us) + obj_files = concatMap getOfiles linkables + + exe_file = exeFileName dflags + + -- if the modification time on the executable is later than the + -- modification times on all of the objects, then omit linking + -- (unless the -no-recomp flag was given). + e_exe_time <- IO.try $ getModificationTime exe_file + let linking_needed + | Left _ <- e_exe_time = True + | Right t <- e_exe_time = + any (t <) (map linkableTime linkables) + + if dopt Opt_RecompChecking dflags && not linking_needed + then do debugTraceMsg dflags 1 (text exe_file <+> ptext SLIT("is up to date, linking not required.")) + return Succeeded + else do + + debugTraceMsg dflags 1 (ptext SLIT("Linking") <+> text exe_file + <+> text "...") + + -- Don't showPass in Batch mode; doLink will do that for us. + let link = case ghcLink dflags of + MkDLL -> doMkDLL + StaticLink -> staticLink + link dflags obj_files pkg_deps + + debugTraceMsg dflags 3 (text "link: done") + + -- staticLink only returns if it succeeds + return Succeeded + + | otherwise + = do debugTraceMsg dflags 3 (text "link(batch): upsweep (partially) failed OR" $$ + text " Main.main not exported; not linking.") + return Succeeded + + +-- ----------------------------------------------------------------------------- +-- Compile files in one-shot mode. + +oneShot :: DynFlags -> Phase -> [(String, Maybe Phase)] -> IO () +oneShot dflags stop_phase srcs = do + o_files <- mapM (compileFile dflags stop_phase) srcs + doLink dflags stop_phase o_files + +compileFile :: DynFlags -> Phase -> (FilePath, Maybe Phase) -> IO FilePath +compileFile dflags stop_phase (src, mb_phase) = do + exists <- doesFileExist src + when (not exists) $ + throwDyn (CmdLineError ("does not exist: " ++ src)) + + let + split = dopt Opt_SplitObjs dflags + mb_o_file = outputFile dflags + ghc_link = ghcLink dflags -- Set by -c or -no-link + + -- When linking, the -o argument refers to the linker's output. + -- otherwise, we use it as the name for the pipeline's output. + output + | StopLn <- stop_phase, not (isNoLink ghc_link) = Persistent + -- -o foo applies to linker + | Just o_file <- mb_o_file = SpecificFile o_file + -- -o foo applies to the file we are compiling now + | otherwise = Persistent + + stop_phase' = case stop_phase of + As | split -> SplitAs + other -> stop_phase + + (_, out_file) <- runPipeline stop_phase' dflags + (src, mb_phase) output Nothing{-no ModLocation-} + return out_file + + +doLink :: DynFlags -> Phase -> [FilePath] -> IO () +doLink dflags stop_phase o_files + | not (isStopLn stop_phase) + = return () -- We stopped before the linking phase + + | otherwise + = case ghcLink dflags of + NoLink -> return () + StaticLink -> staticLink dflags o_files link_pkgs + MkDLL -> doMkDLL dflags o_files link_pkgs + where + -- Always link in the haskell98 package for static linking. Other + -- packages have to be specified via the -package flag. + link_pkgs + | ExtPackage h98_id <- haskell98PackageId (pkgState dflags) = [h98_id] + | otherwise = [] + + +-- --------------------------------------------------------------------------- +-- Run a compilation pipeline, consisting of multiple phases. + +-- This is the interface to the compilation pipeline, which runs +-- a series of compilation steps on a single source file, specifying +-- at which stage to stop. + +-- The DynFlags can be modified by phases in the pipeline (eg. by +-- GHC_OPTIONS pragmas), and the changes affect later phases in the +-- pipeline. + +data PipelineOutput + = Temporary + -- output should be to a temporary file: we're going to + -- run more compilation steps on this output later + | Persistent + -- we want a persistent file, i.e. a file in the current directory + -- derived from the input filename, but with the appropriate extension. + -- eg. in "ghc -c Foo.hs" the output goes into ./Foo.o. + | SpecificFile FilePath + -- the output must go into the specified file. + +runPipeline + :: Phase -- When to stop + -> DynFlags -- Dynamic flags + -> (FilePath,Maybe Phase) -- Input filename (and maybe -x suffix) + -> PipelineOutput -- Output filename + -> Maybe ModLocation -- A ModLocation, if this is a Haskell module + -> IO (DynFlags, FilePath) -- (final flags, output filename) + +runPipeline stop_phase dflags (input_fn, mb_phase) output maybe_loc + = do + let (basename, suffix) = splitFilename input_fn + + -- If we were given a -x flag, then use that phase to start from + start_phase + | Just x_phase <- mb_phase = x_phase + | otherwise = startPhase suffix + + -- We want to catch cases of "you can't get there from here" before + -- we start the pipeline, because otherwise it will just run off the + -- end. + -- + -- There is a partial ordering on phases, where A < B iff A occurs + -- before B in a normal compilation pipeline. + + when (not (start_phase `happensBefore` stop_phase)) $ + throwDyn (UsageError + ("cannot compile this file to desired target: " + ++ input_fn)) + + -- this is a function which will be used to calculate output file names + -- as we go along (we partially apply it to some of its inputs here) + let get_output_fn = getOutputFilename dflags stop_phase output basename + + -- Execute the pipeline... + (dflags', output_fn, maybe_loc) <- + pipeLoop dflags start_phase stop_phase input_fn + basename suffix get_output_fn maybe_loc + + -- Sometimes, a compilation phase doesn't actually generate any output + -- (eg. the CPP phase when -fcpp is not turned on). If we end on this + -- stage, but we wanted to keep the output, then we have to explicitly + -- copy the file. + case output of + Temporary -> + return (dflags', output_fn) + _other -> + do final_fn <- get_output_fn stop_phase maybe_loc + when (final_fn /= output_fn) $ + copy dflags ("Copying `" ++ output_fn ++ "' to `" ++ final_fn + ++ "'") output_fn final_fn + return (dflags', final_fn) + + + +pipeLoop :: DynFlags -> Phase -> Phase + -> FilePath -> String -> Suffix + -> (Phase -> Maybe ModLocation -> IO FilePath) + -> Maybe ModLocation + -> IO (DynFlags, FilePath, Maybe ModLocation) + +pipeLoop dflags phase stop_phase + input_fn orig_basename orig_suff + orig_get_output_fn maybe_loc + + | phase `eqPhase` stop_phase -- All done + = return (dflags, input_fn, maybe_loc) + + | not (phase `happensBefore` stop_phase) + -- Something has gone wrong. We'll try to cover all the cases when + -- this could happen, so if we reach here it is a panic. + -- eg. it might happen if the -C flag is used on a source file that + -- has {-# OPTIONS -fasm #-}. + = panic ("pipeLoop: at phase " ++ show phase ++ + " but I wanted to stop at phase " ++ show stop_phase) + + | otherwise + = do { (next_phase, dflags', maybe_loc, output_fn) + <- runPhase phase stop_phase dflags orig_basename + orig_suff input_fn orig_get_output_fn maybe_loc + ; pipeLoop dflags' next_phase stop_phase output_fn + orig_basename orig_suff orig_get_output_fn maybe_loc } + +getOutputFilename + :: DynFlags -> Phase -> PipelineOutput -> String + -> Phase{-next phase-} -> Maybe ModLocation -> IO FilePath +getOutputFilename dflags stop_phase output basename + = func + where + hcsuf = hcSuf dflags + odir = objectDir dflags + osuf = objectSuf dflags + keep_hc = dopt Opt_KeepHcFiles dflags + keep_raw_s = dopt Opt_KeepRawSFiles dflags + keep_s = dopt Opt_KeepSFiles dflags + + myPhaseInputExt HCc = hcsuf + myPhaseInputExt StopLn = osuf + myPhaseInputExt other = phaseInputExt other + + func next_phase maybe_location + | is_last_phase, Persistent <- output = persistent_fn + | is_last_phase, SpecificFile f <- output = return f + | keep_this_output = persistent_fn + | otherwise = newTempName dflags suffix + where + is_last_phase = next_phase `eqPhase` stop_phase + + -- sometimes, we keep output from intermediate stages + keep_this_output = + case next_phase of + StopLn -> True + Mangle | keep_raw_s -> True + As | keep_s -> True + HCc | keep_hc -> True + _other -> False + + suffix = myPhaseInputExt next_phase + + -- persistent object files get put in odir + persistent_fn + | StopLn <- next_phase = return odir_persistent + | otherwise = return persistent + + persistent = basename `joinFileExt` suffix + + odir_persistent + | Just loc <- maybe_location = ml_obj_file loc + | Just d <- odir = d `joinFileName` persistent + | otherwise = persistent + + +-- ----------------------------------------------------------------------------- +-- Each phase in the pipeline returns the next phase to execute, and the +-- name of the file in which the output was placed. +-- +-- We must do things dynamically this way, because we often don't know +-- what the rest of the phases will be until part-way through the +-- compilation: for example, an {-# OPTIONS -fasm #-} at the beginning +-- of a source file can change the latter stages of the pipeline from +-- taking the via-C route to using the native code generator. + +runPhase :: Phase -- Do this phase first + -> Phase -- Stop just before this phase + -> DynFlags + -> String -- basename of original input source + -> String -- its extension + -> FilePath -- name of file which contains the input to this phase. + -> (Phase -> Maybe ModLocation -> IO FilePath) + -- how to calculate the output filename + -> Maybe ModLocation -- the ModLocation, if we have one + -> IO (Phase, -- next phase + DynFlags, -- new dynamic flags + Maybe ModLocation, -- the ModLocation, if we have one + FilePath) -- output filename + + -- Invariant: the output filename always contains the output + -- Interesting case: Hsc when there is no recompilation to do + -- Then the output filename is still a .o file + +------------------------------------------------------------------------------- +-- Unlit phase + +runPhase (Unlit sf) _stop dflags _basename _suff input_fn get_output_fn maybe_loc + = do let unlit_flags = getOpts dflags opt_L + -- The -h option passes the file name for unlit to put in a #line directive + output_fn <- get_output_fn (Cpp sf) maybe_loc + + SysTools.runUnlit dflags + (map SysTools.Option unlit_flags ++ + [ SysTools.Option "-h" + , SysTools.Option input_fn + , SysTools.FileOption "" input_fn + , SysTools.FileOption "" output_fn + ]) + + return (Cpp sf, dflags, maybe_loc, output_fn) + +------------------------------------------------------------------------------- +-- Cpp phase : (a) gets OPTIONS out of file +-- (b) runs cpp if necessary + +runPhase (Cpp sf) _stop dflags0 basename suff input_fn get_output_fn maybe_loc + = do src_opts <- getOptionsFromFile input_fn + (dflags,unhandled_flags) <- parseDynamicFlags dflags0 (map unLoc src_opts) + checkProcessArgsResult unhandled_flags (basename `joinFileExt` suff) + + if not (dopt Opt_Cpp dflags) then + -- no need to preprocess CPP, just pass input file along + -- to the next phase of the pipeline. + return (HsPp sf, dflags, maybe_loc, input_fn) + else do + output_fn <- get_output_fn (HsPp sf) maybe_loc + doCpp dflags True{-raw-} False{-no CC opts-} input_fn output_fn + return (HsPp sf, dflags, maybe_loc, output_fn) + +------------------------------------------------------------------------------- +-- HsPp phase + +runPhase (HsPp sf) _stop dflags basename suff input_fn get_output_fn maybe_loc + = do if not (dopt Opt_Pp dflags) then + -- no need to preprocess, just pass input file along + -- to the next phase of the pipeline. + return (Hsc sf, dflags, maybe_loc, input_fn) + else do + let hspp_opts = getOpts dflags opt_F + let orig_fn = basename `joinFileExt` suff + output_fn <- get_output_fn (Hsc sf) maybe_loc + SysTools.runPp dflags + ( [ SysTools.Option orig_fn + , SysTools.Option input_fn + , SysTools.FileOption "" output_fn + ] ++ + map SysTools.Option hspp_opts + ) + return (Hsc sf, dflags, maybe_loc, output_fn) + +----------------------------------------------------------------------------- +-- Hsc phase + +-- Compilation of a single module, in "legacy" mode (_not_ under +-- the direction of the compilation manager). +runPhase (Hsc src_flavour) stop dflags0 basename suff input_fn get_output_fn _maybe_loc + = do -- normal Hsc mode, not mkdependHS + + -- we add the current directory (i.e. the directory in which + -- the .hs files resides) to the import path, since this is + -- what gcc does, and it's probably what you want. + let current_dir = directoryOf basename + + paths = includePaths dflags0 + dflags = dflags0 { includePaths = current_dir : paths } + + -- gather the imports and module name + (hspp_buf,mod_name) <- + case src_flavour of + ExtCoreFile -> do { -- no explicit imports in ExtCore input. + ; m <- getCoreModuleName input_fn + ; return (Nothing, mkModule m) } + + other -> do { buf <- hGetStringBuffer input_fn + ; (_,_,L _ mod_name) <- getImports dflags buf input_fn + ; return (Just buf, mod_name) } + + -- Build a ModLocation to pass to hscMain. + -- The source filename is rather irrelevant by now, but it's used + -- by hscMain for messages. hscMain also needs + -- the .hi and .o filenames, and this is as good a way + -- as any to generate them, and better than most. (e.g. takes + -- into accout the -osuf flags) + location1 <- mkHomeModLocation2 dflags mod_name basename suff + + -- Boot-ify it if necessary + let location2 | isHsBoot src_flavour = addBootSuffixLocn location1 + | otherwise = location1 + + + -- Take -ohi into account if present + -- This can't be done in mkHomeModuleLocation because + -- it only applies to the module being compiles + let ohi = outputHi dflags + location3 | Just fn <- ohi = location2{ ml_hi_file = fn } + | otherwise = location2 + + -- Take -o into account if present + -- Very like -ohi, but we must *only* do this if we aren't linking + -- (If we're linking then the -o applies to the linked thing, not to + -- the object file for one module.) + -- Note the nasty duplication with the same computation in compileFile above + let expl_o_file = outputFile dflags + location4 | Just ofile <- expl_o_file + , isNoLink (ghcLink dflags) + = location3 { ml_obj_file = ofile } + | otherwise = location3 + + -- Make the ModSummary to hand to hscMain + src_timestamp <- getModificationTime (basename `joinFileExt` suff) + let + unused_field = panic "runPhase:ModSummary field" + -- Some fields are not looked at by hscMain + mod_summary = ModSummary { ms_mod = mod_name, + ms_hsc_src = src_flavour, + ms_hspp_file = input_fn, + ms_hspp_opts = dflags, + ms_hspp_buf = hspp_buf, + ms_location = location4, + ms_hs_date = src_timestamp, + ms_obj_date = Nothing, + ms_imps = unused_field, + ms_srcimps = unused_field } + + o_file = ml_obj_file location4 -- The real object file + + + -- Figure out if the source has changed, for recompilation avoidance. + -- + -- Setting source_unchanged to True means that M.o seems + -- to be up to date wrt M.hs; so no need to recompile unless imports have + -- changed (which the compiler itself figures out). + -- Setting source_unchanged to False tells the compiler that M.o is out of + -- date wrt M.hs (or M.o doesn't exist) so we must recompile regardless. + let do_recomp = dopt Opt_RecompChecking dflags + source_unchanged <- + if not do_recomp || not (isStopLn stop) + -- Set source_unchanged to False unconditionally if + -- (a) recompilation checker is off, or + -- (b) we aren't going all the way to .o file (e.g. ghc -S) + then return False + -- Otherwise look at file modification dates + else do o_file_exists <- doesFileExist o_file + if not o_file_exists + then return False -- Need to recompile + else do t2 <- getModificationTime o_file + if t2 > src_timestamp + then return True + else return False + + -- get the DynFlags + let hsc_lang = hscMaybeAdjustTarget dflags stop src_flavour (hscTarget dflags) + let next_phase = hscNextPhase dflags src_flavour hsc_lang + output_fn <- get_output_fn next_phase (Just location4) + + let dflags' = dflags { hscTarget = hsc_lang, + hscOutName = output_fn, + extCoreName = basename ++ ".hcr" } + + hsc_env <- newHscEnv dflags' + + -- Tell the finder cache about this module + addHomeModuleToFinder hsc_env mod_name location4 + + -- run the compiler! + mbResult <- hscCompileOneShot hsc_env + mod_summary source_unchanged + Nothing -- No iface + Nothing -- No "module i of n" progress info + + case mbResult of + Nothing -> throwDyn (PhaseFailed "hsc" (ExitFailure 1)) + Just HscNoRecomp + -> do SysTools.touch dflags' "Touching object file" o_file + -- The .o file must have a later modification date + -- than the source file (else we wouldn't be in HscNoRecomp) + -- but we touch it anyway, to keep 'make' happy (we think). + return (StopLn, dflags', Just location4, o_file) + Just (HscRecomp hasStub) + -> do when hasStub $ + do stub_o <- compileStub dflags' mod_name location4 + consIORef v_Ld_inputs stub_o + -- In the case of hs-boot files, generate a dummy .o-boot + -- stamp file for the benefit of Make + when (isHsBoot src_flavour) $ + SysTools.touch dflags' "Touching object file" o_file + return (next_phase, dflags', Just location4, output_fn) + +----------------------------------------------------------------------------- +-- Cmm phase + +runPhase CmmCpp stop dflags basename suff input_fn get_output_fn maybe_loc + = do + output_fn <- get_output_fn Cmm maybe_loc + doCpp dflags False{-not raw-} True{-include CC opts-} input_fn output_fn + return (Cmm, dflags, maybe_loc, output_fn) + +runPhase Cmm stop dflags basename suff input_fn get_output_fn maybe_loc + = do + let hsc_lang = hscMaybeAdjustTarget dflags stop HsSrcFile (hscTarget dflags) + let next_phase = hscNextPhase dflags HsSrcFile hsc_lang + output_fn <- get_output_fn next_phase maybe_loc + + let dflags' = dflags { hscTarget = hsc_lang, + hscOutName = output_fn, + extCoreName = basename ++ ".hcr" } + + ok <- hscCmmFile dflags' input_fn + + when (not ok) $ throwDyn (PhaseFailed "cmm" (ExitFailure 1)) + + return (next_phase, dflags, maybe_loc, output_fn) + +----------------------------------------------------------------------------- +-- Cc phase + +-- we don't support preprocessing .c files (with -E) now. Doing so introduces +-- way too many hacks, and I can't say I've ever used it anyway. + +runPhase cc_phase stop dflags basename suff input_fn get_output_fn maybe_loc + | cc_phase `eqPhase` Cc || cc_phase `eqPhase` HCc + = do let cc_opts = getOpts dflags opt_c + hcc = cc_phase `eqPhase` HCc + + let cmdline_include_paths = includePaths dflags + + -- HC files have the dependent packages stamped into them + pkgs <- if hcc then getHCFilePackages input_fn else return [] + + -- add package include paths even if we're just compiling .c + -- files; this is the Value Add(TM) that using ghc instead of + -- gcc gives you :) + pkg_include_dirs <- getPackageIncludePath dflags pkgs + let include_paths = foldr (\ x xs -> "-I" : x : xs) [] + (cmdline_include_paths ++ pkg_include_dirs) + + let (md_c_flags, md_regd_c_flags) = machdepCCOpts dflags + let pic_c_flags = picCCOpts dflags + + let verb = getVerbFlag dflags + + pkg_extra_cc_opts <- getPackageExtraCcOpts dflags pkgs + + let split_objs = dopt Opt_SplitObjs dflags + split_opt | hcc && split_objs = [ "-DUSE_SPLIT_MARKERS" ] + | otherwise = [ ] + + let excessPrecision = dopt Opt_ExcessPrecision dflags + + let cc_opt | optLevel dflags >= 2 = "-O2" + | otherwise = "-O" + + -- Decide next phase + + let mangle = dopt Opt_DoAsmMangling dflags + next_phase + | hcc && mangle = Mangle + | otherwise = As + output_fn <- get_output_fn next_phase maybe_loc + + let + more_hcc_opts = +#if i386_TARGET_ARCH + -- on x86 the floating point regs have greater precision + -- than a double, which leads to unpredictable results. + -- By default, we turn this off with -ffloat-store unless + -- the user specified -fexcess-precision. + (if excessPrecision then [] else [ "-ffloat-store" ]) ++ +#endif + -- gcc's -fstrict-aliasing allows two accesses to memory + -- to be considered non-aliasing if they have different types. + -- This interacts badly with the C code we generate, which is + -- very weakly typed, being derived from C--. + ["-fno-strict-aliasing"] + + + + SysTools.runCc dflags ( + -- force the C compiler to interpret this file as C when + -- compiling .hc files, by adding the -x c option. + -- Also useful for plain .c files, just in case GHC saw a + -- -x c option. + [ SysTools.Option "-x", SysTools.Option "c"] ++ + [ SysTools.FileOption "" input_fn + , SysTools.Option "-o" + , SysTools.FileOption "" output_fn + ] + ++ map SysTools.Option ( + md_c_flags + ++ pic_c_flags + ++ (if hcc && mangle + then md_regd_c_flags + else []) + ++ (if hcc + then more_hcc_opts + else []) + ++ [ verb, "-S", "-Wimplicit", cc_opt ] + ++ [ "-D__GLASGOW_HASKELL__="++cProjectVersionInt ] + ++ cc_opts + ++ split_opt + ++ include_paths + ++ pkg_extra_cc_opts + )) + + return (next_phase, dflags, maybe_loc, output_fn) + + -- ToDo: postprocess the output from gcc + +----------------------------------------------------------------------------- +-- Mangle phase + +runPhase Mangle stop dflags _basename _suff input_fn get_output_fn maybe_loc + = do let mangler_opts = getOpts dflags opt_m + +#if i386_TARGET_ARCH + machdep_opts <- return [ show (stolen_x86_regs dflags) ] +#else + machdep_opts <- return [] +#endif + + let split = dopt Opt_SplitObjs dflags + next_phase + | split = SplitMangle + | otherwise = As + output_fn <- get_output_fn next_phase maybe_loc + + SysTools.runMangle dflags (map SysTools.Option mangler_opts + ++ [ SysTools.FileOption "" input_fn + , SysTools.FileOption "" output_fn + ] + ++ map SysTools.Option machdep_opts) + + return (next_phase, dflags, maybe_loc, output_fn) + +----------------------------------------------------------------------------- +-- Splitting phase + +runPhase SplitMangle stop dflags _basename _suff input_fn get_output_fn maybe_loc + = do -- tmp_pfx is the prefix used for the split .s files + -- We also use it as the file to contain the no. of split .s files (sigh) + split_s_prefix <- SysTools.newTempName dflags "split" + let n_files_fn = split_s_prefix + + SysTools.runSplit dflags + [ SysTools.FileOption "" input_fn + , SysTools.FileOption "" split_s_prefix + , SysTools.FileOption "" n_files_fn + ] + + -- Save the number of split files for future references + s <- readFile n_files_fn + let n_files = read s :: Int + writeIORef v_Split_info (split_s_prefix, n_files) + + -- Remember to delete all these files + addFilesToClean [ split_s_prefix ++ "__" ++ show n ++ ".s" + | n <- [1..n_files]] + + return (SplitAs, dflags, maybe_loc, "**splitmangle**") + -- we don't use the filename + +----------------------------------------------------------------------------- +-- As phase + +runPhase As stop dflags _basename _suff input_fn get_output_fn maybe_loc + = do let as_opts = getOpts dflags opt_a + let cmdline_include_paths = includePaths dflags + + output_fn <- get_output_fn StopLn maybe_loc + + -- we create directories for the object file, because it + -- might be a hierarchical module. + createDirectoryHierarchy (directoryOf output_fn) + + SysTools.runAs dflags + (map SysTools.Option as_opts + ++ [ SysTools.Option ("-I" ++ p) | p <- cmdline_include_paths ] + ++ [ SysTools.Option "-c" + , SysTools.FileOption "" input_fn + , SysTools.Option "-o" + , SysTools.FileOption "" output_fn + ]) + + return (StopLn, dflags, maybe_loc, output_fn) + + +runPhase SplitAs stop dflags basename _suff _input_fn get_output_fn maybe_loc + = do + output_fn <- get_output_fn StopLn maybe_loc + + let (base_o, _) = splitFilename output_fn + split_odir = base_o ++ "_split" + osuf = objectSuf dflags + + createDirectoryHierarchy split_odir + + -- remove M_split/ *.o, because we're going to archive M_split/ *.o + -- later and we don't want to pick up any old objects. + fs <- getDirectoryContents split_odir + mapM_ removeFile $ map (split_odir `joinFileName`) + $ filter (osuf `isSuffixOf`) fs + + let as_opts = getOpts dflags opt_a + + (split_s_prefix, n) <- readIORef v_Split_info + + let split_s n = split_s_prefix ++ "__" ++ show n `joinFileExt` "s" + split_obj n = split_odir `joinFileName` + filenameOf base_o ++ "__" ++ show n + `joinFileExt` osuf + + let assemble_file n + = SysTools.runAs dflags + (map SysTools.Option as_opts ++ + [ SysTools.Option "-c" + , SysTools.Option "-o" + , SysTools.FileOption "" (split_obj n) + , SysTools.FileOption "" (split_s n) + ]) + + mapM_ assemble_file [1..n] + + -- and join the split objects into a single object file: + let ld_r args = SysTools.runLink dflags ([ + SysTools.Option "-nostdlib", + SysTools.Option "-nodefaultlibs", + SysTools.Option "-Wl,-r", + SysTools.Option ld_x_flag, + SysTools.Option "-o", + SysTools.FileOption "" output_fn ] ++ args) + ld_x_flag | null cLD_X = "" + | otherwise = "-Wl,-x" + + if cLdIsGNULd == "YES" + then do + let script = split_odir `joinFileName` "ld.script" + writeFile script $ + "INPUT(" ++ unwords (map split_obj [1..n]) ++ ")" + ld_r [SysTools.FileOption "" script] + else do + ld_r (map (SysTools.FileOption "" . split_obj) [1..n]) + + return (StopLn, dflags, maybe_loc, output_fn) + + +----------------------------------------------------------------------------- +-- MoveBinary sort-of-phase +-- After having produced a binary, move it somewhere else and generate a +-- wrapper script calling the binary. Currently, we need this only in +-- a parallel way (i.e. in GUM), because PVM expects the binary in a +-- central directory. +-- This is called from staticLink below, after linking. I haven't made it +-- a separate phase to minimise interfering with other modules, and +-- we don't need the generality of a phase (MoveBinary is always +-- done after linking and makes only sense in a parallel setup) -- HWL + +runPhase_MoveBinary input_fn + = do + sysMan <- getSysMan + pvm_root <- getEnv "PVM_ROOT" + pvm_arch <- getEnv "PVM_ARCH" + let + pvm_executable_base = "=" ++ input_fn + pvm_executable = pvm_root ++ "/bin/" ++ pvm_arch ++ "/" ++ pvm_executable_base + -- nuke old binary; maybe use configur'ed names for cp and rm? + system ("rm -f " ++ pvm_executable) + -- move the newly created binary into PVM land + system ("cp -p " ++ input_fn ++ " " ++ pvm_executable) + -- generate a wrapper script for running a parallel prg under PVM + writeFile input_fn (mk_pvm_wrapper_script pvm_executable pvm_executable_base sysMan) + return True + +-- generates a Perl skript starting a parallel prg under PVM +mk_pvm_wrapper_script :: String -> String -> String -> String +mk_pvm_wrapper_script pvm_executable pvm_executable_base sysMan = unlines $ + [ + "eval 'exec perl -S $0 ${1+\"$@\"}'", + " if $running_under_some_shell;", + "# =!=!=!=!=!=!=!=!=!=!=!", + "# This script is automatically generated: DO NOT EDIT!!!", + "# Generated by Glasgow Haskell Compiler", + "# ngoqvam choHbogh vaj' vIHoHnISbej !!!!", + "#", + "$pvm_executable = '" ++ pvm_executable ++ "';", + "$pvm_executable_base = '" ++ pvm_executable_base ++ "';", + "$SysMan = '" ++ sysMan ++ "';", + "", + {- ToDo: add the magical shortcuts again iff we actually use them -- HWL + "# first, some magical shortcuts to run "commands" on the binary", + "# (which is hidden)", + "if ($#ARGV == 1 && $ARGV[0] eq '+RTS' && $ARGV[1] =~ /^--((size|file|strip|rm|nm).*)/ ) {", + " local($cmd) = $1;", + " system("$cmd $pvm_executable");", + " exit(0); # all done", + "}", -} + "", + "# Now, run the real binary; process the args first", + "$ENV{'PE'} = $pvm_executable_base;", -- ++ pvm_executable_base, + "$debug = '';", + "$nprocessors = 0; # the default: as many PEs as machines in PVM config", + "@nonPVM_args = ();", + "$in_RTS_args = 0;", + "", + "args: while ($a = shift(@ARGV)) {", + " if ( $a eq '+RTS' ) {", + " $in_RTS_args = 1;", + " } elsif ( $a eq '-RTS' ) {", + " $in_RTS_args = 0;", + " }", + " if ( $a eq '-d' && $in_RTS_args ) {", + " $debug = '-';", + " } elsif ( $a =~ /^-qN(\\d+)/ && $in_RTS_args ) {", + " $nprocessors = $1;", + " } elsif ( $a =~ /^-qp(\\d+)/ && $in_RTS_args ) {", + " $nprocessors = $1;", + " } else {", + " push(@nonPVM_args, $a);", + " }", + "}", + "", + "local($return_val) = 0;", + "# Start the parallel execution by calling SysMan", + "system(\"$SysMan $debug $pvm_executable $nprocessors @nonPVM_args\");", + "$return_val = $?;", + "# ToDo: fix race condition moving files and flushing them!!", + "system(\"cp $ENV{'HOME'}/$pvm_executable_base.???.gr .\") if -f \"$ENV{'HOME'}/$pvm_executable_base.002.gr\";", + "exit($return_val);" + ] + +----------------------------------------------------------------------------- +-- Complain about non-dynamic flags in OPTIONS pragmas + +checkProcessArgsResult flags filename + = do when (notNull flags) (throwDyn (ProgramError ( + showSDoc (hang (text filename <> char ':') + 4 (text "unknown flags in {-# OPTIONS #-} pragma:" <+> + hsep (map text flags))) + ))) + +----------------------------------------------------------------------------- +-- Look for the /* GHC_PACKAGES ... */ comment at the top of a .hc file + +getHCFilePackages :: FilePath -> IO [PackageId] +getHCFilePackages filename = + EXCEPTION.bracket (openFile filename ReadMode) hClose $ \h -> do + l <- hGetLine h + case l of + '/':'*':' ':'G':'H':'C':'_':'P':'A':'C':'K':'A':'G':'E':'S':rest -> + return (map stringToPackageId (words rest)) + _other -> + return [] + +----------------------------------------------------------------------------- +-- Static linking, of .o files + +-- The list of packages passed to link is the list of packages on +-- which this program depends, as discovered by the compilation +-- manager. It is combined with the list of packages that the user +-- specifies on the command line with -package flags. +-- +-- In one-shot linking mode, we can't discover the package +-- dependencies (because we haven't actually done any compilation or +-- read any interface files), so the user must explicitly specify all +-- the packages. + +staticLink :: DynFlags -> [FilePath] -> [PackageId] -> IO () +staticLink dflags o_files dep_packages = do + let verb = getVerbFlag dflags + output_fn = exeFileName dflags + + -- get the full list of packages to link with, by combining the + -- explicit packages with the auto packages and all of their + -- dependencies, and eliminating duplicates. + + pkg_lib_paths <- getPackageLibraryPath dflags dep_packages + let pkg_lib_path_opts = map ("-L"++) pkg_lib_paths + + let lib_paths = libraryPaths dflags + let lib_path_opts = map ("-L"++) lib_paths + + pkg_link_opts <- getPackageLinkOpts dflags dep_packages + +#ifdef darwin_TARGET_OS + pkg_framework_paths <- getPackageFrameworkPath dflags dep_packages + let pkg_framework_path_opts = map ("-F"++) pkg_framework_paths + + let framework_paths = frameworkPaths dflags + framework_path_opts = map ("-F"++) framework_paths + + pkg_frameworks <- getPackageFrameworks dflags dep_packages + let pkg_framework_opts = concat [ ["-framework", fw] | fw <- pkg_frameworks ] + + let frameworks = cmdlineFrameworks dflags + framework_opts = concat [ ["-framework", fw] | fw <- reverse frameworks ] + -- reverse because they're added in reverse order from the cmd line +#endif + + -- probably _stub.o files + extra_ld_inputs <- readIORef v_Ld_inputs + + -- opts from -optl-<blah> (including -l<blah> options) + let extra_ld_opts = getOpts dflags opt_l + + let ways = wayNames dflags + + -- Here are some libs that need to be linked at the *end* of + -- the command line, because they contain symbols that are referred to + -- by the RTS. We can't therefore use the ordinary way opts for these. + let + debug_opts | WayDebug `elem` ways = [ +#if defined(HAVE_LIBBFD) + "-lbfd", "-liberty" +#endif + ] + | otherwise = [] + + let + thread_opts | WayThreaded `elem` ways = [ +#if !defined(mingw32_TARGET_OS) && !defined(freebsd_TARGET_OS) + "-lpthread" +#endif +#if defined(osf3_TARGET_OS) + , "-lexc" +#endif + ] + | otherwise = [] + + let (md_c_flags, _) = machdepCCOpts dflags + SysTools.runLink dflags ( + [ SysTools.Option verb + , SysTools.Option "-o" + , SysTools.FileOption "" output_fn + ] + ++ map SysTools.Option ( + md_c_flags + ++ o_files + ++ extra_ld_inputs + ++ lib_path_opts + ++ extra_ld_opts +#ifdef darwin_TARGET_OS + ++ framework_path_opts + ++ framework_opts +#endif + ++ pkg_lib_path_opts + ++ pkg_link_opts +#ifdef darwin_TARGET_OS + ++ pkg_framework_path_opts + ++ pkg_framework_opts +#endif + ++ debug_opts + ++ thread_opts + )) + + -- parallel only: move binary to another dir -- HWL + when (WayPar `elem` ways) + (do success <- runPhase_MoveBinary output_fn + if success then return () + else throwDyn (InstallationError ("cannot move binary to PVM dir"))) + + +exeFileName :: DynFlags -> FilePath +exeFileName dflags + | Just s <- outputFile dflags = +#if defined(mingw32_HOST_OS) + if null (suffixOf s) + then s `joinFileExt` "exe" + else s +#else + s +#endif + | otherwise = +#if defined(mingw32_HOST_OS) + "main.exe" +#else + "a.out" +#endif + +----------------------------------------------------------------------------- +-- Making a DLL (only for Win32) + +doMkDLL :: DynFlags -> [String] -> [PackageId] -> IO () +doMkDLL dflags o_files dep_packages = do + let verb = getVerbFlag dflags + let static = opt_Static + let no_hs_main = dopt Opt_NoHsMain dflags + let o_file = outputFile dflags + let output_fn = case o_file of { Just s -> s; Nothing -> "HSdll.dll"; } + + pkg_lib_paths <- getPackageLibraryPath dflags dep_packages + let pkg_lib_path_opts = map ("-L"++) pkg_lib_paths + + let lib_paths = libraryPaths dflags + let lib_path_opts = map ("-L"++) lib_paths + + pkg_link_opts <- getPackageLinkOpts dflags dep_packages + + -- probably _stub.o files + extra_ld_inputs <- readIORef v_Ld_inputs + + -- opts from -optdll-<blah> + let extra_ld_opts = getOpts dflags opt_dll + + let pstate = pkgState dflags + rts_id | ExtPackage id <- rtsPackageId pstate = id + | otherwise = panic "staticLink: rts package missing" + base_id | ExtPackage id <- basePackageId pstate = id + | otherwise = panic "staticLink: base package missing" + rts_pkg = getPackageDetails pstate rts_id + base_pkg = getPackageDetails pstate base_id + + let extra_os = if static || no_hs_main + then [] + else [ head (libraryDirs rts_pkg) ++ "/Main.dll_o", + head (libraryDirs base_pkg) ++ "/PrelMain.dll_o" ] + + let (md_c_flags, _) = machdepCCOpts dflags + SysTools.runMkDLL dflags + ([ SysTools.Option verb + , SysTools.Option "-o" + , SysTools.FileOption "" output_fn + ] + ++ map SysTools.Option ( + md_c_flags + ++ o_files + ++ extra_os + ++ [ "--target=i386-mingw32" ] + ++ extra_ld_inputs + ++ lib_path_opts + ++ extra_ld_opts + ++ pkg_lib_path_opts + ++ pkg_link_opts + ++ (if "--def" `elem` (concatMap words extra_ld_opts) + then [ "" ] + else [ "--export-all" ]) + )) + +-- ----------------------------------------------------------------------------- +-- Running CPP + +doCpp :: DynFlags -> Bool -> Bool -> FilePath -> FilePath -> IO () +doCpp dflags raw include_cc_opts input_fn output_fn = do + let hscpp_opts = getOpts dflags opt_P + let cmdline_include_paths = includePaths dflags + + pkg_include_dirs <- getPackageIncludePath dflags [] + let include_paths = foldr (\ x xs -> "-I" : x : xs) [] + (cmdline_include_paths ++ pkg_include_dirs) + + let verb = getVerbFlag dflags + + let cc_opts + | not include_cc_opts = [] + | otherwise = (optc ++ md_c_flags) + where + optc = getOpts dflags opt_c + (md_c_flags, _) = machdepCCOpts dflags + + let cpp_prog args | raw = SysTools.runCpp dflags args + | otherwise = SysTools.runCc dflags (SysTools.Option "-E" : args) + + let target_defs = + [ "-D" ++ HOST_OS ++ "_BUILD_OS=1", + "-D" ++ HOST_ARCH ++ "_BUILD_ARCH=1", + "-D" ++ TARGET_OS ++ "_HOST_OS=1", + "-D" ++ TARGET_ARCH ++ "_HOST_ARCH=1" ] + -- remember, in code we *compile*, the HOST is the same our TARGET, + -- and BUILD is the same as our HOST. + + cpp_prog ([SysTools.Option verb] + ++ map SysTools.Option include_paths + ++ map SysTools.Option hsSourceCppOpts + ++ map SysTools.Option hscpp_opts + ++ map SysTools.Option cc_opts + ++ map SysTools.Option target_defs + ++ [ SysTools.Option "-x" + , SysTools.Option "c" + , SysTools.Option input_fn + -- We hackily use Option instead of FileOption here, so that the file + -- name is not back-slashed on Windows. cpp is capable of + -- dealing with / in filenames, so it works fine. Furthermore + -- if we put in backslashes, cpp outputs #line directives + -- with *double* backslashes. And that in turn means that + -- our error messages get double backslashes in them. + -- In due course we should arrange that the lexer deals + -- with these \\ escapes properly. + , SysTools.Option "-o" + , SysTools.FileOption "" output_fn + ]) + +cHaskell1Version = "5" -- i.e., Haskell 98 + +-- Default CPP defines in Haskell source +hsSourceCppOpts = + [ "-D__HASKELL1__="++cHaskell1Version + , "-D__GLASGOW_HASKELL__="++cProjectVersionInt + , "-D__HASKELL98__" + , "-D__CONCURRENT_HASKELL__" + ] + + +-- ----------------------------------------------------------------------------- +-- Misc. + +hscNextPhase :: DynFlags -> HscSource -> HscTarget -> Phase +hscNextPhase dflags HsBootFile hsc_lang = StopLn +hscNextPhase dflags other hsc_lang = + case hsc_lang of + HscC -> HCc + HscAsm | dopt Opt_SplitObjs dflags -> SplitMangle + | otherwise -> As + HscNothing -> StopLn + HscInterpreted -> StopLn + _other -> StopLn + + +hscMaybeAdjustTarget :: DynFlags -> Phase -> HscSource -> HscTarget -> HscTarget +hscMaybeAdjustTarget dflags stop HsBootFile current_hsc_lang + = HscNothing -- No output (other than Foo.hi-boot) for hs-boot files +hscMaybeAdjustTarget dflags stop other current_hsc_lang + = hsc_lang + where + keep_hc = dopt Opt_KeepHcFiles dflags + hsc_lang + -- don't change the lang if we're interpreting + | current_hsc_lang == HscInterpreted = current_hsc_lang + + -- force -fvia-C if we are being asked for a .hc file + | HCc <- stop = HscC + | keep_hc = HscC + -- otherwise, stick to the plan + | otherwise = current_hsc_lang + +GLOBAL_VAR(v_Split_info, ("",0), (String,Int)) + -- The split prefix and number of files diff --git a/compiler/main/DynFlags.hs b/compiler/main/DynFlags.hs new file mode 100644 index 0000000000..78acb98375 --- /dev/null +++ b/compiler/main/DynFlags.hs @@ -0,0 +1,1344 @@ +----------------------------------------------------------------------------- +-- +-- Dynamic flags +-- +-- Most flags are dynamic flags, which means they can change from +-- compilation to compilation using OPTIONS_GHC pragmas, and in a +-- multi-session GHC each session can be using different dynamic +-- flags. Dynamic flags can also be set at the prompt in GHCi. +-- +-- (c) The University of Glasgow 2005 +-- +----------------------------------------------------------------------------- + +module DynFlags ( + -- Dynamic flags + DynFlag(..), + DynFlags(..), + HscTarget(..), + GhcMode(..), isOneShot, + GhcLink(..), isNoLink, + PackageFlag(..), + Option(..), + + -- Configuration of the core-to-core and stg-to-stg phases + CoreToDo(..), + StgToDo(..), + SimplifierSwitch(..), + SimplifierMode(..), FloatOutSwitches(..), + getCoreToDo, getStgToDo, + + -- Manipulating DynFlags + defaultDynFlags, -- DynFlags + initDynFlags, -- DynFlags -> IO DynFlags + + dopt, -- DynFlag -> DynFlags -> Bool + dopt_set, dopt_unset, -- DynFlags -> DynFlag -> DynFlags + getOpts, -- (DynFlags -> [a]) -> IO [a] + getVerbFlag, + updOptLevel, + setTmpDir, + + -- parsing DynFlags + parseDynamicFlags, + allFlags, + + -- misc stuff + machdepCCOpts, picCCOpts, + ) where + +#include "HsVersions.h" + +import Module ( Module, mkModule ) +import PrelNames ( mAIN ) +import StaticFlags ( opt_Static, opt_PIC, + WayName(..), v_Ways, v_Build_tag, v_RTS_Build_tag ) +import {-# SOURCE #-} Packages (PackageState) +import DriverPhases ( Phase(..), phaseInputExt ) +import Config +import CmdLineParser +import Panic ( panic, GhcException(..) ) +import Util ( notNull, splitLongestPrefix, split, normalisePath ) +import SrcLoc ( SrcSpan ) + +import DATA_IOREF ( readIORef ) +import EXCEPTION ( throwDyn ) +import Monad ( when ) +#ifdef mingw32_TARGET_OS +import Data.List ( isPrefixOf ) +#endif +import Maybe ( fromJust ) +import Char ( isDigit, isUpper ) +import Outputable +import System.IO ( hPutStrLn, stderr ) +import {-# SOURCE #-} ErrUtils ( Severity(..), Message, mkLocMessage ) + +-- ----------------------------------------------------------------------------- +-- DynFlags + +data DynFlag + + -- debugging flags + = Opt_D_dump_cmm + | Opt_D_dump_asm + | Opt_D_dump_cpranal + | Opt_D_dump_deriv + | Opt_D_dump_ds + | Opt_D_dump_flatC + | Opt_D_dump_foreign + | Opt_D_dump_inlinings + | Opt_D_dump_occur_anal + | Opt_D_dump_parsed + | Opt_D_dump_rn + | Opt_D_dump_simpl + | Opt_D_dump_simpl_iterations + | Opt_D_dump_spec + | Opt_D_dump_prep + | Opt_D_dump_stg + | Opt_D_dump_stranal + | Opt_D_dump_tc + | Opt_D_dump_types + | Opt_D_dump_rules + | Opt_D_dump_cse + | Opt_D_dump_worker_wrapper + | Opt_D_dump_rn_trace + | Opt_D_dump_rn_stats + | Opt_D_dump_opt_cmm + | Opt_D_dump_simpl_stats + | Opt_D_dump_tc_trace + | Opt_D_dump_if_trace + | Opt_D_dump_splices + | Opt_D_dump_BCOs + | Opt_D_dump_vect + | Opt_D_source_stats + | Opt_D_verbose_core2core + | Opt_D_verbose_stg2stg + | Opt_D_dump_hi + | Opt_D_dump_hi_diffs + | Opt_D_dump_minimal_imports + | Opt_D_faststring_stats + | Opt_DoCoreLinting + | Opt_DoStgLinting + | Opt_DoCmmLinting + + | Opt_WarnIsError -- -Werror; makes warnings fatal + | Opt_WarnDuplicateExports + | Opt_WarnHiShadows + | Opt_WarnIncompletePatterns + | Opt_WarnIncompletePatternsRecUpd + | Opt_WarnMissingFields + | Opt_WarnMissingMethods + | Opt_WarnMissingSigs + | Opt_WarnNameShadowing + | Opt_WarnOverlappingPatterns + | Opt_WarnSimplePatterns + | Opt_WarnTypeDefaults + | Opt_WarnUnusedBinds + | Opt_WarnUnusedImports + | Opt_WarnUnusedMatches + | Opt_WarnDeprecations + | Opt_WarnDodgyImports + | Opt_WarnOrphans + + -- language opts + | Opt_AllowOverlappingInstances + | Opt_AllowUndecidableInstances + | Opt_AllowIncoherentInstances + | Opt_MonomorphismRestriction + | Opt_GlasgowExts + | Opt_FFI + | Opt_PArr -- syntactic support for parallel arrays + | Opt_Arrows -- Arrow-notation syntax + | Opt_TH + | Opt_ImplicitParams + | Opt_Generics + | Opt_ImplicitPrelude + | Opt_ScopedTypeVariables + | Opt_BangPatterns + + -- optimisation opts + | Opt_Strictness + | Opt_FullLaziness + | Opt_CSE + | Opt_IgnoreInterfacePragmas + | Opt_OmitInterfacePragmas + | Opt_DoLambdaEtaExpansion + | Opt_IgnoreAsserts + | Opt_IgnoreBreakpoints + | Opt_DoEtaReduction + | Opt_CaseMerge + | Opt_UnboxStrictFields + + -- misc opts + | Opt_Cpp + | Opt_Pp + | Opt_RecompChecking + | Opt_DryRun + | Opt_DoAsmMangling + | Opt_ExcessPrecision + | Opt_ReadUserPackageConf + | Opt_NoHsMain + | Opt_SplitObjs + | Opt_StgStats + | Opt_HideAllPackages + + -- keeping stuff + | Opt_KeepHiDiffs + | Opt_KeepHcFiles + | Opt_KeepSFiles + | Opt_KeepRawSFiles + | Opt_KeepTmpFiles + + deriving (Eq) + +data DynFlags = DynFlags { + ghcMode :: GhcMode, + ghcLink :: GhcLink, + coreToDo :: Maybe [CoreToDo], -- reserved for -Ofile + stgToDo :: Maybe [StgToDo], -- similarly + hscTarget :: HscTarget, + hscOutName :: String, -- name of the output file + extCoreName :: String, -- name of the .core output file + verbosity :: Int, -- verbosity level + optLevel :: Int, -- optimisation level + maxSimplIterations :: Int, -- max simplifier iterations + ruleCheck :: Maybe String, + stolen_x86_regs :: Int, + cmdlineHcIncludes :: [String], -- -#includes + importPaths :: [FilePath], + mainModIs :: Module, + mainFunIs :: Maybe String, + + -- ways + wayNames :: [WayName], -- way flags from the cmd line + buildTag :: String, -- the global "way" (eg. "p" for prof) + rtsBuildTag :: String, -- the RTS "way" + + -- paths etc. + objectDir :: Maybe String, + hiDir :: Maybe String, + stubDir :: Maybe String, + + objectSuf :: String, + hcSuf :: String, + hiSuf :: String, + + outputFile :: Maybe String, + outputHi :: Maybe String, + + includePaths :: [String], + libraryPaths :: [String], + frameworkPaths :: [String], -- used on darwin only + cmdlineFrameworks :: [String], -- ditto + tmpDir :: String, -- no trailing '/' + + -- options for particular phases + opt_L :: [String], + opt_P :: [String], + opt_F :: [String], + opt_c :: [String], + opt_m :: [String], + opt_a :: [String], + opt_l :: [String], + opt_dll :: [String], + opt_dep :: [String], + + -- commands for particular phases + pgm_L :: String, + pgm_P :: (String,[Option]), + pgm_F :: String, + pgm_c :: (String,[Option]), + pgm_m :: (String,[Option]), + pgm_s :: (String,[Option]), + pgm_a :: (String,[Option]), + pgm_l :: (String,[Option]), + pgm_dll :: (String,[Option]), + + -- ** Package flags + extraPkgConfs :: [FilePath], + -- The -package-conf flags given on the command line, in the order + -- they appeared. + + packageFlags :: [PackageFlag], + -- The -package and -hide-package flags from the command-line + + -- ** Package state + pkgState :: PackageState, + + -- hsc dynamic flags + flags :: [DynFlag], + + -- message output + log_action :: Severity -> SrcSpan -> PprStyle -> Message -> IO () + } + +data HscTarget + = HscC + | HscAsm + | HscJava + | HscILX + | HscInterpreted + | HscNothing + deriving (Eq, Show) + +data GhcMode + = BatchCompile -- | @ghc --make Main@ + | Interactive -- | @ghc --interactive@ + | OneShot -- | @ghc -c Foo.hs@ + | JustTypecheck -- | Development environemnts, refactorer, etc. + | MkDepend + deriving Eq + +isOneShot :: GhcMode -> Bool +isOneShot OneShot = True +isOneShot _other = False + +data GhcLink -- What to do in the link step, if there is one + = -- Only relevant for modes + -- DoMake and StopBefore StopLn + NoLink -- Don't link at all + | StaticLink -- Ordinary linker [the default] + | MkDLL -- Make a DLL + +isNoLink :: GhcLink -> Bool +isNoLink NoLink = True +isNoLink other = False + +data PackageFlag + = ExposePackage String + | HidePackage String + | IgnorePackage String + +defaultHscTarget + | cGhcWithNativeCodeGen == "YES" = HscAsm + | otherwise = HscC + +initDynFlags dflags = do + -- someday these will be dynamic flags + ways <- readIORef v_Ways + build_tag <- readIORef v_Build_tag + rts_build_tag <- readIORef v_RTS_Build_tag + return dflags{ + wayNames = ways, + buildTag = build_tag, + rtsBuildTag = rts_build_tag + } + +defaultDynFlags = + DynFlags { + ghcMode = OneShot, + ghcLink = StaticLink, + coreToDo = Nothing, + stgToDo = Nothing, + hscTarget = defaultHscTarget, + hscOutName = "", + extCoreName = "", + verbosity = 0, + optLevel = 0, + maxSimplIterations = 4, + ruleCheck = Nothing, + stolen_x86_regs = 4, + cmdlineHcIncludes = [], + importPaths = ["."], + mainModIs = mAIN, + mainFunIs = Nothing, + + wayNames = panic "ways", + buildTag = panic "buildTag", + rtsBuildTag = panic "rtsBuildTag", + + objectDir = Nothing, + hiDir = Nothing, + stubDir = Nothing, + + objectSuf = phaseInputExt StopLn, + hcSuf = phaseInputExt HCc, + hiSuf = "hi", + + outputFile = Nothing, + outputHi = Nothing, + includePaths = [], + libraryPaths = [], + frameworkPaths = [], + cmdlineFrameworks = [], + tmpDir = cDEFAULT_TMPDIR, + + opt_L = [], + opt_P = [], + opt_F = [], + opt_c = [], + opt_a = [], + opt_m = [], + opt_l = [], + opt_dll = [], + opt_dep = [], + + pgm_L = panic "pgm_L", + pgm_P = panic "pgm_P", + pgm_F = panic "pgm_F", + pgm_c = panic "pgm_c", + pgm_m = panic "pgm_m", + pgm_s = panic "pgm_s", + pgm_a = panic "pgm_a", + pgm_l = panic "pgm_l", + pgm_dll = panic "pgm_mkdll", + + extraPkgConfs = [], + packageFlags = [], + pkgState = panic "pkgState", + + flags = [ + Opt_RecompChecking, + Opt_ReadUserPackageConf, + + Opt_ImplicitPrelude, + Opt_MonomorphismRestriction, + Opt_Strictness, + -- strictness is on by default, but this only + -- applies to -O. + Opt_CSE, -- similarly for CSE. + Opt_FullLaziness, -- ...and for full laziness + + Opt_DoLambdaEtaExpansion, + -- This one is important for a tiresome reason: + -- we want to make sure that the bindings for data + -- constructors are eta-expanded. This is probably + -- a good thing anyway, but it seems fragile. + + Opt_DoAsmMangling, + + -- and the default no-optimisation options: + Opt_IgnoreInterfacePragmas, + Opt_OmitInterfacePragmas + + ] ++ standardWarnings, + + log_action = \severity srcSpan style msg -> + case severity of + SevInfo -> hPutStrLn stderr (show (msg style)) + SevFatal -> hPutStrLn stderr (show (msg style)) + _ -> hPutStrLn stderr ('\n':show ((mkLocMessage srcSpan msg) style)) + } + +{- + Verbosity levels: + + 0 | print errors & warnings only + 1 | minimal verbosity: print "compiling M ... done." for each module. + 2 | equivalent to -dshow-passes + 3 | equivalent to existing "ghc -v" + 4 | "ghc -v -ddump-most" + 5 | "ghc -v -ddump-all" +-} + +dopt :: DynFlag -> DynFlags -> Bool +dopt f dflags = f `elem` (flags dflags) + +dopt_set :: DynFlags -> DynFlag -> DynFlags +dopt_set dfs f = dfs{ flags = f : flags dfs } + +dopt_unset :: DynFlags -> DynFlag -> DynFlags +dopt_unset dfs f = dfs{ flags = filter (/= f) (flags dfs) } + +getOpts :: DynFlags -> (DynFlags -> [a]) -> [a] +getOpts dflags opts = reverse (opts dflags) + -- We add to the options from the front, so we need to reverse the list + +getVerbFlag :: DynFlags -> String +getVerbFlag dflags + | verbosity dflags >= 3 = "-v" + | otherwise = "" + +setObjectDir f d = d{ objectDir = f} +setHiDir f d = d{ hiDir = f} +setStubDir f d = d{ stubDir = f} + +setObjectSuf f d = d{ objectSuf = f} +setHiSuf f d = d{ hiSuf = f} +setHcSuf f d = d{ hcSuf = f} + +setOutputFile f d = d{ outputFile = f} +setOutputHi f d = d{ outputHi = f} + +-- XXX HACK: Prelude> words "'does not' work" ===> ["'does","not'","work"] +-- Config.hs should really use Option. +setPgmP f d = let (pgm:args) = words f in d{ pgm_P = (pgm, map Option args)} + +setPgmL f d = d{ pgm_L = f} +setPgmF f d = d{ pgm_F = f} +setPgmc f d = d{ pgm_c = (f,[])} +setPgmm f d = d{ pgm_m = (f,[])} +setPgms f d = d{ pgm_s = (f,[])} +setPgma f d = d{ pgm_a = (f,[])} +setPgml f d = d{ pgm_l = (f,[])} +setPgmdll f d = d{ pgm_dll = (f,[])} + +addOptL f d = d{ opt_L = f : opt_L d} +addOptP f d = d{ opt_P = f : opt_P d} +addOptF f d = d{ opt_F = f : opt_F d} +addOptc f d = d{ opt_c = f : opt_c d} +addOptm f d = d{ opt_m = f : opt_m d} +addOpta f d = d{ opt_a = f : opt_a d} +addOptl f d = d{ opt_l = f : opt_l d} +addOptdll f d = d{ opt_dll = f : opt_dll d} +addOptdep f d = d{ opt_dep = f : opt_dep d} + +addCmdlineFramework f d = d{ cmdlineFrameworks = f : cmdlineFrameworks d} + +-- ----------------------------------------------------------------------------- +-- Command-line options + +-- When invoking external tools as part of the compilation pipeline, we +-- pass these a sequence of options on the command-line. Rather than +-- just using a list of Strings, we use a type that allows us to distinguish +-- between filepaths and 'other stuff'. [The reason being, of course, that +-- this type gives us a handle on transforming filenames, and filenames only, +-- to whatever format they're expected to be on a particular platform.] + +data Option + = FileOption -- an entry that _contains_ filename(s) / filepaths. + String -- a non-filepath prefix that shouldn't be + -- transformed (e.g., "/out=") + String -- the filepath/filename portion + | Option String + +----------------------------------------------------------------------------- +-- Setting the optimisation level + +updOptLevel :: Int -> DynFlags -> DynFlags +-- Set dynflags appropriate to the optimisation level +updOptLevel n dfs + = if (n >= 1) + then dfs2{ hscTarget = HscC, optLevel = n } -- turn on -fvia-C with -O + else dfs2{ optLevel = n } + where + dfs1 = foldr (flip dopt_unset) dfs remove_dopts + dfs2 = foldr (flip dopt_set) dfs1 extra_dopts + + extra_dopts + | n == 0 = opt_0_dopts + | otherwise = opt_1_dopts + + remove_dopts + | n == 0 = opt_1_dopts + | otherwise = opt_0_dopts + +opt_0_dopts = [ + Opt_IgnoreInterfacePragmas, + Opt_OmitInterfacePragmas + ] + +opt_1_dopts = [ + Opt_IgnoreAsserts, + Opt_DoEtaReduction, + Opt_CaseMerge + ] + +-- ----------------------------------------------------------------------------- +-- Standard sets of warning options + +standardWarnings + = [ Opt_WarnDeprecations, + Opt_WarnOverlappingPatterns, + Opt_WarnMissingFields, + Opt_WarnMissingMethods, + Opt_WarnDuplicateExports + ] + +minusWOpts + = standardWarnings ++ + [ Opt_WarnUnusedBinds, + Opt_WarnUnusedMatches, + Opt_WarnUnusedImports, + Opt_WarnIncompletePatterns, + Opt_WarnDodgyImports + ] + +minusWallOpts + = minusWOpts ++ + [ Opt_WarnTypeDefaults, + Opt_WarnNameShadowing, + Opt_WarnMissingSigs, + Opt_WarnHiShadows, + Opt_WarnOrphans + ] + +-- ----------------------------------------------------------------------------- +-- CoreToDo: abstraction of core-to-core passes to run. + +data CoreToDo -- These are diff core-to-core passes, + -- which may be invoked in any order, + -- as many times as you like. + + = CoreDoSimplify -- The core-to-core simplifier. + SimplifierMode + [SimplifierSwitch] + -- Each run of the simplifier can take a different + -- set of simplifier-specific flags. + | CoreDoFloatInwards + | CoreDoFloatOutwards FloatOutSwitches + | CoreLiberateCase + | CoreDoPrintCore + | CoreDoStaticArgs + | CoreDoStrictness + | CoreDoWorkerWrapper + | CoreDoSpecialising + | CoreDoSpecConstr + | CoreDoOldStrictness + | CoreDoGlomBinds + | CoreCSE + | CoreDoRuleCheck Int{-CompilerPhase-} String -- Check for non-application of rules + -- matching this string + + | CoreDoNothing -- useful when building up lists of these things + +data SimplifierMode -- See comments in SimplMonad + = SimplGently + | SimplPhase Int + +data SimplifierSwitch + = MaxSimplifierIterations Int + | NoCaseOfCase + +data FloatOutSwitches + = FloatOutSw Bool -- True <=> float lambdas to top level + Bool -- True <=> float constants to top level, + -- even if they do not escape a lambda + + +-- The core-to-core pass ordering is derived from the DynFlags: + +getCoreToDo :: DynFlags -> [CoreToDo] +getCoreToDo dflags + | Just todo <- coreToDo dflags = todo -- set explicitly by user + | otherwise = core_todo + where + opt_level = optLevel dflags + max_iter = maxSimplIterations dflags + strictness = dopt Opt_Strictness dflags + full_laziness = dopt Opt_FullLaziness dflags + cse = dopt Opt_CSE dflags + rule_check = ruleCheck dflags + + core_todo = + if opt_level == 0 then + [ + CoreDoSimplify (SimplPhase 0) [ + MaxSimplifierIterations max_iter + ] + ] + else {- opt_level >= 1 -} [ + + -- initial simplify: mk specialiser happy: minimum effort please + CoreDoSimplify SimplGently [ + -- Simplify "gently" + -- Don't inline anything till full laziness has bitten + -- In particular, inlining wrappers inhibits floating + -- e.g. ...(case f x of ...)... + -- ==> ...(case (case x of I# x# -> fw x#) of ...)... + -- ==> ...(case x of I# x# -> case fw x# of ...)... + -- and now the redex (f x) isn't floatable any more + -- Similarly, don't apply any rules until after full + -- laziness. Notably, list fusion can prevent floating. + + NoCaseOfCase, + -- Don't do case-of-case transformations. + -- This makes full laziness work better + MaxSimplifierIterations max_iter + ], + + -- Specialisation is best done before full laziness + -- so that overloaded functions have all their dictionary lambdas manifest + CoreDoSpecialising, + + if full_laziness then CoreDoFloatOutwards (FloatOutSw False False) + else CoreDoNothing, + + CoreDoFloatInwards, + + CoreDoSimplify (SimplPhase 2) [ + -- Want to run with inline phase 2 after the specialiser to give + -- maximum chance for fusion to work before we inline build/augment + -- in phase 1. This made a difference in 'ansi' where an + -- overloaded function wasn't inlined till too late. + MaxSimplifierIterations max_iter + ], + case rule_check of { Just pat -> CoreDoRuleCheck 2 pat; Nothing -> CoreDoNothing }, + + CoreDoSimplify (SimplPhase 1) [ + -- Need inline-phase2 here so that build/augment get + -- inlined. I found that spectral/hartel/genfft lost some useful + -- strictness in the function sumcode' if augment is not inlined + -- before strictness analysis runs + MaxSimplifierIterations max_iter + ], + case rule_check of { Just pat -> CoreDoRuleCheck 1 pat; Nothing -> CoreDoNothing }, + + CoreDoSimplify (SimplPhase 0) [ + -- Phase 0: allow all Ids to be inlined now + -- This gets foldr inlined before strictness analysis + + MaxSimplifierIterations 3 + -- At least 3 iterations because otherwise we land up with + -- huge dead expressions because of an infelicity in the + -- simpifier. + -- let k = BIG in foldr k z xs + -- ==> let k = BIG in letrec go = \xs -> ...(k x).... in go xs + -- ==> let k = BIG in letrec go = \xs -> ...(BIG x).... in go xs + -- Don't stop now! + + ], + case rule_check of { Just pat -> CoreDoRuleCheck 0 pat; Nothing -> CoreDoNothing }, + +#ifdef OLD_STRICTNESS + CoreDoOldStrictness +#endif + if strictness then CoreDoStrictness else CoreDoNothing, + CoreDoWorkerWrapper, + CoreDoGlomBinds, + + CoreDoSimplify (SimplPhase 0) [ + MaxSimplifierIterations max_iter + ], + + if full_laziness then + CoreDoFloatOutwards (FloatOutSw False -- Not lambdas + True) -- Float constants + else CoreDoNothing, + -- nofib/spectral/hartel/wang doubles in speed if you + -- do full laziness late in the day. It only happens + -- after fusion and other stuff, so the early pass doesn't + -- catch it. For the record, the redex is + -- f_el22 (f_el21 r_midblock) + + + -- We want CSE to follow the final full-laziness pass, because it may + -- succeed in commoning up things floated out by full laziness. + -- CSE used to rely on the no-shadowing invariant, but it doesn't any more + + if cse then CoreCSE else CoreDoNothing, + + CoreDoFloatInwards, + +-- Case-liberation for -O2. This should be after +-- strictness analysis and the simplification which follows it. + + case rule_check of { Just pat -> CoreDoRuleCheck 0 pat; Nothing -> CoreDoNothing }, + + if opt_level >= 2 then + CoreLiberateCase + else + CoreDoNothing, + if opt_level >= 2 then + CoreDoSpecConstr + else + CoreDoNothing, + + -- Final clean-up simplification: + CoreDoSimplify (SimplPhase 0) [ + MaxSimplifierIterations max_iter + ] + ] + +-- ----------------------------------------------------------------------------- +-- StgToDo: abstraction of stg-to-stg passes to run. + +data StgToDo + = StgDoMassageForProfiling -- should be (next to) last + -- There's also setStgVarInfo, but its absolute "lastness" + -- is so critical that it is hardwired in (no flag). + | D_stg_stats + +getStgToDo :: DynFlags -> [StgToDo] +getStgToDo dflags + | Just todo <- stgToDo dflags = todo -- set explicitly by user + | otherwise = todo2 + where + stg_stats = dopt Opt_StgStats dflags + + todo1 = if stg_stats then [D_stg_stats] else [] + + todo2 | WayProf `elem` wayNames dflags + = StgDoMassageForProfiling : todo1 + | otherwise + = todo1 + +-- ----------------------------------------------------------------------------- +-- DynFlags parser + +allFlags :: [String] +allFlags = map ('-':) $ + [ name | (name, optkind) <- dynamic_flags, ok optkind ] ++ + map ("fno-"++) flags ++ + map ("f"++) flags + where ok (PrefixPred _ _) = False + ok _ = True + flags = map fst fFlags + +dynamic_flags :: [(String, OptKind DynP)] +dynamic_flags = [ + ( "n" , NoArg (setDynFlag Opt_DryRun) ) + , ( "cpp" , NoArg (setDynFlag Opt_Cpp)) + , ( "F" , NoArg (setDynFlag Opt_Pp)) + , ( "#include" , HasArg (addCmdlineHCInclude) ) + , ( "v" , OptPrefix (setVerbosity) ) + + ------- Specific phases -------------------------------------------- + , ( "pgmL" , HasArg (upd . setPgmL) ) + , ( "pgmP" , HasArg (upd . setPgmP) ) + , ( "pgmF" , HasArg (upd . setPgmF) ) + , ( "pgmc" , HasArg (upd . setPgmc) ) + , ( "pgmm" , HasArg (upd . setPgmm) ) + , ( "pgms" , HasArg (upd . setPgms) ) + , ( "pgma" , HasArg (upd . setPgma) ) + , ( "pgml" , HasArg (upd . setPgml) ) + , ( "pgmdll" , HasArg (upd . setPgmdll) ) + + , ( "optL" , HasArg (upd . addOptL) ) + , ( "optP" , HasArg (upd . addOptP) ) + , ( "optF" , HasArg (upd . addOptF) ) + , ( "optc" , HasArg (upd . addOptc) ) + , ( "optm" , HasArg (upd . addOptm) ) + , ( "opta" , HasArg (upd . addOpta) ) + , ( "optl" , HasArg (upd . addOptl) ) + , ( "optdll" , HasArg (upd . addOptdll) ) + , ( "optdep" , HasArg (upd . addOptdep) ) + + , ( "split-objs" , NoArg (if can_split + then setDynFlag Opt_SplitObjs + else return ()) ) + + -------- Linking ---------------------------------------------------- + , ( "c" , NoArg (upd $ \d -> d{ ghcLink=NoLink } )) + , ( "no-link" , NoArg (upd $ \d -> d{ ghcLink=NoLink } )) -- Dep. + , ( "-mk-dll" , NoArg (upd $ \d -> d{ ghcLink=MkDLL } )) + + ------- Libraries --------------------------------------------------- + , ( "L" , Prefix addLibraryPath ) + , ( "l" , AnySuffix (\s -> do upd (addOptl s) + upd (addOptdll s))) + + ------- Frameworks -------------------------------------------------- + -- -framework-path should really be -F ... + , ( "framework-path" , HasArg addFrameworkPath ) + , ( "framework" , HasArg (upd . addCmdlineFramework) ) + + ------- Output Redirection ------------------------------------------ + , ( "odir" , HasArg (upd . setObjectDir . Just)) + , ( "o" , SepArg (upd . setOutputFile . Just)) + , ( "ohi" , HasArg (upd . setOutputHi . Just )) + , ( "osuf" , HasArg (upd . setObjectSuf)) + , ( "hcsuf" , HasArg (upd . setHcSuf)) + , ( "hisuf" , HasArg (upd . setHiSuf)) + , ( "hidir" , HasArg (upd . setHiDir . Just)) + , ( "tmpdir" , HasArg (upd . setTmpDir)) + , ( "stubdir" , HasArg (upd . setStubDir . Just)) + + ------- Keeping temporary files ------------------------------------- + , ( "keep-hc-file" , AnySuffix (\_ -> setDynFlag Opt_KeepHcFiles)) + , ( "keep-s-file" , AnySuffix (\_ -> setDynFlag Opt_KeepSFiles)) + , ( "keep-raw-s-file", AnySuffix (\_ -> setDynFlag Opt_KeepRawSFiles)) + , ( "keep-tmp-files" , AnySuffix (\_ -> setDynFlag Opt_KeepTmpFiles)) + + ------- Miscellaneous ---------------------------------------------- + , ( "no-hs-main" , NoArg (setDynFlag Opt_NoHsMain)) + , ( "main-is" , SepArg setMainIs ) + + ------- recompilation checker -------------------------------------- + , ( "recomp" , NoArg (setDynFlag Opt_RecompChecking) ) + , ( "no-recomp" , NoArg (unSetDynFlag Opt_RecompChecking) ) + + ------- Packages ---------------------------------------------------- + , ( "package-conf" , HasArg extraPkgConf_ ) + , ( "no-user-package-conf", NoArg (unSetDynFlag Opt_ReadUserPackageConf) ) + , ( "package-name" , HasArg ignorePackage ) -- for compatibility + , ( "package" , HasArg exposePackage ) + , ( "hide-package" , HasArg hidePackage ) + , ( "hide-all-packages", NoArg (setDynFlag Opt_HideAllPackages) ) + , ( "ignore-package" , HasArg ignorePackage ) + , ( "syslib" , HasArg exposePackage ) -- for compatibility + + ------ HsCpp opts --------------------------------------------------- + , ( "D", AnySuffix (upd . addOptP) ) + , ( "U", AnySuffix (upd . addOptP) ) + + ------- Include/Import Paths ---------------------------------------- + , ( "I" , Prefix addIncludePath) + , ( "i" , OptPrefix addImportPath ) + + ------ Debugging ---------------------------------------------------- + , ( "dstg-stats", NoArg (setDynFlag Opt_StgStats)) + + , ( "ddump-cmm", setDumpFlag Opt_D_dump_cmm) + , ( "ddump-asm", setDumpFlag Opt_D_dump_asm) + , ( "ddump-cpranal", setDumpFlag Opt_D_dump_cpranal) + , ( "ddump-deriv", setDumpFlag Opt_D_dump_deriv) + , ( "ddump-ds", setDumpFlag Opt_D_dump_ds) + , ( "ddump-flatC", setDumpFlag Opt_D_dump_flatC) + , ( "ddump-foreign", setDumpFlag Opt_D_dump_foreign) + , ( "ddump-inlinings", setDumpFlag Opt_D_dump_inlinings) + , ( "ddump-occur-anal", setDumpFlag Opt_D_dump_occur_anal) + , ( "ddump-parsed", setDumpFlag Opt_D_dump_parsed) + , ( "ddump-rn", setDumpFlag Opt_D_dump_rn) + , ( "ddump-simpl", setDumpFlag Opt_D_dump_simpl) + , ( "ddump-simpl-iterations", setDumpFlag Opt_D_dump_simpl_iterations) + , ( "ddump-spec", setDumpFlag Opt_D_dump_spec) + , ( "ddump-prep", setDumpFlag Opt_D_dump_prep) + , ( "ddump-stg", setDumpFlag Opt_D_dump_stg) + , ( "ddump-stranal", setDumpFlag Opt_D_dump_stranal) + , ( "ddump-tc", setDumpFlag Opt_D_dump_tc) + , ( "ddump-types", setDumpFlag Opt_D_dump_types) + , ( "ddump-rules", setDumpFlag Opt_D_dump_rules) + , ( "ddump-cse", setDumpFlag Opt_D_dump_cse) + , ( "ddump-worker-wrapper", setDumpFlag Opt_D_dump_worker_wrapper) + , ( "ddump-rn-trace", NoArg (setDynFlag Opt_D_dump_rn_trace)) + , ( "ddump-if-trace", NoArg (setDynFlag Opt_D_dump_if_trace)) + , ( "ddump-tc-trace", setDumpFlag Opt_D_dump_tc_trace) + , ( "ddump-splices", setDumpFlag Opt_D_dump_splices) + , ( "ddump-rn-stats", NoArg (setDynFlag Opt_D_dump_rn_stats)) + , ( "ddump-opt-cmm", setDumpFlag Opt_D_dump_opt_cmm) + , ( "ddump-simpl-stats", setDumpFlag Opt_D_dump_simpl_stats) + , ( "ddump-bcos", setDumpFlag Opt_D_dump_BCOs) + , ( "dsource-stats", setDumpFlag Opt_D_source_stats) + , ( "dverbose-core2core", setDumpFlag Opt_D_verbose_core2core) + , ( "dverbose-stg2stg", setDumpFlag Opt_D_verbose_stg2stg) + , ( "ddump-hi-diffs", NoArg (setDynFlag Opt_D_dump_hi_diffs)) + , ( "ddump-hi", setDumpFlag Opt_D_dump_hi) + , ( "ddump-minimal-imports", NoArg (setDynFlag Opt_D_dump_minimal_imports)) + , ( "ddump-vect", setDumpFlag Opt_D_dump_vect) + , ( "dcore-lint", NoArg (setDynFlag Opt_DoCoreLinting)) + , ( "dstg-lint", NoArg (setDynFlag Opt_DoStgLinting)) + , ( "dcmm-lint", NoArg (setDynFlag Opt_DoCmmLinting)) + , ( "dshow-passes", NoArg (do unSetDynFlag Opt_RecompChecking + setVerbosity "2") ) + , ( "dfaststring-stats", NoArg (setDynFlag Opt_D_faststring_stats)) + + ------ Machine dependant (-m<blah>) stuff --------------------------- + + , ( "monly-2-regs", NoArg (upd (\s -> s{stolen_x86_regs = 2}) )) + , ( "monly-3-regs", NoArg (upd (\s -> s{stolen_x86_regs = 3}) )) + , ( "monly-4-regs", NoArg (upd (\s -> s{stolen_x86_regs = 4}) )) + + ------ Warning opts ------------------------------------------------- + , ( "W" , NoArg (mapM_ setDynFlag minusWOpts) ) + , ( "Werror" , NoArg (setDynFlag Opt_WarnIsError) ) + , ( "Wall" , NoArg (mapM_ setDynFlag minusWallOpts) ) + , ( "Wnot" , NoArg (mapM_ unSetDynFlag minusWallOpts) ) /* DEPREC */ + , ( "w" , NoArg (mapM_ unSetDynFlag minusWallOpts) ) + + ------ Optimisation flags ------------------------------------------ + , ( "O" , NoArg (upd (setOptLevel 1))) + , ( "Onot" , NoArg (upd (setOptLevel 0))) + , ( "O" , PrefixPred (all isDigit) + (\f -> upd (setOptLevel (read f)))) + + , ( "fmax-simplifier-iterations", + PrefixPred (all isDigit) + (\n -> upd (\dfs -> + dfs{ maxSimplIterations = read n })) ) + + , ( "frule-check", + SepArg (\s -> upd (\dfs -> dfs{ ruleCheck = Just s }))) + + ------ Compiler flags ----------------------------------------------- + + , ( "fno-code", NoArg (setTarget HscNothing)) + , ( "fasm", AnySuffix (\_ -> setTarget HscAsm) ) + , ( "fvia-c", NoArg (setTarget HscC) ) + , ( "fvia-C", NoArg (setTarget HscC) ) + , ( "filx", NoArg (setTarget HscILX) ) + + , ( "fglasgow-exts", NoArg (mapM_ setDynFlag glasgowExtsFlags) ) + , ( "fno-glasgow-exts", NoArg (mapM_ unSetDynFlag glasgowExtsFlags) ) + + -- the rest of the -f* and -fno-* flags + , ( "fno-", PrefixPred (\f -> isFFlag f) (\f -> unSetDynFlag (getFFlag f)) ) + , ( "f", PrefixPred (\f -> isFFlag f) (\f -> setDynFlag (getFFlag f)) ) + ] + +-- these -f<blah> flags can all be reversed with -fno-<blah> + +fFlags = [ + ( "warn-duplicate-exports", Opt_WarnDuplicateExports ), + ( "warn-hi-shadowing", Opt_WarnHiShadows ), + ( "warn-incomplete-patterns", Opt_WarnIncompletePatterns ), + ( "warn-incomplete-record-updates", Opt_WarnIncompletePatternsRecUpd ), + ( "warn-missing-fields", Opt_WarnMissingFields ), + ( "warn-missing-methods", Opt_WarnMissingMethods ), + ( "warn-missing-signatures", Opt_WarnMissingSigs ), + ( "warn-name-shadowing", Opt_WarnNameShadowing ), + ( "warn-overlapping-patterns", Opt_WarnOverlappingPatterns ), + ( "warn-simple-patterns", Opt_WarnSimplePatterns ), + ( "warn-type-defaults", Opt_WarnTypeDefaults ), + ( "warn-unused-binds", Opt_WarnUnusedBinds ), + ( "warn-unused-imports", Opt_WarnUnusedImports ), + ( "warn-unused-matches", Opt_WarnUnusedMatches ), + ( "warn-deprecations", Opt_WarnDeprecations ), + ( "warn-orphans", Opt_WarnOrphans ), + ( "fi", Opt_FFI ), -- support `-ffi'... + ( "ffi", Opt_FFI ), -- ...and also `-fffi' + ( "arrows", Opt_Arrows ), -- arrow syntax + ( "parr", Opt_PArr ), + ( "th", Opt_TH ), + ( "implicit-prelude", Opt_ImplicitPrelude ), + ( "scoped-type-variables", Opt_ScopedTypeVariables ), + ( "bang-patterns", Opt_BangPatterns ), + ( "monomorphism-restriction", Opt_MonomorphismRestriction ), + ( "implicit-params", Opt_ImplicitParams ), + ( "allow-overlapping-instances", Opt_AllowOverlappingInstances ), + ( "allow-undecidable-instances", Opt_AllowUndecidableInstances ), + ( "allow-incoherent-instances", Opt_AllowIncoherentInstances ), + ( "generics", Opt_Generics ), + ( "strictness", Opt_Strictness ), + ( "full-laziness", Opt_FullLaziness ), + ( "cse", Opt_CSE ), + ( "ignore-interface-pragmas", Opt_IgnoreInterfacePragmas ), + ( "omit-interface-pragmas", Opt_OmitInterfacePragmas ), + ( "do-lambda-eta-expansion", Opt_DoLambdaEtaExpansion ), + ( "ignore-asserts", Opt_IgnoreAsserts ), + ( "ignore-breakpoints", Opt_IgnoreBreakpoints), + ( "do-eta-reduction", Opt_DoEtaReduction ), + ( "case-merge", Opt_CaseMerge ), + ( "unbox-strict-fields", Opt_UnboxStrictFields ), + ( "excess-precision", Opt_ExcessPrecision ), + ( "asm-mangling", Opt_DoAsmMangling ) + ] + +glasgowExtsFlags = [ + Opt_GlasgowExts, + Opt_FFI, + Opt_TH, + Opt_ImplicitParams, + Opt_ScopedTypeVariables, + Opt_BangPatterns ] + +isFFlag f = f `elem` (map fst fFlags) +getFFlag f = fromJust (lookup f fFlags) + +-- ----------------------------------------------------------------------------- +-- Parsing the dynamic flags. + +parseDynamicFlags :: DynFlags -> [String] -> IO (DynFlags,[String]) +parseDynamicFlags dflags args = do + let ((leftover,errs),dflags') + = runCmdLine (processArgs dynamic_flags args) dflags + when (not (null errs)) $ do + throwDyn (UsageError (unlines errs)) + return (dflags', leftover) + + +type DynP = CmdLineP DynFlags + +upd :: (DynFlags -> DynFlags) -> DynP () +upd f = do + dfs <- getCmdLineState + putCmdLineState $! (f dfs) + +setDynFlag, unSetDynFlag :: DynFlag -> DynP () +setDynFlag f = upd (\dfs -> dopt_set dfs f) +unSetDynFlag f = upd (\dfs -> dopt_unset dfs f) + +setDumpFlag :: DynFlag -> OptKind DynP +setDumpFlag dump_flag + = NoArg (unSetDynFlag Opt_RecompChecking >> setDynFlag dump_flag) + -- Whenver we -ddump, switch off the recompilation checker, + -- else you don't see the dump! + +setVerbosity "" = upd (\dfs -> dfs{ verbosity = 3 }) +setVerbosity n + | all isDigit n = upd (\dfs -> dfs{ verbosity = read n }) + | otherwise = throwDyn (UsageError "can't parse verbosity flag (-v<n>)") + +addCmdlineHCInclude a = upd (\s -> s{cmdlineHcIncludes = a : cmdlineHcIncludes s}) + +extraPkgConf_ p = upd (\s -> s{ extraPkgConfs = p : extraPkgConfs s }) + +exposePackage p = + upd (\s -> s{ packageFlags = ExposePackage p : packageFlags s }) +hidePackage p = + upd (\s -> s{ packageFlags = HidePackage p : packageFlags s }) +ignorePackage p = + upd (\s -> s{ packageFlags = IgnorePackage p : packageFlags s }) + +-- we can only switch between HscC, HscAsmm, and HscILX with dynamic flags +-- (-fvia-C, -fasm, -filx respectively). +setTarget l = upd (\dfs -> case hscTarget dfs of + HscC -> dfs{ hscTarget = l } + HscAsm -> dfs{ hscTarget = l } + HscILX -> dfs{ hscTarget = l } + _ -> dfs) + +setOptLevel :: Int -> DynFlags -> DynFlags +setOptLevel n dflags + | hscTarget dflags == HscInterpreted && n > 0 + = dflags + -- not in IO any more, oh well: + -- putStr "warning: -O conflicts with --interactive; -O ignored.\n" + | otherwise + = updOptLevel n dflags + + +setMainIs :: String -> DynP () +setMainIs arg + | not (null main_fn) -- The arg looked like "Foo.baz" + = upd $ \d -> d{ mainFunIs = Just main_fn, + mainModIs = mkModule main_mod } + + | isUpper (head main_mod) -- The arg looked like "Foo" + = upd $ \d -> d{ mainModIs = mkModule main_mod } + + | otherwise -- The arg looked like "baz" + = upd $ \d -> d{ mainFunIs = Just main_mod } + where + (main_mod, main_fn) = splitLongestPrefix arg (== '.') + +----------------------------------------------------------------------------- +-- Paths & Libraries + +-- -i on its own deletes the import paths +addImportPath "" = upd (\s -> s{importPaths = []}) +addImportPath p = upd (\s -> s{importPaths = importPaths s ++ splitPathList p}) + + +addLibraryPath p = + upd (\s -> s{libraryPaths = libraryPaths s ++ splitPathList p}) + +addIncludePath p = + upd (\s -> s{includePaths = includePaths s ++ splitPathList p}) + +addFrameworkPath p = + upd (\s -> s{frameworkPaths = frameworkPaths s ++ splitPathList p}) + +split_marker = ':' -- not configurable (ToDo) + +splitPathList :: String -> [String] +splitPathList s = filter notNull (splitUp s) + -- empty paths are ignored: there might be a trailing + -- ':' in the initial list, for example. Empty paths can + -- cause confusion when they are translated into -I options + -- for passing to gcc. + where +#ifndef mingw32_TARGET_OS + splitUp xs = split split_marker xs +#else + -- Windows: 'hybrid' support for DOS-style paths in directory lists. + -- + -- That is, if "foo:bar:baz" is used, this interpreted as + -- consisting of three entries, 'foo', 'bar', 'baz'. + -- However, with "c:/foo:c:\\foo;x:/bar", this is interpreted + -- as 3 elts, "c:/foo", "c:\\foo", "x:/bar" + -- + -- Notice that no attempt is made to fully replace the 'standard' + -- split marker ':' with the Windows / DOS one, ';'. The reason being + -- that this will cause too much breakage for users & ':' will + -- work fine even with DOS paths, if you're not insisting on being silly. + -- So, use either. + splitUp [] = [] + splitUp (x:':':div:xs) | div `elem` dir_markers + = ((x:':':div:p): splitUp rs) + where + (p,rs) = findNextPath xs + -- we used to check for existence of the path here, but that + -- required the IO monad to be threaded through the command-line + -- parser which is quite inconvenient. The + splitUp xs = cons p (splitUp rs) + where + (p,rs) = findNextPath xs + + cons "" xs = xs + cons x xs = x:xs + + -- will be called either when we've consumed nought or the + -- "<Drive>:/" part of a DOS path, so splitting is just a Q of + -- finding the next split marker. + findNextPath xs = + case break (`elem` split_markers) xs of + (p, d:ds) -> (p, ds) + (p, xs) -> (p, xs) + + split_markers :: [Char] + split_markers = [':', ';'] + + dir_markers :: [Char] + dir_markers = ['/', '\\'] +#endif + +-- ----------------------------------------------------------------------------- +-- tmpDir, where we store temporary files. + +setTmpDir :: FilePath -> DynFlags -> DynFlags +setTmpDir dir dflags = dflags{ tmpDir = canonicalise dir } + where +#if !defined(mingw32_HOST_OS) + canonicalise p = normalisePath p +#else + -- Canonicalisation of temp path under win32 is a bit more + -- involved: (a) strip trailing slash, + -- (b) normalise slashes + -- (c) just in case, if there is a prefix /cygdrive/x/, change to x: + -- + canonicalise path = normalisePath (xltCygdrive (removeTrailingSlash path)) + + -- if we're operating under cygwin, and TMP/TEMP is of + -- the form "/cygdrive/drive/path", translate this to + -- "drive:/path" (as GHC isn't a cygwin app and doesn't + -- understand /cygdrive paths.) + xltCygdrive path + | "/cygdrive/" `isPrefixOf` path = + case drop (length "/cygdrive/") path of + drive:xs@('/':_) -> drive:':':xs + _ -> path + | otherwise = path + + -- strip the trailing backslash (awful, but we only do this once). + removeTrailingSlash path = + case last path of + '/' -> init path + '\\' -> init path + _ -> path +#endif + +----------------------------------------------------------------------------- +-- Via-C compilation stuff + +machdepCCOpts :: DynFlags -> ([String], -- flags for all C compilations + [String]) -- for registerised HC compilations +machdepCCOpts dflags +#if alpha_TARGET_ARCH + = ( ["-w", "-mieee" +#ifdef HAVE_THREADED_RTS_SUPPORT + , "-D_REENTRANT" +#endif + ], [] ) + -- For now, to suppress the gcc warning "call-clobbered + -- register used for global register variable", we simply + -- disable all warnings altogether using the -w flag. Oh well. + +#elif hppa_TARGET_ARCH + -- ___HPUX_SOURCE, not _HPUX_SOURCE, is #defined if -ansi! + -- (very nice, but too bad the HP /usr/include files don't agree.) + = ( ["-D_HPUX_SOURCE"], [] ) + +#elif m68k_TARGET_ARCH + -- -fno-defer-pop : for the .hc files, we want all the pushing/ + -- popping of args to routines to be explicit; if we let things + -- be deferred 'til after an STGJUMP, imminent death is certain! + -- + -- -fomit-frame-pointer : *don't* + -- It's better to have a6 completely tied up being a frame pointer + -- rather than let GCC pick random things to do with it. + -- (If we want to steal a6, then we would try to do things + -- as on iX86, where we *do* steal the frame pointer [%ebp].) + = ( [], ["-fno-defer-pop", "-fno-omit-frame-pointer"] ) + +#elif i386_TARGET_ARCH + -- -fno-defer-pop : basically the same game as for m68k + -- + -- -fomit-frame-pointer : *must* in .hc files; because we're stealing + -- the fp (%ebp) for our register maps. + = let n_regs = stolen_x86_regs dflags + sta = opt_Static + in + ( [ if sta then "-DDONT_WANT_WIN32_DLL_SUPPORT" else "" +-- , if suffixMatch "mingw32" cTARGETPLATFORM then "-mno-cygwin" else "" + ], + [ "-fno-defer-pop", +#ifdef HAVE_GCC_MNO_OMIT_LFPTR + -- Some gccs are configured with + -- -momit-leaf-frame-pointer on by default, and it + -- apparently takes precedence over + -- -fomit-frame-pointer, so we disable it first here. + "-mno-omit-leaf-frame-pointer", +#endif + "-fomit-frame-pointer", + -- we want -fno-builtin, because when gcc inlines + -- built-in functions like memcpy() it tends to + -- run out of registers, requiring -monly-n-regs + "-fno-builtin", + "-DSTOLEN_X86_REGS="++show n_regs ] + ) + +#elif ia64_TARGET_ARCH + = ( [], ["-fomit-frame-pointer", "-G0"] ) + +#elif x86_64_TARGET_ARCH + = ( [], ["-fomit-frame-pointer", + "-fno-asynchronous-unwind-tables", + -- the unwind tables are unnecessary for HC code, + -- and get in the way of -split-objs. Another option + -- would be to throw them away in the mangler, but this + -- is easier. + "-fno-unit-at-a-time", + -- unit-at-a-time doesn't do us any good, and screws + -- up -split-objs by moving the split markers around. + -- It's only turned on with -O2, but put it here just + -- in case someone uses -optc-O2. + "-fno-builtin" + -- calling builtins like strlen() using the FFI can + -- cause gcc to run out of regs, so use the external + -- version. + ] ) + +#elif mips_TARGET_ARCH + = ( ["-static"], [] ) + +#elif sparc_TARGET_ARCH + = ( [], ["-w"] ) + -- For now, to suppress the gcc warning "call-clobbered + -- register used for global register variable", we simply + -- disable all warnings altogether using the -w flag. Oh well. + +#elif powerpc_apple_darwin_TARGET + -- -no-cpp-precomp: + -- Disable Apple's precompiling preprocessor. It's a great thing + -- for "normal" programs, but it doesn't support register variable + -- declarations. + = ( [], ["-no-cpp-precomp"] ) +#else + = ( [], [] ) +#endif + +picCCOpts :: DynFlags -> [String] +picCCOpts dflags +#if darwin_TARGET_OS + -- Apple prefers to do things the other way round. + -- PIC is on by default. + -- -mdynamic-no-pic: + -- Turn off PIC code generation. + -- -fno-common: + -- Don't generate "common" symbols - these are unwanted + -- in dynamic libraries. + + | opt_PIC + = ["-fno-common"] + | otherwise + = ["-mdynamic-no-pic"] +#elif mingw32_TARGET_OS + -- no -fPIC for Windows + = [] +#else + | opt_PIC + = ["-fPIC"] + | otherwise + = [] +#endif + +-- ----------------------------------------------------------------------------- +-- Splitting + +can_split :: Bool +can_split = +#if defined(i386_TARGET_ARCH) \ + || defined(x86_64_TARGET_ARCH) \ + || defined(alpha_TARGET_ARCH) \ + || defined(hppa_TARGET_ARCH) \ + || defined(m68k_TARGET_ARCH) \ + || defined(mips_TARGET_ARCH) \ + || defined(powerpc_TARGET_ARCH) \ + || defined(rs6000_TARGET_ARCH) \ + || defined(sparc_TARGET_ARCH) + True +#else + False +#endif + diff --git a/compiler/main/ErrUtils.hi-boot-6 b/compiler/main/ErrUtils.hi-boot-6 new file mode 100644 index 0000000000..fd98ca3950 --- /dev/null +++ b/compiler/main/ErrUtils.hi-boot-6 @@ -0,0 +1,11 @@ +module ErrUtils where + +data Severity + = SevInfo + | SevWarning + | SevError + | SevFatal + +type Message = Outputable.SDoc + +mkLocMessage :: SrcLoc.SrcSpan -> Message -> Message diff --git a/compiler/main/ErrUtils.lhs b/compiler/main/ErrUtils.lhs new file mode 100644 index 0000000000..90e5dc87b6 --- /dev/null +++ b/compiler/main/ErrUtils.lhs @@ -0,0 +1,260 @@ +% +% (c) The AQUA Project, Glasgow University, 1994-1998 +% +\section[ErrsUtils]{Utilities for error reporting} + +\begin{code} +module ErrUtils ( + Message, mkLocMessage, printError, + Severity(..), + + ErrMsg, WarnMsg, + errMsgSpans, errMsgContext, errMsgShortDoc, errMsgExtraInfo, + Messages, errorsFound, emptyMessages, + mkErrMsg, mkWarnMsg, mkPlainErrMsg, mkLongErrMsg, + printErrorsAndWarnings, printBagOfErrors, printBagOfWarnings, + + ghcExit, + doIfSet, doIfSet_dyn, + dumpIfSet, dumpIfSet_core, dumpIfSet_dyn, dumpIfSet_dyn_or, mkDumpDoc, + + -- * Messages during compilation + putMsg, + errorMsg, + fatalErrorMsg, + compilationProgressMsg, + showPass, + debugTraceMsg, + ) where + +#include "HsVersions.h" + +import Bag ( Bag, bagToList, isEmptyBag, emptyBag ) +import SrcLoc ( SrcSpan ) +import Util ( sortLe, global ) +import Outputable +import qualified Pretty +import SrcLoc ( srcSpanStart, noSrcSpan ) +import DynFlags ( DynFlags(..), DynFlag(..), dopt ) +import StaticFlags ( opt_ErrorSpans ) +import System ( ExitCode(..), exitWith ) +import DATA_IOREF +import IO ( hPutStrLn, stderr ) +import DYNAMIC + + +-- ----------------------------------------------------------------------------- +-- Basic error messages: just render a message with a source location. + +type Message = SDoc + +data Severity + = SevInfo + | SevWarning + | SevError + | SevFatal + +mkLocMessage :: SrcSpan -> Message -> Message +mkLocMessage locn msg + | opt_ErrorSpans = hang (ppr locn <> colon) 4 msg + | otherwise = hang (ppr (srcSpanStart locn) <> colon) 4 msg + -- always print the location, even if it is unhelpful. Error messages + -- are supposed to be in a standard format, and one without a location + -- would look strange. Better to say explicitly "<no location info>". + +printError :: SrcSpan -> Message -> IO () +printError span msg = printErrs (mkLocMessage span msg $ defaultErrStyle) + + +-- ----------------------------------------------------------------------------- +-- Collecting up messages for later ordering and printing. + +data ErrMsg = ErrMsg { + errMsgSpans :: [SrcSpan], + errMsgContext :: PrintUnqualified, + errMsgShortDoc :: Message, + errMsgExtraInfo :: Message + } + -- The SrcSpan is used for sorting errors into line-number order + -- NB Pretty.Doc not SDoc: we deal with the printing style (in ptic + -- whether to qualify an External Name) at the error occurrence + +-- So we can throw these things as exceptions +errMsgTc :: TyCon +errMsgTc = mkTyCon "ErrMsg" +{-# NOINLINE errMsgTc #-} +instance Typeable ErrMsg where +#if __GLASGOW_HASKELL__ < 603 + typeOf _ = mkAppTy errMsgTc [] +#else + typeOf _ = mkTyConApp errMsgTc [] +#endif + +type WarnMsg = ErrMsg + +-- A short (one-line) error message, with context to tell us whether +-- to qualify names in the message or not. +mkErrMsg :: SrcSpan -> PrintUnqualified -> Message -> ErrMsg +mkErrMsg locn print_unqual msg + = ErrMsg [locn] print_unqual msg empty + +-- Variant that doesn't care about qualified/unqualified names +mkPlainErrMsg :: SrcSpan -> Message -> ErrMsg +mkPlainErrMsg locn msg + = ErrMsg [locn] alwaysQualify msg empty + +-- A long (multi-line) error message, with context to tell us whether +-- to qualify names in the message or not. +mkLongErrMsg :: SrcSpan -> PrintUnqualified -> Message -> Message -> ErrMsg +mkLongErrMsg locn print_unqual msg extra + = ErrMsg [locn] print_unqual msg extra + +mkWarnMsg :: SrcSpan -> PrintUnqualified -> Message -> WarnMsg +mkWarnMsg = mkErrMsg + +type Messages = (Bag WarnMsg, Bag ErrMsg) + +emptyMessages :: Messages +emptyMessages = (emptyBag, emptyBag) + +errorsFound :: DynFlags -> Messages -> Bool +-- The dyn-flags are used to see if the user has specified +-- -Werorr, which says that warnings should be fatal +errorsFound dflags (warns, errs) + | dopt Opt_WarnIsError dflags = not (isEmptyBag errs) || not (isEmptyBag warns) + | otherwise = not (isEmptyBag errs) + +printErrorsAndWarnings :: DynFlags -> Messages -> IO () +printErrorsAndWarnings dflags (warns, errs) + | no_errs && no_warns = return () + | no_errs = printBagOfWarnings dflags warns + -- Don't print any warnings if there are errors + | otherwise = printBagOfErrors dflags errs + where + no_warns = isEmptyBag warns + no_errs = isEmptyBag errs + +printBagOfErrors :: DynFlags -> Bag ErrMsg -> IO () +printBagOfErrors dflags bag_of_errors + = sequence_ [ let style = mkErrStyle unqual + in log_action dflags SevError s style (d $$ e) + | ErrMsg { errMsgSpans = s:ss, + errMsgShortDoc = d, + errMsgExtraInfo = e, + errMsgContext = unqual } <- sorted_errs ] + where + bag_ls = bagToList bag_of_errors + sorted_errs = sortLe occ'ed_before bag_ls + + occ'ed_before err1 err2 = + case compare (head (errMsgSpans err1)) (head (errMsgSpans err2)) of + LT -> True + EQ -> True + GT -> False + +printBagOfWarnings :: DynFlags -> Bag ErrMsg -> IO () +printBagOfWarnings dflags bag_of_warns + = sequence_ [ let style = mkErrStyle unqual + in log_action dflags SevWarning s style (d $$ e) + | ErrMsg { errMsgSpans = s:ss, + errMsgShortDoc = d, + errMsgExtraInfo = e, + errMsgContext = unqual } <- sorted_errs ] + where + bag_ls = bagToList bag_of_warns + sorted_errs = sortLe occ'ed_before bag_ls + + occ'ed_before err1 err2 = + case compare (head (errMsgSpans err1)) (head (errMsgSpans err2)) of + LT -> True + EQ -> True + GT -> False +\end{code} + +\begin{code} +ghcExit :: DynFlags -> Int -> IO () +ghcExit dflags val + | val == 0 = exitWith ExitSuccess + | otherwise = do errorMsg dflags (text "\nCompilation had errors\n\n") + exitWith (ExitFailure val) +\end{code} + +\begin{code} +doIfSet :: Bool -> IO () -> IO () +doIfSet flag action | flag = action + | otherwise = return () + +doIfSet_dyn :: DynFlags -> DynFlag -> IO () -> IO() +doIfSet_dyn dflags flag action | dopt flag dflags = action + | otherwise = return () +\end{code} + +\begin{code} +dumpIfSet :: Bool -> String -> SDoc -> IO () +dumpIfSet flag hdr doc + | not flag = return () + | otherwise = printDump (mkDumpDoc hdr doc) + +dumpIfSet_core :: DynFlags -> DynFlag -> String -> SDoc -> IO () +dumpIfSet_core dflags flag hdr doc + | dopt flag dflags + || verbosity dflags >= 4 + || dopt Opt_D_verbose_core2core dflags = printDump (mkDumpDoc hdr doc) + | otherwise = return () + +dumpIfSet_dyn :: DynFlags -> DynFlag -> String -> SDoc -> IO () +dumpIfSet_dyn dflags flag hdr doc + | dopt flag dflags || verbosity dflags >= 4 + = printDump (mkDumpDoc hdr doc) + | otherwise + = return () + +dumpIfSet_dyn_or :: DynFlags -> [DynFlag] -> String -> SDoc -> IO () +dumpIfSet_dyn_or dflags flags hdr doc + | or [dopt flag dflags | flag <- flags] + || verbosity dflags >= 4 + = printDump (mkDumpDoc hdr doc) + | otherwise = return () + +mkDumpDoc hdr doc + = vcat [text "", + line <+> text hdr <+> line, + doc, + text ""] + where + line = text (replicate 20 '=') + +-- ----------------------------------------------------------------------------- +-- Outputting messages from the compiler + +-- We want all messages to go through one place, so that we can +-- redirect them if necessary. For example, when GHC is used as a +-- library we might want to catch all messages that GHC tries to +-- output and do something else with them. + +ifVerbose :: DynFlags -> Int -> IO () -> IO () +ifVerbose dflags val act + | verbosity dflags >= val = act + | otherwise = return () + +putMsg :: DynFlags -> Message -> IO () +putMsg dflags msg = log_action dflags SevInfo noSrcSpan defaultUserStyle msg + +errorMsg :: DynFlags -> Message -> IO () +errorMsg dflags msg = log_action dflags SevError noSrcSpan defaultErrStyle msg + +fatalErrorMsg :: DynFlags -> Message -> IO () +fatalErrorMsg dflags msg = log_action dflags SevFatal noSrcSpan defaultErrStyle msg + +compilationProgressMsg :: DynFlags -> String -> IO () +compilationProgressMsg dflags msg + = ifVerbose dflags 1 (log_action dflags SevInfo noSrcSpan defaultUserStyle (text msg)) + +showPass :: DynFlags -> String -> IO () +showPass dflags what + = ifVerbose dflags 2 (log_action dflags SevInfo noSrcSpan defaultUserStyle (text "***" <+> text what <> colon)) + +debugTraceMsg :: DynFlags -> Int -> Message -> IO () +debugTraceMsg dflags val msg + = ifVerbose dflags val (log_action dflags SevInfo noSrcSpan defaultDumpStyle msg) +\end{code} diff --git a/compiler/main/ErrUtils.lhs-boot b/compiler/main/ErrUtils.lhs-boot new file mode 100644 index 0000000000..77d6cfdb4a --- /dev/null +++ b/compiler/main/ErrUtils.lhs-boot @@ -0,0 +1,16 @@ +\begin{code} +module ErrUtils where + +import Outputable (SDoc) +import SrcLoc (SrcSpan) + +data Severity + = SevInfo + | SevWarning + | SevError + | SevFatal + +type Message = SDoc + +mkLocMessage :: SrcSpan -> Message -> Message +\end{code} diff --git a/compiler/main/Finder.lhs b/compiler/main/Finder.lhs new file mode 100644 index 0000000000..fbde40f6ea --- /dev/null +++ b/compiler/main/Finder.lhs @@ -0,0 +1,499 @@ +% +% (c) The University of Glasgow, 2000 +% +\section[Finder]{Module Finder} + +\begin{code} +module Finder ( + flushFinderCache, -- :: IO () + FindResult(..), + findModule, -- :: ModuleName -> Bool -> IO FindResult + findPackageModule, -- :: ModuleName -> Bool -> IO FindResult + mkHomeModLocation, -- :: ModuleName -> FilePath -> IO ModLocation + mkHomeModLocation2, -- :: ModuleName -> FilePath -> String -> IO ModLocation + addHomeModuleToFinder, -- :: HscEnv -> Module -> ModLocation -> IO () + uncacheModule, -- :: HscEnv -> Module -> IO () + mkStubPaths, + + findObjectLinkableMaybe, + findObjectLinkable, + + cantFindError, -- :: DynFlags -> Module -> FindResult -> SDoc + ) where + +#include "HsVersions.h" + +import Module +import UniqFM ( filterUFM, delFromUFM ) +import HscTypes +import Packages +import FastString +import Util +import DynFlags ( DynFlags(..), isOneShot, GhcMode(..) ) +import Outputable +import Maybes ( expectJust ) + +import DATA_IOREF ( IORef, writeIORef, readIORef ) + +import Data.List +import System.Directory +import System.IO +import Control.Monad +import Data.Maybe ( isNothing ) +import Time ( ClockTime ) + + +type FileExt = String -- Filename extension +type BaseName = String -- Basename of file + +-- ----------------------------------------------------------------------------- +-- The Finder + +-- The Finder provides a thin filesystem abstraction to the rest of +-- the compiler. For a given module, it can tell you where the +-- source, interface, and object files for that module live. + +-- It does *not* know which particular package a module lives in. Use +-- Packages.lookupModuleInAllPackages for that. + +-- ----------------------------------------------------------------------------- +-- The finder's cache + +-- remove all the home modules from the cache; package modules are +-- assumed to not move around during a session. +flushFinderCache :: IORef FinderCache -> IO () +flushFinderCache finder_cache = do + fm <- readIORef finder_cache + writeIORef finder_cache $! filterUFM (\(loc,m) -> isNothing m) fm + +addToFinderCache :: IORef FinderCache -> Module -> FinderCacheEntry -> IO () +addToFinderCache finder_cache mod_name entry = do + fm <- readIORef finder_cache + writeIORef finder_cache $! extendModuleEnv fm mod_name entry + +removeFromFinderCache :: IORef FinderCache -> Module -> IO () +removeFromFinderCache finder_cache mod_name = do + fm <- readIORef finder_cache + writeIORef finder_cache $! delFromUFM fm mod_name + +lookupFinderCache :: IORef FinderCache -> Module -> IO (Maybe FinderCacheEntry) +lookupFinderCache finder_cache mod_name = do + fm <- readIORef finder_cache + return $! lookupModuleEnv fm mod_name + +-- ----------------------------------------------------------------------------- +-- The two external entry points + +-- This is the main interface to the finder, which maps ModuleNames to +-- Modules and ModLocations. +-- +-- The Module contains one crucial bit of information about a module: +-- whether it lives in the current ("home") package or not (see Module +-- for more details). +-- +-- The ModLocation contains the names of all the files associated with +-- that module: its source file, .hi file, object file, etc. + +data FindResult + = Found ModLocation PackageIdH + -- the module was found + | FoundMultiple [PackageId] + -- *error*: both in multiple packages + | PackageHidden PackageId + -- for an explicit source import: the package containing the module is + -- not exposed. + | ModuleHidden PackageId + -- for an explicit source import: the package containing the module is + -- exposed, but the module itself is hidden. + | NotFound [FilePath] + -- the module was not found, the specified places were searched. + +findModule :: HscEnv -> Module -> Bool -> IO FindResult +findModule = findModule' True + +findPackageModule :: HscEnv -> Module -> Bool -> IO FindResult +findPackageModule = findModule' False + + +data LocalFindResult + = Ok FinderCacheEntry + | CantFindAmongst [FilePath] + | MultiplePackages [PackageId] + +findModule' :: Bool -> HscEnv -> Module -> Bool -> IO FindResult +findModule' home_allowed hsc_env name explicit + = do -- First try the cache + mb_entry <- lookupFinderCache cache name + case mb_entry of + Just old_entry -> return $! found old_entry + Nothing -> not_cached + + where + cache = hsc_FC hsc_env + dflags = hsc_dflags hsc_env + + -- We've found the module, so the remaining question is + -- whether it's visible or not + found :: FinderCacheEntry -> FindResult + found (loc, Nothing) + | home_allowed = Found loc HomePackage + | otherwise = NotFound [] + found (loc, Just (pkg, exposed_mod)) + | explicit && not exposed_mod = ModuleHidden pkg_name + | explicit && not (exposed pkg) = PackageHidden pkg_name + | otherwise = + Found loc (ExtPackage (mkPackageId (package pkg))) + where + pkg_name = packageConfigId pkg + + found_new entry = do + addToFinderCache cache name entry + return $! found entry + + not_cached + | not home_allowed = do + j <- findPackageModule' dflags name + case j of + Ok entry -> found_new entry + MultiplePackages pkgs -> return (FoundMultiple pkgs) + CantFindAmongst paths -> return (NotFound paths) + + | otherwise = do + j <- findHomeModule' dflags name + case j of + Ok entry -> found_new entry + MultiplePackages pkgs -> return (FoundMultiple pkgs) + CantFindAmongst home_files -> do + r <- findPackageModule' dflags name + case r of + CantFindAmongst pkg_files -> + return (NotFound (home_files ++ pkg_files)) + MultiplePackages pkgs -> + return (FoundMultiple pkgs) + Ok entry -> + found_new entry + +addHomeModuleToFinder :: HscEnv -> Module -> ModLocation -> IO () +addHomeModuleToFinder hsc_env mod loc + = addToFinderCache (hsc_FC hsc_env) mod (loc, Nothing) + +uncacheModule :: HscEnv -> Module -> IO () +uncacheModule hsc_env mod = removeFromFinderCache (hsc_FC hsc_env) mod + +-- ----------------------------------------------------------------------------- +-- The internal workers + +findHomeModule' :: DynFlags -> Module -> IO LocalFindResult +findHomeModule' dflags mod = do + let home_path = importPaths dflags + hisuf = hiSuf dflags + + let + source_exts = + [ ("hs", mkHomeModLocationSearched dflags mod "hs") + , ("lhs", mkHomeModLocationSearched dflags mod "lhs") + ] + + hi_exts = [ (hisuf, mkHiOnlyModLocation dflags hisuf) + , (addBootSuffix hisuf, mkHiOnlyModLocation dflags hisuf) + ] + + -- In compilation manager modes, we look for source files in the home + -- package because we can compile these automatically. In one-shot + -- compilation mode we look for .hi and .hi-boot files only. + exts | isOneShot (ghcMode dflags) = hi_exts + | otherwise = source_exts + + searchPathExts home_path mod exts + +findPackageModule' :: DynFlags -> Module -> IO LocalFindResult +findPackageModule' dflags mod + = case lookupModuleInAllPackages dflags mod of + [] -> return (CantFindAmongst []) + [pkg_info] -> findPackageIface dflags mod pkg_info + many -> return (MultiplePackages (map (mkPackageId.package.fst) many)) + +findPackageIface :: DynFlags -> Module -> (PackageConfig,Bool) -> IO LocalFindResult +findPackageIface dflags mod pkg_info@(pkg_conf, _) = do + let + tag = buildTag dflags + + -- hi-suffix for packages depends on the build tag. + package_hisuf | null tag = "hi" + | otherwise = tag ++ "_hi" + hi_exts = + [ (package_hisuf, + mkPackageModLocation dflags pkg_info package_hisuf) ] + + source_exts = + [ ("hs", mkPackageModLocation dflags pkg_info package_hisuf) + , ("lhs", mkPackageModLocation dflags pkg_info package_hisuf) + ] + + -- mkdependHS needs to look for source files in packages too, so + -- that we can make dependencies between package before they have + -- been built. + exts + | MkDepend <- ghcMode dflags = hi_exts ++ source_exts + | otherwise = hi_exts + -- we never look for a .hi-boot file in an external package; + -- .hi-boot files only make sense for the home package. + + searchPathExts (importDirs pkg_conf) mod exts + +-- ----------------------------------------------------------------------------- +-- General path searching + +searchPathExts + :: [FilePath] -- paths to search + -> Module -- module name + -> [ ( + FileExt, -- suffix + FilePath -> BaseName -> IO FinderCacheEntry -- action + ) + ] + -> IO LocalFindResult + +searchPathExts paths mod exts + = do result <- search to_search +{- + hPutStrLn stderr (showSDoc $ + vcat [text "Search" <+> ppr mod <+> sep (map (text. fst) exts) + , nest 2 (vcat (map text paths)) + , case result of + Succeeded (loc, p) -> text "Found" <+> ppr loc + Failed fs -> text "not found"]) +-} + return result + + where + basename = dots_to_slashes (moduleString mod) + + to_search :: [(FilePath, IO FinderCacheEntry)] + to_search = [ (file, fn path basename) + | path <- paths, + (ext,fn) <- exts, + let base | path == "." = basename + | otherwise = path `joinFileName` basename + file = base `joinFileExt` ext + ] + + search [] = return (CantFindAmongst (map fst to_search)) + search ((file, mk_result) : rest) = do + b <- doesFileExist file + if b + then do { res <- mk_result; return (Ok res) } + else search rest + +mkHomeModLocationSearched :: DynFlags -> Module -> FileExt + -> FilePath -> BaseName -> IO FinderCacheEntry +mkHomeModLocationSearched dflags mod suff path basename = do + loc <- mkHomeModLocation2 dflags mod (path `joinFileName` basename) suff + return (loc, Nothing) + +mkHiOnlyModLocation :: DynFlags -> FileExt -> FilePath -> BaseName + -> IO FinderCacheEntry +mkHiOnlyModLocation dflags hisuf path basename = do + loc <- hiOnlyModLocation dflags path basename hisuf + return (loc, Nothing) + +mkPackageModLocation :: DynFlags -> (PackageConfig, Bool) -> FileExt + -> FilePath -> BaseName -> IO FinderCacheEntry +mkPackageModLocation dflags pkg_info hisuf path basename = do + loc <- hiOnlyModLocation dflags path basename hisuf + return (loc, Just pkg_info) + +-- ----------------------------------------------------------------------------- +-- Constructing a home module location + +-- This is where we construct the ModLocation for a module in the home +-- package, for which we have a source file. It is called from three +-- places: +-- +-- (a) Here in the finder, when we are searching for a module to import, +-- using the search path (-i option). +-- +-- (b) The compilation manager, when constructing the ModLocation for +-- a "root" module (a source file named explicitly on the command line +-- or in a :load command in GHCi). +-- +-- (c) The driver in one-shot mode, when we need to construct a +-- ModLocation for a source file named on the command-line. +-- +-- Parameters are: +-- +-- mod +-- The name of the module +-- +-- path +-- (a): The search path component where the source file was found. +-- (b) and (c): "." +-- +-- src_basename +-- (a): dots_to_slashes (moduleNameUserString mod) +-- (b) and (c): The filename of the source file, minus its extension +-- +-- ext +-- The filename extension of the source file (usually "hs" or "lhs"). + +mkHomeModLocation :: DynFlags -> Module -> FilePath -> IO ModLocation +mkHomeModLocation dflags mod src_filename = do + let (basename,extension) = splitFilename src_filename + mkHomeModLocation2 dflags mod basename extension + +mkHomeModLocation2 :: DynFlags + -> Module + -> FilePath -- Of source module, without suffix + -> String -- Suffix + -> IO ModLocation +mkHomeModLocation2 dflags mod src_basename ext = do + let mod_basename = dots_to_slashes (moduleString mod) + + obj_fn <- mkObjPath dflags src_basename mod_basename + hi_fn <- mkHiPath dflags src_basename mod_basename + + return (ModLocation{ ml_hs_file = Just (src_basename `joinFileExt` ext), + ml_hi_file = hi_fn, + ml_obj_file = obj_fn }) + +hiOnlyModLocation :: DynFlags -> FilePath -> String -> Suffix -> IO ModLocation +hiOnlyModLocation dflags path basename hisuf + = do let full_basename = path `joinFileName` basename + obj_fn <- mkObjPath dflags full_basename basename + return ModLocation{ ml_hs_file = Nothing, + ml_hi_file = full_basename `joinFileExt` hisuf, + -- Remove the .hi-boot suffix from + -- hi_file, if it had one. We always + -- want the name of the real .hi file + -- in the ml_hi_file field. + ml_obj_file = obj_fn + } + +-- | Constructs the filename of a .o file for a given source file. +-- Does /not/ check whether the .o file exists +mkObjPath + :: DynFlags + -> FilePath -- the filename of the source file, minus the extension + -> String -- the module name with dots replaced by slashes + -> IO FilePath +mkObjPath dflags basename mod_basename + = do let + odir = objectDir dflags + osuf = objectSuf dflags + + obj_basename | Just dir <- odir = dir `joinFileName` mod_basename + | otherwise = basename + + return (obj_basename `joinFileExt` osuf) + +-- | Constructs the filename of a .hi file for a given source file. +-- Does /not/ check whether the .hi file exists +mkHiPath + :: DynFlags + -> FilePath -- the filename of the source file, minus the extension + -> String -- the module name with dots replaced by slashes + -> IO FilePath +mkHiPath dflags basename mod_basename + = do let + hidir = hiDir dflags + hisuf = hiSuf dflags + + hi_basename | Just dir <- hidir = dir `joinFileName` mod_basename + | otherwise = basename + + return (hi_basename `joinFileExt` hisuf) + + +-- ----------------------------------------------------------------------------- +-- Filenames of the stub files + +-- We don't have to store these in ModLocations, because they can be derived +-- from other available information, and they're only rarely needed. + +mkStubPaths + :: DynFlags + -> Module + -> ModLocation + -> (FilePath,FilePath) + +mkStubPaths dflags mod location + = let + stubdir = stubDir dflags + + mod_basename = dots_to_slashes (moduleString mod) + src_basename = basenameOf (expectJust "mkStubPaths" + (ml_hs_file location)) + + stub_basename0 + | Just dir <- stubdir = dir `joinFileName` mod_basename + | otherwise = src_basename + + stub_basename = stub_basename0 ++ "_stub" + in + (stub_basename `joinFileExt` "c", + stub_basename `joinFileExt` "h") + -- the _stub.o filename is derived from the ml_obj_file. + +-- ----------------------------------------------------------------------------- +-- findLinkable isn't related to the other stuff in here, +-- but there's no other obvious place for it + +findObjectLinkableMaybe :: Module -> ModLocation -> IO (Maybe Linkable) +findObjectLinkableMaybe mod locn + = do let obj_fn = ml_obj_file locn + maybe_obj_time <- modificationTimeIfExists obj_fn + case maybe_obj_time of + Nothing -> return Nothing + Just obj_time -> liftM Just (findObjectLinkable mod obj_fn obj_time) + +-- Make an object linkable when we know the object file exists, and we know +-- its modification time. +findObjectLinkable :: Module -> FilePath -> ClockTime -> IO Linkable +findObjectLinkable mod obj_fn obj_time = do + let stub_fn = case splitFilename3 obj_fn of + (dir, base, ext) -> dir ++ "/" ++ base ++ "_stub.o" + stub_exist <- doesFileExist stub_fn + if stub_exist + then return (LM obj_time mod [DotO obj_fn, DotO stub_fn]) + else return (LM obj_time mod [DotO obj_fn]) + +-- ----------------------------------------------------------------------------- +-- Utils + +dots_to_slashes = map (\c -> if c == '.' then '/' else c) + + +-- ----------------------------------------------------------------------------- +-- Error messages + +cantFindError :: DynFlags -> Module -> FindResult -> SDoc +cantFindError dflags mod_name (FoundMultiple pkgs) + = hang (ptext SLIT("Cannot import") <+> quotes (ppr mod_name) <> colon) 2 ( + sep [ptext SLIT("it was found in multiple packages:"), + hsep (map (text.packageIdString) pkgs)] + ) +cantFindError dflags mod_name find_result + = hang (ptext SLIT("Could not find module") <+> quotes (ppr mod_name) <> colon) + 2 more_info + where + more_info + = case find_result of + PackageHidden pkg + -> ptext SLIT("it is a member of package") <+> ppr pkg <> comma + <+> ptext SLIT("which is hidden") + + ModuleHidden pkg + -> ptext SLIT("it is hidden") <+> parens (ptext SLIT("in package") + <+> ppr pkg) + + NotFound files + | null files + -> ptext SLIT("it is not a module in the current program, or in any known package.") + | verbosity dflags < 3 + -> ptext SLIT("use -v to see a list of the files searched for") + | otherwise + -> hang (ptext SLIT("locations searched:")) + 2 (vcat (map text files)) + + _ -> panic "cantFindErr" +\end{code} diff --git a/compiler/main/GHC.hs b/compiler/main/GHC.hs new file mode 100644 index 0000000000..3f91af6cc4 --- /dev/null +++ b/compiler/main/GHC.hs @@ -0,0 +1,2053 @@ +-- ----------------------------------------------------------------------------- +-- +-- (c) The University of Glasgow, 2005 +-- +-- The GHC API +-- +-- ----------------------------------------------------------------------------- + +module GHC ( + -- * Initialisation + Session, + defaultErrorHandler, + defaultCleanupHandler, + init, initFromArgs, + newSession, + + -- * Flags and settings + DynFlags(..), DynFlag(..), Severity(..), GhcMode(..), HscTarget(..), dopt, + parseDynamicFlags, + initPackages, + getSessionDynFlags, + setSessionDynFlags, + + -- * Targets + Target(..), TargetId(..), Phase, + setTargets, + getTargets, + addTarget, + removeTarget, + guessTarget, + + -- * Loading\/compiling the program + depanal, + load, LoadHowMuch(..), SuccessFlag(..), -- also does depanal + workingDirectoryChanged, + checkModule, CheckedModule(..), + TypecheckedSource, ParsedSource, RenamedSource, + + -- * Inspecting the module structure of the program + ModuleGraph, ModSummary(..), ModLocation(..), + getModuleGraph, + isLoaded, + topSortModuleGraph, + + -- * Inspecting modules + ModuleInfo, + getModuleInfo, + modInfoTyThings, + modInfoTopLevelScope, + modInfoPrintUnqualified, + modInfoExports, + modInfoInstances, + modInfoIsExportedName, + modInfoLookupName, + lookupGlobalName, + + -- * Printing + PrintUnqualified, alwaysQualify, + + -- * Interactive evaluation + getBindings, getPrintUnqual, +#ifdef GHCI + setContext, getContext, + getNamesInScope, + getRdrNamesInScope, + moduleIsInterpreted, + getInfo, + exprType, + typeKind, + parseName, + RunResult(..), + runStmt, + showModule, + compileExpr, HValue, + lookupName, +#endif + + -- * Abstract syntax elements + + -- ** Modules + Module, mkModule, pprModule, + + -- ** Names + Name, + nameModule, nameParent_maybe, pprParenSymName, nameSrcLoc, + NamedThing(..), + RdrName(Qual,Unqual), + + -- ** Identifiers + Id, idType, + isImplicitId, isDeadBinder, + isExportedId, isLocalId, isGlobalId, + isRecordSelector, + isPrimOpId, isFCallId, isClassOpId_maybe, + isDataConWorkId, idDataCon, + isBottomingId, isDictonaryId, + recordSelectorFieldLabel, + + -- ** Type constructors + TyCon, + tyConTyVars, tyConDataCons, tyConArity, + isClassTyCon, isSynTyCon, isNewTyCon, isPrimTyCon, isFunTyCon, + synTyConDefn, synTyConRhs, + + -- ** Type variables + TyVar, + alphaTyVars, + + -- ** Data constructors + DataCon, + dataConSig, dataConType, dataConTyCon, dataConFieldLabels, + dataConIsInfix, isVanillaDataCon, + dataConStrictMarks, + StrictnessMark(..), isMarkedStrict, + + -- ** Classes + Class, + classMethods, classSCTheta, classTvsFds, + pprFundeps, + + -- ** Instances + Instance, + instanceDFunId, pprInstance, pprInstanceHdr, + + -- ** Types and Kinds + Type, dropForAlls, splitForAllTys, funResultTy, pprParendType, + Kind, + PredType, + ThetaType, pprThetaArrow, + + -- ** Entities + TyThing(..), + + -- ** Syntax + module HsSyn, -- ToDo: remove extraneous bits + + -- ** Fixities + FixityDirection(..), + defaultFixity, maxPrecedence, + negateFixity, + compareFixity, + + -- ** Source locations + SrcLoc, pprDefnLoc, + + -- * Exceptions + GhcException(..), showGhcException, + + -- * Miscellaneous + sessionHscEnv, + cyclicModuleErr, + ) where + +{- + ToDo: + + * inline bits of HscMain here to simplify layering: hscTcExpr, hscStmt. + * we need to expose DynFlags, so should parseDynamicFlags really be + part of this interface? + * what StaticFlags should we expose, if any? +-} + +#include "HsVersions.h" + +#ifdef GHCI +import qualified Linker +import Linker ( HValue, extendLinkEnv ) +import TcRnDriver ( tcRnLookupRdrName, tcRnGetInfo, + tcRnLookupName, getModuleExports ) +import RdrName ( plusGlobalRdrEnv, Provenance(..), + ImportSpec(..), ImpDeclSpec(..), ImpItemSpec(..), + emptyGlobalRdrEnv, mkGlobalRdrEnv ) +import HscMain ( hscParseIdentifier, hscStmt, hscTcExpr, hscKcType ) +import Type ( tidyType ) +import VarEnv ( emptyTidyEnv ) +import GHC.Exts ( unsafeCoerce# ) +#endif + +import Packages ( initPackages ) +import NameSet ( NameSet, nameSetToList, elemNameSet ) +import RdrName ( GlobalRdrEnv, GlobalRdrElt(..), RdrName(..), + globalRdrEnvElts ) +import HsSyn +import Type ( Kind, Type, dropForAlls, PredType, ThetaType, + pprThetaArrow, pprParendType, splitForAllTys, + funResultTy ) +import Id ( Id, idType, isImplicitId, isDeadBinder, + isExportedId, isLocalId, isGlobalId, + isRecordSelector, recordSelectorFieldLabel, + isPrimOpId, isFCallId, isClassOpId_maybe, + isDataConWorkId, idDataCon, + isBottomingId ) +import Var ( TyVar ) +import TysPrim ( alphaTyVars ) +import TyCon ( TyCon, isClassTyCon, isSynTyCon, isNewTyCon, + isPrimTyCon, isFunTyCon, tyConArity, + tyConTyVars, tyConDataCons, synTyConDefn, synTyConRhs ) +import Class ( Class, classSCTheta, classTvsFds, classMethods ) +import FunDeps ( pprFundeps ) +import DataCon ( DataCon, dataConWrapId, dataConSig, dataConTyCon, + dataConFieldLabels, dataConStrictMarks, + dataConIsInfix, isVanillaDataCon ) +import Name ( Name, nameModule, NamedThing(..), nameParent_maybe, + nameSrcLoc, nameOccName ) +import OccName ( parenSymOcc ) +import NameEnv ( nameEnvElts ) +import InstEnv ( Instance, instanceDFunId, pprInstance, pprInstanceHdr ) +import SrcLoc +import DriverPipeline +import DriverPhases ( Phase(..), isHaskellSrcFilename, startPhase ) +import HeaderInfo ( getImports, getOptions ) +import Packages ( isHomePackage ) +import Finder +import HscMain ( newHscEnv, hscFileCheck, HscChecked(..) ) +import HscTypes +import DynFlags +import SysTools ( initSysTools, cleanTempFiles ) +import Module +import FiniteMap +import Panic +import Digraph +import Bag ( unitBag ) +import ErrUtils ( Severity(..), showPass, fatalErrorMsg, debugTraceMsg, + mkPlainErrMsg, printBagOfErrors, printErrorsAndWarnings ) +import qualified ErrUtils +import Util +import StringBuffer ( StringBuffer, hGetStringBuffer ) +import Outputable +import SysTools ( cleanTempFilesExcept ) +import BasicTypes +import TcType ( tcSplitSigmaTy, isDictTy ) +import Maybes ( expectJust, mapCatMaybes ) + +import Control.Concurrent +import System.Directory ( getModificationTime, doesFileExist ) +import Data.Maybe ( isJust, isNothing ) +import Data.List ( partition, nub ) +import qualified Data.List as List +import Control.Monad ( unless, when ) +import System.Exit ( exitWith, ExitCode(..) ) +import System.Time ( ClockTime ) +import Control.Exception as Exception hiding (handle) +import Data.IORef +import System.IO +import System.IO.Error ( isDoesNotExistError ) +import Prelude hiding (init) + +#if __GLASGOW_HASKELL__ < 600 +import System.IO as System.IO.Error ( try ) +#else +import System.IO.Error ( try ) +#endif + +-- ----------------------------------------------------------------------------- +-- Exception handlers + +-- | Install some default exception handlers and run the inner computation. +-- Unless you want to handle exceptions yourself, you should wrap this around +-- the top level of your program. The default handlers output the error +-- message(s) to stderr and exit cleanly. +defaultErrorHandler :: DynFlags -> IO a -> IO a +defaultErrorHandler dflags inner = + -- top-level exception handler: any unrecognised exception is a compiler bug. + handle (\exception -> do + hFlush stdout + case exception of + -- an IO exception probably isn't our fault, so don't panic + IOException _ -> + fatalErrorMsg dflags (text (show exception)) + AsyncException StackOverflow -> + fatalErrorMsg dflags (text "stack overflow: use +RTS -K<size> to increase it") + _other -> + fatalErrorMsg dflags (text (show (Panic (show exception)))) + exitWith (ExitFailure 1) + ) $ + + -- program errors: messages with locations attached. Sometimes it is + -- convenient to just throw these as exceptions. + handleDyn (\dyn -> do printBagOfErrors dflags (unitBag dyn) + exitWith (ExitFailure 1)) $ + + -- error messages propagated as exceptions + handleDyn (\dyn -> do + hFlush stdout + case dyn of + PhaseFailed _ code -> exitWith code + Interrupted -> exitWith (ExitFailure 1) + _ -> do fatalErrorMsg dflags (text (show (dyn :: GhcException))) + exitWith (ExitFailure 1) + ) $ + inner + +-- | Install a default cleanup handler to remove temporary files +-- deposited by a GHC run. This is seperate from +-- 'defaultErrorHandler', because you might want to override the error +-- handling, but still get the ordinary cleanup behaviour. +defaultCleanupHandler :: DynFlags -> IO a -> IO a +defaultCleanupHandler dflags inner = + -- make sure we clean up after ourselves + later (unless (dopt Opt_KeepTmpFiles dflags) $ + cleanTempFiles dflags) + -- exceptions will be blocked while we clean the temporary files, + -- so there shouldn't be any difficulty if we receive further + -- signals. + inner + + +-- | Initialises GHC. This must be done /once/ only. Takes the +-- TopDir path without the '-B' prefix. + +init :: Maybe String -> IO () +init mbMinusB = do + -- catch ^C + main_thread <- myThreadId + putMVar interruptTargetThread [main_thread] + installSignalHandlers + + dflags0 <- initSysTools mbMinusB defaultDynFlags + writeIORef v_initDynFlags dflags0 + +-- | Initialises GHC. This must be done /once/ only. Takes the +-- command-line arguments. All command-line arguments which aren't +-- understood by GHC will be returned. + +initFromArgs :: [String] -> IO [String] +initFromArgs args + = do init mbMinusB + return argv1 + where -- Grab the -B option if there is one + (minusB_args, argv1) = partition (prefixMatch "-B") args + mbMinusB | null minusB_args + = Nothing + | otherwise + = Just (drop 2 (last minusB_args)) + +GLOBAL_VAR(v_initDynFlags, error "initDynFlags", DynFlags) + -- stores the DynFlags between the call to init and subsequent + -- calls to newSession. + +-- | Starts a new session. A session consists of a set of loaded +-- modules, a set of options (DynFlags), and an interactive context. +-- ToDo: GhcMode should say "keep typechecked code" and\/or "keep renamed +-- code". +newSession :: GhcMode -> IO Session +newSession mode = do + dflags0 <- readIORef v_initDynFlags + dflags <- initDynFlags dflags0 + env <- newHscEnv dflags{ ghcMode=mode } + ref <- newIORef env + return (Session ref) + +-- tmp: this breaks the abstraction, but required because DriverMkDepend +-- needs to call the Finder. ToDo: untangle this. +sessionHscEnv :: Session -> IO HscEnv +sessionHscEnv (Session ref) = readIORef ref + +withSession :: Session -> (HscEnv -> IO a) -> IO a +withSession (Session ref) f = do h <- readIORef ref; f h + +modifySession :: Session -> (HscEnv -> HscEnv) -> IO () +modifySession (Session ref) f = do h <- readIORef ref; writeIORef ref $! f h + +-- ----------------------------------------------------------------------------- +-- Flags & settings + +-- | Grabs the DynFlags from the Session +getSessionDynFlags :: Session -> IO DynFlags +getSessionDynFlags s = withSession s (return . hsc_dflags) + +-- | Updates the DynFlags in a Session +setSessionDynFlags :: Session -> DynFlags -> IO () +setSessionDynFlags s dflags = modifySession s (\h -> h{ hsc_dflags = dflags }) + +-- | If there is no -o option, guess the name of target executable +-- by using top-level source file name as a base. +guessOutputFile :: Session -> IO () +guessOutputFile s = modifySession s $ \env -> + let dflags = hsc_dflags env + mod_graph = hsc_mod_graph env + mainModuleSrcPath, guessedName :: Maybe String + mainModuleSrcPath = do + let isMain = (== mainModIs dflags) . ms_mod + [ms] <- return (filter isMain mod_graph) + ml_hs_file (ms_location ms) + guessedName = fmap basenameOf mainModuleSrcPath + in + case outputFile dflags of + Just _ -> env + Nothing -> env { hsc_dflags = dflags { outputFile = guessedName } } + +-- ----------------------------------------------------------------------------- +-- Targets + +-- ToDo: think about relative vs. absolute file paths. And what +-- happens when the current directory changes. + +-- | Sets the targets for this session. Each target may be a module name +-- or a filename. The targets correspond to the set of root modules for +-- the program\/library. Unloading the current program is achieved by +-- setting the current set of targets to be empty, followed by load. +setTargets :: Session -> [Target] -> IO () +setTargets s targets = modifySession s (\h -> h{ hsc_targets = targets }) + +-- | returns the current set of targets +getTargets :: Session -> IO [Target] +getTargets s = withSession s (return . hsc_targets) + +-- | Add another target +addTarget :: Session -> Target -> IO () +addTarget s target + = modifySession s (\h -> h{ hsc_targets = target : hsc_targets h }) + +-- | Remove a target +removeTarget :: Session -> TargetId -> IO () +removeTarget s target_id + = modifySession s (\h -> h{ hsc_targets = filter (hsc_targets h) }) + where + filter targets = [ t | t@(Target id _) <- targets, id /= target_id ] + +-- Attempts to guess what Target a string refers to. This function implements +-- the --make/GHCi command-line syntax for filenames: +-- +-- - if the string looks like a Haskell source filename, then interpret +-- it as such +-- - if adding a .hs or .lhs suffix yields the name of an existing file, +-- then use that +-- - otherwise interpret the string as a module name +-- +guessTarget :: String -> Maybe Phase -> IO Target +guessTarget file (Just phase) + = return (Target (TargetFile file (Just phase)) Nothing) +guessTarget file Nothing + | isHaskellSrcFilename file + = return (Target (TargetFile file Nothing) Nothing) + | otherwise + = do exists <- doesFileExist hs_file + if exists + then return (Target (TargetFile hs_file Nothing) Nothing) + else do + exists <- doesFileExist lhs_file + if exists + then return (Target (TargetFile lhs_file Nothing) Nothing) + else do + return (Target (TargetModule (mkModule file)) Nothing) + where + hs_file = file `joinFileExt` "hs" + lhs_file = file `joinFileExt` "lhs" + +-- ----------------------------------------------------------------------------- +-- Loading the program + +-- Perform a dependency analysis starting from the current targets +-- and update the session with the new module graph. +depanal :: Session -> [Module] -> Bool -> IO (Maybe ModuleGraph) +depanal (Session ref) excluded_mods allow_dup_roots = do + hsc_env <- readIORef ref + let + dflags = hsc_dflags hsc_env + gmode = ghcMode (hsc_dflags hsc_env) + targets = hsc_targets hsc_env + old_graph = hsc_mod_graph hsc_env + + showPass dflags "Chasing dependencies" + when (gmode == BatchCompile) $ + debugTraceMsg dflags 1 (hcat [ + text "Chasing modules from: ", + hcat (punctuate comma (map pprTarget targets))]) + + r <- downsweep hsc_env old_graph excluded_mods allow_dup_roots + case r of + Just mod_graph -> writeIORef ref hsc_env{ hsc_mod_graph = mod_graph } + _ -> return () + return r + +{- +-- | The result of load. +data LoadResult + = LoadOk Errors -- ^ all specified targets were loaded successfully. + | LoadFailed Errors -- ^ not all modules were loaded. + +type Errors = [String] + +data ErrMsg = ErrMsg { + errMsgSeverity :: Severity, -- warning, error, etc. + errMsgSpans :: [SrcSpan], + errMsgShortDoc :: Doc, + errMsgExtraInfo :: Doc + } +-} + +data LoadHowMuch + = LoadAllTargets + | LoadUpTo Module + | LoadDependenciesOf Module + +-- | Try to load the program. If a Module is supplied, then just +-- attempt to load up to this target. If no Module is supplied, +-- then try to load all targets. +load :: Session -> LoadHowMuch -> IO SuccessFlag +load s@(Session ref) how_much + = do + -- Dependency analysis first. Note that this fixes the module graph: + -- even if we don't get a fully successful upsweep, the full module + -- graph is still retained in the Session. We can tell which modules + -- were successfully loaded by inspecting the Session's HPT. + mb_graph <- depanal s [] False + case mb_graph of + Just mod_graph -> load2 s how_much mod_graph + Nothing -> return Failed + +load2 s@(Session ref) how_much mod_graph = do + guessOutputFile s + hsc_env <- readIORef ref + + let hpt1 = hsc_HPT hsc_env + let dflags = hsc_dflags hsc_env + let ghci_mode = ghcMode dflags -- this never changes + + -- The "bad" boot modules are the ones for which we have + -- B.hs-boot in the module graph, but no B.hs + -- The downsweep should have ensured this does not happen + -- (see msDeps) + let all_home_mods = [ms_mod s | s <- mod_graph, not (isBootSummary s)] +#ifdef DEBUG + bad_boot_mods = [s | s <- mod_graph, isBootSummary s, + not (ms_mod s `elem` all_home_mods)] +#endif + ASSERT( null bad_boot_mods ) return () + + -- mg2_with_srcimps drops the hi-boot nodes, returning a + -- graph with cycles. Among other things, it is used for + -- backing out partially complete cycles following a failed + -- upsweep, and for removing from hpt all the modules + -- not in strict downwards closure, during calls to compile. + let mg2_with_srcimps :: [SCC ModSummary] + mg2_with_srcimps = topSortModuleGraph True mod_graph Nothing + + -- check the stability property for each module. + stable_mods@(stable_obj,stable_bco) + | BatchCompile <- ghci_mode = ([],[]) + | otherwise = checkStability hpt1 mg2_with_srcimps all_home_mods + + -- prune bits of the HPT which are definitely redundant now, + -- to save space. + pruned_hpt = pruneHomePackageTable hpt1 + (flattenSCCs mg2_with_srcimps) + stable_mods + + evaluate pruned_hpt + + debugTraceMsg dflags 2 (text "Stable obj:" <+> ppr stable_obj $$ + text "Stable BCO:" <+> ppr stable_bco) + + -- Unload any modules which are going to be re-linked this time around. + let stable_linkables = [ linkable + | m <- stable_obj++stable_bco, + Just hmi <- [lookupModuleEnv pruned_hpt m], + Just linkable <- [hm_linkable hmi] ] + unload hsc_env stable_linkables + + -- We could at this point detect cycles which aren't broken by + -- a source-import, and complain immediately, but it seems better + -- to let upsweep_mods do this, so at least some useful work gets + -- done before the upsweep is abandoned. + --hPutStrLn stderr "after tsort:\n" + --hPutStrLn stderr (showSDoc (vcat (map ppr mg2))) + + -- Now do the upsweep, calling compile for each module in + -- turn. Final result is version 3 of everything. + + -- Topologically sort the module graph, this time including hi-boot + -- nodes, and possibly just including the portion of the graph + -- reachable from the module specified in the 2nd argument to load. + -- This graph should be cycle-free. + -- If we're restricting the upsweep to a portion of the graph, we + -- also want to retain everything that is still stable. + let full_mg :: [SCC ModSummary] + full_mg = topSortModuleGraph False mod_graph Nothing + + maybe_top_mod = case how_much of + LoadUpTo m -> Just m + LoadDependenciesOf m -> Just m + _ -> Nothing + + partial_mg0 :: [SCC ModSummary] + partial_mg0 = topSortModuleGraph False mod_graph maybe_top_mod + + -- LoadDependenciesOf m: we want the upsweep to stop just + -- short of the specified module (unless the specified module + -- is stable). + partial_mg + | LoadDependenciesOf mod <- how_much + = ASSERT( case last partial_mg0 of + AcyclicSCC ms -> ms_mod ms == mod; _ -> False ) + List.init partial_mg0 + | otherwise + = partial_mg0 + + stable_mg = + [ AcyclicSCC ms + | AcyclicSCC ms <- full_mg, + ms_mod ms `elem` stable_obj++stable_bco, + ms_mod ms `notElem` [ ms_mod ms' | + AcyclicSCC ms' <- partial_mg ] ] + + mg = stable_mg ++ partial_mg + + -- clean up between compilations + let cleanup = cleanTempFilesExcept dflags + (ppFilesFromSummaries (flattenSCCs mg2_with_srcimps)) + + (upsweep_ok, hsc_env1, modsUpswept) + <- upsweep (hsc_env { hsc_HPT = emptyHomePackageTable }) + pruned_hpt stable_mods cleanup mg + + -- Make modsDone be the summaries for each home module now + -- available; this should equal the domain of hpt3. + -- Get in in a roughly top .. bottom order (hence reverse). + + let modsDone = reverse modsUpswept + + -- Try and do linking in some form, depending on whether the + -- upsweep was completely or only partially successful. + + if succeeded upsweep_ok + + then + -- Easy; just relink it all. + do debugTraceMsg dflags 2 (text "Upsweep completely successful.") + + -- Clean up after ourselves + cleanTempFilesExcept dflags (ppFilesFromSummaries modsDone) + + -- Issue a warning for the confusing case where the user + -- said '-o foo' but we're not going to do any linking. + -- We attempt linking if either (a) one of the modules is + -- called Main, or (b) the user said -no-hs-main, indicating + -- that main() is going to come from somewhere else. + -- + let ofile = outputFile dflags + let no_hs_main = dopt Opt_NoHsMain dflags + let + main_mod = mainModIs dflags + a_root_is_Main = any ((==main_mod).ms_mod) mod_graph + do_linking = a_root_is_Main || no_hs_main + + when (ghci_mode == BatchCompile && isJust ofile && not do_linking) $ + debugTraceMsg dflags 1 (text ("Warning: output was redirected with -o, " ++ + "but no output will be generated\n" ++ + "because there is no " ++ moduleString main_mod ++ " module.")) + + -- link everything together + linkresult <- link ghci_mode dflags do_linking (hsc_HPT hsc_env1) + + loadFinish Succeeded linkresult ref hsc_env1 + + else + -- Tricky. We need to back out the effects of compiling any + -- half-done cycles, both so as to clean up the top level envs + -- and to avoid telling the interactive linker to link them. + do debugTraceMsg dflags 2 (text "Upsweep partially successful.") + + let modsDone_names + = map ms_mod modsDone + let mods_to_zap_names + = findPartiallyCompletedCycles modsDone_names + mg2_with_srcimps + let mods_to_keep + = filter ((`notElem` mods_to_zap_names).ms_mod) + modsDone + + let hpt4 = retainInTopLevelEnvs (map ms_mod mods_to_keep) + (hsc_HPT hsc_env1) + + -- Clean up after ourselves + cleanTempFilesExcept dflags (ppFilesFromSummaries mods_to_keep) + + -- there should be no Nothings where linkables should be, now + ASSERT(all (isJust.hm_linkable) + (moduleEnvElts (hsc_HPT hsc_env))) do + + -- Link everything together + linkresult <- link ghci_mode dflags False hpt4 + + let hsc_env4 = hsc_env1{ hsc_HPT = hpt4 } + loadFinish Failed linkresult ref hsc_env4 + +-- Finish up after a load. + +-- If the link failed, unload everything and return. +loadFinish all_ok Failed ref hsc_env + = do unload hsc_env [] + writeIORef ref $! discardProg hsc_env + return Failed + +-- Empty the interactive context and set the module context to the topmost +-- newly loaded module, or the Prelude if none were loaded. +loadFinish all_ok Succeeded ref hsc_env + = do writeIORef ref $! hsc_env{ hsc_IC = emptyInteractiveContext } + return all_ok + + +-- Forget the current program, but retain the persistent info in HscEnv +discardProg :: HscEnv -> HscEnv +discardProg hsc_env + = hsc_env { hsc_mod_graph = emptyMG, + hsc_IC = emptyInteractiveContext, + hsc_HPT = emptyHomePackageTable } + +-- used to fish out the preprocess output files for the purposes of +-- cleaning up. The preprocessed file *might* be the same as the +-- source file, but that doesn't do any harm. +ppFilesFromSummaries summaries = map ms_hspp_file summaries + +-- ----------------------------------------------------------------------------- +-- Check module + +data CheckedModule = + CheckedModule { parsedSource :: ParsedSource, + renamedSource :: Maybe RenamedSource, + typecheckedSource :: Maybe TypecheckedSource, + checkedModuleInfo :: Maybe ModuleInfo + } + -- ToDo: improvements that could be made here: + -- if the module succeeded renaming but not typechecking, + -- we can still get back the GlobalRdrEnv and exports, so + -- perhaps the ModuleInfo should be split up into separate + -- fields within CheckedModule. + +type ParsedSource = Located (HsModule RdrName) +type RenamedSource = (HsGroup Name, [LImportDecl Name], Maybe [LIE Name]) +type TypecheckedSource = LHsBinds Id + +-- NOTE: +-- - things that aren't in the output of the typechecker right now: +-- - the export list +-- - the imports +-- - type signatures +-- - type/data/newtype declarations +-- - class declarations +-- - instances +-- - extra things in the typechecker's output: +-- - default methods are turned into top-level decls. +-- - dictionary bindings + + +-- | This is the way to get access to parsed and typechecked source code +-- for a module. 'checkModule' loads all the dependencies of the specified +-- module in the Session, and then attempts to typecheck the module. If +-- successful, it returns the abstract syntax for the module. +checkModule :: Session -> Module -> IO (Maybe CheckedModule) +checkModule session@(Session ref) mod = do + -- load up the dependencies first + r <- load session (LoadDependenciesOf mod) + if (failed r) then return Nothing else do + + -- now parse & typecheck the module + hsc_env <- readIORef ref + let mg = hsc_mod_graph hsc_env + case [ ms | ms <- mg, ms_mod ms == mod ] of + [] -> return Nothing + (ms:_) -> do + mbChecked <- hscFileCheck hsc_env{hsc_dflags=ms_hspp_opts ms} ms + case mbChecked of + Nothing -> return Nothing + Just (HscChecked parsed renamed Nothing) -> + return (Just (CheckedModule { + parsedSource = parsed, + renamedSource = renamed, + typecheckedSource = Nothing, + checkedModuleInfo = Nothing })) + Just (HscChecked parsed renamed + (Just (tc_binds, rdr_env, details))) -> do + let minf = ModuleInfo { + minf_type_env = md_types details, + minf_exports = md_exports details, + minf_rdr_env = Just rdr_env, + minf_instances = md_insts details + } + return (Just (CheckedModule { + parsedSource = parsed, + renamedSource = renamed, + typecheckedSource = Just tc_binds, + checkedModuleInfo = Just minf })) + +-- --------------------------------------------------------------------------- +-- Unloading + +unload :: HscEnv -> [Linkable] -> IO () +unload hsc_env stable_linkables -- Unload everthing *except* 'stable_linkables' + = case ghcMode (hsc_dflags hsc_env) of + BatchCompile -> return () + JustTypecheck -> return () +#ifdef GHCI + Interactive -> Linker.unload (hsc_dflags hsc_env) stable_linkables +#else + Interactive -> panic "unload: no interpreter" +#endif + other -> panic "unload: strange mode" + +-- ----------------------------------------------------------------------------- +-- checkStability + +{- + Stability tells us which modules definitely do not need to be recompiled. + There are two main reasons for having stability: + + - avoid doing a complete upsweep of the module graph in GHCi when + modules near the bottom of the tree have not changed. + + - to tell GHCi when it can load object code: we can only load object code + for a module when we also load object code fo all of the imports of the + module. So we need to know that we will definitely not be recompiling + any of these modules, and we can use the object code. + + NB. stability is of no importance to BatchCompile at all, only Interactive. + (ToDo: what about JustTypecheck?) + + The stability check is as follows. Both stableObject and + stableBCO are used during the upsweep phase later. + + ------------------- + stable m = stableObject m || stableBCO m + + stableObject m = + all stableObject (imports m) + && old linkable does not exist, or is == on-disk .o + && date(on-disk .o) > date(.hs) + + stableBCO m = + all stable (imports m) + && date(BCO) > date(.hs) + ------------------- + + These properties embody the following ideas: + + - if a module is stable: + - if it has been compiled in a previous pass (present in HPT) + then it does not need to be compiled or re-linked. + - if it has not been compiled in a previous pass, + then we only need to read its .hi file from disk and + link it to produce a ModDetails. + + - if a modules is not stable, we will definitely be at least + re-linking, and possibly re-compiling it during the upsweep. + All non-stable modules can (and should) therefore be unlinked + before the upsweep. + + - Note that objects are only considered stable if they only depend + on other objects. We can't link object code against byte code. +-} + +checkStability + :: HomePackageTable -- HPT from last compilation + -> [SCC ModSummary] -- current module graph (cyclic) + -> [Module] -- all home modules + -> ([Module], -- stableObject + [Module]) -- stableBCO + +checkStability hpt sccs all_home_mods = foldl checkSCC ([],[]) sccs + where + checkSCC (stable_obj, stable_bco) scc0 + | stableObjects = (scc_mods ++ stable_obj, stable_bco) + | stableBCOs = (stable_obj, scc_mods ++ stable_bco) + | otherwise = (stable_obj, stable_bco) + where + scc = flattenSCC scc0 + scc_mods = map ms_mod scc + home_module m = m `elem` all_home_mods && m `notElem` scc_mods + + scc_allimps = nub (filter home_module (concatMap ms_allimps scc)) + -- all imports outside the current SCC, but in the home pkg + + stable_obj_imps = map (`elem` stable_obj) scc_allimps + stable_bco_imps = map (`elem` stable_bco) scc_allimps + + stableObjects = + and stable_obj_imps + && all object_ok scc + + stableBCOs = + and (zipWith (||) stable_obj_imps stable_bco_imps) + && all bco_ok scc + + object_ok ms + | Just t <- ms_obj_date ms = t >= ms_hs_date ms + && same_as_prev t + | otherwise = False + where + same_as_prev t = case lookupModuleEnv hpt (ms_mod ms) of + Just hmi | Just l <- hm_linkable hmi + -> isObjectLinkable l && t == linkableTime l + _other -> True + -- why '>=' rather than '>' above? If the filesystem stores + -- times to the nearset second, we may occasionally find that + -- the object & source have the same modification time, + -- especially if the source was automatically generated + -- and compiled. Using >= is slightly unsafe, but it matches + -- make's behaviour. + + bco_ok ms + = case lookupModuleEnv hpt (ms_mod ms) of + Just hmi | Just l <- hm_linkable hmi -> + not (isObjectLinkable l) && + linkableTime l >= ms_hs_date ms + _other -> False + +ms_allimps :: ModSummary -> [Module] +ms_allimps ms = map unLoc (ms_srcimps ms ++ ms_imps ms) + +-- ----------------------------------------------------------------------------- +-- Prune the HomePackageTable + +-- Before doing an upsweep, we can throw away: +-- +-- - For non-stable modules: +-- - all ModDetails, all linked code +-- - all unlinked code that is out of date with respect to +-- the source file +-- +-- This is VERY IMPORTANT otherwise we'll end up requiring 2x the +-- space at the end of the upsweep, because the topmost ModDetails of the +-- old HPT holds on to the entire type environment from the previous +-- compilation. + +pruneHomePackageTable + :: HomePackageTable + -> [ModSummary] + -> ([Module],[Module]) + -> HomePackageTable + +pruneHomePackageTable hpt summ (stable_obj, stable_bco) + = mapModuleEnv prune hpt + where prune hmi + | is_stable modl = hmi' + | otherwise = hmi'{ hm_details = emptyModDetails } + where + modl = mi_module (hm_iface hmi) + hmi' | Just l <- hm_linkable hmi, linkableTime l < ms_hs_date ms + = hmi{ hm_linkable = Nothing } + | otherwise + = hmi + where ms = expectJust "prune" (lookupModuleEnv ms_map modl) + + ms_map = mkModuleEnv [(ms_mod ms, ms) | ms <- summ] + + is_stable m = m `elem` stable_obj || m `elem` stable_bco + +-- ----------------------------------------------------------------------------- + +-- Return (names of) all those in modsDone who are part of a cycle +-- as defined by theGraph. +findPartiallyCompletedCycles :: [Module] -> [SCC ModSummary] -> [Module] +findPartiallyCompletedCycles modsDone theGraph + = chew theGraph + where + chew [] = [] + chew ((AcyclicSCC v):rest) = chew rest -- acyclic? not interesting. + chew ((CyclicSCC vs):rest) + = let names_in_this_cycle = nub (map ms_mod vs) + mods_in_this_cycle + = nub ([done | done <- modsDone, + done `elem` names_in_this_cycle]) + chewed_rest = chew rest + in + if notNull mods_in_this_cycle + && length mods_in_this_cycle < length names_in_this_cycle + then mods_in_this_cycle ++ chewed_rest + else chewed_rest + +-- ----------------------------------------------------------------------------- +-- The upsweep + +-- This is where we compile each module in the module graph, in a pass +-- from the bottom to the top of the graph. + +-- There better had not be any cyclic groups here -- we check for them. + +upsweep + :: HscEnv -- Includes initially-empty HPT + -> HomePackageTable -- HPT from last time round (pruned) + -> ([Module],[Module]) -- stable modules (see checkStability) + -> IO () -- How to clean up unwanted tmp files + -> [SCC ModSummary] -- Mods to do (the worklist) + -> IO (SuccessFlag, + HscEnv, -- With an updated HPT + [ModSummary]) -- Mods which succeeded + +upsweep hsc_env old_hpt stable_mods cleanup mods + = upsweep' hsc_env old_hpt stable_mods cleanup mods 1 (length mods) + +upsweep' hsc_env old_hpt stable_mods cleanup + [] _ _ + = return (Succeeded, hsc_env, []) + +upsweep' hsc_env old_hpt stable_mods cleanup + (CyclicSCC ms:_) _ _ + = do fatalErrorMsg (hsc_dflags hsc_env) (cyclicModuleErr ms) + return (Failed, hsc_env, []) + +upsweep' hsc_env old_hpt stable_mods cleanup + (AcyclicSCC mod:mods) mod_index nmods + = do -- putStrLn ("UPSWEEP_MOD: hpt = " ++ + -- show (map (moduleUserString.moduleName.mi_module.hm_iface) + -- (moduleEnvElts (hsc_HPT hsc_env))) + + mb_mod_info <- upsweep_mod hsc_env old_hpt stable_mods mod + mod_index nmods + + cleanup -- Remove unwanted tmp files between compilations + + case mb_mod_info of + Nothing -> return (Failed, hsc_env, []) + Just mod_info -> do + { let this_mod = ms_mod mod + + -- Add new info to hsc_env + hpt1 = extendModuleEnv (hsc_HPT hsc_env) + this_mod mod_info + hsc_env1 = hsc_env { hsc_HPT = hpt1 } + + -- Space-saving: delete the old HPT entry + -- for mod BUT if mod is a hs-boot + -- node, don't delete it. For the + -- interface, the HPT entry is probaby for the + -- main Haskell source file. Deleting it + -- would force .. (what?? --SDM) + old_hpt1 | isBootSummary mod = old_hpt + | otherwise = delModuleEnv old_hpt this_mod + + ; (restOK, hsc_env2, modOKs) + <- upsweep' hsc_env1 old_hpt1 stable_mods cleanup + mods (mod_index+1) nmods + ; return (restOK, hsc_env2, mod:modOKs) + } + + +-- Compile a single module. Always produce a Linkable for it if +-- successful. If no compilation happened, return the old Linkable. +upsweep_mod :: HscEnv + -> HomePackageTable + -> ([Module],[Module]) + -> ModSummary + -> Int -- index of module + -> Int -- total number of modules + -> IO (Maybe HomeModInfo) -- Nothing => Failed + +upsweep_mod hsc_env old_hpt (stable_obj, stable_bco) summary mod_index nmods + = do + let + this_mod = ms_mod summary + mb_obj_date = ms_obj_date summary + obj_fn = ml_obj_file (ms_location summary) + hs_date = ms_hs_date summary + + compile_it :: Maybe Linkable -> IO (Maybe HomeModInfo) + compile_it = upsweep_compile hsc_env old_hpt this_mod + summary mod_index nmods + + case ghcMode (hsc_dflags hsc_env) of + BatchCompile -> + case () of + -- Batch-compilating is easy: just check whether we have + -- an up-to-date object file. If we do, then the compiler + -- needs to do a recompilation check. + _ | Just obj_date <- mb_obj_date, obj_date >= hs_date -> do + linkable <- + findObjectLinkable this_mod obj_fn obj_date + compile_it (Just linkable) + + | otherwise -> + compile_it Nothing + + interactive -> + case () of + _ | is_stable_obj, isJust old_hmi -> + return old_hmi + -- object is stable, and we have an entry in the + -- old HPT: nothing to do + + | is_stable_obj, isNothing old_hmi -> do + linkable <- + findObjectLinkable this_mod obj_fn + (expectJust "upseep1" mb_obj_date) + compile_it (Just linkable) + -- object is stable, but we need to load the interface + -- off disk to make a HMI. + + | is_stable_bco -> + ASSERT(isJust old_hmi) -- must be in the old_hpt + return old_hmi + -- BCO is stable: nothing to do + + | Just hmi <- old_hmi, + Just l <- hm_linkable hmi, not (isObjectLinkable l), + linkableTime l >= ms_hs_date summary -> + compile_it (Just l) + -- we have an old BCO that is up to date with respect + -- to the source: do a recompilation check as normal. + + | otherwise -> + compile_it Nothing + -- no existing code at all: we must recompile. + where + is_stable_obj = this_mod `elem` stable_obj + is_stable_bco = this_mod `elem` stable_bco + + old_hmi = lookupModuleEnv old_hpt this_mod + +-- Run hsc to compile a module +upsweep_compile hsc_env old_hpt this_mod summary + mod_index nmods + mb_old_linkable = do + let + -- The old interface is ok if it's in the old HPT + -- a) we're compiling a source file, and the old HPT + -- entry is for a source file + -- b) we're compiling a hs-boot file + -- Case (b) allows an hs-boot file to get the interface of its + -- real source file on the second iteration of the compilation + -- manager, but that does no harm. Otherwise the hs-boot file + -- will always be recompiled + + mb_old_iface + = case lookupModuleEnv old_hpt this_mod of + Nothing -> Nothing + Just hm_info | isBootSummary summary -> Just iface + | not (mi_boot iface) -> Just iface + | otherwise -> Nothing + where + iface = hm_iface hm_info + + compresult <- compile hsc_env summary mb_old_linkable mb_old_iface + mod_index nmods + + case compresult of + -- Compilation failed. Compile may still have updated the PCS, tho. + CompErrs -> return Nothing + + -- Compilation "succeeded", and may or may not have returned a new + -- linkable (depending on whether compilation was actually performed + -- or not). + CompOK new_details new_iface new_linkable + -> do let new_info = HomeModInfo { hm_iface = new_iface, + hm_details = new_details, + hm_linkable = new_linkable } + return (Just new_info) + + +-- Filter modules in the HPT +retainInTopLevelEnvs :: [Module] -> HomePackageTable -> HomePackageTable +retainInTopLevelEnvs keep_these hpt + = mkModuleEnv [ (mod, expectJust "retain" mb_mod_info) + | mod <- keep_these + , let mb_mod_info = lookupModuleEnv hpt mod + , isJust mb_mod_info ] + +-- --------------------------------------------------------------------------- +-- Topological sort of the module graph + +topSortModuleGraph + :: Bool -- Drop hi-boot nodes? (see below) + -> [ModSummary] + -> Maybe Module + -> [SCC ModSummary] +-- Calculate SCCs of the module graph, possibly dropping the hi-boot nodes +-- The resulting list of strongly-connected-components is in topologically +-- sorted order, starting with the module(s) at the bottom of the +-- dependency graph (ie compile them first) and ending with the ones at +-- the top. +-- +-- Drop hi-boot nodes (first boolean arg)? +-- +-- False: treat the hi-boot summaries as nodes of the graph, +-- so the graph must be acyclic +-- +-- True: eliminate the hi-boot nodes, and instead pretend +-- the a source-import of Foo is an import of Foo +-- The resulting graph has no hi-boot nodes, but can by cyclic + +topSortModuleGraph drop_hs_boot_nodes summaries Nothing + = stronglyConnComp (fst (moduleGraphNodes drop_hs_boot_nodes summaries)) +topSortModuleGraph drop_hs_boot_nodes summaries (Just mod) + = stronglyConnComp (map vertex_fn (reachable graph root)) + where + -- restrict the graph to just those modules reachable from + -- the specified module. We do this by building a graph with + -- the full set of nodes, and determining the reachable set from + -- the specified node. + (nodes, lookup_key) = moduleGraphNodes drop_hs_boot_nodes summaries + (graph, vertex_fn, key_fn) = graphFromEdges' nodes + root + | Just key <- lookup_key HsSrcFile mod, Just v <- key_fn key = v + | otherwise = throwDyn (ProgramError "module does not exist") + +moduleGraphNodes :: Bool -> [ModSummary] + -> ([(ModSummary, Int, [Int])], HscSource -> Module -> Maybe Int) +moduleGraphNodes drop_hs_boot_nodes summaries = (nodes, lookup_key) + where + -- Drop hs-boot nodes by using HsSrcFile as the key + hs_boot_key | drop_hs_boot_nodes = HsSrcFile + | otherwise = HsBootFile + + -- We use integers as the keys for the SCC algorithm + nodes :: [(ModSummary, Int, [Int])] + nodes = [(s, expectJust "topSort" (lookup_key (ms_hsc_src s) (ms_mod s)), + out_edge_keys hs_boot_key (map unLoc (ms_srcimps s)) ++ + out_edge_keys HsSrcFile (map unLoc (ms_imps s)) ) + | s <- summaries + , not (isBootSummary s && drop_hs_boot_nodes) ] + -- Drop the hi-boot ones if told to do so + + key_map :: NodeMap Int + key_map = listToFM ([(ms_mod s, ms_hsc_src s) | s <- summaries] + `zip` [1..]) + + lookup_key :: HscSource -> Module -> Maybe Int + lookup_key hs_src mod = lookupFM key_map (mod, hs_src) + + out_edge_keys :: HscSource -> [Module] -> [Int] + out_edge_keys hi_boot ms = mapCatMaybes (lookup_key hi_boot) ms + -- If we want keep_hi_boot_nodes, then we do lookup_key with + -- the IsBootInterface parameter True; else False + + +type NodeKey = (Module, HscSource) -- The nodes of the graph are +type NodeMap a = FiniteMap NodeKey a -- keyed by (mod, src_file_type) pairs + +msKey :: ModSummary -> NodeKey +msKey (ModSummary { ms_mod = mod, ms_hsc_src = boot }) = (mod,boot) + +mkNodeMap :: [ModSummary] -> NodeMap ModSummary +mkNodeMap summaries = listToFM [ (msKey s, s) | s <- summaries] + +nodeMapElts :: NodeMap a -> [a] +nodeMapElts = eltsFM + +----------------------------------------------------------------------------- +-- Downsweep (dependency analysis) + +-- Chase downwards from the specified root set, returning summaries +-- for all home modules encountered. Only follow source-import +-- links. + +-- We pass in the previous collection of summaries, which is used as a +-- cache to avoid recalculating a module summary if the source is +-- unchanged. +-- +-- The returned list of [ModSummary] nodes has one node for each home-package +-- module, plus one for any hs-boot files. The imports of these nodes +-- are all there, including the imports of non-home-package modules. + +downsweep :: HscEnv + -> [ModSummary] -- Old summaries + -> [Module] -- Ignore dependencies on these; treat + -- them as if they were package modules + -> Bool -- True <=> allow multiple targets to have + -- the same module name; this is + -- very useful for ghc -M + -> IO (Maybe [ModSummary]) + -- The elts of [ModSummary] all have distinct + -- (Modules, IsBoot) identifiers, unless the Bool is true + -- in which case there can be repeats +downsweep hsc_env old_summaries excl_mods allow_dup_roots + = -- catch error messages and return them + handleDyn (\err_msg -> printBagOfErrors (hsc_dflags hsc_env) (unitBag err_msg) >> return Nothing) $ do + rootSummaries <- mapM getRootSummary roots + let root_map = mkRootMap rootSummaries + checkDuplicates root_map + summs <- loop (concatMap msDeps rootSummaries) root_map + return (Just summs) + where + roots = hsc_targets hsc_env + + old_summary_map :: NodeMap ModSummary + old_summary_map = mkNodeMap old_summaries + + getRootSummary :: Target -> IO ModSummary + getRootSummary (Target (TargetFile file mb_phase) maybe_buf) + = do exists <- doesFileExist file + if exists + then summariseFile hsc_env old_summaries file mb_phase maybe_buf + else throwDyn $ mkPlainErrMsg noSrcSpan $ + text "can't find file:" <+> text file + getRootSummary (Target (TargetModule modl) maybe_buf) + = do maybe_summary <- summariseModule hsc_env old_summary_map False + (L rootLoc modl) maybe_buf excl_mods + case maybe_summary of + Nothing -> packageModErr modl + Just s -> return s + + rootLoc = mkGeneralSrcSpan FSLIT("<command line>") + + -- In a root module, the filename is allowed to diverge from the module + -- name, so we have to check that there aren't multiple root files + -- defining the same module (otherwise the duplicates will be silently + -- ignored, leading to confusing behaviour). + checkDuplicates :: NodeMap [ModSummary] -> IO () + checkDuplicates root_map + | allow_dup_roots = return () + | null dup_roots = return () + | otherwise = multiRootsErr (head dup_roots) + where + dup_roots :: [[ModSummary]] -- Each at least of length 2 + dup_roots = filterOut isSingleton (nodeMapElts root_map) + + loop :: [(Located Module,IsBootInterface)] + -- Work list: process these modules + -> NodeMap [ModSummary] + -- Visited set; the range is a list because + -- the roots can have the same module names + -- if allow_dup_roots is True + -> IO [ModSummary] + -- The result includes the worklist, except + -- for those mentioned in the visited set + loop [] done = return (concat (nodeMapElts done)) + loop ((wanted_mod, is_boot) : ss) done + | Just summs <- lookupFM done key + = if isSingleton summs then + loop ss done + else + do { multiRootsErr summs; return [] } + | otherwise = do { mb_s <- summariseModule hsc_env old_summary_map + is_boot wanted_mod Nothing excl_mods + ; case mb_s of + Nothing -> loop ss done + Just s -> loop (msDeps s ++ ss) + (addToFM done key [s]) } + where + key = (unLoc wanted_mod, if is_boot then HsBootFile else HsSrcFile) + +mkRootMap :: [ModSummary] -> NodeMap [ModSummary] +mkRootMap summaries = addListToFM_C (++) emptyFM + [ (msKey s, [s]) | s <- summaries ] + +msDeps :: ModSummary -> [(Located Module, IsBootInterface)] +-- (msDeps s) returns the dependencies of the ModSummary s. +-- A wrinkle is that for a {-# SOURCE #-} import we return +-- *both* the hs-boot file +-- *and* the source file +-- as "dependencies". That ensures that the list of all relevant +-- modules always contains B.hs if it contains B.hs-boot. +-- Remember, this pass isn't doing the topological sort. It's +-- just gathering the list of all relevant ModSummaries +msDeps s = + concat [ [(m,True), (m,False)] | m <- ms_srcimps s ] + ++ [ (m,False) | m <- ms_imps s ] + +----------------------------------------------------------------------------- +-- Summarising modules + +-- We have two types of summarisation: +-- +-- * Summarise a file. This is used for the root module(s) passed to +-- cmLoadModules. The file is read, and used to determine the root +-- module name. The module name may differ from the filename. +-- +-- * Summarise a module. We are given a module name, and must provide +-- a summary. The finder is used to locate the file in which the module +-- resides. + +summariseFile + :: HscEnv + -> [ModSummary] -- old summaries + -> FilePath -- source file name + -> Maybe Phase -- start phase + -> Maybe (StringBuffer,ClockTime) + -> IO ModSummary + +summariseFile hsc_env old_summaries file mb_phase maybe_buf + -- we can use a cached summary if one is available and the + -- source file hasn't changed, But we have to look up the summary + -- by source file, rather than module name as we do in summarise. + | Just old_summary <- findSummaryBySourceFile old_summaries file + = do + let location = ms_location old_summary + + -- return the cached summary if the source didn't change + src_timestamp <- case maybe_buf of + Just (_,t) -> return t + Nothing -> getModificationTime file + -- The file exists; we checked in getRootSummary above. + -- If it gets removed subsequently, then this + -- getModificationTime may fail, but that's the right + -- behaviour. + + if ms_hs_date old_summary == src_timestamp + then do -- update the object-file timestamp + obj_timestamp <- getObjTimestamp location False + return old_summary{ ms_obj_date = obj_timestamp } + else + new_summary + + | otherwise + = new_summary + where + new_summary = do + let dflags = hsc_dflags hsc_env + + (dflags', hspp_fn, buf) + <- preprocessFile dflags file mb_phase maybe_buf + + (srcimps,the_imps, L _ mod) <- getImports dflags' buf hspp_fn + + -- Make a ModLocation for this file + location <- mkHomeModLocation dflags mod file + + -- Tell the Finder cache where it is, so that subsequent calls + -- to findModule will find it, even if it's not on any search path + addHomeModuleToFinder hsc_env mod location + + src_timestamp <- case maybe_buf of + Just (_,t) -> return t + Nothing -> getModificationTime file + -- getMofificationTime may fail + + obj_timestamp <- modificationTimeIfExists (ml_obj_file location) + + return (ModSummary { ms_mod = mod, ms_hsc_src = HsSrcFile, + ms_location = location, + ms_hspp_file = hspp_fn, + ms_hspp_opts = dflags', + ms_hspp_buf = Just buf, + ms_srcimps = srcimps, ms_imps = the_imps, + ms_hs_date = src_timestamp, + ms_obj_date = obj_timestamp }) + +findSummaryBySourceFile :: [ModSummary] -> FilePath -> Maybe ModSummary +findSummaryBySourceFile summaries file + = case [ ms | ms <- summaries, HsSrcFile <- [ms_hsc_src ms], + expectJust "findSummaryBySourceFile" (ml_hs_file (ms_location ms)) == file ] of + [] -> Nothing + (x:xs) -> Just x + +-- Summarise a module, and pick up source and timestamp. +summariseModule + :: HscEnv + -> NodeMap ModSummary -- Map of old summaries + -> IsBootInterface -- True <=> a {-# SOURCE #-} import + -> Located Module -- Imported module to be summarised + -> Maybe (StringBuffer, ClockTime) + -> [Module] -- Modules to exclude + -> IO (Maybe ModSummary) -- Its new summary + +summariseModule hsc_env old_summary_map is_boot (L loc wanted_mod) maybe_buf excl_mods + | wanted_mod `elem` excl_mods + = return Nothing + + | Just old_summary <- lookupFM old_summary_map (wanted_mod, hsc_src) + = do -- Find its new timestamp; all the + -- ModSummaries in the old map have valid ml_hs_files + let location = ms_location old_summary + src_fn = expectJust "summariseModule" (ml_hs_file location) + + -- check the modification time on the source file, and + -- return the cached summary if it hasn't changed. If the + -- file has disappeared, we need to call the Finder again. + case maybe_buf of + Just (_,t) -> check_timestamp old_summary location src_fn t + Nothing -> do + m <- System.IO.Error.try (getModificationTime src_fn) + case m of + Right t -> check_timestamp old_summary location src_fn t + Left e | isDoesNotExistError e -> find_it + | otherwise -> ioError e + + | otherwise = find_it + where + dflags = hsc_dflags hsc_env + + hsc_src = if is_boot then HsBootFile else HsSrcFile + + check_timestamp old_summary location src_fn src_timestamp + | ms_hs_date old_summary == src_timestamp = do + -- update the object-file timestamp + obj_timestamp <- getObjTimestamp location is_boot + return (Just old_summary{ ms_obj_date = obj_timestamp }) + | otherwise = + -- source changed: find and re-summarise. We call the finder + -- again, because the user may have moved the source file. + new_summary location src_fn src_timestamp + + find_it = do + -- Don't use the Finder's cache this time. If the module was + -- previously a package module, it may have now appeared on the + -- search path, so we want to consider it to be a home module. If + -- the module was previously a home module, it may have moved. + uncacheModule hsc_env wanted_mod + found <- findModule hsc_env wanted_mod True {-explicit-} + case found of + Found location pkg + | not (isHomePackage pkg) -> return Nothing + -- Drop external-pkg + | isJust (ml_hs_file location) -> just_found location + -- Home package + err -> noModError dflags loc wanted_mod err + -- Not found + + just_found location = do + -- Adjust location to point to the hs-boot source file, + -- hi file, object file, when is_boot says so + let location' | is_boot = addBootSuffixLocn location + | otherwise = location + src_fn = expectJust "summarise2" (ml_hs_file location') + + -- Check that it exists + -- It might have been deleted since the Finder last found it + maybe_t <- modificationTimeIfExists src_fn + case maybe_t of + Nothing -> noHsFileErr loc src_fn + Just t -> new_summary location' src_fn t + + + new_summary location src_fn src_timestamp + = do + -- Preprocess the source file and get its imports + -- The dflags' contains the OPTIONS pragmas + (dflags', hspp_fn, buf) <- preprocessFile dflags src_fn Nothing maybe_buf + (srcimps, the_imps, L mod_loc mod_name) <- getImports dflags' buf hspp_fn + + when (mod_name /= wanted_mod) $ + throwDyn $ mkPlainErrMsg mod_loc $ + text "file name does not match module name" + <+> quotes (ppr mod_name) + + -- Find the object timestamp, and return the summary + obj_timestamp <- getObjTimestamp location is_boot + + return (Just ( ModSummary { ms_mod = wanted_mod, + ms_hsc_src = hsc_src, + ms_location = location, + ms_hspp_file = hspp_fn, + ms_hspp_opts = dflags', + ms_hspp_buf = Just buf, + ms_srcimps = srcimps, + ms_imps = the_imps, + ms_hs_date = src_timestamp, + ms_obj_date = obj_timestamp })) + + +getObjTimestamp location is_boot + = if is_boot then return Nothing + else modificationTimeIfExists (ml_obj_file location) + + +preprocessFile :: DynFlags -> FilePath -> Maybe Phase -> Maybe (StringBuffer,ClockTime) + -> IO (DynFlags, FilePath, StringBuffer) +preprocessFile dflags src_fn mb_phase Nothing + = do + (dflags', hspp_fn) <- preprocess dflags (src_fn, mb_phase) + buf <- hGetStringBuffer hspp_fn + return (dflags', hspp_fn, buf) + +preprocessFile dflags src_fn mb_phase (Just (buf, time)) + = do + -- case we bypass the preprocessing stage? + let + local_opts = getOptions buf src_fn + -- + (dflags', errs) <- parseDynamicFlags dflags (map unLoc local_opts) + + let + needs_preprocessing + | Just (Unlit _) <- mb_phase = True + | Nothing <- mb_phase, Unlit _ <- startPhase src_fn = True + -- note: local_opts is only required if there's no Unlit phase + | dopt Opt_Cpp dflags' = True + | dopt Opt_Pp dflags' = True + | otherwise = False + + when needs_preprocessing $ + ghcError (ProgramError "buffer needs preprocesing; interactive check disabled") + + return (dflags', src_fn, buf) + + +----------------------------------------------------------------------------- +-- Error messages +----------------------------------------------------------------------------- + +noModError :: DynFlags -> SrcSpan -> Module -> FindResult -> IO ab +-- ToDo: we don't have a proper line number for this error +noModError dflags loc wanted_mod err + = throwDyn $ mkPlainErrMsg loc $ cantFindError dflags wanted_mod err + +noHsFileErr loc path + = throwDyn $ mkPlainErrMsg loc $ text "Can't find" <+> text path + +packageModErr mod + = throwDyn $ mkPlainErrMsg noSrcSpan $ + text "module" <+> quotes (ppr mod) <+> text "is a package module" + +multiRootsErr :: [ModSummary] -> IO () +multiRootsErr summs@(summ1:_) + = throwDyn $ mkPlainErrMsg noSrcSpan $ + text "module" <+> quotes (ppr mod) <+> + text "is defined in multiple files:" <+> + sep (map text files) + where + mod = ms_mod summ1 + files = map (expectJust "checkDup" . ml_hs_file . ms_location) summs + +cyclicModuleErr :: [ModSummary] -> SDoc +cyclicModuleErr ms + = hang (ptext SLIT("Module imports form a cycle for modules:")) + 2 (vcat (map show_one ms)) + where + show_one ms = sep [ show_mod (ms_hsc_src ms) (ms_mod ms), + nest 2 $ ptext SLIT("imports:") <+> + (pp_imps HsBootFile (ms_srcimps ms) + $$ pp_imps HsSrcFile (ms_imps ms))] + show_mod hsc_src mod = ppr mod <> text (hscSourceString hsc_src) + pp_imps src mods = fsep (map (show_mod src) mods) + + +-- | Inform GHC that the working directory has changed. GHC will flush +-- its cache of module locations, since it may no longer be valid. +-- Note: if you change the working directory, you should also unload +-- the current program (set targets to empty, followed by load). +workingDirectoryChanged :: Session -> IO () +workingDirectoryChanged s = withSession s $ \hsc_env -> + flushFinderCache (hsc_FC hsc_env) + +-- ----------------------------------------------------------------------------- +-- inspecting the session + +-- | Get the module dependency graph. +getModuleGraph :: Session -> IO ModuleGraph -- ToDo: DiGraph ModSummary +getModuleGraph s = withSession s (return . hsc_mod_graph) + +isLoaded :: Session -> Module -> IO Bool +isLoaded s m = withSession s $ \hsc_env -> + return $! isJust (lookupModuleEnv (hsc_HPT hsc_env) m) + +getBindings :: Session -> IO [TyThing] +getBindings s = withSession s (return . nameEnvElts . ic_type_env . hsc_IC) + +getPrintUnqual :: Session -> IO PrintUnqualified +getPrintUnqual s = withSession s (return . icPrintUnqual . hsc_IC) + +-- | Container for information about a 'Module'. +data ModuleInfo = ModuleInfo { + minf_type_env :: TypeEnv, + minf_exports :: NameSet, + minf_rdr_env :: Maybe GlobalRdrEnv, -- Nothing for a compiled/package mod + minf_instances :: [Instance] + -- ToDo: this should really contain the ModIface too + } + -- We don't want HomeModInfo here, because a ModuleInfo applies + -- to package modules too. + +-- | Request information about a loaded 'Module' +getModuleInfo :: Session -> Module -> IO (Maybe ModuleInfo) +getModuleInfo s mdl = withSession s $ \hsc_env -> do + let mg = hsc_mod_graph hsc_env + if mdl `elem` map ms_mod mg + then getHomeModuleInfo hsc_env mdl + else do + {- if isHomeModule (hsc_dflags hsc_env) mdl + then return Nothing + else -} getPackageModuleInfo hsc_env mdl + -- getPackageModuleInfo will attempt to find the interface, so + -- we don't want to call it for a home module, just in case there + -- was a problem loading the module and the interface doesn't + -- exist... hence the isHomeModule test here. (ToDo: reinstate) + +getPackageModuleInfo :: HscEnv -> Module -> IO (Maybe ModuleInfo) +getPackageModuleInfo hsc_env mdl = do +#ifdef GHCI + (_msgs, mb_names) <- getModuleExports hsc_env mdl + case mb_names of + Nothing -> return Nothing + Just names -> do + eps <- readIORef (hsc_EPS hsc_env) + let + pte = eps_PTE eps + n_list = nameSetToList names + tys = [ ty | name <- n_list, + Just ty <- [lookupTypeEnv pte name] ] + -- + return (Just (ModuleInfo { + minf_type_env = mkTypeEnv tys, + minf_exports = names, + minf_rdr_env = Just $! nameSetToGlobalRdrEnv names mdl, + minf_instances = error "getModuleInfo: instances for package module unimplemented" + })) +#else + -- bogusly different for non-GHCI (ToDo) + return Nothing +#endif + +getHomeModuleInfo hsc_env mdl = + case lookupModuleEnv (hsc_HPT hsc_env) mdl of + Nothing -> return Nothing + Just hmi -> do + let details = hm_details hmi + return (Just (ModuleInfo { + minf_type_env = md_types details, + minf_exports = md_exports details, + minf_rdr_env = mi_globals $! hm_iface hmi, + minf_instances = md_insts details + })) + +-- | The list of top-level entities defined in a module +modInfoTyThings :: ModuleInfo -> [TyThing] +modInfoTyThings minf = typeEnvElts (minf_type_env minf) + +modInfoTopLevelScope :: ModuleInfo -> Maybe [Name] +modInfoTopLevelScope minf + = fmap (map gre_name . globalRdrEnvElts) (minf_rdr_env minf) + +modInfoExports :: ModuleInfo -> [Name] +modInfoExports minf = nameSetToList $! minf_exports minf + +-- | Returns the instances defined by the specified module. +-- Warning: currently unimplemented for package modules. +modInfoInstances :: ModuleInfo -> [Instance] +modInfoInstances = minf_instances + +modInfoIsExportedName :: ModuleInfo -> Name -> Bool +modInfoIsExportedName minf name = elemNameSet name (minf_exports minf) + +modInfoPrintUnqualified :: ModuleInfo -> Maybe PrintUnqualified +modInfoPrintUnqualified minf = fmap unQualInScope (minf_rdr_env minf) + +modInfoLookupName :: Session -> ModuleInfo -> Name -> IO (Maybe TyThing) +modInfoLookupName s minf name = withSession s $ \hsc_env -> do + case lookupTypeEnv (minf_type_env minf) name of + Just tyThing -> return (Just tyThing) + Nothing -> do + eps <- readIORef (hsc_EPS hsc_env) + return $! lookupType (hsc_HPT hsc_env) (eps_PTE eps) name + +isDictonaryId :: Id -> Bool +isDictonaryId id + = case tcSplitSigmaTy (idType id) of { (tvs, theta, tau) -> isDictTy tau } + +-- | Looks up a global name: that is, any top-level name in any +-- visible module. Unlike 'lookupName', lookupGlobalName does not use +-- the interactive context, and therefore does not require a preceding +-- 'setContext'. +lookupGlobalName :: Session -> Name -> IO (Maybe TyThing) +lookupGlobalName s name = withSession s $ \hsc_env -> do + eps <- readIORef (hsc_EPS hsc_env) + return $! lookupType (hsc_HPT hsc_env) (eps_PTE eps) name + +-- ----------------------------------------------------------------------------- +-- Misc exported utils + +dataConType :: DataCon -> Type +dataConType dc = idType (dataConWrapId dc) + +-- | print a 'NamedThing', adding parentheses if the name is an operator. +pprParenSymName :: NamedThing a => a -> SDoc +pprParenSymName a = parenSymOcc (getOccName a) (ppr (getName a)) + +-- ---------------------------------------------------------------------------- + +#if 0 + +-- ToDo: +-- - Data and Typeable instances for HsSyn. + +-- ToDo: check for small transformations that happen to the syntax in +-- the typechecker (eg. -e ==> negate e, perhaps for fromIntegral) + +-- ToDo: maybe use TH syntax instead of IfaceSyn? There's already a way +-- to get from TyCons, Ids etc. to TH syntax (reify). + +-- :browse will use either lm_toplev or inspect lm_interface, depending +-- on whether the module is interpreted or not. + +-- This is for reconstructing refactored source code +-- Calls the lexer repeatedly. +-- ToDo: add comment tokens to token stream +getTokenStream :: Session -> Module -> IO [Located Token] +#endif + +-- ----------------------------------------------------------------------------- +-- Interactive evaluation + +#ifdef GHCI + +-- | Set the interactive evaluation context. +-- +-- Setting the context doesn't throw away any bindings; the bindings +-- we've built up in the InteractiveContext simply move to the new +-- module. They always shadow anything in scope in the current context. +setContext :: Session + -> [Module] -- entire top level scope of these modules + -> [Module] -- exports only of these modules + -> IO () +setContext (Session ref) toplevs exports = do + hsc_env <- readIORef ref + let old_ic = hsc_IC hsc_env + hpt = hsc_HPT hsc_env + + mapM_ (checkModuleExists hsc_env hpt) exports + export_env <- mkExportEnv hsc_env exports + toplev_envs <- mapM (mkTopLevEnv hpt) toplevs + let all_env = foldr plusGlobalRdrEnv export_env toplev_envs + writeIORef ref hsc_env{ hsc_IC = old_ic { ic_toplev_scope = toplevs, + ic_exports = exports, + ic_rn_gbl_env = all_env }} + + +-- Make a GlobalRdrEnv based on the exports of the modules only. +mkExportEnv :: HscEnv -> [Module] -> IO GlobalRdrEnv +mkExportEnv hsc_env mods = do + stuff <- mapM (getModuleExports hsc_env) mods + let + (_msgs, mb_name_sets) = unzip stuff + gres = [ nameSetToGlobalRdrEnv name_set mod + | (Just name_set, mod) <- zip mb_name_sets mods ] + -- + return $! foldr plusGlobalRdrEnv emptyGlobalRdrEnv gres + +nameSetToGlobalRdrEnv :: NameSet -> Module -> GlobalRdrEnv +nameSetToGlobalRdrEnv names mod = + mkGlobalRdrEnv [ GRE { gre_name = name, gre_prov = vanillaProv mod } + | name <- nameSetToList names ] + +vanillaProv :: Module -> Provenance +-- We're building a GlobalRdrEnv as if the user imported +-- all the specified modules into the global interactive module +vanillaProv mod = Imported [ImpSpec { is_decl = decl, is_item = ImpAll}] + where + decl = ImpDeclSpec { is_mod = mod, is_as = mod, + is_qual = False, + is_dloc = srcLocSpan interactiveSrcLoc } + +checkModuleExists :: HscEnv -> HomePackageTable -> Module -> IO () +checkModuleExists hsc_env hpt mod = + case lookupModuleEnv hpt mod of + Just mod_info -> return () + _not_a_home_module -> do + res <- findPackageModule hsc_env mod True + case res of + Found _ _ -> return () + err -> let msg = cantFindError (hsc_dflags hsc_env) mod err in + throwDyn (CmdLineError (showSDoc msg)) + +mkTopLevEnv :: HomePackageTable -> Module -> IO GlobalRdrEnv +mkTopLevEnv hpt modl + = case lookupModuleEnv hpt modl of + Nothing -> + throwDyn (ProgramError ("mkTopLevEnv: not a home module " + ++ showSDoc (pprModule modl))) + Just details -> + case mi_globals (hm_iface details) of + Nothing -> + throwDyn (ProgramError ("mkTopLevEnv: not interpreted " + ++ showSDoc (pprModule modl))) + Just env -> return env + +-- | Get the interactive evaluation context, consisting of a pair of the +-- set of modules from which we take the full top-level scope, and the set +-- of modules from which we take just the exports respectively. +getContext :: Session -> IO ([Module],[Module]) +getContext s = withSession s (\HscEnv{ hsc_IC=ic } -> + return (ic_toplev_scope ic, ic_exports ic)) + +-- | Returns 'True' if the specified module is interpreted, and hence has +-- its full top-level scope available. +moduleIsInterpreted :: Session -> Module -> IO Bool +moduleIsInterpreted s modl = withSession s $ \h -> + case lookupModuleEnv (hsc_HPT h) modl of + Just details -> return (isJust (mi_globals (hm_iface details))) + _not_a_home_module -> return False + +-- | Looks up an identifier in the current interactive context (for :info) +getInfo :: Session -> Name -> IO (Maybe (TyThing,Fixity,[Instance])) +getInfo s name = withSession s $ \hsc_env -> tcRnGetInfo hsc_env name + +-- | Returns all names in scope in the current interactive context +getNamesInScope :: Session -> IO [Name] +getNamesInScope s = withSession s $ \hsc_env -> do + return (map gre_name (globalRdrEnvElts (ic_rn_gbl_env (hsc_IC hsc_env)))) + +getRdrNamesInScope :: Session -> IO [RdrName] +getRdrNamesInScope s = withSession s $ \hsc_env -> do + let env = ic_rn_gbl_env (hsc_IC hsc_env) + return (concat (map greToRdrNames (globalRdrEnvElts env))) + +-- ToDo: move to RdrName +greToRdrNames :: GlobalRdrElt -> [RdrName] +greToRdrNames GRE{ gre_name = name, gre_prov = prov } + = case prov of + LocalDef -> [unqual] + Imported specs -> concat (map do_spec (map is_decl specs)) + where + occ = nameOccName name + unqual = Unqual occ + do_spec decl_spec + | is_qual decl_spec = [qual] + | otherwise = [unqual,qual] + where qual = Qual (is_as decl_spec) occ + +-- | Parses a string as an identifier, and returns the list of 'Name's that +-- the identifier can refer to in the current interactive context. +parseName :: Session -> String -> IO [Name] +parseName s str = withSession s $ \hsc_env -> do + maybe_rdr_name <- hscParseIdentifier (hsc_dflags hsc_env) str + case maybe_rdr_name of + Nothing -> return [] + Just (L _ rdr_name) -> do + mb_names <- tcRnLookupRdrName hsc_env rdr_name + case mb_names of + Nothing -> return [] + Just ns -> return ns + -- ToDo: should return error messages + +-- | Returns the 'TyThing' for a 'Name'. The 'Name' may refer to any +-- entity known to GHC, including 'Name's defined using 'runStmt'. +lookupName :: Session -> Name -> IO (Maybe TyThing) +lookupName s name = withSession s $ \hsc_env -> tcRnLookupName hsc_env name + +-- ----------------------------------------------------------------------------- +-- Getting the type of an expression + +-- | Get the type of an expression +exprType :: Session -> String -> IO (Maybe Type) +exprType s expr = withSession s $ \hsc_env -> do + maybe_stuff <- hscTcExpr hsc_env expr + case maybe_stuff of + Nothing -> return Nothing + Just ty -> return (Just tidy_ty) + where + tidy_ty = tidyType emptyTidyEnv ty + +-- ----------------------------------------------------------------------------- +-- Getting the kind of a type + +-- | Get the kind of a type +typeKind :: Session -> String -> IO (Maybe Kind) +typeKind s str = withSession s $ \hsc_env -> do + maybe_stuff <- hscKcType hsc_env str + case maybe_stuff of + Nothing -> return Nothing + Just kind -> return (Just kind) + +----------------------------------------------------------------------------- +-- cmCompileExpr: compile an expression and deliver an HValue + +compileExpr :: Session -> String -> IO (Maybe HValue) +compileExpr s expr = withSession s $ \hsc_env -> do + maybe_stuff <- hscStmt hsc_env ("let __cmCompileExpr = "++expr) + case maybe_stuff of + Nothing -> return Nothing + Just (new_ic, names, hval) -> do + -- Run it! + hvals <- (unsafeCoerce# hval) :: IO [HValue] + + case (names,hvals) of + ([n],[hv]) -> return (Just hv) + _ -> panic "compileExpr" + +-- ----------------------------------------------------------------------------- +-- running a statement interactively + +data RunResult + = RunOk [Name] -- ^ names bound by this evaluation + | RunFailed -- ^ statement failed compilation + | RunException Exception -- ^ statement raised an exception + +-- | Run a statement in the current interactive context. Statemenet +-- may bind multple values. +runStmt :: Session -> String -> IO RunResult +runStmt (Session ref) expr + = do + hsc_env <- readIORef ref + + -- Turn off -fwarn-unused-bindings when running a statement, to hide + -- warnings about the implicit bindings we introduce. + let dflags' = dopt_unset (hsc_dflags hsc_env) Opt_WarnUnusedBinds + hsc_env' = hsc_env{ hsc_dflags = dflags' } + + maybe_stuff <- hscStmt hsc_env' expr + + case maybe_stuff of + Nothing -> return RunFailed + Just (new_hsc_env, names, hval) -> do + + let thing_to_run = unsafeCoerce# hval :: IO [HValue] + either_hvals <- sandboxIO thing_to_run + + case either_hvals of + Left e -> do + -- on error, keep the *old* interactive context, + -- so that 'it' is not bound to something + -- that doesn't exist. + return (RunException e) + + Right hvals -> do + -- Get the newly bound things, and bind them. + -- Don't need to delete any shadowed bindings; + -- the new ones override the old ones. + extendLinkEnv (zip names hvals) + + writeIORef ref new_hsc_env + return (RunOk names) + +-- When running a computation, we redirect ^C exceptions to the running +-- thread. ToDo: we might want a way to continue even if the target +-- thread doesn't die when it receives the exception... "this thread +-- is not responding". +sandboxIO :: IO a -> IO (Either Exception a) +sandboxIO thing = do + m <- newEmptyMVar + ts <- takeMVar interruptTargetThread + child <- forkIO (do res <- Exception.try thing; putMVar m res) + putMVar interruptTargetThread (child:ts) + takeMVar m `finally` modifyMVar_ interruptTargetThread (return.tail) + +{- +-- This version of sandboxIO runs the expression in a completely new +-- RTS main thread. It is disabled for now because ^C exceptions +-- won't be delivered to the new thread, instead they'll be delivered +-- to the (blocked) GHCi main thread. + +-- SLPJ: when re-enabling this, reflect a wrong-stat error as an exception + +sandboxIO :: IO a -> IO (Either Int (Either Exception a)) +sandboxIO thing = do + st_thing <- newStablePtr (Exception.try thing) + alloca $ \ p_st_result -> do + stat <- rts_evalStableIO st_thing p_st_result + freeStablePtr st_thing + if stat == 1 + then do st_result <- peek p_st_result + result <- deRefStablePtr st_result + freeStablePtr st_result + return (Right result) + else do + return (Left (fromIntegral stat)) + +foreign import "rts_evalStableIO" {- safe -} + rts_evalStableIO :: StablePtr (IO a) -> Ptr (StablePtr a) -> IO CInt + -- more informative than the C type! +-} + +----------------------------------------------------------------------------- +-- show a module and it's source/object filenames + +showModule :: Session -> ModSummary -> IO String +showModule s mod_summary = withSession s $ \hsc_env -> do + case lookupModuleEnv (hsc_HPT hsc_env) (ms_mod mod_summary) of + Nothing -> panic "missing linkable" + Just mod_info -> return (showModMsg (hscTarget (hsc_dflags hsc_env)) (not obj_linkable) mod_summary) + where + obj_linkable = isObjectLinkable (expectJust "showModule" (hm_linkable mod_info)) + +#endif /* GHCI */ diff --git a/compiler/main/HeaderInfo.hs b/compiler/main/HeaderInfo.hs new file mode 100644 index 0000000000..913ac33a33 --- /dev/null +++ b/compiler/main/HeaderInfo.hs @@ -0,0 +1,201 @@ +----------------------------------------------------------------------------- +-- +-- Parsing the top of a Haskell source file to get its module name, +-- imports and options. +-- +-- (c) Simon Marlow 2005 +-- (c) Lemmih 2006 +-- +----------------------------------------------------------------------------- + +module HeaderInfo ( getImportsFromFile, getImports + , getOptionsFromFile, getOptions + , optionsErrorMsgs ) where + +#include "HsVersions.h" + +import Parser ( parseHeader ) +import Lexer ( P(..), ParseResult(..), mkPState, pragState + , lexer, Token(..), PState(..) ) +import FastString +import HsSyn ( ImportDecl(..), HsModule(..) ) +import Module ( Module, mkModule ) +import PrelNames ( gHC_PRIM ) +import StringBuffer ( StringBuffer(..), hGetStringBuffer, hGetStringBufferBlock + , appendStringBuffers ) +import SrcLoc ( Located(..), mkSrcLoc, unLoc, noSrcSpan ) +import FastString ( mkFastString ) +import DynFlags ( DynFlags ) +import ErrUtils +import Util +import Outputable +import Pretty () +import Panic +import Bag ( unitBag, emptyBag, listToBag ) + +import Distribution.Compiler + +import TRACE + +import EXCEPTION ( throwDyn ) +import IO +import List + +#if __GLASGOW_HASKELL__ >= 601 +import System.IO ( openBinaryFile ) +#else +import IOExts ( openFileEx, IOModeEx(..) ) +#endif + +#if __GLASGOW_HASKELL__ < 601 +openBinaryFile fp mode = openFileEx fp (BinaryMode mode) +#endif + +-- getImportsFromFile is careful to close the file afterwards, otherwise +-- we can end up with a large number of open handles before the garbage +-- collector gets around to closing them. +getImportsFromFile :: DynFlags -> FilePath + -> IO ([Located Module], [Located Module], Located Module) +getImportsFromFile dflags filename = do + buf <- hGetStringBuffer filename + getImports dflags buf filename + +getImports :: DynFlags -> StringBuffer -> FilePath + -> IO ([Located Module], [Located Module], Located Module) +getImports dflags buf filename = do + let loc = mkSrcLoc (mkFastString filename) 1 0 + case unP parseHeader (mkPState buf loc dflags) of + PFailed span err -> parseError span err + POk _ rdr_module -> + case rdr_module of + L _ (HsModule mod _ imps _ _) -> + let + mod_name | Just located_mod <- mod = located_mod + | otherwise = L noSrcSpan (mkModule "Main") + (src_idecls, ord_idecls) = partition isSourceIdecl (map unLoc imps) + source_imps = map getImpMod src_idecls + ordinary_imps = filter ((/= gHC_PRIM) . unLoc) + (map getImpMod ord_idecls) + -- GHC.Prim doesn't exist physically, so don't go looking for it. + in + return (source_imps, ordinary_imps, mod_name) + +parseError span err = throwDyn $ mkPlainErrMsg span err + +isSourceIdecl (ImportDecl _ s _ _ _) = s + +getImpMod (ImportDecl located_mod _ _ _ _) = located_mod + +-------------------------------------------------------------- +-- Get options +-------------------------------------------------------------- + + +getOptionsFromFile :: FilePath -- input file + -> IO [Located String] -- options, if any +getOptionsFromFile filename + = bracket (openBinaryFile filename ReadMode) + (hClose) + (\handle -> + do buf <- hGetStringBufferBlock handle blockSize + loop handle buf) + where blockSize = 1024 + loop handle buf + | len buf == 0 = return [] + | otherwise + = case getOptions' buf filename of + (Nothing, opts) -> return opts + (Just buf', opts) -> do nextBlock <- hGetStringBufferBlock handle blockSize + newBuf <- appendStringBuffers buf' nextBlock + if len newBuf == len buf + then return opts + else do opts' <- loop handle newBuf + return (opts++opts') + +getOptions :: StringBuffer -> FilePath -> [Located String] +getOptions buf filename + = case getOptions' buf filename of + (_,opts) -> opts + +-- The token parser is written manually because Happy can't +-- return a partial result when it encounters a lexer error. +-- We want to extract options before the buffer is passed through +-- CPP, so we can't use the same trick as 'getImports'. +getOptions' :: StringBuffer -- Input buffer + -> FilePath -- Source file. Used for msgs only. + -> ( Maybe StringBuffer -- Just => we can use more input + , [Located String] -- Options. + ) +getOptions' buf filename + = parseToks (lexAll (pragState buf loc)) + where loc = mkSrcLoc (mkFastString filename) 1 0 + + getToken (buf,L _loc tok) = tok + getLoc (buf,L loc _tok) = loc + getBuf (buf,_tok) = buf + combine opts (flag, opts') = (flag, opts++opts') + add opt (flag, opts) = (flag, opt:opts) + + parseToks (open:close:xs) + | IToptions_prag str <- getToken open + , ITclose_prag <- getToken close + = map (L (getLoc open)) (words str) `combine` + parseToks xs + parseToks (open:close:xs) + | ITinclude_prag str <- getToken open + , ITclose_prag <- getToken close + = map (L (getLoc open)) ["-#include",removeSpaces str] `combine` + parseToks xs + parseToks (open:xs) + | ITlanguage_prag <- getToken open + = parseLanguage xs + -- The last token before EOF could have been truncated. + -- We ignore it to be on the safe side. + parseToks [tok,eof] + | ITeof <- getToken eof + = (Just (getBuf tok),[]) + parseToks (eof:_) + | ITeof <- getToken eof + = (Just (getBuf eof),[]) + parseToks _ = (Nothing,[]) + parseLanguage ((_buf,L loc (ITconid fs)):rest) + = checkExtension (L loc fs) `add` + case rest of + (_,L loc ITcomma):more -> parseLanguage more + (_,L loc ITclose_prag):more -> parseToks more + (_,L loc _):_ -> languagePragParseError loc + parseLanguage (tok:_) + = languagePragParseError (getLoc tok) + lexToken t = return t + lexAll state = case unP (lexer lexToken) state of + POk state' t@(L _ ITeof) -> [(buffer state,t)] + POk state' t -> (buffer state,t):lexAll state' + _ -> [(buffer state,L (last_loc state) ITeof)] + +checkExtension :: Located FastString -> Located String +checkExtension (L l ext) + = case reads (unpackFS ext) of + [] -> languagePragParseError l + (okExt,""):_ -> case extensionsToGHCFlag [okExt] of + ([],[opt]) -> L l opt + _ -> unsupportedExtnError l okExt + +languagePragParseError loc = + pgmError (showSDoc (mkLocMessage loc ( + text "cannot parse LANGUAGE pragma"))) + +unsupportedExtnError loc unsup = + pgmError (showSDoc (mkLocMessage loc ( + text "unsupported extension: " <> + (text.show) unsup))) + + +optionsErrorMsgs :: [String] -> [Located String] -> FilePath -> Messages +optionsErrorMsgs unhandled_flags flags_lines filename + = (emptyBag, listToBag (map mkMsg unhandled_flags_lines)) + where unhandled_flags_lines = [ L l f | f <- unhandled_flags, + L l f' <- flags_lines, f == f' ] + mkMsg (L flagSpan flag) = + ErrUtils.mkPlainErrMsg flagSpan $ + text "unknown flag in {-# OPTIONS #-} pragma:" <+> text flag + diff --git a/compiler/main/HscMain.lhs b/compiler/main/HscMain.lhs new file mode 100644 index 0000000000..e170f8fa31 --- /dev/null +++ b/compiler/main/HscMain.lhs @@ -0,0 +1,965 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1993-2000 +% + +\section[GHC_Main]{Main driver for Glasgow Haskell compiler} + +\begin{code} +module HscMain + ( newHscEnv, hscCmmFile + , hscFileCheck + , hscParseIdentifier +#ifdef GHCI + , hscStmt, hscTcExpr, hscKcType + , compileExpr +#endif + , hscCompileOneShot -- :: Compiler HscStatus + , hscCompileBatch -- :: Compiler (HscStatus, ModIface, ModDetails) + , hscCompileNothing -- :: Compiler (HscStatus, ModIface, ModDetails) + , hscCompileInteractive -- :: Compiler (InteractiveStatus, ModIface, ModDetails) + , HscStatus (..) + , InteractiveStatus (..) + , HscChecked (..) + ) where + +#include "HsVersions.h" + +#ifdef GHCI +import HsSyn ( Stmt(..), LHsExpr, LStmt, LHsType ) +import Module ( Module ) +import CodeOutput ( outputForeignStubs ) +import ByteCodeGen ( byteCodeGen, coreExprToBCOs ) +import Linker ( HValue, linkExpr ) +import CoreTidy ( tidyExpr ) +import CorePrep ( corePrepExpr ) +import Flattening ( flattenExpr ) +import Desugar ( deSugarExpr ) +import SimplCore ( simplifyExpr ) +import TcRnDriver ( tcRnStmt, tcRnExpr, tcRnType ) +import Type ( Type ) +import PrelNames ( iNTERACTIVE ) +import Kind ( Kind ) +import CoreLint ( lintUnfolding ) +import DsMeta ( templateHaskellNames ) +import SrcLoc ( noSrcLoc ) +import VarEnv ( emptyTidyEnv ) +#endif + +import Var ( Id ) +import Module ( emptyModuleEnv, ModLocation(..) ) +import RdrName ( GlobalRdrEnv, RdrName ) +import HsSyn ( HsModule, LHsBinds, HsGroup, LIE, LImportDecl ) +import SrcLoc ( Located(..) ) +import StringBuffer ( hGetStringBuffer, stringToStringBuffer ) +import Parser +import Lexer ( P(..), ParseResult(..), mkPState ) +import SrcLoc ( mkSrcLoc ) +import TcRnDriver ( tcRnModule, tcRnExtCore ) +import TcIface ( typecheckIface ) +import TcRnMonad ( initIfaceCheck, TcGblEnv(..) ) +import IfaceEnv ( initNameCache ) +import LoadIface ( ifaceStats, initExternalPackageState ) +import PrelInfo ( wiredInThings, basicKnownKeyNames ) +import MkIface ( checkOldIface, mkIface, writeIfaceFile ) +import Desugar ( deSugar ) +import Flattening ( flatten ) +import SimplCore ( core2core ) +import TidyPgm ( tidyProgram, mkBootModDetails ) +import CorePrep ( corePrepPgm ) +import CoreToStg ( coreToStg ) +import TyCon ( isDataTyCon ) +import Packages ( mkHomeModules ) +import Name ( Name, NamedThing(..) ) +import SimplStg ( stg2stg ) +import CodeGen ( codeGen ) +import CmmParse ( parseCmmFile ) +import CodeOutput ( codeOutput ) + +import DynFlags +import ErrUtils +import UniqSupply ( mkSplitUniqSupply ) + +import Outputable +import HscStats ( ppSourceStats ) +import HscTypes +import MkExternalCore ( emitExternalCore ) +import ParserCore +import ParserCoreUtils +import FastString +import Maybes ( expectJust ) +import Bag ( unitBag ) +import Monad ( unless ) +import IO +import DATA_IOREF ( newIORef, readIORef ) +\end{code} + + +%************************************************************************ +%* * + Initialisation +%* * +%************************************************************************ + +\begin{code} +newHscEnv :: DynFlags -> IO HscEnv +newHscEnv dflags + = do { eps_var <- newIORef initExternalPackageState + ; us <- mkSplitUniqSupply 'r' + ; nc_var <- newIORef (initNameCache us knownKeyNames) + ; fc_var <- newIORef emptyModuleEnv + ; return (HscEnv { hsc_dflags = dflags, + hsc_targets = [], + hsc_mod_graph = [], + hsc_IC = emptyInteractiveContext, + hsc_HPT = emptyHomePackageTable, + hsc_EPS = eps_var, + hsc_NC = nc_var, + hsc_FC = fc_var } ) } + + +knownKeyNames :: [Name] -- Put here to avoid loops involving DsMeta, + -- where templateHaskellNames are defined +knownKeyNames = map getName wiredInThings + ++ basicKnownKeyNames +#ifdef GHCI + ++ templateHaskellNames +#endif +\end{code} + + +%************************************************************************ +%* * + The main compiler pipeline +%* * +%************************************************************************ + + -------------------------------- + The compilation proper + -------------------------------- + + +It's the task of the compilation proper to compile Haskell, hs-boot and +core files to either byte-code, hard-code (C, asm, Java, ect) or to +nothing at all (the module is still parsed and type-checked. This +feature is mostly used by IDE's and the likes). +Compilation can happen in either 'one-shot', 'batch', 'nothing', +or 'interactive' mode. 'One-shot' mode targets hard-code, 'batch' mode +targets hard-code, 'nothing' mode targets nothing and 'interactive' mode +targets byte-code. +The modes are kept separate because of their different types and meanings. +In 'one-shot' mode, we're only compiling a single file and can therefore +discard the new ModIface and ModDetails. This is also the reason it only +targets hard-code; compiling to byte-code or nothing doesn't make sense +when we discard the result. +'Batch' mode is like 'one-shot' except that we keep the resulting ModIface +and ModDetails. 'Batch' mode doesn't target byte-code since that require +us to return the newly compiled byte-code. +'Nothing' mode has exactly the same type as 'batch' mode but they're still +kept separate. This is because compiling to nothing is fairly special: We +don't output any interface files, we don't run the simplifier and we don't +generate any code. +'Interactive' mode is similar to 'batch' mode except that we return the +compiled byte-code together with the ModIface and ModDetails. + +Trying to compile a hs-boot file to byte-code will result in a run-time +error. This is the only thing that isn't caught by the type-system. + +\begin{code} + +data HscChecked + = HscChecked + -- parsed + (Located (HsModule RdrName)) + -- renamed + (Maybe (HsGroup Name,[LImportDecl Name],Maybe [LIE Name])) + -- typechecked + (Maybe (LHsBinds Id, GlobalRdrEnv, ModDetails)) + + +-- Status of a compilation to hard-code or nothing. +data HscStatus + = HscNoRecomp + | HscRecomp Bool -- Has stub files. + -- This is a hack. We can't compile C files here + -- since it's done in DriverPipeline. For now we + -- just return True if we want the caller to compile + -- it for us. + +-- Status of a compilation to byte-code. +data InteractiveStatus + = InteractiveNoRecomp + | InteractiveRecomp Bool -- Same as HscStatus + CompiledByteCode + + +-- I want Control.Monad.State! --Lemmih 03/07/2006 +newtype Comp a = Comp {runComp :: CompState -> IO (a, CompState)} + +instance Monad Comp where + g >>= fn = Comp $ \s -> runComp g s >>= \(a,s') -> runComp (fn a) s' + return a = Comp $ \s -> return (a,s) + fail = error + +evalComp :: Comp a -> CompState -> IO a +evalComp comp st = do (val,_st') <- runComp comp st + return val + +data CompState + = CompState + { compHscEnv :: HscEnv + , compModSummary :: ModSummary + , compOldIface :: Maybe ModIface + } + +get :: Comp CompState +get = Comp $ \s -> return (s,s) + +gets :: (CompState -> a) -> Comp a +gets getter = do st <- get + return (getter st) + +liftIO :: IO a -> Comp a +liftIO ioA = Comp $ \s -> do a <- ioA + return (a,s) + +type NoRecomp result = ModIface -> Comp result +type FrontEnd core = Comp (Maybe core) + +-- FIXME: The old interface and module index are only using in 'batch' and +-- 'interactive' mode. They should be removed from 'oneshot' mode. +type Compiler result = HscEnv + -> ModSummary + -> Bool -- True <=> source unchanged + -> Maybe ModIface -- Old interface, if available + -> Maybe (Int,Int) -- Just (i,n) <=> module i of n (for msgs) + -> IO (Maybe result) + + +-- This functions checks if recompilation is necessary and +-- then combines the FrontEnd and BackEnd to a working compiler. +hscMkCompiler :: NoRecomp result -- What to do when recompilation isn't required. + -> (Maybe (Int,Int) -> Bool -> Comp ()) + -> FrontEnd core + -> (core -> Comp result) -- Backend. + -> Compiler result +hscMkCompiler norecomp messenger frontend backend + hsc_env mod_summary source_unchanged + mbOldIface mbModIndex + = flip evalComp (CompState hsc_env mod_summary mbOldIface) $ + do (recomp_reqd, mbCheckedIface) + <- {-# SCC "checkOldIface" #-} + liftIO $ checkOldIface hsc_env mod_summary + source_unchanged mbOldIface + case mbCheckedIface of + Just iface | not recomp_reqd + -> do messenger mbModIndex False + result <- norecomp iface + return (Just result) + _otherwise + -> do messenger mbModIndex True + mbCore <- frontend + case mbCore of + Nothing + -> return Nothing + Just core + -> do result <- backend core + return (Just result) + +-------------------------------------------------------------- +-- Compilers +-------------------------------------------------------------- + +-- 1 2 3 4 5 6 7 8 9 +-- Compile Haskell, boot and extCore in OneShot mode. +hscCompileOneShot :: Compiler HscStatus +hscCompileOneShot hsc_env mod_summary = + compiler hsc_env mod_summary + where mkComp = hscMkCompiler norecompOneShot oneShotMsg + -- How to compile nonBoot files. + nonBootComp inp = hscSimplify inp >>= hscNormalIface >>= + hscWriteIface >>= hscOneShot + -- How to compile boot files. + bootComp inp = hscSimpleIface inp >>= hscWriteIface >>= hscConst (HscRecomp False) + compiler + = case ms_hsc_src mod_summary of + ExtCoreFile + -> mkComp hscCoreFrontEnd nonBootComp + HsSrcFile + -> mkComp hscFileFrontEnd nonBootComp + HsBootFile + -> mkComp hscFileFrontEnd bootComp + +-- Compile Haskell, boot and extCore in batch mode. +hscCompileBatch :: Compiler (HscStatus, ModIface, ModDetails) +hscCompileBatch hsc_env mod_summary + = compiler hsc_env mod_summary + where mkComp = hscMkCompiler norecompBatch batchMsg + nonBootComp inp = hscSimplify inp >>= hscNormalIface >>= + hscWriteIface >>= hscBatch + bootComp inp = hscSimpleIface inp >>= hscWriteIface >>= hscNothing + compiler + = case ms_hsc_src mod_summary of + ExtCoreFile + -> mkComp hscCoreFrontEnd nonBootComp + HsSrcFile + -> mkComp hscFileFrontEnd nonBootComp + HsBootFile + -> mkComp hscFileFrontEnd bootComp + +-- Type-check Haskell, boot and extCore. +-- Does it make sense to compile extCore to nothing? +hscCompileNothing :: Compiler (HscStatus, ModIface, ModDetails) +hscCompileNothing hsc_env mod_summary + = compiler hsc_env mod_summary + where mkComp = hscMkCompiler norecompBatch batchMsg + pipeline inp = hscSimpleIface inp >>= hscIgnoreIface >>= hscNothing + compiler + = case ms_hsc_src mod_summary of + ExtCoreFile + -> mkComp hscCoreFrontEnd pipeline + HsSrcFile + -> mkComp hscFileFrontEnd pipeline + HsBootFile + -> mkComp hscFileFrontEnd pipeline + +-- Compile Haskell, extCore to bytecode. +hscCompileInteractive :: Compiler (InteractiveStatus, ModIface, ModDetails) +hscCompileInteractive hsc_env mod_summary = + hscMkCompiler norecompInteractive batchMsg + frontend backend + hsc_env mod_summary + where backend inp = hscSimplify inp >>= hscNormalIface >>= hscIgnoreIface >>= hscInteractive + frontend = case ms_hsc_src mod_summary of + ExtCoreFile -> hscCoreFrontEnd + HsSrcFile -> hscFileFrontEnd + HsBootFile -> panic bootErrorMsg + bootErrorMsg = "Compiling a HsBootFile to bytecode doesn't make sense. " ++ + "Use 'hscCompileBatch' instead." + +-------------------------------------------------------------- +-- NoRecomp handlers +-------------------------------------------------------------- + +norecompOneShot :: NoRecomp HscStatus +norecompOneShot old_iface + = do hsc_env <- gets compHscEnv + liftIO $ do + dumpIfaceStats hsc_env + return HscNoRecomp + +norecompBatch :: NoRecomp (HscStatus, ModIface, ModDetails) +norecompBatch = norecompWorker HscNoRecomp False + +norecompInteractive :: NoRecomp (InteractiveStatus, ModIface, ModDetails) +norecompInteractive = norecompWorker InteractiveNoRecomp True + +norecompWorker :: a -> Bool -> NoRecomp (a, ModIface, ModDetails) +norecompWorker a isInterp old_iface + = do hsc_env <- gets compHscEnv + mod_summary <- gets compModSummary + liftIO $ do + new_details <- {-# SCC "tcRnIface" #-} + initIfaceCheck hsc_env $ + typecheckIface old_iface + dumpIfaceStats hsc_env + return (a, old_iface, new_details) + +-------------------------------------------------------------- +-- Progress displayers. +-------------------------------------------------------------- + +oneShotMsg :: Maybe (Int,Int) -> Bool -> Comp () +oneShotMsg _mb_mod_index recomp + = do hsc_env <- gets compHscEnv + liftIO $ do + if recomp + then return () + else compilationProgressMsg (hsc_dflags hsc_env) $ + "compilation IS NOT required" + +batchMsg :: Maybe (Int,Int) -> Bool -> Comp () +batchMsg mb_mod_index recomp + = do hsc_env <- gets compHscEnv + mod_summary <- gets compModSummary + let showMsg msg = compilationProgressMsg (hsc_dflags hsc_env) $ + (showModuleIndex mb_mod_index ++ + msg ++ showModMsg (hscTarget (hsc_dflags hsc_env)) recomp mod_summary) + liftIO $ do + if recomp + then showMsg "Compiling " + else showMsg "Skipping " + + + +-------------------------------------------------------------- +-- FrontEnds +-------------------------------------------------------------- + +hscCoreFrontEnd :: FrontEnd ModGuts +hscCoreFrontEnd = + do hsc_env <- gets compHscEnv + mod_summary <- gets compModSummary + liftIO $ do + ------------------- + -- PARSE + ------------------- + inp <- readFile (ms_hspp_file mod_summary) + case parseCore inp 1 of + FailP s + -> do errorMsg (hsc_dflags hsc_env) (text s{-ToDo: wrong-}) + return Nothing + OkP rdr_module + ------------------- + -- RENAME and TYPECHECK + ------------------- + -> do (tc_msgs, maybe_tc_result) <- {-# SCC "TypeCheck" #-} + tcRnExtCore hsc_env rdr_module + printErrorsAndWarnings (hsc_dflags hsc_env) tc_msgs + case maybe_tc_result of + Nothing -> return Nothing + Just mod_guts -> return (Just mod_guts) -- No desugaring to do! + + +hscFileFrontEnd :: FrontEnd ModGuts +hscFileFrontEnd = + do hsc_env <- gets compHscEnv + mod_summary <- gets compModSummary + liftIO $ do + ------------------- + -- PARSE + ------------------- + let dflags = hsc_dflags hsc_env + hspp_file = ms_hspp_file mod_summary + hspp_buf = ms_hspp_buf mod_summary + maybe_parsed <- myParseModule dflags hspp_file hspp_buf + case maybe_parsed of + Left err + -> do printBagOfErrors dflags (unitBag err) + return Nothing + Right rdr_module + ------------------- + -- RENAME and TYPECHECK + ------------------- + -> do (tc_msgs, maybe_tc_result) + <- {-# SCC "Typecheck-Rename" #-} + tcRnModule hsc_env (ms_hsc_src mod_summary) False rdr_module + printErrorsAndWarnings dflags tc_msgs + case maybe_tc_result of + Nothing + -> return Nothing + Just tc_result + ------------------- + -- DESUGAR + ------------------- + -> do (warns, maybe_ds_result) <- {-# SCC "DeSugar" #-} + deSugar hsc_env tc_result + printBagOfWarnings dflags warns + return maybe_ds_result + +-------------------------------------------------------------- +-- Simplifiers +-------------------------------------------------------------- + +hscSimplify :: ModGuts -> Comp ModGuts +hscSimplify ds_result + = do hsc_env <- gets compHscEnv + liftIO $ do + flat_result <- {-# SCC "Flattening" #-} + flatten hsc_env ds_result + ------------------- + -- SIMPLIFY + ------------------- + simpl_result <- {-# SCC "Core2Core" #-} + core2core hsc_env flat_result + return simpl_result + +-------------------------------------------------------------- +-- Interface generators +-------------------------------------------------------------- + +-- HACK: we return ModGuts even though we know it's not gonna be used. +-- We do this because the type signature needs to be identical +-- in structure to the type of 'hscNormalIface'. +hscSimpleIface :: ModGuts -> Comp (ModIface, Bool, ModDetails, ModGuts) +hscSimpleIface ds_result + = do hsc_env <- gets compHscEnv + mod_summary <- gets compModSummary + maybe_old_iface <- gets compOldIface + liftIO $ do + details <- mkBootModDetails hsc_env ds_result + (new_iface, no_change) + <- {-# SCC "MkFinalIface" #-} + mkIface hsc_env maybe_old_iface ds_result details + -- And the answer is ... + dumpIfaceStats hsc_env + return (new_iface, no_change, details, ds_result) + +hscNormalIface :: ModGuts -> Comp (ModIface, Bool, ModDetails, CgGuts) +hscNormalIface simpl_result + = do hsc_env <- gets compHscEnv + mod_summary <- gets compModSummary + maybe_old_iface <- gets compOldIface + liftIO $ do + ------------------- + -- TIDY + ------------------- + (cg_guts, details) <- {-# SCC "CoreTidy" #-} + tidyProgram hsc_env simpl_result + + ------------------- + -- BUILD THE NEW ModIface and ModDetails + -- and emit external core if necessary + -- This has to happen *after* code gen so that the back-end + -- info has been set. Not yet clear if it matters waiting + -- until after code output + (new_iface, no_change) + <- {-# SCC "MkFinalIface" #-} + mkIface hsc_env maybe_old_iface simpl_result details + -- Emit external core + emitExternalCore (hsc_dflags hsc_env) cg_guts -- Move this? --Lemmih 03/07/2006 + dumpIfaceStats hsc_env + + ------------------- + -- Return the prepared code. + return (new_iface, no_change, details, cg_guts) + +-------------------------------------------------------------- +-- BackEnd combinators +-------------------------------------------------------------- + +hscWriteIface :: (ModIface, Bool, ModDetails, a) -> Comp (ModIface, ModDetails, a) +hscWriteIface (iface, no_change, details, a) + = do mod_summary <- gets compModSummary + liftIO $ do + unless no_change + $ writeIfaceFile (ms_location mod_summary) iface + return (iface, details, a) + +hscIgnoreIface :: (ModIface, Bool, ModDetails, a) -> Comp (ModIface, ModDetails, a) +hscIgnoreIface (iface, no_change, details, a) + = return (iface, details, a) + +-- Don't output any code. +hscNothing :: (ModIface, ModDetails, a) -> Comp (HscStatus, ModIface, ModDetails) +hscNothing (iface, details, a) + = return (HscRecomp False, iface, details) + +-- Generate code and return both the new ModIface and the ModDetails. +hscBatch :: (ModIface, ModDetails, CgGuts) -> Comp (HscStatus, ModIface, ModDetails) +hscBatch (iface, details, cgguts) + = do hasStub <- hscCompile cgguts + return (HscRecomp hasStub, iface, details) + +-- Here we don't need the ModIface and ModDetails anymore. +hscOneShot :: (ModIface, ModDetails, CgGuts) -> Comp HscStatus +hscOneShot (_, _, cgguts) + = do hasStub <- hscCompile cgguts + return (HscRecomp hasStub) + +-- Compile to hard-code. +hscCompile :: CgGuts -> Comp Bool +hscCompile cgguts + = do hsc_env <- gets compHscEnv + mod_summary <- gets compModSummary + liftIO $ do + let CgGuts{ -- This is the last use of the ModGuts in a compilation. + -- From now on, we just use the bits we need. + cg_module = this_mod, + cg_binds = core_binds, + cg_tycons = tycons, + cg_dir_imps = dir_imps, + cg_foreign = foreign_stubs, + cg_home_mods = home_mods, + cg_dep_pkgs = dependencies } = cgguts + dflags = hsc_dflags hsc_env + location = ms_location mod_summary + data_tycons = filter isDataTyCon tycons + -- cg_tycons includes newtypes, for the benefit of External Core, + -- but we don't generate any code for newtypes + + ------------------- + -- PREPARE FOR CODE GENERATION + -- Do saturation and convert to A-normal form + prepd_binds <- {-# SCC "CorePrep" #-} + corePrepPgm dflags core_binds data_tycons ; + ----------------- Convert to STG ------------------ + (stg_binds, cost_centre_info) + <- {-# SCC "CoreToStg" #-} + myCoreToStg dflags home_mods this_mod prepd_binds + ------------------ Code generation ------------------ + abstractC <- {-# SCC "CodeGen" #-} + codeGen dflags home_mods this_mod data_tycons + foreign_stubs dir_imps cost_centre_info + stg_binds + ------------------ Code output ----------------------- + (stub_h_exists,stub_c_exists) + <- codeOutput dflags this_mod location foreign_stubs + dependencies abstractC + return stub_c_exists + +hscConst :: b -> a -> Comp b +hscConst b a = return b + +hscInteractive :: (ModIface, ModDetails, CgGuts) + -> Comp (InteractiveStatus, ModIface, ModDetails) +hscInteractive (iface, details, cgguts) +#ifdef GHCI + = do hsc_env <- gets compHscEnv + mod_summary <- gets compModSummary + liftIO $ do + let CgGuts{ -- This is the last use of the ModGuts in a compilation. + -- From now on, we just use the bits we need. + cg_module = this_mod, + cg_binds = core_binds, + cg_tycons = tycons, + cg_foreign = foreign_stubs } = cgguts + dflags = hsc_dflags hsc_env + location = ms_location mod_summary + data_tycons = filter isDataTyCon tycons + -- cg_tycons includes newtypes, for the benefit of External Core, + -- but we don't generate any code for newtypes + + ------------------- + -- PREPARE FOR CODE GENERATION + -- Do saturation and convert to A-normal form + prepd_binds <- {-# SCC "CorePrep" #-} + corePrepPgm dflags core_binds data_tycons ; + ----------------- Generate byte code ------------------ + comp_bc <- byteCodeGen dflags prepd_binds data_tycons + ------------------ Create f-x-dynamic C-side stuff --- + (istub_h_exists, istub_c_exists) + <- outputForeignStubs dflags this_mod location foreign_stubs + return (InteractiveRecomp istub_c_exists comp_bc, iface, details) +#else + = panic "GHC not compiled with interpreter" +#endif + +------------------------------ + +hscFileCheck :: HscEnv -> ModSummary -> IO (Maybe HscChecked) +hscFileCheck hsc_env mod_summary = do { + ------------------- + -- PARSE + ------------------- + ; let dflags = hsc_dflags hsc_env + hspp_file = ms_hspp_file mod_summary + hspp_buf = ms_hspp_buf mod_summary + + ; maybe_parsed <- myParseModule dflags hspp_file hspp_buf + + ; case maybe_parsed of { + Left err -> do { printBagOfErrors dflags (unitBag err) + ; return Nothing } ; + Right rdr_module -> do { + + ------------------- + -- RENAME and TYPECHECK + ------------------- + (tc_msgs, maybe_tc_result) + <- _scc_ "Typecheck-Rename" + tcRnModule hsc_env (ms_hsc_src mod_summary) + True{-save renamed syntax-} + rdr_module + + ; printErrorsAndWarnings dflags tc_msgs + ; case maybe_tc_result of { + Nothing -> return (Just (HscChecked rdr_module Nothing Nothing)); + Just tc_result -> do + let md = ModDetails { + md_types = tcg_type_env tc_result, + md_exports = tcg_exports tc_result, + md_insts = tcg_insts tc_result, + md_rules = [panic "no rules"] } + -- Rules are CoreRules, not the + -- RuleDecls we get out of the typechecker + rnInfo = do decl <- tcg_rn_decls tc_result + imports <- tcg_rn_imports tc_result + let exports = tcg_rn_exports tc_result + return (decl,imports,exports) + return (Just (HscChecked rdr_module + rnInfo + (Just (tcg_binds tc_result, + tcg_rdr_env tc_result, + md)))) + }}}} + + +hscCmmFile :: DynFlags -> FilePath -> IO Bool +hscCmmFile dflags filename = do + maybe_cmm <- parseCmmFile dflags (mkHomeModules []) filename + case maybe_cmm of + Nothing -> return False + Just cmm -> do + codeOutput dflags no_mod no_loc NoStubs [] [cmm] + return True + where + no_mod = panic "hscCmmFile: no_mod" + no_loc = ModLocation{ ml_hs_file = Just filename, + ml_hi_file = panic "hscCmmFile: no hi file", + ml_obj_file = panic "hscCmmFile: no obj file" } + + +myParseModule dflags src_filename maybe_src_buf + = -------------------------- Parser ---------------- + showPass dflags "Parser" >> + {-# SCC "Parser" #-} do + + -- sometimes we already have the buffer in memory, perhaps + -- because we needed to parse the imports out of it, or get the + -- module name. + buf <- case maybe_src_buf of + Just b -> return b + Nothing -> hGetStringBuffer src_filename + + let loc = mkSrcLoc (mkFastString src_filename) 1 0 + + case unP parseModule (mkPState buf loc dflags) of { + + PFailed span err -> return (Left (mkPlainErrMsg span err)); + + POk _ rdr_module -> do { + + dumpIfSet_dyn dflags Opt_D_dump_parsed "Parser" (ppr rdr_module) ; + + dumpIfSet_dyn dflags Opt_D_source_stats "Source Statistics" + (ppSourceStats False rdr_module) ; + + return (Right rdr_module) + -- ToDo: free the string buffer later. + }} + + +myCoreToStg dflags home_mods this_mod prepd_binds + = do + stg_binds <- {-# SCC "Core2Stg" #-} + coreToStg home_mods prepd_binds + + (stg_binds2, cost_centre_info) <- {-# SCC "Stg2Stg" #-} + stg2stg dflags home_mods this_mod stg_binds + + return (stg_binds2, cost_centre_info) +\end{code} + + +%************************************************************************ +%* * +\subsection{Compiling a do-statement} +%* * +%************************************************************************ + +When the UnlinkedBCOExpr is linked you get an HValue of type + IO [HValue] +When you run it you get a list of HValues that should be +the same length as the list of names; add them to the ClosureEnv. + +A naked expression returns a singleton Name [it]. + + What you type The IO [HValue] that hscStmt returns + ------------- ------------------------------------ + let pat = expr ==> let pat = expr in return [coerce HVal x, coerce HVal y, ...] + bindings: [x,y,...] + + pat <- expr ==> expr >>= \ pat -> return [coerce HVal x, coerce HVal y, ...] + bindings: [x,y,...] + + expr (of IO type) ==> expr >>= \ v -> return [v] + [NB: result not printed] bindings: [it] + + + expr (of non-IO type, + result showable) ==> let v = expr in print v >> return [v] + bindings: [it] + + expr (of non-IO type, + result not showable) ==> error + +\begin{code} +#ifdef GHCI +hscStmt -- Compile a stmt all the way to an HValue, but don't run it + :: HscEnv + -> String -- The statement + -> IO (Maybe (HscEnv, [Name], HValue)) + +hscStmt hsc_env stmt + = do { maybe_stmt <- hscParseStmt (hsc_dflags hsc_env) stmt + ; case maybe_stmt of { + Nothing -> return Nothing ; -- Parse error + Just Nothing -> return Nothing ; -- Empty line + Just (Just parsed_stmt) -> do { -- The real stuff + + -- Rename and typecheck it + let icontext = hsc_IC hsc_env + ; maybe_tc_result <- tcRnStmt hsc_env icontext parsed_stmt + + ; case maybe_tc_result of { + Nothing -> return Nothing ; + Just (new_ic, bound_names, tc_expr) -> do { + + -- Then desugar, code gen, and link it + ; hval <- compileExpr hsc_env iNTERACTIVE + (ic_rn_gbl_env new_ic) + (ic_type_env new_ic) + tc_expr + + ; return (Just (hsc_env{ hsc_IC=new_ic }, bound_names, hval)) + }}}}} + +hscTcExpr -- Typecheck an expression (but don't run it) + :: HscEnv + -> String -- The expression + -> IO (Maybe Type) + +hscTcExpr hsc_env expr + = do { maybe_stmt <- hscParseStmt (hsc_dflags hsc_env) expr + ; let icontext = hsc_IC hsc_env + ; case maybe_stmt of { + Nothing -> return Nothing ; -- Parse error + Just (Just (L _ (ExprStmt expr _ _))) + -> tcRnExpr hsc_env icontext expr ; + Just other -> do { errorMsg (hsc_dflags hsc_env) (text "not an expression:" <+> quotes (text expr)) ; + return Nothing } ; + } } + +hscKcType -- Find the kind of a type + :: HscEnv + -> String -- The type + -> IO (Maybe Kind) + +hscKcType hsc_env str + = do { maybe_type <- hscParseType (hsc_dflags hsc_env) str + ; let icontext = hsc_IC hsc_env + ; case maybe_type of { + Just ty -> tcRnType hsc_env icontext ty ; + Just other -> do { errorMsg (hsc_dflags hsc_env) (text "not an type:" <+> quotes (text str)) ; + return Nothing } ; + Nothing -> return Nothing } } +#endif +\end{code} + +\begin{code} +#ifdef GHCI +hscParseStmt :: DynFlags -> String -> IO (Maybe (Maybe (LStmt RdrName))) +hscParseStmt = hscParseThing parseStmt + +hscParseType :: DynFlags -> String -> IO (Maybe (LHsType RdrName)) +hscParseType = hscParseThing parseType +#endif + +hscParseIdentifier :: DynFlags -> String -> IO (Maybe (Located RdrName)) +hscParseIdentifier = hscParseThing parseIdentifier + +hscParseThing :: Outputable thing + => Lexer.P thing + -> DynFlags -> String + -> IO (Maybe thing) + -- Nothing => Parse error (message already printed) + -- Just x => success +hscParseThing parser dflags str + = showPass dflags "Parser" >> + {-# SCC "Parser" #-} do + + buf <- stringToStringBuffer str + + let loc = mkSrcLoc FSLIT("<interactive>") 1 0 + + case unP parser (mkPState buf loc dflags) of { + + PFailed span err -> do { printError span err; + return Nothing }; + + POk _ thing -> do { + + --ToDo: can't free the string buffer until we've finished this + -- compilation sweep and all the identifiers have gone away. + dumpIfSet_dyn dflags Opt_D_dump_parsed "Parser" (ppr thing); + return (Just thing) + }} +\end{code} + +%************************************************************************ +%* * + Desugar, simplify, convert to bytecode, and link an expression +%* * +%************************************************************************ + +\begin{code} +#ifdef GHCI +compileExpr :: HscEnv + -> Module -> GlobalRdrEnv -> TypeEnv + -> LHsExpr Id + -> IO HValue + +compileExpr hsc_env this_mod rdr_env type_env tc_expr + = do { let { dflags = hsc_dflags hsc_env ; + lint_on = dopt Opt_DoCoreLinting dflags } + + -- Desugar it + ; ds_expr <- deSugarExpr hsc_env this_mod rdr_env type_env tc_expr + + -- Flatten it + ; flat_expr <- flattenExpr hsc_env ds_expr + + -- Simplify it + ; simpl_expr <- simplifyExpr dflags flat_expr + + -- Tidy it (temporary, until coreSat does cloning) + ; let tidy_expr = tidyExpr emptyTidyEnv simpl_expr + + -- Prepare for codegen + ; prepd_expr <- corePrepExpr dflags tidy_expr + + -- Lint if necessary + -- ToDo: improve SrcLoc + ; if lint_on then + case lintUnfolding noSrcLoc [] prepd_expr of + Just err -> pprPanic "compileExpr" err + Nothing -> return () + else + return () + + -- Convert to BCOs + ; bcos <- coreExprToBCOs dflags prepd_expr + + -- link it + ; hval <- linkExpr hsc_env bcos + + ; return hval + } +#endif +\end{code} + + +%************************************************************************ +%* * + Statistics on reading interfaces +%* * +%************************************************************************ + +\begin{code} +dumpIfaceStats :: HscEnv -> IO () +dumpIfaceStats hsc_env + = do { eps <- readIORef (hsc_EPS hsc_env) + ; dumpIfSet (dump_if_trace || dump_rn_stats) + "Interface statistics" + (ifaceStats eps) } + where + dflags = hsc_dflags hsc_env + dump_rn_stats = dopt Opt_D_dump_rn_stats dflags + dump_if_trace = dopt Opt_D_dump_if_trace dflags +\end{code} + +%************************************************************************ +%* * + Progress Messages: Module i of n +%* * +%************************************************************************ + +\begin{code} +showModuleIndex Nothing = "" +showModuleIndex (Just (i,n)) = "[" ++ padded ++ " of " ++ n_str ++ "] " + where + n_str = show n + i_str = show i + padded = replicate (length n_str - length i_str) ' ' ++ i_str +\end{code} + diff --git a/compiler/main/HscStats.lhs b/compiler/main/HscStats.lhs new file mode 100644 index 0000000000..750744af44 --- /dev/null +++ b/compiler/main/HscStats.lhs @@ -0,0 +1,160 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1993-1998 +% +\section[GHC_Stats]{Statistics for per-module compilations} + +\begin{code} +module HscStats ( ppSourceStats ) where + +#include "HsVersions.h" + +import HsSyn +import Outputable +import SrcLoc ( unLoc, Located(..) ) +import Char ( isSpace ) +import Bag ( bagToList ) +import Util ( count ) +\end{code} + +%************************************************************************ +%* * +\subsection{Statistics} +%* * +%************************************************************************ + +\begin{code} +ppSourceStats short (L _ (HsModule _ exports imports ldecls _)) + = (if short then hcat else vcat) + (map pp_val + [("ExportAll ", export_all), -- 1 if no export list + ("ExportDecls ", export_ds), + ("ExportModules ", export_ms), + ("Imports ", import_no), + (" ImpQual ", import_qual), + (" ImpAs ", import_as), + (" ImpAll ", import_all), + (" ImpPartial ", import_partial), + (" ImpHiding ", import_hiding), + ("FixityDecls ", fixity_sigs), + ("DefaultDecls ", default_ds), + ("TypeDecls ", type_ds), + ("DataDecls ", data_ds), + ("NewTypeDecls ", newt_ds), + ("DataConstrs ", data_constrs), + ("DataDerivings ", data_derivs), + ("ClassDecls ", class_ds), + ("ClassMethods ", class_method_ds), + ("DefaultMethods ", default_method_ds), + ("InstDecls ", inst_ds), + ("InstMethods ", inst_method_ds), + ("TypeSigs ", bind_tys), + ("ValBinds ", val_bind_ds), + ("FunBinds ", fn_bind_ds), + ("InlineMeths ", method_inlines), + ("InlineBinds ", bind_inlines), +-- ("SpecialisedData ", data_specs), +-- ("SpecialisedInsts ", inst_specs), + ("SpecialisedMeths ", method_specs), + ("SpecialisedBinds ", bind_specs) + ]) + where + decls = map unLoc ldecls + + pp_val (str, 0) = empty + pp_val (str, n) + | not short = hcat [text str, int n] + | otherwise = hcat [text (trim str), equals, int n, semi] + + trim ls = takeWhile (not.isSpace) (dropWhile isSpace ls) + + (fixity_sigs, bind_tys, bind_specs, bind_inlines) + = count_sigs [d | SigD d <- decls] + -- NB: this omits fixity decls on local bindings and + -- in class decls. ToDo + + tycl_decls = [d | TyClD d <- decls] + (class_ds, type_ds, data_ds, newt_ds) = countTyClDecls tycl_decls + + inst_decls = [d | InstD d <- decls] + inst_ds = length inst_decls + default_ds = count (\ x -> case x of { DefD{} -> True; _ -> False}) decls + val_decls = [d | ValD d <- decls] + + real_exports = case exports of { Nothing -> []; Just es -> es } + n_exports = length real_exports + export_ms = count (\ e -> case unLoc e of { IEModuleContents{} -> True;_ -> False}) + real_exports + export_ds = n_exports - export_ms + export_all = case exports of { Nothing -> 1; other -> 0 } + + (val_bind_ds, fn_bind_ds) + = foldr add2 (0,0) (map count_bind val_decls) + + (import_no, import_qual, import_as, import_all, import_partial, import_hiding) + = foldr add6 (0,0,0,0,0,0) (map import_info imports) + (data_constrs, data_derivs) + = foldr add2 (0,0) (map data_info tycl_decls) + (class_method_ds, default_method_ds) + = foldr add2 (0,0) (map class_info tycl_decls) + (inst_method_ds, method_specs, method_inlines) + = foldr add3 (0,0,0) (map inst_info inst_decls) + + count_bind (PatBind { pat_lhs = L _ (VarPat n) }) = (1,0) + count_bind (PatBind {}) = (0,1) + count_bind (FunBind {}) = (0,1) + + count_sigs sigs = foldr add4 (0,0,0,0) (map sig_info sigs) + + sig_info (FixSig _) = (1,0,0,0) + sig_info (TypeSig _ _) = (0,1,0,0) + sig_info (SpecSig _ _ _) = (0,0,1,0) + sig_info (InlineSig _ _) = (0,0,0,1) + sig_info _ = (0,0,0,0) + + import_info (L _ (ImportDecl _ _ qual as spec)) + = add6 (1, qual_info qual, as_info as, 0,0,0) (spec_info spec) + qual_info False = 0 + qual_info True = 1 + as_info Nothing = 0 + as_info (Just _) = 1 + spec_info Nothing = (0,0,0,1,0,0) + spec_info (Just (False, _)) = (0,0,0,0,1,0) + spec_info (Just (True, _)) = (0,0,0,0,0,1) + + data_info (TyData {tcdCons = cs, tcdDerivs = derivs}) + = (length cs, case derivs of Nothing -> 0 + Just ds -> length ds) + data_info other = (0,0) + + class_info decl@(ClassDecl {}) + = case count_sigs (map unLoc (tcdSigs decl)) of + (_,classops,_,_) -> + (classops, addpr (foldr add2 (0,0) (map (count_bind.unLoc) (bagToList (tcdMeths decl))))) + class_info other = (0,0) + + inst_info (InstDecl _ inst_meths inst_sigs) + = case count_sigs (map unLoc inst_sigs) of + (_,_,ss,is) -> + (addpr (foldr add2 (0,0) (map (count_bind.unLoc) (bagToList inst_meths))), ss, is) + + addpr :: (Int,Int) -> Int + add2 :: (Int,Int) -> (Int,Int) -> (Int, Int) + add3 :: (Int,Int,Int) -> (Int,Int,Int) -> (Int, Int, Int) + add4 :: (Int,Int,Int,Int) -> (Int,Int,Int,Int) -> (Int, Int, Int, Int) + add6 :: (Int,Int,Int,Int,Int,Int) -> (Int,Int,Int,Int,Int,Int) -> (Int, Int, Int, Int, Int, Int) + + addpr (x,y) = x+y + add2 (x1,x2) (y1,y2) = (x1+y1,x2+y2) + add3 (x1,x2,x3) (y1,y2,y3) = (x1+y1,x2+y2,x3+y3) + add4 (x1,x2,x3,x4) (y1,y2,y3,y4) = (x1+y1,x2+y2,x3+y3,x4+y4) + add6 (x1,x2,x3,x4,x5,x6) (y1,y2,y3,y4,y5,y6) = (x1+y1,x2+y2,x3+y3,x4+y4,x5+y5,x6+y6) +\end{code} + + + + + + + + + diff --git a/compiler/main/HscTypes.lhs b/compiler/main/HscTypes.lhs new file mode 100644 index 0000000000..ee5438b319 --- /dev/null +++ b/compiler/main/HscTypes.lhs @@ -0,0 +1,1083 @@ + +% (c) The University of Glasgow, 2000 +% +\section[HscTypes]{Types for the per-module compiler} + +\begin{code} +module HscTypes ( + -- * Sessions and compilation state + Session(..), HscEnv(..), hscEPS, + FinderCache, FinderCacheEntry, + Target(..), TargetId(..), pprTarget, pprTargetId, + ModuleGraph, emptyMG, + + ModDetails(..), emptyModDetails, + ModGuts(..), CgGuts(..), ModImports(..), ForeignStubs(..), + + ModSummary(..), showModMsg, isBootSummary, + msHsFilePath, msHiFilePath, msObjFilePath, + + HscSource(..), isHsBoot, hscSourceString, -- Re-exported from DriverPhases + + HomePackageTable, HomeModInfo(..), emptyHomePackageTable, + hptInstances, hptRules, + + ExternalPackageState(..), EpsStats(..), addEpsInStats, + PackageTypeEnv, PackageIfaceTable, emptyPackageIfaceTable, + lookupIface, lookupIfaceByModule, emptyModIface, + + InteractiveContext(..), emptyInteractiveContext, + icPrintUnqual, unQualInScope, + + ModIface(..), mkIfaceDepCache, mkIfaceVerCache, mkIfaceFixCache, + emptyIfaceDepCache, + + Deprecs(..), IfaceDeprecs, + + FixityEnv, FixItem(..), lookupFixity, emptyFixityEnv, + + implicitTyThings, + + TyThing(..), tyThingClass, tyThingTyCon, tyThingDataCon, tyThingId, + TypeEnv, lookupType, mkTypeEnv, emptyTypeEnv, + extendTypeEnv, extendTypeEnvList, extendTypeEnvWithIds, lookupTypeEnv, + typeEnvElts, typeEnvClasses, typeEnvTyCons, typeEnvIds, + + WhetherHasOrphans, IsBootInterface, Usage(..), + Dependencies(..), noDependencies, + NameCache(..), OrigNameCache, OrigIParamCache, + Avails, availsToNameSet, availName, availNames, + GenAvailInfo(..), AvailInfo, RdrAvailInfo, + IfaceExport, + + Deprecations, DeprecTxt, lookupDeprec, plusDeprecs, + + PackageInstEnv, PackageRuleBase, + + -- Linker stuff + Linkable(..), isObjectLinkable, + Unlinked(..), CompiledByteCode, + isObject, nameOfObject, isInterpretable, byteCodeOfObject + ) where + +#include "HsVersions.h" + +#ifdef GHCI +import ByteCodeAsm ( CompiledByteCode ) +#endif + +import RdrName ( GlobalRdrEnv, emptyGlobalRdrEnv, + LocalRdrEnv, emptyLocalRdrEnv, + GlobalRdrElt(..), mkRdrUnqual, lookupGRE_RdrName ) +import Name ( Name, NamedThing, getName, nameOccName, nameModule ) +import NameEnv +import NameSet +import OccName ( OccName, OccEnv, lookupOccEnv, mkOccEnv, emptyOccEnv, + extendOccEnv ) +import Module +import InstEnv ( InstEnv, Instance ) +import Rules ( RuleBase ) +import CoreSyn ( CoreBind ) +import Id ( Id ) +import Type ( TyThing(..) ) + +import Class ( Class, classSelIds, classTyCon ) +import TyCon ( TyCon, tyConSelIds, tyConDataCons ) +import DataCon ( dataConImplicitIds ) +import PrelNames ( gHC_PRIM ) +import Packages ( PackageIdH, PackageId, PackageConfig, HomeModules ) +import DynFlags ( DynFlags(..), isOneShot, HscTarget (..) ) +import DriverPhases ( HscSource(..), isHsBoot, hscSourceString, Phase ) +import BasicTypes ( Version, initialVersion, IPName, + Fixity, defaultFixity, DeprecTxt ) + +import IfaceSyn ( IfaceInst, IfaceRule, IfaceDecl(ifName) ) + +import FiniteMap ( FiniteMap ) +import CoreSyn ( CoreRule ) +import Maybes ( orElse, expectJust, expectJust ) +import Outputable +import SrcLoc ( SrcSpan, Located ) +import UniqSupply ( UniqSupply ) +import FastString ( FastString ) + +import DATA_IOREF ( IORef, readIORef ) +import StringBuffer ( StringBuffer ) +import Time ( ClockTime ) +\end{code} + + +%************************************************************************ +%* * +\subsection{Compilation environment} +%* * +%************************************************************************ + + +\begin{code} +-- | The Session is a handle to the complete state of a compilation +-- session. A compilation session consists of a set of modules +-- constituting the current program or library, the context for +-- interactive evaluation, and various caches. +newtype Session = Session (IORef HscEnv) +\end{code} + +HscEnv is like Session, except that some of the fields are immutable. +An HscEnv is used to compile a single module from plain Haskell source +code (after preprocessing) to either C, assembly or C--. Things like +the module graph don't change during a single compilation. + +Historical note: "hsc" used to be the name of the compiler binary, +when there was a separate driver and compiler. To compile a single +module, the driver would invoke hsc on the source code... so nowadays +we think of hsc as the layer of the compiler that deals with compiling +a single module. + +\begin{code} +data HscEnv + = HscEnv { + hsc_dflags :: DynFlags, + -- The dynamic flag settings + + hsc_targets :: [Target], + -- The targets (or roots) of the current session + + hsc_mod_graph :: ModuleGraph, + -- The module graph of the current session + + hsc_IC :: InteractiveContext, + -- The context for evaluating interactive statements + + hsc_HPT :: HomePackageTable, + -- The home package table describes already-compiled + -- home-packge modules, *excluding* the module we + -- are compiling right now. + -- (In one-shot mode the current module is the only + -- home-package module, so hsc_HPT is empty. All other + -- modules count as "external-package" modules. + -- However, even in GHCi mode, hi-boot interfaces are + -- demand-loadeded into the external-package table.) + -- + -- hsc_HPT is not mutable because we only demand-load + -- external packages; the home package is eagerly + -- loaded, module by module, by the compilation manager. + -- + -- The HPT may contain modules compiled earlier by --make + -- but not actually below the current module in the dependency + -- graph. (This changes a previous invariant: changed Jan 05.) + + hsc_EPS :: {-# UNPACK #-} !(IORef ExternalPackageState), + hsc_NC :: {-# UNPACK #-} !(IORef NameCache), + -- These are side-effected by compiling to reflect + -- sucking in interface files. They cache the state of + -- external interface files, in effect. + + hsc_FC :: {-# UNPACK #-} !(IORef FinderCache) + -- The finder's cache. This caches the location of modules, + -- so we don't have to search the filesystem multiple times. + } + +hscEPS :: HscEnv -> IO ExternalPackageState +hscEPS hsc_env = readIORef (hsc_EPS hsc_env) + +-- | A compilation target. +-- +-- A target may be supplied with the actual text of the +-- module. If so, use this instead of the file contents (this +-- is for use in an IDE where the file hasn't been saved by +-- the user yet). +data Target = Target TargetId (Maybe (StringBuffer,ClockTime)) + +data TargetId + = TargetModule Module + -- ^ A module name: search for the file + | TargetFile FilePath (Maybe Phase) + -- ^ A filename: preprocess & parse it to find the module name. + -- If specified, the Phase indicates how to compile this file + -- (which phase to start from). Nothing indicates the starting phase + -- should be determined from the suffix of the filename. + deriving Eq + +pprTarget :: Target -> SDoc +pprTarget (Target id _) = pprTargetId id + +pprTargetId (TargetModule m) = ppr m +pprTargetId (TargetFile f _) = text f + +type FinderCache = ModuleEnv FinderCacheEntry +type FinderCacheEntry = (ModLocation, Maybe (PackageConfig,Bool)) + -- The finder's cache (see module Finder) + +type HomePackageTable = ModuleEnv HomeModInfo + -- Domain = modules in the home package +type PackageIfaceTable = ModuleEnv ModIface + -- Domain = modules in the imported packages + +emptyHomePackageTable = emptyModuleEnv +emptyPackageIfaceTable = emptyModuleEnv + +data HomeModInfo + = HomeModInfo { hm_iface :: !ModIface, + hm_details :: !ModDetails, + hm_linkable :: !(Maybe Linkable) } + -- hm_linkable might be Nothing if: + -- a) this is an .hs-boot module + -- b) temporarily during compilation if we pruned away + -- the old linkable because it was out of date. + -- after a complete compilation (GHC.load), all hm_linkable + -- fields in the HPT will be Just. + -- + -- When re-linking a module (hscNoRecomp), we construct + -- the HomModInfo by building a new ModDetails from the + -- old ModIface (only). +\end{code} + +Simple lookups in the symbol table. + +\begin{code} +lookupIface :: HomePackageTable -> PackageIfaceTable -> Module -> Maybe ModIface +-- We often have two IfaceTables, and want to do a lookup +lookupIface hpt pit mod + = case lookupModuleEnv hpt mod of + Just mod_info -> Just (hm_iface mod_info) + Nothing -> lookupModuleEnv pit mod + +lookupIfaceByModule :: HomePackageTable -> PackageIfaceTable -> Module -> Maybe ModIface +-- We often have two IfaceTables, and want to do a lookup +lookupIfaceByModule hpt pit mod + = case lookupModuleEnv hpt mod of + Just mod_info -> Just (hm_iface mod_info) + Nothing -> lookupModuleEnv pit mod +\end{code} + + +\begin{code} +hptInstances :: HscEnv -> (Module -> Bool) -> [Instance] +-- Find all the instance declarations that are in modules imported +-- by this one, directly or indirectly, and are in the Home Package Table +-- This ensures that we don't see instances from modules --make compiled +-- before this one, but which are not below this one +hptInstances hsc_env want_this_module + = [ ispec + | mod_info <- moduleEnvElts (hsc_HPT hsc_env) + , want_this_module (mi_module (hm_iface mod_info)) + , ispec <- md_insts (hm_details mod_info) ] + +hptRules :: HscEnv -> [(Module, IsBootInterface)] -> [CoreRule] +-- Get rules from modules "below" this one (in the dependency sense) +-- C.f Inst.hptInstances +hptRules hsc_env deps + | isOneShot (ghcMode (hsc_dflags hsc_env)) = [] + | otherwise + = let + hpt = hsc_HPT hsc_env + in + [ rule + | -- Find each non-hi-boot module below me + (mod, False) <- deps + + -- unsavoury: when compiling the base package with --make, we + -- sometimes try to look up RULES for GHC.Prim. GHC.Prim won't + -- be in the HPT, because we never compile it; it's in the EPT + -- instead. ToDo: clean up, and remove this slightly bogus + -- filter: + , mod /= gHC_PRIM + + -- Look it up in the HPT + , let mod_info = case lookupModuleEnv hpt mod of + Nothing -> pprPanic "hptRules" (ppr mod <+> ppr deps) + Just x -> x + + -- And get its dfuns + , rule <- md_rules (hm_details mod_info) ] +\end{code} + + +%************************************************************************ +%* * +\subsection{Symbol tables and Module details} +%* * +%************************************************************************ + +A @ModIface@ plus a @ModDetails@ summarises everything we know +about a compiled module. The @ModIface@ is the stuff *before* linking, +and can be written out to an interface file. (The @ModDetails@ is after +linking; it is the "linked" form of the mi_decls field.) + +When we *read* an interface file, we also construct a @ModIface@ from it, +except that the mi_decls part is empty; when reading we consolidate +the declarations into a single indexed map in the @PersistentRenamerState@. + +\begin{code} +data ModIface + = ModIface { + mi_package :: !PackageIdH, -- Which package the module comes from + mi_module :: !Module, + mi_mod_vers :: !Version, -- Module version: changes when anything changes + + mi_orphan :: !WhetherHasOrphans, -- Whether this module has orphans + mi_boot :: !IsBootInterface, -- Read from an hi-boot file? + + mi_deps :: Dependencies, + -- This is consulted for directly-imported modules, + -- but not for anything else (hence lazy) + + -- Usages; kept sorted so that it's easy to decide + -- whether to write a new iface file (changing usages + -- doesn't affect the version of this module) + mi_usages :: [Usage], + -- NOT STRICT! we read this field lazily from the interface file + -- It is *only* consulted by the recompilation checker + + -- Exports + -- Kept sorted by (mod,occ), to make version comparisons easier + mi_exports :: ![IfaceExport], + mi_exp_vers :: !Version, -- Version number of export list + + -- Fixities + mi_fixities :: [(OccName,Fixity)], + -- NOT STRICT! we read this field lazily from the interface file + + -- Deprecations + mi_deprecs :: IfaceDeprecs, + -- NOT STRICT! we read this field lazily from the interface file + + -- Type, class and variable declarations + -- The version of an Id changes if its fixity or deprecations change + -- (as well as its type of course) + -- Ditto data constructors, class operations, except that + -- the version of the parent class/tycon changes + mi_decls :: [(Version,IfaceDecl)], -- Sorted + + mi_globals :: !(Maybe GlobalRdrEnv), + -- Binds all the things defined at the top level in + -- the *original source* code for this module. which + -- is NOT the same as mi_exports, nor mi_decls (which + -- may contains declarations for things not actually + -- defined by the user). Used for GHCi and for inspecting + -- the contents of modules via the GHC API only. + -- + -- (We need the source file to figure out the + -- top-level environment, if we didn't compile this module + -- from source then this field contains Nothing). + -- + -- Strictly speaking this field should live in the + -- HomeModInfo, but that leads to more plumbing. + + -- Instance declarations and rules + mi_insts :: [IfaceInst], -- Sorted + mi_rules :: [IfaceRule], -- Sorted + mi_rule_vers :: !Version, -- Version number for rules and instances combined + + -- Cached environments for easy lookup + -- These are computed (lazily) from other fields + -- and are not put into the interface file + mi_dep_fn :: Name -> Maybe DeprecTxt, -- Cached lookup for mi_deprecs + mi_fix_fn :: OccName -> Fixity, -- Cached lookup for mi_fixities + mi_ver_fn :: OccName -> Maybe Version -- Cached lookup for mi_decls + -- The Nothing in mi_ver_fn means that the thing + -- isn't in decls. It's useful to know that when + -- seeing if we are up to date wrt the old interface + } + +-- Should be able to construct ModDetails from mi_decls in ModIface +data ModDetails + = ModDetails { + -- The next three fields are created by the typechecker + md_exports :: NameSet, + md_types :: !TypeEnv, + md_insts :: ![Instance], -- Dfun-ids for the instances in this module + md_rules :: ![CoreRule] -- Domain may include Ids from other modules + } + +emptyModDetails = ModDetails { md_types = emptyTypeEnv, + md_exports = emptyNameSet, + md_insts = [], + md_rules = [] } + +-- A ModGuts is carried through the compiler, accumulating stuff as it goes +-- There is only one ModGuts at any time, the one for the module +-- being compiled right now. Once it is compiled, a ModIface and +-- ModDetails are extracted and the ModGuts is dicarded. + +data ModGuts + = ModGuts { + mg_module :: !Module, + mg_boot :: IsBootInterface, -- Whether it's an hs-boot module + mg_exports :: !NameSet, -- What it exports + mg_deps :: !Dependencies, -- What is below it, directly or otherwise + mg_home_mods :: !HomeModules, -- For calling isHomeModule etc. + mg_dir_imps :: ![Module], -- Directly-imported modules; used to + -- generate initialisation code + mg_usages :: ![Usage], -- Version info for what it needed + + mg_rdr_env :: !GlobalRdrEnv, -- Top-level lexical environment + mg_fix_env :: !FixityEnv, -- Fixity env, for things declared in this module + mg_deprecs :: !Deprecations, -- Deprecations declared in the module + + mg_types :: !TypeEnv, + mg_insts :: ![Instance], -- Instances + mg_rules :: ![CoreRule], -- Rules from this module + mg_binds :: ![CoreBind], -- Bindings for this module + mg_foreign :: !ForeignStubs + } + +-- The ModGuts takes on several slightly different forms: +-- +-- After simplification, the following fields change slightly: +-- mg_rules Orphan rules only (local ones now attached to binds) +-- mg_binds With rules attached + + +--------------------------------------------------------- +-- The Tidy pass forks the information about this module: +-- * one lot goes to interface file generation (ModIface) +-- and later compilations (ModDetails) +-- * the other lot goes to code generation (CgGuts) +data CgGuts + = CgGuts { + cg_module :: !Module, + + cg_tycons :: [TyCon], + -- Algebraic data types (including ones that started + -- life as classes); generate constructors and info + -- tables Includes newtypes, just for the benefit of + -- External Core + + cg_binds :: [CoreBind], + -- The tidied main bindings, including + -- previously-implicit bindings for record and class + -- selectors, and data construtor wrappers. But *not* + -- data constructor workers; reason: we we regard them + -- as part of the code-gen of tycons + + cg_dir_imps :: ![Module], + -- Directly-imported modules; used to generate + -- initialisation code + + cg_foreign :: !ForeignStubs, + cg_home_mods :: !HomeModules, -- for calling isHomeModule etc. + cg_dep_pkgs :: ![PackageId] -- Used to generate #includes for C code gen + } + +----------------------------------- +data ModImports + = ModImports { + imp_direct :: ![(Module,Bool)], -- Explicitly-imported modules + -- Boolean is true if we imported the whole + -- module (apart, perhaps, from hiding some) + imp_pkg_mods :: !ModuleSet, -- Non-home-package modules on which we depend, + -- directly or indirectly + imp_home_names :: !NameSet -- Home package things on which we depend, + -- directly or indirectly + } + +----------------------------------- +data ForeignStubs = NoStubs + | ForeignStubs + SDoc -- Header file prototypes for + -- "foreign exported" functions + SDoc -- C stubs to use when calling + -- "foreign exported" functions + [FastString] -- Headers that need to be included + -- into C code generated for this module + [Id] -- Foreign-exported binders + -- we have to generate code to register these + +\end{code} + +\begin{code} +emptyModIface :: PackageIdH -> Module -> ModIface +emptyModIface pkg mod + = ModIface { mi_package = pkg, + mi_module = mod, + mi_mod_vers = initialVersion, + mi_orphan = False, + mi_boot = False, + mi_deps = noDependencies, + mi_usages = [], + mi_exports = [], + mi_exp_vers = initialVersion, + mi_fixities = [], + mi_deprecs = NoDeprecs, + mi_insts = [], + mi_rules = [], + mi_decls = [], + mi_globals = Nothing, + mi_rule_vers = initialVersion, + mi_dep_fn = emptyIfaceDepCache, + mi_fix_fn = emptyIfaceFixCache, + mi_ver_fn = emptyIfaceVerCache + } +\end{code} + + +%************************************************************************ +%* * +\subsection{The interactive context} +%* * +%************************************************************************ + +\begin{code} +data InteractiveContext + = InteractiveContext { + ic_toplev_scope :: [Module], -- Include the "top-level" scope of + -- these modules + + ic_exports :: [Module], -- Include just the exports of these + -- modules + + ic_rn_gbl_env :: GlobalRdrEnv, -- The cached GlobalRdrEnv, built from + -- ic_toplev_scope and ic_exports + + ic_rn_local_env :: LocalRdrEnv, -- Lexical context for variables bound + -- during interaction + + ic_type_env :: TypeEnv -- Ditto for types + } + +emptyInteractiveContext + = InteractiveContext { ic_toplev_scope = [], + ic_exports = [], + ic_rn_gbl_env = emptyGlobalRdrEnv, + ic_rn_local_env = emptyLocalRdrEnv, + ic_type_env = emptyTypeEnv } + +icPrintUnqual :: InteractiveContext -> PrintUnqualified +icPrintUnqual ictxt = unQualInScope (ic_rn_gbl_env ictxt) +\end{code} + +@unQualInScope@ returns a function that takes a @Name@ and tells whether +its unqualified name is in scope. This is put as a boolean flag in +the @Name@'s provenance to guide whether or not to print the name qualified +in error messages. + +\begin{code} +unQualInScope :: GlobalRdrEnv -> PrintUnqualified +-- True if 'f' is in scope, and has only one binding, +-- and the thing it is bound to is the name we are looking for +-- (i.e. false if A.f and B.f are both in scope as unqualified 'f') +-- +-- [Out of date] Also checks for built-in syntax, which is always 'in scope' +unQualInScope env mod occ + = case lookupGRE_RdrName (mkRdrUnqual occ) env of + [gre] -> nameModule (gre_name gre) == mod + other -> False +\end{code} + + +%************************************************************************ +%* * + TyThing +%* * +%************************************************************************ + +\begin{code} +implicitTyThings :: TyThing -> [TyThing] +implicitTyThings (AnId id) = [] + + -- For type constructors, add the data cons (and their extras), + -- and the selectors and generic-programming Ids too + -- + -- Newtypes don't have a worker Id, so don't generate that? +implicitTyThings (ATyCon tc) = map AnId (tyConSelIds tc) ++ + concatMap (extras_plus . ADataCon) (tyConDataCons tc) + + -- For classes, add the class TyCon too (and its extras) + -- and the class selector Ids +implicitTyThings (AClass cl) = map AnId (classSelIds cl) ++ + extras_plus (ATyCon (classTyCon cl)) + + + -- For data cons add the worker and wrapper (if any) +implicitTyThings (ADataCon dc) = map AnId (dataConImplicitIds dc) + +extras_plus thing = thing : implicitTyThings thing + +extendTypeEnvWithIds :: TypeEnv -> [Id] -> TypeEnv +extendTypeEnvWithIds env ids + = extendNameEnvList env [(getName id, AnId id) | id <- ids] +\end{code} + +%************************************************************************ +%* * + TypeEnv +%* * +%************************************************************************ + +\begin{code} +type TypeEnv = NameEnv TyThing + +emptyTypeEnv :: TypeEnv +typeEnvElts :: TypeEnv -> [TyThing] +typeEnvClasses :: TypeEnv -> [Class] +typeEnvTyCons :: TypeEnv -> [TyCon] +typeEnvIds :: TypeEnv -> [Id] +lookupTypeEnv :: TypeEnv -> Name -> Maybe TyThing + +emptyTypeEnv = emptyNameEnv +typeEnvElts env = nameEnvElts env +typeEnvClasses env = [cl | AClass cl <- typeEnvElts env] +typeEnvTyCons env = [tc | ATyCon tc <- typeEnvElts env] +typeEnvIds env = [id | AnId id <- typeEnvElts env] + +mkTypeEnv :: [TyThing] -> TypeEnv +mkTypeEnv things = extendTypeEnvList emptyTypeEnv things + +lookupTypeEnv = lookupNameEnv + +-- Extend the type environment +extendTypeEnv :: TypeEnv -> TyThing -> TypeEnv +extendTypeEnv env thing = extendNameEnv env (getName thing) thing + +extendTypeEnvList :: TypeEnv -> [TyThing] -> TypeEnv +extendTypeEnvList env things = foldl extendTypeEnv env things +\end{code} + +\begin{code} +lookupType :: HomePackageTable -> PackageTypeEnv -> Name -> Maybe TyThing +lookupType hpt pte name + = case lookupModuleEnv hpt (nameModule name) of + Just details -> lookupNameEnv (md_types (hm_details details)) name + Nothing -> lookupNameEnv pte name +\end{code} + + +\begin{code} +tyThingTyCon (ATyCon tc) = tc +tyThingTyCon other = pprPanic "tyThingTyCon" (ppr other) + +tyThingClass (AClass cls) = cls +tyThingClass other = pprPanic "tyThingClass" (ppr other) + +tyThingDataCon (ADataCon dc) = dc +tyThingDataCon other = pprPanic "tyThingDataCon" (ppr other) + +tyThingId (AnId id) = id +tyThingId other = pprPanic "tyThingId" (ppr other) +\end{code} + +%************************************************************************ +%* * +\subsection{Auxiliary types} +%* * +%************************************************************************ + +These types are defined here because they are mentioned in ModDetails, +but they are mostly elaborated elsewhere + +\begin{code} +mkIfaceVerCache :: [(Version,IfaceDecl)] -> OccName -> Maybe Version +mkIfaceVerCache pairs + = \occ -> lookupOccEnv env occ + where + env = foldl add emptyOccEnv pairs + add env (v,d) = extendOccEnv env (ifName d) v + +emptyIfaceVerCache :: OccName -> Maybe Version +emptyIfaceVerCache occ = Nothing + +------------------ Deprecations ------------------------- +data Deprecs a + = NoDeprecs + | DeprecAll DeprecTxt -- Whole module deprecated + | DeprecSome a -- Some specific things deprecated + deriving( Eq ) + +type IfaceDeprecs = Deprecs [(OccName,DeprecTxt)] +type Deprecations = Deprecs (NameEnv (OccName,DeprecTxt)) + -- Keep the OccName so we can flatten the NameEnv to + -- get an IfaceDeprecs from a Deprecations + -- Only an OccName is needed, because a deprecation always + -- applies to things defined in the module in which the + -- deprecation appears. + +mkIfaceDepCache:: IfaceDeprecs -> Name -> Maybe DeprecTxt +mkIfaceDepCache NoDeprecs = \n -> Nothing +mkIfaceDepCache (DeprecAll t) = \n -> Just t +mkIfaceDepCache (DeprecSome pairs) = lookupOccEnv (mkOccEnv pairs) . nameOccName + +emptyIfaceDepCache :: Name -> Maybe DeprecTxt +emptyIfaceDepCache n = Nothing + +lookupDeprec :: Deprecations -> Name -> Maybe DeprecTxt +lookupDeprec NoDeprecs name = Nothing +lookupDeprec (DeprecAll txt) name = Just txt +lookupDeprec (DeprecSome env) name = case lookupNameEnv env name of + Just (_, txt) -> Just txt + Nothing -> Nothing + +plusDeprecs :: Deprecations -> Deprecations -> Deprecations +plusDeprecs d NoDeprecs = d +plusDeprecs NoDeprecs d = d +plusDeprecs d (DeprecAll t) = DeprecAll t +plusDeprecs (DeprecAll t) d = DeprecAll t +plusDeprecs (DeprecSome v1) (DeprecSome v2) = DeprecSome (v1 `plusNameEnv` v2) +\end{code} + + +\begin{code} +type Avails = [AvailInfo] +type AvailInfo = GenAvailInfo Name +type RdrAvailInfo = GenAvailInfo OccName + +data GenAvailInfo name = Avail name -- An ordinary identifier + | AvailTC name -- The name of the type or class + [name] -- The available pieces of type/class. + -- NB: If the type or class is itself + -- to be in scope, it must be in this list. + -- Thus, typically: AvailTC Eq [Eq, ==, /=] + deriving( Eq ) + -- Equality used when deciding if the interface has changed + +type IfaceExport = (Module, [GenAvailInfo OccName]) + +availsToNameSet :: [AvailInfo] -> NameSet +availsToNameSet avails = foldl add emptyNameSet avails + where + add set avail = addListToNameSet set (availNames avail) + +availName :: GenAvailInfo name -> name +availName (Avail n) = n +availName (AvailTC n _) = n + +availNames :: GenAvailInfo name -> [name] +availNames (Avail n) = [n] +availNames (AvailTC n ns) = ns + +instance Outputable n => Outputable (GenAvailInfo n) where + ppr = pprAvail + +pprAvail :: Outputable n => GenAvailInfo n -> SDoc +pprAvail (AvailTC n ns) = ppr n <> case {- filter (/= n) -} ns of + [] -> empty + ns' -> braces (hsep (punctuate comma (map ppr ns'))) + +pprAvail (Avail n) = ppr n +\end{code} + +\begin{code} +mkIfaceFixCache :: [(OccName, Fixity)] -> OccName -> Fixity +mkIfaceFixCache pairs + = \n -> lookupOccEnv env n `orElse` defaultFixity + where + env = mkOccEnv pairs + +emptyIfaceFixCache :: OccName -> Fixity +emptyIfaceFixCache n = defaultFixity + +-- This fixity environment is for source code only +type FixityEnv = NameEnv FixItem + +-- We keep the OccName in the range so that we can generate an interface from it +data FixItem = FixItem OccName Fixity SrcSpan + +instance Outputable FixItem where + ppr (FixItem occ fix loc) = ppr fix <+> ppr occ <+> parens (ppr loc) + +emptyFixityEnv :: FixityEnv +emptyFixityEnv = emptyNameEnv + +lookupFixity :: FixityEnv -> Name -> Fixity +lookupFixity env n = case lookupNameEnv env n of + Just (FixItem _ fix _) -> fix + Nothing -> defaultFixity +\end{code} + + +%************************************************************************ +%* * +\subsection{WhatsImported} +%* * +%************************************************************************ + +\begin{code} +type WhetherHasOrphans = Bool + -- An "orphan" is + -- * an instance decl in a module other than the defn module for + -- one of the tycons or classes in the instance head + -- * a transformation rule in a module other than the one defining + -- the function in the head of the rule. + +type IsBootInterface = Bool + +-- Dependency info about modules and packages below this one +-- in the import hierarchy. See TcRnTypes.ImportAvails for details. +-- +-- Invariant: the dependencies of a module M never includes M +-- Invariant: the lists are unordered, with no duplicates +data Dependencies + = Deps { dep_mods :: [(Module,IsBootInterface)], -- Home-package module dependencies + dep_pkgs :: [PackageId], -- External package dependencies + dep_orphs :: [Module] } -- Orphan modules (whether home or external pkg) + deriving( Eq ) + -- Equality used only for old/new comparison in MkIface.addVersionInfo + +noDependencies :: Dependencies +noDependencies = Deps [] [] [] + +data Usage + = Usage { usg_name :: Module, -- Name of the module + usg_mod :: Version, -- Module version + usg_entities :: [(OccName,Version)], -- Sorted by occurrence name + usg_exports :: Maybe Version, -- Export-list version, if we depend on it + usg_rules :: Version -- Orphan-rules version (for non-orphan + -- modules this will always be initialVersion) + } deriving( Eq ) + -- This type doesn't let you say "I imported f but none of the rules in + -- the module". If you use anything in the module you get its rule version + -- So if the rules change, you'll recompile, even if you don't use them. + -- This is easy to implement, and it's safer: you might not have used the rules last + -- time round, but if someone has added a new rule you might need it this time + + -- The export list field is (Just v) if we depend on the export list: + -- i.e. we imported the module directly, whether or not we + -- enumerated the things we imported, or just imported everything + -- We need to recompile if M's exports change, because + -- if the import was import M, we might now have a name clash in the + -- importing module. + -- if the import was import M(x) M might no longer export x + -- The only way we don't depend on the export list is if we have + -- import M() + -- And of course, for modules that aren't imported directly we don't + -- depend on their export lists +\end{code} + + +%************************************************************************ +%* * + The External Package State +%* * +%************************************************************************ + +\begin{code} +type PackageTypeEnv = TypeEnv +type PackageRuleBase = RuleBase +type PackageInstEnv = InstEnv + +data ExternalPackageState + = EPS { + eps_is_boot :: !(ModuleEnv (Module, IsBootInterface)), + -- In OneShot mode (only), home-package modules accumulate in the + -- external package state, and are sucked in lazily. + -- For these home-pkg modules (only) we need to record which are + -- boot modules. We set this field after loading all the + -- explicitly-imported interfaces, but before doing anything else + -- + -- The Module part is not necessary, but it's useful for + -- debug prints, and it's convenient because this field comes + -- direct from TcRnTypes.ImportAvails.imp_dep_mods + + eps_PIT :: !PackageIfaceTable, + -- The ModuleIFaces for modules in external packages + -- whose interfaces we have opened + -- The declarations in these interface files are held in + -- eps_decls, eps_inst_env, eps_rules (below), not in the + -- mi_decls fields of the iPIT. + -- What _is_ in the iPIT is: + -- * The Module + -- * Version info + -- * Its exports + -- * Fixities + -- * Deprecations + + eps_PTE :: !PackageTypeEnv, -- Domain = external-package modules + + eps_inst_env :: !PackageInstEnv, -- The total InstEnv accumulated from + -- all the external-package modules + eps_rule_base :: !PackageRuleBase, -- Ditto RuleEnv + + eps_stats :: !EpsStats + } + +-- "In" means read from iface files +-- "Out" means actually sucked in and type-checked +data EpsStats = EpsStats { n_ifaces_in + , n_decls_in, n_decls_out + , n_rules_in, n_rules_out + , n_insts_in, n_insts_out :: !Int } + +addEpsInStats :: EpsStats -> Int -> Int -> Int -> EpsStats +-- Add stats for one newly-read interface +addEpsInStats stats n_decls n_insts n_rules + = stats { n_ifaces_in = n_ifaces_in stats + 1 + , n_decls_in = n_decls_in stats + n_decls + , n_insts_in = n_insts_in stats + n_insts + , n_rules_in = n_rules_in stats + n_rules } +\end{code} + +The NameCache makes sure that there is just one Unique assigned for +each original name; i.e. (module-name, occ-name) pair. The Name is +always stored as a Global, and has the SrcLoc of its binding location. +Actually that's not quite right. When we first encounter the original +name, we might not be at its binding site (e.g. we are reading an +interface file); so we give it 'noSrcLoc' then. Later, when we find +its binding site, we fix it up. + +\begin{code} +data NameCache + = NameCache { nsUniqs :: UniqSupply, + -- Supply of uniques + nsNames :: OrigNameCache, + -- Ensures that one original name gets one unique + nsIPs :: OrigIParamCache + -- Ensures that one implicit parameter name gets one unique + } + +type OrigNameCache = ModuleEnv (OccEnv Name) +type OrigIParamCache = FiniteMap (IPName OccName) (IPName Name) +\end{code} + + + +%************************************************************************ +%* * + The module graph and ModSummary type + A ModSummary is a node in the compilation manager's + dependency graph, and it's also passed to hscMain +%* * +%************************************************************************ + +A ModuleGraph contains all the nodes from the home package (only). +There will be a node for each source module, plus a node for each hi-boot +module. + +\begin{code} +type ModuleGraph = [ModSummary] -- The module graph, + -- NOT NECESSARILY IN TOPOLOGICAL ORDER + +emptyMG :: ModuleGraph +emptyMG = [] + +-- The nodes of the module graph are +-- EITHER a regular Haskell source module +-- OR a hi-boot source module + +data ModSummary + = ModSummary { + ms_mod :: Module, -- Name of the module + ms_hsc_src :: HscSource, -- Source is Haskell, hs-boot, external core + ms_location :: ModLocation, -- Location + ms_hs_date :: ClockTime, -- Timestamp of source file + ms_obj_date :: Maybe ClockTime, -- Timestamp of object, maybe + ms_srcimps :: [Located Module], -- Source imports + ms_imps :: [Located Module], -- Non-source imports + ms_hspp_file :: FilePath, -- Filename of preprocessed source. + ms_hspp_opts :: DynFlags, -- Cached flags from OPTIONS, INCLUDE + -- and LANGUAGE pragmas. + ms_hspp_buf :: Maybe StringBuffer -- The actual preprocessed source, maybe. + } + +-- The ModLocation contains both the original source filename and the +-- filename of the cleaned-up source file after all preprocessing has been +-- done. The point is that the summariser will have to cpp/unlit/whatever +-- all files anyway, and there's no point in doing this twice -- just +-- park the result in a temp file, put the name of it in the location, +-- and let @compile@ read from that file on the way back up. + +-- The ModLocation is stable over successive up-sweeps in GHCi, wheres +-- the ms_hs_date and imports can, of course, change + +msHsFilePath, msHiFilePath, msObjFilePath :: ModSummary -> FilePath +msHsFilePath ms = expectJust "msHsFilePath" (ml_hs_file (ms_location ms)) +msHiFilePath ms = ml_hi_file (ms_location ms) +msObjFilePath ms = ml_obj_file (ms_location ms) + +isBootSummary :: ModSummary -> Bool +isBootSummary ms = isHsBoot (ms_hsc_src ms) + +instance Outputable ModSummary where + ppr ms + = sep [text "ModSummary {", + nest 3 (sep [text "ms_hs_date = " <> text (show (ms_hs_date ms)), + text "ms_mod =" <+> ppr (ms_mod ms) + <> text (hscSourceString (ms_hsc_src ms)) <> comma, + text "ms_imps =" <+> ppr (ms_imps ms), + text "ms_srcimps =" <+> ppr (ms_srcimps ms)]), + char '}' + ] + +showModMsg :: HscTarget -> Bool -> ModSummary -> String +showModMsg target recomp mod_summary + = showSDoc (hsep [text (mod_str ++ replicate (max 0 (16 - length mod_str)) ' '), + char '(', text (msHsFilePath mod_summary) <> comma, + case target of + HscInterpreted | recomp + -> text "interpreted" + HscNothing -> text "nothing" + _other -> text (msObjFilePath mod_summary), + char ')']) + where + mod = ms_mod mod_summary + mod_str = moduleString mod ++ hscSourceString (ms_hsc_src mod_summary) +\end{code} + + +%************************************************************************ +%* * +\subsection{Linkable stuff} +%* * +%************************************************************************ + +This stuff is in here, rather than (say) in Linker.lhs, because the Linker.lhs +stuff is the *dynamic* linker, and isn't present in a stage-1 compiler + +\begin{code} +data Linkable = LM { + linkableTime :: ClockTime, -- Time at which this linkable was built + -- (i.e. when the bytecodes were produced, + -- or the mod date on the files) + linkableModule :: Module, -- Should be Module, but see below + linkableUnlinked :: [Unlinked] + } + +isObjectLinkable :: Linkable -> Bool +isObjectLinkable l = not (null unlinked) && all isObject unlinked + where unlinked = linkableUnlinked l + -- A linkable with no Unlinked's is treated as a BCO. We can + -- generate a linkable with no Unlinked's as a result of + -- compiling a module in HscNothing mode, and this choice + -- happens to work well with checkStability in module GHC. + +instance Outputable Linkable where + ppr (LM when_made mod unlinkeds) + = (text "LinkableM" <+> parens (text (show when_made)) <+> ppr mod) + $$ nest 3 (ppr unlinkeds) + +------------------------------------------- +data Unlinked + = DotO FilePath + | DotA FilePath + | DotDLL FilePath + | BCOs CompiledByteCode + +#ifndef GHCI +data CompiledByteCode = NoByteCode +#endif + +instance Outputable Unlinked where + ppr (DotO path) = text "DotO" <+> text path + ppr (DotA path) = text "DotA" <+> text path + ppr (DotDLL path) = text "DotDLL" <+> text path +#ifdef GHCI + ppr (BCOs bcos) = text "BCOs" <+> ppr bcos +#else + ppr (BCOs bcos) = text "No byte code" +#endif + +isObject (DotO _) = True +isObject (DotA _) = True +isObject (DotDLL _) = True +isObject _ = False + +isInterpretable = not . isObject + +nameOfObject (DotO fn) = fn +nameOfObject (DotA fn) = fn +nameOfObject (DotDLL fn) = fn + +byteCodeOfObject (BCOs bc) = bc +\end{code} + + + diff --git a/compiler/main/Main.hs b/compiler/main/Main.hs new file mode 100644 index 0000000000..ec5a116894 --- /dev/null +++ b/compiler/main/Main.hs @@ -0,0 +1,476 @@ +{-# OPTIONS -fno-warn-incomplete-patterns -optc-DNON_POSIX_SOURCE #-} +----------------------------------------------------------------------------- +-- +-- GHC Driver program +-- +-- (c) The University of Glasgow 2005 +-- +----------------------------------------------------------------------------- + +module Main (main) where + +#include "HsVersions.h" + +-- The official GHC API +import qualified GHC +import GHC ( Session, DynFlags(..), GhcMode(..), HscTarget(..), + LoadHowMuch(..), dopt, DynFlag(..) ) +import CmdLineParser + +-- Implementations of the various modes (--show-iface, mkdependHS. etc.) +import MkIface ( showIface ) +import DriverPipeline ( oneShot, compileFile ) +import DriverMkDepend ( doMkDependHS ) +import SysTools ( getTopDir, getUsageMsgPaths ) +#ifdef GHCI +import InteractiveUI ( ghciWelcomeMsg, interactiveUI ) +#endif + +-- Various other random stuff that we need +import Config ( cProjectVersion, cBooterVersion, cProjectName ) +import Packages ( dumpPackages, initPackages ) +import DriverPhases ( Phase(..), isSourceFilename, anyHsc, + startPhase, isHaskellSrcFilename ) +import StaticFlags ( staticFlags, v_Ld_inputs, parseStaticFlags ) +import DynFlags ( defaultDynFlags ) +import BasicTypes ( failed ) +import ErrUtils ( Message, debugTraceMsg, putMsg ) +import FastString ( getFastStringTable, isZEncoded, hasZEncoding ) +import Outputable +import Util +import Panic + +-- Standard Haskell libraries +import EXCEPTION ( throwDyn ) +import IO +import Directory ( doesDirectoryExist ) +import System ( getArgs, exitWith, ExitCode(..) ) +import Monad +import List +import Maybe + +----------------------------------------------------------------------------- +-- ToDo: + +-- time commands when run with -v +-- user ways +-- Win32 support: proper signal handling +-- reading the package configuration file is too slow +-- -K<size> + +----------------------------------------------------------------------------- +-- GHC's command-line interface + +main = + GHC.defaultErrorHandler defaultDynFlags $ do + + argv0 <- getArgs + argv1 <- parseStaticFlags =<< GHC.initFromArgs argv0 + + -- 2. Parse the "mode" flags (--make, --interactive etc.) + (cli_mode, argv2) <- parseModeFlags argv1 + + let mode = case cli_mode of + DoInteractive -> Interactive + DoEval _ -> Interactive + DoMake -> BatchCompile + DoMkDependHS -> MkDepend + _ -> OneShot + + -- start our GHC session + session <- GHC.newSession mode + + dflags0 <- GHC.getSessionDynFlags session + + -- set the default HscTarget. The HscTarget can be further + -- adjusted on a module by module basis, using only the -fvia-C and + -- -fasm flags. If the default HscTarget is not HscC or HscAsm, + -- -fvia-C and -fasm have no effect. + let lang = case cli_mode of + DoInteractive -> HscInterpreted + DoEval _ -> HscInterpreted + _other -> hscTarget dflags0 + + let dflags1 = dflags0{ ghcMode = mode, + hscTarget = lang, + -- leave out hscOutName for now + hscOutName = panic "Main.main:hscOutName not set", + verbosity = case cli_mode of + DoEval _ -> 0 + _other -> 1 + } + + -- The rest of the arguments are "dynamic" + -- Leftover ones are presumably files + (dflags2, fileish_args) <- GHC.parseDynamicFlags dflags1 argv2 + + -- make sure we clean up after ourselves + GHC.defaultCleanupHandler dflags2 $ do + + -- Display banner + showBanner cli_mode dflags2 + + -- Read the package config(s), and process the package-related + -- command-line flags + dflags <- initPackages dflags2 + + -- we've finished manipulating the DynFlags, update the session + GHC.setSessionDynFlags session dflags + + let + -- To simplify the handling of filepaths, we normalise all filepaths right + -- away - e.g., for win32 platforms, backslashes are converted + -- into forward slashes. + normal_fileish_paths = map normalisePath fileish_args + (srcs, objs) = partition_args normal_fileish_paths [] [] + + -- Note: have v_Ld_inputs maintain the order in which 'objs' occurred on + -- the command-line. + mapM_ (consIORef v_Ld_inputs) (reverse objs) + + ---------------- Display configuration ----------- + when (verbosity dflags >= 4) $ + dumpPackages dflags + + when (verbosity dflags >= 3) $ do + hPutStrLn stderr ("Hsc static flags: " ++ unwords staticFlags) + + ---------------- Final sanity checking ----------- + checkOptions cli_mode dflags srcs objs + + ---------------- Do the business ----------- + case cli_mode of + ShowUsage -> showGhcUsage cli_mode + PrintLibdir -> do d <- getTopDir; putStrLn d + ShowVersion -> showVersion + ShowNumVersion -> putStrLn cProjectVersion + ShowInterface f -> showIface f + DoMake -> doMake session srcs + DoMkDependHS -> doMkDependHS session (map fst srcs) + StopBefore p -> oneShot dflags p srcs + DoInteractive -> interactiveUI session srcs Nothing + DoEval expr -> interactiveUI session srcs (Just expr) + + dumpFinalStats dflags + exitWith ExitSuccess + +#ifndef GHCI +interactiveUI _ _ _ = + throwDyn (CmdLineError "not built for interactive use") +#endif + +-- ----------------------------------------------------------------------------- +-- Splitting arguments into source files and object files. This is where we +-- interpret the -x <suffix> option, and attach a (Maybe Phase) to each source +-- file indicating the phase specified by the -x option in force, if any. + +partition_args [] srcs objs = (reverse srcs, reverse objs) +partition_args ("-x":suff:args) srcs objs + | "none" <- suff = partition_args args srcs objs + | StopLn <- phase = partition_args args srcs (slurp ++ objs) + | otherwise = partition_args rest (these_srcs ++ srcs) objs + where phase = startPhase suff + (slurp,rest) = break (== "-x") args + these_srcs = zip slurp (repeat (Just phase)) +partition_args (arg:args) srcs objs + | looks_like_an_input arg = partition_args args ((arg,Nothing):srcs) objs + | otherwise = partition_args args srcs (arg:objs) + + {- + We split out the object files (.o, .dll) and add them + to v_Ld_inputs for use by the linker. + + The following things should be considered compilation manager inputs: + + - haskell source files (strings ending in .hs, .lhs or other + haskellish extension), + + - module names (not forgetting hierarchical module names), + + - and finally we consider everything not containing a '.' to be + a comp manager input, as shorthand for a .hs or .lhs filename. + + Everything else is considered to be a linker object, and passed + straight through to the linker. + -} +looks_like_an_input m = isSourceFilename m + || looksLikeModuleName m + || '.' `notElem` m + +-- ----------------------------------------------------------------------------- +-- Option sanity checks + +checkOptions :: CmdLineMode -> DynFlags -> [(String,Maybe Phase)] -> [String] -> IO () + -- Final sanity checking before kicking off a compilation (pipeline). +checkOptions cli_mode dflags srcs objs = do + -- Complain about any unknown flags + let unknown_opts = [ f | (f@('-':_), _) <- srcs ] + when (notNull unknown_opts) (unknownFlagsErr unknown_opts) + + -- -prof and --interactive are not a good combination + when (notNull (wayNames dflags) && isInterpretiveMode cli_mode) $ + do throwDyn (UsageError + "--interactive can't be used with -prof, -ticky, -unreg or -smp.") + -- -ohi sanity check + if (isJust (outputHi dflags) && + (isCompManagerMode cli_mode || srcs `lengthExceeds` 1)) + then throwDyn (UsageError "-ohi can only be used when compiling a single source file") + else do + + -- -o sanity checking + if (srcs `lengthExceeds` 1 && isJust (outputFile dflags) + && not (isLinkMode cli_mode)) + then throwDyn (UsageError "can't apply -o to multiple source files") + else do + + -- Check that there are some input files + -- (except in the interactive case) + if null srcs && null objs && needsInputsMode cli_mode + then throwDyn (UsageError "no input files") + else do + + -- Verify that output files point somewhere sensible. + verifyOutputFiles dflags + + +-- Compiler output options + +-- called to verify that the output files & directories +-- point somewhere valid. +-- +-- The assumption is that the directory portion of these output +-- options will have to exist by the time 'verifyOutputFiles' +-- is invoked. +-- +verifyOutputFiles :: DynFlags -> IO () +verifyOutputFiles dflags = do + let odir = objectDir dflags + when (isJust odir) $ do + let dir = fromJust odir + flg <- doesDirectoryExist dir + when (not flg) (nonExistentDir "-odir" dir) + let ofile = outputFile dflags + when (isJust ofile) $ do + let fn = fromJust ofile + flg <- doesDirNameExist fn + when (not flg) (nonExistentDir "-o" fn) + let ohi = outputHi dflags + when (isJust ohi) $ do + let hi = fromJust ohi + flg <- doesDirNameExist hi + when (not flg) (nonExistentDir "-ohi" hi) + where + nonExistentDir flg dir = + throwDyn (CmdLineError ("error: directory portion of " ++ + show dir ++ " does not exist (used with " ++ + show flg ++ " option.)")) + +----------------------------------------------------------------------------- +-- GHC modes of operation + +data CmdLineMode + = ShowUsage -- ghc -? + | PrintLibdir -- ghc --print-libdir + | ShowVersion -- ghc -V/--version + | ShowNumVersion -- ghc --numeric-version + | ShowInterface String -- ghc --show-iface + | DoMkDependHS -- ghc -M + | StopBefore Phase -- ghc -E | -C | -S + -- StopBefore StopLn is the default + | DoMake -- ghc --make + | DoInteractive -- ghc --interactive + | DoEval String -- ghc -e + deriving (Show) + +isInteractiveMode, isInterpretiveMode :: CmdLineMode -> Bool +isLinkMode, isCompManagerMode :: CmdLineMode -> Bool + +isInteractiveMode DoInteractive = True +isInteractiveMode _ = False + +-- isInterpretiveMode: byte-code compiler involved +isInterpretiveMode DoInteractive = True +isInterpretiveMode (DoEval _) = True +isInterpretiveMode _ = False + +needsInputsMode DoMkDependHS = True +needsInputsMode (StopBefore _) = True +needsInputsMode DoMake = True +needsInputsMode _ = False + +-- True if we are going to attempt to link in this mode. +-- (we might not actually link, depending on the GhcLink flag) +isLinkMode (StopBefore StopLn) = True +isLinkMode DoMake = True +isLinkMode _ = False + +isCompManagerMode DoMake = True +isCompManagerMode DoInteractive = True +isCompManagerMode (DoEval _) = True +isCompManagerMode _ = False + + +-- ----------------------------------------------------------------------------- +-- Parsing the mode flag + +parseModeFlags :: [String] -> IO (CmdLineMode, [String]) +parseModeFlags args = do + let ((leftover, errs), (mode, _, flags)) = + runCmdLine (processArgs mode_flags args) (StopBefore StopLn, "", []) + when (not (null errs)) $ do + throwDyn (UsageError (unlines errs)) + return (mode, flags ++ leftover) + +type ModeM a = CmdLineP (CmdLineMode, String, [String]) a + -- mode flags sometimes give rise to new DynFlags (eg. -C, see below) + -- so we collect the new ones and return them. + +mode_flags :: [(String, OptKind (CmdLineP (CmdLineMode, String, [String])))] +mode_flags = + [ ------- help / version ---------------------------------------------- + ( "?" , PassFlag (setMode ShowUsage)) + , ( "-help" , PassFlag (setMode ShowUsage)) + , ( "-print-libdir" , PassFlag (setMode PrintLibdir)) + , ( "V" , PassFlag (setMode ShowVersion)) + , ( "-version" , PassFlag (setMode ShowVersion)) + , ( "-numeric-version", PassFlag (setMode ShowNumVersion)) + + ------- interfaces ---------------------------------------------------- + , ( "-show-iface" , HasArg (\f -> setMode (ShowInterface f) + "--show-iface")) + + ------- primary modes ------------------------------------------------ + , ( "M" , PassFlag (setMode DoMkDependHS)) + , ( "E" , PassFlag (setMode (StopBefore anyHsc))) + , ( "C" , PassFlag (\f -> do setMode (StopBefore HCc) f + addFlag "-fvia-C")) + , ( "S" , PassFlag (setMode (StopBefore As))) + , ( "-make" , PassFlag (setMode DoMake)) + , ( "-interactive" , PassFlag (setMode DoInteractive)) + , ( "e" , HasArg (\s -> setMode (DoEval s) "-e")) + + -- -fno-code says to stop after Hsc but don't generate any code. + , ( "fno-code" , PassFlag (\f -> do setMode (StopBefore HCc) f + addFlag "-fno-code" + addFlag "-no-recomp")) + ] + +setMode :: CmdLineMode -> String -> ModeM () +setMode m flag = do + (old_mode, old_flag, flags) <- getCmdLineState + when (notNull old_flag && flag /= old_flag) $ + throwDyn (UsageError + ("cannot use `" ++ old_flag ++ "' with `" ++ flag ++ "'")) + putCmdLineState (m, flag, flags) + +addFlag :: String -> ModeM () +addFlag s = do + (m, f, flags) <- getCmdLineState + putCmdLineState (m, f, s:flags) + + +-- ---------------------------------------------------------------------------- +-- Run --make mode + +doMake :: Session -> [(String,Maybe Phase)] -> IO () +doMake sess [] = throwDyn (UsageError "no input files") +doMake sess srcs = do + let (hs_srcs, non_hs_srcs) = partition haskellish srcs + + haskellish (f,Nothing) = + looksLikeModuleName f || isHaskellSrcFilename f || '.' `notElem` f + haskellish (f,Just phase) = + phase `notElem` [As, Cc, CmmCpp, Cmm, StopLn] + + dflags <- GHC.getSessionDynFlags sess + o_files <- mapM (compileFile dflags StopLn) non_hs_srcs + mapM_ (consIORef v_Ld_inputs) (reverse o_files) + + targets <- mapM (uncurry GHC.guessTarget) hs_srcs + GHC.setTargets sess targets + ok_flag <- GHC.load sess LoadAllTargets + when (failed ok_flag) (exitWith (ExitFailure 1)) + return () + +-- --------------------------------------------------------------------------- +-- Various banners and verbosity output. + +showBanner :: CmdLineMode -> DynFlags -> IO () +showBanner cli_mode dflags = do + let verb = verbosity dflags + -- Show the GHCi banner +# ifdef GHCI + when (isInteractiveMode cli_mode && verb >= 1) $ + hPutStrLn stdout ghciWelcomeMsg +# endif + + -- Display details of the configuration in verbose mode + when (not (isInteractiveMode cli_mode) && verb >= 2) $ + do hPutStr stderr "Glasgow Haskell Compiler, Version " + hPutStr stderr cProjectVersion + hPutStr stderr ", for Haskell 98, compiled by GHC version " +#ifdef GHCI + -- GHCI is only set when we are bootstrapping... + hPutStrLn stderr cProjectVersion +#else + hPutStrLn stderr cBooterVersion +#endif + +showVersion :: IO () +showVersion = do + putStrLn (cProjectName ++ ", version " ++ cProjectVersion) + exitWith ExitSuccess + +showGhcUsage cli_mode = do + (ghc_usage_path,ghci_usage_path) <- getUsageMsgPaths + let usage_path + | DoInteractive <- cli_mode = ghci_usage_path + | otherwise = ghc_usage_path + usage <- readFile usage_path + dump usage + exitWith ExitSuccess + where + dump "" = return () + dump ('$':'$':s) = putStr progName >> dump s + dump (c:s) = putChar c >> dump s + +dumpFinalStats :: DynFlags -> IO () +dumpFinalStats dflags = + when (dopt Opt_D_faststring_stats dflags) $ dumpFastStringStats dflags + +dumpFastStringStats :: DynFlags -> IO () +dumpFastStringStats dflags = do + buckets <- getFastStringTable + let (entries, longest, is_z, has_z) = countFS 0 0 0 0 buckets + msg = text "FastString stats:" $$ + nest 4 (vcat [text "size: " <+> int (length buckets), + text "entries: " <+> int entries, + text "longest chain: " <+> int longest, + text "z-encoded: " <+> (is_z `pcntOf` entries), + text "has z-encoding: " <+> (has_z `pcntOf` entries) + ]) + -- we usually get more "has z-encoding" than "z-encoded", because + -- when we z-encode a string it might hash to the exact same string, + -- which will is not counted as "z-encoded". Only strings whose + -- Z-encoding is different from the original string are counted in + -- the "z-encoded" total. + putMsg dflags msg + where + x `pcntOf` y = int ((x * 100) `quot` y) <> char '%' + +countFS entries longest is_z has_z [] = (entries, longest, is_z, has_z) +countFS entries longest is_z has_z (b:bs) = + let + len = length b + longest' = max len longest + entries' = entries + len + is_zs = length (filter isZEncoded b) + has_zs = length (filter hasZEncoding b) + in + countFS entries' longest' (is_z + is_zs) (has_z + has_zs) bs + +-- ----------------------------------------------------------------------------- +-- Util + +unknownFlagsErr :: [String] -> a +unknownFlagsErr fs = throwDyn (UsageError ("unrecognised flags: " ++ unwords fs)) diff --git a/compiler/main/PackageConfig.hs b/compiler/main/PackageConfig.hs new file mode 100644 index 0000000000..e19a10dbc5 --- /dev/null +++ b/compiler/main/PackageConfig.hs @@ -0,0 +1,69 @@ +-- +-- (c) The University of Glasgow, 2004 +-- + +module PackageConfig ( + -- * PackageId + PackageId, + mkPackageId, stringToPackageId, packageIdString, packageConfigId, + packageIdFS, fsToPackageId, + + -- * The PackageConfig type: information about a package + PackageConfig, + InstalledPackageInfo(..), showPackageId, + Version(..), + PackageIdentifier(..), + defaultPackageConfig + ) where + +#include "HsVersions.h" + +import Distribution.InstalledPackageInfo +import Distribution.Package +import Distribution.Version +import FastString + +-- ----------------------------------------------------------------------------- +-- Our PackageConfig type is just InstalledPackageInfo from Cabal. Later we +-- might need to extend it with some GHC-specific stuff, but for now it's fine. + +type PackageConfig = InstalledPackageInfo +defaultPackageConfig = emptyInstalledPackageInfo + +-- ----------------------------------------------------------------------------- +-- PackageId (package names with versions) + +-- Mostly the compiler deals in terms of PackageNames, which don't +-- have the version suffix. This is so that we don't need to know the +-- version for the -package-name flag, or know the versions of +-- wired-in packages like base & rts. Versions are confined to the +-- package sub-system. +-- +-- This means that in theory you could have multiple base packages installed +-- (for example), and switch between them using -package/-hide-package. +-- +-- A PackageId is a string of the form <pkg>-<version>. + +newtype PackageId = PId FastString deriving( Eq, Ord ) -- includes the version + -- easier not to use a newtype here, because we need instances of + -- Binary & Outputable, and we're too early to define them + +fsToPackageId :: FastString -> PackageId +fsToPackageId = PId + +packageIdFS :: PackageId -> FastString +packageIdFS (PId fs) = fs + +stringToPackageId :: String -> PackageId +stringToPackageId = fsToPackageId . mkFastString + +packageIdString :: PackageId -> String +packageIdString = unpackFS . packageIdFS + +mkPackageId :: PackageIdentifier -> PackageId +mkPackageId = stringToPackageId . showPackageId + +packageConfigId :: PackageConfig -> PackageId +packageConfigId = mkPackageId . package + + diff --git a/compiler/main/Packages.hi-boot-5 b/compiler/main/Packages.hi-boot-5 new file mode 100644 index 0000000000..62f020cddb --- /dev/null +++ b/compiler/main/Packages.hi-boot-5 @@ -0,0 +1,3 @@ +__interface Packages 1 0 where +__export Packages PackageState ; +1 data PackageState ; diff --git a/compiler/main/Packages.hi-boot-6 b/compiler/main/Packages.hi-boot-6 new file mode 100644 index 0000000000..6b12f1496e --- /dev/null +++ b/compiler/main/Packages.hi-boot-6 @@ -0,0 +1,2 @@ +module Packages where +data PackageState diff --git a/compiler/main/Packages.lhs b/compiler/main/Packages.lhs new file mode 100644 index 0000000000..ae6b18863e --- /dev/null +++ b/compiler/main/Packages.lhs @@ -0,0 +1,705 @@ +% +% (c) The University of Glasgow, 2000 +% +\section{Package manipulation} + +\begin{code} +module Packages ( + module PackageConfig, + + -- * The PackageConfigMap + PackageConfigMap, emptyPackageConfigMap, lookupPackage, + extendPackageConfigMap, dumpPackages, + + -- * Reading the package config, and processing cmdline args + PackageIdH(..), isHomePackage, + PackageState(..), + mkPackageState, + initPackages, + getPackageDetails, + checkForPackageConflicts, + lookupModuleInAllPackages, + + HomeModules, mkHomeModules, isHomeModule, + + -- * Inspecting the set of packages in scope + getPackageIncludePath, + getPackageCIncludes, + getPackageLibraryPath, + getPackageLinkOpts, + getPackageExtraCcOpts, + getPackageFrameworkPath, + getPackageFrameworks, + getExplicitPackagesAnd, + + -- * Utils + isDllName + ) +where + +#include "HsVersions.h" + +import PackageConfig +import SysTools ( getTopDir, getPackageConfigPath ) +import ParsePkgConf ( loadPackageConfig ) +import DynFlags ( dopt, DynFlag(..), DynFlags(..), PackageFlag(..) ) +import StaticFlags ( opt_Static ) +import Config ( cProjectVersion ) +import Name ( Name, nameModule_maybe ) +import UniqFM +import Module +import FiniteMap +import UniqSet +import Util +import Maybes ( expectJust, MaybeErr(..) ) +import Panic +import Outputable + +#if __GLASGOW_HASKELL__ >= 603 +import System.Directory ( getAppUserDataDirectory ) +#else +import Compat.Directory ( getAppUserDataDirectory ) +#endif + +import System.Environment ( getEnv ) +import Distribution.InstalledPackageInfo +import Distribution.Package +import Distribution.Version +import System.Directory ( doesFileExist, doesDirectoryExist, + getDirectoryContents ) +import Control.Monad ( foldM ) +import Data.List ( nub, partition, sortBy, isSuffixOf ) +import FastString +import EXCEPTION ( throwDyn ) +import ErrUtils ( debugTraceMsg, putMsg, Message ) + +-- --------------------------------------------------------------------------- +-- The Package state + +-- Package state is all stored in DynFlags, including the details of +-- all packages, which packages are exposed, and which modules they +-- provide. + +-- The package state is computed by initPackages, and kept in DynFlags. +-- +-- * -package <pkg> causes <pkg> to become exposed, and all other packages +-- with the same name to become hidden. +-- +-- * -hide-package <pkg> causes <pkg> to become hidden. +-- +-- * Let exposedPackages be the set of packages thus exposed. +-- Let depExposedPackages be the transitive closure from exposedPackages of +-- their dependencies. +-- +-- * It is an error for any two packages in depExposedPackages to provide the +-- same module. +-- +-- * When searching for a module from an explicit import declaration, +-- only the exposed modules in exposedPackages are valid. +-- +-- * When searching for a module from an implicit import, all modules +-- from depExposedPackages are valid. +-- +-- * When linking in a comp manager mode, we link in packages the +-- program depends on (the compiler knows this list by the +-- time it gets to the link step). Also, we link in all packages +-- which were mentioned with explicit -package flags on the command-line, +-- or are a transitive dependency of same, or are "base"/"rts". +-- The reason for (b) is that we might need packages which don't +-- contain any Haskell modules, and therefore won't be discovered +-- by the normal mechanism of dependency tracking. + + +-- One important thing that the package state provides is a way to +-- tell, for a given module, whether it is part of the current package +-- or not. We need to know this for two reasons: +-- +-- * generating cross-DLL calls is different from intra-DLL calls +-- (see below). +-- * we don't record version information in interface files for entities +-- in a different package. +-- +-- Notes on DLLs +-- ~~~~~~~~~~~~~ +-- When compiling module A, which imports module B, we need to +-- know whether B will be in the same DLL as A. +-- If it's in the same DLL, we refer to B_f_closure +-- If it isn't, we refer to _imp__B_f_closure +-- When compiling A, we record in B's Module value whether it's +-- in a different DLL, by setting the DLL flag. + +data PackageState = PackageState { + + explicitPackages :: [PackageId], + -- The packages we're going to link in eagerly. This list + -- should be in reverse dependency order; that is, a package + -- is always mentioned before the packages it depends on. + + origPkgIdMap :: PackageConfigMap, -- PackageId -> PackageConfig + -- the full package database + + pkgIdMap :: PackageConfigMap, -- PackageId -> PackageConfig + -- Derived from origPkgIdMap. + -- The exposed flags are adjusted according to -package and + -- -hide-package flags, and -ignore-package removes packages. + + moduleToPkgConfAll :: ModuleEnv [(PackageConfig,Bool)], + -- Derived from pkgIdMap. + -- Maps Module to (pkgconf,exposed), where pkgconf is the + -- PackageConfig for the package containing the module, and + -- exposed is True if the package exposes that module. + + -- The PackageIds of some known packages + basePackageId :: PackageIdH, + rtsPackageId :: PackageIdH, + haskell98PackageId :: PackageIdH, + thPackageId :: PackageIdH + } + +data PackageIdH + = HomePackage -- The "home" package is the package curently + -- being compiled + | ExtPackage PackageId -- An "external" package is any other package + + +isHomePackage :: PackageIdH -> Bool +isHomePackage HomePackage = True +isHomePackage (ExtPackage _) = False + +-- A PackageConfigMap maps a PackageId to a PackageConfig +type PackageConfigMap = UniqFM PackageConfig + +emptyPackageConfigMap :: PackageConfigMap +emptyPackageConfigMap = emptyUFM + +lookupPackage :: PackageConfigMap -> PackageId -> Maybe PackageConfig +lookupPackage = lookupUFM + +extendPackageConfigMap + :: PackageConfigMap -> [PackageConfig] -> PackageConfigMap +extendPackageConfigMap pkg_map new_pkgs + = foldl add pkg_map new_pkgs + where add pkg_map p = addToUFM pkg_map (packageConfigId p) p + +getPackageDetails :: PackageState -> PackageId -> PackageConfig +getPackageDetails dflags ps = expectJust "getPackageDetails" (lookupPackage (pkgIdMap dflags) ps) + +-- ---------------------------------------------------------------------------- +-- Loading the package config files and building up the package state + +-- | Call this after parsing the DynFlags. It reads the package +-- configuration files, and sets up various internal tables of package +-- information, according to the package-related flags on the +-- command-line (@-package@, @-hide-package@ etc.) +initPackages :: DynFlags -> IO DynFlags +initPackages dflags = do + pkg_map <- readPackageConfigs dflags; + state <- mkPackageState dflags pkg_map + return dflags{ pkgState = state } + +-- ----------------------------------------------------------------------------- +-- Reading the package database(s) + +readPackageConfigs :: DynFlags -> IO PackageConfigMap +readPackageConfigs dflags = do + e_pkg_path <- try (getEnv "GHC_PACKAGE_PATH") + system_pkgconfs <- getSystemPackageConfigs dflags + + let pkgconfs = case e_pkg_path of + Left _ -> system_pkgconfs + Right path + | last cs == "" -> init cs ++ system_pkgconfs + | otherwise -> cs + where cs = parseSearchPath path + -- if the path ends in a separator (eg. "/foo/bar:") + -- the we tack on the system paths. + + -- Read all the ones mentioned in -package-conf flags + pkg_map <- foldM (readPackageConfig dflags) emptyPackageConfigMap + (reverse pkgconfs ++ extraPkgConfs dflags) + + return pkg_map + + +getSystemPackageConfigs :: DynFlags -> IO [FilePath] +getSystemPackageConfigs dflags = do + -- System one always comes first + system_pkgconf <- getPackageConfigPath + + -- allow package.conf.d to contain a bunch of .conf files + -- containing package specifications. This is an easier way + -- to maintain the package database on systems with a package + -- management system, or systems that don't want to run ghc-pkg + -- to register or unregister packages. Undocumented feature for now. + let system_pkgconf_dir = system_pkgconf ++ ".d" + system_pkgconf_dir_exists <- doesDirectoryExist system_pkgconf_dir + system_pkgconfs <- + if system_pkgconf_dir_exists + then do files <- getDirectoryContents system_pkgconf_dir + return [ system_pkgconf_dir ++ '/' : file + | file <- files + , isSuffixOf ".conf" file] + else return [] + + -- Read user's package conf (eg. ~/.ghc/i386-linux-6.3/package.conf) + -- unless the -no-user-package-conf flag was given. + -- We only do this when getAppUserDataDirectory is available + -- (GHC >= 6.3). + user_pkgconf <- handle (\_ -> return []) $ do + appdir <- getAppUserDataDirectory "ghc" + let + pkgconf = appdir + `joinFileName` (TARGET_ARCH ++ '-':TARGET_OS ++ '-':cProjectVersion) + `joinFileName` "package.conf" + flg <- doesFileExist pkgconf + if (flg && dopt Opt_ReadUserPackageConf dflags) + then return [pkgconf] + else return [] + + return (user_pkgconf ++ system_pkgconfs ++ [system_pkgconf]) + + +readPackageConfig + :: DynFlags -> PackageConfigMap -> FilePath -> IO PackageConfigMap +readPackageConfig dflags pkg_map conf_file = do + debugTraceMsg dflags 2 (text "Using package config file:" <+> text conf_file) + proto_pkg_configs <- loadPackageConfig conf_file + top_dir <- getTopDir + let pkg_configs1 = mungePackagePaths top_dir proto_pkg_configs + pkg_configs2 = maybeHidePackages dflags pkg_configs1 + return (extendPackageConfigMap pkg_map pkg_configs2) + +maybeHidePackages :: DynFlags -> [PackageConfig] -> [PackageConfig] +maybeHidePackages dflags pkgs + | dopt Opt_HideAllPackages dflags = map hide pkgs + | otherwise = pkgs + where + hide pkg = pkg{ exposed = False } + +mungePackagePaths :: String -> [PackageConfig] -> [PackageConfig] +-- Replace the string "$topdir" at the beginning of a path +-- with the current topdir (obtained from the -B option). +mungePackagePaths top_dir ps = map munge_pkg ps + where + munge_pkg p = p{ importDirs = munge_paths (importDirs p), + includeDirs = munge_paths (includeDirs p), + libraryDirs = munge_paths (libraryDirs p), + frameworkDirs = munge_paths (frameworkDirs p) } + + munge_paths = map munge_path + + munge_path p + | Just p' <- maybePrefixMatch "$topdir" p = top_dir ++ p' + | otherwise = p + + +-- ----------------------------------------------------------------------------- +-- When all the command-line options are in, we can process our package +-- settings and populate the package state. + +mkPackageState :: DynFlags -> PackageConfigMap -> IO PackageState +mkPackageState dflags orig_pkg_db = do + -- + -- Modify the package database according to the command-line flags + -- (-package, -hide-package, -ignore-package, -hide-all-packages). + -- + -- Also, here we build up a set of the packages mentioned in -package + -- flags on the command line; these are called the "explicit" packages. + -- we link these packages in eagerly. The explicit set should contain + -- at least rts & base, which is why we pretend that the command line + -- contains -package rts & -package base. + -- + let + flags = reverse (packageFlags dflags) + + procflags pkgs expl [] = return (pkgs,expl) + procflags pkgs expl (ExposePackage str : flags) = do + case pick str pkgs of + Nothing -> missingPackageErr str + Just (p,ps) -> procflags (p':ps') expl' flags + where pkgid = packageConfigId p + p' = p {exposed=True} + ps' = hideAll (pkgName (package p)) ps + expl' = addOneToUniqSet expl pkgid + procflags pkgs expl (HidePackage str : flags) = do + case partition (matches str) pkgs of + ([],_) -> missingPackageErr str + (ps,qs) -> procflags (map hide ps ++ qs) expl flags + where hide p = p {exposed=False} + procflags pkgs expl (IgnorePackage str : flags) = do + case partition (matches str) pkgs of + (ps,qs) -> procflags qs expl flags + -- missing package is not an error for -ignore-package, + -- because a common usage is to -ignore-package P as + -- a preventative measure just in case P exists. + + pick str pkgs + = case partition (matches str) pkgs of + ([],_) -> Nothing + (ps,rest) -> + case sortBy (flip (comparing (pkgVersion.package))) ps of + (p:ps) -> Just (p, ps ++ rest) + _ -> panic "Packages.pick" + + comparing f a b = f a `compare` f b + + -- A package named on the command line can either include the + -- version, or just the name if it is unambiguous. + matches str p + = str == showPackageId (package p) + || str == pkgName (package p) + + -- When a package is requested to be exposed, we hide all other + -- packages with the same name. + hideAll name ps = map maybe_hide ps + where maybe_hide p | pkgName (package p) == name = p {exposed=False} + | otherwise = p + -- + (pkgs1,explicit) <- procflags (eltsUFM orig_pkg_db) emptyUniqSet flags + -- + -- hide all packages for which there is also a later version + -- that is already exposed. This just makes it non-fatal to have two + -- versions of a package exposed, which can happen if you install a + -- later version of a package in the user database, for example. + -- + let maybe_hide p + | not (exposed p) = return p + | (p' : _) <- later_versions = do + debugTraceMsg dflags 2 $ + (ptext SLIT("hiding package") <+> text (showPackageId (package p)) <+> + ptext SLIT("to avoid conflict with later version") <+> + text (showPackageId (package p'))) + return (p {exposed=False}) + | otherwise = return p + where myname = pkgName (package p) + myversion = pkgVersion (package p) + later_versions = [ p | p <- pkgs1, exposed p, + let pkg = package p, + pkgName pkg == myname, + pkgVersion pkg > myversion ] + a_later_version_is_exposed + = not (null later_versions) + + pkgs2 <- mapM maybe_hide pkgs1 + -- + -- Eliminate any packages which have dangling dependencies (perhaps + -- because the package was removed by -ignore-package). + -- + let + elimDanglingDeps pkgs = + case partition (not.null.snd) (map (getDanglingDeps pkgs) pkgs) of + ([],ps) -> return (map fst ps) + (ps,qs) -> do + mapM_ reportElim ps + elimDanglingDeps (map fst qs) + + reportElim (p, deps) = + debugTraceMsg dflags 2 $ + (ptext SLIT("package") <+> pprPkg p <+> + ptext SLIT("will be ignored due to missing dependencies:") $$ + nest 2 (hsep (map (text.showPackageId) deps))) + + getDanglingDeps pkgs p = (p, filter dangling (depends p)) + where dangling pid = pid `notElem` all_pids + all_pids = map package pkgs + -- + pkgs <- elimDanglingDeps pkgs2 + let pkg_db = extendPackageConfigMap emptyPackageConfigMap pkgs + -- + -- Find the transitive closure of dependencies of exposed + -- + let exposed_pkgids = [ packageConfigId p | p <- pkgs, exposed p ] + dep_exposed <- closeDeps pkg_db exposed_pkgids + -- + -- Look up some known PackageIds + -- + let + lookupPackageByName :: FastString -> PackageIdH + lookupPackageByName nm = + case [ conf | p <- dep_exposed, + Just conf <- [lookupPackage pkg_db p], + nm == mkFastString (pkgName (package conf)) ] of + [] -> HomePackage + (p:ps) -> ExtPackage (mkPackageId (package p)) + + -- Get the PackageIds for some known packages (we know the names, + -- but we don't know the versions). Some of these packages might + -- not exist in the database, so they are Maybes. + basePackageId = lookupPackageByName basePackageName + rtsPackageId = lookupPackageByName rtsPackageName + haskell98PackageId = lookupPackageByName haskell98PackageName + thPackageId = lookupPackageByName thPackageName + + -- add base & rts to the explicit packages + basicLinkedPackages = [basePackageId,rtsPackageId] + explicit' = addListToUniqSet explicit + [ p | ExtPackage p <- basicLinkedPackages ] + -- + -- Close the explicit packages with their dependencies + -- + dep_explicit <- closeDeps pkg_db (uniqSetToList explicit') + -- + -- Build up a mapping from Module -> PackageConfig for all modules. + -- Discover any conflicts at the same time, and factor in the new exposed + -- status of each package. + -- + let mod_map = mkModuleMap pkg_db dep_exposed + + return PackageState{ explicitPackages = dep_explicit, + origPkgIdMap = orig_pkg_db, + pkgIdMap = pkg_db, + moduleToPkgConfAll = mod_map, + basePackageId = basePackageId, + rtsPackageId = rtsPackageId, + haskell98PackageId = haskell98PackageId, + thPackageId = thPackageId + } + -- done! + +basePackageName = FSLIT("base") +rtsPackageName = FSLIT("rts") +haskell98PackageName = FSLIT("haskell98") +thPackageName = FSLIT("template-haskell") + -- Template Haskell libraries in here + +mkModuleMap + :: PackageConfigMap + -> [PackageId] + -> ModuleEnv [(PackageConfig, Bool)] +mkModuleMap pkg_db pkgs = foldr extend_modmap emptyUFM pkgs + where + extend_modmap pkgname modmap = + addListToUFM_C (++) modmap + [(m, [(pkg, m `elem` exposed_mods)]) | m <- all_mods] + where + pkg = expectJust "mkModuleMap" (lookupPackage pkg_db pkgname) + exposed_mods = map mkModule (exposedModules pkg) + hidden_mods = map mkModule (hiddenModules pkg) + all_mods = exposed_mods ++ hidden_mods + +-- ----------------------------------------------------------------------------- +-- Check for conflicts in the program. + +-- | A conflict arises if the program contains two modules with the same +-- name, which can arise if the program depends on multiple packages that +-- expose the same module, or if the program depends on a package that +-- contains a module also present in the program (the "home package"). +-- +checkForPackageConflicts + :: DynFlags + -> [Module] -- modules in the home package + -> [PackageId] -- packages on which the program depends + -> MaybeErr Message () + +checkForPackageConflicts dflags mods pkgs = do + let + state = pkgState dflags + pkg_db = pkgIdMap state + -- + dep_pkgs <- closeDepsErr pkg_db pkgs + + let + extend_modmap pkgname modmap = + addListToFM_C (++) modmap + [(m, [(pkg, m `elem` exposed_mods)]) | m <- all_mods] + where + pkg = expectJust "checkForPackageConflicts" + (lookupPackage pkg_db pkgname) + exposed_mods = map mkModule (exposedModules pkg) + hidden_mods = map mkModule (hiddenModules pkg) + all_mods = exposed_mods ++ hidden_mods + + mod_map = foldr extend_modmap emptyFM pkgs + mod_map_list :: [(Module,[(PackageConfig,Bool)])] + mod_map_list = fmToList mod_map + + overlaps = [ (m, map fst ps) | (m,ps@(_:_:_)) <- mod_map_list ] + -- + if not (null overlaps) + then Failed (pkgOverlapError overlaps) + else do + + let + overlap_mods = [ (mod,pkg) + | mod <- mods, + Just ((pkg,_):_) <- [lookupFM mod_map mod] ] + -- will be only one package here + if not (null overlap_mods) + then Failed (modOverlapError overlap_mods) + else do + + return () + +pkgOverlapError overlaps = vcat (map msg overlaps) + where + msg (mod,pkgs) = + text "conflict: module" <+> quotes (ppr mod) + <+> ptext SLIT("is present in multiple packages:") + <+> hsep (punctuate comma (map pprPkg pkgs)) + +modOverlapError overlaps = vcat (map msg overlaps) + where + msg (mod,pkg) = fsep [ + text "conflict: module", + quotes (ppr mod), + ptext SLIT("belongs to the current program/library"), + ptext SLIT("and also to package"), + pprPkg pkg ] + +pprPkg :: PackageConfig -> SDoc +pprPkg p = text (showPackageId (package p)) + +-- ----------------------------------------------------------------------------- +-- Extracting information from the packages in scope + +-- Many of these functions take a list of packages: in those cases, +-- the list is expected to contain the "dependent packages", +-- i.e. those packages that were found to be depended on by the +-- current module/program. These can be auto or non-auto packages, it +-- doesn't really matter. The list is always combined with the list +-- of explicit (command-line) packages to determine which packages to +-- use. + +getPackageIncludePath :: DynFlags -> [PackageId] -> IO [String] +getPackageIncludePath dflags pkgs = do + ps <- getExplicitPackagesAnd dflags pkgs + return (nub (filter notNull (concatMap includeDirs ps))) + + -- includes are in reverse dependency order (i.e. rts first) +getPackageCIncludes :: [PackageConfig] -> IO [String] +getPackageCIncludes pkg_configs = do + return (reverse (nub (filter notNull (concatMap includes pkg_configs)))) + +getPackageLibraryPath :: DynFlags -> [PackageId] -> IO [String] +getPackageLibraryPath dflags pkgs = do + ps <- getExplicitPackagesAnd dflags pkgs + return (nub (filter notNull (concatMap libraryDirs ps))) + +getPackageLinkOpts :: DynFlags -> [PackageId] -> IO [String] +getPackageLinkOpts dflags pkgs = do + ps <- getExplicitPackagesAnd dflags pkgs + let tag = buildTag dflags + rts_tag = rtsBuildTag dflags + let + imp = if opt_Static then "" else "_dyn" + libs p = map ((++imp) . addSuffix) (hsLibraries p) + ++ hACK_dyn (extraLibraries p) + all_opts p = map ("-l" ++) (libs p) ++ ldOptions p + + suffix = if null tag then "" else '_':tag + rts_suffix = if null rts_tag then "" else '_':rts_tag + + addSuffix rts@"HSrts" = rts ++ rts_suffix + addSuffix other_lib = other_lib ++ suffix + + -- This is a hack that's even more horrible (and hopefully more temporary) + -- than the one below [referring to previous splittage of HSbase into chunks + -- to work around GNU ld bug]. HSbase_cbits and friends require the _dyn suffix + -- for dynamic linking, but not _p or other 'way' suffix. So we just add + -- _dyn to extraLibraries if they already have a _cbits suffix. + + hACK_dyn = map hack + where hack lib | not opt_Static && "_cbits" `isSuffixOf` lib = lib ++ "_dyn" + | otherwise = lib + + return (concat (map all_opts ps)) + +getPackageExtraCcOpts :: DynFlags -> [PackageId] -> IO [String] +getPackageExtraCcOpts dflags pkgs = do + ps <- getExplicitPackagesAnd dflags pkgs + return (concatMap ccOptions ps) + +getPackageFrameworkPath :: DynFlags -> [PackageId] -> IO [String] +getPackageFrameworkPath dflags pkgs = do + ps <- getExplicitPackagesAnd dflags pkgs + return (nub (filter notNull (concatMap frameworkDirs ps))) + +getPackageFrameworks :: DynFlags -> [PackageId] -> IO [String] +getPackageFrameworks dflags pkgs = do + ps <- getExplicitPackagesAnd dflags pkgs + return (concatMap frameworks ps) + +-- ----------------------------------------------------------------------------- +-- Package Utils + +-- | Takes a Module, and if the module is in a package returns +-- @(pkgconf,exposed)@ where pkgconf is the PackageConfig for that package, +-- and exposed is True if the package exposes the module. +lookupModuleInAllPackages :: DynFlags -> Module -> [(PackageConfig,Bool)] +lookupModuleInAllPackages dflags m = + case lookupModuleEnv (moduleToPkgConfAll (pkgState dflags)) m of + Nothing -> [] + Just ps -> ps + +getExplicitPackagesAnd :: DynFlags -> [PackageId] -> IO [PackageConfig] +getExplicitPackagesAnd dflags pkgids = + let + state = pkgState dflags + pkg_map = pkgIdMap state + expl = explicitPackages state + in do + all_pkgs <- throwErr (foldM (add_package pkg_map) expl pkgids) + return (map (getPackageDetails state) all_pkgs) + +-- Takes a list of packages, and returns the list with dependencies included, +-- in reverse dependency order (a package appears before those it depends on). +closeDeps :: PackageConfigMap -> [PackageId] -> IO [PackageId] +closeDeps pkg_map ps = throwErr (closeDepsErr pkg_map ps) + +throwErr :: MaybeErr Message a -> IO a +throwErr m = case m of + Failed e -> throwDyn (CmdLineError (showSDoc e)) + Succeeded r -> return r + +closeDepsErr :: PackageConfigMap -> [PackageId] + -> MaybeErr Message [PackageId] +closeDepsErr pkg_map ps = foldM (add_package pkg_map) [] ps + +-- internal helper +add_package :: PackageConfigMap -> [PackageId] -> PackageId + -> MaybeErr Message [PackageId] +add_package pkg_db ps p + | p `elem` ps = return ps -- Check if we've already added this package + | otherwise = + case lookupPackage pkg_db p of + Nothing -> Failed (missingPackageMsg (packageIdString p)) + Just pkg -> do + -- Add the package's dependents also + let deps = map mkPackageId (depends pkg) + ps' <- foldM (add_package pkg_db) ps deps + return (p : ps') + +missingPackageErr p = throwDyn (CmdLineError (showSDoc (missingPackageMsg p))) +missingPackageMsg p = ptext SLIT("unknown package:") <+> text p + +-- ----------------------------------------------------------------------------- +-- The home module set + +newtype HomeModules = HomeModules ModuleSet + +mkHomeModules :: [Module] -> HomeModules +mkHomeModules = HomeModules . mkModuleSet + +isHomeModule :: HomeModules -> Module -> Bool +isHomeModule (HomeModules set) mod = elemModuleSet mod set + +-- Determining whether a Name refers to something in another package or not. +-- Cross-package references need to be handled differently when dynamically- +-- linked libraries are involved. + +isDllName :: HomeModules -> Name -> Bool +isDllName pdeps name + | opt_Static = False + | Just mod <- nameModule_maybe name = not (isHomeModule pdeps mod) + | otherwise = False -- no, it is not even an external name + +-- ----------------------------------------------------------------------------- +-- Displaying packages + +dumpPackages :: DynFlags -> IO () +-- Show package info on console, if verbosity is >= 3 +dumpPackages dflags + = do let pkg_map = pkgIdMap (pkgState dflags) + putMsg dflags $ + vcat (map (text.showInstalledPackageInfo) (eltsUFM pkg_map)) +\end{code} diff --git a/compiler/main/Packages.lhs-boot b/compiler/main/Packages.lhs-boot new file mode 100644 index 0000000000..3a1712e2da --- /dev/null +++ b/compiler/main/Packages.lhs-boot @@ -0,0 +1,4 @@ +\begin{code} +module Packages where +data PackageState +\end{code} diff --git a/compiler/main/ParsePkgConf.y b/compiler/main/ParsePkgConf.y new file mode 100644 index 0000000000..901a5bc943 --- /dev/null +++ b/compiler/main/ParsePkgConf.y @@ -0,0 +1,153 @@ +{ +module ParsePkgConf( loadPackageConfig ) where + +#include "HsVersions.h" + +import PackageConfig +import Lexer +import DynFlags +import FastString +import StringBuffer +import ErrUtils ( mkLocMessage ) +import SrcLoc +import Outputable +import Panic ( GhcException(..) ) +import EXCEPTION ( throwDyn ) + +} + +%token + '{' { L _ ITocurly } + '}' { L _ ITccurly } + '[' { L _ ITobrack } + ']' { L _ ITcbrack } + ',' { L _ ITcomma } + '=' { L _ ITequal } + VARID { L _ (ITvarid $$) } + CONID { L _ (ITconid $$) } + STRING { L _ (ITstring $$) } + INT { L _ (ITinteger $$) } + +%monad { P } { >>= } { return } +%lexer { lexer } { L _ ITeof } +%name parse +%tokentype { Located Token } +%% + +pkgconf :: { [ PackageConfig ] } + : '[' ']' { [] } + | '[' pkgs ']' { reverse $2 } + +pkgs :: { [ PackageConfig ] } + : pkg { [ $1 ] } + | pkgs ',' pkg { $3 : $1 } + +pkg :: { PackageConfig } + : CONID '{' fields '}' { $3 defaultPackageConfig } + +fields :: { PackageConfig -> PackageConfig } + : field { \p -> $1 p } + | fields ',' field { \p -> $1 ($3 p) } + +field :: { PackageConfig -> PackageConfig } + : VARID '=' pkgid + {% case unpackFS $1 of + "package" -> return (\p -> p{package = $3}) + _other -> happyError + } + + | VARID '=' STRING { id } + -- we aren't interested in the string fields, they're all + -- boring (copyright, maintainer etc.) + + | VARID '=' CONID + {% case unpackFS $1 of { + "exposed" -> + case unpackFS $3 of { + "True" -> return (\p -> p{exposed=True}); + "False" -> return (\p -> p{exposed=False}); + _ -> happyError }; + "license" -> return id; -- not interested + _ -> happyError } + } + + | VARID '=' CONID STRING { id } + -- another case of license + + | VARID '=' strlist + {\p -> case unpackFS $1 of + "exposedModules" -> p{exposedModules = $3} + "hiddenModules" -> p{hiddenModules = $3} + "importDirs" -> p{importDirs = $3} + "libraryDirs" -> p{libraryDirs = $3} + "hsLibraries" -> p{hsLibraries = $3} + "extraLibraries" -> p{extraLibraries = $3} + "extraGHCiLibraries"-> p{extraGHCiLibraries= $3} + "includeDirs" -> p{includeDirs = $3} + "includes" -> p{includes = $3} + "hugsOptions" -> p{hugsOptions = $3} + "ccOptions" -> p{ccOptions = $3} + "ldOptions" -> p{ldOptions = $3} + "frameworkDirs" -> p{frameworkDirs = $3} + "frameworks" -> p{frameworks = $3} + "haddockInterfaces" -> p{haddockInterfaces = $3} + "haddockHTMLs" -> p{haddockHTMLs = $3} + "depends" -> p{depends = []} + -- empty list only, non-empty handled below + other -> p + } + + | VARID '=' pkgidlist + {% case unpackFS $1 of + "depends" -> return (\p -> p{depends = $3}) + _other -> happyError + } + +pkgid :: { PackageIdentifier } + : CONID '{' VARID '=' STRING ',' VARID '=' version '}' + { PackageIdentifier{ pkgName = unpackFS $5, + pkgVersion = $9 } } + +version :: { Version } + : CONID '{' VARID '=' intlist ',' VARID '=' strlist '}' + { Version{ versionBranch=$5, versionTags=$9 } } + +pkgidlist :: { [PackageIdentifier] } + : '[' pkgids ']' { $2 } + -- empty list case is covered by strlist, to avoid conflicts + +pkgids :: { [PackageIdentifier] } + : pkgid { [ $1 ] } + | pkgid ',' pkgids { $1 : $3 } + +intlist :: { [Int] } + : '[' ']' { [] } + | '[' ints ']' { $2 } + +ints :: { [Int] } + : INT { [ fromIntegral $1 ] } + | INT ',' ints { fromIntegral $1 : $3 } + +strlist :: { [String] } + : '[' ']' { [] } + | '[' strs ']' { $2 } + +strs :: { [String] } + : STRING { [ unpackFS $1 ] } + | STRING ',' strs { unpackFS $1 : $3 } + +{ +happyError :: P a +happyError = srcParseFail + +loadPackageConfig :: FilePath -> IO [PackageConfig] +loadPackageConfig conf_filename = do + buf <- hGetStringBuffer conf_filename + let loc = mkSrcLoc (mkFastString conf_filename) 1 0 + case unP parse (mkPState buf loc defaultDynFlags) of + PFailed span err -> + throwDyn (InstallationError (showSDoc (mkLocMessage span err))) + + POk _ pkg_details -> do + return pkg_details +} diff --git a/compiler/main/PprTyThing.hs b/compiler/main/PprTyThing.hs new file mode 100644 index 0000000000..2763b052fd --- /dev/null +++ b/compiler/main/PprTyThing.hs @@ -0,0 +1,223 @@ +----------------------------------------------------------------------------- +-- +-- Pretty-printing TyThings +-- +-- (c) The GHC Team 2005 +-- +----------------------------------------------------------------------------- + +module PprTyThing ( + pprTyThing, + pprTyThingInContext, + pprTyThingLoc, + pprTyThingInContextLoc, + pprTyThingHdr + ) where + +#include "HsVersions.h" + +import qualified GHC +import GHC ( TyThing(..), SrcLoc ) +import Outputable + +-- ----------------------------------------------------------------------------- +-- Pretty-printing entities that we get from the GHC API + +-- This should be a good source of sample code for using the GHC API to +-- inspect source code entities. + +-- | Pretty-prints a 'TyThing' with its defining location. +pprTyThingLoc :: Bool -> TyThing -> SDoc +pprTyThingLoc exts tyThing + = showWithLoc loc (pprTyThing exts tyThing) + where loc = GHC.nameSrcLoc (GHC.getName tyThing) + +-- | Pretty-prints a 'TyThing'. +pprTyThing :: Bool -> TyThing -> SDoc +pprTyThing exts (AnId id) = pprId exts id +pprTyThing exts (ADataCon dataCon) = pprDataConSig exts dataCon +pprTyThing exts (ATyCon tyCon) = pprTyCon exts tyCon +pprTyThing exts (AClass cls) = pprClass exts cls + +-- | Like 'pprTyThingInContext', but adds the defining location. +pprTyThingInContextLoc :: Bool -> TyThing -> SDoc +pprTyThingInContextLoc exts tyThing + = showWithLoc loc (pprTyThingInContext exts tyThing) + where loc = GHC.nameSrcLoc (GHC.getName tyThing) + +-- | Pretty-prints a 'TyThing' in context: that is, if the entity +-- is a data constructor, record selector, or class method, then +-- the entity's parent declaration is pretty-printed with irrelevant +-- parts omitted. +pprTyThingInContext :: Bool -> TyThing -> SDoc +pprTyThingInContext exts (AnId id) = pprIdInContext exts id +pprTyThingInContext exts (ADataCon dataCon) = pprDataCon exts dataCon +pprTyThingInContext exts (ATyCon tyCon) = pprTyCon exts tyCon +pprTyThingInContext exts (AClass cls) = pprClass exts cls + +-- | Pretty-prints the 'TyThing' header. For functions and data constructors +-- the function is equivalent to 'pprTyThing' but for type constructors +-- and classes it prints only the header part of the declaration. +pprTyThingHdr :: Bool -> TyThing -> SDoc +pprTyThingHdr exts (AnId id) = pprId exts id +pprTyThingHdr exts (ADataCon dataCon) = pprDataConSig exts dataCon +pprTyThingHdr exts (ATyCon tyCon) = pprTyConHdr exts tyCon +pprTyThingHdr exts (AClass cls) = pprClassHdr exts cls + +pprTyConHdr exts tyCon = + ptext keyword <+> ppr_bndr tyCon <+> hsep (map ppr vars) + where + vars | GHC.isPrimTyCon tyCon || + GHC.isFunTyCon tyCon = take (GHC.tyConArity tyCon) GHC.alphaTyVars + | otherwise = GHC.tyConTyVars tyCon + + keyword | GHC.isSynTyCon tyCon = SLIT("type") + | GHC.isNewTyCon tyCon = SLIT("newtype") + | otherwise = SLIT("data") + +pprDataConSig exts dataCon = + ppr_bndr dataCon <+> dcolon <+> pprType exts (GHC.dataConType dataCon) + +pprClassHdr exts cls = + let (tyVars, funDeps) = GHC.classTvsFds cls + in ptext SLIT("class") <+> + GHC.pprThetaArrow (GHC.classSCTheta cls) <+> + ppr_bndr cls <+> + hsep (map ppr tyVars) <+> + GHC.pprFundeps funDeps + +pprIdInContext exts id + | GHC.isRecordSelector id = pprRecordSelector exts id + | Just cls <- GHC.isClassOpId_maybe id = pprClassOneMethod exts cls id + | otherwise = pprId exts id + +pprRecordSelector exts id + = pprAlgTyCon exts tyCon show_con show_label + where + (tyCon,label) = GHC.recordSelectorFieldLabel id + show_con dataCon = label `elem` GHC.dataConFieldLabels dataCon + show_label label' = label == label' + +pprId exts id + = hang (ppr_bndr id <+> dcolon) 2 + (pprType exts (GHC.idType id)) + +pprType True ty = ppr ty +pprType False ty = ppr (GHC.dropForAlls ty) + +pprTyCon exts tyCon + | GHC.isSynTyCon tyCon + = let rhs_type = GHC.synTyConRhs tyCon + in hang (pprTyConHdr exts tyCon <+> equals) 2 (pprType exts rhs_type) + | otherwise + = pprAlgTyCon exts tyCon (const True) (const True) + +pprAlgTyCon exts tyCon ok_con ok_label + | gadt = pprTyConHdr exts tyCon <+> ptext SLIT("where") $$ + nest 2 (vcat (ppr_trim show_con datacons)) + | otherwise = hang (pprTyConHdr exts tyCon) + 2 (add_bars (ppr_trim show_con datacons)) + where + datacons = GHC.tyConDataCons tyCon + gadt = any (not . GHC.isVanillaDataCon) datacons + + show_con dataCon + | ok_con dataCon = Just (pprDataConDecl exts gadt ok_label dataCon) + | otherwise = Nothing + +pprDataCon exts dataCon = pprAlgTyCon exts tyCon (== dataCon) (const True) + where tyCon = GHC.dataConTyCon dataCon + +pprDataConDecl exts gadt_style show_label dataCon + | not gadt_style = ppr_fields tys_w_strs + | otherwise = ppr_bndr dataCon <+> dcolon <+> + sep [ ppr_tvs, GHC.pprThetaArrow theta, pp_tau ] + where + (tyvars, theta, argTypes, tyCon, res_tys) = GHC.dataConSig dataCon + labels = GHC.dataConFieldLabels dataCon + qualVars = filter (flip notElem (GHC.tyConTyVars tyCon)) tyvars + stricts = GHC.dataConStrictMarks dataCon + tys_w_strs = zip stricts argTypes + + ppr_tvs + | null qualVars = empty + | otherwise = ptext SLIT("forall") <+> + hsep (map ppr qualVars) <> dot + + -- printing out the dataCon as a type signature, in GADT style + pp_tau = foldr add pp_res_ty tys_w_strs + pp_res_ty = ppr_bndr tyCon <+> hsep (map GHC.pprParendType res_tys) + add (str,ty) pp_ty = pprBangTy str ty <+> arrow <+> pp_ty + + pprParendBangTy (strict,ty) + | GHC.isMarkedStrict strict = char '!' <> GHC.pprParendType ty + | otherwise = GHC.pprParendType ty + + pprBangTy strict ty + | GHC.isMarkedStrict strict = char '!' <> ppr ty + | otherwise = ppr ty + + maybe_show_label (lbl,(strict,tp)) + | show_label lbl = Just (ppr lbl <+> dcolon <+> pprBangTy strict tp) + | otherwise = Nothing + + ppr_fields [ty1, ty2] + | GHC.dataConIsInfix dataCon && null labels + = sep [pprParendBangTy ty1, ppr dataCon, pprParendBangTy ty2] + ppr_fields fields + | null labels + = ppr_bndr dataCon <+> sep (map pprParendBangTy fields) + | otherwise + = ppr_bndr dataCon <+> + braces (sep (punctuate comma (ppr_trim maybe_show_label + (zip labels fields)))) + +pprClass exts cls + | null methods = + pprClassHdr exts cls + | otherwise = + hang (pprClassHdr exts cls <+> ptext SLIT("where")) + 2 (vcat (map (pprClassMethod exts) methods)) + where + methods = GHC.classMethods cls + +pprClassOneMethod exts cls this_one = + hang (pprClassHdr exts cls <+> ptext SLIT("where")) + 2 (vcat (ppr_trim show_meth methods)) + where + methods = GHC.classMethods cls + show_meth id | id == this_one = Just (pprClassMethod exts id) + | otherwise = Nothing + +pprClassMethod exts id = + hang (ppr_bndr id <+> dcolon) 2 (pprType exts (classOpType id)) + where + -- Here's the magic incantation to strip off the dictionary + -- from the class op type. Stolen from IfaceSyn.tyThingToIfaceDecl. + classOpType id = GHC.funResultTy rho_ty + where (_sel_tyvars, rho_ty) = GHC.splitForAllTys (GHC.idType id) + +ppr_trim :: (a -> Maybe SDoc) -> [a] -> [SDoc] +ppr_trim show xs + = snd (foldr go (False, []) xs) + where + go x (eliding, so_far) + | Just doc <- show x = (False, doc : so_far) + | otherwise = if eliding then (True, so_far) + else (True, ptext SLIT("...") : so_far) + +add_bars [] = empty +add_bars [c] = equals <+> c +add_bars (c:cs) = sep ((equals <+> c) : map (char '|' <+>) cs) + +-- Wrap operators in () +ppr_bndr :: GHC.NamedThing a => a -> SDoc +ppr_bndr a = GHC.pprParenSymName a + +showWithLoc :: SrcLoc -> SDoc -> SDoc +showWithLoc loc doc + = hang doc 2 (char '\t' <> comment <+> GHC.pprDefnLoc loc) + -- The tab tries to make them line up a bit + where + comment = ptext SLIT("--") + diff --git a/compiler/main/StaticFlags.hs b/compiler/main/StaticFlags.hs new file mode 100644 index 0000000000..3067063f7b --- /dev/null +++ b/compiler/main/StaticFlags.hs @@ -0,0 +1,584 @@ +----------------------------------------------------------------------------- +-- +-- Static flags +-- +-- Static flags can only be set once, on the command-line. Inside GHC, +-- each static flag corresponds to a top-level value, usually of type Bool. +-- +-- (c) The University of Glasgow 2005 +-- +----------------------------------------------------------------------------- + +module StaticFlags ( + parseStaticFlags, + staticFlags, + + -- Ways + WayName(..), v_Ways, v_Build_tag, v_RTS_Build_tag, + + -- Output style options + opt_PprUserLength, + opt_PprStyle_Debug, + + -- profiling opts + opt_AutoSccsOnAllToplevs, + opt_AutoSccsOnExportedToplevs, + opt_AutoSccsOnIndividualCafs, + opt_SccProfilingOn, + opt_DoTickyProfiling, + + -- language opts + opt_DictsStrict, + opt_MaxContextReductionDepth, + opt_IrrefutableTuples, + opt_Parallel, + opt_RuntimeTypes, + opt_Flatten, + + -- optimisation opts + opt_NoMethodSharing, + opt_NoStateHack, + opt_LiberateCaseThreshold, + opt_CprOff, + opt_RulesOff, + opt_SimplNoPreInlining, + opt_SimplExcessPrecision, + opt_MaxWorkerArgs, + + -- Unfolding control + opt_UF_CreationThreshold, + opt_UF_UseThreshold, + opt_UF_FunAppDiscount, + opt_UF_KeenessFactor, + opt_UF_UpdateInPlace, + opt_UF_DearOp, + + -- misc opts + opt_IgnoreDotGhci, + opt_ErrorSpans, + opt_EmitCExternDecls, + opt_GranMacros, + opt_HiVersion, + opt_HistorySize, + opt_OmitBlackHoling, + opt_Static, + opt_Unregisterised, + opt_EmitExternalCore, + opt_PIC, + v_Ld_inputs, + ) where + +#include "HsVersions.h" + +import Util ( consIORef ) +import CmdLineParser +import Config ( cProjectVersionInt, cProjectPatchLevel, + cGhcUnregisterised ) +import FastString ( FastString, mkFastString ) +import Util +import Maybes ( firstJust ) +import Panic ( GhcException(..), ghcError ) +import Constants ( mAX_CONTEXT_REDUCTION_DEPTH ) + +import EXCEPTION ( throwDyn ) +import DATA_IOREF +import UNSAFE_IO ( unsafePerformIO ) +import Monad ( when ) +import Char ( isDigit ) +import List ( sort, intersperse ) + +----------------------------------------------------------------------------- +-- Static flags + +parseStaticFlags :: [String] -> IO [String] +parseStaticFlags args = do + (leftover, errs) <- processArgs static_flags args + when (not (null errs)) $ throwDyn (UsageError (unlines errs)) + + -- deal with the way flags: the way (eg. prof) gives rise to + -- futher flags, some of which might be static. + way_flags <- findBuildTag + + -- if we're unregisterised, add some more flags + let unreg_flags | cGhcUnregisterised == "YES" = unregFlags + | otherwise = [] + + (more_leftover, errs) <- processArgs static_flags (unreg_flags ++ way_flags) + when (not (null errs)) $ ghcError (UsageError (unlines errs)) + return (more_leftover++leftover) + + +-- note that ordering is important in the following list: any flag which +-- is a prefix flag (i.e. HasArg, Prefix, OptPrefix, AnySuffix) will override +-- flags further down the list with the same prefix. + +static_flags :: [(String, OptKind IO)] +static_flags = [ + ------- GHCi ------------------------------------------------------- + ( "ignore-dot-ghci", PassFlag addOpt ) + , ( "read-dot-ghci" , NoArg (removeOpt "-ignore-dot-ghci") ) + + ------- ways -------------------------------------------------------- + , ( "prof" , NoArg (addWay WayProf) ) + , ( "unreg" , NoArg (addWay WayUnreg) ) + , ( "ticky" , NoArg (addWay WayTicky) ) + , ( "parallel" , NoArg (addWay WayPar) ) + , ( "gransim" , NoArg (addWay WayGran) ) + , ( "smp" , NoArg (addWay WayThreaded) ) -- backwards compat. + , ( "debug" , NoArg (addWay WayDebug) ) + , ( "ndp" , NoArg (addWay WayNDP) ) + , ( "threaded" , NoArg (addWay WayThreaded) ) + -- ToDo: user ways + + ------ Debugging ---------------------------------------------------- + , ( "dppr-noprags", PassFlag addOpt ) + , ( "dppr-debug", PassFlag addOpt ) + , ( "dppr-user-length", AnySuffix addOpt ) + -- rest of the debugging flags are dynamic + + --------- Profiling -------------------------------------------------- + , ( "auto-all" , NoArg (addOpt "-fauto-sccs-on-all-toplevs") ) + , ( "auto" , NoArg (addOpt "-fauto-sccs-on-exported-toplevs") ) + , ( "caf-all" , NoArg (addOpt "-fauto-sccs-on-individual-cafs") ) + -- "ignore-sccs" doesn't work (ToDo) + + , ( "no-auto-all" , NoArg (removeOpt "-fauto-sccs-on-all-toplevs") ) + , ( "no-auto" , NoArg (removeOpt "-fauto-sccs-on-exported-toplevs") ) + , ( "no-caf-all" , NoArg (removeOpt "-fauto-sccs-on-individual-cafs") ) + + ------- Miscellaneous ----------------------------------------------- + , ( "no-link-chk" , NoArg (return ()) ) -- ignored for backwards compat + + ----- Linker -------------------------------------------------------- + , ( "static" , PassFlag addOpt ) + , ( "dynamic" , NoArg (removeOpt "-static") ) + , ( "rdynamic" , NoArg (return ()) ) -- ignored for compat w/ gcc + + ----- RTS opts ------------------------------------------------------ + , ( "H" , HasArg (setHeapSize . fromIntegral . decodeSize) ) + , ( "Rghc-timing" , NoArg (enableTimingStats) ) + + ------ Compiler flags ----------------------------------------------- + -- All other "-fno-<blah>" options cancel out "-f<blah>" on the hsc cmdline + , ( "fno-", PrefixPred (\s -> isStaticFlag ("f"++s)) + (\s -> removeOpt ("-f"++s)) ) + + -- Pass all remaining "-f<blah>" options to hsc + , ( "f", AnySuffixPred (isStaticFlag) addOpt ) + ] + +addOpt = consIORef v_opt_C + +addWay = consIORef v_Ways + +removeOpt f = do + fs <- readIORef v_opt_C + writeIORef v_opt_C $! filter (/= f) fs + +lookUp :: FastString -> Bool +lookup_def_int :: String -> Int -> Int +lookup_def_float :: String -> Float -> Float +lookup_str :: String -> Maybe String + +-- holds the static opts while they're being collected, before +-- being unsafely read by unpacked_static_opts below. +GLOBAL_VAR(v_opt_C, defaultStaticOpts, [String]) +staticFlags = unsafePerformIO (readIORef v_opt_C) + +-- -static is the default +defaultStaticOpts = ["-static"] + +packed_static_opts = map mkFastString staticFlags + +lookUp sw = sw `elem` packed_static_opts + +-- (lookup_str "foo") looks for the flag -foo=X or -fooX, +-- and returns the string X +lookup_str sw + = case firstJust (map (startsWith sw) staticFlags) of + Just ('=' : str) -> Just str + Just str -> Just str + Nothing -> Nothing + +lookup_def_int sw def = case (lookup_str sw) of + Nothing -> def -- Use default + Just xx -> try_read sw xx + +lookup_def_float sw def = case (lookup_str sw) of + Nothing -> def -- Use default + Just xx -> try_read sw xx + + +try_read :: Read a => String -> String -> a +-- (try_read sw str) tries to read s; if it fails, it +-- bleats about flag sw +try_read sw str + = case reads str of + ((x,_):_) -> x -- Be forgiving: ignore trailing goop, and alternative parses + [] -> ghcError (UsageError ("Malformed argument " ++ str ++ " for flag " ++ sw)) + -- ToDo: hack alert. We should really parse the arugments + -- and announce errors in a more civilised way. + + +{- + Putting the compiler options into temporary at-files + may turn out to be necessary later on if we turn hsc into + a pure Win32 application where I think there's a command-line + length limit of 255. unpacked_opts understands the @ option. + +unpacked_opts :: [String] +unpacked_opts = + concat $ + map (expandAts) $ + map unpackFS argv -- NOT ARGV any more: v_Static_hsc_opts + where + expandAts ('@':fname) = words (unsafePerformIO (readFile fname)) + expandAts l = [l] +-} + + +opt_IgnoreDotGhci = lookUp FSLIT("-ignore-dot-ghci") + +-- debugging opts +opt_PprStyle_Debug = lookUp FSLIT("-dppr-debug") +opt_PprUserLength = lookup_def_int "-dppr-user-length" 5 --ToDo: give this a name + +-- profiling opts +opt_AutoSccsOnAllToplevs = lookUp FSLIT("-fauto-sccs-on-all-toplevs") +opt_AutoSccsOnExportedToplevs = lookUp FSLIT("-fauto-sccs-on-exported-toplevs") +opt_AutoSccsOnIndividualCafs = lookUp FSLIT("-fauto-sccs-on-individual-cafs") +opt_SccProfilingOn = lookUp FSLIT("-fscc-profiling") +opt_DoTickyProfiling = lookUp FSLIT("-fticky-ticky") + +-- language opts +opt_DictsStrict = lookUp FSLIT("-fdicts-strict") +opt_IrrefutableTuples = lookUp FSLIT("-firrefutable-tuples") +opt_MaxContextReductionDepth = lookup_def_int "-fcontext-stack" mAX_CONTEXT_REDUCTION_DEPTH +opt_Parallel = lookUp FSLIT("-fparallel") +opt_Flatten = lookUp FSLIT("-fflatten") + +-- optimisation opts +opt_NoStateHack = lookUp FSLIT("-fno-state-hack") +opt_NoMethodSharing = lookUp FSLIT("-fno-method-sharing") +opt_CprOff = lookUp FSLIT("-fcpr-off") +opt_RulesOff = lookUp FSLIT("-frules-off") + -- Switch off CPR analysis in the new demand analyser +opt_LiberateCaseThreshold = lookup_def_int "-fliberate-case-threshold" (10::Int) +opt_MaxWorkerArgs = lookup_def_int "-fmax-worker-args" (10::Int) + +opt_EmitCExternDecls = lookUp FSLIT("-femit-extern-decls") +opt_GranMacros = lookUp FSLIT("-fgransim") +opt_HiVersion = read (cProjectVersionInt ++ cProjectPatchLevel) :: Int +opt_HistorySize = lookup_def_int "-fhistory-size" 20 +opt_OmitBlackHoling = lookUp FSLIT("-dno-black-holing") +opt_RuntimeTypes = lookUp FSLIT("-fruntime-types") + +-- Simplifier switches +opt_SimplNoPreInlining = lookUp FSLIT("-fno-pre-inlining") + -- NoPreInlining is there just to see how bad things + -- get if you don't do it! +opt_SimplExcessPrecision = lookUp FSLIT("-fexcess-precision") + +-- Unfolding control +opt_UF_CreationThreshold = lookup_def_int "-funfolding-creation-threshold" (45::Int) +opt_UF_UseThreshold = lookup_def_int "-funfolding-use-threshold" (8::Int) -- Discounts can be big +opt_UF_FunAppDiscount = lookup_def_int "-funfolding-fun-discount" (6::Int) -- It's great to inline a fn +opt_UF_KeenessFactor = lookup_def_float "-funfolding-keeness-factor" (1.5::Float) +opt_UF_UpdateInPlace = lookUp FSLIT("-funfolding-update-in-place") + +opt_UF_DearOp = ( 4 :: Int) + +opt_Static = lookUp FSLIT("-static") +opt_Unregisterised = lookUp FSLIT("-funregisterised") +opt_EmitExternalCore = lookUp FSLIT("-fext-core") + +-- Include full span info in error messages, instead of just the start position. +opt_ErrorSpans = lookUp FSLIT("-ferror-spans") + +opt_PIC = lookUp FSLIT("-fPIC") + +-- object files and libraries to be linked in are collected here. +-- ToDo: perhaps this could be done without a global, it wasn't obvious +-- how to do it though --SDM. +GLOBAL_VAR(v_Ld_inputs, [], [String]) + +isStaticFlag f = + f `elem` [ + "fauto-sccs-on-all-toplevs", + "fauto-sccs-on-exported-toplevs", + "fauto-sccs-on-individual-cafs", + "fscc-profiling", + "fticky-ticky", + "fall-strict", + "fdicts-strict", + "firrefutable-tuples", + "fparallel", + "fflatten", + "fsemi-tagging", + "flet-no-escape", + "femit-extern-decls", + "fglobalise-toplev-names", + "fgransim", + "fno-hi-version-check", + "dno-black-holing", + "fno-method-sharing", + "fno-state-hack", + "fruntime-types", + "fno-pre-inlining", + "fexcess-precision", + "funfolding-update-in-place", + "static", + "funregisterised", + "fext-core", + "frule-check", + "frules-off", + "fcpr-off", + "ferror-spans", + "fPIC" + ] + || any (flip prefixMatch f) [ + "fcontext-stack", + "fliberate-case-threshold", + "fmax-worker-args", + "fhistory-size", + "funfolding-creation-threshold", + "funfolding-use-threshold", + "funfolding-fun-discount", + "funfolding-keeness-factor" + ] + + + +-- Misc functions for command-line options + +startsWith :: String -> String -> Maybe String +-- startsWith pfx (pfx++rest) = Just rest + +startsWith [] str = Just str +startsWith (c:cs) (s:ss) + = if c /= s then Nothing else startsWith cs ss +startsWith _ [] = Nothing + + +----------------------------------------------------------------------------- +-- convert sizes like "3.5M" into integers + +decodeSize :: String -> Integer +decodeSize str + | c == "" = truncate n + | c == "K" || c == "k" = truncate (n * 1000) + | c == "M" || c == "m" = truncate (n * 1000 * 1000) + | c == "G" || c == "g" = truncate (n * 1000 * 1000 * 1000) + | otherwise = throwDyn (CmdLineError ("can't decode size: " ++ str)) + where (m, c) = span pred str + n = read m :: Double + pred c = isDigit c || c == '.' + + +----------------------------------------------------------------------------- +-- RTS Hooks + +#if __GLASGOW_HASKELL__ >= 504 +foreign import ccall unsafe "setHeapSize" setHeapSize :: Int -> IO () +foreign import ccall unsafe "enableTimingStats" enableTimingStats :: IO () +#else +foreign import "setHeapSize" unsafe setHeapSize :: Int -> IO () +foreign import "enableTimingStats" unsafe enableTimingStats :: IO () +#endif + +----------------------------------------------------------------------------- +-- Ways + +-- The central concept of a "way" is that all objects in a given +-- program must be compiled in the same "way". Certain options change +-- parameters of the virtual machine, eg. profiling adds an extra word +-- to the object header, so profiling objects cannot be linked with +-- non-profiling objects. + +-- After parsing the command-line options, we determine which "way" we +-- are building - this might be a combination way, eg. profiling+ticky-ticky. + +-- We then find the "build-tag" associated with this way, and this +-- becomes the suffix used to find .hi files and libraries used in +-- this compilation. + +GLOBAL_VAR(v_Build_tag, "", String) + +-- The RTS has its own build tag, because there are some ways that +-- affect the RTS only. +GLOBAL_VAR(v_RTS_Build_tag, "", String) + +data WayName + = WayThreaded + | WayDebug + | WayProf + | WayUnreg + | WayTicky + | WayPar + | WayGran + | WayNDP + | WayUser_a + | WayUser_b + | WayUser_c + | WayUser_d + | WayUser_e + | WayUser_f + | WayUser_g + | WayUser_h + | WayUser_i + | WayUser_j + | WayUser_k + | WayUser_l + | WayUser_m + | WayUser_n + | WayUser_o + | WayUser_A + | WayUser_B + deriving (Eq,Ord) + +GLOBAL_VAR(v_Ways, [] ,[WayName]) + +allowed_combination way = and [ x `allowedWith` y + | x <- way, y <- way, x < y ] + where + -- Note ordering in these tests: the left argument is + -- <= the right argument, according to the Ord instance + -- on Way above. + + -- debug is allowed with everything + _ `allowedWith` WayDebug = True + WayDebug `allowedWith` _ = True + + WayThreaded `allowedWith` WayProf = True + WayProf `allowedWith` WayUnreg = True + WayProf `allowedWith` WayNDP = True + _ `allowedWith` _ = False + + +findBuildTag :: IO [String] -- new options +findBuildTag = do + way_names <- readIORef v_Ways + let ws = sort way_names + if not (allowed_combination ws) + then throwDyn (CmdLineError $ + "combination not supported: " ++ + foldr1 (\a b -> a ++ '/':b) + (map (wayName . lkupWay) ws)) + else let ways = map lkupWay ws + tag = mkBuildTag (filter (not.wayRTSOnly) ways) + rts_tag = mkBuildTag ways + flags = map wayOpts ways + in do + writeIORef v_Build_tag tag + writeIORef v_RTS_Build_tag rts_tag + return (concat flags) + +mkBuildTag :: [Way] -> String +mkBuildTag ways = concat (intersperse "_" (map wayTag ways)) + +lkupWay w = + case lookup w way_details of + Nothing -> error "findBuildTag" + Just details -> details + +data Way = Way { + wayTag :: String, + wayRTSOnly :: Bool, + wayName :: String, + wayOpts :: [String] + } + +way_details :: [ (WayName, Way) ] +way_details = + [ (WayThreaded, Way "thr" True "Threaded" [ +#if defined(freebsd_TARGET_OS) + "-optc-pthread" + , "-optl-pthread" +#endif + ] ), + + (WayDebug, Way "debug" True "Debug" [] ), + + (WayProf, Way "p" False "Profiling" + [ "-fscc-profiling" + , "-DPROFILING" + , "-optc-DPROFILING" ]), + + (WayTicky, Way "t" False "Ticky-ticky Profiling" + [ "-fticky-ticky" + , "-DTICKY_TICKY" + , "-optc-DTICKY_TICKY" ]), + + (WayUnreg, Way "u" False "Unregisterised" + unregFlags ), + + -- optl's below to tell linker where to find the PVM library -- HWL + (WayPar, Way "mp" False "Parallel" + [ "-fparallel" + , "-D__PARALLEL_HASKELL__" + , "-optc-DPAR" + , "-package concurrent" + , "-optc-w" + , "-optl-L${PVM_ROOT}/lib/${PVM_ARCH}" + , "-optl-lpvm3" + , "-optl-lgpvm3" ]), + + -- at the moment we only change the RTS and could share compiler and libs! + (WayPar, Way "mt" False "Parallel ticky profiling" + [ "-fparallel" + , "-D__PARALLEL_HASKELL__" + , "-optc-DPAR" + , "-optc-DPAR_TICKY" + , "-package concurrent" + , "-optc-w" + , "-optl-L${PVM_ROOT}/lib/${PVM_ARCH}" + , "-optl-lpvm3" + , "-optl-lgpvm3" ]), + + (WayPar, Way "md" False "Distributed" + [ "-fparallel" + , "-D__PARALLEL_HASKELL__" + , "-D__DISTRIBUTED_HASKELL__" + , "-optc-DPAR" + , "-optc-DDIST" + , "-package concurrent" + , "-optc-w" + , "-optl-L${PVM_ROOT}/lib/${PVM_ARCH}" + , "-optl-lpvm3" + , "-optl-lgpvm3" ]), + + (WayGran, Way "mg" False "GranSim" + [ "-fgransim" + , "-D__GRANSIM__" + , "-optc-DGRAN" + , "-package concurrent" ]), + + (WayNDP, Way "ndp" False "Nested data parallelism" + [ "-fparr" + , "-fflatten"]), + + (WayUser_a, Way "a" False "User way 'a'" ["$WAY_a_REAL_OPTS"]), + (WayUser_b, Way "b" False "User way 'b'" ["$WAY_b_REAL_OPTS"]), + (WayUser_c, Way "c" False "User way 'c'" ["$WAY_c_REAL_OPTS"]), + (WayUser_d, Way "d" False "User way 'd'" ["$WAY_d_REAL_OPTS"]), + (WayUser_e, Way "e" False "User way 'e'" ["$WAY_e_REAL_OPTS"]), + (WayUser_f, Way "f" False "User way 'f'" ["$WAY_f_REAL_OPTS"]), + (WayUser_g, Way "g" False "User way 'g'" ["$WAY_g_REAL_OPTS"]), + (WayUser_h, Way "h" False "User way 'h'" ["$WAY_h_REAL_OPTS"]), + (WayUser_i, Way "i" False "User way 'i'" ["$WAY_i_REAL_OPTS"]), + (WayUser_j, Way "j" False "User way 'j'" ["$WAY_j_REAL_OPTS"]), + (WayUser_k, Way "k" False "User way 'k'" ["$WAY_k_REAL_OPTS"]), + (WayUser_l, Way "l" False "User way 'l'" ["$WAY_l_REAL_OPTS"]), + (WayUser_m, Way "m" False "User way 'm'" ["$WAY_m_REAL_OPTS"]), + (WayUser_n, Way "n" False "User way 'n'" ["$WAY_n_REAL_OPTS"]), + (WayUser_o, Way "o" False "User way 'o'" ["$WAY_o_REAL_OPTS"]), + (WayUser_A, Way "A" False "User way 'A'" ["$WAY_A_REAL_OPTS"]), + (WayUser_B, Way "B" False "User way 'B'" ["$WAY_B_REAL_OPTS"]) + ] + +unregFlags = + [ "-optc-DNO_REGS" + , "-optc-DUSE_MINIINTERPRETER" + , "-fno-asm-mangling" + , "-funregisterised" + , "-fvia-C" ] diff --git a/compiler/main/SysTools.lhs b/compiler/main/SysTools.lhs new file mode 100644 index 0000000000..eee3e1a383 --- /dev/null +++ b/compiler/main/SysTools.lhs @@ -0,0 +1,817 @@ +----------------------------------------------------------------------------- +-- +-- (c) The University of Glasgow 2001-2003 +-- +-- Access to system tools: gcc, cp, rm etc +-- +----------------------------------------------------------------------------- + +\begin{code} +module SysTools ( + -- Initialisation + initSysTools, + + getTopDir, -- IO String -- The value of $topdir + getPackageConfigPath, -- IO String -- Where package.conf is + getUsageMsgPaths, -- IO (String,String) + + -- Interface to system tools + runUnlit, runCpp, runCc, -- [Option] -> IO () + runPp, -- [Option] -> IO () + runMangle, runSplit, -- [Option] -> IO () + runAs, runLink, -- [Option] -> IO () + runMkDLL, + + touch, -- String -> String -> IO () + copy, -- String -> String -> String -> IO () + normalisePath, -- FilePath -> FilePath + + -- Temporary-file management + setTmpDir, + newTempName, + cleanTempFiles, cleanTempFilesExcept, + addFilesToClean, + + -- System interface + system, -- String -> IO ExitCode + + -- Misc + getSysMan, -- IO String Parallel system only + + Option(..) + + ) where + +#include "HsVersions.h" + +import DriverPhases ( isHaskellUserSrcFilename ) +import Config +import Outputable +import ErrUtils ( putMsg, debugTraceMsg, showPass, Severity(..), Messages ) +import Panic ( GhcException(..) ) +import Util ( Suffix, global, notNull, consIORef, joinFileName, + normalisePath, pgmPath, platformPath, joinFileExt ) +import DynFlags ( DynFlags(..), DynFlag(..), dopt, Option(..), + setTmpDir, defaultDynFlags ) + +import EXCEPTION ( throwDyn, finally ) +import DATA_IOREF ( IORef, readIORef, writeIORef ) +import DATA_INT + +import Monad ( when, unless ) +import System ( ExitCode(..), getEnv, system ) +import IO ( try, catch, hGetContents, + openFile, hPutStr, hClose, hFlush, IOMode(..), + stderr, ioError, isDoesNotExistError ) +import Directory ( doesFileExist, removeFile ) +import Maybe ( isJust ) +import List ( partition ) + +-- GHC <= 4.08 didn't have rawSystem, and runs into problems with long command +-- lines on mingw32, so we disallow it now. +#if __GLASGOW_HASKELL__ < 500 +#error GHC >= 5.00 is required for bootstrapping GHC +#endif + +#ifndef mingw32_HOST_OS +#if __GLASGOW_HASKELL__ > 504 +import qualified System.Posix.Internals +#else +import qualified Posix +#endif +#else /* Must be Win32 */ +import List ( isPrefixOf ) +import Util ( dropList ) +import Foreign +import CString ( CString, peekCString ) +#endif + +import Text.Regex + +#if __GLASGOW_HASKELL__ < 603 +-- rawSystem comes from libghccompat.a in stage1 +import Compat.RawSystem ( rawSystem ) +import GHC.IOBase ( IOErrorType(..) ) +import System.IO.Error ( ioeGetErrorType ) +#else +import System.Process ( runInteractiveProcess, getProcessExitCode ) +import System.IO ( hSetBuffering, hGetLine, BufferMode(..) ) +import Control.Concurrent( forkIO, newChan, readChan, writeChan ) +import Data.Char ( isSpace ) +import FastString ( mkFastString ) +import SrcLoc ( SrcLoc, mkSrcLoc, noSrcSpan, mkSrcSpan ) +#endif +\end{code} + + + The configuration story + ~~~~~~~~~~~~~~~~~~~~~~~ + +GHC needs various support files (library packages, RTS etc), plus +various auxiliary programs (cp, gcc, etc). It finds these in one +of two places: + +* When running as an *installed program*, GHC finds most of this support + stuff in the installed library tree. The path to this tree is passed + to GHC via the -B flag, and given to initSysTools . + +* When running *in-place* in a build tree, GHC finds most of this support + stuff in the build tree. The path to the build tree is, again passed + to GHC via -B. + +GHC tells which of the two is the case by seeing whether package.conf +is in TopDir [installed] or in TopDir/ghc/driver [inplace] (what a hack). + + +SysTools.initSysProgs figures out exactly where all the auxiliary programs +are, and initialises mutable variables to make it easy to call them. +To to this, it makes use of definitions in Config.hs, which is a Haskell +file containing variables whose value is figured out by the build system. + +Config.hs contains two sorts of things + + cGCC, The *names* of the programs + cCPP e.g. cGCC = gcc + cUNLIT cCPP = gcc -E + etc They do *not* include paths + + + cUNLIT_DIR_REL The *path* to the directory containing unlit, split etc + cSPLIT_DIR_REL *relative* to the root of the build tree, + for use when running *in-place* in a build tree (only) + + + +--------------------------------------------- +NOTES for an ALTERNATIVE scheme (i.e *not* what is currently implemented): + +Another hair-brained scheme for simplifying the current tool location +nightmare in GHC: Simon originally suggested using another +configuration file along the lines of GCC's specs file - which is fine +except that it means adding code to read yet another configuration +file. What I didn't notice is that the current package.conf is +general enough to do this: + +Package + {name = "tools", import_dirs = [], source_dirs = [], + library_dirs = [], hs_libraries = [], extra_libraries = [], + include_dirs = [], c_includes = [], package_deps = [], + extra_ghc_opts = ["-pgmc/usr/bin/gcc","-pgml${topdir}/bin/unlit", ... etc.], + extra_cc_opts = [], extra_ld_opts = []} + +Which would have the advantage that we get to collect together in one +place the path-specific package stuff with the path-specific tool +stuff. + End of NOTES +--------------------------------------------- + + +%************************************************************************ +%* * +\subsection{Global variables to contain system programs} +%* * +%************************************************************************ + +All these pathnames are maintained IN THE NATIVE FORMAT OF THE HOST MACHINE. +(See remarks under pathnames below) + +\begin{code} +GLOBAL_VAR(v_Pgm_T, error "pgm_T", String) -- touch +GLOBAL_VAR(v_Pgm_CP, error "pgm_CP", String) -- cp + +GLOBAL_VAR(v_Path_package_config, error "path_package_config", String) +GLOBAL_VAR(v_Path_usages, error "ghc_usage.txt", (String,String)) + +GLOBAL_VAR(v_TopDir, error "TopDir", String) -- -B<dir> + +-- Parallel system only +GLOBAL_VAR(v_Pgm_sysman, error "pgm_sysman", String) -- system manager + +-- ways to get at some of these variables from outside this module +getPackageConfigPath = readIORef v_Path_package_config +getTopDir = readIORef v_TopDir +\end{code} + + +%************************************************************************ +%* * +\subsection{Initialisation} +%* * +%************************************************************************ + +\begin{code} +initSysTools :: Maybe String -- Maybe TopDir path (without the '-B' prefix) + + -> DynFlags + -> IO DynFlags -- Set all the mutable variables above, holding + -- (a) the system programs + -- (b) the package-config file + -- (c) the GHC usage message + + +initSysTools mbMinusB dflags + = do { (am_installed, top_dir) <- findTopDir mbMinusB + ; writeIORef v_TopDir top_dir + -- top_dir + -- for "installed" this is the root of GHC's support files + -- for "in-place" it is the root of the build tree + -- NB: top_dir is assumed to be in standard Unix format '/' separated + + ; let installed, installed_bin :: FilePath -> FilePath + installed_bin pgm = pgmPath top_dir pgm + installed file = pgmPath top_dir file + inplace dir pgm = pgmPath (top_dir `joinFileName` + cPROJECT_DIR `joinFileName` dir) pgm + + ; let pkgconfig_path + | am_installed = installed "package.conf" + | otherwise = inplace cGHC_DRIVER_DIR_REL "package.conf.inplace" + + ghc_usage_msg_path + | am_installed = installed "ghc-usage.txt" + | otherwise = inplace cGHC_DRIVER_DIR_REL "ghc-usage.txt" + + ghci_usage_msg_path + | am_installed = installed "ghci-usage.txt" + | otherwise = inplace cGHC_DRIVER_DIR_REL "ghci-usage.txt" + + -- For all systems, unlit, split, mangle are GHC utilities + -- architecture-specific stuff is done when building Config.hs + unlit_path + | am_installed = installed_bin cGHC_UNLIT_PGM + | otherwise = inplace cGHC_UNLIT_DIR_REL cGHC_UNLIT_PGM + + -- split and mangle are Perl scripts + split_script + | am_installed = installed_bin cGHC_SPLIT_PGM + | otherwise = inplace cGHC_SPLIT_DIR_REL cGHC_SPLIT_PGM + + mangle_script + | am_installed = installed_bin cGHC_MANGLER_PGM + | otherwise = inplace cGHC_MANGLER_DIR_REL cGHC_MANGLER_PGM + + ; let dflags0 = defaultDynFlags +#ifndef mingw32_HOST_OS + -- check whether TMPDIR is set in the environment + ; e_tmpdir <- IO.try (getEnv "TMPDIR") -- fails if not set +#else + -- On Win32, consult GetTempPath() for a temp dir. + -- => it first tries TMP, TEMP, then finally the + -- Windows directory(!). The directory is in short-path + -- form. + ; e_tmpdir <- + IO.try (do + let len = (2048::Int) + buf <- mallocArray len + ret <- getTempPath len buf + if ret == 0 then do + -- failed, consult TMPDIR. + free buf + getEnv "TMPDIR" + else do + s <- peekCString buf + free buf + return s) +#endif + ; let dflags1 = case e_tmpdir of + Left _ -> dflags0 + Right d -> setTmpDir d dflags0 + + -- Check that the package config exists + ; config_exists <- doesFileExist pkgconfig_path + ; when (not config_exists) $ + throwDyn (InstallationError + ("Can't find package.conf as " ++ pkgconfig_path)) + +#if defined(mingw32_HOST_OS) + -- WINDOWS-SPECIFIC STUFF + -- On Windows, gcc and friends are distributed with GHC, + -- so when "installed" we look in TopDir/bin + -- When "in-place" we look wherever the build-time configure + -- script found them + -- When "install" we tell gcc where its specs file + exes are (-B) + -- and also some places to pick up include files. We need + -- to be careful to put all necessary exes in the -B place + -- (as, ld, cc1, etc) since if they don't get found there, gcc + -- then tries to run unadorned "as", "ld", etc, and will + -- pick up whatever happens to be lying around in the path, + -- possibly including those from a cygwin install on the target, + -- which is exactly what we're trying to avoid. + ; let gcc_b_arg = Option ("-B" ++ installed "gcc-lib/") + (gcc_prog,gcc_args) + | am_installed = (installed_bin "gcc", [gcc_b_arg]) + | otherwise = (cGCC, []) + -- The trailing "/" is absolutely essential; gcc seems + -- to construct file names simply by concatenating to + -- this -B path with no extra slash We use "/" rather + -- than "\\" because otherwise "\\\" is mangled + -- later on; although gcc_args are in NATIVE format, + -- gcc can cope + -- (see comments with declarations of global variables) + -- + -- The quotes round the -B argument are in case TopDir + -- has spaces in it + + perl_path | am_installed = installed_bin cGHC_PERL + | otherwise = cGHC_PERL + + -- 'touch' is a GHC util for Windows, and similarly unlit, mangle + ; let touch_path | am_installed = installed_bin cGHC_TOUCHY_PGM + | otherwise = inplace cGHC_TOUCHY_DIR_REL cGHC_TOUCHY_PGM + + -- On Win32 we don't want to rely on #!/bin/perl, so we prepend + -- a call to Perl to get the invocation of split and mangle + ; let (split_prog, split_args) = (perl_path, [Option split_script]) + (mangle_prog, mangle_args) = (perl_path, [Option mangle_script]) + + ; let (mkdll_prog, mkdll_args) + | am_installed = + (pgmPath (installed "gcc-lib/") cMKDLL, + [ Option "--dlltool-name", + Option (pgmPath (installed "gcc-lib/") "dlltool"), + Option "--driver-name", + Option gcc_prog, gcc_b_arg ]) + | otherwise = (cMKDLL, []) +#else + -- UNIX-SPECIFIC STUFF + -- On Unix, the "standard" tools are assumed to be + -- in the same place whether we are running "in-place" or "installed" + -- That place is wherever the build-time configure script found them. + ; let gcc_prog = cGCC + gcc_args = [] + touch_path = "touch" + mkdll_prog = panic "Can't build DLLs on a non-Win32 system" + mkdll_args = [] + + -- On Unix, scripts are invoked using the '#!' method. Binary + -- installations of GHC on Unix place the correct line on the front + -- of the script at installation time, so we don't want to wire-in + -- our knowledge of $(PERL) on the host system here. + ; let (split_prog, split_args) = (split_script, []) + (mangle_prog, mangle_args) = (mangle_script, []) +#endif + + -- cpp is derived from gcc on all platforms + -- HACK, see setPgmP below. We keep 'words' here to remember to fix + -- Config.hs one day. + ; let cpp_path = (gcc_prog, gcc_args ++ + (Option "-E"):(map Option (words cRAWCPP_FLAGS))) + + -- For all systems, copy and remove are provided by the host + -- system; architecture-specific stuff is done when building Config.hs + ; let cp_path = cGHC_CP + + -- Other things being equal, as and ld are simply gcc + ; let (as_prog,as_args) = (gcc_prog,gcc_args) + (ld_prog,ld_args) = (gcc_prog,gcc_args) + + -- Initialise the global vars + ; writeIORef v_Path_package_config pkgconfig_path + ; writeIORef v_Path_usages (ghc_usage_msg_path, + ghci_usage_msg_path) + + ; writeIORef v_Pgm_sysman (top_dir ++ "/ghc/rts/parallel/SysMan") + -- Hans: this isn't right in general, but you can + -- elaborate it in the same way as the others + + ; writeIORef v_Pgm_T touch_path + ; writeIORef v_Pgm_CP cp_path + + ; return dflags1{ + pgm_L = unlit_path, + pgm_P = cpp_path, + pgm_F = "", + pgm_c = (gcc_prog,gcc_args), + pgm_m = (mangle_prog,mangle_args), + pgm_s = (split_prog,split_args), + pgm_a = (as_prog,as_args), + pgm_l = (ld_prog,ld_args), + pgm_dll = (mkdll_prog,mkdll_args) } + } + +#if defined(mingw32_HOST_OS) +foreign import stdcall unsafe "GetTempPathA" getTempPath :: Int -> CString -> IO Int32 +#endif +\end{code} + +\begin{code} +-- Find TopDir +-- for "installed" this is the root of GHC's support files +-- for "in-place" it is the root of the build tree +-- +-- Plan of action: +-- 1. Set proto_top_dir +-- if there is no given TopDir path, get the directory +-- where GHC is running (only on Windows) +-- +-- 2. If package.conf exists in proto_top_dir, we are running +-- installed; and TopDir = proto_top_dir +-- +-- 3. Otherwise we are running in-place, so +-- proto_top_dir will be /...stuff.../ghc/compiler +-- Set TopDir to /...stuff..., which is the root of the build tree +-- +-- This is very gruesome indeed + +findTopDir :: Maybe String -- Maybe TopDir path (without the '-B' prefix). + -> IO (Bool, -- True <=> am installed, False <=> in-place + String) -- TopDir (in Unix format '/' separated) + +findTopDir mbMinusB + = do { top_dir <- get_proto + -- Discover whether we're running in a build tree or in an installation, + -- by looking for the package configuration file. + ; am_installed <- doesFileExist (top_dir `joinFileName` "package.conf") + + ; return (am_installed, top_dir) + } + where + -- get_proto returns a Unix-format path (relying on getBaseDir to do so too) + get_proto = case mbMinusB of + Just minusb -> return (normalisePath minusb) + Nothing + -> do maybe_exec_dir <- getBaseDir -- Get directory of executable + case maybe_exec_dir of -- (only works on Windows; + -- returns Nothing on Unix) + Nothing -> throwDyn (InstallationError "missing -B<dir> option") + Just dir -> return dir +\end{code} + + +%************************************************************************ +%* * +\subsection{Running an external program} +%* * +%************************************************************************ + + +\begin{code} +runUnlit :: DynFlags -> [Option] -> IO () +runUnlit dflags args = do + let p = pgm_L dflags + runSomething dflags "Literate pre-processor" p args + +runCpp :: DynFlags -> [Option] -> IO () +runCpp dflags args = do + let (p,args0) = pgm_P dflags + runSomething dflags "C pre-processor" p (args0 ++ args) + +runPp :: DynFlags -> [Option] -> IO () +runPp dflags args = do + let p = pgm_F dflags + runSomething dflags "Haskell pre-processor" p args + +runCc :: DynFlags -> [Option] -> IO () +runCc dflags args = do + let (p,args0) = pgm_c dflags + runSomethingFiltered dflags cc_filter "C Compiler" p (args0++args) + where + -- discard some harmless warnings from gcc that we can't turn off + cc_filter str = unlines (do_filter (lines str)) + + do_filter [] = [] + do_filter ls@(l:ls') + | (w:rest) <- dropWhile (isJust .matchRegex r_from) ls, + isJust (matchRegex r_warn w) + = do_filter rest + | otherwise + = l : do_filter ls' + + r_from = mkRegex "from.*:[0-9]+" + r_warn = mkRegex "warning: call-clobbered register used" + +runMangle :: DynFlags -> [Option] -> IO () +runMangle dflags args = do + let (p,args0) = pgm_m dflags + runSomething dflags "Mangler" p (args0++args) + +runSplit :: DynFlags -> [Option] -> IO () +runSplit dflags args = do + let (p,args0) = pgm_s dflags + runSomething dflags "Splitter" p (args0++args) + +runAs :: DynFlags -> [Option] -> IO () +runAs dflags args = do + let (p,args0) = pgm_a dflags + runSomething dflags "Assembler" p (args0++args) + +runLink :: DynFlags -> [Option] -> IO () +runLink dflags args = do + let (p,args0) = pgm_l dflags + runSomething dflags "Linker" p (args0++args) + +runMkDLL :: DynFlags -> [Option] -> IO () +runMkDLL dflags args = do + let (p,args0) = pgm_dll dflags + runSomething dflags "Make DLL" p (args0++args) + +touch :: DynFlags -> String -> String -> IO () +touch dflags purpose arg = do + p <- readIORef v_Pgm_T + runSomething dflags purpose p [FileOption "" arg] + +copy :: DynFlags -> String -> String -> String -> IO () +copy dflags purpose from to = do + showPass dflags purpose + + h <- openFile to WriteMode + ls <- readFile from -- inefficient, but it'll do for now. + -- ToDo: speed up via slurping. + hPutStr h ls + hClose h + +\end{code} + +\begin{code} +getSysMan :: IO String -- How to invoke the system manager + -- (parallel system only) +getSysMan = readIORef v_Pgm_sysman +\end{code} + +\begin{code} +getUsageMsgPaths :: IO (FilePath,FilePath) + -- the filenames of the usage messages (ghc, ghci) +getUsageMsgPaths = readIORef v_Path_usages +\end{code} + + +%************************************************************************ +%* * +\subsection{Managing temporary files +%* * +%************************************************************************ + +\begin{code} +GLOBAL_VAR(v_FilesToClean, [], [String] ) +\end{code} + +\begin{code} +cleanTempFiles :: DynFlags -> IO () +cleanTempFiles dflags + = do fs <- readIORef v_FilesToClean + removeTmpFiles dflags fs + writeIORef v_FilesToClean [] + +cleanTempFilesExcept :: DynFlags -> [FilePath] -> IO () +cleanTempFilesExcept dflags dont_delete + = do files <- readIORef v_FilesToClean + let (to_keep, to_delete) = partition (`elem` dont_delete) files + removeTmpFiles dflags to_delete + writeIORef v_FilesToClean to_keep + + +-- find a temporary name that doesn't already exist. +newTempName :: DynFlags -> Suffix -> IO FilePath +newTempName DynFlags{tmpDir=tmp_dir} extn + = do x <- getProcessID + findTempName (tmp_dir ++ "/ghc" ++ show x ++ "_") 0 + where + findTempName prefix x + = do let filename = (prefix ++ show x) `joinFileExt` extn + b <- doesFileExist filename + if b then findTempName prefix (x+1) + else do consIORef v_FilesToClean filename -- clean it up later + return filename + +addFilesToClean :: [FilePath] -> IO () +-- May include wildcards [used by DriverPipeline.run_phase SplitMangle] +addFilesToClean files = mapM_ (consIORef v_FilesToClean) files + +removeTmpFiles :: DynFlags -> [FilePath] -> IO () +removeTmpFiles dflags fs + = warnNon $ + traceCmd dflags "Deleting temp files" + ("Deleting: " ++ unwords deletees) + (mapM_ rm deletees) + where + -- Flat out refuse to delete files that are likely to be source input + -- files (is there a worse bug than having a compiler delete your source + -- files?) + -- + -- Deleting source files is a sign of a bug elsewhere, so prominently flag + -- the condition. + warnNon act + | null non_deletees = act + | otherwise = do + putMsg dflags (text "WARNING - NOT deleting source files:" <+> hsep (map text non_deletees)) + act + + (non_deletees, deletees) = partition isHaskellUserSrcFilename fs + + rm f = removeFile f `IO.catch` + (\_ignored -> + debugTraceMsg dflags 2 (ptext SLIT("Warning: deleting non-existent") <+> text f) + ) + + +----------------------------------------------------------------------------- +-- Running an external program + +runSomething :: DynFlags + -> String -- For -v message + -> String -- Command name (possibly a full path) + -- assumed already dos-ified + -> [Option] -- Arguments + -- runSomething will dos-ify them + -> IO () + +runSomething dflags phase_name pgm args = + runSomethingFiltered dflags id phase_name pgm args + +runSomethingFiltered + :: DynFlags -> (String->String) -> String -> String -> [Option] -> IO () + +runSomethingFiltered dflags filter_fn phase_name pgm args = do + let real_args = filter notNull (map showOpt args) + traceCmd dflags phase_name (unwords (pgm:real_args)) $ do + (exit_code, doesn'tExist) <- + IO.catch (do + rc <- builderMainLoop dflags filter_fn pgm real_args + case rc of + ExitSuccess{} -> return (rc, False) + ExitFailure n + -- rawSystem returns (ExitFailure 127) if the exec failed for any + -- reason (eg. the program doesn't exist). This is the only clue + -- we have, but we need to report something to the user because in + -- the case of a missing program there will otherwise be no output + -- at all. + | n == 127 -> return (rc, True) + | otherwise -> return (rc, False)) + -- Should 'rawSystem' generate an IO exception indicating that + -- 'pgm' couldn't be run rather than a funky return code, catch + -- this here (the win32 version does this, but it doesn't hurt + -- to test for this in general.) + (\ err -> + if IO.isDoesNotExistError err +#if defined(mingw32_HOST_OS) && __GLASGOW_HASKELL__ < 604 + -- the 'compat' version of rawSystem under mingw32 always + -- maps 'errno' to EINVAL to failure. + || case (ioeGetErrorType err ) of { InvalidArgument{} -> True ; _ -> False} +#endif + then return (ExitFailure 1, True) + else IO.ioError err) + case (doesn'tExist, exit_code) of + (True, _) -> throwDyn (InstallationError ("could not execute: " ++ pgm)) + (_, ExitSuccess) -> return () + _ -> throwDyn (PhaseFailed phase_name exit_code) + + + +#if __GLASGOW_HASKELL__ < 603 +builderMainLoop dflags filter_fn pgm real_args = do + rawSystem pgm real_args +#else +builderMainLoop dflags filter_fn pgm real_args = do + chan <- newChan + (hStdIn, hStdOut, hStdErr, hProcess) <- runInteractiveProcess pgm real_args Nothing Nothing + + -- and run a loop piping the output from the compiler to the log_action in DynFlags + hSetBuffering hStdOut LineBuffering + hSetBuffering hStdErr LineBuffering + forkIO (readerProc chan hStdOut filter_fn) + forkIO (readerProc chan hStdErr filter_fn) + rc <- loop chan hProcess 2 1 ExitSuccess + hClose hStdIn + hClose hStdOut + hClose hStdErr + return rc + where + -- status starts at zero, and increments each time either + -- a reader process gets EOF, or the build proc exits. We wait + -- for all of these to happen (status==3). + -- ToDo: we should really have a contingency plan in case any of + -- the threads dies, such as a timeout. + loop chan hProcess 0 0 exitcode = return exitcode + loop chan hProcess t p exitcode = do + mb_code <- if p > 0 + then getProcessExitCode hProcess + else return Nothing + case mb_code of + Just code -> loop chan hProcess t (p-1) code + Nothing + | t > 0 -> do + msg <- readChan chan + case msg of + BuildMsg msg -> do + log_action dflags SevInfo noSrcSpan defaultUserStyle msg + loop chan hProcess t p exitcode + BuildError loc msg -> do + log_action dflags SevError (mkSrcSpan loc loc) defaultUserStyle msg + loop chan hProcess t p exitcode + EOF -> + loop chan hProcess (t-1) p exitcode + | otherwise -> loop chan hProcess t p exitcode + +readerProc chan hdl filter_fn = + (do str <- hGetContents hdl + loop (lines (filter_fn str)) Nothing) + `finally` + writeChan chan EOF + -- ToDo: check errors more carefully + -- ToDo: in the future, the filter should be implemented as + -- a stream transformer. + where + loop [] Nothing = return () + loop [] (Just err) = writeChan chan err + loop (l:ls) in_err = + case in_err of + Just err@(BuildError srcLoc msg) + | leading_whitespace l -> do + loop ls (Just (BuildError srcLoc (msg $$ text l))) + | otherwise -> do + writeChan chan err + checkError l ls + Nothing -> do + checkError l ls + + checkError l ls + = case matchRegex errRegex l of + Nothing -> do + writeChan chan (BuildMsg (text l)) + loop ls Nothing + Just (file':lineno':colno':msg:_) -> do + let file = mkFastString file' + lineno = read lineno'::Int + colno = case colno' of + "" -> 0 + _ -> read (init colno') :: Int + srcLoc = mkSrcLoc file lineno colno + loop ls (Just (BuildError srcLoc (text msg))) + + leading_whitespace [] = False + leading_whitespace (x:_) = isSpace x + +errRegex = mkRegex "^([^:]*):([0-9]+):([0-9]+:)?(.*)" + +data BuildMessage + = BuildMsg !SDoc + | BuildError !SrcLoc !SDoc + | EOF +#endif + +showOpt (FileOption pre f) = pre ++ platformPath f +showOpt (Option "") = "" +showOpt (Option s) = s + +traceCmd :: DynFlags -> String -> String -> IO () -> IO () +-- a) trace the command (at two levels of verbosity) +-- b) don't do it at all if dry-run is set +traceCmd dflags phase_name cmd_line action + = do { let verb = verbosity dflags + ; showPass dflags phase_name + ; debugTraceMsg dflags 3 (text cmd_line) + ; hFlush stderr + + -- Test for -n flag + ; unless (dopt Opt_DryRun dflags) $ do { + + -- And run it! + ; action `IO.catch` handle_exn verb + }} + where + handle_exn verb exn = do { debugTraceMsg dflags 2 (char '\n') + ; debugTraceMsg dflags 2 (ptext SLIT("Failed:") <+> text cmd_line <+> text (show exn)) + ; throwDyn (PhaseFailed phase_name (ExitFailure 1)) } +\end{code} + +%************************************************************************ +%* * +\subsection{Support code} +%* * +%************************************************************************ + +\begin{code} +----------------------------------------------------------------------------- +-- Define getBaseDir :: IO (Maybe String) + +getBaseDir :: IO (Maybe String) +#if defined(mingw32_HOST_OS) +-- Assuming we are running ghc, accessed by path $()/bin/ghc.exe, +-- return the path $(stuff). Note that we drop the "bin/" directory too. +getBaseDir = do let len = (2048::Int) -- plenty, PATH_MAX is 512 under Win32. + buf <- mallocArray len + ret <- getModuleFileName nullPtr buf len + if ret == 0 then free buf >> return Nothing + else do s <- peekCString buf + free buf + return (Just (rootDir s)) + where + rootDir s = reverse (dropList "/bin/ghc.exe" (reverse (normalisePath s))) + +foreign import stdcall unsafe "GetModuleFileNameA" + getModuleFileName :: Ptr () -> CString -> Int -> IO Int32 +#else +getBaseDir = return Nothing +#endif + +#ifdef mingw32_HOST_OS +foreign import ccall unsafe "_getpid" getProcessID :: IO Int -- relies on Int == Int32 on Windows +#elif __GLASGOW_HASKELL__ > 504 +getProcessID :: IO Int +getProcessID = System.Posix.Internals.c_getpid >>= return . fromIntegral +#else +getProcessID :: IO Int +getProcessID = Posix.getProcessID +#endif + +\end{code} diff --git a/compiler/main/TidyPgm.lhs b/compiler/main/TidyPgm.lhs new file mode 100644 index 0000000000..86e55f9e06 --- /dev/null +++ b/compiler/main/TidyPgm.lhs @@ -0,0 +1,816 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section{Tidying up Core} + +\begin{code} +module TidyPgm( mkBootModDetails, tidyProgram ) where + +#include "HsVersions.h" + +import DynFlags ( DynFlag(..), dopt ) +import Packages ( HomeModules ) +import CoreSyn +import CoreUnfold ( noUnfolding, mkTopUnfolding ) +import CoreFVs ( ruleLhsFreeIds, exprSomeFreeVars ) +import CoreTidy ( tidyExpr, tidyVarOcc, tidyRules ) +import PprCore ( pprRules ) +import CoreLint ( showPass, endPass ) +import CoreUtils ( exprArity, rhsIsStatic ) +import VarEnv +import VarSet +import Var ( Id, Var ) +import Id ( idType, idInfo, idName, idCoreRules, isGlobalId, + isExportedId, mkVanillaGlobal, isLocalId, isNaughtyRecordSelector, + idArity, idCafInfo, idUnfolding, isImplicitId, setIdInfo + ) +import IdInfo {- loads of stuff -} +import InstEnv ( Instance, DFunId, instanceDFunId, setInstanceDFunId ) +import NewDemand ( isBottomingSig, topSig ) +import BasicTypes ( Arity, isNeverActive ) +import Name ( Name, getOccName, nameOccName, mkInternalName, + localiseName, isExternalName, nameSrcLoc, nameParent_maybe, + isWiredInName, getName + ) +import NameSet ( NameSet, elemNameSet ) +import IfaceEnv ( allocateGlobalBinder ) +import NameEnv ( filterNameEnv, mapNameEnv ) +import OccName ( TidyOccEnv, initTidyOccEnv, tidyOccName ) +import Type ( tidyTopType ) +import TcType ( isFFITy ) +import DataCon ( dataConName, dataConFieldLabels, dataConWrapId_maybe ) +import TyCon ( TyCon, makeTyConAbstract, tyConDataCons, isNewTyCon, + newTyConRep, tyConSelIds, isAlgTyCon, isEnumerationTyCon ) +import Class ( classSelIds ) +import Module ( Module ) +import HscTypes ( HscEnv(..), NameCache( nsUniqs ), CgGuts(..), + TypeEnv, typeEnvIds, typeEnvElts, typeEnvTyCons, + extendTypeEnvWithIds, lookupTypeEnv, + ModGuts(..), TyThing(..), ModDetails(..), Dependencies(..) + ) +import Maybes ( orElse, mapCatMaybes ) +import ErrUtils ( showPass, dumpIfSet_core ) +import UniqSupply ( splitUniqSupply, uniqFromSupply ) +import List ( partition ) +import Maybe ( isJust ) +import Outputable +import DATA_IOREF ( IORef, readIORef, writeIORef ) +import FastTypes hiding ( fastOr ) +\end{code} + + +Constructing the TypeEnv, Instances, Rules from which the ModIface is +constructed, and which goes on to subsequent modules in --make mode. + +Most of the interface file is obtained simply by serialising the +TypeEnv. One important consequence is that if the *interface file* +has pragma info if and only if the final TypeEnv does. This is not so +important for *this* module, but it's essential for ghc --make: +subsequent compilations must not see (e.g.) the arity if the interface +file does not contain arity If they do, they'll exploit the arity; +then the arity might change, but the iface file doesn't change => +recompilation does not happen => disaster. + +For data types, the final TypeEnv will have a TyThing for the TyCon, +plus one for each DataCon; the interface file will contain just one +data type declaration, but it is de-serialised back into a collection +of TyThings. + +%************************************************************************ +%* * + Plan A: simpleTidyPgm +%* * +%************************************************************************ + + +Plan A: mkBootModDetails: omit pragmas, make interfaces small +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +* Ignore the bindings + +* Drop all WiredIn things from the TypeEnv + (we never want them in interface files) + +* Retain all TyCons and Classes in the TypeEnv, to avoid + having to find which ones are mentioned in the + types of exported Ids + +* Trim off the constructors of non-exported TyCons, both + from the TyCon and from the TypeEnv + +* Drop non-exported Ids from the TypeEnv + +* Tidy the types of the DFunIds of Instances, + make them into GlobalIds, (they already have External Names) + and add them to the TypeEnv + +* Tidy the types of the (exported) Ids in the TypeEnv, + make them into GlobalIds (they already have External Names) + +* Drop rules altogether + +* Tidy the bindings, to ensure that the Caf and Arity + information is correct for each top-level binder; the + code generator needs it. And to ensure that local names have + distinct OccNames in case of object-file splitting + +\begin{code} +mkBootModDetails :: HscEnv -> ModGuts -> IO ModDetails +-- This is Plan A: make a small type env when typechecking only, +-- or when compiling a hs-boot file, or simply when not using -O +-- +-- We don't look at the bindings at all -- there aren't any +-- for hs-boot files + +mkBootModDetails hsc_env (ModGuts { mg_module = mod, + mg_exports = exports, + mg_types = type_env, + mg_insts = ispecs }) + = do { let dflags = hsc_dflags hsc_env + ; showPass dflags "Tidy [hoot] type env" + + ; let { ispecs' = tidyInstances tidyExternalId ispecs + ; type_env1 = filterNameEnv (not . isWiredInThing) type_env + ; type_env2 = mapNameEnv tidyBootThing type_env1 + ; type_env' = extendTypeEnvWithIds type_env2 + (map instanceDFunId ispecs') + } + ; return (ModDetails { md_types = type_env', + md_insts = ispecs', + md_rules = [], + md_exports = exports }) + } + where + +isWiredInThing :: TyThing -> Bool +isWiredInThing thing = isWiredInName (getName thing) + +tidyBootThing :: TyThing -> TyThing +-- Just externalise the Ids; keep everything +tidyBootThing (AnId id) | isLocalId id = AnId (tidyExternalId id) +tidyBootThing thing = thing + +tidyExternalId :: Id -> Id +-- Takes an LocalId with an External Name, +-- makes it into a GlobalId with VanillaIdInfo, and tidies its type +-- (NB: vanillaIdInfo makes a conservative assumption about Caf-hood.) +tidyExternalId id + = ASSERT2( isLocalId id && isExternalName (idName id), ppr id ) + mkVanillaGlobal (idName id) (tidyTopType (idType id)) vanillaIdInfo +\end{code} + + +%************************************************************************ +%* * + Plan B: tidy bindings, make TypeEnv full of IdInfo +%* * +%************************************************************************ + +Plan B: include pragmas, make interfaces +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +* Figure out which Ids are externally visible + +* Tidy the bindings, externalising appropriate Ids + +* Drop all Ids from the TypeEnv, and add all the External Ids from + the bindings. (This adds their IdInfo to the TypeEnv; and adds + floated-out Ids that weren't even in the TypeEnv before.) + +Step 1: Figure out external Ids +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +First we figure out which Ids are "external" Ids. An +"external" Id is one that is visible from outside the compilation +unit. These are + a) the user exported ones + b) ones mentioned in the unfoldings, workers, + or rules of externally-visible ones +This exercise takes a sweep of the bindings bottom to top. Actually, +in Step 2 we're also going to need to know which Ids should be +exported with their unfoldings, so we produce not an IdSet but an +IdEnv Bool + + +Step 2: Tidy the program +~~~~~~~~~~~~~~~~~~~~~~~~ +Next we traverse the bindings top to bottom. For each *top-level* +binder + + 1. Make it into a GlobalId; its IdDetails becomes VanillaGlobal, + reflecting the fact that from now on we regard it as a global, + not local, Id + + 2. Give it a system-wide Unique. + [Even non-exported things need system-wide Uniques because the + byte-code generator builds a single Name->BCO symbol table.] + + We use the NameCache kept in the HscEnv as the + source of such system-wide uniques. + + For external Ids, use the original-name cache in the NameCache + to ensure that the unique assigned is the same as the Id had + in any previous compilation run. + + 3. If it's an external Id, make it have a External Name, otherwise + make it have an Internal Name. + This is used by the code generator to decide whether + to make the label externally visible + + 4. Give external Ids a "tidy" OccName. This means + we can print them in interface files without confusing + "x" (unique 5) with "x" (unique 10). + + 5. Give it its UTTERLY FINAL IdInfo; in ptic, + * its unfolding, if it should have one + + * its arity, computed from the number of visible lambdas + + * its CAF info, computed from what is free in its RHS + + +Finally, substitute these new top-level binders consistently +throughout, including in unfoldings. We also tidy binders in +RHSs, so that they print nicely in interfaces. + +\begin{code} +tidyProgram :: HscEnv -> ModGuts -> IO (CgGuts, ModDetails) +tidyProgram hsc_env + mod_impl@(ModGuts { mg_module = mod, mg_exports = exports, + mg_types = type_env, mg_insts = insts_tc, + mg_binds = binds, + mg_rules = imp_rules, + mg_dir_imps = dir_imps, mg_deps = deps, + mg_home_mods = home_mods, + mg_foreign = foreign_stubs }) + + = do { let dflags = hsc_dflags hsc_env + ; showPass dflags "Tidy Core" + + ; let { omit_prags = dopt Opt_OmitInterfacePragmas dflags + ; ext_ids = findExternalIds omit_prags binds + ; ext_rules + | omit_prags = [] + | otherwise = findExternalRules binds imp_rules ext_ids + -- findExternalRules filters imp_rules to avoid binders that + -- aren't externally visible; but the externally-visible binders + -- are computed (by findExternalIds) assuming that all orphan + -- rules are exported (they get their Exported flag set in the desugarer) + -- So in fact we may export more than we need. + -- (It's a sort of mutual recursion.) + } + + ; (tidy_env, tidy_binds) <- tidyTopBinds hsc_env home_mods mod type_env ext_ids binds + + ; let { tidy_type_env = tidyTypeEnv omit_prags exports type_env tidy_binds + ; tidy_ispecs = tidyInstances (lookup_dfun tidy_type_env) insts_tc + -- A DFunId will have a binding in tidy_binds, and so + -- will now be in final_env, replete with IdInfo + -- Its name will be unchanged since it was born, but + -- we want Global, IdInfo-rich (or not) DFunId in the tidy_ispecs + + ; tidy_rules = tidyRules tidy_env ext_rules + -- You might worry that the tidy_env contains IdInfo-rich stuff + -- and indeed it does, but if omit_prags is on, ext_rules is empty + + ; implicit_binds = getImplicitBinds type_env + ; alg_tycons = filter isAlgTyCon (typeEnvTyCons type_env) + } + + ; endPass dflags "Tidy Core" Opt_D_dump_simpl tidy_binds + ; dumpIfSet_core dflags Opt_D_dump_simpl + "Tidy Core Rules" + (pprRules tidy_rules) + + ; return (CgGuts { cg_module = mod, + cg_tycons = alg_tycons, + cg_binds = implicit_binds ++ tidy_binds, + cg_dir_imps = dir_imps, + cg_foreign = foreign_stubs, + cg_home_mods = home_mods, + cg_dep_pkgs = dep_pkgs deps }, + + ModDetails { md_types = tidy_type_env, + md_rules = tidy_rules, + md_insts = tidy_ispecs, + md_exports = exports }) + } + +lookup_dfun type_env dfun_id + = case lookupTypeEnv type_env (idName dfun_id) of + Just (AnId dfun_id') -> dfun_id' + other -> pprPanic "lookup_dfun" (ppr dfun_id) + +tidyTypeEnv :: Bool -> NameSet -> TypeEnv -> [CoreBind] -> TypeEnv + +-- The competed type environment is gotten from +-- Dropping any wired-in things, and then +-- a) keeping the types and classes +-- b) removing all Ids, +-- c) adding Ids with correct IdInfo, including unfoldings, +-- gotten from the bindings +-- From (c) we keep only those Ids with External names; +-- the CoreTidy pass makes sure these are all and only +-- the externally-accessible ones +-- This truncates the type environment to include only the +-- exported Ids and things needed from them, which saves space + +tidyTypeEnv omit_prags exports type_env tidy_binds + = let type_env1 = filterNameEnv keep_it type_env + type_env2 = extendTypeEnvWithIds type_env1 final_ids + type_env3 | omit_prags = mapNameEnv trim_thing type_env2 + | otherwise = type_env2 + in + type_env3 + where + final_ids = [ id | id <- bindersOfBinds tidy_binds, + isExternalName (idName id)] + + -- We keep GlobalIds, because they won't appear + -- in the bindings from which final_ids are derived! + -- (The bindings bind LocalIds.) + keep_it thing | isWiredInThing thing = False + keep_it (AnId id) = isGlobalId id -- Keep GlobalIds (e.g. class ops) + keep_it other = True -- Keep all TyCons, DataCons, and Classes + + trim_thing thing + = case thing of + ATyCon tc | mustExposeTyCon exports tc -> thing + | otherwise -> ATyCon (makeTyConAbstract tc) + + AnId id | isImplicitId id -> thing + | otherwise -> AnId (id `setIdInfo` vanillaIdInfo) + + other -> thing + +mustExposeTyCon :: NameSet -- Exports + -> TyCon -- The tycon + -> Bool -- Can its rep be hidden? +-- We are compiling without -O, and thus trying to write as little as +-- possible into the interface file. But we must expose the details of +-- any data types whose constructors or fields are exported +mustExposeTyCon exports tc + | not (isAlgTyCon tc) -- Synonyms + = True + | isEnumerationTyCon tc -- For an enumeration, exposing the constructors + = True -- won't lead to the need for further exposure + -- (This includes data types with no constructors.) + | otherwise -- Newtype, datatype + = any exported_con (tyConDataCons tc) + -- Expose rep if any datacon or field is exported + + || (isNewTyCon tc && isFFITy (snd (newTyConRep tc))) + -- Expose the rep for newtypes if the rep is an FFI type. + -- For a very annoying reason. 'Foreign import' is meant to + -- be able to look through newtypes transparently, but it + -- can only do that if it can "see" the newtype representation + where + exported_con con = any (`elemNameSet` exports) + (dataConName con : dataConFieldLabels con) + +tidyInstances :: (DFunId -> DFunId) -> [Instance] -> [Instance] +tidyInstances tidy_dfun ispecs + = map tidy ispecs + where + tidy ispec = setInstanceDFunId ispec $ + tidy_dfun (instanceDFunId ispec) + +getImplicitBinds :: TypeEnv -> [CoreBind] +getImplicitBinds type_env + = map get_defn (concatMap implicit_con_ids (typeEnvTyCons type_env) + ++ concatMap other_implicit_ids (typeEnvElts type_env)) + -- Put the constructor wrappers first, because + -- other implicit bindings (notably the fromT functions arising + -- from generics) use the constructor wrappers. At least that's + -- what External Core likes + where + implicit_con_ids tc = mapCatMaybes dataConWrapId_maybe (tyConDataCons tc) + + other_implicit_ids (ATyCon tc) = filter (not . isNaughtyRecordSelector) (tyConSelIds tc) + -- The "naughty" ones are not real functions at all + -- They are there just so we can get decent error messages + -- See Note [Naughty record selectors] in MkId.lhs + other_implicit_ids (AClass cl) = classSelIds cl + other_implicit_ids other = [] + + get_defn :: Id -> CoreBind + get_defn id = NonRec id (tidyExpr emptyTidyEnv rhs) + where + rhs = unfoldingTemplate (idUnfolding id) + -- Don't forget to tidy the body ! Otherwise you get silly things like + -- \ tpl -> case tpl of tpl -> (tpl,tpl) -> tpl +\end{code} + + +%************************************************************************ +%* * +\subsection{Step 1: finding externals} +%* * +%************************************************************************ + +\begin{code} +findExternalIds :: Bool + -> [CoreBind] + -> IdEnv Bool -- In domain => external + -- Range = True <=> show unfolding + -- Step 1 from the notes above +findExternalIds omit_prags binds + | omit_prags + = mkVarEnv [ (id,False) | id <- bindersOfBinds binds, isExportedId id ] + + | otherwise + = foldr find emptyVarEnv binds + where + find (NonRec id rhs) needed + | need_id needed id = addExternal (id,rhs) needed + | otherwise = needed + find (Rec prs) needed = find_prs prs needed + + -- For a recursive group we have to look for a fixed point + find_prs prs needed + | null needed_prs = needed + | otherwise = find_prs other_prs new_needed + where + (needed_prs, other_prs) = partition (need_pr needed) prs + new_needed = foldr addExternal needed needed_prs + + -- The 'needed' set contains the Ids that are needed by earlier + -- interface file emissions. If the Id isn't in this set, and isn't + -- exported, there's no need to emit anything + need_id needed_set id = id `elemVarEnv` needed_set || isExportedId id + need_pr needed_set (id,rhs) = need_id needed_set id + +addExternal :: (Id,CoreExpr) -> IdEnv Bool -> IdEnv Bool +-- The Id is needed; extend the needed set +-- with it and its dependents (free vars etc) +addExternal (id,rhs) needed + = extendVarEnv (foldVarSet add_occ needed new_needed_ids) + id show_unfold + where + add_occ id needed = extendVarEnv needed id False + -- "False" because we don't know we need the Id's unfolding + -- We'll override it later when we find the binding site + + new_needed_ids = worker_ids `unionVarSet` + unfold_ids `unionVarSet` + spec_ids + + idinfo = idInfo id + dont_inline = isNeverActive (inlinePragInfo idinfo) + loop_breaker = isLoopBreaker (occInfo idinfo) + bottoming_fn = isBottomingSig (newStrictnessInfo idinfo `orElse` topSig) + spec_ids = specInfoFreeVars (specInfo idinfo) + worker_info = workerInfo idinfo + + -- Stuff to do with the Id's unfolding + -- The simplifier has put an up-to-date unfolding + -- in the IdInfo, but the RHS will do just as well + unfolding = unfoldingInfo idinfo + rhs_is_small = not (neverUnfold unfolding) + + -- We leave the unfolding there even if there is a worker + -- In GHCI the unfolding is used by importers + -- When writing an interface file, we omit the unfolding + -- if there is a worker + show_unfold = not bottoming_fn && -- Not necessary + not dont_inline && + not loop_breaker && + rhs_is_small -- Small enough + + unfold_ids | show_unfold = exprSomeFreeVars isLocalId rhs + | otherwise = emptyVarSet + + worker_ids = case worker_info of + HasWorker work_id _ -> unitVarSet work_id + otherwise -> emptyVarSet +\end{code} + + +\begin{code} +findExternalRules :: [CoreBind] + -> [CoreRule] -- Non-local rules (i.e. ones for imported fns) + -> IdEnv a -- Ids that are exported, so we need their rules + -> [CoreRule] + -- The complete rules are gotten by combining + -- a) the non-local rules + -- b) rules embedded in the top-level Ids +findExternalRules binds non_local_rules ext_ids + = filter (not . internal_rule) (non_local_rules ++ local_rules) + where + local_rules = [ rule + | id <- bindersOfBinds binds, + id `elemVarEnv` ext_ids, + rule <- idCoreRules id + ] + + internal_rule rule + = any internal_id (varSetElems (ruleLhsFreeIds rule)) + -- Don't export a rule whose LHS mentions a locally-defined + -- Id that is completely internal (i.e. not visible to an + -- importing module) + + internal_id id = not (id `elemVarEnv` ext_ids) +\end{code} + + + +%************************************************************************ +%* * +\subsection{Step 2: top-level tidying} +%* * +%************************************************************************ + + +\begin{code} +-- TopTidyEnv: when tidying we need to know +-- * nc_var: The NameCache, containing a unique supply and any pre-ordained Names. +-- These may have arisen because the +-- renamer read in an interface file mentioning M.$wf, say, +-- and assigned it unique r77. If, on this compilation, we've +-- invented an Id whose name is $wf (but with a different unique) +-- we want to rename it to have unique r77, so that we can do easy +-- comparisons with stuff from the interface file +-- +-- * occ_env: The TidyOccEnv, which tells us which local occurrences +-- are 'used' +-- +-- * subst_env: A Var->Var mapping that substitutes the new Var for the old + +tidyTopBinds :: HscEnv + -> HomeModules + -> Module + -> TypeEnv + -> IdEnv Bool -- Domain = Ids that should be external + -- True <=> their unfolding is external too + -> [CoreBind] + -> IO (TidyEnv, [CoreBind]) + +tidyTopBinds hsc_env hmods mod type_env ext_ids binds + = tidy init_env binds + where + nc_var = hsc_NC hsc_env + + -- We also make sure to avoid any exported binders. Consider + -- f{-u1-} = 1 -- Local decl + -- ... + -- f{-u2-} = 2 -- Exported decl + -- + -- The second exported decl must 'get' the name 'f', so we + -- have to put 'f' in the avoids list before we get to the first + -- decl. tidyTopId then does a no-op on exported binders. + init_env = (initTidyOccEnv avoids, emptyVarEnv) + avoids = [getOccName name | bndr <- typeEnvIds type_env, + let name = idName bndr, + isExternalName name] + -- In computing our "avoids" list, we must include + -- all implicit Ids + -- all things with global names (assigned once and for + -- all by the renamer) + -- since their names are "taken". + -- The type environment is a convenient source of such things. + + tidy env [] = return (env, []) + tidy env (b:bs) = do { (env1, b') <- tidyTopBind hmods mod nc_var ext_ids env b + ; (env2, bs') <- tidy env1 bs + ; return (env2, b':bs') } + +------------------------ +tidyTopBind :: HomeModules + -> Module + -> IORef NameCache -- For allocating new unique names + -> IdEnv Bool -- Domain = Ids that should be external + -- True <=> their unfolding is external too + -> TidyEnv -> CoreBind + -> IO (TidyEnv, CoreBind) + +tidyTopBind hmods mod nc_var ext_ids tidy_env1@(occ_env1,subst1) (NonRec bndr rhs) + = do { (occ_env2, name') <- tidyTopName mod nc_var ext_ids occ_env1 bndr + ; let { (bndr', rhs') = tidyTopPair ext_ids tidy_env2 caf_info name' (bndr, rhs) + ; subst2 = extendVarEnv subst1 bndr bndr' + ; tidy_env2 = (occ_env2, subst2) } + ; return (tidy_env2, NonRec bndr' rhs') } + where + caf_info = hasCafRefs hmods subst1 (idArity bndr) rhs + +tidyTopBind hmods mod nc_var ext_ids tidy_env1@(occ_env1,subst1) (Rec prs) + = do { (occ_env2, names') <- tidyTopNames mod nc_var ext_ids occ_env1 bndrs + ; let { prs' = zipWith (tidyTopPair ext_ids tidy_env2 caf_info) + names' prs + ; subst2 = extendVarEnvList subst1 (bndrs `zip` map fst prs') + ; tidy_env2 = (occ_env2, subst2) } + ; return (tidy_env2, Rec prs') } + where + bndrs = map fst prs + + -- the CafInfo for a recursive group says whether *any* rhs in + -- the group may refer indirectly to a CAF (because then, they all do). + caf_info + | or [ mayHaveCafRefs (hasCafRefs hmods subst1 (idArity bndr) rhs) + | (bndr,rhs) <- prs ] = MayHaveCafRefs + | otherwise = NoCafRefs + +-------------------------------------------------------------------- +-- tidyTopName +-- This is where we set names to local/global based on whether they really are +-- externally visible (see comment at the top of this module). If the name +-- was previously local, we have to give it a unique occurrence name if +-- we intend to externalise it. +tidyTopNames mod nc_var ext_ids occ_env [] = return (occ_env, []) +tidyTopNames mod nc_var ext_ids occ_env (id:ids) + = do { (occ_env1, name) <- tidyTopName mod nc_var ext_ids occ_env id + ; (occ_env2, names) <- tidyTopNames mod nc_var ext_ids occ_env1 ids + ; return (occ_env2, name:names) } + +tidyTopName :: Module -> IORef NameCache -> VarEnv Bool -> TidyOccEnv + -> Id -> IO (TidyOccEnv, Name) +tidyTopName mod nc_var ext_ids occ_env id + | global && internal = return (occ_env, localiseName name) + + | global && external = return (occ_env, name) + -- Global names are assumed to have been allocated by the renamer, + -- so they already have the "right" unique + -- And it's a system-wide unique too + + -- Now we get to the real reason that all this is in the IO Monad: + -- we have to update the name cache in a nice atomic fashion + + | local && internal = do { nc <- readIORef nc_var + ; let (nc', new_local_name) = mk_new_local nc + ; writeIORef nc_var nc' + ; return (occ_env', new_local_name) } + -- Even local, internal names must get a unique occurrence, because + -- if we do -split-objs we externalise the name later, in the code generator + -- + -- Similarly, we must make sure it has a system-wide Unique, because + -- the byte-code generator builds a system-wide Name->BCO symbol table + + | local && external = do { nc <- readIORef nc_var + ; let (nc', new_external_name) = mk_new_external nc + ; writeIORef nc_var nc' + ; return (occ_env', new_external_name) } + where + name = idName id + external = id `elemVarEnv` ext_ids + global = isExternalName name + local = not global + internal = not external + mb_parent = nameParent_maybe name + loc = nameSrcLoc name + + (occ_env', occ') = tidyOccName occ_env (nameOccName name) + + mk_new_local nc = (nc { nsUniqs = us2 }, mkInternalName uniq occ' loc) + where + (us1, us2) = splitUniqSupply (nsUniqs nc) + uniq = uniqFromSupply us1 + + mk_new_external nc = allocateGlobalBinder nc mod occ' mb_parent loc + -- If we want to externalise a currently-local name, check + -- whether we have already assigned a unique for it. + -- If so, use it; if not, extend the table. + -- All this is done by allcoateGlobalBinder. + -- This is needed when *re*-compiling a module in GHCi; we must + -- use the same name for externally-visible things as we did before. + + +----------------------------------------------------------- +tidyTopPair :: VarEnv Bool + -> TidyEnv -- The TidyEnv is used to tidy the IdInfo + -- It is knot-tied: don't look at it! + -> CafInfo + -> Name -- New name + -> (Id, CoreExpr) -- Binder and RHS before tidying + -> (Id, CoreExpr) + -- This function is the heart of Step 2 + -- The rec_tidy_env is the one to use for the IdInfo + -- It's necessary because when we are dealing with a recursive + -- group, a variable late in the group might be mentioned + -- in the IdInfo of one early in the group + +tidyTopPair ext_ids rhs_tidy_env caf_info name' (bndr, rhs) + | isGlobalId bndr -- Injected binding for record selector, etc + = (bndr, tidyExpr rhs_tidy_env rhs) + | otherwise + = (bndr', rhs') + where + bndr' = mkVanillaGlobal name' ty' idinfo' + ty' = tidyTopType (idType bndr) + rhs' = tidyExpr rhs_tidy_env rhs + idinfo' = tidyTopIdInfo rhs_tidy_env (isJust maybe_external) + (idInfo bndr) unfold_info arity + caf_info + + -- Expose an unfolding if ext_ids tells us to + -- Remember that ext_ids maps an Id to a Bool: + -- True to show the unfolding, False to hide it + maybe_external = lookupVarEnv ext_ids bndr + show_unfold = maybe_external `orElse` False + unfold_info | show_unfold = mkTopUnfolding rhs' + | otherwise = noUnfolding + + -- Usually the Id will have an accurate arity on it, because + -- the simplifier has just run, but not always. + -- One case I found was when the last thing the simplifier + -- did was to let-bind a non-atomic argument and then float + -- it to the top level. So it seems more robust just to + -- fix it here. + arity = exprArity rhs + + +-- tidyTopIdInfo creates the final IdInfo for top-level +-- binders. There are two delicate pieces: +-- +-- * Arity. After CoreTidy, this arity must not change any more. +-- Indeed, CorePrep must eta expand where necessary to make +-- the manifest arity equal to the claimed arity. +-- +-- * CAF info. This must also remain valid through to code generation. +-- We add the info here so that it propagates to all +-- occurrences of the binders in RHSs, and hence to occurrences in +-- unfoldings, which are inside Ids imported by GHCi. Ditto RULES. +-- CoreToStg makes use of this when constructing SRTs. + +tidyTopIdInfo tidy_env is_external idinfo unfold_info arity caf_info + | not is_external -- For internal Ids (not externally visible) + = vanillaIdInfo -- we only need enough info for code generation + -- Arity and strictness info are enough; + -- c.f. CoreTidy.tidyLetBndr + `setCafInfo` caf_info + `setArityInfo` arity + `setAllStrictnessInfo` newStrictnessInfo idinfo + + | otherwise -- Externally-visible Ids get the whole lot + = vanillaIdInfo + `setCafInfo` caf_info + `setArityInfo` arity + `setAllStrictnessInfo` newStrictnessInfo idinfo + `setInlinePragInfo` inlinePragInfo idinfo + `setUnfoldingInfo` unfold_info + `setWorkerInfo` tidyWorker tidy_env (workerInfo idinfo) + -- NB: we throw away the Rules + -- They have already been extracted by findExternalRules + + + +------------ Worker -------------- +tidyWorker tidy_env (HasWorker work_id wrap_arity) + = HasWorker (tidyVarOcc tidy_env work_id) wrap_arity +tidyWorker tidy_env other + = NoWorker +\end{code} + +%************************************************************************ +%* * +\subsection{Figuring out CafInfo for an expression} +%* * +%************************************************************************ + +hasCafRefs decides whether a top-level closure can point into the dynamic heap. +We mark such things as `MayHaveCafRefs' because this information is +used to decide whether a particular closure needs to be referenced +in an SRT or not. + +There are two reasons for setting MayHaveCafRefs: + a) The RHS is a CAF: a top-level updatable thunk. + b) The RHS refers to something that MayHaveCafRefs + +Possible improvement: In an effort to keep the number of CAFs (and +hence the size of the SRTs) down, we could also look at the expression and +decide whether it requires a small bounded amount of heap, so we can ignore +it as a CAF. In these cases however, we would need to use an additional +CAF list to keep track of non-collectable CAFs. + +\begin{code} +hasCafRefs :: HomeModules -> VarEnv Var -> Arity -> CoreExpr -> CafInfo +hasCafRefs hmods p arity expr + | is_caf || mentions_cafs = MayHaveCafRefs + | otherwise = NoCafRefs + where + mentions_cafs = isFastTrue (cafRefs p expr) + is_caf = not (arity > 0 || rhsIsStatic hmods expr) + -- NB. we pass in the arity of the expression, which is expected + -- to be calculated by exprArity. This is because exprArity + -- knows how much eta expansion is going to be done by + -- CorePrep later on, and we don't want to duplicate that + -- knowledge in rhsIsStatic below. + +cafRefs p (Var id) + -- imported Ids first: + | not (isLocalId id) = fastBool (mayHaveCafRefs (idCafInfo id)) + -- now Ids local to this module: + | otherwise = + case lookupVarEnv p id of + Just id' -> fastBool (mayHaveCafRefs (idCafInfo id')) + Nothing -> fastBool False + +cafRefs p (Lit l) = fastBool False +cafRefs p (App f a) = fastOr (cafRefs p f) (cafRefs p) a +cafRefs p (Lam x e) = cafRefs p e +cafRefs p (Let b e) = fastOr (cafRefss p (rhssOfBind b)) (cafRefs p) e +cafRefs p (Case e bndr _ alts) = fastOr (cafRefs p e) (cafRefss p) (rhssOfAlts alts) +cafRefs p (Note n e) = cafRefs p e +cafRefs p (Type t) = fastBool False + +cafRefss p [] = fastBool False +cafRefss p (e:es) = fastOr (cafRefs p e) (cafRefss p) es + +-- hack for lazy-or over FastBool. +fastOr a f x = fastBool (isFastTrue a || isFastTrue (f x)) +\end{code} diff --git a/compiler/nativeGen/AsmCodeGen.lhs b/compiler/nativeGen/AsmCodeGen.lhs new file mode 100644 index 0000000000..1576162167 --- /dev/null +++ b/compiler/nativeGen/AsmCodeGen.lhs @@ -0,0 +1,545 @@ +-- ----------------------------------------------------------------------------- +-- +-- (c) The University of Glasgow 1993-2004 +-- +-- This is the top-level module in the native code generator. +-- +-- ----------------------------------------------------------------------------- + +\begin{code} +module AsmCodeGen ( nativeCodeGen ) where + +#include "HsVersions.h" +#include "NCG.h" + +import MachInstrs +import MachRegs +import MachCodeGen +import PprMach +import RegisterAlloc +import RegAllocInfo ( jumpDests ) +import NCGMonad +import PositionIndependentCode + +import Cmm +import CmmOpt ( cmmMiniInline, cmmMachOpFold ) +import PprCmm ( pprStmt, pprCmms ) +import MachOp +import CLabel ( CLabel, mkSplitMarkerLabel, mkAsmTempLabel ) +#if powerpc_TARGET_ARCH +import CLabel ( mkRtsCodeLabel ) +#endif + +import UniqFM +import Unique ( Unique, getUnique ) +import UniqSupply +import FastTypes +import List ( groupBy, sortBy ) +import CLabel ( pprCLabel ) +import ErrUtils ( dumpIfSet_dyn ) +import DynFlags ( DynFlags, DynFlag(..), dopt ) +import StaticFlags ( opt_Static, opt_PIC ) + +import Digraph +import qualified Pretty +import Outputable +import FastString + +-- DEBUGGING ONLY +--import OrdList + +#ifdef NCG_DEBUG +import List ( intersperse ) +#endif + +import DATA_INT +import DATA_WORD +import DATA_BITS +import GLAEXTS + +{- +The native-code generator has machine-independent and +machine-dependent modules. + +This module ("AsmCodeGen") is the top-level machine-independent +module. Before entering machine-dependent land, we do some +machine-independent optimisations (defined below) on the +'CmmStmts's. + +We convert to the machine-specific 'Instr' datatype with +'cmmCodeGen', assuming an infinite supply of registers. We then use +a machine-independent register allocator ('regAlloc') to rejoin +reality. Obviously, 'regAlloc' has machine-specific helper +functions (see about "RegAllocInfo" below). + +Finally, we order the basic blocks of the function so as to minimise +the number of jumps between blocks, by utilising fallthrough wherever +possible. + +The machine-dependent bits break down as follows: + + * ["MachRegs"] Everything about the target platform's machine + registers (and immediate operands, and addresses, which tend to + intermingle/interact with registers). + + * ["MachInstrs"] Includes the 'Instr' datatype (possibly should + have a module of its own), plus a miscellany of other things + (e.g., 'targetDoubleSize', 'smStablePtrTable', ...) + + * ["MachCodeGen"] is where 'Cmm' stuff turns into + machine instructions. + + * ["PprMach"] 'pprInstr' turns an 'Instr' into text (well, really + a 'Doc'). + + * ["RegAllocInfo"] In the register allocator, we manipulate + 'MRegsState's, which are 'BitSet's, one bit per machine register. + When we want to say something about a specific machine register + (e.g., ``it gets clobbered by this instruction''), we set/unset + its bit. Obviously, we do this 'BitSet' thing for efficiency + reasons. + + The 'RegAllocInfo' module collects together the machine-specific + info needed to do register allocation. + + * ["RegisterAlloc"] The (machine-independent) register allocator. +-} + +-- ----------------------------------------------------------------------------- +-- Top-level of the native codegen + +-- NB. We *lazilly* compile each block of code for space reasons. + +nativeCodeGen :: DynFlags -> [Cmm] -> UniqSupply -> IO Pretty.Doc +nativeCodeGen dflags cmms us + = let (res, _) = initUs us $ + cgCmm (concat (map add_split cmms)) + + cgCmm :: [CmmTop] -> UniqSM (Cmm, Pretty.Doc, [CLabel]) + cgCmm tops = + lazyMapUs (cmmNativeGen dflags) tops `thenUs` \ results -> + case unzip3 results of { (cmms,docs,imps) -> + returnUs (Cmm cmms, my_vcat docs, concat imps) + } + in + case res of { (ppr_cmms, insn_sdoc, imports) -> do + dumpIfSet_dyn dflags Opt_D_dump_opt_cmm "Optimised Cmm" (pprCmms [ppr_cmms]) + return (insn_sdoc Pretty.$$ dyld_stubs imports +#if HAVE_SUBSECTIONS_VIA_SYMBOLS + -- On recent versions of Darwin, the linker supports + -- dead-stripping of code and data on a per-symbol basis. + -- There's a hack to make this work in PprMach.pprNatCmmTop. + Pretty.$$ Pretty.text ".subsections_via_symbols" +#endif + ) + } + + where + + add_split (Cmm tops) + | dopt Opt_SplitObjs dflags = split_marker : tops + | otherwise = tops + + split_marker = CmmProc [] mkSplitMarkerLabel [] [] + + -- Generate "symbol stubs" for all external symbols that might + -- come from a dynamic library. +{- dyld_stubs imps = Pretty.vcat $ map pprDyldSymbolStub $ + map head $ group $ sort imps-} + + -- (Hack) sometimes two Labels pretty-print the same, but have + -- different uniques; so we compare their text versions... + dyld_stubs imps + | needImportedSymbols + = Pretty.vcat $ + (pprGotDeclaration :) $ + map (pprImportedSymbol . fst . head) $ + groupBy (\(_,a) (_,b) -> a == b) $ + sortBy (\(_,a) (_,b) -> compare a b) $ + map doPpr $ + imps + | otherwise + = Pretty.empty + + where doPpr lbl = (lbl, Pretty.render $ pprCLabel lbl astyle) + astyle = mkCodeStyle AsmStyle + +#ifndef NCG_DEBUG + my_vcat sds = Pretty.vcat sds +#else + my_vcat sds = Pretty.vcat ( + intersperse ( + Pretty.char ' ' + Pretty.$$ Pretty.ptext SLIT("# ___ncg_debug_marker") + Pretty.$$ Pretty.char ' ' + ) + sds + ) +#endif + + +-- Complete native code generation phase for a single top-level chunk +-- of Cmm. + +cmmNativeGen :: DynFlags -> CmmTop -> UniqSM (CmmTop, Pretty.Doc, [CLabel]) +cmmNativeGen dflags cmm + = {-# SCC "fixAssigns" #-} + fixAssignsTop cmm `thenUs` \ fixed_cmm -> + {-# SCC "genericOpt" #-} + cmmToCmm fixed_cmm `bind` \ (cmm, imports) -> + (if dopt Opt_D_dump_opt_cmm dflags -- space leak avoidance + then cmm + else CmmData Text []) `bind` \ ppr_cmm -> + {-# SCC "genMachCode" #-} + genMachCode cmm `thenUs` \ (pre_regalloc, lastMinuteImports) -> + {-# SCC "regAlloc" #-} + mapUs regAlloc pre_regalloc `thenUs` \ with_regs -> + {-# SCC "sequenceBlocks" #-} + map sequenceTop with_regs `bind` \ sequenced -> + {-# SCC "x86fp_kludge" #-} + map x86fp_kludge sequenced `bind` \ final_mach_code -> + {-# SCC "vcat" #-} + Pretty.vcat (map pprNatCmmTop final_mach_code) `bind` \ final_sdoc -> + + returnUs (ppr_cmm, final_sdoc Pretty.$$ Pretty.text "", lastMinuteImports ++ imports) + where + x86fp_kludge :: NatCmmTop -> NatCmmTop + x86fp_kludge top@(CmmData _ _) = top +#if i386_TARGET_ARCH + x86fp_kludge top@(CmmProc info lbl params code) = + CmmProc info lbl params (map bb_i386_insert_ffrees code) + where + bb_i386_insert_ffrees (BasicBlock id instrs) = + BasicBlock id (i386_insert_ffrees instrs) +#else + x86fp_kludge top = top +#endif + +-- ----------------------------------------------------------------------------- +-- Sequencing the basic blocks + +-- Cmm BasicBlocks are self-contained entities: they always end in a +-- jump, either non-local or to another basic block in the same proc. +-- In this phase, we attempt to place the basic blocks in a sequence +-- such that as many of the local jumps as possible turn into +-- fallthroughs. + +sequenceTop :: NatCmmTop -> NatCmmTop +sequenceTop top@(CmmData _ _) = top +sequenceTop (CmmProc info lbl params blocks) = + CmmProc info lbl params (sequenceBlocks blocks) + +-- The algorithm is very simple (and stupid): we make a graph out of +-- the blocks where there is an edge from one block to another iff the +-- first block ends by jumping to the second. Then we topologically +-- sort this graph. Then traverse the list: for each block, we first +-- output the block, then if it has an out edge, we move the +-- destination of the out edge to the front of the list, and continue. + +sequenceBlocks :: [NatBasicBlock] -> [NatBasicBlock] +sequenceBlocks [] = [] +sequenceBlocks (entry:blocks) = + seqBlocks (mkNode entry : reverse (flattenSCCs (sccBlocks blocks))) + -- the first block is the entry point ==> it must remain at the start. + +sccBlocks :: [NatBasicBlock] -> [SCC (NatBasicBlock,Unique,[Unique])] +sccBlocks blocks = stronglyConnCompR (map mkNode blocks) + +getOutEdges :: [Instr] -> [Unique] +getOutEdges instrs = case jumpDests (last instrs) [] of + [one] -> [getUnique one] + _many -> [] + -- we're only interested in the last instruction of + -- the block, and only if it has a single destination. + +mkNode block@(BasicBlock id instrs) = (block, getUnique id, getOutEdges instrs) + +seqBlocks [] = [] +seqBlocks ((block,_,[]) : rest) + = block : seqBlocks rest +seqBlocks ((block@(BasicBlock id instrs),_,[next]) : rest) + | can_fallthrough = BasicBlock id (init instrs) : seqBlocks rest' + | otherwise = block : seqBlocks rest' + where + (can_fallthrough, rest') = reorder next [] rest + -- TODO: we should do a better job for cycles; try to maximise the + -- fallthroughs within a loop. +seqBlocks _ = panic "AsmCodegen:seqBlocks" + +reorder id accum [] = (False, reverse accum) +reorder id accum (b@(block,id',out) : rest) + | id == id' = (True, (block,id,out) : reverse accum ++ rest) + | otherwise = reorder id (b:accum) rest + +-- ----------------------------------------------------------------------------- +-- Instruction selection + +-- Native code instruction selection for a chunk of stix code. For +-- this part of the computation, we switch from the UniqSM monad to +-- the NatM monad. The latter carries not only a Unique, but also an +-- Int denoting the current C stack pointer offset in the generated +-- code; this is needed for creating correct spill offsets on +-- architectures which don't offer, or for which it would be +-- prohibitively expensive to employ, a frame pointer register. Viz, +-- x86. + +-- The offset is measured in bytes, and indicates the difference +-- between the current (simulated) C stack-ptr and the value it was at +-- the beginning of the block. For stacks which grow down, this value +-- should be either zero or negative. + +-- Switching between the two monads whilst carrying along the same +-- Unique supply breaks abstraction. Is that bad? + +genMachCode :: CmmTop -> UniqSM ([NatCmmTop], [CLabel]) + +genMachCode cmm_top initial_us + = let initial_st = mkNatM_State initial_us 0 + (new_tops, final_st) = initNat initial_st (cmmTopCodeGen cmm_top) + final_us = natm_us final_st + final_delta = natm_delta final_st + final_imports = natm_imports final_st + in + if final_delta == 0 + then ((new_tops, final_imports), final_us) + else pprPanic "genMachCode: nonzero final delta" + (int final_delta) + +-- ----------------------------------------------------------------------------- +-- Fixup assignments to global registers so that they assign to +-- locations within the RegTable, if appropriate. + +-- Note that we currently don't fixup reads here: they're done by +-- the generic optimiser below, to avoid having two separate passes +-- over the Cmm. + +fixAssignsTop :: CmmTop -> UniqSM CmmTop +fixAssignsTop top@(CmmData _ _) = returnUs top +fixAssignsTop (CmmProc info lbl params blocks) = + mapUs fixAssignsBlock blocks `thenUs` \ blocks' -> + returnUs (CmmProc info lbl params blocks') + +fixAssignsBlock :: CmmBasicBlock -> UniqSM CmmBasicBlock +fixAssignsBlock (BasicBlock id stmts) = + fixAssigns stmts `thenUs` \ stmts' -> + returnUs (BasicBlock id stmts') + +fixAssigns :: [CmmStmt] -> UniqSM [CmmStmt] +fixAssigns stmts = + mapUs fixAssign stmts `thenUs` \ stmtss -> + returnUs (concat stmtss) + +fixAssign :: CmmStmt -> UniqSM [CmmStmt] +fixAssign (CmmAssign (CmmGlobal BaseReg) src) + = panic "cmmStmtConFold: assignment to BaseReg"; + +fixAssign (CmmAssign (CmmGlobal reg) src) + | Left realreg <- reg_or_addr + = returnUs [CmmAssign (CmmGlobal reg) src] + | Right baseRegAddr <- reg_or_addr + = returnUs [CmmStore baseRegAddr src] + -- Replace register leaves with appropriate StixTrees for + -- the given target. GlobalRegs which map to a reg on this + -- arch are left unchanged. Assigning to BaseReg is always + -- illegal, so we check for that. + where + reg_or_addr = get_GlobalReg_reg_or_addr reg + +fixAssign (CmmCall target results args vols) + = mapAndUnzipUs fixResult results `thenUs` \ (results',stores) -> + returnUs (caller_save ++ + CmmCall target results' args vols : + caller_restore ++ + concat stores) + where + -- we also save/restore any caller-saves STG registers here + (caller_save, caller_restore) = callerSaveVolatileRegs vols + + fixResult g@(CmmGlobal reg,hint) = + case get_GlobalReg_reg_or_addr reg of + Left realreg -> returnUs (g, []) + Right baseRegAddr -> + getUniqueUs `thenUs` \ uq -> + let local = CmmLocal (LocalReg uq (globalRegRep reg)) in + returnUs ((local,hint), + [CmmStore baseRegAddr (CmmReg local)]) + fixResult other = + returnUs (other,[]) + +fixAssign other_stmt = returnUs [other_stmt] + +-- ----------------------------------------------------------------------------- +-- Generic Cmm optimiser + +{- +Here we do: + + (a) Constant folding + (b) Simple inlining: a temporary which is assigned to and then + used, once, can be shorted. + (c) Replacement of references to GlobalRegs which do not have + machine registers by the appropriate memory load (eg. + Hp ==> *(BaseReg + 34) ). + (d) Position independent code and dynamic linking + (i) introduce the appropriate indirections + and position independent refs + (ii) compile a list of imported symbols + +Ideas for other things we could do (ToDo): + + - shortcut jumps-to-jumps + - eliminate dead code blocks + - simple CSE: if an expr is assigned to a temp, then replace later occs of + that expr with the temp, until the expr is no longer valid (can push through + temp assignments, and certain assigns to mem...) +-} + +cmmToCmm :: CmmTop -> (CmmTop, [CLabel]) +cmmToCmm top@(CmmData _ _) = (top, []) +cmmToCmm (CmmProc info lbl params blocks) = runCmmOpt $ do + blocks' <- mapM cmmBlockConFold (cmmMiniInline blocks) + return $ CmmProc info lbl params blocks' + +newtype CmmOptM a = CmmOptM ([CLabel] -> (# a, [CLabel] #)) + +instance Monad CmmOptM where + return x = CmmOptM $ \imports -> (# x,imports #) + (CmmOptM f) >>= g = + CmmOptM $ \imports -> + case f imports of + (# x, imports' #) -> + case g x of + CmmOptM g' -> g' imports' + +addImportCmmOpt :: CLabel -> CmmOptM () +addImportCmmOpt lbl = CmmOptM $ \imports -> (# (), lbl:imports #) + +runCmmOpt :: CmmOptM a -> (a, [CLabel]) +runCmmOpt (CmmOptM f) = case f [] of + (# result, imports #) -> (result, imports) + +cmmBlockConFold :: CmmBasicBlock -> CmmOptM CmmBasicBlock +cmmBlockConFold (BasicBlock id stmts) = do + stmts' <- mapM cmmStmtConFold stmts + return $ BasicBlock id stmts' + +cmmStmtConFold stmt + = case stmt of + CmmAssign reg src + -> do src' <- cmmExprConFold False src + return $ case src' of + CmmReg reg' | reg == reg' -> CmmNop + new_src -> CmmAssign reg new_src + + CmmStore addr src + -> do addr' <- cmmExprConFold False addr + src' <- cmmExprConFold False src + return $ CmmStore addr' src' + + CmmJump addr regs + -> do addr' <- cmmExprConFold True addr + return $ CmmJump addr' regs + + CmmCall target regs args vols + -> do target' <- case target of + CmmForeignCall e conv -> do + e' <- cmmExprConFold True e + return $ CmmForeignCall e' conv + other -> return other + args' <- mapM (\(arg, hint) -> do + arg' <- cmmExprConFold False arg + return (arg', hint)) args + return $ CmmCall target' regs args' vols + + CmmCondBranch test dest + -> do test' <- cmmExprConFold False test + return $ case test' of + CmmLit (CmmInt 0 _) -> + CmmComment (mkFastString ("deleted: " ++ + showSDoc (pprStmt stmt))) + + CmmLit (CmmInt n _) -> CmmBranch dest + other -> CmmCondBranch test' dest + + CmmSwitch expr ids + -> do expr' <- cmmExprConFold False expr + return $ CmmSwitch expr' ids + + other + -> return other + + +cmmExprConFold isJumpTarget expr + = case expr of + CmmLoad addr rep + -> do addr' <- cmmExprConFold False addr + return $ CmmLoad addr' rep + + CmmMachOp mop args + -- For MachOps, we first optimize the children, and then we try + -- our hand at some constant-folding. + -> do args' <- mapM (cmmExprConFold False) args + return $ cmmMachOpFold mop args' + + CmmLit (CmmLabel lbl) + -> cmmMakeDynamicReference addImportCmmOpt isJumpTarget lbl + CmmLit (CmmLabelOff lbl off) + -> do dynRef <- cmmMakeDynamicReference addImportCmmOpt isJumpTarget lbl + return $ cmmMachOpFold (MO_Add wordRep) [ + dynRef, + (CmmLit $ CmmInt (fromIntegral off) wordRep) + ] + +#if powerpc_TARGET_ARCH + -- On powerpc (non-PIC), it's easier to jump directly to a label than + -- to use the register table, so we replace these registers + -- with the corresponding labels: + CmmReg (CmmGlobal GCEnter1) + | not opt_PIC + -> cmmExprConFold isJumpTarget $ + CmmLit (CmmLabel (mkRtsCodeLabel SLIT( "__stg_gc_enter_1"))) + CmmReg (CmmGlobal GCFun) + | not opt_PIC + -> cmmExprConFold isJumpTarget $ + CmmLit (CmmLabel (mkRtsCodeLabel SLIT( "__stg_gc_fun"))) +#endif + + CmmReg (CmmGlobal mid) + -- Replace register leaves with appropriate StixTrees for + -- the given target. MagicIds which map to a reg on this + -- arch are left unchanged. For the rest, BaseReg is taken + -- to mean the address of the reg table in MainCapability, + -- and for all others we generate an indirection to its + -- location in the register table. + -> case get_GlobalReg_reg_or_addr mid of + Left realreg -> return expr + Right baseRegAddr + -> case mid of + BaseReg -> cmmExprConFold False baseRegAddr + other -> cmmExprConFold False (CmmLoad baseRegAddr + (globalRegRep mid)) + -- eliminate zero offsets + CmmRegOff reg 0 + -> cmmExprConFold False (CmmReg reg) + + CmmRegOff (CmmGlobal mid) offset + -- RegOf leaves are just a shorthand form. If the reg maps + -- to a real reg, we keep the shorthand, otherwise, we just + -- expand it and defer to the above code. + -> case get_GlobalReg_reg_or_addr mid of + Left realreg -> return expr + Right baseRegAddr + -> cmmExprConFold False (CmmMachOp (MO_Add wordRep) [ + CmmReg (CmmGlobal mid), + CmmLit (CmmInt (fromIntegral offset) + wordRep)]) + other + -> return other + +-- ----------------------------------------------------------------------------- +-- Utils + +bind f x = x $! f + +\end{code} + diff --git a/compiler/nativeGen/MachCodeGen.hs b/compiler/nativeGen/MachCodeGen.hs new file mode 100644 index 0000000000..90ce6b5bf8 --- /dev/null +++ b/compiler/nativeGen/MachCodeGen.hs @@ -0,0 +1,4654 @@ +----------------------------------------------------------------------------- +-- +-- Generating machine code (instruction selection) +-- +-- (c) The University of Glasgow 1996-2004 +-- +----------------------------------------------------------------------------- + +-- This is a big module, but, if you pay attention to +-- (a) the sectioning, (b) the type signatures, and +-- (c) the #if blah_TARGET_ARCH} things, the +-- structure should not be too overwhelming. + +module MachCodeGen ( cmmTopCodeGen, InstrBlock ) where + +#include "HsVersions.h" +#include "nativeGen/NCG.h" +#include "MachDeps.h" + +-- NCG stuff: +import MachInstrs +import MachRegs +import NCGMonad +import PositionIndependentCode ( cmmMakeDynamicReference, initializePicBase ) +import RegAllocInfo ( mkBranchInstr ) + +-- Our intermediate code: +import PprCmm ( pprExpr ) +import Cmm +import MachOp +import CLabel + +-- The rest: +import StaticFlags ( opt_PIC ) +import ForeignCall ( CCallConv(..) ) +import OrdList +import Pretty +import Outputable +import FastString +import FastTypes ( isFastTrue ) +import Constants ( wORD_SIZE ) + +#ifdef DEBUG +import Outputable ( assertPanic ) +import TRACE ( trace ) +#endif + +import Control.Monad ( mapAndUnzipM ) +import Maybe ( fromJust ) +import DATA_BITS +import DATA_WORD + +-- ----------------------------------------------------------------------------- +-- Top-level of the instruction selector + +-- | 'InstrBlock's are the insn sequences generated by the insn selectors. +-- They are really trees of insns to facilitate fast appending, where a +-- left-to-right traversal (pre-order?) yields the insns in the correct +-- order. + +type InstrBlock = OrdList Instr + +cmmTopCodeGen :: CmmTop -> NatM [NatCmmTop] +cmmTopCodeGen (CmmProc info lab params blocks) = do + (nat_blocks,statics) <- mapAndUnzipM basicBlockCodeGen blocks + picBaseMb <- getPicBaseMaybeNat + let proc = CmmProc info lab params (concat nat_blocks) + tops = proc : concat statics + case picBaseMb of + Just picBase -> initializePicBase picBase tops + Nothing -> return tops + +cmmTopCodeGen (CmmData sec dat) = do + return [CmmData sec dat] -- no translation, we just use CmmStatic + +basicBlockCodeGen :: CmmBasicBlock -> NatM ([NatBasicBlock],[NatCmmTop]) +basicBlockCodeGen (BasicBlock id stmts) = do + instrs <- stmtsToInstrs stmts + -- code generation may introduce new basic block boundaries, which + -- are indicated by the NEWBLOCK instruction. We must split up the + -- instruction stream into basic blocks again. Also, we extract + -- LDATAs here too. + let + (top,other_blocks,statics) = foldrOL mkBlocks ([],[],[]) instrs + + mkBlocks (NEWBLOCK id) (instrs,blocks,statics) + = ([], BasicBlock id instrs : blocks, statics) + mkBlocks (LDATA sec dat) (instrs,blocks,statics) + = (instrs, blocks, CmmData sec dat:statics) + mkBlocks instr (instrs,blocks,statics) + = (instr:instrs, blocks, statics) + -- in + return (BasicBlock id top : other_blocks, statics) + +stmtsToInstrs :: [CmmStmt] -> NatM InstrBlock +stmtsToInstrs stmts + = do instrss <- mapM stmtToInstrs stmts + return (concatOL instrss) + +stmtToInstrs :: CmmStmt -> NatM InstrBlock +stmtToInstrs stmt = case stmt of + CmmNop -> return nilOL + CmmComment s -> return (unitOL (COMMENT s)) + + CmmAssign reg src + | isFloatingRep kind -> assignReg_FltCode kind reg src +#if WORD_SIZE_IN_BITS==32 + | kind == I64 -> assignReg_I64Code reg src +#endif + | otherwise -> assignReg_IntCode kind reg src + where kind = cmmRegRep reg + + CmmStore addr src + | isFloatingRep kind -> assignMem_FltCode kind addr src +#if WORD_SIZE_IN_BITS==32 + | kind == I64 -> assignMem_I64Code addr src +#endif + | otherwise -> assignMem_IntCode kind addr src + where kind = cmmExprRep src + + CmmCall target result_regs args vols + -> genCCall target result_regs args vols + + CmmBranch id -> genBranch id + CmmCondBranch arg id -> genCondJump id arg + CmmSwitch arg ids -> genSwitch arg ids + CmmJump arg params -> genJump arg + +-- ----------------------------------------------------------------------------- +-- General things for putting together code sequences + +-- Expand CmmRegOff. ToDo: should we do it this way around, or convert +-- CmmExprs into CmmRegOff? +mangleIndexTree :: CmmExpr -> CmmExpr +mangleIndexTree (CmmRegOff reg off) + = CmmMachOp (MO_Add rep) [CmmReg reg, CmmLit (CmmInt (fromIntegral off) rep)] + where rep = cmmRegRep reg + +-- ----------------------------------------------------------------------------- +-- Code gen for 64-bit arithmetic on 32-bit platforms + +{- +Simple support for generating 64-bit code (ie, 64 bit values and 64 +bit assignments) on 32-bit platforms. Unlike the main code generator +we merely shoot for generating working code as simply as possible, and +pay little attention to code quality. Specifically, there is no +attempt to deal cleverly with the fixed-vs-floating register +distinction; all values are generated into (pairs of) floating +registers, even if this would mean some redundant reg-reg moves as a +result. Only one of the VRegUniques is returned, since it will be +of the VRegUniqueLo form, and the upper-half VReg can be determined +by applying getHiVRegFromLo to it. +-} + +data ChildCode64 -- a.k.a "Register64" + = ChildCode64 + InstrBlock -- code + Reg -- the lower 32-bit temporary which contains the + -- result; use getHiVRegFromLo to find the other + -- VRegUnique. Rules of this simplified insn + -- selection game are therefore that the returned + -- Reg may be modified + +#if WORD_SIZE_IN_BITS==32 +assignMem_I64Code :: CmmExpr -> CmmExpr -> NatM InstrBlock +assignReg_I64Code :: CmmReg -> CmmExpr -> NatM InstrBlock +#endif + +#ifndef x86_64_TARGET_ARCH +iselExpr64 :: CmmExpr -> NatM ChildCode64 +#endif + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if i386_TARGET_ARCH + +assignMem_I64Code addrTree valueTree = do + Amode addr addr_code <- getAmode addrTree + ChildCode64 vcode rlo <- iselExpr64 valueTree + let + rhi = getHiVRegFromLo rlo + + -- Little-endian store + mov_lo = MOV I32 (OpReg rlo) (OpAddr addr) + mov_hi = MOV I32 (OpReg rhi) (OpAddr (fromJust (addrOffset addr 4))) + -- in + return (vcode `appOL` addr_code `snocOL` mov_lo `snocOL` mov_hi) + + +assignReg_I64Code (CmmLocal (LocalReg u_dst pk)) valueTree = do + ChildCode64 vcode r_src_lo <- iselExpr64 valueTree + let + r_dst_lo = mkVReg u_dst I32 + r_dst_hi = getHiVRegFromLo r_dst_lo + r_src_hi = getHiVRegFromLo r_src_lo + mov_lo = MOV I32 (OpReg r_src_lo) (OpReg r_dst_lo) + mov_hi = MOV I32 (OpReg r_src_hi) (OpReg r_dst_hi) + -- in + return ( + vcode `snocOL` mov_lo `snocOL` mov_hi + ) + +assignReg_I64Code lvalue valueTree + = panic "assignReg_I64Code(i386): invalid lvalue" + +------------ + +iselExpr64 (CmmLit (CmmInt i _)) = do + (rlo,rhi) <- getNewRegPairNat I32 + let + r = fromIntegral (fromIntegral i :: Word32) + q = fromIntegral ((fromIntegral i `shiftR` 32) :: Word32) + code = toOL [ + MOV I32 (OpImm (ImmInteger r)) (OpReg rlo), + MOV I32 (OpImm (ImmInteger q)) (OpReg rhi) + ] + -- in + return (ChildCode64 code rlo) + +iselExpr64 (CmmLoad addrTree I64) = do + Amode addr addr_code <- getAmode addrTree + (rlo,rhi) <- getNewRegPairNat I32 + let + mov_lo = MOV I32 (OpAddr addr) (OpReg rlo) + mov_hi = MOV I32 (OpAddr (fromJust (addrOffset addr 4))) (OpReg rhi) + -- in + return ( + ChildCode64 (addr_code `snocOL` mov_lo `snocOL` mov_hi) + rlo + ) + +iselExpr64 (CmmReg (CmmLocal (LocalReg vu I64))) + = return (ChildCode64 nilOL (mkVReg vu I32)) + +-- we handle addition, but rather badly +iselExpr64 (CmmMachOp (MO_Add _) [e1, CmmLit (CmmInt i _)]) = do + ChildCode64 code1 r1lo <- iselExpr64 e1 + (rlo,rhi) <- getNewRegPairNat I32 + let + r = fromIntegral (fromIntegral i :: Word32) + q = fromIntegral ((fromIntegral i `shiftR` 32) :: Word32) + r1hi = getHiVRegFromLo r1lo + code = code1 `appOL` + toOL [ MOV I32 (OpReg r1lo) (OpReg rlo), + ADD I32 (OpImm (ImmInteger r)) (OpReg rlo), + MOV I32 (OpReg r1hi) (OpReg rhi), + ADC I32 (OpImm (ImmInteger q)) (OpReg rhi) ] + -- in + return (ChildCode64 code rlo) + +iselExpr64 (CmmMachOp (MO_Add _) [e1,e2]) = do + ChildCode64 code1 r1lo <- iselExpr64 e1 + ChildCode64 code2 r2lo <- iselExpr64 e2 + (rlo,rhi) <- getNewRegPairNat I32 + let + r1hi = getHiVRegFromLo r1lo + r2hi = getHiVRegFromLo r2lo + code = code1 `appOL` + code2 `appOL` + toOL [ MOV I32 (OpReg r1lo) (OpReg rlo), + ADD I32 (OpReg r2lo) (OpReg rlo), + MOV I32 (OpReg r1hi) (OpReg rhi), + ADC I32 (OpReg r2hi) (OpReg rhi) ] + -- in + return (ChildCode64 code rlo) + +iselExpr64 expr + = pprPanic "iselExpr64(i386)" (ppr expr) + +#endif /* i386_TARGET_ARCH */ + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if sparc_TARGET_ARCH + +assignMem_I64Code addrTree valueTree = do + Amode addr addr_code <- getAmode addrTree + ChildCode64 vcode rlo <- iselExpr64 valueTree + (src, code) <- getSomeReg addrTree + let + rhi = getHiVRegFromLo rlo + -- Big-endian store + mov_hi = ST I32 rhi (AddrRegImm src (ImmInt 0)) + mov_lo = ST I32 rlo (AddrRegImm src (ImmInt 4)) + return (vcode `appOL` code `snocOL` mov_hi `snocOL` mov_lo) + +assignReg_I64Code (CmmLocal (LocalReg u_dst pk)) valueTree = do + ChildCode64 vcode r_src_lo <- iselExpr64 valueTree + let + r_dst_lo = mkVReg u_dst pk + r_dst_hi = getHiVRegFromLo r_dst_lo + r_src_hi = getHiVRegFromLo r_src_lo + mov_lo = mkMOV r_src_lo r_dst_lo + mov_hi = mkMOV r_src_hi r_dst_hi + mkMOV sreg dreg = OR False g0 (RIReg sreg) dreg + return (vcode `snocOL` mov_hi `snocOL` mov_lo) +assignReg_I64Code lvalue valueTree + = panic "assignReg_I64Code(sparc): invalid lvalue" + + +-- Don't delete this -- it's very handy for debugging. +--iselExpr64 expr +-- | trace ("iselExpr64: " ++ showSDoc (ppr expr)) False +-- = panic "iselExpr64(???)" + +iselExpr64 (CmmLoad addrTree I64) = do + Amode (AddrRegReg r1 r2) addr_code <- getAmode addrTree + rlo <- getNewRegNat I32 + let rhi = getHiVRegFromLo rlo + mov_hi = LD I32 (AddrRegImm r1 (ImmInt 0)) rhi + mov_lo = LD I32 (AddrRegImm r1 (ImmInt 4)) rlo + return ( + ChildCode64 (addr_code `snocOL` mov_hi `snocOL` mov_lo) + rlo + ) + +iselExpr64 (CmmReg (CmmLocal (LocalReg uq I64))) = do + r_dst_lo <- getNewRegNat I32 + let r_dst_hi = getHiVRegFromLo r_dst_lo + r_src_lo = mkVReg uq I32 + r_src_hi = getHiVRegFromLo r_src_lo + mov_lo = mkMOV r_src_lo r_dst_lo + mov_hi = mkMOV r_src_hi r_dst_hi + mkMOV sreg dreg = OR False g0 (RIReg sreg) dreg + return ( + ChildCode64 (toOL [mov_hi, mov_lo]) r_dst_lo + ) + +iselExpr64 expr + = pprPanic "iselExpr64(sparc)" (ppr expr) + +#endif /* sparc_TARGET_ARCH */ + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if powerpc_TARGET_ARCH + +getI64Amodes :: CmmExpr -> NatM (AddrMode, AddrMode, InstrBlock) +getI64Amodes addrTree = do + Amode hi_addr addr_code <- getAmode addrTree + case addrOffset hi_addr 4 of + Just lo_addr -> return (hi_addr, lo_addr, addr_code) + Nothing -> do (hi_ptr, code) <- getSomeReg addrTree + return (AddrRegImm hi_ptr (ImmInt 0), + AddrRegImm hi_ptr (ImmInt 4), + code) + +assignMem_I64Code addrTree valueTree = do + (hi_addr, lo_addr, addr_code) <- getI64Amodes addrTree + ChildCode64 vcode rlo <- iselExpr64 valueTree + let + rhi = getHiVRegFromLo rlo + + -- Big-endian store + mov_hi = ST I32 rhi hi_addr + mov_lo = ST I32 rlo lo_addr + -- in + return (vcode `appOL` addr_code `snocOL` mov_lo `snocOL` mov_hi) + +assignReg_I64Code (CmmLocal (LocalReg u_dst pk)) valueTree = do + ChildCode64 vcode r_src_lo <- iselExpr64 valueTree + let + r_dst_lo = mkVReg u_dst I32 + r_dst_hi = getHiVRegFromLo r_dst_lo + r_src_hi = getHiVRegFromLo r_src_lo + mov_lo = MR r_dst_lo r_src_lo + mov_hi = MR r_dst_hi r_src_hi + -- in + return ( + vcode `snocOL` mov_lo `snocOL` mov_hi + ) + +assignReg_I64Code lvalue valueTree + = panic "assignReg_I64Code(powerpc): invalid lvalue" + + +-- Don't delete this -- it's very handy for debugging. +--iselExpr64 expr +-- | trace ("iselExpr64: " ++ showSDoc (pprCmmExpr expr)) False +-- = panic "iselExpr64(???)" + +iselExpr64 (CmmLoad addrTree I64) = do + (hi_addr, lo_addr, addr_code) <- getI64Amodes addrTree + (rlo, rhi) <- getNewRegPairNat I32 + let mov_hi = LD I32 rhi hi_addr + mov_lo = LD I32 rlo lo_addr + return $ ChildCode64 (addr_code `snocOL` mov_lo `snocOL` mov_hi) + rlo + +iselExpr64 (CmmReg (CmmLocal (LocalReg vu I64))) + = return (ChildCode64 nilOL (mkVReg vu I32)) + +iselExpr64 (CmmLit (CmmInt i _)) = do + (rlo,rhi) <- getNewRegPairNat I32 + let + half0 = fromIntegral (fromIntegral i :: Word16) + half1 = fromIntegral ((fromIntegral i `shiftR` 16) :: Word16) + half2 = fromIntegral ((fromIntegral i `shiftR` 32) :: Word16) + half3 = fromIntegral ((fromIntegral i `shiftR` 48) :: Word16) + + code = toOL [ + LIS rlo (ImmInt half1), + OR rlo rlo (RIImm $ ImmInt half0), + LIS rhi (ImmInt half3), + OR rlo rlo (RIImm $ ImmInt half2) + ] + -- in + return (ChildCode64 code rlo) + +iselExpr64 (CmmMachOp (MO_Add _) [e1,e2]) = do + ChildCode64 code1 r1lo <- iselExpr64 e1 + ChildCode64 code2 r2lo <- iselExpr64 e2 + (rlo,rhi) <- getNewRegPairNat I32 + let + r1hi = getHiVRegFromLo r1lo + r2hi = getHiVRegFromLo r2lo + code = code1 `appOL` + code2 `appOL` + toOL [ ADDC rlo r1lo r2lo, + ADDE rhi r1hi r2hi ] + -- in + return (ChildCode64 code rlo) + +iselExpr64 expr + = pprPanic "iselExpr64(powerpc)" (ppr expr) + +#endif /* powerpc_TARGET_ARCH */ + + +-- ----------------------------------------------------------------------------- +-- The 'Register' type + +-- 'Register's passed up the tree. If the stix code forces the register +-- to live in a pre-decided machine register, it comes out as @Fixed@; +-- otherwise, it comes out as @Any@, and the parent can decide which +-- register to put it in. + +data Register + = Fixed MachRep Reg InstrBlock + | Any MachRep (Reg -> InstrBlock) + +swizzleRegisterRep :: Register -> MachRep -> Register +swizzleRegisterRep (Fixed _ reg code) rep = Fixed rep reg code +swizzleRegisterRep (Any _ codefn) rep = Any rep codefn + + +-- ----------------------------------------------------------------------------- +-- Utils based on getRegister, below + +-- The dual to getAnyReg: compute an expression into a register, but +-- we don't mind which one it is. +getSomeReg :: CmmExpr -> NatM (Reg, InstrBlock) +getSomeReg expr = do + r <- getRegister expr + case r of + Any rep code -> do + tmp <- getNewRegNat rep + return (tmp, code tmp) + Fixed _ reg code -> + return (reg, code) + +-- ----------------------------------------------------------------------------- +-- Grab the Reg for a CmmReg + +getRegisterReg :: CmmReg -> Reg + +getRegisterReg (CmmLocal (LocalReg u pk)) + = mkVReg u pk + +getRegisterReg (CmmGlobal mid) + = case get_GlobalReg_reg_or_addr mid of + Left (RealReg rrno) -> RealReg rrno + _other -> pprPanic "getRegisterReg-memory" (ppr $ CmmGlobal mid) + -- By this stage, the only MagicIds remaining should be the + -- ones which map to a real machine register on this + -- platform. Hence ... + + +-- ----------------------------------------------------------------------------- +-- Generate code to get a subtree into a Register + +-- Don't delete this -- it's very handy for debugging. +--getRegister expr +-- | trace ("getRegister: " ++ showSDoc (pprCmmExpr expr)) False +-- = panic "getRegister(???)" + +getRegister :: CmmExpr -> NatM Register + +getRegister (CmmReg (CmmGlobal PicBaseReg)) + = do + reg <- getPicBaseNat wordRep + return (Fixed wordRep reg nilOL) + +getRegister (CmmReg reg) + = return (Fixed (cmmRegRep reg) (getRegisterReg reg) nilOL) + +getRegister tree@(CmmRegOff _ _) + = getRegister (mangleIndexTree tree) + +-- end of machine-"independent" bit; here we go on the rest... + +#if alpha_TARGET_ARCH + +getRegister (StDouble d) + = getBlockIdNat `thenNat` \ lbl -> + getNewRegNat PtrRep `thenNat` \ tmp -> + let code dst = mkSeqInstrs [ + LDATA RoDataSegment lbl [ + DATA TF [ImmLab (rational d)] + ], + LDA tmp (AddrImm (ImmCLbl lbl)), + LD TF dst (AddrReg tmp)] + in + return (Any F64 code) + +getRegister (StPrim primop [x]) -- unary PrimOps + = case primop of + IntNegOp -> trivialUCode (NEG Q False) x + + NotOp -> trivialUCode NOT x + + FloatNegOp -> trivialUFCode FloatRep (FNEG TF) x + DoubleNegOp -> trivialUFCode F64 (FNEG TF) x + + OrdOp -> coerceIntCode IntRep x + ChrOp -> chrCode x + + Float2IntOp -> coerceFP2Int x + Int2FloatOp -> coerceInt2FP pr x + Double2IntOp -> coerceFP2Int x + Int2DoubleOp -> coerceInt2FP pr x + + Double2FloatOp -> coerceFltCode x + Float2DoubleOp -> coerceFltCode x + + other_op -> getRegister (StCall fn CCallConv F64 [x]) + where + fn = case other_op of + FloatExpOp -> FSLIT("exp") + FloatLogOp -> FSLIT("log") + FloatSqrtOp -> FSLIT("sqrt") + FloatSinOp -> FSLIT("sin") + FloatCosOp -> FSLIT("cos") + FloatTanOp -> FSLIT("tan") + FloatAsinOp -> FSLIT("asin") + FloatAcosOp -> FSLIT("acos") + FloatAtanOp -> FSLIT("atan") + FloatSinhOp -> FSLIT("sinh") + FloatCoshOp -> FSLIT("cosh") + FloatTanhOp -> FSLIT("tanh") + DoubleExpOp -> FSLIT("exp") + DoubleLogOp -> FSLIT("log") + DoubleSqrtOp -> FSLIT("sqrt") + DoubleSinOp -> FSLIT("sin") + DoubleCosOp -> FSLIT("cos") + DoubleTanOp -> FSLIT("tan") + DoubleAsinOp -> FSLIT("asin") + DoubleAcosOp -> FSLIT("acos") + DoubleAtanOp -> FSLIT("atan") + DoubleSinhOp -> FSLIT("sinh") + DoubleCoshOp -> FSLIT("cosh") + DoubleTanhOp -> FSLIT("tanh") + where + pr = panic "MachCode.getRegister: no primrep needed for Alpha" + +getRegister (StPrim primop [x, y]) -- dyadic PrimOps + = case primop of + CharGtOp -> trivialCode (CMP LTT) y x + CharGeOp -> trivialCode (CMP LE) y x + CharEqOp -> trivialCode (CMP EQQ) x y + CharNeOp -> int_NE_code x y + CharLtOp -> trivialCode (CMP LTT) x y + CharLeOp -> trivialCode (CMP LE) x y + + IntGtOp -> trivialCode (CMP LTT) y x + IntGeOp -> trivialCode (CMP LE) y x + IntEqOp -> trivialCode (CMP EQQ) x y + IntNeOp -> int_NE_code x y + IntLtOp -> trivialCode (CMP LTT) x y + IntLeOp -> trivialCode (CMP LE) x y + + WordGtOp -> trivialCode (CMP ULT) y x + WordGeOp -> trivialCode (CMP ULE) x y + WordEqOp -> trivialCode (CMP EQQ) x y + WordNeOp -> int_NE_code x y + WordLtOp -> trivialCode (CMP ULT) x y + WordLeOp -> trivialCode (CMP ULE) x y + + AddrGtOp -> trivialCode (CMP ULT) y x + AddrGeOp -> trivialCode (CMP ULE) y x + AddrEqOp -> trivialCode (CMP EQQ) x y + AddrNeOp -> int_NE_code x y + AddrLtOp -> trivialCode (CMP ULT) x y + AddrLeOp -> trivialCode (CMP ULE) x y + + FloatGtOp -> cmpF_code (FCMP TF LE) EQQ x y + FloatGeOp -> cmpF_code (FCMP TF LTT) EQQ x y + FloatEqOp -> cmpF_code (FCMP TF EQQ) NE x y + FloatNeOp -> cmpF_code (FCMP TF EQQ) EQQ x y + FloatLtOp -> cmpF_code (FCMP TF LTT) NE x y + FloatLeOp -> cmpF_code (FCMP TF LE) NE x y + + DoubleGtOp -> cmpF_code (FCMP TF LE) EQQ x y + DoubleGeOp -> cmpF_code (FCMP TF LTT) EQQ x y + DoubleEqOp -> cmpF_code (FCMP TF EQQ) NE x y + DoubleNeOp -> cmpF_code (FCMP TF EQQ) EQQ x y + DoubleLtOp -> cmpF_code (FCMP TF LTT) NE x y + DoubleLeOp -> cmpF_code (FCMP TF LE) NE x y + + IntAddOp -> trivialCode (ADD Q False) x y + IntSubOp -> trivialCode (SUB Q False) x y + IntMulOp -> trivialCode (MUL Q False) x y + IntQuotOp -> trivialCode (DIV Q False) x y + IntRemOp -> trivialCode (REM Q False) x y + + WordAddOp -> trivialCode (ADD Q False) x y + WordSubOp -> trivialCode (SUB Q False) x y + WordMulOp -> trivialCode (MUL Q False) x y + WordQuotOp -> trivialCode (DIV Q True) x y + WordRemOp -> trivialCode (REM Q True) x y + + FloatAddOp -> trivialFCode FloatRep (FADD TF) x y + FloatSubOp -> trivialFCode FloatRep (FSUB TF) x y + FloatMulOp -> trivialFCode FloatRep (FMUL TF) x y + FloatDivOp -> trivialFCode FloatRep (FDIV TF) x y + + DoubleAddOp -> trivialFCode F64 (FADD TF) x y + DoubleSubOp -> trivialFCode F64 (FSUB TF) x y + DoubleMulOp -> trivialFCode F64 (FMUL TF) x y + DoubleDivOp -> trivialFCode F64 (FDIV TF) x y + + AddrAddOp -> trivialCode (ADD Q False) x y + AddrSubOp -> trivialCode (SUB Q False) x y + AddrRemOp -> trivialCode (REM Q True) x y + + AndOp -> trivialCode AND x y + OrOp -> trivialCode OR x y + XorOp -> trivialCode XOR x y + SllOp -> trivialCode SLL x y + SrlOp -> trivialCode SRL x y + + ISllOp -> trivialCode SLL x y -- was: panic "AlphaGen:isll" + ISraOp -> trivialCode SRA x y -- was: panic "AlphaGen:isra" + ISrlOp -> trivialCode SRL x y -- was: panic "AlphaGen:isrl" + + FloatPowerOp -> getRegister (StCall FSLIT("pow") CCallConv F64 [x,y]) + DoublePowerOp -> getRegister (StCall FSLIT("pow") CCallConv F64 [x,y]) + where + {- ------------------------------------------------------------ + Some bizarre special code for getting condition codes into + registers. Integer non-equality is a test for equality + followed by an XOR with 1. (Integer comparisons always set + the result register to 0 or 1.) Floating point comparisons of + any kind leave the result in a floating point register, so we + need to wrangle an integer register out of things. + -} + int_NE_code :: StixTree -> StixTree -> NatM Register + + int_NE_code x y + = trivialCode (CMP EQQ) x y `thenNat` \ register -> + getNewRegNat IntRep `thenNat` \ tmp -> + let + code = registerCode register tmp + src = registerName register tmp + code__2 dst = code . mkSeqInstr (XOR src (RIImm (ImmInt 1)) dst) + in + return (Any IntRep code__2) + + {- ------------------------------------------------------------ + Comments for int_NE_code also apply to cmpF_code + -} + cmpF_code + :: (Reg -> Reg -> Reg -> Instr) + -> Cond + -> StixTree -> StixTree + -> NatM Register + + cmpF_code instr cond x y + = trivialFCode pr instr x y `thenNat` \ register -> + getNewRegNat F64 `thenNat` \ tmp -> + getBlockIdNat `thenNat` \ lbl -> + let + code = registerCode register tmp + result = registerName register tmp + + code__2 dst = code . mkSeqInstrs [ + OR zeroh (RIImm (ImmInt 1)) dst, + BF cond result (ImmCLbl lbl), + OR zeroh (RIReg zeroh) dst, + NEWBLOCK lbl] + in + return (Any IntRep code__2) + where + pr = panic "trivialU?FCode: does not use PrimRep on Alpha" + ------------------------------------------------------------ + +getRegister (CmmLoad pk mem) + = getAmode mem `thenNat` \ amode -> + let + code = amodeCode amode + src = amodeAddr amode + size = primRepToSize pk + code__2 dst = code . mkSeqInstr (LD size dst src) + in + return (Any pk code__2) + +getRegister (StInt i) + | fits8Bits i + = let + code dst = mkSeqInstr (OR zeroh (RIImm src) dst) + in + return (Any IntRep code) + | otherwise + = let + code dst = mkSeqInstr (LDI Q dst src) + in + return (Any IntRep code) + where + src = ImmInt (fromInteger i) + +getRegister leaf + | isJust imm + = let + code dst = mkSeqInstr (LDA dst (AddrImm imm__2)) + in + return (Any PtrRep code) + where + imm = maybeImm leaf + imm__2 = case imm of Just x -> x + +#endif /* alpha_TARGET_ARCH */ + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if i386_TARGET_ARCH + +getRegister (CmmLit (CmmFloat f F32)) = do + lbl <- getNewLabelNat + dynRef <- cmmMakeDynamicReference addImportNat False lbl + Amode addr addr_code <- getAmode dynRef + let code dst = + LDATA ReadOnlyData + [CmmDataLabel lbl, + CmmStaticLit (CmmFloat f F32)] + `consOL` (addr_code `snocOL` + GLD F32 addr dst) + -- in + return (Any F32 code) + + +getRegister (CmmLit (CmmFloat d F64)) + | d == 0.0 + = let code dst = unitOL (GLDZ dst) + in return (Any F64 code) + + | d == 1.0 + = let code dst = unitOL (GLD1 dst) + in return (Any F64 code) + + | otherwise = do + lbl <- getNewLabelNat + dynRef <- cmmMakeDynamicReference addImportNat False lbl + Amode addr addr_code <- getAmode dynRef + let code dst = + LDATA ReadOnlyData + [CmmDataLabel lbl, + CmmStaticLit (CmmFloat d F64)] + `consOL` (addr_code `snocOL` + GLD F64 addr dst) + -- in + return (Any F64 code) + +#endif /* i386_TARGET_ARCH */ + +#if x86_64_TARGET_ARCH + +getRegister (CmmLit (CmmFloat 0.0 rep)) = do + let code dst = unitOL (XOR rep (OpReg dst) (OpReg dst)) + -- I don't know why there are xorpd, xorps, and pxor instructions. + -- They all appear to do the same thing --SDM + return (Any rep code) + +getRegister (CmmLit (CmmFloat f rep)) = do + lbl <- getNewLabelNat + let code dst = toOL [ + LDATA ReadOnlyData + [CmmDataLabel lbl, + CmmStaticLit (CmmFloat f rep)], + MOV rep (OpAddr (ripRel (ImmCLbl lbl))) (OpReg dst) + ] + -- in + return (Any rep code) + +#endif /* x86_64_TARGET_ARCH */ + +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + +-- catch simple cases of zero- or sign-extended load +getRegister (CmmMachOp (MO_U_Conv I8 I32) [CmmLoad addr _]) = do + code <- intLoadCode (MOVZxL I8) addr + return (Any I32 code) + +getRegister (CmmMachOp (MO_S_Conv I8 I32) [CmmLoad addr _]) = do + code <- intLoadCode (MOVSxL I8) addr + return (Any I32 code) + +getRegister (CmmMachOp (MO_U_Conv I16 I32) [CmmLoad addr _]) = do + code <- intLoadCode (MOVZxL I16) addr + return (Any I32 code) + +getRegister (CmmMachOp (MO_S_Conv I16 I32) [CmmLoad addr _]) = do + code <- intLoadCode (MOVSxL I16) addr + return (Any I32 code) + +#endif + +#if x86_64_TARGET_ARCH + +-- catch simple cases of zero- or sign-extended load +getRegister (CmmMachOp (MO_U_Conv I8 I64) [CmmLoad addr _]) = do + code <- intLoadCode (MOVZxL I8) addr + return (Any I64 code) + +getRegister (CmmMachOp (MO_S_Conv I8 I64) [CmmLoad addr _]) = do + code <- intLoadCode (MOVSxL I8) addr + return (Any I64 code) + +getRegister (CmmMachOp (MO_U_Conv I16 I64) [CmmLoad addr _]) = do + code <- intLoadCode (MOVZxL I16) addr + return (Any I64 code) + +getRegister (CmmMachOp (MO_S_Conv I16 I64) [CmmLoad addr _]) = do + code <- intLoadCode (MOVSxL I16) addr + return (Any I64 code) + +getRegister (CmmMachOp (MO_U_Conv I32 I64) [CmmLoad addr _]) = do + code <- intLoadCode (MOV I32) addr -- 32-bit loads zero-extend + return (Any I64 code) + +getRegister (CmmMachOp (MO_S_Conv I32 I64) [CmmLoad addr _]) = do + code <- intLoadCode (MOVSxL I32) addr + return (Any I64 code) + +#endif + +#if x86_64_TARGET_ARCH +getRegister (CmmMachOp (MO_S_Neg F32) [x]) = do + x_code <- getAnyReg x + lbl <- getNewLabelNat + let + code dst = x_code dst `appOL` toOL [ + -- This is how gcc does it, so it can't be that bad: + LDATA ReadOnlyData16 [ + CmmAlign 16, + CmmDataLabel lbl, + CmmStaticLit (CmmInt 0x80000000 I32), + CmmStaticLit (CmmInt 0 I32), + CmmStaticLit (CmmInt 0 I32), + CmmStaticLit (CmmInt 0 I32) + ], + XOR F32 (OpAddr (ripRel (ImmCLbl lbl))) (OpReg dst) + -- xorps, so we need the 128-bit constant + -- ToDo: rip-relative + ] + -- + return (Any F32 code) + +getRegister (CmmMachOp (MO_S_Neg F64) [x]) = do + x_code <- getAnyReg x + lbl <- getNewLabelNat + let + -- This is how gcc does it, so it can't be that bad: + code dst = x_code dst `appOL` toOL [ + LDATA ReadOnlyData16 [ + CmmAlign 16, + CmmDataLabel lbl, + CmmStaticLit (CmmInt 0x8000000000000000 I64), + CmmStaticLit (CmmInt 0 I64) + ], + -- gcc puts an unpck here. Wonder if we need it. + XOR F64 (OpAddr (ripRel (ImmCLbl lbl))) (OpReg dst) + -- xorpd, so we need the 128-bit constant + ] + -- + return (Any F64 code) +#endif + +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + +getRegister (CmmMachOp mop [x]) -- unary MachOps + = case mop of +#if i386_TARGET_ARCH + MO_S_Neg F32 -> trivialUFCode F32 (GNEG F32) x + MO_S_Neg F64 -> trivialUFCode F64 (GNEG F64) x +#endif + + MO_S_Neg rep -> trivialUCode rep (NEGI rep) x + MO_Not rep -> trivialUCode rep (NOT rep) x + + -- Nop conversions + -- TODO: these are only nops if the arg is not a fixed register that + -- can't be byte-addressed. + MO_U_Conv I32 I8 -> conversionNop I32 x + MO_S_Conv I32 I8 -> conversionNop I32 x + MO_U_Conv I16 I8 -> conversionNop I16 x + MO_S_Conv I16 I8 -> conversionNop I16 x + MO_U_Conv I32 I16 -> conversionNop I32 x + MO_S_Conv I32 I16 -> conversionNop I32 x +#if x86_64_TARGET_ARCH + MO_U_Conv I64 I32 -> conversionNop I64 x + MO_S_Conv I64 I32 -> conversionNop I64 x + MO_U_Conv I64 I16 -> conversionNop I64 x + MO_S_Conv I64 I16 -> conversionNop I64 x + MO_U_Conv I64 I8 -> conversionNop I64 x + MO_S_Conv I64 I8 -> conversionNop I64 x +#endif + + MO_U_Conv rep1 rep2 | rep1 == rep2 -> conversionNop rep1 x + MO_S_Conv rep1 rep2 | rep1 == rep2 -> conversionNop rep1 x + + -- widenings + MO_U_Conv I8 I32 -> integerExtend I8 I32 MOVZxL x + MO_U_Conv I16 I32 -> integerExtend I16 I32 MOVZxL x + MO_U_Conv I8 I16 -> integerExtend I8 I16 MOVZxL x + + MO_S_Conv I8 I32 -> integerExtend I8 I32 MOVSxL x + MO_S_Conv I16 I32 -> integerExtend I16 I32 MOVSxL x + MO_S_Conv I8 I16 -> integerExtend I8 I16 MOVSxL x + +#if x86_64_TARGET_ARCH + MO_U_Conv I8 I64 -> integerExtend I8 I64 MOVZxL x + MO_U_Conv I16 I64 -> integerExtend I16 I64 MOVZxL x + MO_U_Conv I32 I64 -> integerExtend I32 I64 MOVZxL x + MO_S_Conv I8 I64 -> integerExtend I8 I64 MOVSxL x + MO_S_Conv I16 I64 -> integerExtend I16 I64 MOVSxL x + MO_S_Conv I32 I64 -> integerExtend I32 I64 MOVSxL x + -- for 32-to-64 bit zero extension, amd64 uses an ordinary movl. + -- However, we don't want the register allocator to throw it + -- away as an unnecessary reg-to-reg move, so we keep it in + -- the form of a movzl and print it as a movl later. +#endif + +#if i386_TARGET_ARCH + MO_S_Conv F32 F64 -> conversionNop F64 x + MO_S_Conv F64 F32 -> conversionNop F32 x +#else + MO_S_Conv F32 F64 -> coerceFP2FP F64 x + MO_S_Conv F64 F32 -> coerceFP2FP F32 x +#endif + + MO_S_Conv from to + | isFloatingRep from -> coerceFP2Int from to x + | isFloatingRep to -> coerceInt2FP from to x + + other -> pprPanic "getRegister" (pprMachOp mop) + where + -- signed or unsigned extension. + integerExtend from to instr expr = do + (reg,e_code) <- if from == I8 then getByteReg expr + else getSomeReg expr + let + code dst = + e_code `snocOL` + instr from (OpReg reg) (OpReg dst) + return (Any to code) + + conversionNop new_rep expr + = do e_code <- getRegister expr + return (swizzleRegisterRep e_code new_rep) + + +getRegister e@(CmmMachOp mop [x, y]) -- dyadic MachOps + = ASSERT2(cmmExprRep x /= I8, pprExpr e) + case mop of + MO_Eq F32 -> condFltReg EQQ x y + MO_Ne F32 -> condFltReg NE x y + MO_S_Gt F32 -> condFltReg GTT x y + MO_S_Ge F32 -> condFltReg GE x y + MO_S_Lt F32 -> condFltReg LTT x y + MO_S_Le F32 -> condFltReg LE x y + + MO_Eq F64 -> condFltReg EQQ x y + MO_Ne F64 -> condFltReg NE x y + MO_S_Gt F64 -> condFltReg GTT x y + MO_S_Ge F64 -> condFltReg GE x y + MO_S_Lt F64 -> condFltReg LTT x y + MO_S_Le F64 -> condFltReg LE x y + + MO_Eq rep -> condIntReg EQQ x y + MO_Ne rep -> condIntReg NE x y + + MO_S_Gt rep -> condIntReg GTT x y + MO_S_Ge rep -> condIntReg GE x y + MO_S_Lt rep -> condIntReg LTT x y + MO_S_Le rep -> condIntReg LE x y + + MO_U_Gt rep -> condIntReg GU x y + MO_U_Ge rep -> condIntReg GEU x y + MO_U_Lt rep -> condIntReg LU x y + MO_U_Le rep -> condIntReg LEU x y + +#if i386_TARGET_ARCH + MO_Add F32 -> trivialFCode F32 GADD x y + MO_Sub F32 -> trivialFCode F32 GSUB x y + + MO_Add F64 -> trivialFCode F64 GADD x y + MO_Sub F64 -> trivialFCode F64 GSUB x y + + MO_S_Quot F32 -> trivialFCode F32 GDIV x y + MO_S_Quot F64 -> trivialFCode F64 GDIV x y +#endif + +#if x86_64_TARGET_ARCH + MO_Add F32 -> trivialFCode F32 ADD x y + MO_Sub F32 -> trivialFCode F32 SUB x y + + MO_Add F64 -> trivialFCode F64 ADD x y + MO_Sub F64 -> trivialFCode F64 SUB x y + + MO_S_Quot F32 -> trivialFCode F32 FDIV x y + MO_S_Quot F64 -> trivialFCode F64 FDIV x y +#endif + + MO_Add rep -> add_code rep x y + MO_Sub rep -> sub_code rep x y + + MO_S_Quot rep -> div_code rep True True x y + MO_S_Rem rep -> div_code rep True False x y + MO_U_Quot rep -> div_code rep False True x y + MO_U_Rem rep -> div_code rep False False x y + +#if i386_TARGET_ARCH + MO_Mul F32 -> trivialFCode F32 GMUL x y + MO_Mul F64 -> trivialFCode F64 GMUL x y +#endif + +#if x86_64_TARGET_ARCH + MO_Mul F32 -> trivialFCode F32 MUL x y + MO_Mul F64 -> trivialFCode F64 MUL x y +#endif + + MO_Mul rep -> let op = IMUL rep in + trivialCode rep op (Just op) x y + + MO_S_MulMayOflo rep -> imulMayOflo rep x y + + MO_And rep -> let op = AND rep in + trivialCode rep op (Just op) x y + MO_Or rep -> let op = OR rep in + trivialCode rep op (Just op) x y + MO_Xor rep -> let op = XOR rep in + trivialCode rep op (Just op) x y + + {- Shift ops on x86s have constraints on their source, it + either has to be Imm, CL or 1 + => trivialCode is not restrictive enough (sigh.) + -} + MO_Shl rep -> shift_code rep (SHL rep) x y {-False-} + MO_U_Shr rep -> shift_code rep (SHR rep) x y {-False-} + MO_S_Shr rep -> shift_code rep (SAR rep) x y {-False-} + + other -> pprPanic "getRegister(x86) - binary CmmMachOp (1)" (pprMachOp mop) + where + -------------------- + imulMayOflo :: MachRep -> CmmExpr -> CmmExpr -> NatM Register + imulMayOflo rep a b = do + (a_reg, a_code) <- getNonClobberedReg a + b_code <- getAnyReg b + let + shift_amt = case rep of + I32 -> 31 + I64 -> 63 + _ -> panic "shift_amt" + + code = a_code `appOL` b_code eax `appOL` + toOL [ + IMUL2 rep (OpReg a_reg), -- result in %edx:%eax + SAR rep (OpImm (ImmInt shift_amt)) (OpReg eax), + -- sign extend lower part + SUB rep (OpReg edx) (OpReg eax) + -- compare against upper + -- eax==0 if high part == sign extended low part + ] + -- in + return (Fixed rep eax code) + + -------------------- + shift_code :: MachRep + -> (Operand -> Operand -> Instr) + -> CmmExpr + -> CmmExpr + -> NatM Register + + {- Case1: shift length as immediate -} + shift_code rep instr x y@(CmmLit lit) = do + x_code <- getAnyReg x + let + code dst + = x_code dst `snocOL` + instr (OpImm (litToImm lit)) (OpReg dst) + -- in + return (Any rep code) + + {- Case2: shift length is complex (non-immediate) -} + shift_code rep instr x y{-amount-} = do + (x_reg, x_code) <- getNonClobberedReg x + y_code <- getAnyReg y + let + code = x_code `appOL` + y_code ecx `snocOL` + instr (OpReg ecx) (OpReg x_reg) + -- in + return (Fixed rep x_reg code) + + -------------------- + add_code :: MachRep -> CmmExpr -> CmmExpr -> NatM Register + add_code rep x (CmmLit (CmmInt y _)) + | not (is64BitInteger y) = add_int rep x y + add_code rep x y = trivialCode rep (ADD rep) (Just (ADD rep)) x y + + -------------------- + sub_code :: MachRep -> CmmExpr -> CmmExpr -> NatM Register + sub_code rep x (CmmLit (CmmInt y _)) + | not (is64BitInteger (-y)) = add_int rep x (-y) + sub_code rep x y = trivialCode rep (SUB rep) Nothing x y + + -- our three-operand add instruction: + add_int rep x y = do + (x_reg, x_code) <- getSomeReg x + let + imm = ImmInt (fromInteger y) + code dst + = x_code `snocOL` + LEA rep + (OpAddr (AddrBaseIndex (EABaseReg x_reg) EAIndexNone imm)) + (OpReg dst) + -- + return (Any rep code) + + ---------------------- + div_code rep signed quotient x y = do + (y_op, y_code) <- getRegOrMem y -- cannot be clobbered + x_code <- getAnyReg x + let + widen | signed = CLTD rep + | otherwise = XOR rep (OpReg edx) (OpReg edx) + + instr | signed = IDIV + | otherwise = DIV + + code = y_code `appOL` + x_code eax `appOL` + toOL [widen, instr rep y_op] + + result | quotient = eax + | otherwise = edx + + -- in + return (Fixed rep result code) + + +getRegister (CmmLoad mem pk) + | isFloatingRep pk + = do + Amode src mem_code <- getAmode mem + let + code dst = mem_code `snocOL` + IF_ARCH_i386(GLD pk src dst, + MOV pk (OpAddr src) (OpReg dst)) + -- + return (Any pk code) + +#if i386_TARGET_ARCH +getRegister (CmmLoad mem pk) + | pk /= I64 + = do + code <- intLoadCode (instr pk) mem + return (Any pk code) + where + instr I8 = MOVZxL pk + instr I16 = MOV I16 + instr I32 = MOV I32 + -- we always zero-extend 8-bit loads, if we + -- can't think of anything better. This is because + -- we can't guarantee access to an 8-bit variant of every register + -- (esi and edi don't have 8-bit variants), so to make things + -- simpler we do our 8-bit arithmetic with full 32-bit registers. +#endif + +#if x86_64_TARGET_ARCH +-- Simpler memory load code on x86_64 +getRegister (CmmLoad mem pk) + = do + code <- intLoadCode (MOV pk) mem + return (Any pk code) +#endif + +getRegister (CmmLit (CmmInt 0 rep)) + = let + -- x86_64: 32-bit xor is one byte shorter, and zero-extends to 64 bits + adj_rep = case rep of I64 -> I32; _ -> rep + rep1 = IF_ARCH_i386( rep, adj_rep ) + code dst + = unitOL (XOR rep1 (OpReg dst) (OpReg dst)) + in + return (Any rep code) + +#if x86_64_TARGET_ARCH + -- optimisation for loading small literals on x86_64: take advantage + -- of the automatic zero-extension from 32 to 64 bits, because the 32-bit + -- instruction forms are shorter. +getRegister (CmmLit lit) + | I64 <- cmmLitRep lit, not (isBigLit lit) + = let + imm = litToImm lit + code dst = unitOL (MOV I32 (OpImm imm) (OpReg dst)) + in + return (Any I64 code) + where + isBigLit (CmmInt i I64) = i < 0 || i > 0xffffffff + isBigLit _ = False + -- note1: not the same as is64BitLit, because that checks for + -- signed literals that fit in 32 bits, but we want unsigned + -- literals here. + -- note2: all labels are small, because we're assuming the + -- small memory model (see gcc docs, -mcmodel=small). +#endif + +getRegister (CmmLit lit) + = let + rep = cmmLitRep lit + imm = litToImm lit + code dst = unitOL (MOV rep (OpImm imm) (OpReg dst)) + in + return (Any rep code) + +getRegister other = pprPanic "getRegister(x86)" (ppr other) + + +intLoadCode :: (Operand -> Operand -> Instr) -> CmmExpr + -> NatM (Reg -> InstrBlock) +intLoadCode instr mem = do + Amode src mem_code <- getAmode mem + return (\dst -> mem_code `snocOL` instr (OpAddr src) (OpReg dst)) + +-- Compute an expression into *any* register, adding the appropriate +-- move instruction if necessary. +getAnyReg :: CmmExpr -> NatM (Reg -> InstrBlock) +getAnyReg expr = do + r <- getRegister expr + anyReg r + +anyReg :: Register -> NatM (Reg -> InstrBlock) +anyReg (Any _ code) = return code +anyReg (Fixed rep reg fcode) = return (\dst -> fcode `snocOL` reg2reg rep reg dst) + +-- A bit like getSomeReg, but we want a reg that can be byte-addressed. +-- Fixed registers might not be byte-addressable, so we make sure we've +-- got a temporary, inserting an extra reg copy if necessary. +getByteReg :: CmmExpr -> NatM (Reg, InstrBlock) +#if x86_64_TARGET_ARCH +getByteReg = getSomeReg -- all regs are byte-addressable on x86_64 +#else +getByteReg expr = do + r <- getRegister expr + case r of + Any rep code -> do + tmp <- getNewRegNat rep + return (tmp, code tmp) + Fixed rep reg code + | isVirtualReg reg -> return (reg,code) + | otherwise -> do + tmp <- getNewRegNat rep + return (tmp, code `snocOL` reg2reg rep reg tmp) + -- ToDo: could optimise slightly by checking for byte-addressable + -- real registers, but that will happen very rarely if at all. +#endif + +-- Another variant: this time we want the result in a register that cannot +-- be modified by code to evaluate an arbitrary expression. +getNonClobberedReg :: CmmExpr -> NatM (Reg, InstrBlock) +getNonClobberedReg expr = do + r <- getRegister expr + case r of + Any rep code -> do + tmp <- getNewRegNat rep + return (tmp, code tmp) + Fixed rep reg code + -- only free regs can be clobbered + | RealReg rr <- reg, isFastTrue (freeReg rr) -> do + tmp <- getNewRegNat rep + return (tmp, code `snocOL` reg2reg rep reg tmp) + | otherwise -> + return (reg, code) + +reg2reg :: MachRep -> Reg -> Reg -> Instr +reg2reg rep src dst +#if i386_TARGET_ARCH + | isFloatingRep rep = GMOV src dst +#endif + | otherwise = MOV rep (OpReg src) (OpReg dst) + +#endif /* i386_TARGET_ARCH || x86_64_TARGET_ARCH */ + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if sparc_TARGET_ARCH + +getRegister (CmmLit (CmmFloat f F32)) = do + lbl <- getNewLabelNat + let code dst = toOL [ + LDATA ReadOnlyData + [CmmDataLabel lbl, + CmmStaticLit (CmmFloat f F32)], + SETHI (HI (ImmCLbl lbl)) dst, + LD F32 (AddrRegImm dst (LO (ImmCLbl lbl))) dst] + return (Any F32 code) + +getRegister (CmmLit (CmmFloat d F64)) = do + lbl <- getNewLabelNat + let code dst = toOL [ + LDATA ReadOnlyData + [CmmDataLabel lbl, + CmmStaticLit (CmmFloat d F64)], + SETHI (HI (ImmCLbl lbl)) dst, + LD F64 (AddrRegImm dst (LO (ImmCLbl lbl))) dst] + return (Any F64 code) + +getRegister (CmmMachOp mop [x]) -- unary MachOps + = case mop of + MO_S_Neg F32 -> trivialUFCode F32 (FNEG F32) x + MO_S_Neg F64 -> trivialUFCode F64 (FNEG F64) x + + MO_S_Neg rep -> trivialUCode rep (SUB False False g0) x + MO_Not rep -> trivialUCode rep (XNOR False g0) x + + MO_U_Conv I32 I8 -> trivialCode I8 (AND False) x (CmmLit (CmmInt 255 I8)) + + MO_U_Conv F64 F32-> coerceDbl2Flt x + MO_U_Conv F32 F64-> coerceFlt2Dbl x + + MO_S_Conv F32 I32-> coerceFP2Int F32 I32 x + MO_S_Conv I32 F32-> coerceInt2FP I32 F32 x + MO_S_Conv F64 I32-> coerceFP2Int F64 I32 x + MO_S_Conv I32 F64-> coerceInt2FP I32 F64 x + + -- Conversions which are a nop on sparc + MO_U_Conv from to + | from == to -> conversionNop to x + MO_U_Conv I32 to -> conversionNop to x + MO_S_Conv I32 to -> conversionNop to x + + -- widenings + MO_U_Conv I8 I32 -> integerExtend False I8 I32 x + MO_U_Conv I16 I32 -> integerExtend False I16 I32 x + MO_U_Conv I8 I16 -> integerExtend False I8 I16 x + MO_S_Conv I16 I32 -> integerExtend True I16 I32 x + + other_op -> panic "Unknown unary mach op" + where + -- XXX SLL/SRL? + integerExtend signed from to expr = do + (reg, e_code) <- getSomeReg expr + let + code dst = + e_code `snocOL` + ((if signed then SRA else SRL) + reg (RIImm (ImmInt 0)) dst) + return (Any to code) + conversionNop new_rep expr + = do e_code <- getRegister expr + return (swizzleRegisterRep e_code new_rep) + +getRegister (CmmMachOp mop [x, y]) -- dyadic PrimOps + = case mop of + MO_Eq F32 -> condFltReg EQQ x y + MO_Ne F32 -> condFltReg NE x y + + MO_S_Gt F32 -> condFltReg GTT x y + MO_S_Ge F32 -> condFltReg GE x y + MO_S_Lt F32 -> condFltReg LTT x y + MO_S_Le F32 -> condFltReg LE x y + + MO_Eq F64 -> condFltReg EQQ x y + MO_Ne F64 -> condFltReg NE x y + + MO_S_Gt F64 -> condFltReg GTT x y + MO_S_Ge F64 -> condFltReg GE x y + MO_S_Lt F64 -> condFltReg LTT x y + MO_S_Le F64 -> condFltReg LE x y + + MO_Eq rep -> condIntReg EQQ x y + MO_Ne rep -> condIntReg NE x y + + MO_S_Gt rep -> condIntReg GTT x y + MO_S_Ge rep -> condIntReg GE x y + MO_S_Lt rep -> condIntReg LTT x y + MO_S_Le rep -> condIntReg LE x y + + MO_U_Gt I32 -> condIntReg GTT x y + MO_U_Ge I32 -> condIntReg GE x y + MO_U_Lt I32 -> condIntReg LTT x y + MO_U_Le I32 -> condIntReg LE x y + + MO_U_Gt I16 -> condIntReg GU x y + MO_U_Ge I16 -> condIntReg GEU x y + MO_U_Lt I16 -> condIntReg LU x y + MO_U_Le I16 -> condIntReg LEU x y + + MO_Add I32 -> trivialCode I32 (ADD False False) x y + MO_Sub I32 -> trivialCode I32 (SUB False False) x y + + MO_S_MulMayOflo rep -> imulMayOflo rep x y +{- + -- ToDo: teach about V8+ SPARC div instructions + MO_S_Quot I32 -> idiv FSLIT(".div") x y + MO_S_Rem I32 -> idiv FSLIT(".rem") x y + MO_U_Quot I32 -> idiv FSLIT(".udiv") x y + MO_U_Rem I32 -> idiv FSLIT(".urem") x y +-} + MO_Add F32 -> trivialFCode F32 FADD x y + MO_Sub F32 -> trivialFCode F32 FSUB x y + MO_Mul F32 -> trivialFCode F32 FMUL x y + MO_S_Quot F32 -> trivialFCode F32 FDIV x y + + MO_Add F64 -> trivialFCode F64 FADD x y + MO_Sub F64 -> trivialFCode F64 FSUB x y + MO_Mul F64 -> trivialFCode F64 FMUL x y + MO_S_Quot F64 -> trivialFCode F64 FDIV x y + + MO_And rep -> trivialCode rep (AND False) x y + MO_Or rep -> trivialCode rep (OR False) x y + MO_Xor rep -> trivialCode rep (XOR False) x y + + MO_Mul rep -> trivialCode rep (SMUL False) x y + + MO_Shl rep -> trivialCode rep SLL x y + MO_U_Shr rep -> trivialCode rep SRL x y + MO_S_Shr rep -> trivialCode rep SRA x y + +{- + MO_F32_Pwr -> getRegister (StCall (Left FSLIT("pow")) CCallConv F64 + [promote x, promote y]) + where promote x = CmmMachOp MO_F32_to_Dbl [x] + MO_F64_Pwr -> getRegister (StCall (Left FSLIT("pow")) CCallConv F64 + [x, y]) +-} + other -> pprPanic "getRegister(sparc) - binary CmmMachOp (1)" (pprMachOp mop) + where + --idiv fn x y = getRegister (StCall (Left fn) CCallConv I32 [x, y]) + + -------------------- + imulMayOflo :: MachRep -> CmmExpr -> CmmExpr -> NatM Register + imulMayOflo rep a b = do + (a_reg, a_code) <- getSomeReg a + (b_reg, b_code) <- getSomeReg b + res_lo <- getNewRegNat I32 + res_hi <- getNewRegNat I32 + let + shift_amt = case rep of + I32 -> 31 + I64 -> 63 + _ -> panic "shift_amt" + code dst = a_code `appOL` b_code `appOL` + toOL [ + SMUL False a_reg (RIReg b_reg) res_lo, + RDY res_hi, + SRA res_lo (RIImm (ImmInt shift_amt)) res_lo, + SUB False False res_lo (RIReg res_hi) dst + ] + return (Any I32 code) + +getRegister (CmmLoad mem pk) = do + Amode src code <- getAmode mem + let + code__2 dst = code `snocOL` LD pk src dst + return (Any pk code__2) + +getRegister (CmmLit (CmmInt i _)) + | fits13Bits i + = let + src = ImmInt (fromInteger i) + code dst = unitOL (OR False g0 (RIImm src) dst) + in + return (Any I32 code) + +getRegister (CmmLit lit) + = let rep = cmmLitRep lit + imm = litToImm lit + code dst = toOL [ + SETHI (HI imm) dst, + OR False dst (RIImm (LO imm)) dst] + in return (Any I32 code) + +#endif /* sparc_TARGET_ARCH */ + +#if powerpc_TARGET_ARCH +getRegister (CmmLoad mem pk) + | pk /= I64 + = do + Amode addr addr_code <- getAmode mem + let code dst = ASSERT((regClass dst == RcDouble) == isFloatingRep pk) + addr_code `snocOL` LD pk dst addr + return (Any pk code) + +-- catch simple cases of zero- or sign-extended load +getRegister (CmmMachOp (MO_U_Conv I8 I32) [CmmLoad mem _]) = do + Amode addr addr_code <- getAmode mem + return (Any I32 (\dst -> addr_code `snocOL` LD I8 dst addr)) + +-- Note: there is no Load Byte Arithmetic instruction, so no signed case here + +getRegister (CmmMachOp (MO_U_Conv I16 I32) [CmmLoad mem _]) = do + Amode addr addr_code <- getAmode mem + return (Any I32 (\dst -> addr_code `snocOL` LD I16 dst addr)) + +getRegister (CmmMachOp (MO_S_Conv I16 I32) [CmmLoad mem _]) = do + Amode addr addr_code <- getAmode mem + return (Any I32 (\dst -> addr_code `snocOL` LA I16 dst addr)) + +getRegister (CmmMachOp mop [x]) -- unary MachOps + = case mop of + MO_Not rep -> trivialUCode rep NOT x + + MO_S_Conv F64 F32 -> trivialUCode F32 FRSP x + MO_S_Conv F32 F64 -> conversionNop F64 x + + MO_S_Conv from to + | from == to -> conversionNop to x + | isFloatingRep from -> coerceFP2Int from to x + | isFloatingRep to -> coerceInt2FP from to x + + -- narrowing is a nop: we treat the high bits as undefined + MO_S_Conv I32 to -> conversionNop to x + MO_S_Conv I16 I8 -> conversionNop I8 x + MO_S_Conv I8 to -> trivialUCode to (EXTS I8) x + MO_S_Conv I16 to -> trivialUCode to (EXTS I16) x + + MO_U_Conv from to + | from == to -> conversionNop to x + -- narrowing is a nop: we treat the high bits as undefined + MO_U_Conv I32 to -> conversionNop to x + MO_U_Conv I16 I8 -> conversionNop I8 x + MO_U_Conv I8 to -> trivialCode to False AND x (CmmLit (CmmInt 255 I32)) + MO_U_Conv I16 to -> trivialCode to False AND x (CmmLit (CmmInt 65535 I32)) + + MO_S_Neg F32 -> trivialUCode F32 FNEG x + MO_S_Neg F64 -> trivialUCode F64 FNEG x + MO_S_Neg rep -> trivialUCode rep NEG x + + where + conversionNop new_rep expr + = do e_code <- getRegister expr + return (swizzleRegisterRep e_code new_rep) + +getRegister (CmmMachOp mop [x, y]) -- dyadic PrimOps + = case mop of + MO_Eq F32 -> condFltReg EQQ x y + MO_Ne F32 -> condFltReg NE x y + + MO_S_Gt F32 -> condFltReg GTT x y + MO_S_Ge F32 -> condFltReg GE x y + MO_S_Lt F32 -> condFltReg LTT x y + MO_S_Le F32 -> condFltReg LE x y + + MO_Eq F64 -> condFltReg EQQ x y + MO_Ne F64 -> condFltReg NE x y + + MO_S_Gt F64 -> condFltReg GTT x y + MO_S_Ge F64 -> condFltReg GE x y + MO_S_Lt F64 -> condFltReg LTT x y + MO_S_Le F64 -> condFltReg LE x y + + MO_Eq rep -> condIntReg EQQ (extendUExpr rep x) (extendUExpr rep y) + MO_Ne rep -> condIntReg NE (extendUExpr rep x) (extendUExpr rep y) + + MO_S_Gt rep -> condIntReg GTT (extendSExpr rep x) (extendSExpr rep y) + MO_S_Ge rep -> condIntReg GE (extendSExpr rep x) (extendSExpr rep y) + MO_S_Lt rep -> condIntReg LTT (extendSExpr rep x) (extendSExpr rep y) + MO_S_Le rep -> condIntReg LE (extendSExpr rep x) (extendSExpr rep y) + + MO_U_Gt rep -> condIntReg GU (extendUExpr rep x) (extendUExpr rep y) + MO_U_Ge rep -> condIntReg GEU (extendUExpr rep x) (extendUExpr rep y) + MO_U_Lt rep -> condIntReg LU (extendUExpr rep x) (extendUExpr rep y) + MO_U_Le rep -> condIntReg LEU (extendUExpr rep x) (extendUExpr rep y) + + MO_Add F32 -> trivialCodeNoImm F32 (FADD F32) x y + MO_Sub F32 -> trivialCodeNoImm F32 (FSUB F32) x y + MO_Mul F32 -> trivialCodeNoImm F32 (FMUL F32) x y + MO_S_Quot F32 -> trivialCodeNoImm F32 (FDIV F32) x y + + MO_Add F64 -> trivialCodeNoImm F64 (FADD F64) x y + MO_Sub F64 -> trivialCodeNoImm F64 (FSUB F64) x y + MO_Mul F64 -> trivialCodeNoImm F64 (FMUL F64) x y + MO_S_Quot F64 -> trivialCodeNoImm F64 (FDIV F64) x y + + -- optimize addition with 32-bit immediate + -- (needed for PIC) + MO_Add I32 -> + case y of + CmmLit (CmmInt imm immrep) | Just _ <- makeImmediate I32 True (-imm) + -> trivialCode I32 True ADD x (CmmLit $ CmmInt imm immrep) + CmmLit lit + -> do + (src, srcCode) <- getSomeReg x + let imm = litToImm lit + code dst = srcCode `appOL` toOL [ + ADDIS dst src (HA imm), + ADD dst dst (RIImm (LO imm)) + ] + return (Any I32 code) + _ -> trivialCode I32 True ADD x y + + MO_Add rep -> trivialCode rep True ADD x y + MO_Sub rep -> + case y of -- subfi ('substract from' with immediate) doesn't exist + CmmLit (CmmInt imm immrep) | Just _ <- makeImmediate rep True (-imm) + -> trivialCode rep True ADD x (CmmLit $ CmmInt (-imm) immrep) + _ -> trivialCodeNoImm rep SUBF y x + + MO_Mul rep -> trivialCode rep True MULLW x y + + MO_S_MulMayOflo I32 -> trivialCodeNoImm I32 MULLW_MayOflo x y + + MO_S_MulMayOflo rep -> panic "S_MulMayOflo (rep /= I32): not implemented" + MO_U_MulMayOflo rep -> panic "U_MulMayOflo: not implemented" + + MO_S_Quot rep -> trivialCodeNoImm rep DIVW (extendSExpr rep x) (extendSExpr rep y) + MO_U_Quot rep -> trivialCodeNoImm rep DIVWU (extendUExpr rep x) (extendUExpr rep y) + + MO_S_Rem rep -> remainderCode rep DIVW (extendSExpr rep x) (extendSExpr rep y) + MO_U_Rem rep -> remainderCode rep DIVWU (extendUExpr rep x) (extendUExpr rep y) + + MO_And rep -> trivialCode rep False AND x y + MO_Or rep -> trivialCode rep False OR x y + MO_Xor rep -> trivialCode rep False XOR x y + + MO_Shl rep -> trivialCode rep False SLW x y + MO_S_Shr rep -> trivialCode rep False SRAW (extendSExpr rep x) y + MO_U_Shr rep -> trivialCode rep False SRW (extendUExpr rep x) y + +getRegister (CmmLit (CmmInt i rep)) + | Just imm <- makeImmediate rep True i + = let + code dst = unitOL (LI dst imm) + in + return (Any rep code) + +getRegister (CmmLit (CmmFloat f frep)) = do + lbl <- getNewLabelNat + dynRef <- cmmMakeDynamicReference addImportNat False lbl + Amode addr addr_code <- getAmode dynRef + let code dst = + LDATA ReadOnlyData [CmmDataLabel lbl, + CmmStaticLit (CmmFloat f frep)] + `consOL` (addr_code `snocOL` LD frep dst addr) + return (Any frep code) + +getRegister (CmmLit lit) + = let rep = cmmLitRep lit + imm = litToImm lit + code dst = toOL [ + LIS dst (HI imm), + OR dst dst (RIImm (LO imm)) + ] + in return (Any rep code) + +getRegister other = pprPanic "getRegister(ppc)" (pprExpr other) + + -- extend?Rep: wrap integer expression of type rep + -- in a conversion to I32 +extendSExpr I32 x = x +extendSExpr rep x = CmmMachOp (MO_S_Conv rep I32) [x] +extendUExpr I32 x = x +extendUExpr rep x = CmmMachOp (MO_U_Conv rep I32) [x] + +#endif /* powerpc_TARGET_ARCH */ + + +-- ----------------------------------------------------------------------------- +-- The 'Amode' type: Memory addressing modes passed up the tree. + +data Amode = Amode AddrMode InstrBlock + +{- +Now, given a tree (the argument to an CmmLoad) that references memory, +produce a suitable addressing mode. + +A Rule of the Game (tm) for Amodes: use of the addr bit must +immediately follow use of the code part, since the code part puts +values in registers which the addr then refers to. So you can't put +anything in between, lest it overwrite some of those registers. If +you need to do some other computation between the code part and use of +the addr bit, first store the effective address from the amode in a +temporary, then do the other computation, and then use the temporary: + + code + LEA amode, tmp + ... other computation ... + ... (tmp) ... +-} + +getAmode :: CmmExpr -> NatM Amode +getAmode tree@(CmmRegOff _ _) = getAmode (mangleIndexTree tree) + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if alpha_TARGET_ARCH + +getAmode (StPrim IntSubOp [x, StInt i]) + = getNewRegNat PtrRep `thenNat` \ tmp -> + getRegister x `thenNat` \ register -> + let + code = registerCode register tmp + reg = registerName register tmp + off = ImmInt (-(fromInteger i)) + in + return (Amode (AddrRegImm reg off) code) + +getAmode (StPrim IntAddOp [x, StInt i]) + = getNewRegNat PtrRep `thenNat` \ tmp -> + getRegister x `thenNat` \ register -> + let + code = registerCode register tmp + reg = registerName register tmp + off = ImmInt (fromInteger i) + in + return (Amode (AddrRegImm reg off) code) + +getAmode leaf + | isJust imm + = return (Amode (AddrImm imm__2) id) + where + imm = maybeImm leaf + imm__2 = case imm of Just x -> x + +getAmode other + = getNewRegNat PtrRep `thenNat` \ tmp -> + getRegister other `thenNat` \ register -> + let + code = registerCode register tmp + reg = registerName register tmp + in + return (Amode (AddrReg reg) code) + +#endif /* alpha_TARGET_ARCH */ + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + +-- This is all just ridiculous, since it carefully undoes +-- what mangleIndexTree has just done. +getAmode (CmmMachOp (MO_Sub rep) [x, CmmLit lit@(CmmInt i _)]) + | not (is64BitLit lit) + -- ASSERT(rep == I32)??? + = do (x_reg, x_code) <- getSomeReg x + let off = ImmInt (-(fromInteger i)) + return (Amode (AddrBaseIndex (EABaseReg x_reg) EAIndexNone off) x_code) + +getAmode (CmmMachOp (MO_Add rep) [x, CmmLit lit@(CmmInt i _)]) + | not (is64BitLit lit) + -- ASSERT(rep == I32)??? + = do (x_reg, x_code) <- getSomeReg x + let off = ImmInt (fromInteger i) + return (Amode (AddrBaseIndex (EABaseReg x_reg) EAIndexNone off) x_code) + +-- Turn (lit1 << n + lit2) into (lit2 + lit1 << n) so it will be +-- recognised by the next rule. +getAmode (CmmMachOp (MO_Add rep) [a@(CmmMachOp (MO_Shl _) _), + b@(CmmLit _)]) + = getAmode (CmmMachOp (MO_Add rep) [b,a]) + +getAmode (CmmMachOp (MO_Add rep) [x, CmmMachOp (MO_Shl _) + [y, CmmLit (CmmInt shift _)]]) + | shift == 0 || shift == 1 || shift == 2 || shift == 3 + = do (x_reg, x_code) <- getNonClobberedReg x + -- x must be in a temp, because it has to stay live over y_code + -- we could compre x_reg and y_reg and do something better here... + (y_reg, y_code) <- getSomeReg y + let + code = x_code `appOL` y_code + base = case shift of 0 -> 1; 1 -> 2; 2 -> 4; 3 -> 8 + return (Amode (AddrBaseIndex (EABaseReg x_reg) (EAIndex y_reg base) (ImmInt 0)) + code) + +getAmode (CmmLit lit) | not (is64BitLit lit) + = return (Amode (ImmAddr (litToImm lit) 0) nilOL) + +getAmode expr = do + (reg,code) <- getSomeReg expr + return (Amode (AddrBaseIndex (EABaseReg reg) EAIndexNone (ImmInt 0)) code) + +#endif /* i386_TARGET_ARCH || x86_64_TARGET_ARCH */ + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if sparc_TARGET_ARCH + +getAmode (CmmMachOp (MO_Sub rep) [x, CmmLit (CmmInt i _)]) + | fits13Bits (-i) + = do + (reg, code) <- getSomeReg x + let + off = ImmInt (-(fromInteger i)) + return (Amode (AddrRegImm reg off) code) + + +getAmode (CmmMachOp (MO_Add rep) [x, CmmLit (CmmInt i _)]) + | fits13Bits i + = do + (reg, code) <- getSomeReg x + let + off = ImmInt (fromInteger i) + return (Amode (AddrRegImm reg off) code) + +getAmode (CmmMachOp (MO_Add rep) [x, y]) + = do + (regX, codeX) <- getSomeReg x + (regY, codeY) <- getSomeReg y + let + code = codeX `appOL` codeY + return (Amode (AddrRegReg regX regY) code) + +-- XXX Is this same as "leaf" in Stix? +getAmode (CmmLit lit) + = do + tmp <- getNewRegNat I32 + let + code = unitOL (SETHI (HI imm__2) tmp) + return (Amode (AddrRegImm tmp (LO imm__2)) code) + where + imm__2 = litToImm lit + +getAmode other + = do + (reg, code) <- getSomeReg other + let + off = ImmInt 0 + return (Amode (AddrRegImm reg off) code) + +#endif /* sparc_TARGET_ARCH */ + +#ifdef powerpc_TARGET_ARCH +getAmode (CmmMachOp (MO_Sub I32) [x, CmmLit (CmmInt i _)]) + | Just off <- makeImmediate I32 True (-i) + = do + (reg, code) <- getSomeReg x + return (Amode (AddrRegImm reg off) code) + + +getAmode (CmmMachOp (MO_Add I32) [x, CmmLit (CmmInt i _)]) + | Just off <- makeImmediate I32 True i + = do + (reg, code) <- getSomeReg x + return (Amode (AddrRegImm reg off) code) + + -- optimize addition with 32-bit immediate + -- (needed for PIC) +getAmode (CmmMachOp (MO_Add I32) [x, CmmLit lit]) + = do + tmp <- getNewRegNat I32 + (src, srcCode) <- getSomeReg x + let imm = litToImm lit + code = srcCode `snocOL` ADDIS tmp src (HA imm) + return (Amode (AddrRegImm tmp (LO imm)) code) + +getAmode (CmmLit lit) + = do + tmp <- getNewRegNat I32 + let imm = litToImm lit + code = unitOL (LIS tmp (HA imm)) + return (Amode (AddrRegImm tmp (LO imm)) code) + +getAmode (CmmMachOp (MO_Add I32) [x, y]) + = do + (regX, codeX) <- getSomeReg x + (regY, codeY) <- getSomeReg y + return (Amode (AddrRegReg regX regY) (codeX `appOL` codeY)) + +getAmode other + = do + (reg, code) <- getSomeReg other + let + off = ImmInt 0 + return (Amode (AddrRegImm reg off) code) +#endif /* powerpc_TARGET_ARCH */ + +-- ----------------------------------------------------------------------------- +-- getOperand: sometimes any operand will do. + +-- getNonClobberedOperand: the value of the operand will remain valid across +-- the computation of an arbitrary expression, unless the expression +-- is computed directly into a register which the operand refers to +-- (see trivialCode where this function is used for an example). + +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + +getNonClobberedOperand :: CmmExpr -> NatM (Operand, InstrBlock) +#if x86_64_TARGET_ARCH +getNonClobberedOperand (CmmLit lit) + | isSuitableFloatingPointLit lit = do + lbl <- getNewLabelNat + let code = unitOL (LDATA ReadOnlyData [CmmDataLabel lbl, + CmmStaticLit lit]) + return (OpAddr (ripRel (ImmCLbl lbl)), code) +#endif +getNonClobberedOperand (CmmLit lit) + | not (is64BitLit lit) && not (isFloatingRep (cmmLitRep lit)) = + return (OpImm (litToImm lit), nilOL) +getNonClobberedOperand (CmmLoad mem pk) + | IF_ARCH_i386(not (isFloatingRep pk) && pk /= I64, True) = do + Amode src mem_code <- getAmode mem + (src',save_code) <- + if (amodeCouldBeClobbered src) + then do + tmp <- getNewRegNat wordRep + return (AddrBaseIndex (EABaseReg tmp) EAIndexNone (ImmInt 0), + unitOL (LEA I32 (OpAddr src) (OpReg tmp))) + else + return (src, nilOL) + return (OpAddr src', save_code `appOL` mem_code) +getNonClobberedOperand e = do + (reg, code) <- getNonClobberedReg e + return (OpReg reg, code) + +amodeCouldBeClobbered :: AddrMode -> Bool +amodeCouldBeClobbered amode = any regClobbered (addrModeRegs amode) + +regClobbered (RealReg rr) = isFastTrue (freeReg rr) +regClobbered _ = False + +-- getOperand: the operand is not required to remain valid across the +-- computation of an arbitrary expression. +getOperand :: CmmExpr -> NatM (Operand, InstrBlock) +#if x86_64_TARGET_ARCH +getOperand (CmmLit lit) + | isSuitableFloatingPointLit lit = do + lbl <- getNewLabelNat + let code = unitOL (LDATA ReadOnlyData [CmmDataLabel lbl, + CmmStaticLit lit]) + return (OpAddr (ripRel (ImmCLbl lbl)), code) +#endif +getOperand (CmmLit lit) + | not (is64BitLit lit) && not (isFloatingRep (cmmLitRep lit)) = do + return (OpImm (litToImm lit), nilOL) +getOperand (CmmLoad mem pk) + | IF_ARCH_i386(not (isFloatingRep pk) && pk /= I64, True) = do + Amode src mem_code <- getAmode mem + return (OpAddr src, mem_code) +getOperand e = do + (reg, code) <- getSomeReg e + return (OpReg reg, code) + +isOperand :: CmmExpr -> Bool +isOperand (CmmLoad _ _) = True +isOperand (CmmLit lit) = not (is64BitLit lit) + || isSuitableFloatingPointLit lit +isOperand _ = False + +-- if we want a floating-point literal as an operand, we can +-- use it directly from memory. However, if the literal is +-- zero, we're better off generating it into a register using +-- xor. +isSuitableFloatingPointLit (CmmFloat f _) = f /= 0.0 +isSuitableFloatingPointLit _ = False + +getRegOrMem :: CmmExpr -> NatM (Operand, InstrBlock) +getRegOrMem (CmmLoad mem pk) + | IF_ARCH_i386(not (isFloatingRep pk) && pk /= I64, True) = do + Amode src mem_code <- getAmode mem + return (OpAddr src, mem_code) +getRegOrMem e = do + (reg, code) <- getNonClobberedReg e + return (OpReg reg, code) + +#if x86_64_TARGET_ARCH +is64BitLit (CmmInt i I64) = is64BitInteger i + -- assume that labels are in the range 0-2^31-1: this assumes the + -- small memory model (see gcc docs, -mcmodel=small). +#endif +is64BitLit x = False +#endif + +is64BitInteger :: Integer -> Bool +is64BitInteger i = i > 0x7fffffff || i < -0x80000000 + +-- ----------------------------------------------------------------------------- +-- The 'CondCode' type: Condition codes passed up the tree. + +data CondCode = CondCode Bool Cond InstrBlock + +-- Set up a condition code for a conditional branch. + +getCondCode :: CmmExpr -> NatM CondCode + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if alpha_TARGET_ARCH +getCondCode = panic "MachCode.getCondCode: not on Alphas" +#endif /* alpha_TARGET_ARCH */ + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH || sparc_TARGET_ARCH +-- yes, they really do seem to want exactly the same! + +getCondCode (CmmMachOp mop [x, y]) + = ASSERT (cmmExprRep x /= I8) -- tmp, not set up to handle 8-bit comparisons + case mop of + MO_Eq F32 -> condFltCode EQQ x y + MO_Ne F32 -> condFltCode NE x y + + MO_S_Gt F32 -> condFltCode GTT x y + MO_S_Ge F32 -> condFltCode GE x y + MO_S_Lt F32 -> condFltCode LTT x y + MO_S_Le F32 -> condFltCode LE x y + + MO_Eq F64 -> condFltCode EQQ x y + MO_Ne F64 -> condFltCode NE x y + + MO_S_Gt F64 -> condFltCode GTT x y + MO_S_Ge F64 -> condFltCode GE x y + MO_S_Lt F64 -> condFltCode LTT x y + MO_S_Le F64 -> condFltCode LE x y + + MO_Eq rep -> condIntCode EQQ x y + MO_Ne rep -> condIntCode NE x y + + MO_S_Gt rep -> condIntCode GTT x y + MO_S_Ge rep -> condIntCode GE x y + MO_S_Lt rep -> condIntCode LTT x y + MO_S_Le rep -> condIntCode LE x y + + MO_U_Gt rep -> condIntCode GU x y + MO_U_Ge rep -> condIntCode GEU x y + MO_U_Lt rep -> condIntCode LU x y + MO_U_Le rep -> condIntCode LEU x y + + other -> pprPanic "getCondCode(x86,sparc)" (pprMachOp mop) + +getCondCode other = pprPanic "getCondCode(2)(x86,sparc)" (ppr other) + +#elif powerpc_TARGET_ARCH + +-- almost the same as everywhere else - but we need to +-- extend small integers to 32 bit first + +getCondCode (CmmMachOp mop [x, y]) + = case mop of + MO_Eq F32 -> condFltCode EQQ x y + MO_Ne F32 -> condFltCode NE x y + + MO_S_Gt F32 -> condFltCode GTT x y + MO_S_Ge F32 -> condFltCode GE x y + MO_S_Lt F32 -> condFltCode LTT x y + MO_S_Le F32 -> condFltCode LE x y + + MO_Eq F64 -> condFltCode EQQ x y + MO_Ne F64 -> condFltCode NE x y + + MO_S_Gt F64 -> condFltCode GTT x y + MO_S_Ge F64 -> condFltCode GE x y + MO_S_Lt F64 -> condFltCode LTT x y + MO_S_Le F64 -> condFltCode LE x y + + MO_Eq rep -> condIntCode EQQ (extendUExpr rep x) (extendUExpr rep y) + MO_Ne rep -> condIntCode NE (extendUExpr rep x) (extendUExpr rep y) + + MO_S_Gt rep -> condIntCode GTT (extendSExpr rep x) (extendSExpr rep y) + MO_S_Ge rep -> condIntCode GE (extendSExpr rep x) (extendSExpr rep y) + MO_S_Lt rep -> condIntCode LTT (extendSExpr rep x) (extendSExpr rep y) + MO_S_Le rep -> condIntCode LE (extendSExpr rep x) (extendSExpr rep y) + + MO_U_Gt rep -> condIntCode GU (extendUExpr rep x) (extendUExpr rep y) + MO_U_Ge rep -> condIntCode GEU (extendUExpr rep x) (extendUExpr rep y) + MO_U_Lt rep -> condIntCode LU (extendUExpr rep x) (extendUExpr rep y) + MO_U_Le rep -> condIntCode LEU (extendUExpr rep x) (extendUExpr rep y) + + other -> pprPanic "getCondCode(powerpc)" (pprMachOp mop) + +getCondCode other = panic "getCondCode(2)(powerpc)" + + +#endif + + +-- @cond(Int|Flt)Code@: Turn a boolean expression into a condition, to be +-- passed back up the tree. + +condIntCode, condFltCode :: Cond -> CmmExpr -> CmmExpr -> NatM CondCode + +#if alpha_TARGET_ARCH +condIntCode = panic "MachCode.condIntCode: not on Alphas" +condFltCode = panic "MachCode.condFltCode: not on Alphas" +#endif /* alpha_TARGET_ARCH */ + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + +-- memory vs immediate +condIntCode cond (CmmLoad x pk) (CmmLit lit) | not (is64BitLit lit) = do + Amode x_addr x_code <- getAmode x + let + imm = litToImm lit + code = x_code `snocOL` + CMP pk (OpImm imm) (OpAddr x_addr) + -- + return (CondCode False cond code) + +-- anything vs zero +condIntCode cond x (CmmLit (CmmInt 0 pk)) = do + (x_reg, x_code) <- getSomeReg x + let + code = x_code `snocOL` + TEST pk (OpReg x_reg) (OpReg x_reg) + -- + return (CondCode False cond code) + +-- anything vs operand +condIntCode cond x y | isOperand y = do + (x_reg, x_code) <- getNonClobberedReg x + (y_op, y_code) <- getOperand y + let + code = x_code `appOL` y_code `snocOL` + CMP (cmmExprRep x) y_op (OpReg x_reg) + -- in + return (CondCode False cond code) + +-- anything vs anything +condIntCode cond x y = do + (y_reg, y_code) <- getNonClobberedReg y + (x_op, x_code) <- getRegOrMem x + let + code = y_code `appOL` + x_code `snocOL` + CMP (cmmExprRep x) (OpReg y_reg) x_op + -- in + return (CondCode False cond code) +#endif + +#if i386_TARGET_ARCH +condFltCode cond x y + = ASSERT(cond `elem` ([EQQ, NE, LE, LTT, GE, GTT])) do + (x_reg, x_code) <- getNonClobberedReg x + (y_reg, y_code) <- getSomeReg y + let + code = x_code `appOL` y_code `snocOL` + GCMP cond x_reg y_reg + -- The GCMP insn does the test and sets the zero flag if comparable + -- and true. Hence we always supply EQQ as the condition to test. + return (CondCode True EQQ code) +#endif /* i386_TARGET_ARCH */ + +#if x86_64_TARGET_ARCH +-- in the SSE2 comparison ops (ucomiss, ucomisd) the left arg may be +-- an operand, but the right must be a reg. We can probably do better +-- than this general case... +condFltCode cond x y = do + (x_reg, x_code) <- getNonClobberedReg x + (y_op, y_code) <- getOperand y + let + code = x_code `appOL` + y_code `snocOL` + CMP (cmmExprRep x) y_op (OpReg x_reg) + -- NB(1): we need to use the unsigned comparison operators on the + -- result of this comparison. + -- in + return (CondCode True (condToUnsigned cond) code) +#endif + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if sparc_TARGET_ARCH + +condIntCode cond x (CmmLit (CmmInt y rep)) + | fits13Bits y + = do + (src1, code) <- getSomeReg x + let + src2 = ImmInt (fromInteger y) + code' = code `snocOL` SUB False True src1 (RIImm src2) g0 + return (CondCode False cond code') + +condIntCode cond x y = do + (src1, code1) <- getSomeReg x + (src2, code2) <- getSomeReg y + let + code__2 = code1 `appOL` code2 `snocOL` + SUB False True src1 (RIReg src2) g0 + return (CondCode False cond code__2) + +----------- +condFltCode cond x y = do + (src1, code1) <- getSomeReg x + (src2, code2) <- getSomeReg y + tmp <- getNewRegNat F64 + let + promote x = FxTOy F32 F64 x tmp + + pk1 = cmmExprRep x + pk2 = cmmExprRep y + + code__2 = + if pk1 == pk2 then + code1 `appOL` code2 `snocOL` + FCMP True pk1 src1 src2 + else if pk1 == F32 then + code1 `snocOL` promote src1 `appOL` code2 `snocOL` + FCMP True F64 tmp src2 + else + code1 `appOL` code2 `snocOL` promote src2 `snocOL` + FCMP True F64 src1 tmp + return (CondCode True cond code__2) + +#endif /* sparc_TARGET_ARCH */ + +#if powerpc_TARGET_ARCH +-- ###FIXME: I16 and I8! +condIntCode cond x (CmmLit (CmmInt y rep)) + | Just src2 <- makeImmediate rep (not $ condUnsigned cond) y + = do + (src1, code) <- getSomeReg x + let + code' = code `snocOL` + (if condUnsigned cond then CMPL else CMP) I32 src1 (RIImm src2) + return (CondCode False cond code') + +condIntCode cond x y = do + (src1, code1) <- getSomeReg x + (src2, code2) <- getSomeReg y + let + code' = code1 `appOL` code2 `snocOL` + (if condUnsigned cond then CMPL else CMP) I32 src1 (RIReg src2) + return (CondCode False cond code') + +condFltCode cond x y = do + (src1, code1) <- getSomeReg x + (src2, code2) <- getSomeReg y + let + code' = code1 `appOL` code2 `snocOL` FCMP src1 src2 + code'' = case cond of -- twiddle CR to handle unordered case + GE -> code' `snocOL` CRNOR ltbit eqbit gtbit + LE -> code' `snocOL` CRNOR gtbit eqbit ltbit + _ -> code' + where + ltbit = 0 ; eqbit = 2 ; gtbit = 1 + return (CondCode True cond code'') + +#endif /* powerpc_TARGET_ARCH */ + +-- ----------------------------------------------------------------------------- +-- Generating assignments + +-- Assignments are really at the heart of the whole code generation +-- business. Almost all top-level nodes of any real importance are +-- assignments, which correspond to loads, stores, or register +-- transfers. If we're really lucky, some of the register transfers +-- will go away, because we can use the destination register to +-- complete the code generation for the right hand side. This only +-- fails when the right hand side is forced into a fixed register +-- (e.g. the result of a call). + +assignMem_IntCode :: MachRep -> CmmExpr -> CmmExpr -> NatM InstrBlock +assignReg_IntCode :: MachRep -> CmmReg -> CmmExpr -> NatM InstrBlock + +assignMem_FltCode :: MachRep -> CmmExpr -> CmmExpr -> NatM InstrBlock +assignReg_FltCode :: MachRep -> CmmReg -> CmmExpr -> NatM InstrBlock + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if alpha_TARGET_ARCH + +assignIntCode pk (CmmLoad dst _) src + = getNewRegNat IntRep `thenNat` \ tmp -> + getAmode dst `thenNat` \ amode -> + getRegister src `thenNat` \ register -> + let + code1 = amodeCode amode [] + dst__2 = amodeAddr amode + code2 = registerCode register tmp [] + src__2 = registerName register tmp + sz = primRepToSize pk + code__2 = asmSeqThen [code1, code2] . mkSeqInstr (ST sz src__2 dst__2) + in + return code__2 + +assignIntCode pk dst src + = getRegister dst `thenNat` \ register1 -> + getRegister src `thenNat` \ register2 -> + let + dst__2 = registerName register1 zeroh + code = registerCode register2 dst__2 + src__2 = registerName register2 dst__2 + code__2 = if isFixed register2 + then code . mkSeqInstr (OR src__2 (RIReg src__2) dst__2) + else code + in + return code__2 + +#endif /* alpha_TARGET_ARCH */ + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + +-- integer assignment to memory +assignMem_IntCode pk addr src = do + Amode addr code_addr <- getAmode addr + (code_src, op_src) <- get_op_RI src + let + code = code_src `appOL` + code_addr `snocOL` + MOV pk op_src (OpAddr addr) + -- NOTE: op_src is stable, so it will still be valid + -- after code_addr. This may involve the introduction + -- of an extra MOV to a temporary register, but we hope + -- the register allocator will get rid of it. + -- + return code + where + get_op_RI :: CmmExpr -> NatM (InstrBlock,Operand) -- code, operator + get_op_RI (CmmLit lit) | not (is64BitLit lit) + = return (nilOL, OpImm (litToImm lit)) + get_op_RI op + = do (reg,code) <- getNonClobberedReg op + return (code, OpReg reg) + + +-- Assign; dst is a reg, rhs is mem +assignReg_IntCode pk reg (CmmLoad src _) = do + load_code <- intLoadCode (MOV pk) src + return (load_code (getRegisterReg reg)) + +-- dst is a reg, but src could be anything +assignReg_IntCode pk reg src = do + code <- getAnyReg src + return (code (getRegisterReg reg)) + +#endif /* i386_TARGET_ARCH */ + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if sparc_TARGET_ARCH + +assignMem_IntCode pk addr src = do + (srcReg, code) <- getSomeReg src + Amode dstAddr addr_code <- getAmode addr + return $ code `appOL` addr_code `snocOL` ST pk srcReg dstAddr + +assignReg_IntCode pk reg src = do + r <- getRegister src + return $ case r of + Any _ code -> code dst + Fixed _ freg fcode -> fcode `snocOL` OR False g0 (RIReg dst) freg + where + dst = getRegisterReg reg + + +#endif /* sparc_TARGET_ARCH */ + +#if powerpc_TARGET_ARCH + +assignMem_IntCode pk addr src = do + (srcReg, code) <- getSomeReg src + Amode dstAddr addr_code <- getAmode addr + return $ code `appOL` addr_code `snocOL` ST pk srcReg dstAddr + +-- dst is a reg, but src could be anything +assignReg_IntCode pk reg src + = do + r <- getRegister src + return $ case r of + Any _ code -> code dst + Fixed _ freg fcode -> fcode `snocOL` MR dst freg + where + dst = getRegisterReg reg + +#endif /* powerpc_TARGET_ARCH */ + + +-- ----------------------------------------------------------------------------- +-- Floating-point assignments + +#if alpha_TARGET_ARCH + +assignFltCode pk (CmmLoad dst _) src + = getNewRegNat pk `thenNat` \ tmp -> + getAmode dst `thenNat` \ amode -> + getRegister src `thenNat` \ register -> + let + code1 = amodeCode amode [] + dst__2 = amodeAddr amode + code2 = registerCode register tmp [] + src__2 = registerName register tmp + sz = primRepToSize pk + code__2 = asmSeqThen [code1, code2] . mkSeqInstr (ST sz src__2 dst__2) + in + return code__2 + +assignFltCode pk dst src + = getRegister dst `thenNat` \ register1 -> + getRegister src `thenNat` \ register2 -> + let + dst__2 = registerName register1 zeroh + code = registerCode register2 dst__2 + src__2 = registerName register2 dst__2 + code__2 = if isFixed register2 + then code . mkSeqInstr (FMOV src__2 dst__2) + else code + in + return code__2 + +#endif /* alpha_TARGET_ARCH */ + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + +-- Floating point assignment to memory +assignMem_FltCode pk addr src = do + (src_reg, src_code) <- getNonClobberedReg src + Amode addr addr_code <- getAmode addr + let + code = src_code `appOL` + addr_code `snocOL` + IF_ARCH_i386(GST pk src_reg addr, + MOV pk (OpReg src_reg) (OpAddr addr)) + return code + +-- Floating point assignment to a register/temporary +assignReg_FltCode pk reg src = do + src_code <- getAnyReg src + return (src_code (getRegisterReg reg)) + +#endif /* i386_TARGET_ARCH */ + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if sparc_TARGET_ARCH + +-- Floating point assignment to memory +assignMem_FltCode pk addr src = do + Amode dst__2 code1 <- getAmode addr + (src__2, code2) <- getSomeReg src + tmp1 <- getNewRegNat pk + let + pk__2 = cmmExprRep src + code__2 = code1 `appOL` code2 `appOL` + if pk == pk__2 + then unitOL (ST pk src__2 dst__2) + else toOL [FxTOy pk__2 pk src__2 tmp1, ST pk tmp1 dst__2] + return code__2 + +-- Floating point assignment to a register/temporary +-- ToDo: Verify correctness +assignReg_FltCode pk reg src = do + r <- getRegister src + v1 <- getNewRegNat pk + return $ case r of + Any _ code -> code dst + Fixed _ freg fcode -> fcode `snocOL` FMOV pk freg v1 + where + dst = getRegisterReg reg + +#endif /* sparc_TARGET_ARCH */ + +#if powerpc_TARGET_ARCH + +-- Easy, isn't it? +assignMem_FltCode = assignMem_IntCode +assignReg_FltCode = assignReg_IntCode + +#endif /* powerpc_TARGET_ARCH */ + + +-- ----------------------------------------------------------------------------- +-- Generating an non-local jump + +-- (If applicable) Do not fill the delay slots here; you will confuse the +-- register allocator. + +genJump :: CmmExpr{-the branch target-} -> NatM InstrBlock + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if alpha_TARGET_ARCH + +genJump (CmmLabel lbl) + | isAsmTemp lbl = returnInstr (BR target) + | otherwise = returnInstrs [LDA pv (AddrImm target), JMP zeroh (AddrReg pv) 0] + where + target = ImmCLbl lbl + +genJump tree + = getRegister tree `thenNat` \ register -> + getNewRegNat PtrRep `thenNat` \ tmp -> + let + dst = registerName register pv + code = registerCode register pv + target = registerName register pv + in + if isFixed register then + returnSeq code [OR dst (RIReg dst) pv, JMP zeroh (AddrReg pv) 0] + else + return (code . mkSeqInstr (JMP zeroh (AddrReg pv) 0)) + +#endif /* alpha_TARGET_ARCH */ + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + +genJump (CmmLoad mem pk) = do + Amode target code <- getAmode mem + return (code `snocOL` JMP (OpAddr target)) + +genJump (CmmLit lit) = do + return (unitOL (JMP (OpImm (litToImm lit)))) + +genJump expr = do + (reg,code) <- getSomeReg expr + return (code `snocOL` JMP (OpReg reg)) + +#endif /* i386_TARGET_ARCH */ + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if sparc_TARGET_ARCH + +genJump (CmmLit (CmmLabel lbl)) + = return (toOL [CALL (Left target) 0 True, NOP]) + where + target = ImmCLbl lbl + +genJump tree + = do + (target, code) <- getSomeReg tree + return (code `snocOL` JMP (AddrRegReg target g0) `snocOL` NOP) + +#endif /* sparc_TARGET_ARCH */ + +#if powerpc_TARGET_ARCH +genJump (CmmLit (CmmLabel lbl)) + = return (unitOL $ JMP lbl) + +genJump tree + = do + (target,code) <- getSomeReg tree + return (code `snocOL` MTCTR target `snocOL` BCTR []) +#endif /* powerpc_TARGET_ARCH */ + + +-- ----------------------------------------------------------------------------- +-- Unconditional branches + +genBranch :: BlockId -> NatM InstrBlock + +genBranch = return . toOL . mkBranchInstr + +-- ----------------------------------------------------------------------------- +-- Conditional jumps + +{- +Conditional jumps are always to local labels, so we can use branch +instructions. We peek at the arguments to decide what kind of +comparison to do. + +ALPHA: For comparisons with 0, we're laughing, because we can just do +the desired conditional branch. + +I386: First, we have to ensure that the condition +codes are set according to the supplied comparison operation. + +SPARC: First, we have to ensure that the condition codes are set +according to the supplied comparison operation. We generate slightly +different code for floating point comparisons, because a floating +point operation cannot directly precede a @BF@. We assume the worst +and fill that slot with a @NOP@. + +SPARC: Do not fill the delay slots here; you will confuse the register +allocator. +-} + + +genCondJump + :: BlockId -- the branch target + -> CmmExpr -- the condition on which to branch + -> NatM InstrBlock + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if alpha_TARGET_ARCH + +genCondJump id (StPrim op [x, StInt 0]) + = getRegister x `thenNat` \ register -> + getNewRegNat (registerRep register) + `thenNat` \ tmp -> + let + code = registerCode register tmp + value = registerName register tmp + pk = registerRep register + target = ImmCLbl lbl + in + returnSeq code [BI (cmpOp op) value target] + where + cmpOp CharGtOp = GTT + cmpOp CharGeOp = GE + cmpOp CharEqOp = EQQ + cmpOp CharNeOp = NE + cmpOp CharLtOp = LTT + cmpOp CharLeOp = LE + cmpOp IntGtOp = GTT + cmpOp IntGeOp = GE + cmpOp IntEqOp = EQQ + cmpOp IntNeOp = NE + cmpOp IntLtOp = LTT + cmpOp IntLeOp = LE + cmpOp WordGtOp = NE + cmpOp WordGeOp = ALWAYS + cmpOp WordEqOp = EQQ + cmpOp WordNeOp = NE + cmpOp WordLtOp = NEVER + cmpOp WordLeOp = EQQ + cmpOp AddrGtOp = NE + cmpOp AddrGeOp = ALWAYS + cmpOp AddrEqOp = EQQ + cmpOp AddrNeOp = NE + cmpOp AddrLtOp = NEVER + cmpOp AddrLeOp = EQQ + +genCondJump lbl (StPrim op [x, StDouble 0.0]) + = getRegister x `thenNat` \ register -> + getNewRegNat (registerRep register) + `thenNat` \ tmp -> + let + code = registerCode register tmp + value = registerName register tmp + pk = registerRep register + target = ImmCLbl lbl + in + return (code . mkSeqInstr (BF (cmpOp op) value target)) + where + cmpOp FloatGtOp = GTT + cmpOp FloatGeOp = GE + cmpOp FloatEqOp = EQQ + cmpOp FloatNeOp = NE + cmpOp FloatLtOp = LTT + cmpOp FloatLeOp = LE + cmpOp DoubleGtOp = GTT + cmpOp DoubleGeOp = GE + cmpOp DoubleEqOp = EQQ + cmpOp DoubleNeOp = NE + cmpOp DoubleLtOp = LTT + cmpOp DoubleLeOp = LE + +genCondJump lbl (StPrim op [x, y]) + | fltCmpOp op + = trivialFCode pr instr x y `thenNat` \ register -> + getNewRegNat F64 `thenNat` \ tmp -> + let + code = registerCode register tmp + result = registerName register tmp + target = ImmCLbl lbl + in + return (code . mkSeqInstr (BF cond result target)) + where + pr = panic "trivialU?FCode: does not use PrimRep on Alpha" + + fltCmpOp op = case op of + FloatGtOp -> True + FloatGeOp -> True + FloatEqOp -> True + FloatNeOp -> True + FloatLtOp -> True + FloatLeOp -> True + DoubleGtOp -> True + DoubleGeOp -> True + DoubleEqOp -> True + DoubleNeOp -> True + DoubleLtOp -> True + DoubleLeOp -> True + _ -> False + (instr, cond) = case op of + FloatGtOp -> (FCMP TF LE, EQQ) + FloatGeOp -> (FCMP TF LTT, EQQ) + FloatEqOp -> (FCMP TF EQQ, NE) + FloatNeOp -> (FCMP TF EQQ, EQQ) + FloatLtOp -> (FCMP TF LTT, NE) + FloatLeOp -> (FCMP TF LE, NE) + DoubleGtOp -> (FCMP TF LE, EQQ) + DoubleGeOp -> (FCMP TF LTT, EQQ) + DoubleEqOp -> (FCMP TF EQQ, NE) + DoubleNeOp -> (FCMP TF EQQ, EQQ) + DoubleLtOp -> (FCMP TF LTT, NE) + DoubleLeOp -> (FCMP TF LE, NE) + +genCondJump lbl (StPrim op [x, y]) + = trivialCode instr x y `thenNat` \ register -> + getNewRegNat IntRep `thenNat` \ tmp -> + let + code = registerCode register tmp + result = registerName register tmp + target = ImmCLbl lbl + in + return (code . mkSeqInstr (BI cond result target)) + where + (instr, cond) = case op of + CharGtOp -> (CMP LE, EQQ) + CharGeOp -> (CMP LTT, EQQ) + CharEqOp -> (CMP EQQ, NE) + CharNeOp -> (CMP EQQ, EQQ) + CharLtOp -> (CMP LTT, NE) + CharLeOp -> (CMP LE, NE) + IntGtOp -> (CMP LE, EQQ) + IntGeOp -> (CMP LTT, EQQ) + IntEqOp -> (CMP EQQ, NE) + IntNeOp -> (CMP EQQ, EQQ) + IntLtOp -> (CMP LTT, NE) + IntLeOp -> (CMP LE, NE) + WordGtOp -> (CMP ULE, EQQ) + WordGeOp -> (CMP ULT, EQQ) + WordEqOp -> (CMP EQQ, NE) + WordNeOp -> (CMP EQQ, EQQ) + WordLtOp -> (CMP ULT, NE) + WordLeOp -> (CMP ULE, NE) + AddrGtOp -> (CMP ULE, EQQ) + AddrGeOp -> (CMP ULT, EQQ) + AddrEqOp -> (CMP EQQ, NE) + AddrNeOp -> (CMP EQQ, EQQ) + AddrLtOp -> (CMP ULT, NE) + AddrLeOp -> (CMP ULE, NE) + +#endif /* alpha_TARGET_ARCH */ + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if i386_TARGET_ARCH + +genCondJump id bool = do + CondCode _ cond code <- getCondCode bool + return (code `snocOL` JXX cond id) + +#endif + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if x86_64_TARGET_ARCH + +genCondJump id bool = do + CondCode is_float cond cond_code <- getCondCode bool + if not is_float + then + return (cond_code `snocOL` JXX cond id) + else do + lbl <- getBlockIdNat + + -- see comment with condFltReg + let code = case cond of + NE -> or_unordered + GU -> plain_test + GEU -> plain_test + _ -> and_ordered + + plain_test = unitOL ( + JXX cond id + ) + or_unordered = toOL [ + JXX cond id, + JXX PARITY id + ] + and_ordered = toOL [ + JXX PARITY lbl, + JXX cond id, + JXX ALWAYS lbl, + NEWBLOCK lbl + ] + return (cond_code `appOL` code) + +#endif + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if sparc_TARGET_ARCH + +genCondJump (BlockId id) bool = do + CondCode is_float cond code <- getCondCode bool + return ( + code `appOL` + toOL ( + if is_float + then [NOP, BF cond False (ImmCLbl (mkAsmTempLabel id)), NOP] + else [BI cond False (ImmCLbl (mkAsmTempLabel id)), NOP] + ) + ) + +#endif /* sparc_TARGET_ARCH */ + + +#if powerpc_TARGET_ARCH + +genCondJump id bool = do + CondCode is_float cond code <- getCondCode bool + return (code `snocOL` BCC cond id) + +#endif /* powerpc_TARGET_ARCH */ + + +-- ----------------------------------------------------------------------------- +-- Generating C calls + +-- Now the biggest nightmare---calls. Most of the nastiness is buried in +-- @get_arg@, which moves the arguments to the correct registers/stack +-- locations. Apart from that, the code is easy. +-- +-- (If applicable) Do not fill the delay slots here; you will confuse the +-- register allocator. + +genCCall + :: CmmCallTarget -- function to call + -> [(CmmReg,MachHint)] -- where to put the result + -> [(CmmExpr,MachHint)] -- arguments (of mixed type) + -> Maybe [GlobalReg] -- volatile regs to save + -> NatM InstrBlock + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if alpha_TARGET_ARCH + +ccallResultRegs = + +genCCall fn cconv result_regs args + = mapAccumLNat get_arg (allArgRegs, eXTRA_STK_ARGS_HERE) args + `thenNat` \ ((unused,_), argCode) -> + let + nRegs = length allArgRegs - length unused + code = asmSeqThen (map ($ []) argCode) + in + returnSeq code [ + LDA pv (AddrImm (ImmLab (ptext fn))), + JSR ra (AddrReg pv) nRegs, + LDGP gp (AddrReg ra)] + where + ------------------------ + {- Try to get a value into a specific register (or registers) for + a call. The first 6 arguments go into the appropriate + argument register (separate registers for integer and floating + point arguments, but used in lock-step), and the remaining + arguments are dumped to the stack, beginning at 0(sp). Our + first argument is a pair of the list of remaining argument + registers to be assigned for this call and the next stack + offset to use for overflowing arguments. This way, + @get_Arg@ can be applied to all of a call's arguments using + @mapAccumLNat@. + -} + get_arg + :: ([(Reg,Reg)], Int) -- Argument registers and stack offset (accumulator) + -> StixTree -- Current argument + -> NatM (([(Reg,Reg)],Int), InstrBlock) -- Updated accumulator and code + + -- We have to use up all of our argument registers first... + + get_arg ((iDst,fDst):dsts, offset) arg + = getRegister arg `thenNat` \ register -> + let + reg = if isFloatingRep pk then fDst else iDst + code = registerCode register reg + src = registerName register reg + pk = registerRep register + in + return ( + if isFloatingRep pk then + ((dsts, offset), if isFixed register then + code . mkSeqInstr (FMOV src fDst) + else code) + else + ((dsts, offset), if isFixed register then + code . mkSeqInstr (OR src (RIReg src) iDst) + else code)) + + -- Once we have run out of argument registers, we move to the + -- stack... + + get_arg ([], offset) arg + = getRegister arg `thenNat` \ register -> + getNewRegNat (registerRep register) + `thenNat` \ tmp -> + let + code = registerCode register tmp + src = registerName register tmp + pk = registerRep register + sz = primRepToSize pk + in + return (([], offset + 1), code . mkSeqInstr (ST sz src (spRel offset))) + +#endif /* alpha_TARGET_ARCH */ + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if i386_TARGET_ARCH + +-- we only cope with a single result for foreign calls +genCCall (CmmPrim op) [(r,_)] args vols = do + case op of + MO_F32_Sqrt -> actuallyInlineFloatOp F32 (GSQRT F32) args + MO_F64_Sqrt -> actuallyInlineFloatOp F64 (GSQRT F64) args + + MO_F32_Sin -> actuallyInlineFloatOp F32 (GSIN F32) args + MO_F64_Sin -> actuallyInlineFloatOp F64 (GSIN F64) args + + MO_F32_Cos -> actuallyInlineFloatOp F32 (GCOS F32) args + MO_F64_Cos -> actuallyInlineFloatOp F64 (GCOS F64) args + + MO_F32_Tan -> actuallyInlineFloatOp F32 (GTAN F32) args + MO_F64_Tan -> actuallyInlineFloatOp F64 (GTAN F64) args + + other_op -> outOfLineFloatOp op r args vols + where + actuallyInlineFloatOp rep instr [(x,_)] + = do res <- trivialUFCode rep instr x + any <- anyReg res + return (any (getRegisterReg r)) + +genCCall target dest_regs args vols = do + let + sizes = map (arg_size . cmmExprRep . fst) (reverse args) +#if !darwin_TARGET_OS + tot_arg_size = sum sizes +#else + raw_arg_size = sum sizes + tot_arg_size = roundTo 16 raw_arg_size + arg_pad_size = tot_arg_size - raw_arg_size + delta0 <- getDeltaNat + setDeltaNat (delta0 - arg_pad_size) +#endif + + push_codes <- mapM push_arg (reverse args) + delta <- getDeltaNat + + -- in + -- deal with static vs dynamic call targets + (callinsns,cconv) <- + case target of + -- CmmPrim -> ... + CmmForeignCall (CmmLit (CmmLabel lbl)) conv + -> -- ToDo: stdcall arg sizes + return (unitOL (CALL (Left fn_imm) []), conv) + where fn_imm = ImmCLbl lbl + CmmForeignCall expr conv + -> do (dyn_c, dyn_r, dyn_rep) <- get_op expr + ASSERT(dyn_rep == I32) + return (dyn_c `snocOL` CALL (Right dyn_r) [], conv) + + let push_code +#if darwin_TARGET_OS + | arg_pad_size /= 0 + = toOL [SUB I32 (OpImm (ImmInt arg_pad_size)) (OpReg esp), + DELTA (delta0 - arg_pad_size)] + `appOL` concatOL push_codes + | otherwise +#endif + = concatOL push_codes + call = callinsns `appOL` + toOL ( + -- Deallocate parameters after call for ccall; + -- but not for stdcall (callee does it) + (if cconv == StdCallConv || tot_arg_size==0 then [] else + [ADD I32 (OpImm (ImmInt tot_arg_size)) (OpReg esp)]) + ++ + [DELTA (delta + tot_arg_size)] + ) + -- in + setDeltaNat (delta + tot_arg_size) + + let + -- assign the results, if necessary + assign_code [] = nilOL + assign_code [(dest,_hint)] = + case rep of + I64 -> toOL [MOV I32 (OpReg eax) (OpReg r_dest), + MOV I32 (OpReg edx) (OpReg r_dest_hi)] + F32 -> unitOL (GMOV fake0 r_dest) + F64 -> unitOL (GMOV fake0 r_dest) + rep -> unitOL (MOV rep (OpReg eax) (OpReg r_dest)) + where + r_dest_hi = getHiVRegFromLo r_dest + rep = cmmRegRep dest + r_dest = getRegisterReg dest + assign_code many = panic "genCCall.assign_code many" + + return (push_code `appOL` + call `appOL` + assign_code dest_regs) + + where + arg_size F64 = 8 + arg_size F32 = 4 + arg_size I64 = 8 + arg_size _ = 4 + + roundTo a x | x `mod` a == 0 = x + | otherwise = x + a - (x `mod` a) + + + push_arg :: (CmmExpr,MachHint){-current argument-} + -> NatM InstrBlock -- code + + push_arg (arg,_hint) -- we don't need the hints on x86 + | arg_rep == I64 = do + ChildCode64 code r_lo <- iselExpr64 arg + delta <- getDeltaNat + setDeltaNat (delta - 8) + let + r_hi = getHiVRegFromLo r_lo + -- in + return ( code `appOL` + toOL [PUSH I32 (OpReg r_hi), DELTA (delta - 4), + PUSH I32 (OpReg r_lo), DELTA (delta - 8), + DELTA (delta-8)] + ) + + | otherwise = do + (code, reg, sz) <- get_op arg + delta <- getDeltaNat + let size = arg_size sz + setDeltaNat (delta-size) + if (case sz of F64 -> True; F32 -> True; _ -> False) + then return (code `appOL` + toOL [SUB I32 (OpImm (ImmInt size)) (OpReg esp), + DELTA (delta-size), + GST sz reg (AddrBaseIndex (EABaseReg esp) + EAIndexNone + (ImmInt 0))] + ) + else return (code `snocOL` + PUSH I32 (OpReg reg) `snocOL` + DELTA (delta-size) + ) + where + arg_rep = cmmExprRep arg + + ------------ + get_op :: CmmExpr -> NatM (InstrBlock, Reg, MachRep) -- code, reg, size + get_op op = do + (reg,code) <- getSomeReg op + return (code, reg, cmmExprRep op) + +#endif /* i386_TARGET_ARCH */ + +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + +outOfLineFloatOp :: CallishMachOp -> CmmReg -> [(CmmExpr,MachHint)] + -> Maybe [GlobalReg] -> NatM InstrBlock +outOfLineFloatOp mop res args vols + = do + targetExpr <- cmmMakeDynamicReference addImportNat True lbl + let target = CmmForeignCall targetExpr CCallConv + + if cmmRegRep res == F64 + then + stmtToInstrs (CmmCall target [(res,FloatHint)] args vols) + else do + uq <- getUniqueNat + let + tmp = CmmLocal (LocalReg uq F64) + -- in + code1 <- stmtToInstrs (CmmCall target [(tmp,FloatHint)] args vols) + code2 <- stmtToInstrs (CmmAssign res (CmmReg tmp)) + return (code1 `appOL` code2) + where + lbl = mkForeignLabel fn Nothing True + + fn = case mop of + MO_F32_Sqrt -> FSLIT("sqrtf") + MO_F32_Sin -> FSLIT("sinf") + MO_F32_Cos -> FSLIT("cosf") + MO_F32_Tan -> FSLIT("tanf") + MO_F32_Exp -> FSLIT("expf") + MO_F32_Log -> FSLIT("logf") + + MO_F32_Asin -> FSLIT("asinf") + MO_F32_Acos -> FSLIT("acosf") + MO_F32_Atan -> FSLIT("atanf") + + MO_F32_Sinh -> FSLIT("sinhf") + MO_F32_Cosh -> FSLIT("coshf") + MO_F32_Tanh -> FSLIT("tanhf") + MO_F32_Pwr -> FSLIT("powf") + + MO_F64_Sqrt -> FSLIT("sqrt") + MO_F64_Sin -> FSLIT("sin") + MO_F64_Cos -> FSLIT("cos") + MO_F64_Tan -> FSLIT("tan") + MO_F64_Exp -> FSLIT("exp") + MO_F64_Log -> FSLIT("log") + + MO_F64_Asin -> FSLIT("asin") + MO_F64_Acos -> FSLIT("acos") + MO_F64_Atan -> FSLIT("atan") + + MO_F64_Sinh -> FSLIT("sinh") + MO_F64_Cosh -> FSLIT("cosh") + MO_F64_Tanh -> FSLIT("tanh") + MO_F64_Pwr -> FSLIT("pow") + +#endif /* i386_TARGET_ARCH || x86_64_TARGET_ARCH */ + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if x86_64_TARGET_ARCH + +genCCall (CmmPrim op) [(r,_)] args vols = + outOfLineFloatOp op r args vols + +genCCall target dest_regs args vols = do + + -- load up the register arguments + (stack_args, aregs, fregs, load_args_code) + <- load_args args allArgRegs allFPArgRegs nilOL + + let + fp_regs_used = reverse (drop (length fregs) (reverse allFPArgRegs)) + int_regs_used = reverse (drop (length aregs) (reverse allArgRegs)) + arg_regs = int_regs_used ++ fp_regs_used + -- for annotating the call instruction with + + sse_regs = length fp_regs_used + + tot_arg_size = arg_size * length stack_args + + -- On entry to the called function, %rsp should be aligned + -- on a 16-byte boundary +8 (i.e. the first stack arg after + -- the return address is 16-byte aligned). In STG land + -- %rsp is kept 16-byte aligned (see StgCRun.c), so we just + -- need to make sure we push a multiple of 16-bytes of args, + -- plus the return address, to get the correct alignment. + -- Urg, this is hard. We need to feed the delta back into + -- the arg pushing code. + (real_size, adjust_rsp) <- + if tot_arg_size `rem` 16 == 0 + then return (tot_arg_size, nilOL) + else do -- we need to adjust... + delta <- getDeltaNat + setDeltaNat (delta-8) + return (tot_arg_size+8, toOL [ + SUB I64 (OpImm (ImmInt 8)) (OpReg rsp), + DELTA (delta-8) + ]) + + -- push the stack args, right to left + push_code <- push_args (reverse stack_args) nilOL + delta <- getDeltaNat + + -- deal with static vs dynamic call targets + (callinsns,cconv) <- + case target of + -- CmmPrim -> ... + CmmForeignCall (CmmLit (CmmLabel lbl)) conv + -> -- ToDo: stdcall arg sizes + return (unitOL (CALL (Left fn_imm) arg_regs), conv) + where fn_imm = ImmCLbl lbl + CmmForeignCall expr conv + -> do (dyn_r, dyn_c) <- getSomeReg expr + return (dyn_c `snocOL` CALL (Right dyn_r) arg_regs, conv) + + let + -- The x86_64 ABI requires us to set %al to the number of SSE + -- registers that contain arguments, if the called routine + -- is a varargs function. We don't know whether it's a + -- varargs function or not, so we have to assume it is. + -- + -- It's not safe to omit this assignment, even if the number + -- of SSE regs in use is zero. If %al is larger than 8 + -- on entry to a varargs function, seg faults ensue. + assign_eax n = unitOL (MOV I32 (OpImm (ImmInt n)) (OpReg eax)) + + let call = callinsns `appOL` + toOL ( + -- Deallocate parameters after call for ccall; + -- but not for stdcall (callee does it) + (if cconv == StdCallConv || real_size==0 then [] else + [ADD wordRep (OpImm (ImmInt real_size)) (OpReg esp)]) + ++ + [DELTA (delta + real_size)] + ) + -- in + setDeltaNat (delta + real_size) + + let + -- assign the results, if necessary + assign_code [] = nilOL + assign_code [(dest,_hint)] = + case rep of + F32 -> unitOL (MOV rep (OpReg xmm0) (OpReg r_dest)) + F64 -> unitOL (MOV rep (OpReg xmm0) (OpReg r_dest)) + rep -> unitOL (MOV rep (OpReg rax) (OpReg r_dest)) + where + rep = cmmRegRep dest + r_dest = getRegisterReg dest + assign_code many = panic "genCCall.assign_code many" + + return (load_args_code `appOL` + adjust_rsp `appOL` + push_code `appOL` + assign_eax sse_regs `appOL` + call `appOL` + assign_code dest_regs) + + where + arg_size = 8 -- always, at the mo + + load_args :: [(CmmExpr,MachHint)] + -> [Reg] -- int regs avail for args + -> [Reg] -- FP regs avail for args + -> InstrBlock + -> NatM ([(CmmExpr,MachHint)],[Reg],[Reg],InstrBlock) + load_args args [] [] code = return (args, [], [], code) + -- no more regs to use + load_args [] aregs fregs code = return ([], aregs, fregs, code) + -- no more args to push + load_args ((arg,hint) : rest) aregs fregs code + | isFloatingRep arg_rep = + case fregs of + [] -> push_this_arg + (r:rs) -> do + arg_code <- getAnyReg arg + load_args rest aregs rs (code `appOL` arg_code r) + | otherwise = + case aregs of + [] -> push_this_arg + (r:rs) -> do + arg_code <- getAnyReg arg + load_args rest rs fregs (code `appOL` arg_code r) + where + arg_rep = cmmExprRep arg + + push_this_arg = do + (args',ars,frs,code') <- load_args rest aregs fregs code + return ((arg,hint):args', ars, frs, code') + + push_args [] code = return code + push_args ((arg,hint):rest) code + | isFloatingRep arg_rep = do + (arg_reg, arg_code) <- getSomeReg arg + delta <- getDeltaNat + setDeltaNat (delta-arg_size) + let code' = code `appOL` toOL [ + MOV arg_rep (OpReg arg_reg) (OpAddr (spRel 0)), + SUB wordRep (OpImm (ImmInt arg_size)) (OpReg rsp) , + DELTA (delta-arg_size)] + push_args rest code' + + | otherwise = do + -- we only ever generate word-sized function arguments. Promotion + -- has already happened: our Int8# type is kept sign-extended + -- in an Int#, for example. + ASSERT(arg_rep == I64) return () + (arg_op, arg_code) <- getOperand arg + delta <- getDeltaNat + setDeltaNat (delta-arg_size) + let code' = code `appOL` toOL [PUSH I64 arg_op, + DELTA (delta-arg_size)] + push_args rest code' + where + arg_rep = cmmExprRep arg +#endif + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if sparc_TARGET_ARCH +{- + The SPARC calling convention is an absolute + nightmare. The first 6x32 bits of arguments are mapped into + %o0 through %o5, and the remaining arguments are dumped to the + stack, beginning at [%sp+92]. (Note that %o6 == %sp.) + + If we have to put args on the stack, move %o6==%sp down by + the number of words to go on the stack, to ensure there's enough space. + + According to Fraser and Hanson's lcc book, page 478, fig 17.2, + 16 words above the stack pointer is a word for the address of + a structure return value. I use this as a temporary location + for moving values from float to int regs. Certainly it isn't + safe to put anything in the 16 words starting at %sp, since + this area can get trashed at any time due to window overflows + caused by signal handlers. + + A final complication (if the above isn't enough) is that + we can't blithely calculate the arguments one by one into + %o0 .. %o5. Consider the following nested calls: + + fff a (fff b c) + + Naive code moves a into %o0, and (fff b c) into %o1. Unfortunately + the inner call will itself use %o0, which trashes the value put there + in preparation for the outer call. Upshot: we need to calculate the + args into temporary regs, and move those to arg regs or onto the + stack only immediately prior to the call proper. Sigh. +-} + +genCCall target dest_regs argsAndHints vols = do + let + args = map fst argsAndHints + argcode_and_vregs <- mapM arg_to_int_vregs args + let + (argcodes, vregss) = unzip argcode_and_vregs + n_argRegs = length allArgRegs + n_argRegs_used = min (length vregs) n_argRegs + vregs = concat vregss + -- deal with static vs dynamic call targets + callinsns <- (case target of + CmmForeignCall (CmmLit (CmmLabel lbl)) conv -> do + return (unitOL (CALL (Left (litToImm (CmmLabel lbl))) n_argRegs_used False)) + CmmForeignCall expr conv -> do + (dyn_c, [dyn_r]) <- arg_to_int_vregs expr + return (dyn_c `snocOL` CALL (Right dyn_r) n_argRegs_used False) + CmmPrim mop -> do + (res, reduce) <- outOfLineFloatOp mop + lblOrMopExpr <- case res of + Left lbl -> do + return (unitOL (CALL (Left (litToImm (CmmLabel lbl))) n_argRegs_used False)) + Right mopExpr -> do + (dyn_c, [dyn_r]) <- arg_to_int_vregs mopExpr + return (dyn_c `snocOL` CALL (Right dyn_r) n_argRegs_used False) + if reduce then panic "genCCall(sparc): can not reduce" else return lblOrMopExpr + + ) + let + argcode = concatOL argcodes + (move_sp_down, move_sp_up) + = let diff = length vregs - n_argRegs + nn = if odd diff then diff + 1 else diff -- keep 8-byte alignment + in if nn <= 0 + then (nilOL, nilOL) + else (unitOL (moveSp (-1*nn)), unitOL (moveSp (1*nn))) + transfer_code + = toOL (move_final vregs allArgRegs eXTRA_STK_ARGS_HERE) + return (argcode `appOL` + move_sp_down `appOL` + transfer_code `appOL` + callinsns `appOL` + unitOL NOP `appOL` + move_sp_up) + where + -- move args from the integer vregs into which they have been + -- marshalled, into %o0 .. %o5, and the rest onto the stack. + move_final :: [Reg] -> [Reg] -> Int -> [Instr] + + move_final [] _ offset -- all args done + = [] + + move_final (v:vs) [] offset -- out of aregs; move to stack + = ST I32 v (spRel offset) + : move_final vs [] (offset+1) + + move_final (v:vs) (a:az) offset -- move into an arg (%o[0..5]) reg + = OR False g0 (RIReg v) a + : move_final vs az offset + + -- generate code to calculate an argument, and move it into one + -- or two integer vregs. + arg_to_int_vregs :: CmmExpr -> NatM (OrdList Instr, [Reg]) + arg_to_int_vregs arg + | (cmmExprRep arg) == I64 + = do + (ChildCode64 code r_lo) <- iselExpr64 arg + let + r_hi = getHiVRegFromLo r_lo + return (code, [r_hi, r_lo]) + | otherwise + = do + (src, code) <- getSomeReg arg + tmp <- getNewRegNat (cmmExprRep arg) + let + pk = cmmExprRep arg + case pk of + F64 -> do + v1 <- getNewRegNat I32 + v2 <- getNewRegNat I32 + return ( + code `snocOL` + FMOV F64 src f0 `snocOL` + ST F32 f0 (spRel 16) `snocOL` + LD I32 (spRel 16) v1 `snocOL` + ST F32 (fPair f0) (spRel 16) `snocOL` + LD I32 (spRel 16) v2 + , + [v1,v2] + ) + F32 -> do + v1 <- getNewRegNat I32 + return ( + code `snocOL` + ST F32 src (spRel 16) `snocOL` + LD I32 (spRel 16) v1 + , + [v1] + ) + other -> do + v1 <- getNewRegNat I32 + return ( + code `snocOL` OR False g0 (RIReg src) v1 + , + [v1] + ) +outOfLineFloatOp mop = + do + mopExpr <- cmmMakeDynamicReference addImportNat True $ + mkForeignLabel functionName Nothing True + let mopLabelOrExpr = case mopExpr of + CmmLit (CmmLabel lbl) -> Left lbl + _ -> Right mopExpr + return (mopLabelOrExpr, reduce) + where + (reduce, functionName) = case mop of + MO_F32_Exp -> (True, FSLIT("exp")) + MO_F32_Log -> (True, FSLIT("log")) + MO_F32_Sqrt -> (True, FSLIT("sqrt")) + + MO_F32_Sin -> (True, FSLIT("sin")) + MO_F32_Cos -> (True, FSLIT("cos")) + MO_F32_Tan -> (True, FSLIT("tan")) + + MO_F32_Asin -> (True, FSLIT("asin")) + MO_F32_Acos -> (True, FSLIT("acos")) + MO_F32_Atan -> (True, FSLIT("atan")) + + MO_F32_Sinh -> (True, FSLIT("sinh")) + MO_F32_Cosh -> (True, FSLIT("cosh")) + MO_F32_Tanh -> (True, FSLIT("tanh")) + + MO_F64_Exp -> (False, FSLIT("exp")) + MO_F64_Log -> (False, FSLIT("log")) + MO_F64_Sqrt -> (False, FSLIT("sqrt")) + + MO_F64_Sin -> (False, FSLIT("sin")) + MO_F64_Cos -> (False, FSLIT("cos")) + MO_F64_Tan -> (False, FSLIT("tan")) + + MO_F64_Asin -> (False, FSLIT("asin")) + MO_F64_Acos -> (False, FSLIT("acos")) + MO_F64_Atan -> (False, FSLIT("atan")) + + MO_F64_Sinh -> (False, FSLIT("sinh")) + MO_F64_Cosh -> (False, FSLIT("cosh")) + MO_F64_Tanh -> (False, FSLIT("tanh")) + + other -> pprPanic "outOfLineFloatOp(sparc) " + (pprCallishMachOp mop) + +#endif /* sparc_TARGET_ARCH */ + +#if powerpc_TARGET_ARCH + +#if darwin_TARGET_OS || linux_TARGET_OS +{- + The PowerPC calling convention for Darwin/Mac OS X + is described in Apple's document + "Inside Mac OS X - Mach-O Runtime Architecture". + + PowerPC Linux uses the System V Release 4 Calling Convention + for PowerPC. It is described in the + "System V Application Binary Interface PowerPC Processor Supplement". + + Both conventions are similar: + Parameters may be passed in general-purpose registers starting at r3, in + floating point registers starting at f1, or on the stack. + + But there are substantial differences: + * The number of registers used for parameter passing and the exact set of + nonvolatile registers differs (see MachRegs.lhs). + * On Darwin, stack space is always reserved for parameters, even if they are + passed in registers. The called routine may choose to save parameters from + registers to the corresponding space on the stack. + * On Darwin, a corresponding amount of GPRs is skipped when a floating point + parameter is passed in an FPR. + * SysV insists on either passing I64 arguments on the stack, or in two GPRs, + starting with an odd-numbered GPR. It may skip a GPR to achieve this. + Darwin just treats an I64 like two separate I32s (high word first). + * I64 and F64 arguments are 8-byte aligned on the stack for SysV, but only + 4-byte aligned like everything else on Darwin. + * The SysV spec claims that F32 is represented as F64 on the stack. GCC on + PowerPC Linux does not agree, so neither do we. + + According to both conventions, The parameter area should be part of the + caller's stack frame, allocated in the caller's prologue code (large enough + to hold the parameter lists for all called routines). The NCG already + uses the stack for register spilling, leaving 64 bytes free at the top. + If we need a larger parameter area than that, we just allocate a new stack + frame just before ccalling. +-} + +genCCall target dest_regs argsAndHints vols + = ASSERT (not $ any (`elem` [I8,I16]) argReps) + -- we rely on argument promotion in the codeGen + do + (finalStack,passArgumentsCode,usedRegs) <- passArguments + (zip args argReps) + allArgRegs allFPArgRegs + initialStackOffset + (toOL []) [] + + (labelOrExpr, reduceToF32) <- case target of + CmmForeignCall (CmmLit (CmmLabel lbl)) conv -> return (Left lbl, False) + CmmForeignCall expr conv -> return (Right expr, False) + CmmPrim mop -> outOfLineFloatOp mop + + let codeBefore = move_sp_down finalStack `appOL` passArgumentsCode + codeAfter = move_sp_up finalStack `appOL` moveResult reduceToF32 + + case labelOrExpr of + Left lbl -> do + return ( codeBefore + `snocOL` BL lbl usedRegs + `appOL` codeAfter) + Right dyn -> do + (dynReg, dynCode) <- getSomeReg dyn + return ( dynCode + `snocOL` MTCTR dynReg + `appOL` codeBefore + `snocOL` BCTRL usedRegs + `appOL` codeAfter) + where +#if darwin_TARGET_OS + initialStackOffset = 24 + -- size of linkage area + size of arguments, in bytes + stackDelta _finalStack = roundTo 16 $ (24 +) $ max 32 $ sum $ + map machRepByteWidth argReps +#elif linux_TARGET_OS + initialStackOffset = 8 + stackDelta finalStack = roundTo 16 finalStack +#endif + args = map fst argsAndHints + argReps = map cmmExprRep args + + roundTo a x | x `mod` a == 0 = x + | otherwise = x + a - (x `mod` a) + + move_sp_down finalStack + | delta > 64 = + toOL [STU I32 sp (AddrRegImm sp (ImmInt (-delta))), + DELTA (-delta)] + | otherwise = nilOL + where delta = stackDelta finalStack + move_sp_up finalStack + | delta > 64 = + toOL [ADD sp sp (RIImm (ImmInt delta)), + DELTA 0] + | otherwise = nilOL + where delta = stackDelta finalStack + + + passArguments [] _ _ stackOffset accumCode accumUsed = return (stackOffset, accumCode, accumUsed) + passArguments ((arg,I64):args) gprs fprs stackOffset + accumCode accumUsed = + do + ChildCode64 code vr_lo <- iselExpr64 arg + let vr_hi = getHiVRegFromLo vr_lo + +#if darwin_TARGET_OS + passArguments args + (drop 2 gprs) + fprs + (stackOffset+8) + (accumCode `appOL` code + `snocOL` storeWord vr_hi gprs stackOffset + `snocOL` storeWord vr_lo (drop 1 gprs) (stackOffset+4)) + ((take 2 gprs) ++ accumUsed) + where + storeWord vr (gpr:_) offset = MR gpr vr + storeWord vr [] offset = ST I32 vr (AddrRegImm sp (ImmInt offset)) + +#elif linux_TARGET_OS + let stackOffset' = roundTo 8 stackOffset + stackCode = accumCode `appOL` code + `snocOL` ST I32 vr_hi (AddrRegImm sp (ImmInt stackOffset')) + `snocOL` ST I32 vr_lo (AddrRegImm sp (ImmInt (stackOffset'+4))) + regCode hireg loreg = + accumCode `appOL` code + `snocOL` MR hireg vr_hi + `snocOL` MR loreg vr_lo + + case gprs of + hireg : loreg : regs | even (length gprs) -> + passArguments args regs fprs stackOffset + (regCode hireg loreg) (hireg : loreg : accumUsed) + _skipped : hireg : loreg : regs -> + passArguments args regs fprs stackOffset + (regCode hireg loreg) (hireg : loreg : accumUsed) + _ -> -- only one or no regs left + passArguments args [] fprs (stackOffset'+8) + stackCode accumUsed +#endif + + passArguments ((arg,rep):args) gprs fprs stackOffset accumCode accumUsed + | reg : _ <- regs = do + register <- getRegister arg + let code = case register of + Fixed _ freg fcode -> fcode `snocOL` MR reg freg + Any _ acode -> acode reg + passArguments args + (drop nGprs gprs) + (drop nFprs fprs) +#if darwin_TARGET_OS + -- The Darwin ABI requires that we reserve stack slots for register parameters + (stackOffset + stackBytes) +#elif linux_TARGET_OS + -- ... the SysV ABI doesn't. + stackOffset +#endif + (accumCode `appOL` code) + (reg : accumUsed) + | otherwise = do + (vr, code) <- getSomeReg arg + passArguments args + (drop nGprs gprs) + (drop nFprs fprs) + (stackOffset' + stackBytes) + (accumCode `appOL` code `snocOL` ST rep vr stackSlot) + accumUsed + where +#if darwin_TARGET_OS + -- stackOffset is at least 4-byte aligned + -- The Darwin ABI is happy with that. + stackOffset' = stackOffset +#else + -- ... the SysV ABI requires 8-byte alignment for doubles. + stackOffset' | rep == F64 = roundTo 8 stackOffset + | otherwise = stackOffset +#endif + stackSlot = AddrRegImm sp (ImmInt stackOffset') + (nGprs, nFprs, stackBytes, regs) = case rep of + I32 -> (1, 0, 4, gprs) +#if darwin_TARGET_OS + -- The Darwin ABI requires that we skip a corresponding number of GPRs when + -- we use the FPRs. + F32 -> (1, 1, 4, fprs) + F64 -> (2, 1, 8, fprs) +#elif linux_TARGET_OS + -- ... the SysV ABI doesn't. + F32 -> (0, 1, 4, fprs) + F64 -> (0, 1, 8, fprs) +#endif + + moveResult reduceToF32 = + case dest_regs of + [] -> nilOL + [(dest, _hint)] + | reduceToF32 && rep == F32 -> unitOL (FRSP r_dest f1) + | rep == F32 || rep == F64 -> unitOL (MR r_dest f1) + | rep == I64 -> toOL [MR (getHiVRegFromLo r_dest) r3, + MR r_dest r4] + | otherwise -> unitOL (MR r_dest r3) + where rep = cmmRegRep dest + r_dest = getRegisterReg dest + + outOfLineFloatOp mop = + do + mopExpr <- cmmMakeDynamicReference addImportNat True $ + mkForeignLabel functionName Nothing True + let mopLabelOrExpr = case mopExpr of + CmmLit (CmmLabel lbl) -> Left lbl + _ -> Right mopExpr + return (mopLabelOrExpr, reduce) + where + (functionName, reduce) = case mop of + MO_F32_Exp -> (FSLIT("exp"), True) + MO_F32_Log -> (FSLIT("log"), True) + MO_F32_Sqrt -> (FSLIT("sqrt"), True) + + MO_F32_Sin -> (FSLIT("sin"), True) + MO_F32_Cos -> (FSLIT("cos"), True) + MO_F32_Tan -> (FSLIT("tan"), True) + + MO_F32_Asin -> (FSLIT("asin"), True) + MO_F32_Acos -> (FSLIT("acos"), True) + MO_F32_Atan -> (FSLIT("atan"), True) + + MO_F32_Sinh -> (FSLIT("sinh"), True) + MO_F32_Cosh -> (FSLIT("cosh"), True) + MO_F32_Tanh -> (FSLIT("tanh"), True) + MO_F32_Pwr -> (FSLIT("pow"), True) + + MO_F64_Exp -> (FSLIT("exp"), False) + MO_F64_Log -> (FSLIT("log"), False) + MO_F64_Sqrt -> (FSLIT("sqrt"), False) + + MO_F64_Sin -> (FSLIT("sin"), False) + MO_F64_Cos -> (FSLIT("cos"), False) + MO_F64_Tan -> (FSLIT("tan"), False) + + MO_F64_Asin -> (FSLIT("asin"), False) + MO_F64_Acos -> (FSLIT("acos"), False) + MO_F64_Atan -> (FSLIT("atan"), False) + + MO_F64_Sinh -> (FSLIT("sinh"), False) + MO_F64_Cosh -> (FSLIT("cosh"), False) + MO_F64_Tanh -> (FSLIT("tanh"), False) + MO_F64_Pwr -> (FSLIT("pow"), False) + other -> pprPanic "genCCall(ppc): unknown callish op" + (pprCallishMachOp other) + +#endif /* darwin_TARGET_OS || linux_TARGET_OS */ + +#endif /* powerpc_TARGET_ARCH */ + + +-- ----------------------------------------------------------------------------- +-- Generating a table-branch + +genSwitch :: CmmExpr -> [Maybe BlockId] -> NatM InstrBlock + +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH +genSwitch expr ids + | opt_PIC + = do + (reg,e_code) <- getSomeReg expr + lbl <- getNewLabelNat + dynRef <- cmmMakeDynamicReference addImportNat False lbl + (tableReg,t_code) <- getSomeReg $ dynRef + let + jumpTable = map jumpTableEntryRel ids + + jumpTableEntryRel Nothing + = CmmStaticLit (CmmInt 0 wordRep) + jumpTableEntryRel (Just (BlockId id)) + = CmmStaticLit (CmmLabelDiffOff blockLabel lbl 0) + where blockLabel = mkAsmTempLabel id + + op = OpAddr (AddrBaseIndex (EABaseReg tableReg) + (EAIndex reg wORD_SIZE) (ImmInt 0)) + + code = e_code `appOL` t_code `appOL` toOL [ + LDATA ReadOnlyData (CmmDataLabel lbl : jumpTable), + ADD wordRep op (OpReg tableReg), + JMP_TBL (OpReg tableReg) [ id | Just id <- ids ] + ] + return code + | otherwise + = do + (reg,e_code) <- getSomeReg expr + lbl <- getNewLabelNat + let + jumpTable = map jumpTableEntry ids + op = OpAddr (AddrBaseIndex EABaseNone (EAIndex reg wORD_SIZE) (ImmCLbl lbl)) + code = e_code `appOL` toOL [ + LDATA ReadOnlyData (CmmDataLabel lbl : jumpTable), + JMP_TBL op [ id | Just id <- ids ] + ] + -- in + return code +#elif powerpc_TARGET_ARCH +genSwitch expr ids + | opt_PIC + = do + (reg,e_code) <- getSomeReg expr + tmp <- getNewRegNat I32 + lbl <- getNewLabelNat + dynRef <- cmmMakeDynamicReference addImportNat False lbl + (tableReg,t_code) <- getSomeReg $ dynRef + let + jumpTable = map jumpTableEntryRel ids + + jumpTableEntryRel Nothing + = CmmStaticLit (CmmInt 0 wordRep) + jumpTableEntryRel (Just (BlockId id)) + = CmmStaticLit (CmmLabelDiffOff blockLabel lbl 0) + where blockLabel = mkAsmTempLabel id + + code = e_code `appOL` t_code `appOL` toOL [ + LDATA ReadOnlyData (CmmDataLabel lbl : jumpTable), + SLW tmp reg (RIImm (ImmInt 2)), + LD I32 tmp (AddrRegReg tableReg tmp), + ADD tmp tmp (RIReg tableReg), + MTCTR tmp, + BCTR [ id | Just id <- ids ] + ] + return code + | otherwise + = do + (reg,e_code) <- getSomeReg expr + tmp <- getNewRegNat I32 + lbl <- getNewLabelNat + let + jumpTable = map jumpTableEntry ids + + code = e_code `appOL` toOL [ + LDATA ReadOnlyData (CmmDataLabel lbl : jumpTable), + SLW tmp reg (RIImm (ImmInt 2)), + ADDIS tmp tmp (HA (ImmCLbl lbl)), + LD I32 tmp (AddrRegImm tmp (LO (ImmCLbl lbl))), + MTCTR tmp, + BCTR [ id | Just id <- ids ] + ] + return code +#else +genSwitch expr ids = panic "ToDo: genSwitch" +#endif + +jumpTableEntry Nothing = CmmStaticLit (CmmInt 0 wordRep) +jumpTableEntry (Just (BlockId id)) = CmmStaticLit (CmmLabel blockLabel) + where blockLabel = mkAsmTempLabel id + +-- ----------------------------------------------------------------------------- +-- Support bits +-- ----------------------------------------------------------------------------- + + +-- ----------------------------------------------------------------------------- +-- 'condIntReg' and 'condFltReg': condition codes into registers + +-- Turn those condition codes into integers now (when they appear on +-- the right hand side of an assignment). +-- +-- (If applicable) Do not fill the delay slots here; you will confuse the +-- register allocator. + +condIntReg, condFltReg :: Cond -> CmmExpr -> CmmExpr -> NatM Register + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if alpha_TARGET_ARCH +condIntReg = panic "MachCode.condIntReg (not on Alpha)" +condFltReg = panic "MachCode.condFltReg (not on Alpha)" +#endif /* alpha_TARGET_ARCH */ + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + +condIntReg cond x y = do + CondCode _ cond cond_code <- condIntCode cond x y + tmp <- getNewRegNat I8 + let + code dst = cond_code `appOL` toOL [ + SETCC cond (OpReg tmp), + MOVZxL I8 (OpReg tmp) (OpReg dst) + ] + -- in + return (Any I32 code) + +#endif + +#if i386_TARGET_ARCH + +condFltReg cond x y = do + CondCode _ cond cond_code <- condFltCode cond x y + tmp <- getNewRegNat I8 + let + code dst = cond_code `appOL` toOL [ + SETCC cond (OpReg tmp), + MOVZxL I8 (OpReg tmp) (OpReg dst) + ] + -- in + return (Any I32 code) + +#endif + +#if x86_64_TARGET_ARCH + +condFltReg cond x y = do + CondCode _ cond cond_code <- condFltCode cond x y + tmp1 <- getNewRegNat wordRep + tmp2 <- getNewRegNat wordRep + let + -- We have to worry about unordered operands (eg. comparisons + -- against NaN). If the operands are unordered, the comparison + -- sets the parity flag, carry flag and zero flag. + -- All comparisons are supposed to return false for unordered + -- operands except for !=, which returns true. + -- + -- Optimisation: we don't have to test the parity flag if we + -- know the test has already excluded the unordered case: eg > + -- and >= test for a zero carry flag, which can only occur for + -- ordered operands. + -- + -- ToDo: by reversing comparisons we could avoid testing the + -- parity flag in more cases. + + code dst = + cond_code `appOL` + (case cond of + NE -> or_unordered dst + GU -> plain_test dst + GEU -> plain_test dst + _ -> and_ordered dst) + + plain_test dst = toOL [ + SETCC cond (OpReg tmp1), + MOVZxL I8 (OpReg tmp1) (OpReg dst) + ] + or_unordered dst = toOL [ + SETCC cond (OpReg tmp1), + SETCC PARITY (OpReg tmp2), + OR I8 (OpReg tmp1) (OpReg tmp2), + MOVZxL I8 (OpReg tmp2) (OpReg dst) + ] + and_ordered dst = toOL [ + SETCC cond (OpReg tmp1), + SETCC NOTPARITY (OpReg tmp2), + AND I8 (OpReg tmp1) (OpReg tmp2), + MOVZxL I8 (OpReg tmp2) (OpReg dst) + ] + -- in + return (Any I32 code) + +#endif + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if sparc_TARGET_ARCH + +condIntReg EQQ x (CmmLit (CmmInt 0 d)) = do + (src, code) <- getSomeReg x + tmp <- getNewRegNat I32 + let + code__2 dst = code `appOL` toOL [ + SUB False True g0 (RIReg src) g0, + SUB True False g0 (RIImm (ImmInt (-1))) dst] + return (Any I32 code__2) + +condIntReg EQQ x y = do + (src1, code1) <- getSomeReg x + (src2, code2) <- getSomeReg y + tmp1 <- getNewRegNat I32 + tmp2 <- getNewRegNat I32 + let + code__2 dst = code1 `appOL` code2 `appOL` toOL [ + XOR False src1 (RIReg src2) dst, + SUB False True g0 (RIReg dst) g0, + SUB True False g0 (RIImm (ImmInt (-1))) dst] + return (Any I32 code__2) + +condIntReg NE x (CmmLit (CmmInt 0 d)) = do + (src, code) <- getSomeReg x + tmp <- getNewRegNat I32 + let + code__2 dst = code `appOL` toOL [ + SUB False True g0 (RIReg src) g0, + ADD True False g0 (RIImm (ImmInt 0)) dst] + return (Any I32 code__2) + +condIntReg NE x y = do + (src1, code1) <- getSomeReg x + (src2, code2) <- getSomeReg y + tmp1 <- getNewRegNat I32 + tmp2 <- getNewRegNat I32 + let + code__2 dst = code1 `appOL` code2 `appOL` toOL [ + XOR False src1 (RIReg src2) dst, + SUB False True g0 (RIReg dst) g0, + ADD True False g0 (RIImm (ImmInt 0)) dst] + return (Any I32 code__2) + +condIntReg cond x y = do + BlockId lbl1 <- getBlockIdNat + BlockId lbl2 <- getBlockIdNat + CondCode _ cond cond_code <- condIntCode cond x y + let + code__2 dst = cond_code `appOL` toOL [ + BI cond False (ImmCLbl (mkAsmTempLabel lbl1)), NOP, + OR False g0 (RIImm (ImmInt 0)) dst, + BI ALWAYS False (ImmCLbl (mkAsmTempLabel lbl2)), NOP, + NEWBLOCK (BlockId lbl1), + OR False g0 (RIImm (ImmInt 1)) dst, + NEWBLOCK (BlockId lbl2)] + return (Any I32 code__2) + +condFltReg cond x y = do + BlockId lbl1 <- getBlockIdNat + BlockId lbl2 <- getBlockIdNat + CondCode _ cond cond_code <- condFltCode cond x y + let + code__2 dst = cond_code `appOL` toOL [ + NOP, + BF cond False (ImmCLbl (mkAsmTempLabel lbl1)), NOP, + OR False g0 (RIImm (ImmInt 0)) dst, + BI ALWAYS False (ImmCLbl (mkAsmTempLabel lbl2)), NOP, + NEWBLOCK (BlockId lbl1), + OR False g0 (RIImm (ImmInt 1)) dst, + NEWBLOCK (BlockId lbl2)] + return (Any I32 code__2) + +#endif /* sparc_TARGET_ARCH */ + +#if powerpc_TARGET_ARCH +condReg getCond = do + lbl1 <- getBlockIdNat + lbl2 <- getBlockIdNat + CondCode _ cond cond_code <- getCond + let +{- code dst = cond_code `appOL` toOL [ + BCC cond lbl1, + LI dst (ImmInt 0), + BCC ALWAYS lbl2, + NEWBLOCK lbl1, + LI dst (ImmInt 1), + BCC ALWAYS lbl2, + NEWBLOCK lbl2 + ]-} + code dst = cond_code + `appOL` negate_code + `appOL` toOL [ + MFCR dst, + RLWINM dst dst (bit + 1) 31 31 + ] + + negate_code | do_negate = unitOL (CRNOR bit bit bit) + | otherwise = nilOL + + (bit, do_negate) = case cond of + LTT -> (0, False) + LE -> (1, True) + EQQ -> (2, False) + GE -> (0, True) + GTT -> (1, False) + + NE -> (2, True) + + LU -> (0, False) + LEU -> (1, True) + GEU -> (0, True) + GU -> (1, False) + + return (Any I32 code) + +condIntReg cond x y = condReg (condIntCode cond x y) +condFltReg cond x y = condReg (condFltCode cond x y) +#endif /* powerpc_TARGET_ARCH */ + + +-- ----------------------------------------------------------------------------- +-- 'trivial*Code': deal with trivial instructions + +-- Trivial (dyadic: 'trivialCode', floating-point: 'trivialFCode', +-- unary: 'trivialUCode', unary fl-pt:'trivialUFCode') instructions. +-- Only look for constants on the right hand side, because that's +-- where the generic optimizer will have put them. + +-- Similarly, for unary instructions, we don't have to worry about +-- matching an StInt as the argument, because genericOpt will already +-- have handled the constant-folding. + +trivialCode + :: MachRep + -> IF_ARCH_alpha((Reg -> RI -> Reg -> Instr) + ,IF_ARCH_i386 ((Operand -> Operand -> Instr) + -> Maybe (Operand -> Operand -> Instr) + ,IF_ARCH_x86_64 ((Operand -> Operand -> Instr) + -> Maybe (Operand -> Operand -> Instr) + ,IF_ARCH_sparc((Reg -> RI -> Reg -> Instr) + ,IF_ARCH_powerpc(Bool -> (Reg -> Reg -> RI -> Instr) + ,))))) + -> CmmExpr -> CmmExpr -- the two arguments + -> NatM Register + +#ifndef powerpc_TARGET_ARCH +trivialFCode + :: MachRep + -> IF_ARCH_alpha((Reg -> Reg -> Reg -> Instr) + ,IF_ARCH_sparc((MachRep -> Reg -> Reg -> Reg -> Instr) + ,IF_ARCH_i386 ((MachRep -> Reg -> Reg -> Reg -> Instr) + ,IF_ARCH_x86_64 ((MachRep -> Operand -> Operand -> Instr) + ,)))) + -> CmmExpr -> CmmExpr -- the two arguments + -> NatM Register +#endif + +trivialUCode + :: MachRep + -> IF_ARCH_alpha((RI -> Reg -> Instr) + ,IF_ARCH_i386 ((Operand -> Instr) + ,IF_ARCH_x86_64 ((Operand -> Instr) + ,IF_ARCH_sparc((RI -> Reg -> Instr) + ,IF_ARCH_powerpc((Reg -> Reg -> Instr) + ,))))) + -> CmmExpr -- the one argument + -> NatM Register + +#ifndef powerpc_TARGET_ARCH +trivialUFCode + :: MachRep + -> IF_ARCH_alpha((Reg -> Reg -> Instr) + ,IF_ARCH_i386 ((Reg -> Reg -> Instr) + ,IF_ARCH_x86_64 ((Reg -> Reg -> Instr) + ,IF_ARCH_sparc((Reg -> Reg -> Instr) + ,)))) + -> CmmExpr -- the one argument + -> NatM Register +#endif + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if alpha_TARGET_ARCH + +trivialCode instr x (StInt y) + | fits8Bits y + = getRegister x `thenNat` \ register -> + getNewRegNat IntRep `thenNat` \ tmp -> + let + code = registerCode register tmp + src1 = registerName register tmp + src2 = ImmInt (fromInteger y) + code__2 dst = code . mkSeqInstr (instr src1 (RIImm src2) dst) + in + return (Any IntRep code__2) + +trivialCode instr x y + = getRegister x `thenNat` \ register1 -> + getRegister y `thenNat` \ register2 -> + getNewRegNat IntRep `thenNat` \ tmp1 -> + getNewRegNat IntRep `thenNat` \ tmp2 -> + let + code1 = registerCode register1 tmp1 [] + src1 = registerName register1 tmp1 + code2 = registerCode register2 tmp2 [] + src2 = registerName register2 tmp2 + code__2 dst = asmSeqThen [code1, code2] . + mkSeqInstr (instr src1 (RIReg src2) dst) + in + return (Any IntRep code__2) + +------------ +trivialUCode instr x + = getRegister x `thenNat` \ register -> + getNewRegNat IntRep `thenNat` \ tmp -> + let + code = registerCode register tmp + src = registerName register tmp + code__2 dst = code . mkSeqInstr (instr (RIReg src) dst) + in + return (Any IntRep code__2) + +------------ +trivialFCode _ instr x y + = getRegister x `thenNat` \ register1 -> + getRegister y `thenNat` \ register2 -> + getNewRegNat F64 `thenNat` \ tmp1 -> + getNewRegNat F64 `thenNat` \ tmp2 -> + let + code1 = registerCode register1 tmp1 + src1 = registerName register1 tmp1 + + code2 = registerCode register2 tmp2 + src2 = registerName register2 tmp2 + + code__2 dst = asmSeqThen [code1 [], code2 []] . + mkSeqInstr (instr src1 src2 dst) + in + return (Any F64 code__2) + +trivialUFCode _ instr x + = getRegister x `thenNat` \ register -> + getNewRegNat F64 `thenNat` \ tmp -> + let + code = registerCode register tmp + src = registerName register tmp + code__2 dst = code . mkSeqInstr (instr src dst) + in + return (Any F64 code__2) + +#endif /* alpha_TARGET_ARCH */ + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + +{- +The Rules of the Game are: + +* You cannot assume anything about the destination register dst; + it may be anything, including a fixed reg. + +* You may compute an operand into a fixed reg, but you may not + subsequently change the contents of that fixed reg. If you + want to do so, first copy the value either to a temporary + or into dst. You are free to modify dst even if it happens + to be a fixed reg -- that's not your problem. + +* You cannot assume that a fixed reg will stay live over an + arbitrary computation. The same applies to the dst reg. + +* Temporary regs obtained from getNewRegNat are distinct from + each other and from all other regs, and stay live over + arbitrary computations. + +-------------------- + +SDM's version of The Rules: + +* If getRegister returns Any, that means it can generate correct + code which places the result in any register, period. Even if that + register happens to be read during the computation. + + Corollary #1: this means that if you are generating code for an + operation with two arbitrary operands, you cannot assign the result + of the first operand into the destination register before computing + the second operand. The second operand might require the old value + of the destination register. + + Corollary #2: A function might be able to generate more efficient + code if it knows the destination register is a new temporary (and + therefore not read by any of the sub-computations). + +* If getRegister returns Any, then the code it generates may modify only: + (a) fresh temporaries + (b) the destination register + (c) known registers (eg. %ecx is used by shifts) + In particular, it may *not* modify global registers, unless the global + register happens to be the destination register. +-} + +trivialCode rep instr (Just revinstr) (CmmLit lit_a) b + | not (is64BitLit lit_a) = do + b_code <- getAnyReg b + let + code dst + = b_code dst `snocOL` + revinstr (OpImm (litToImm lit_a)) (OpReg dst) + -- in + return (Any rep code) + +trivialCode rep instr maybe_revinstr a b = genTrivialCode rep instr a b + +-- This is re-used for floating pt instructions too. +genTrivialCode rep instr a b = do + (b_op, b_code) <- getNonClobberedOperand b + a_code <- getAnyReg a + tmp <- getNewRegNat rep + let + -- We want the value of b to stay alive across the computation of a. + -- But, we want to calculate a straight into the destination register, + -- because the instruction only has two operands (dst := dst `op` src). + -- The troublesome case is when the result of b is in the same register + -- as the destination reg. In this case, we have to save b in a + -- new temporary across the computation of a. + code dst + | dst `regClashesWithOp` b_op = + b_code `appOL` + unitOL (MOV rep b_op (OpReg tmp)) `appOL` + a_code dst `snocOL` + instr (OpReg tmp) (OpReg dst) + | otherwise = + b_code `appOL` + a_code dst `snocOL` + instr b_op (OpReg dst) + -- in + return (Any rep code) + +reg `regClashesWithOp` OpReg reg2 = reg == reg2 +reg `regClashesWithOp` OpAddr amode = any (==reg) (addrModeRegs amode) +reg `regClashesWithOp` _ = False + +----------- + +trivialUCode rep instr x = do + x_code <- getAnyReg x + let + code dst = + x_code dst `snocOL` + instr (OpReg dst) + -- in + return (Any rep code) + +----------- + +#if i386_TARGET_ARCH + +trivialFCode pk instr x y = do + (x_reg, x_code) <- getNonClobberedReg x -- these work for float regs too + (y_reg, y_code) <- getSomeReg y + let + code dst = + x_code `appOL` + y_code `snocOL` + instr pk x_reg y_reg dst + -- in + return (Any pk code) + +#endif + +#if x86_64_TARGET_ARCH + +trivialFCode pk instr x y = genTrivialCode pk (instr pk) x y + +#endif + +------------- + +trivialUFCode rep instr x = do + (x_reg, x_code) <- getSomeReg x + let + code dst = + x_code `snocOL` + instr x_reg dst + -- in + return (Any rep code) + +#endif /* i386_TARGET_ARCH */ + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if sparc_TARGET_ARCH + +trivialCode pk instr x (CmmLit (CmmInt y d)) + | fits13Bits y + = do + (src1, code) <- getSomeReg x + tmp <- getNewRegNat I32 + let + src2 = ImmInt (fromInteger y) + code__2 dst = code `snocOL` instr src1 (RIImm src2) dst + return (Any I32 code__2) + +trivialCode pk instr x y = do + (src1, code1) <- getSomeReg x + (src2, code2) <- getSomeReg y + tmp1 <- getNewRegNat I32 + tmp2 <- getNewRegNat I32 + let + code__2 dst = code1 `appOL` code2 `snocOL` + instr src1 (RIReg src2) dst + return (Any I32 code__2) + +------------ +trivialFCode pk instr x y = do + (src1, code1) <- getSomeReg x + (src2, code2) <- getSomeReg y + tmp1 <- getNewRegNat (cmmExprRep x) + tmp2 <- getNewRegNat (cmmExprRep y) + tmp <- getNewRegNat F64 + let + promote x = FxTOy F32 F64 x tmp + + pk1 = cmmExprRep x + pk2 = cmmExprRep y + + code__2 dst = + if pk1 == pk2 then + code1 `appOL` code2 `snocOL` + instr pk src1 src2 dst + else if pk1 == F32 then + code1 `snocOL` promote src1 `appOL` code2 `snocOL` + instr F64 tmp src2 dst + else + code1 `appOL` code2 `snocOL` promote src2 `snocOL` + instr F64 src1 tmp dst + return (Any (if pk1 == pk2 then pk1 else F64) code__2) + +------------ +trivialUCode pk instr x = do + (src, code) <- getSomeReg x + tmp <- getNewRegNat pk + let + code__2 dst = code `snocOL` instr (RIReg src) dst + return (Any pk code__2) + +------------- +trivialUFCode pk instr x = do + (src, code) <- getSomeReg x + tmp <- getNewRegNat pk + let + code__2 dst = code `snocOL` instr src dst + return (Any pk code__2) + +#endif /* sparc_TARGET_ARCH */ + +#if powerpc_TARGET_ARCH + +{- +Wolfgang's PowerPC version of The Rules: + +A slightly modified version of The Rules to take advantage of the fact +that PowerPC instructions work on all registers and don't implicitly +clobber any fixed registers. + +* The only expression for which getRegister returns Fixed is (CmmReg reg). + +* If getRegister returns Any, then the code it generates may modify only: + (a) fresh temporaries + (b) the destination register + It may *not* modify global registers, unless the global + register happens to be the destination register. + It may not clobber any other registers. In fact, only ccalls clobber any + fixed registers. + Also, it may not modify the counter register (used by genCCall). + + Corollary: If a getRegister for a subexpression returns Fixed, you need + not move it to a fresh temporary before evaluating the next subexpression. + The Fixed register won't be modified. + Therefore, we don't need a counterpart for the x86's getStableReg on PPC. + +* SDM's First Rule is valid for PowerPC, too: subexpressions can depend on + the value of the destination register. +-} + +trivialCode rep signed instr x (CmmLit (CmmInt y _)) + | Just imm <- makeImmediate rep signed y + = do + (src1, code1) <- getSomeReg x + let code dst = code1 `snocOL` instr dst src1 (RIImm imm) + return (Any rep code) + +trivialCode rep signed instr x y = do + (src1, code1) <- getSomeReg x + (src2, code2) <- getSomeReg y + let code dst = code1 `appOL` code2 `snocOL` instr dst src1 (RIReg src2) + return (Any rep code) + +trivialCodeNoImm :: MachRep -> (Reg -> Reg -> Reg -> Instr) + -> CmmExpr -> CmmExpr -> NatM Register +trivialCodeNoImm rep instr x y = do + (src1, code1) <- getSomeReg x + (src2, code2) <- getSomeReg y + let code dst = code1 `appOL` code2 `snocOL` instr dst src1 src2 + return (Any rep code) + +trivialUCode rep instr x = do + (src, code) <- getSomeReg x + let code' dst = code `snocOL` instr dst src + return (Any rep code') + +-- There is no "remainder" instruction on the PPC, so we have to do +-- it the hard way. +-- The "div" parameter is the division instruction to use (DIVW or DIVWU) + +remainderCode :: MachRep -> (Reg -> Reg -> Reg -> Instr) + -> CmmExpr -> CmmExpr -> NatM Register +remainderCode rep div x y = do + (src1, code1) <- getSomeReg x + (src2, code2) <- getSomeReg y + let code dst = code1 `appOL` code2 `appOL` toOL [ + div dst src1 src2, + MULLW dst dst (RIReg src2), + SUBF dst dst src1 + ] + return (Any rep code) + +#endif /* powerpc_TARGET_ARCH */ + + +-- ----------------------------------------------------------------------------- +-- Coercing to/from integer/floating-point... + +-- @coerce(Int2FP|FP2Int)@ are more complicated integer/float +-- conversions. We have to store temporaries in memory to move +-- between the integer and the floating point register sets. + +-- @coerceDbl2Flt@ and @coerceFlt2Dbl@ are done this way because we +-- pretend, on sparc at least, that double and float regs are seperate +-- kinds, so the value has to be computed into one kind before being +-- explicitly "converted" to live in the other kind. + +coerceInt2FP :: MachRep -> MachRep -> CmmExpr -> NatM Register +coerceFP2Int :: MachRep -> MachRep -> CmmExpr -> NatM Register + +#if sparc_TARGET_ARCH +coerceDbl2Flt :: CmmExpr -> NatM Register +coerceFlt2Dbl :: CmmExpr -> NatM Register +#endif + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if alpha_TARGET_ARCH + +coerceInt2FP _ x + = getRegister x `thenNat` \ register -> + getNewRegNat IntRep `thenNat` \ reg -> + let + code = registerCode register reg + src = registerName register reg + + code__2 dst = code . mkSeqInstrs [ + ST Q src (spRel 0), + LD TF dst (spRel 0), + CVTxy Q TF dst dst] + in + return (Any F64 code__2) + +------------- +coerceFP2Int x + = getRegister x `thenNat` \ register -> + getNewRegNat F64 `thenNat` \ tmp -> + let + code = registerCode register tmp + src = registerName register tmp + + code__2 dst = code . mkSeqInstrs [ + CVTxy TF Q src tmp, + ST TF tmp (spRel 0), + LD Q dst (spRel 0)] + in + return (Any IntRep code__2) + +#endif /* alpha_TARGET_ARCH */ + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if i386_TARGET_ARCH + +coerceInt2FP from to x = do + (x_reg, x_code) <- getSomeReg x + let + opc = case to of F32 -> GITOF; F64 -> GITOD + code dst = x_code `snocOL` opc x_reg dst + -- ToDo: works for non-I32 reps? + -- in + return (Any to code) + +------------ + +coerceFP2Int from to x = do + (x_reg, x_code) <- getSomeReg x + let + opc = case from of F32 -> GFTOI; F64 -> GDTOI + code dst = x_code `snocOL` opc x_reg dst + -- ToDo: works for non-I32 reps? + -- in + return (Any to code) + +#endif /* i386_TARGET_ARCH */ + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if x86_64_TARGET_ARCH + +coerceFP2Int from to x = do + (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand + let + opc = case from of F32 -> CVTSS2SI; F64 -> CVTSD2SI + code dst = x_code `snocOL` opc x_op dst + -- in + return (Any to code) -- works even if the destination rep is <I32 + +coerceInt2FP from to x = do + (x_op, x_code) <- getOperand x -- ToDo: could be a safe operand + let + opc = case to of F32 -> CVTSI2SS; F64 -> CVTSI2SD + code dst = x_code `snocOL` opc x_op dst + -- in + return (Any to code) -- works even if the destination rep is <I32 + +coerceFP2FP :: MachRep -> CmmExpr -> NatM Register +coerceFP2FP to x = do + (x_reg, x_code) <- getSomeReg x + let + opc = case to of F32 -> CVTSD2SS; F64 -> CVTSS2SD + code dst = x_code `snocOL` opc x_reg dst + -- in + return (Any to code) + +#endif + +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + +#if sparc_TARGET_ARCH + +coerceInt2FP pk1 pk2 x = do + (src, code) <- getSomeReg x + let + code__2 dst = code `appOL` toOL [ + ST pk1 src (spRel (-2)), + LD pk1 (spRel (-2)) dst, + FxTOy pk1 pk2 dst dst] + return (Any pk2 code__2) + +------------ +coerceFP2Int pk fprep x = do + (src, code) <- getSomeReg x + reg <- getNewRegNat fprep + tmp <- getNewRegNat pk + let + code__2 dst = ASSERT(fprep == F64 || fprep == F32) + code `appOL` toOL [ + FxTOy fprep pk src tmp, + ST pk tmp (spRel (-2)), + LD pk (spRel (-2)) dst] + return (Any pk code__2) + +------------ +coerceDbl2Flt x = do + (src, code) <- getSomeReg x + return (Any F32 (\dst -> code `snocOL` FxTOy F64 F32 src dst)) + +------------ +coerceFlt2Dbl x = do + (src, code) <- getSomeReg x + return (Any F64 (\dst -> code `snocOL` FxTOy F32 F64 src dst)) + +#endif /* sparc_TARGET_ARCH */ + +#if powerpc_TARGET_ARCH +coerceInt2FP fromRep toRep x = do + (src, code) <- getSomeReg x + lbl <- getNewLabelNat + itmp <- getNewRegNat I32 + ftmp <- getNewRegNat F64 + dynRef <- cmmMakeDynamicReference addImportNat False lbl + Amode addr addr_code <- getAmode dynRef + let + code' dst = code `appOL` maybe_exts `appOL` toOL [ + LDATA ReadOnlyData + [CmmDataLabel lbl, + CmmStaticLit (CmmInt 0x43300000 I32), + CmmStaticLit (CmmInt 0x80000000 I32)], + XORIS itmp src (ImmInt 0x8000), + ST I32 itmp (spRel 3), + LIS itmp (ImmInt 0x4330), + ST I32 itmp (spRel 2), + LD F64 ftmp (spRel 2) + ] `appOL` addr_code `appOL` toOL [ + LD F64 dst addr, + FSUB F64 dst ftmp dst + ] `appOL` maybe_frsp dst + + maybe_exts = case fromRep of + I8 -> unitOL $ EXTS I8 src src + I16 -> unitOL $ EXTS I16 src src + I32 -> nilOL + maybe_frsp dst = case toRep of + F32 -> unitOL $ FRSP dst dst + F64 -> nilOL + return (Any toRep code') + +coerceFP2Int fromRep toRep x = do + -- the reps don't really matter: F*->F64 and I32->I* are no-ops + (src, code) <- getSomeReg x + tmp <- getNewRegNat F64 + let + code' dst = code `appOL` toOL [ + -- convert to int in FP reg + FCTIWZ tmp src, + -- store value (64bit) from FP to stack + ST F64 tmp (spRel 2), + -- read low word of value (high word is undefined) + LD I32 dst (spRel 3)] + return (Any toRep code') +#endif /* powerpc_TARGET_ARCH */ + + +-- ----------------------------------------------------------------------------- +-- eXTRA_STK_ARGS_HERE + +-- We (allegedly) put the first six C-call arguments in registers; +-- where do we start putting the rest of them? + +-- Moved from MachInstrs (SDM): + +#if alpha_TARGET_ARCH || sparc_TARGET_ARCH +eXTRA_STK_ARGS_HERE :: Int +eXTRA_STK_ARGS_HERE + = IF_ARCH_alpha(0, IF_ARCH_sparc(23, ???)) +#endif + diff --git a/compiler/nativeGen/MachInstrs.hs b/compiler/nativeGen/MachInstrs.hs new file mode 100644 index 0000000000..0f718d3cea --- /dev/null +++ b/compiler/nativeGen/MachInstrs.hs @@ -0,0 +1,722 @@ +----------------------------------------------------------------------------- +-- +-- Machine-dependent assembly language +-- +-- (c) The University of Glasgow 1993-2004 +-- +----------------------------------------------------------------------------- + +#include "nativeGen/NCG.h" + +module MachInstrs ( + -- * Cmm instantiations + NatCmm, NatCmmTop, NatBasicBlock, + + -- * Machine instructions + Instr(..), + Cond(..), condUnsigned, condToSigned, condToUnsigned, + +#if !powerpc_TARGET_ARCH && !i386_TARGET_ARCH && !x86_64_TARGET_ARCH + Size(..), machRepSize, +#endif + RI(..), + +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + Operand(..), +#endif +#if i386_TARGET_ARCH + i386_insert_ffrees, +#endif +#if sparc_TARGET_ARCH + riZero, fpRelEA, moveSp, fPair, +#endif + ) where + +#include "HsVersions.h" + +import MachRegs +import Cmm +import MachOp ( MachRep(..) ) +import CLabel ( CLabel, pprCLabel ) +import Panic ( panic ) +import Outputable +import FastString +import Constants ( wORD_SIZE ) + +import GLAEXTS + + +-- ----------------------------------------------------------------------------- +-- Our flavours of the Cmm types + +-- Type synonyms for Cmm populated with native code +type NatCmm = GenCmm CmmStatic Instr +type NatCmmTop = GenCmmTop CmmStatic Instr +type NatBasicBlock = GenBasicBlock Instr + +-- ----------------------------------------------------------------------------- +-- Conditions on this architecture + +data Cond +#if alpha_TARGET_ARCH + = ALWAYS -- For BI (same as BR) + | EQQ -- For CMP and BI (NB: "EQ" is a 1.3 Prelude name) + | GE -- For BI only + | GTT -- For BI only (NB: "GT" is a 1.3 Prelude name) + | LE -- For CMP and BI + | LTT -- For CMP and BI (NB: "LT" is a 1.3 Prelude name) + | NE -- For BI only + | NEVER -- For BI (null instruction) + | ULE -- For CMP only + | ULT -- For CMP only +#endif +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + = ALWAYS -- What's really used? ToDo + | EQQ + | GE + | GEU + | GTT + | GU + | LE + | LEU + | LTT + | LU + | NE + | NEG + | POS + | CARRY + | OFLO + | PARITY + | NOTPARITY +#endif +#if sparc_TARGET_ARCH + = ALWAYS -- What's really used? ToDo + | EQQ + | GE + | GEU + | GTT + | GU + | LE + | LEU + | LTT + | LU + | NE + | NEG + | NEVER + | POS + | VC + | VS +#endif +#if powerpc_TARGET_ARCH + = ALWAYS + | EQQ + | GE + | GEU + | GTT + | GU + | LE + | LEU + | LTT + | LU + | NE +#endif + deriving Eq -- to make an assertion work + +condUnsigned GU = True +condUnsigned LU = True +condUnsigned GEU = True +condUnsigned LEU = True +condUnsigned _ = False + +condToSigned GU = GTT +condToSigned LU = LTT +condToSigned GEU = GE +condToSigned LEU = LE +condToSigned x = x + +condToUnsigned GTT = GU +condToUnsigned LTT = LU +condToUnsigned GE = GEU +condToUnsigned LE = LEU +condToUnsigned x = x + +-- ----------------------------------------------------------------------------- +-- Sizes on this architecture + +-- ToDo: it's not clear to me that we need separate signed-vs-unsigned sizes +-- here. I've removed them from the x86 version, we'll see what happens --SDM + +#if !powerpc_TARGET_ARCH && !i386_TARGET_ARCH && !x86_64_TARGET_ARCH +data Size +#if alpha_TARGET_ARCH + = B -- byte + | Bu +-- | W -- word (2 bytes): UNUSED +-- | Wu -- : UNUSED + | L -- longword (4 bytes) + | Q -- quadword (8 bytes) +-- | FF -- VAX F-style floating pt: UNUSED +-- | GF -- VAX G-style floating pt: UNUSED +-- | DF -- VAX D-style floating pt: UNUSED +-- | SF -- IEEE single-precision floating pt: UNUSED + | TF -- IEEE double-precision floating pt +#endif +#if sparc_TARGET_ARCH || powerpc_TARGET_ARCH + = B -- byte (signed) + | Bu -- byte (unsigned) + | H -- halfword (signed, 2 bytes) + | Hu -- halfword (unsigned, 2 bytes) + | W -- word (4 bytes) + | F -- IEEE single-precision floating pt + | DF -- IEEE single-precision floating pt +#endif + deriving Eq + +machRepSize :: MachRep -> Size +machRepSize I8 = IF_ARCH_alpha(Bu, IF_ARCH_sparc(Bu, )) +machRepSize I16 = IF_ARCH_alpha(err,IF_ARCH_sparc(Hu, )) +machRepSize I32 = IF_ARCH_alpha(L, IF_ARCH_sparc(W, )) +machRepSize I64 = panic "machRepSize: I64" +machRepSize I128 = panic "machRepSize: I128" +machRepSize F32 = IF_ARCH_alpha(TF, IF_ARCH_sparc(F, )) +machRepSize F64 = IF_ARCH_alpha(TF, IF_ARCH_sparc(DF,)) +#endif + +-- ----------------------------------------------------------------------------- +-- Register or immediate (a handy type on some platforms) + +data RI = RIReg Reg + | RIImm Imm + + +-- ----------------------------------------------------------------------------- +-- Machine's assembly language + +-- We have a few common "instructions" (nearly all the pseudo-ops) but +-- mostly all of 'Instr' is machine-specific. + +data Instr + = COMMENT FastString -- comment pseudo-op + + | LDATA Section [CmmStatic] -- some static data spat out during code + -- generation. Will be extracted before + -- pretty-printing. + + | NEWBLOCK BlockId -- start a new basic block. Useful during + -- codegen, removed later. Preceding + -- instruction should be a jump, as per the + -- invariants for a BasicBlock (see Cmm). + + | DELTA Int -- specify current stack offset for + -- benefit of subsequent passes + +-- ----------------------------------------------------------------------------- +-- Alpha instructions + +#if alpha_TARGET_ARCH + +-- data Instr continues... + +-- Loads and stores. + | LD Size Reg AddrMode -- size, dst, src + | LDA Reg AddrMode -- dst, src + | LDAH Reg AddrMode -- dst, src + | LDGP Reg AddrMode -- dst, src + | LDI Size Reg Imm -- size, dst, src + | ST Size Reg AddrMode -- size, src, dst + +-- Int Arithmetic. + | CLR Reg -- dst + | ABS Size RI Reg -- size, src, dst + | NEG Size Bool RI Reg -- size, overflow, src, dst + | ADD Size Bool Reg RI Reg -- size, overflow, src, src, dst + | SADD Size Size Reg RI Reg -- size, scale, src, src, dst + | SUB Size Bool Reg RI Reg -- size, overflow, src, src, dst + | SSUB Size Size Reg RI Reg -- size, scale, src, src, dst + | MUL Size Bool Reg RI Reg -- size, overflow, src, src, dst + | DIV Size Bool Reg RI Reg -- size, unsigned, src, src, dst + | REM Size Bool Reg RI Reg -- size, unsigned, src, src, dst + +-- Simple bit-twiddling. + | NOT RI Reg + | AND Reg RI Reg + | ANDNOT Reg RI Reg + | OR Reg RI Reg + | ORNOT Reg RI Reg + | XOR Reg RI Reg + | XORNOT Reg RI Reg + | SLL Reg RI Reg + | SRL Reg RI Reg + | SRA Reg RI Reg + + | ZAP Reg RI Reg + | ZAPNOT Reg RI Reg + + | NOP + +-- Comparison + | CMP Cond Reg RI Reg + +-- Float Arithmetic. + | FCLR Reg + | FABS Reg Reg + | FNEG Size Reg Reg + | FADD Size Reg Reg Reg + | FDIV Size Reg Reg Reg + | FMUL Size Reg Reg Reg + | FSUB Size Reg Reg Reg + | CVTxy Size Size Reg Reg + | FCMP Size Cond Reg Reg Reg + | FMOV Reg Reg + +-- Jumping around. + | BI Cond Reg Imm + | BF Cond Reg Imm + | BR Imm + | JMP Reg AddrMode Int + | BSR Imm Int + | JSR Reg AddrMode Int + +-- Alpha-specific pseudo-ops. + | FUNBEGIN CLabel + | FUNEND CLabel + +data RI + = RIReg Reg + | RIImm Imm + +#endif /* alpha_TARGET_ARCH */ + + +-- ----------------------------------------------------------------------------- +-- Intel x86 instructions + +{- +Intel, in their infinite wisdom, selected a stack model for floating +point registers on x86. That might have made sense back in 1979 -- +nowadays we can see it for the nonsense it really is. A stack model +fits poorly with the existing nativeGen infrastructure, which assumes +flat integer and FP register sets. Prior to this commit, nativeGen +could not generate correct x86 FP code -- to do so would have meant +somehow working the register-stack paradigm into the register +allocator and spiller, which sounds very difficult. + +We have decided to cheat, and go for a simple fix which requires no +infrastructure modifications, at the expense of generating ropey but +correct FP code. All notions of the x86 FP stack and its insns have +been removed. Instead, we pretend (to the instruction selector and +register allocator) that x86 has six floating point registers, %fake0 +.. %fake5, which can be used in the usual flat manner. We further +claim that x86 has floating point instructions very similar to SPARC +and Alpha, that is, a simple 3-operand register-register arrangement. +Code generation and register allocation proceed on this basis. + +When we come to print out the final assembly, our convenient fiction +is converted to dismal reality. Each fake instruction is +independently converted to a series of real x86 instructions. +%fake0 .. %fake5 are mapped to %st(0) .. %st(5). To do reg-reg +arithmetic operations, the two operands are pushed onto the top of the +FP stack, the operation done, and the result copied back into the +relevant register. There are only six %fake registers because 2 are +needed for the translation, and x86 has 8 in total. + +The translation is inefficient but is simple and it works. A cleverer +translation would handle a sequence of insns, simulating the FP stack +contents, would not impose a fixed mapping from %fake to %st regs, and +hopefully could avoid most of the redundant reg-reg moves of the +current translation. + +We might as well make use of whatever unique FP facilities Intel have +chosen to bless us with (let's not be churlish, after all). +Hence GLDZ and GLD1. Bwahahahahahahaha! +-} + +{- +MORE FLOATING POINT MUSINGS... + +Intel's internal floating point registers are by default 80 bit +extended precision. This means that all operations done on values in +registers are done at 80 bits, and unless the intermediate values are +truncated to the appropriate size (32 or 64 bits) by storing in +memory, calculations in registers will give different results from +calculations which pass intermediate values in memory (eg. via +function calls). + +One solution is to set the FPU into 64 bit precision mode. Some OSs +do this (eg. FreeBSD) and some don't (eg. Linux). The problem here is +that this will only affect 64-bit precision arithmetic; 32-bit +calculations will still be done at 64-bit precision in registers. So +it doesn't solve the whole problem. + +There's also the issue of what the C library is expecting in terms of +precision. It seems to be the case that glibc on Linux expects the +FPU to be set to 80 bit precision, so setting it to 64 bit could have +unexpected effects. Changing the default could have undesirable +effects on other 3rd-party library code too, so the right thing would +be to save/restore the FPU control word across Haskell code if we were +to do this. + +gcc's -ffloat-store gives consistent results by always storing the +results of floating-point calculations in memory, which works for both +32 and 64-bit precision. However, it only affects the values of +user-declared floating point variables in C, not intermediate results. +GHC in -fvia-C mode uses -ffloat-store (see the -fexcess-precision +flag). + +Another problem is how to spill floating point registers in the +register allocator. Should we spill the whole 80 bits, or just 64? +On an OS which is set to 64 bit precision, spilling 64 is fine. On +Linux, spilling 64 bits will round the results of some operations. +This is what gcc does. Spilling at 80 bits requires taking up a full +128 bit slot (so we get alignment). We spill at 80-bits and ignore +the alignment problems. + +In the future, we'll use the SSE registers for floating point. This +requires a CPU that supports SSE2 (ordinary SSE only supports 32 bit +precision float ops), which means P4 or Xeon and above. Using SSE +will solve all these problems, because the SSE registers use fixed 32 +bit or 64 bit precision. + +--SDM 1/2003 +-} + +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + +-- data Instr continues... + +-- Moves. + | MOV MachRep Operand Operand + | MOVZxL MachRep Operand Operand -- size is the size of operand 1 + | MOVSxL MachRep Operand Operand -- size is the size of operand 1 + -- x86_64 note: plain mov into a 32-bit register always zero-extends + -- into the 64-bit reg, in contrast to the 8 and 16-bit movs which + -- don't affect the high bits of the register. + +-- Load effective address (also a very useful three-operand add instruction :-) + | LEA MachRep Operand Operand + +-- Int Arithmetic. + | ADD MachRep Operand Operand + | ADC MachRep Operand Operand + | SUB MachRep Operand Operand + + | MUL MachRep Operand Operand + | IMUL MachRep Operand Operand -- signed int mul + | IMUL2 MachRep Operand -- %edx:%eax = operand * %eax + + | DIV MachRep Operand -- eax := eax:edx/op, edx := eax:edx%op + | IDIV MachRep Operand -- ditto, but signed + +-- Simple bit-twiddling. + | AND MachRep Operand Operand + | OR MachRep Operand Operand + | XOR MachRep Operand Operand + | NOT MachRep Operand + | NEGI MachRep Operand -- NEG instruction (name clash with Cond) + +-- Shifts (amount may be immediate or %cl only) + | SHL MachRep Operand{-amount-} Operand + | SAR MachRep Operand{-amount-} Operand + | SHR MachRep Operand{-amount-} Operand + + | BT MachRep Imm Operand + | NOP + +#if i386_TARGET_ARCH +-- Float Arithmetic. + +-- Note that we cheat by treating G{ABS,MOV,NEG} of doubles +-- as single instructions right up until we spit them out. + -- all the 3-operand fake fp insns are src1 src2 dst + -- and furthermore are constrained to be fp regs only. + -- IMPORTANT: keep is_G_insn up to date with any changes here + | GMOV Reg Reg -- src(fpreg), dst(fpreg) + | GLD MachRep AddrMode Reg -- src, dst(fpreg) + | GST MachRep Reg AddrMode -- src(fpreg), dst + + | GLDZ Reg -- dst(fpreg) + | GLD1 Reg -- dst(fpreg) + + | GFTOI Reg Reg -- src(fpreg), dst(intreg) + | GDTOI Reg Reg -- src(fpreg), dst(intreg) + + | GITOF Reg Reg -- src(intreg), dst(fpreg) + | GITOD Reg Reg -- src(intreg), dst(fpreg) + + | GADD MachRep Reg Reg Reg -- src1, src2, dst + | GDIV MachRep Reg Reg Reg -- src1, src2, dst + | GSUB MachRep Reg Reg Reg -- src1, src2, dst + | GMUL MachRep Reg Reg Reg -- src1, src2, dst + + -- FP compare. Cond must be `elem` [EQQ, NE, LE, LTT, GE, GTT] + -- Compare src1 with src2; set the Zero flag iff the numbers are + -- comparable and the comparison is True. Subsequent code must + -- test the %eflags zero flag regardless of the supplied Cond. + | GCMP Cond Reg Reg -- src1, src2 + + | GABS MachRep Reg Reg -- src, dst + | GNEG MachRep Reg Reg -- src, dst + | GSQRT MachRep Reg Reg -- src, dst + | GSIN MachRep Reg Reg -- src, dst + | GCOS MachRep Reg Reg -- src, dst + | GTAN MachRep Reg Reg -- src, dst + + | GFREE -- do ffree on all x86 regs; an ugly hack +#endif + +#if x86_64_TARGET_ARCH +-- SSE2 floating point: we use a restricted set of the available SSE2 +-- instructions for floating-point. + + -- use MOV for moving (either movss or movsd (movlpd better?)) + + | CVTSS2SD Reg Reg -- F32 to F64 + | CVTSD2SS Reg Reg -- F64 to F32 + | CVTSS2SI Operand Reg -- F32 to I32/I64 (with rounding) + | CVTSD2SI Operand Reg -- F64 to I32/I64 (with rounding) + | CVTSI2SS Operand Reg -- I32/I64 to F32 + | CVTSI2SD Operand Reg -- I32/I64 to F64 + + -- use ADD & SUB for arithmetic. In both cases, operands + -- are Operand Reg. + + -- SSE2 floating-point division: + | FDIV MachRep Operand Operand -- divisor, dividend(dst) + + -- use CMP for comparisons. ucomiss and ucomisd instructions + -- compare single/double prec floating point respectively. + + | SQRT MachRep Operand Reg -- src, dst +#endif + +-- Comparison + | TEST MachRep Operand Operand + | CMP MachRep Operand Operand + | SETCC Cond Operand + +-- Stack Operations. + | PUSH MachRep Operand + | POP MachRep Operand + -- both unused (SDM): + -- | PUSHA + -- | POPA + +-- Jumping around. + | JMP Operand + | JXX Cond BlockId -- includes unconditional branches + | JMP_TBL Operand [BlockId] -- table jump + | CALL (Either Imm Reg) [Reg] + +-- Other things. + | CLTD MachRep -- sign extend %eax into %edx:%eax + + | FETCHGOT Reg -- pseudo-insn for ELF position-independent code + -- pretty-prints as + -- call 1f + -- 1: popl %reg + -- addl __GLOBAL_OFFSET_TABLE__+.-1b, %reg + | FETCHPC Reg -- pseudo-insn for Darwin position-independent code + -- pretty-prints as + -- call 1f + -- 1: popl %reg + + +data Operand + = OpReg Reg -- register + | OpImm Imm -- immediate value + | OpAddr AddrMode -- memory reference + +#endif /* i386 or x86_64 */ + +#if i386_TARGET_ARCH +i386_insert_ffrees :: [Instr] -> [Instr] +i386_insert_ffrees insns + | any is_G_instr insns + = concatMap ffree_before_nonlocal_transfers insns + | otherwise + = insns + +ffree_before_nonlocal_transfers insn + = case insn of + CALL _ _ -> [GFREE, insn] + JMP _ -> [GFREE, insn] + other -> [insn] + + +-- if you ever add a new FP insn to the fake x86 FP insn set, +-- you must update this too +is_G_instr :: Instr -> Bool +is_G_instr instr + = case instr of + GMOV _ _ -> True; GLD _ _ _ -> True; GST _ _ _ -> True + GLDZ _ -> True; GLD1 _ -> True + GFTOI _ _ -> True; GDTOI _ _ -> True + GITOF _ _ -> True; GITOD _ _ -> True + GADD _ _ _ _ -> True; GDIV _ _ _ _ -> True + GSUB _ _ _ _ -> True; GMUL _ _ _ _ -> True + GCMP _ _ _ -> True; GABS _ _ _ -> True + GNEG _ _ _ -> True; GSQRT _ _ _ -> True + GSIN _ _ _ -> True; GCOS _ _ _ -> True; GTAN _ _ _ -> True + GFREE -> panic "is_G_instr: GFREE (!)" + other -> False +#endif /* i386_TARGET_ARCH */ + + +-- ----------------------------------------------------------------------------- +-- Sparc instructions + +#if sparc_TARGET_ARCH + +-- data Instr continues... + +-- Loads and stores. + | LD MachRep AddrMode Reg -- size, src, dst + | ST MachRep Reg AddrMode -- size, src, dst + +-- Int Arithmetic. + | ADD Bool Bool Reg RI Reg -- x?, cc?, src1, src2, dst + | SUB Bool Bool Reg RI Reg -- x?, cc?, src1, src2, dst + | UMUL Bool Reg RI Reg -- cc?, src1, src2, dst + | SMUL Bool Reg RI Reg -- cc?, src1, src2, dst + | RDY Reg -- move contents of Y register to reg + +-- Simple bit-twiddling. + | AND Bool Reg RI Reg -- cc?, src1, src2, dst + | ANDN Bool Reg RI Reg -- cc?, src1, src2, dst + | OR Bool Reg RI Reg -- cc?, src1, src2, dst + | ORN Bool Reg RI Reg -- cc?, src1, src2, dst + | XOR Bool Reg RI Reg -- cc?, src1, src2, dst + | XNOR Bool Reg RI Reg -- cc?, src1, src2, dst + | SLL Reg RI Reg -- src1, src2, dst + | SRL Reg RI Reg -- src1, src2, dst + | SRA Reg RI Reg -- src1, src2, dst + | SETHI Imm Reg -- src, dst + | NOP -- Really SETHI 0, %g0, but worth an alias + +-- Float Arithmetic. + +-- Note that we cheat by treating F{ABS,MOV,NEG} of doubles as single +-- instructions right up until we spit them out. + | FABS MachRep Reg Reg -- src dst + | FADD MachRep Reg Reg Reg -- src1, src2, dst + | FCMP Bool MachRep Reg Reg -- exception?, src1, src2, dst + | FDIV MachRep Reg Reg Reg -- src1, src2, dst + | FMOV MachRep Reg Reg -- src, dst + | FMUL MachRep Reg Reg Reg -- src1, src2, dst + | FNEG MachRep Reg Reg -- src, dst + | FSQRT MachRep Reg Reg -- src, dst + | FSUB MachRep Reg Reg Reg -- src1, src2, dst + | FxTOy MachRep MachRep Reg Reg -- src, dst + +-- Jumping around. + | BI Cond Bool Imm -- cond, annul?, target + | BF Cond Bool Imm -- cond, annul?, target + + | JMP AddrMode -- target + | CALL (Either Imm Reg) Int Bool -- target, args, terminal + +riZero :: RI -> Bool + +riZero (RIImm (ImmInt 0)) = True +riZero (RIImm (ImmInteger 0)) = True +riZero (RIReg (RealReg 0)) = True +riZero _ = False + +-- Calculate the effective address which would be used by the +-- corresponding fpRel sequence. fpRel is in MachRegs.lhs, +-- alas -- can't have fpRelEA here because of module dependencies. +fpRelEA :: Int -> Reg -> Instr +fpRelEA n dst + = ADD False False fp (RIImm (ImmInt (n * wORD_SIZE))) dst + +-- Code to shift the stack pointer by n words. +moveSp :: Int -> Instr +moveSp n + = ADD False False sp (RIImm (ImmInt (n * wORD_SIZE))) sp + +-- Produce the second-half-of-a-double register given the first half. +fPair :: Reg -> Reg +fPair (RealReg n) | n >= 32 && n `mod` 2 == 0 = RealReg (n+1) +fPair other = pprPanic "fPair(sparc NCG)" (ppr other) +#endif /* sparc_TARGET_ARCH */ + + +-- ----------------------------------------------------------------------------- +-- PowerPC instructions + +#ifdef powerpc_TARGET_ARCH +-- data Instr continues... + +-- Loads and stores. + | LD MachRep Reg AddrMode -- Load size, dst, src + | LA MachRep Reg AddrMode -- Load arithmetic size, dst, src + | ST MachRep Reg AddrMode -- Store size, src, dst + | STU MachRep Reg AddrMode -- Store with Update size, src, dst + | LIS Reg Imm -- Load Immediate Shifted dst, src + | LI Reg Imm -- Load Immediate dst, src + | MR Reg Reg -- Move Register dst, src -- also for fmr + + | CMP MachRep Reg RI --- size, src1, src2 + | CMPL MachRep Reg RI --- size, src1, src2 + + | BCC Cond BlockId + | JMP CLabel -- same as branch, + -- but with CLabel instead of block ID + | MTCTR Reg + | BCTR [BlockId] -- with list of local destinations + | BL CLabel [Reg] -- with list of argument regs + | BCTRL [Reg] + + | ADD Reg Reg RI -- dst, src1, src2 + | ADDC Reg Reg Reg -- (carrying) dst, src1, src2 + | ADDE Reg Reg Reg -- (extend) dst, src1, src2 + | ADDIS Reg Reg Imm -- Add Immediate Shifted dst, src1, src2 + | SUBF Reg Reg Reg -- dst, src1, src2 ; dst = src2 - src1 + | MULLW Reg Reg RI + | DIVW Reg Reg Reg + | DIVWU Reg Reg Reg + + | MULLW_MayOflo Reg Reg Reg + -- dst = 1 if src1 * src2 overflows + -- pseudo-instruction; pretty-printed as: + -- mullwo. dst, src1, src2 + -- mfxer dst + -- rlwinm dst, dst, 2, 31,31 + + | AND Reg Reg RI -- dst, src1, src2 + | OR Reg Reg RI -- dst, src1, src2 + | XOR Reg Reg RI -- dst, src1, src2 + | XORIS Reg Reg Imm -- XOR Immediate Shifted dst, src1, src2 + + | EXTS MachRep Reg Reg + + | NEG Reg Reg + | NOT Reg Reg + + | SLW Reg Reg RI -- shift left word + | SRW Reg Reg RI -- shift right word + | SRAW Reg Reg RI -- shift right arithmetic word + + -- Rotate Left Word Immediate then AND with Mask + | RLWINM Reg Reg Int Int Int + + | FADD MachRep Reg Reg Reg + | FSUB MachRep Reg Reg Reg + | FMUL MachRep Reg Reg Reg + | FDIV MachRep Reg Reg Reg + | FNEG Reg Reg -- negate is the same for single and double prec. + + | FCMP Reg Reg + + | FCTIWZ Reg Reg -- convert to integer word + | FRSP Reg Reg -- reduce to single precision + -- (but destination is a FP register) + + | CRNOR Int Int Int -- condition register nor + | MFCR Reg -- move from condition register + + | MFLR Reg -- move from link register + | FETCHPC Reg -- pseudo-instruction: + -- bcl to next insn, mflr reg + +#endif /* powerpc_TARGET_ARCH */ diff --git a/compiler/nativeGen/MachRegs.lhs b/compiler/nativeGen/MachRegs.lhs new file mode 100644 index 0000000000..bffb723d1b --- /dev/null +++ b/compiler/nativeGen/MachRegs.lhs @@ -0,0 +1,1437 @@ +-- ----------------------------------------------------------------------------- +-- +-- (c) The University of Glasgow 1994-2004 +-- +-- Machine-specific info about registers. +-- +-- Also includes stuff about immediate operands, which are +-- often/usually quite entangled with registers. +-- +-- (Immediates could be untangled from registers at some cost in tangled +-- modules --- the pleasure has been foregone.) +-- +-- ----------------------------------------------------------------------------- + +\begin{code} +#include "nativeGen/NCG.h" + +module MachRegs ( + + -- * Immediate values + Imm(..), strImmLit, litToImm, + + -- * Addressing modes + AddrMode(..), + addrOffset, + + -- * The 'Reg' type + RegNo, + Reg(..), isRealReg, isVirtualReg, + RegClass(..), regClass, + getHiVRegFromLo, + mkVReg, + + -- * Global registers + get_GlobalReg_reg_or_addr, + callerSaves, callerSaveVolatileRegs, + + -- * Machine-dependent register-related stuff + allocatableRegs, argRegs, allArgRegs, callClobberedRegs, + freeReg, + spRel, + +#if alpha_TARGET_ARCH + fits8Bits, + fReg, + gp, pv, ra, sp, t9, t10, t11, t12, v0, f0, zeroh, +#endif +#if i386_TARGET_ARCH + EABase(..), EAIndex(..), + eax, ebx, ecx, edx, esi, edi, ebp, esp, + fake0, fake1, fake2, fake3, fake4, fake5, + addrModeRegs, +#endif +#if x86_64_TARGET_ARCH + EABase(..), EAIndex(..), ripRel, + rax, rbx, rcx, rdx, rsi, rdi, rbp, rsp, + eax, ebx, ecx, edx, esi, edi, ebp, esp, + r8, r9, r10, r11, r12, r13, r14, r15, + xmm0, xmm1, xmm2, xmm3, xmm4, xmm5, xmm6, xmm7, + xmm8, xmm9, xmm10, xmm11, xmm12, xmm13, xmm14, xmm15, + xmm, + addrModeRegs, allFPArgRegs, +#endif +#if sparc_TARGET_ARCH + fits13Bits, + fpRel, gReg, iReg, lReg, oReg, largeOffsetError, + fp, sp, g0, g1, g2, o0, o1, f0, f6, f8, f26, f27, +#endif +#if powerpc_TARGET_ARCH + allFPArgRegs, + makeImmediate, + sp, + r3, r4, r27, r28, + f1, f20, f21, +#endif + ) where + +#include "HsVersions.h" + +#if i386_TARGET_ARCH +# define STOLEN_X86_REGS 4 +-- HACK: go for the max +#endif + +#include "../includes/MachRegs.h" + +import Cmm +import MachOp ( MachRep(..) ) + +import CLabel ( CLabel, mkMainCapabilityLabel ) +import Pretty +import Outputable ( Outputable(..), pprPanic, panic ) +import qualified Outputable +import Unique +import Constants +import FastTypes + +#if powerpc_TARGET_ARCH +#if __GLASGOW_HASKELL__ >= 504 +import Data.Word ( Word8, Word16, Word32 ) +import Data.Int ( Int8, Int16, Int32 ) +#else +import Word ( Word8, Word16, Word32 ) +import Int ( Int8, Int16, Int32 ) +#endif +#endif + +-- ----------------------------------------------------------------------------- +-- Immediates + +data Imm + = ImmInt Int + | ImmInteger Integer -- Sigh. + | ImmCLbl CLabel -- AbstractC Label (with baggage) + | ImmLit Doc -- Simple string + | ImmIndex CLabel Int + | ImmFloat Rational + | ImmDouble Rational + | ImmConstantSum Imm Imm + | ImmConstantDiff Imm Imm +#if sparc_TARGET_ARCH + | LO Imm {- Possible restrictions... -} + | HI Imm +#endif +#if powerpc_TARGET_ARCH + | LO Imm + | HI Imm + | HA Imm {- high halfword adjusted -} +#endif +strImmLit s = ImmLit (text s) + +litToImm :: CmmLit -> Imm +litToImm (CmmInt i _) = ImmInteger i +litToImm (CmmFloat f F32) = ImmFloat f +litToImm (CmmFloat f F64) = ImmDouble f +litToImm (CmmLabel l) = ImmCLbl l +litToImm (CmmLabelOff l off) = ImmIndex l off +litToImm (CmmLabelDiffOff l1 l2 off) + = ImmConstantSum + (ImmConstantDiff (ImmCLbl l1) (ImmCLbl l2)) + (ImmInt off) + +-- ----------------------------------------------------------------------------- +-- Addressing modes + +data AddrMode +#if alpha_TARGET_ARCH + = AddrImm Imm + | AddrReg Reg + | AddrRegImm Reg Imm +#endif + +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + = AddrBaseIndex EABase EAIndex Displacement + | ImmAddr Imm Int + +data EABase = EABaseNone | EABaseReg Reg | EABaseRip +data EAIndex = EAIndexNone | EAIndex Reg Int +type Displacement = Imm +#endif + +#if sparc_TARGET_ARCH + = AddrRegReg Reg Reg + | AddrRegImm Reg Imm +#endif + +#if powerpc_TARGET_ARCH + = AddrRegReg Reg Reg + | AddrRegImm Reg Imm +#endif + +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH +addrModeRegs :: AddrMode -> [Reg] +addrModeRegs (AddrBaseIndex b i _) = b_regs ++ i_regs + where + b_regs = case b of { EABaseReg r -> [r]; _ -> [] } + i_regs = case i of { EAIndex r _ -> [r]; _ -> [] } +addrModeRegs _ = [] +#endif + + +addrOffset :: AddrMode -> Int -> Maybe AddrMode + +addrOffset addr off + = case addr of +#if alpha_TARGET_ARCH + _ -> panic "MachMisc.addrOffset not defined for Alpha" +#endif +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + ImmAddr i off0 -> Just (ImmAddr i (off0 + off)) + + AddrBaseIndex r i (ImmInt n) -> Just (AddrBaseIndex r i (ImmInt (n + off))) + AddrBaseIndex r i (ImmInteger n) + -> Just (AddrBaseIndex r i (ImmInt (fromInteger (n + toInteger off)))) + + AddrBaseIndex r i (ImmCLbl lbl) + -> Just (AddrBaseIndex r i (ImmIndex lbl off)) + + AddrBaseIndex r i (ImmIndex lbl ix) + -> Just (AddrBaseIndex r i (ImmIndex lbl (ix+off))) + + _ -> Nothing -- in theory, shouldn't happen +#endif +#if sparc_TARGET_ARCH + AddrRegImm r (ImmInt n) + | fits13Bits n2 -> Just (AddrRegImm r (ImmInt n2)) + | otherwise -> Nothing + where n2 = n + off + + AddrRegImm r (ImmInteger n) + | fits13Bits n2 -> Just (AddrRegImm r (ImmInt (fromInteger n2))) + | otherwise -> Nothing + where n2 = n + toInteger off + + AddrRegReg r (RealReg 0) + | fits13Bits off -> Just (AddrRegImm r (ImmInt off)) + | otherwise -> Nothing + + _ -> Nothing +#endif /* sparc */ +#if powerpc_TARGET_ARCH + AddrRegImm r (ImmInt n) + | fits16Bits n2 -> Just (AddrRegImm r (ImmInt n2)) + | otherwise -> Nothing + where n2 = n + off + + AddrRegImm r (ImmInteger n) + | fits16Bits n2 -> Just (AddrRegImm r (ImmInt (fromInteger n2))) + | otherwise -> Nothing + where n2 = n + toInteger off + + _ -> Nothing +#endif /* powerpc */ + +----------------- +#if alpha_TARGET_ARCH + +fits8Bits :: Integer -> Bool +fits8Bits i = i >= -256 && i < 256 + +#endif + +#if sparc_TARGET_ARCH + +{-# SPECIALIZE fits13Bits :: Int -> Bool, Integer -> Bool #-} +fits13Bits :: Integral a => a -> Bool +fits13Bits x = x >= -4096 && x < 4096 + +----------------- +largeOffsetError i + = error ("ERROR: SPARC native-code generator cannot handle large offset (" + ++show i++");\nprobably because of large constant data structures;" ++ + "\nworkaround: use -fvia-C on this module.\n") + +#endif /* sparc */ + +#if powerpc_TARGET_ARCH +fits16Bits :: Integral a => a -> Bool +fits16Bits x = x >= -32768 && x < 32768 + +makeImmediate :: Integral a => MachRep -> Bool -> a -> Maybe Imm + +makeImmediate rep signed x = fmap ImmInt (toI16 rep signed) + where + narrow I32 False = fromIntegral (fromIntegral x :: Word32) + narrow I16 False = fromIntegral (fromIntegral x :: Word16) + narrow I8 False = fromIntegral (fromIntegral x :: Word8) + narrow I32 True = fromIntegral (fromIntegral x :: Int32) + narrow I16 True = fromIntegral (fromIntegral x :: Int16) + narrow I8 True = fromIntegral (fromIntegral x :: Int8) + + narrowed = narrow rep signed + + toI16 I32 True + | narrowed >= -32768 && narrowed < 32768 = Just narrowed + | otherwise = Nothing + toI16 I32 False + | narrowed >= 0 && narrowed < 65536 = Just narrowed + | otherwise = Nothing + toI16 _ _ = Just narrowed +#endif + + +-- @spRel@ gives us a stack relative addressing mode for volatile +-- temporaries and for excess call arguments. @fpRel@, where +-- applicable, is the same but for the frame pointer. + +spRel :: Int -- desired stack offset in words, positive or negative + -> AddrMode + +spRel n +#if defined(i386_TARGET_ARCH) + = AddrBaseIndex (EABaseReg esp) EAIndexNone (ImmInt (n * wORD_SIZE)) +#elif defined(x86_64_TARGET_ARCH) + = AddrBaseIndex (EABaseReg rsp) EAIndexNone (ImmInt (n * wORD_SIZE)) +#else + = AddrRegImm sp (ImmInt (n * wORD_SIZE)) +#endif + +#if sparc_TARGET_ARCH +fpRel :: Int -> AddrMode + -- Duznae work for offsets greater than 13 bits; we just hope for + -- the best +fpRel n + = AddrRegImm fp (ImmInt (n * wORD_SIZE)) +#endif + +#if x86_64_TARGET_ARCH +ripRel imm = AddrBaseIndex EABaseRip EAIndexNone imm +#endif + +-- ----------------------------------------------------------------------------- +-- Global registers + +-- We map STG registers onto appropriate CmmExprs. Either they map +-- to real machine registers or stored as offsets from BaseReg. Given +-- a GlobalReg, get_GlobalReg_reg_or_addr produces either the real +-- register it is in, on this platform, or a StixExpr denoting the +-- address in the register table holding it. get_MagicId_addr always +-- produces the register table address for it. + +get_GlobalReg_reg_or_addr :: GlobalReg -> Either Reg CmmExpr +get_GlobalReg_addr :: GlobalReg -> CmmExpr +get_Regtable_addr_from_offset :: MachRep -> Int -> CmmExpr + +get_GlobalReg_reg_or_addr mid + = case globalRegMaybe mid of + Just rr -> Left rr + Nothing -> Right (get_GlobalReg_addr mid) + +get_GlobalReg_addr BaseReg = regTableOffset 0 +get_GlobalReg_addr mid = get_Regtable_addr_from_offset + (globalRegRep mid) (baseRegOffset mid) + +-- Calculate a literal representing an offset into the register table. +-- Used when we don't have an actual BaseReg to offset from. +regTableOffset n = + CmmLit (CmmLabelOff mkMainCapabilityLabel (oFFSET_Capability_r + n)) + +get_Regtable_addr_from_offset rep offset + = case globalRegMaybe BaseReg of + Nothing -> regTableOffset offset + Just _ -> CmmRegOff (CmmGlobal BaseReg) offset + +-- ----------------------------------------------------------------------------- +-- caller-save registers + +-- Here we generate the sequence of saves/restores required around a +-- foreign call instruction. + +callerSaveVolatileRegs :: Maybe [GlobalReg] -> ([CmmStmt], [CmmStmt]) +callerSaveVolatileRegs vols = (caller_save, caller_load) + where + caller_save = foldr ($!) [] (map callerSaveGlobalReg regs_to_save) + caller_load = foldr ($!) [] (map callerRestoreGlobalReg regs_to_save) + + system_regs = [Sp,SpLim,Hp,HpLim,CurrentTSO,CurrentNursery, + {-SparkHd,SparkTl,SparkBase,SparkLim,-}BaseReg ] + + regs_to_save = system_regs ++ vol_list + + vol_list = case vols of Nothing -> all_of_em; Just regs -> regs + + all_of_em = [ VanillaReg n | n <- [0..mAX_Vanilla_REG] ] + ++ [ FloatReg n | n <- [0..mAX_Float_REG] ] + ++ [ DoubleReg n | n <- [0..mAX_Double_REG] ] + ++ [ LongReg n | n <- [0..mAX_Long_REG] ] + + callerSaveGlobalReg reg next + | callerSaves reg = + CmmStore (get_GlobalReg_addr reg) + (CmmReg (CmmGlobal reg)) : next + | otherwise = next + + callerRestoreGlobalReg reg next + | callerSaves reg = + CmmAssign (CmmGlobal reg) + (CmmLoad (get_GlobalReg_addr reg) (globalRegRep reg)) + : next + | otherwise = next + + +-- --------------------------------------------------------------------------- +-- Registers + +-- RealRegs are machine regs which are available for allocation, in +-- the usual way. We know what class they are, because that's part of +-- the processor's architecture. + +-- VirtualRegs are virtual registers. The register allocator will +-- eventually have to map them into RealRegs, or into spill slots. +-- VirtualRegs are allocated on the fly, usually to represent a single +-- value in the abstract assembly code (i.e. dynamic registers are +-- usually single assignment). With the new register allocator, the +-- single assignment restriction isn't necessary to get correct code, +-- although a better register allocation will result if single +-- assignment is used -- because the allocator maps a VirtualReg into +-- a single RealReg, even if the VirtualReg has multiple live ranges. + +-- Virtual regs can be of either class, so that info is attached. + +-- Determine the upper-half vreg for a 64-bit quantity on a 32-bit platform +-- when supplied with the vreg for the lower-half of the quantity. +-- (NB. Not reversible). +getHiVRegFromLo (VirtualRegI u) + = VirtualRegHi (newTagUnique u 'H') -- makes a pseudo-unique with tag 'H' +getHiVRegFromLo other + = pprPanic "getHiVRegFromLo" (ppr other) + +data RegClass + = RcInteger + | RcFloat + | RcDouble + deriving Eq + +type RegNo = Int + +data Reg + = RealReg {-# UNPACK #-} !RegNo + | VirtualRegI {-# UNPACK #-} !Unique + | VirtualRegHi {-# UNPACK #-} !Unique -- High part of 2-word register + | VirtualRegF {-# UNPACK #-} !Unique + | VirtualRegD {-# UNPACK #-} !Unique + deriving (Eq,Ord) + +-- We like to have Uniques for Reg so that we can make UniqFM and UniqSets +-- in the register allocator. +instance Uniquable Reg where + getUnique (RealReg i) = mkUnique 'C' i + getUnique (VirtualRegI u) = u + getUnique (VirtualRegHi u) = u + getUnique (VirtualRegF u) = u + getUnique (VirtualRegD u) = u + +unRealReg (RealReg i) = i +unRealReg vreg = pprPanic "unRealReg on VirtualReg" (ppr vreg) + +mkVReg :: Unique -> MachRep -> Reg +mkVReg u rep + = case rep of +#if sparc_TARGET_ARCH + F32 -> VirtualRegF u +#else + F32 -> VirtualRegD u +#endif + F64 -> VirtualRegD u + other -> VirtualRegI u + +isVirtualReg :: Reg -> Bool +isVirtualReg (RealReg _) = False +isVirtualReg (VirtualRegI _) = True +isVirtualReg (VirtualRegHi _) = True +isVirtualReg (VirtualRegF _) = True +isVirtualReg (VirtualRegD _) = True + +isRealReg :: Reg -> Bool +isRealReg = not . isVirtualReg + +instance Show Reg where + show (RealReg i) = showReg i + show (VirtualRegI u) = "%vI_" ++ show u + show (VirtualRegHi u) = "%vHi_" ++ show u + show (VirtualRegF u) = "%vF_" ++ show u + show (VirtualRegD u) = "%vD_" ++ show u + +instance Outputable Reg where + ppr r = Outputable.text (show r) + + +-- ----------------------------------------------------------------------------- +-- Machine-specific register stuff + +-- The Alpha has 64 registers of interest; 32 integer registers and 32 floating +-- point registers. The mapping of STG registers to alpha machine registers +-- is defined in StgRegs.h. We are, of course, prepared for any eventuality. + +#if alpha_TARGET_ARCH +fReg :: Int -> RegNo +fReg x = (32 + x) + +v0, f0, ra, pv, gp, sp, zeroh :: Reg +v0 = realReg 0 +f0 = realReg (fReg 0) +ra = FixedReg ILIT(26) +pv = t12 +gp = FixedReg ILIT(29) +sp = FixedReg ILIT(30) +zeroh = FixedReg ILIT(31) -- "zero" is used in 1.3 (MonadZero method) + +t9, t10, t11, t12 :: Reg +t9 = realReg 23 +t10 = realReg 24 +t11 = realReg 25 +t12 = realReg 27 +#endif + +{- +Intel x86 architecture: +- All registers except 7 (esp) are available for use. +- Only ebx, esi, edi and esp are available across a C call (they are callee-saves). +- Registers 0-7 have 16-bit counterparts (ax, bx etc.) +- Registers 0-3 have 8 bit counterparts (ah, bh etc.) +- Registers 8-13 are fakes; we pretend x86 has 6 conventionally-addressable + fp registers, and 3-operand insns for them, and we translate this into + real stack-based x86 fp code after register allocation. + +The fp registers are all Double registers; we don't have any RcFloat class +regs. @regClass@ barfs if you give it a VirtualRegF, and mkVReg above should +never generate them. +-} + +#if i386_TARGET_ARCH + +fake0, fake1, fake2, fake3, fake4, fake5, + eax, ebx, ecx, edx, esp, ebp, esi, edi :: Reg +eax = RealReg 0 +ebx = RealReg 1 +ecx = RealReg 2 +edx = RealReg 3 +esi = RealReg 4 +edi = RealReg 5 +ebp = RealReg 6 +esp = RealReg 7 +fake0 = RealReg 8 +fake1 = RealReg 9 +fake2 = RealReg 10 +fake3 = RealReg 11 +fake4 = RealReg 12 +fake5 = RealReg 13 + +-- On x86, we might want to have an 8-bit RegClass, which would +-- contain just regs 1-4 (the others don't have 8-bit versions). +-- However, we can get away without this at the moment because the +-- only allocatable integer regs are also 8-bit compatible (1, 3, 4). +regClass (RealReg i) = if i < 8 then RcInteger else RcDouble +regClass (VirtualRegI u) = RcInteger +regClass (VirtualRegHi u) = RcInteger +regClass (VirtualRegD u) = RcDouble +regClass (VirtualRegF u) = pprPanic "regClass(x86):VirtualRegF" + (ppr (VirtualRegF u)) + +regNames + = ["%eax", "%ebx", "%ecx", "%edx", "%esi", "%edi", "%ebp", "%esp", + "%fake0", "%fake1", "%fake2", "%fake3", "%fake4", "%fake5", "%fake6"] + +showReg :: RegNo -> String +showReg n + = if n >= 0 && n < 14 + then regNames !! n + else "%unknown_x86_real_reg_" ++ show n + +#endif + +{- +AMD x86_64 architecture: +- Registers 0-16 have 32-bit counterparts (eax, ebx etc.) +- Registers 0-7 have 16-bit counterparts (ax, bx etc.) +- Registers 0-3 have 8 bit counterparts (ah, bh etc.) + +-} + +#if x86_64_TARGET_ARCH + +rax, rbx, rcx, rdx, rsp, rbp, rsi, rdi, + r8, r9, r10, r11, r12, r13, r14, r15, + xmm0, xmm1, xmm2, xmm3, xmm4, xmm5, xmm6, xmm7, + xmm8, xmm9, xmm10, xmm11, xmm12, xmm13, xmm14, xmm15 :: Reg + +rax = RealReg 0 +rbx = RealReg 1 +rcx = RealReg 2 +rdx = RealReg 3 +rsi = RealReg 4 +rdi = RealReg 5 +rbp = RealReg 6 +rsp = RealReg 7 +r8 = RealReg 8 +r9 = RealReg 9 +r10 = RealReg 10 +r11 = RealReg 11 +r12 = RealReg 12 +r13 = RealReg 13 +r14 = RealReg 14 +r15 = RealReg 15 +xmm0 = RealReg 16 +xmm1 = RealReg 17 +xmm2 = RealReg 18 +xmm3 = RealReg 19 +xmm4 = RealReg 20 +xmm5 = RealReg 21 +xmm6 = RealReg 22 +xmm7 = RealReg 23 +xmm8 = RealReg 24 +xmm9 = RealReg 25 +xmm10 = RealReg 26 +xmm11 = RealReg 27 +xmm12 = RealReg 28 +xmm13 = RealReg 29 +xmm14 = RealReg 30 +xmm15 = RealReg 31 + + -- so we can re-use some x86 code: +eax = rax +ebx = rbx +ecx = rcx +edx = rdx +esi = rsi +edi = rdi +ebp = rbp +esp = rsp + +xmm n = RealReg (16+n) + +-- On x86, we might want to have an 8-bit RegClass, which would +-- contain just regs 1-4 (the others don't have 8-bit versions). +-- However, we can get away without this at the moment because the +-- only allocatable integer regs are also 8-bit compatible (1, 3, 4). +regClass (RealReg i) = if i < 16 then RcInteger else RcDouble +regClass (VirtualRegI u) = RcInteger +regClass (VirtualRegHi u) = RcInteger +regClass (VirtualRegD u) = RcDouble +regClass (VirtualRegF u) = pprPanic "regClass(x86_64):VirtualRegF" + (ppr (VirtualRegF u)) + +regNames + = ["%rax", "%rbx", "%rcx", "%rdx", "%rsi", "%rdi", "%rbp", "%rsp" ] + +showReg :: RegNo -> String +showReg n + | n >= 16 = "%xmm" ++ show (n-16) + | n >= 8 = "%r" ++ show n + | otherwise = regNames !! n + +#endif + +{- +The SPARC has 64 registers of interest; 32 integer registers and 32 +floating point registers. The mapping of STG registers to SPARC +machine registers is defined in StgRegs.h. We are, of course, +prepared for any eventuality. + +The whole fp-register pairing thing on sparcs is a huge nuisance. See +fptools/ghc/includes/MachRegs.h for a description of what's going on +here. +-} + +#if sparc_TARGET_ARCH + +gReg,lReg,iReg,oReg,fReg :: Int -> RegNo +gReg x = x +oReg x = (8 + x) +lReg x = (16 + x) +iReg x = (24 + x) +fReg x = (32 + x) + +nCG_FirstFloatReg :: RegNo +nCG_FirstFloatReg = unRealReg NCG_FirstFloatReg + +regClass (VirtualRegI u) = RcInteger +regClass (VirtualRegF u) = RcFloat +regClass (VirtualRegD u) = RcDouble +regClass (RealReg i) | i < 32 = RcInteger + | i < nCG_FirstFloatReg = RcDouble + | otherwise = RcFloat + +showReg :: RegNo -> String +showReg n + | n >= 0 && n < 8 = "%g" ++ show n + | n >= 8 && n < 16 = "%o" ++ show (n-8) + | n >= 16 && n < 24 = "%l" ++ show (n-16) + | n >= 24 && n < 32 = "%i" ++ show (n-24) + | n >= 32 && n < 64 = "%f" ++ show (n-32) + | otherwise = "%unknown_sparc_real_reg_" ++ show n + +g0, g1, g2, fp, sp, o0, o1, f0, f1, f6, f8, f22, f26, f27 :: Reg + +f6 = RealReg (fReg 6) +f8 = RealReg (fReg 8) +f22 = RealReg (fReg 22) +f26 = RealReg (fReg 26) +f27 = RealReg (fReg 27) + + +-- g0 is useful for codegen; is always zero, and writes to it vanish. +g0 = RealReg (gReg 0) +g1 = RealReg (gReg 1) +g2 = RealReg (gReg 2) + +-- FP, SP, int and float return (from C) regs. +fp = RealReg (iReg 6) +sp = RealReg (oReg 6) +o0 = RealReg (oReg 0) +o1 = RealReg (oReg 1) +f0 = RealReg (fReg 0) +f1 = RealReg (fReg 1) + +#endif + +{- +The PowerPC has 64 registers of interest; 32 integer registers and 32 floating +point registers. +-} + +#if powerpc_TARGET_ARCH +fReg :: Int -> RegNo +fReg x = (32 + x) + +regClass (VirtualRegI u) = RcInteger +regClass (VirtualRegHi u) = RcInteger +regClass (VirtualRegF u) = pprPanic "regClass(ppc):VirtualRegF" + (ppr (VirtualRegF u)) +regClass (VirtualRegD u) = RcDouble +regClass (RealReg i) | i < 32 = RcInteger + | otherwise = RcDouble + +showReg :: RegNo -> String +showReg n + | n >= 0 && n <= 31 = "%r" ++ show n + | n >= 32 && n <= 63 = "%f" ++ show (n - 32) + | otherwise = "%unknown_powerpc_real_reg_" ++ show n + +sp = RealReg 1 +r3 = RealReg 3 +r4 = RealReg 4 +r27 = RealReg 27 +r28 = RealReg 28 +f1 = RealReg $ fReg 1 +f20 = RealReg $ fReg 20 +f21 = RealReg $ fReg 21 +#endif + +{- +Redefine the literals used for machine-registers with non-numeric +names in the header files. Gag me with a spoon, eh? +-} + +#if alpha_TARGET_ARCH +#define f0 32 +#define f1 33 +#define f2 34 +#define f3 35 +#define f4 36 +#define f5 37 +#define f6 38 +#define f7 39 +#define f8 40 +#define f9 41 +#define f10 42 +#define f11 43 +#define f12 44 +#define f13 45 +#define f14 46 +#define f15 47 +#define f16 48 +#define f17 49 +#define f18 50 +#define f19 51 +#define f20 52 +#define f21 53 +#define f22 54 +#define f23 55 +#define f24 56 +#define f25 57 +#define f26 58 +#define f27 59 +#define f28 60 +#define f29 61 +#define f30 62 +#define f31 63 +#endif +#if i386_TARGET_ARCH +#define eax 0 +#define ebx 1 +#define ecx 2 +#define edx 3 +#define esi 4 +#define edi 5 +#define ebp 6 +#define esp 7 +#define fake0 8 +#define fake1 9 +#define fake2 10 +#define fake3 11 +#define fake4 12 +#define fake5 13 +#endif + +#if x86_64_TARGET_ARCH +#define rax 0 +#define rbx 1 +#define rcx 2 +#define rdx 3 +#define rsi 4 +#define rdi 5 +#define rbp 6 +#define rsp 7 +#define r8 8 +#define r9 9 +#define r10 10 +#define r11 11 +#define r12 12 +#define r13 13 +#define r14 14 +#define r15 15 +#define xmm0 16 +#define xmm1 17 +#define xmm2 18 +#define xmm3 19 +#define xmm4 20 +#define xmm5 21 +#define xmm6 22 +#define xmm7 23 +#define xmm8 24 +#define xmm9 25 +#define xmm10 26 +#define xmm11 27 +#define xmm12 28 +#define xmm13 29 +#define xmm14 30 +#define xmm15 31 +#endif + +#if sparc_TARGET_ARCH +#define g0 0 +#define g1 1 +#define g2 2 +#define g3 3 +#define g4 4 +#define g5 5 +#define g6 6 +#define g7 7 +#define o0 8 +#define o1 9 +#define o2 10 +#define o3 11 +#define o4 12 +#define o5 13 +#define o6 14 +#define o7 15 +#define l0 16 +#define l1 17 +#define l2 18 +#define l3 19 +#define l4 20 +#define l5 21 +#define l6 22 +#define l7 23 +#define i0 24 +#define i1 25 +#define i2 26 +#define i3 27 +#define i4 28 +#define i5 29 +#define i6 30 +#define i7 31 + +#define f0 32 +#define f1 33 +#define f2 34 +#define f3 35 +#define f4 36 +#define f5 37 +#define f6 38 +#define f7 39 +#define f8 40 +#define f9 41 +#define f10 42 +#define f11 43 +#define f12 44 +#define f13 45 +#define f14 46 +#define f15 47 +#define f16 48 +#define f17 49 +#define f18 50 +#define f19 51 +#define f20 52 +#define f21 53 +#define f22 54 +#define f23 55 +#define f24 56 +#define f25 57 +#define f26 58 +#define f27 59 +#define f28 60 +#define f29 61 +#define f30 62 +#define f31 63 +#endif + +#if powerpc_TARGET_ARCH +#define r0 0 +#define r1 1 +#define r2 2 +#define r3 3 +#define r4 4 +#define r5 5 +#define r6 6 +#define r7 7 +#define r8 8 +#define r9 9 +#define r10 10 +#define r11 11 +#define r12 12 +#define r13 13 +#define r14 14 +#define r15 15 +#define r16 16 +#define r17 17 +#define r18 18 +#define r19 19 +#define r20 20 +#define r21 21 +#define r22 22 +#define r23 23 +#define r24 24 +#define r25 25 +#define r26 26 +#define r27 27 +#define r28 28 +#define r29 29 +#define r30 30 +#define r31 31 + +#ifdef darwin_TARGET_OS +#define f0 32 +#define f1 33 +#define f2 34 +#define f3 35 +#define f4 36 +#define f5 37 +#define f6 38 +#define f7 39 +#define f8 40 +#define f9 41 +#define f10 42 +#define f11 43 +#define f12 44 +#define f13 45 +#define f14 46 +#define f15 47 +#define f16 48 +#define f17 49 +#define f18 50 +#define f19 51 +#define f20 52 +#define f21 53 +#define f22 54 +#define f23 55 +#define f24 56 +#define f25 57 +#define f26 58 +#define f27 59 +#define f28 60 +#define f29 61 +#define f30 62 +#define f31 63 +#else +#define fr0 32 +#define fr1 33 +#define fr2 34 +#define fr3 35 +#define fr4 36 +#define fr5 37 +#define fr6 38 +#define fr7 39 +#define fr8 40 +#define fr9 41 +#define fr10 42 +#define fr11 43 +#define fr12 44 +#define fr13 45 +#define fr14 46 +#define fr15 47 +#define fr16 48 +#define fr17 49 +#define fr18 50 +#define fr19 51 +#define fr20 52 +#define fr21 53 +#define fr22 54 +#define fr23 55 +#define fr24 56 +#define fr25 57 +#define fr26 58 +#define fr27 59 +#define fr28 60 +#define fr29 61 +#define fr30 62 +#define fr31 63 +#endif +#endif + + +-- allMachRegs is the complete set of machine regs. +allMachRegNos :: [RegNo] +allMachRegNos + = IF_ARCH_alpha( [0..63], + IF_ARCH_i386( [0..13], + IF_ARCH_x86_64( [0..31], + IF_ARCH_sparc( ([0..31] + ++ [f0,f2 .. nCG_FirstFloatReg-1] + ++ [nCG_FirstFloatReg .. f31]), + IF_ARCH_powerpc([0..63], + ))))) + +-- allocatableRegs is allMachRegNos with the fixed-use regs removed. +-- i.e., these are the regs for which we are prepared to allow the +-- register allocator to attempt to map VRegs to. +allocatableRegs :: [RegNo] +allocatableRegs + = let isFree i = isFastTrue (freeReg i) + in filter isFree allMachRegNos + +-- these are the regs which we cannot assume stay alive over a +-- C call. +callClobberedRegs :: [Reg] +callClobberedRegs + = +#if alpha_TARGET_ARCH + [0, 1, 2, 3, 4, 5, 6, 7, 8, + 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, + fReg 0, fReg 1, fReg 10, fReg 11, fReg 12, fReg 13, fReg 14, fReg 15, + fReg 16, fReg 17, fReg 18, fReg 19, fReg 20, fReg 21, fReg 22, fReg 23, + fReg 24, fReg 25, fReg 26, fReg 27, fReg 28, fReg 29, fReg 30] +#endif /* alpha_TARGET_ARCH */ +#if i386_TARGET_ARCH + -- caller-saves registers + map RealReg [eax,ecx,edx,fake0,fake1,fake2,fake3,fake4,fake5] +#endif /* i386_TARGET_ARCH */ +#if x86_64_TARGET_ARCH + -- caller-saves registers + map RealReg ([rax,rcx,rdx,rsi,rdi,r8,r9,r10,r11] ++ [16..31]) + -- all xmm regs are caller-saves +#endif /* x86_64_TARGET_ARCH */ +#if sparc_TARGET_ARCH + map RealReg + ( oReg 7 : + [oReg i | i <- [0..5]] ++ + [gReg i | i <- [1..7]] ++ + [fReg i | i <- [0..31]] ) +#endif /* sparc_TARGET_ARCH */ +#if powerpc_TARGET_ARCH +#if darwin_TARGET_OS + map RealReg (0:[2..12] ++ map fReg [0..13]) +#elif linux_TARGET_OS + map RealReg (0:[2..13] ++ map fReg [0..13]) +#endif +#endif /* powerpc_TARGET_ARCH */ + + +-- argRegs is the set of regs which are read for an n-argument call to C. +-- For archs which pass all args on the stack (x86), is empty. +-- Sparc passes up to the first 6 args in regs. +-- Dunno about Alpha. +argRegs :: RegNo -> [Reg] + +#if i386_TARGET_ARCH +argRegs _ = panic "MachRegs.argRegs(x86): should not be used!" +#endif + +#if x86_64_TARGET_ARCH +argRegs _ = panic "MachRegs.argRegs(x86_64): should not be used!" +#endif + +#if alpha_TARGET_ARCH +argRegs 0 = [] +argRegs 1 = freeMappedRegs [16, fReg 16] +argRegs 2 = freeMappedRegs [16, 17, fReg 16, fReg 17] +argRegs 3 = freeMappedRegs [16, 17, 18, fReg 16, fReg 17, fReg 18] +argRegs 4 = freeMappedRegs [16, 17, 18, 19, fReg 16, fReg 17, fReg 18, fReg 19] +argRegs 5 = freeMappedRegs [16, 17, 18, 19, 20, fReg 16, fReg 17, fReg 18, fReg 19, fReg 20] +argRegs 6 = freeMappedRegs [16, 17, 18, 19, 20, 21, fReg 16, fReg 17, fReg 18, fReg 19, fReg 20, fReg 21] +argRegs _ = panic "MachRegs.argRegs(alpha): don't know about >6 arguments!" +#endif /* alpha_TARGET_ARCH */ + +#if sparc_TARGET_ARCH +argRegs 0 = [] +argRegs 1 = map (RealReg . oReg) [0] +argRegs 2 = map (RealReg . oReg) [0,1] +argRegs 3 = map (RealReg . oReg) [0,1,2] +argRegs 4 = map (RealReg . oReg) [0,1,2,3] +argRegs 5 = map (RealReg . oReg) [0,1,2,3,4] +argRegs 6 = map (RealReg . oReg) [0,1,2,3,4,5] +argRegs _ = panic "MachRegs.argRegs(sparc): don't know about >6 arguments!" +#endif /* sparc_TARGET_ARCH */ + +#if powerpc_TARGET_ARCH +argRegs 0 = [] +argRegs 1 = map RealReg [3] +argRegs 2 = map RealReg [3,4] +argRegs 3 = map RealReg [3..5] +argRegs 4 = map RealReg [3..6] +argRegs 5 = map RealReg [3..7] +argRegs 6 = map RealReg [3..8] +argRegs 7 = map RealReg [3..9] +argRegs 8 = map RealReg [3..10] +argRegs _ = panic "MachRegs.argRegs(powerpc): don't know about >8 arguments!" +#endif /* powerpc_TARGET_ARCH */ + + +-- all of the arg regs ?? +#if alpha_TARGET_ARCH +allArgRegs :: [(Reg, Reg)] +allArgRegs = [(realReg i, realReg (fReg i)) | i <- [16..21]] +#endif /* alpha_TARGET_ARCH */ + +#if sparc_TARGET_ARCH +allArgRegs :: [Reg] +allArgRegs = map RealReg [oReg i | i <- [0..5]] +#endif /* sparc_TARGET_ARCH */ + +#if i386_TARGET_ARCH +allArgRegs :: [Reg] +allArgRegs = panic "MachRegs.allArgRegs(x86): should not be used!" +#endif + +#if x86_64_TARGET_ARCH +allArgRegs :: [Reg] +allArgRegs = map RealReg [rdi,rsi,rdx,rcx,r8,r9] +allFPArgRegs :: [Reg] +allFPArgRegs = map RealReg [xmm0 .. xmm7] +#endif + +#if powerpc_TARGET_ARCH +allArgRegs :: [Reg] +allArgRegs = map RealReg [3..10] +allFPArgRegs :: [Reg] +#if darwin_TARGET_OS +allFPArgRegs = map (RealReg . fReg) [1..13] +#elif linux_TARGET_OS +allFPArgRegs = map (RealReg . fReg) [1..8] +#endif +#endif /* powerpc_TARGET_ARCH */ +\end{code} + +\begin{code} +freeReg :: RegNo -> FastBool + +#if alpha_TARGET_ARCH +freeReg 26 = fastBool False -- return address (ra) +freeReg 28 = fastBool False -- reserved for the assembler (at) +freeReg 29 = fastBool False -- global pointer (gp) +freeReg 30 = fastBool False -- stack pointer (sp) +freeReg 31 = fastBool False -- always zero (zeroh) +freeReg 63 = fastBool False -- always zero (f31) +#endif + +#if i386_TARGET_ARCH +freeReg esp = fastBool False -- %esp is the C stack pointer +#endif + +#if x86_64_TARGET_ARCH +freeReg rsp = fastBool False -- %rsp is the C stack pointer +#endif + +#if sparc_TARGET_ARCH +freeReg g0 = fastBool False -- %g0 is always 0. +freeReg g5 = fastBool False -- %g5 is reserved (ABI). +freeReg g6 = fastBool False -- %g6 is reserved (ABI). +freeReg g7 = fastBool False -- %g7 is reserved (ABI). +freeReg i6 = fastBool False -- %i6 is our frame pointer. +freeReg i7 = fastBool False -- %i7 tends to have ret-addr-ish things +freeReg o6 = fastBool False -- %o6 is our stack pointer. +freeReg o7 = fastBool False -- %o7 holds ret addrs (???) +freeReg f0 = fastBool False -- %f0/%f1 are the C fp return registers. +freeReg f1 = fastBool False +#endif + +#if powerpc_TARGET_ARCH +freeReg 0 = fastBool False -- Hack: r0 can't be used in all insns, but it's actually free +freeReg 1 = fastBool False -- The Stack Pointer +#if !darwin_TARGET_OS + -- most non-darwin powerpc OSes use r2 as a TOC pointer or something like that +freeReg 2 = fastBool False +#endif +#endif + +#ifdef REG_Base +freeReg REG_Base = fastBool False +#endif +#ifdef REG_R1 +freeReg REG_R1 = fastBool False +#endif +#ifdef REG_R2 +freeReg REG_R2 = fastBool False +#endif +#ifdef REG_R3 +freeReg REG_R3 = fastBool False +#endif +#ifdef REG_R4 +freeReg REG_R4 = fastBool False +#endif +#ifdef REG_R5 +freeReg REG_R5 = fastBool False +#endif +#ifdef REG_R6 +freeReg REG_R6 = fastBool False +#endif +#ifdef REG_R7 +freeReg REG_R7 = fastBool False +#endif +#ifdef REG_R8 +freeReg REG_R8 = fastBool False +#endif +#ifdef REG_F1 +freeReg REG_F1 = fastBool False +#endif +#ifdef REG_F2 +freeReg REG_F2 = fastBool False +#endif +#ifdef REG_F3 +freeReg REG_F3 = fastBool False +#endif +#ifdef REG_F4 +freeReg REG_F4 = fastBool False +#endif +#ifdef REG_D1 +freeReg REG_D1 = fastBool False +#endif +#ifdef REG_D2 +freeReg REG_D2 = fastBool False +#endif +#ifdef REG_Sp +freeReg REG_Sp = fastBool False +#endif +#ifdef REG_Su +freeReg REG_Su = fastBool False +#endif +#ifdef REG_SpLim +freeReg REG_SpLim = fastBool False +#endif +#ifdef REG_Hp +freeReg REG_Hp = fastBool False +#endif +#ifdef REG_HpLim +freeReg REG_HpLim = fastBool False +#endif +freeReg n = fastBool True + + +-- ----------------------------------------------------------------------------- +-- Information about global registers + +baseRegOffset :: GlobalReg -> Int + +baseRegOffset (VanillaReg 1) = oFFSET_StgRegTable_rR1 +baseRegOffset (VanillaReg 2) = oFFSET_StgRegTable_rR2 +baseRegOffset (VanillaReg 3) = oFFSET_StgRegTable_rR3 +baseRegOffset (VanillaReg 4) = oFFSET_StgRegTable_rR4 +baseRegOffset (VanillaReg 5) = oFFSET_StgRegTable_rR5 +baseRegOffset (VanillaReg 6) = oFFSET_StgRegTable_rR6 +baseRegOffset (VanillaReg 7) = oFFSET_StgRegTable_rR7 +baseRegOffset (VanillaReg 8) = oFFSET_StgRegTable_rR8 +baseRegOffset (VanillaReg 9) = oFFSET_StgRegTable_rR9 +baseRegOffset (VanillaReg 10) = oFFSET_StgRegTable_rR10 +baseRegOffset (FloatReg 1) = oFFSET_StgRegTable_rF1 +baseRegOffset (FloatReg 2) = oFFSET_StgRegTable_rF2 +baseRegOffset (FloatReg 3) = oFFSET_StgRegTable_rF3 +baseRegOffset (FloatReg 4) = oFFSET_StgRegTable_rF4 +baseRegOffset (DoubleReg 1) = oFFSET_StgRegTable_rD1 +baseRegOffset (DoubleReg 2) = oFFSET_StgRegTable_rD2 +baseRegOffset Sp = oFFSET_StgRegTable_rSp +baseRegOffset SpLim = oFFSET_StgRegTable_rSpLim +baseRegOffset (LongReg 1) = oFFSET_StgRegTable_rL1 +baseRegOffset Hp = oFFSET_StgRegTable_rHp +baseRegOffset HpLim = oFFSET_StgRegTable_rHpLim +baseRegOffset CurrentTSO = oFFSET_StgRegTable_rCurrentTSO +baseRegOffset CurrentNursery = oFFSET_StgRegTable_rCurrentNursery +baseRegOffset HpAlloc = oFFSET_StgRegTable_rHpAlloc +baseRegOffset GCEnter1 = oFFSET_stgGCEnter1 +baseRegOffset GCFun = oFFSET_stgGCFun +#ifdef DEBUG +baseRegOffset BaseReg = panic "baseRegOffset:BaseReg" +baseRegOffset _ = panic "baseRegOffset:other" +#endif + + +-- | Returns 'True' if this global register is stored in a caller-saves +-- machine register. + +callerSaves :: GlobalReg -> Bool + +#ifdef CALLER_SAVES_Base +callerSaves BaseReg = True +#endif +#ifdef CALLER_SAVES_R1 +callerSaves (VanillaReg 1) = True +#endif +#ifdef CALLER_SAVES_R2 +callerSaves (VanillaReg 2) = True +#endif +#ifdef CALLER_SAVES_R3 +callerSaves (VanillaReg 3) = True +#endif +#ifdef CALLER_SAVES_R4 +callerSaves (VanillaReg 4) = True +#endif +#ifdef CALLER_SAVES_R5 +callerSaves (VanillaReg 5) = True +#endif +#ifdef CALLER_SAVES_R6 +callerSaves (VanillaReg 6) = True +#endif +#ifdef CALLER_SAVES_R7 +callerSaves (VanillaReg 7) = True +#endif +#ifdef CALLER_SAVES_R8 +callerSaves (VanillaReg 8) = True +#endif +#ifdef CALLER_SAVES_F1 +callerSaves (FloatReg 1) = True +#endif +#ifdef CALLER_SAVES_F2 +callerSaves (FloatReg 2) = True +#endif +#ifdef CALLER_SAVES_F3 +callerSaves (FloatReg 3) = True +#endif +#ifdef CALLER_SAVES_F4 +callerSaves (FloatReg 4) = True +#endif +#ifdef CALLER_SAVES_D1 +callerSaves (DoubleReg 1) = True +#endif +#ifdef CALLER_SAVES_D2 +callerSaves (DoubleReg 2) = True +#endif +#ifdef CALLER_SAVES_L1 +callerSaves (LongReg 1) = True +#endif +#ifdef CALLER_SAVES_Sp +callerSaves Sp = True +#endif +#ifdef CALLER_SAVES_SpLim +callerSaves SpLim = True +#endif +#ifdef CALLER_SAVES_Hp +callerSaves Hp = True +#endif +#ifdef CALLER_SAVES_HpLim +callerSaves HpLim = True +#endif +#ifdef CALLER_SAVES_CurrentTSO +callerSaves CurrentTSO = True +#endif +#ifdef CALLER_SAVES_CurrentNursery +callerSaves CurrentNursery = True +#endif +callerSaves _ = False + + +-- | Returns 'Nothing' if this global register is not stored +-- in a real machine register, otherwise returns @'Just' reg@, where +-- reg is the machine register it is stored in. + +globalRegMaybe :: GlobalReg -> Maybe Reg + +#ifdef REG_Base +globalRegMaybe BaseReg = Just (RealReg REG_Base) +#endif +#ifdef REG_R1 +globalRegMaybe (VanillaReg 1) = Just (RealReg REG_R1) +#endif +#ifdef REG_R2 +globalRegMaybe (VanillaReg 2) = Just (RealReg REG_R2) +#endif +#ifdef REG_R3 +globalRegMaybe (VanillaReg 3) = Just (RealReg REG_R3) +#endif +#ifdef REG_R4 +globalRegMaybe (VanillaReg 4) = Just (RealReg REG_R4) +#endif +#ifdef REG_R5 +globalRegMaybe (VanillaReg 5) = Just (RealReg REG_R5) +#endif +#ifdef REG_R6 +globalRegMaybe (VanillaReg 6) = Just (RealReg REG_R6) +#endif +#ifdef REG_R7 +globalRegMaybe (VanillaReg 7) = Just (RealReg REG_R7) +#endif +#ifdef REG_R8 +globalRegMaybe (VanillaReg 8) = Just (RealReg REG_R8) +#endif +#ifdef REG_R9 +globalRegMaybe (VanillaReg 9) = Just (RealReg REG_R9) +#endif +#ifdef REG_R10 +globalRegMaybe (VanillaReg 10) = Just (RealReg REG_R10) +#endif +#ifdef REG_F1 +globalRegMaybe (FloatReg 1) = Just (RealReg REG_F1) +#endif +#ifdef REG_F2 +globalRegMaybe (FloatReg 2) = Just (RealReg REG_F2) +#endif +#ifdef REG_F3 +globalRegMaybe (FloatReg 3) = Just (RealReg REG_F3) +#endif +#ifdef REG_F4 +globalRegMaybe (FloatReg 4) = Just (RealReg REG_F4) +#endif +#ifdef REG_D1 +globalRegMaybe (DoubleReg 1) = Just (RealReg REG_D1) +#endif +#ifdef REG_D2 +globalRegMaybe (DoubleReg 2) = Just (RealReg REG_D2) +#endif +#ifdef REG_Sp +globalRegMaybe Sp = Just (RealReg REG_Sp) +#endif +#ifdef REG_Lng1 +globalRegMaybe (LongReg 1) = Just (RealReg REG_Lng1) +#endif +#ifdef REG_Lng2 +globalRegMaybe (LongReg 2) = Just (RealReg REG_Lng2) +#endif +#ifdef REG_SpLim +globalRegMaybe SpLim = Just (RealReg REG_SpLim) +#endif +#ifdef REG_Hp +globalRegMaybe Hp = Just (RealReg REG_Hp) +#endif +#ifdef REG_HpLim +globalRegMaybe HpLim = Just (RealReg REG_HpLim) +#endif +#ifdef REG_CurrentTSO +globalRegMaybe CurrentTSO = Just (RealReg REG_CurrentTSO) +#endif +#ifdef REG_CurrentNursery +globalRegMaybe CurrentNursery = Just (RealReg REG_CurrentNursery) +#endif +globalRegMaybe _ = Nothing + + +\end{code} diff --git a/compiler/nativeGen/NCG.h b/compiler/nativeGen/NCG.h new file mode 100644 index 0000000000..b17f682e71 --- /dev/null +++ b/compiler/nativeGen/NCG.h @@ -0,0 +1,108 @@ +/* ----------------------------------------------------------------------------- + + (c) The University of Glasgow, 1994-2004 + + Native-code generator header file - just useful macros for now. + + -------------------------------------------------------------------------- */ + +#ifndef NCG_H +#define NCG_H + +#include "ghc_boot_platform.h" + +#define COMMA , + +-- - - - - - - - - - - - - - - - - - - - - - +#if alpha_TARGET_ARCH +# define IF_ARCH_alpha(x,y) x +#else +# define IF_ARCH_alpha(x,y) y +#endif +-- - - - - - - - - - - - - - - - - - - - - - +#if i386_TARGET_ARCH +# define IF_ARCH_i386(x,y) x +#else +# define IF_ARCH_i386(x,y) y +#endif +-- - - - - - - - - - - - - - - - - - - - - - +#if x86_64_TARGET_ARCH +# define IF_ARCH_x86_64(x,y) x +#else +# define IF_ARCH_x86_64(x,y) y +#endif +-- - - - - - - - - - - - - - - - - - - - - - +#if freebsd_TARGET_OS +# define IF_OS_freebsd(x,y) x +#else +# define IF_OS_freebsd(x,y) y +#endif +-- - - - - - - - - - - - - - - - - - - - - - +#if netbsd_TARGET_OS +# define IF_OS_netbsd(x,y) x +#else +# define IF_OS_netbsd(x,y) y +#endif +-- - - - - - - - - - - - - - - - - - - - - - +#if openbsd_TARGET_OS +# define IF_OS_openbsd(x,y) x +#else +# define IF_OS_openbsd(x,y) y +#endif +-- - - - - - - - - - - - - - - - - - - - - - +#if linux_TARGET_OS +# define IF_OS_linux(x,y) x +#else +# define IF_OS_linux(x,y) y +#endif +-- - - - - - - - - - - - - - - - - - - - - - +#if linuxaout_TARGET_OS +# define IF_OS_linuxaout(x,y) x +#else +# define IF_OS_linuxaout(x,y) y +#endif +-- - - - - - - - - - - - - - - - - - - - - - +#if bsdi_TARGET_OS +# define IF_OS_bsdi(x,y) x +#else +# define IF_OS_bsdi(x,y) y +#endif +-- - - - - - - - - - - - - - - - - - - - - - +#if cygwin32_TARGET_OS +# define IF_OS_cygwin32(x,y) x +#else +# define IF_OS_cygwin32(x,y) y +#endif +-- - - - - - - - - - - - - - - - - - - - - - +#if sparc_TARGET_ARCH +# define IF_ARCH_sparc(x,y) x +#else +# define IF_ARCH_sparc(x,y) y +#endif +-- - - - - - - - - - - - - - - - - - - - - - +#if sunos4_TARGET_OS +# define IF_OS_sunos4(x,y) x +#else +# define IF_OS_sunos4(x,y) y +#endif +-- - - - - - - - - - - - - - - - - - - - - - +-- NB: this will catch i386-*-solaris2, too +#if solaris2_TARGET_OS +# define IF_OS_solaris2(x,y) x +#else +# define IF_OS_solaris2(x,y) y +#endif +-- - - - - - - - - - - - - - - - - - - - - - +#if powerpc_TARGET_ARCH +# define IF_ARCH_powerpc(x,y) x +#else +# define IF_ARCH_powerpc(x,y) y +#endif +-- - - - - - - - - - - - - - - - - - - - - - +#if darwin_TARGET_OS +# define IF_OS_darwin(x,y) x +#else +# define IF_OS_darwin(x,y) y +#endif +--------------------------------------------- +#endif diff --git a/compiler/nativeGen/NCGMonad.hs b/compiler/nativeGen/NCGMonad.hs new file mode 100644 index 0000000000..8fdcd44024 --- /dev/null +++ b/compiler/nativeGen/NCGMonad.hs @@ -0,0 +1,111 @@ +-- ----------------------------------------------------------------------------- +-- +-- (c) The University of Glasgow 1993-2004 +-- +-- The native code generator's monad. +-- +-- ----------------------------------------------------------------------------- + +module NCGMonad ( + NatM_State(..), mkNatM_State, + + NatM, -- instance Monad + initNat, addImportNat, getUniqueNat, + mapAccumLNat, setDeltaNat, getDeltaNat, + getBlockIdNat, getNewLabelNat, getNewRegNat, getNewRegPairNat, + getPicBaseMaybeNat, getPicBaseNat + ) where + +#include "HsVersions.h" + +import Cmm ( BlockId(..) ) +import CLabel ( CLabel, mkAsmTempLabel ) +import MachRegs +import MachOp ( MachRep ) +import UniqSupply +import Unique ( Unique ) + + +data NatM_State = NatM_State { + natm_us :: UniqSupply, + natm_delta :: Int, + natm_imports :: [(CLabel)], + natm_pic :: Maybe Reg + } + +newtype NatM result = NatM (NatM_State -> (result, NatM_State)) + +unNat (NatM a) = a + +mkNatM_State :: UniqSupply -> Int -> NatM_State +mkNatM_State us delta = NatM_State us delta [] Nothing + +initNat :: NatM_State -> NatM a -> (a, NatM_State) +initNat init_st m = case unNat m init_st of { (r,st) -> (r,st) } + +instance Monad NatM where + (>>=) = thenNat + return = returnNat + +thenNat :: NatM a -> (a -> NatM b) -> NatM b +thenNat expr cont + = NatM $ \st -> case unNat expr st of + (result, st') -> unNat (cont result) st' + +returnNat :: a -> NatM a +returnNat result = NatM $ \st -> (result, st) + +mapAccumLNat :: (acc -> x -> NatM (acc, y)) + -> acc + -> [x] + -> NatM (acc, [y]) + +mapAccumLNat f b [] + = return (b, []) +mapAccumLNat f b (x:xs) + = do (b__2, x__2) <- f b x + (b__3, xs__2) <- mapAccumLNat f b__2 xs + return (b__3, x__2:xs__2) + +getUniqueNat :: NatM Unique +getUniqueNat = NatM $ \ (NatM_State us delta imports pic) -> + case splitUniqSupply us of + (us1,us2) -> (uniqFromSupply us1, (NatM_State us2 delta imports pic)) + +getDeltaNat :: NatM Int +getDeltaNat = NatM $ \ st -> (natm_delta st, st) + +setDeltaNat :: Int -> NatM () +setDeltaNat delta = NatM $ \ (NatM_State us _ imports pic) -> + ((), NatM_State us delta imports pic) + +addImportNat :: CLabel -> NatM () +addImportNat imp = NatM $ \ (NatM_State us delta imports pic) -> + ((), NatM_State us delta (imp:imports) pic) + +getBlockIdNat :: NatM BlockId +getBlockIdNat = do u <- getUniqueNat; return (BlockId u) + +getNewLabelNat :: NatM CLabel +getNewLabelNat = do u <- getUniqueNat; return (mkAsmTempLabel u) + +getNewRegNat :: MachRep -> NatM Reg +getNewRegNat rep = do u <- getUniqueNat; return (mkVReg u rep) + +getNewRegPairNat :: MachRep -> NatM (Reg,Reg) +getNewRegPairNat rep = do + u <- getUniqueNat + let lo = mkVReg u rep; hi = getHiVRegFromLo lo + return (lo,hi) + +getPicBaseMaybeNat :: NatM (Maybe Reg) +getPicBaseMaybeNat = NatM (\state -> (natm_pic state, state)) + +getPicBaseNat :: MachRep -> NatM Reg +getPicBaseNat rep = do + mbPicBase <- getPicBaseMaybeNat + case mbPicBase of + Just picBase -> return picBase + Nothing -> do + reg <- getNewRegNat rep + NatM (\state -> (reg, state { natm_pic = Just reg })) diff --git a/compiler/nativeGen/NOTES b/compiler/nativeGen/NOTES new file mode 100644 index 0000000000..9068a7fc2c --- /dev/null +++ b/compiler/nativeGen/NOTES @@ -0,0 +1,41 @@ +TODO in new NCG +~~~~~~~~~~~~~~~ + +- Are we being careful enough about narrowing those out-of-range CmmInts? + +- Register allocator: + - fixup code + - keep track of free stack slots + + Optimisations: + + - picking the assignment on entry to a block: better to defer this + until we know all the assignments. In a loop, we should pick + the assignment from the looping jump (fixpointing?), so that any + fixup code ends up *outside* the loop. Otherwise, we should + pick the assignment that results in the least fixup code. + +- splitting? + +-- ----------------------------------------------------------------------------- +-- x86 ToDos + +- x86 genCCall needs to tack on the @size for stdcalls (might not be in the + foreignlabel). + +- x86: should really clean up that IMUL64 stuff, and tell the code gen about + Intel imul instructions. + +- x86: we're not careful enough about making sure that we only use + byte-addressable registers in byte instructions. Should we do it this + way, or stick to using 32-bit registers everywhere? + +- Use SSE for floating point, optionally. + +------------------------------------------------------------------------------ +-- Further optimisations: + +- We might be able to extend the scope of the inlining phase so it can + skip over more statements that don't affect the value of the inlined + expr. + diff --git a/compiler/nativeGen/PositionIndependentCode.hs b/compiler/nativeGen/PositionIndependentCode.hs new file mode 100644 index 0000000000..0daccb6530 --- /dev/null +++ b/compiler/nativeGen/PositionIndependentCode.hs @@ -0,0 +1,605 @@ +module PositionIndependentCode ( + cmmMakeDynamicReference, + needImportedSymbols, + pprImportedSymbol, + pprGotDeclaration, + initializePicBase + ) where + +{- + This module handles generation of position independent code and + dynamic-linking related issues for the native code generator. + + Things outside this module which are related to this: + + + module CLabel + - PIC base label (pretty printed as local label 1) + - DynamicLinkerLabels - several kinds: + CodeStub, SymbolPtr, GotSymbolPtr, GotSymbolOffset + - labelDynamic predicate + + module Cmm + - The GlobalReg datatype has a PicBaseReg constructor + - The CmmLit datatype has a CmmLabelDiffOff constructor + + codeGen & RTS + - When tablesNextToCode, no absolute addresses are stored in info tables + any more. Instead, offsets from the info label are used. + - For Win32 only, SRTs might contain addresses of __imp_ symbol pointers + because Win32 doesn't support external references in data sections. + TODO: make sure this still works, it might be bitrotted + + NCG + - The cmmToCmm pass in AsmCodeGen calls cmmMakeDynamicReference for all + labels. + - nativeCodeGen calls pprImportedSymbol and pprGotDeclaration to output + all the necessary stuff for imported symbols. + - The NCG monad keeps track of a list of imported symbols. + - MachCodeGen invokes initializePicBase to generate code to initialize + the PIC base register when needed. + - MachCodeGen calls cmmMakeDynamicReference whenever it uses a CLabel + that wasn't in the original Cmm code (e.g. floating point literals). + + The Mangler + - The mangler converts absolure refs to relative refs in info tables + - Symbol pointers, stub code and PIC calculations that are generated + by GCC are left intact by the mangler (so far only on ppc-darwin + and ppc-linux). +-} + +#include "HsVersions.h" +#include "nativeGen/NCG.h" + +import Cmm +import MachOp ( MachOp(MO_Add), wordRep ) +import CLabel ( CLabel, pprCLabel, + mkDynamicLinkerLabel, DynamicLinkerLabelInfo(..), + dynamicLinkerLabelInfo, mkPicBaseLabel, + labelDynamic, externallyVisibleCLabel ) + +#if linux_TARGET_OS +import CLabel ( mkForeignLabel ) +#endif + +import MachRegs +import MachInstrs +import NCGMonad ( NatM, getNewRegNat, getNewLabelNat ) + +import StaticFlags ( opt_PIC, opt_Static ) + +import Pretty +import qualified Outputable + +import Panic ( panic ) + + +-- The most important function here is cmmMakeDynamicReference. + +-- It gets called by the cmmToCmm pass for every CmmLabel in the Cmm +-- code. It does The Right Thing(tm) to convert the CmmLabel into a +-- position-independent, dynamic-linking-aware reference to the thing +-- in question. +-- Note that this also has to be called from MachCodeGen in order to +-- access static data like floating point literals (labels that were +-- created after the cmmToCmm pass). +-- The function must run in a monad that can keep track of imported symbols +-- A function for recording an imported symbol must be passed in: +-- - addImportCmmOpt for the CmmOptM monad +-- - addImportNat for the NatM monad. + +cmmMakeDynamicReference + :: Monad m => (CLabel -> m ()) -- a monad & a function + -- used for recording imported symbols + -> Bool -- whether this is the target of a jump + -> CLabel -- the label + -> m CmmExpr + +cmmMakeDynamicReference addImport isJumpTarget lbl + | Just _ <- dynamicLinkerLabelInfo lbl + = return $ CmmLit $ CmmLabel lbl -- already processed it, pass through + | otherwise = case howToAccessLabel isJumpTarget lbl of + AccessViaStub -> do + let stub = mkDynamicLinkerLabel CodeStub lbl + addImport stub + return $ CmmLit $ CmmLabel stub + AccessViaSymbolPtr -> do + let symbolPtr = mkDynamicLinkerLabel SymbolPtr lbl + addImport symbolPtr + return $ CmmLoad (cmmMakePicReference symbolPtr) wordRep + AccessDirectly + -- all currently supported processors support + -- a PC-relative branch instruction, so just jump there + | isJumpTarget -> return $ CmmLit $ CmmLabel lbl + -- for data, we might have to make some calculations: + | otherwise -> return $ cmmMakePicReference lbl + +-- ------------------------------------------------------------------- + +-- Create a position independent reference to a label. +-- (but do not bother with dynamic linking). +-- We calculate the label's address by adding some (platform-dependent) +-- offset to our base register; this offset is calculated by +-- the function picRelative in the platform-dependent part below. + +cmmMakePicReference :: CLabel -> CmmExpr + +#if !mingw32_TARGET_OS + -- Windows doesn't need PIC, + -- everything gets relocated at runtime + +cmmMakePicReference lbl + | opt_PIC && absoluteLabel lbl = CmmMachOp (MO_Add wordRep) [ + CmmReg (CmmGlobal PicBaseReg), + CmmLit $ picRelative lbl + ] + where + absoluteLabel lbl = case dynamicLinkerLabelInfo lbl of + Just (GotSymbolPtr, _) -> False + Just (GotSymbolOffset, _) -> False + _ -> True + +#endif +cmmMakePicReference lbl = CmmLit $ CmmLabel lbl + +-- =================================================================== +-- Platform dependent stuff +-- =================================================================== + +-- Knowledge about how special dynamic linker labels like symbol +-- pointers, code stubs and GOT offsets look like is located in the +-- module CLabel. + +-- ------------------------------------------------------------------- + +-- We have to decide which labels need to be accessed +-- indirectly or via a piece of stub code. + +data LabelAccessStyle = AccessViaStub + | AccessViaSymbolPtr + | AccessDirectly + +howToAccessLabel :: Bool -> CLabel -> LabelAccessStyle + +#if mingw32_TARGET_OS +-- Windows +-- +-- We need to use access *exactly* those things that +-- are imported from a DLL via an __imp_* label. +-- There are no stubs for imported code. + +howToAccessLabel _ lbl | labelDynamic lbl = AccessViaSymbolPtr + | otherwise = AccessDirectly + +#elif darwin_TARGET_OS +-- Mach-O (Darwin, Mac OS X) +-- +-- Indirect access is required in the following cases: +-- * things imported from a dynamic library +-- * things from a different module, if we're generating PIC code +-- It is always possible to access something indirectly, +-- even when it's not necessary. + +howToAccessLabel True lbl + -- jumps to a dynamic library go via a symbol stub + | labelDynamic lbl = AccessViaStub + -- when generating PIC code, all cross-module references must + -- must go via a symbol pointer, too. + -- Unfortunately, we don't know whether it's cross-module, + -- so we do it for all externally visible labels. + -- This is a slight waste of time and space, but otherwise + -- we'd need to pass the current Module all the way in to + -- this function. + | opt_PIC && externallyVisibleCLabel lbl = AccessViaStub +howToAccessLabel False lbl + -- data access to a dynamic library goes via a symbol pointer + | labelDynamic lbl = AccessViaSymbolPtr + -- cross-module PIC references: same as above + | opt_PIC && externallyVisibleCLabel lbl = AccessViaSymbolPtr +howToAccessLabel _ _ = AccessDirectly + +#elif linux_TARGET_OS && powerpc64_TARGET_ARCH +-- ELF PPC64 (powerpc64-linux), AIX, MacOS 9, BeOS/PPC + +howToAccessLabel True lbl = AccessDirectly -- actually, .label instead of label +howToAccessLabel _ lbl = AccessViaSymbolPtr + +#elif linux_TARGET_OS +-- ELF (Linux) +-- +-- ELF tries to pretend to the main application code that dynamic linking does +-- not exist. While this may sound convenient, it tends to mess things up in +-- very bad ways, so we have to be careful when we generate code for the main +-- program (-dynamic but no -fPIC). +-- +-- Indirect access is required for references to imported symbols +-- from position independent code. It is also required from the main program +-- when dynamic libraries containing Haskell code are used. + +howToAccessLabel isJump lbl + -- no PIC -> the dynamic linker does everything for us; + -- if we don't dynamically link to Haskell code, + -- it actually manages to do so without messing thins up. + | not opt_PIC && opt_Static = AccessDirectly + +#if !i386_TARGET_ARCH +-- for Intel, we temporarily disable the use of the +-- Procedure Linkage Table, because PLTs on intel require the +-- address of the GOT to be loaded into register %ebx before +-- a jump through the PLT is made. +-- TODO: make the i386 NCG ensure this before jumping to a +-- CodeStub label, so we can remove this special case. + + -- As long as we're in a shared library ourselves, + -- we can use the plt. + -- NOTE: We might want to disable this, because this + -- prevents -fPIC code from being linked statically. + | isJump && labelDynamic lbl && opt_PIC = AccessViaStub + + -- TODO: it would be OK to access non-Haskell code via a stub +-- | isJump && labelDynamic lbl && not isHaskellCode lbl = AccessViaStub + + -- Using code stubs for jumps from the main program to an entry + -- label in a dynamic library is deadly; this will cause the dynamic + -- linker to replace all references (even data references) to that + -- label by references to the stub, so we won't find our info tables + -- any more. +#endif + + -- A dynamic label needs to be accessed via a symbol pointer. + -- NOTE: It would be OK to jump to foreign code via a PLT stub. + | labelDynamic lbl = AccessViaSymbolPtr + +#if powerpc_TARGET_ARCH + -- For PowerPC32 -fPIC, we have to access even static data + -- via a symbol pointer (see below for an explanation why + -- PowerPC32 Linux is especially broken). + | opt_PIC && not isJump = AccessViaSymbolPtr +#endif + + | otherwise = AccessDirectly + +#else +-- +-- all other platforms +-- +howToAccessLabel _ _ + | not opt_PIC = AccessDirectly + | otherwise = panic "howToAccessLabel: PIC not defined for this platform" +#endif + +-- ------------------------------------------------------------------- + +-- What do we have to add to our 'PIC base register' in order to +-- get the address of a label? + +picRelative :: CLabel -> CmmLit +#if darwin_TARGET_OS +-- Darwin: +-- The PIC base register points to the PIC base label at the beginning +-- of the current CmmTop. We just have to use a label difference to +-- get the offset. +-- We have already made sure that all labels that are not from the current +-- module are accessed indirectly ('as' can't calculate differences between +-- undefined labels). + +picRelative lbl + = CmmLabelDiffOff lbl mkPicBaseLabel 0 + +#elif powerpc_TARGET_ARCH && linux_TARGET_OS +-- PowerPC Linux: +-- The PIC base register points to our fake GOT. Use a label difference +-- to get the offset. +-- We have made sure that *everything* is accessed indirectly, so this +-- is only used for offsets from the GOT to symbol pointers inside the +-- GOT. +picRelative lbl + = CmmLabelDiffOff lbl gotLabel 0 + +#elif linux_TARGET_OS +-- Other Linux versions: +-- The PIC base register points to the GOT. Use foo@got for symbol +-- pointers, and foo@gotoff for everything else. + +picRelative lbl + | Just (SymbolPtr, lbl') <- dynamicLinkerLabelInfo lbl + = CmmLabel $ mkDynamicLinkerLabel GotSymbolPtr lbl' + | otherwise + = CmmLabel $ mkDynamicLinkerLabel GotSymbolOffset lbl + +#else +picRelative lbl = panic "PositionIndependentCode.picRelative" +#endif + +-- ------------------------------------------------------------------- + +-- What do we have to add to every assembly file we generate? + +-- utility function for pretty-printing asm-labels, +-- copied from PprMach +asmSDoc d = Outputable.withPprStyleDoc ( + Outputable.mkCodeStyle Outputable.AsmStyle) d +pprCLabel_asm l = asmSDoc (pprCLabel l) + + +#if darwin_TARGET_OS + +needImportedSymbols = True + +-- We don't need to declare any offset tables. +-- However, for PIC on x86, we need a small helper function. +#if i386_TARGET_ARCH +pprGotDeclaration + | opt_PIC + = vcat [ + ptext SLIT(".section __TEXT,__textcoal_nt,coalesced,no_toc"), + ptext SLIT(".weak_definition ___i686.get_pc_thunk.ax"), + ptext SLIT(".private_extern ___i686.get_pc_thunk.ax"), + ptext SLIT("___i686.get_pc_thunk.ax:"), + ptext SLIT("\tmovl (%esp), %eax"), + ptext SLIT("\tret") + ] + | otherwise = Pretty.empty +#else +pprGotDeclaration = Pretty.empty +#endif + +-- On Darwin, we have to generate our own stub code for lazy binding.. +-- For each processor architecture, there are two versions, one for PIC +-- and one for non-PIC. +pprImportedSymbol importedLbl +#if powerpc_TARGET_ARCH + | Just (CodeStub, lbl) <- dynamicLinkerLabelInfo importedLbl + = case opt_PIC of + False -> + vcat [ + ptext SLIT(".symbol_stub"), + ptext SLIT("L") <> pprCLabel_asm lbl <> ptext SLIT("$stub:"), + ptext SLIT("\t.indirect_symbol") <+> pprCLabel_asm lbl, + ptext SLIT("\tlis r11,ha16(L") <> pprCLabel_asm lbl + <> ptext SLIT("$lazy_ptr)"), + ptext SLIT("\tlwz r12,lo16(L") <> pprCLabel_asm lbl + <> ptext SLIT("$lazy_ptr)(r11)"), + ptext SLIT("\tmtctr r12"), + ptext SLIT("\taddi r11,r11,lo16(L") <> pprCLabel_asm lbl + <> ptext SLIT("$lazy_ptr)"), + ptext SLIT("\tbctr") + ] + True -> + vcat [ + ptext SLIT(".section __TEXT,__picsymbolstub1,") + <> ptext SLIT("symbol_stubs,pure_instructions,32"), + ptext SLIT("\t.align 2"), + ptext SLIT("L") <> pprCLabel_asm lbl <> ptext SLIT("$stub:"), + ptext SLIT("\t.indirect_symbol") <+> pprCLabel_asm lbl, + ptext SLIT("\tmflr r0"), + ptext SLIT("\tbcl 20,31,L0$") <> pprCLabel_asm lbl, + ptext SLIT("L0$") <> pprCLabel_asm lbl <> char ':', + ptext SLIT("\tmflr r11"), + ptext SLIT("\taddis r11,r11,ha16(L") <> pprCLabel_asm lbl + <> ptext SLIT("$lazy_ptr-L0$") <> pprCLabel_asm lbl <> char ')', + ptext SLIT("\tmtlr r0"), + ptext SLIT("\tlwzu r12,lo16(L") <> pprCLabel_asm lbl + <> ptext SLIT("$lazy_ptr-L0$") <> pprCLabel_asm lbl + <> ptext SLIT(")(r11)"), + ptext SLIT("\tmtctr r12"), + ptext SLIT("\tbctr") + ] + $+$ vcat [ + ptext SLIT(".lazy_symbol_pointer"), + ptext SLIT("L") <> pprCLabel_asm lbl <> ptext SLIT("$lazy_ptr:"), + ptext SLIT("\t.indirect_symbol") <+> pprCLabel_asm lbl, + ptext SLIT("\t.long dyld_stub_binding_helper") + ] +#elif i386_TARGET_ARCH + | Just (CodeStub, lbl) <- dynamicLinkerLabelInfo importedLbl + = case opt_PIC of + False -> + vcat [ + ptext SLIT(".symbol_stub"), + ptext SLIT("L") <> pprCLabel_asm lbl <> ptext SLIT("$stub:"), + ptext SLIT("\t.indirect_symbol") <+> pprCLabel_asm lbl, + ptext SLIT("\tjmp *L") <> pprCLabel_asm lbl + <> ptext SLIT("$lazy_ptr"), + ptext SLIT("L") <> pprCLabel_asm lbl + <> ptext SLIT("$stub_binder:"), + ptext SLIT("\tpushl $L") <> pprCLabel_asm lbl + <> ptext SLIT("$lazy_ptr"), + ptext SLIT("\tjmp dyld_stub_binding_helper") + ] + True -> + vcat [ + ptext SLIT(".section __TEXT,__picsymbolstub2,") + <> ptext SLIT("symbol_stubs,pure_instructions,25"), + ptext SLIT("L") <> pprCLabel_asm lbl <> ptext SLIT("$stub:"), + ptext SLIT("\t.indirect_symbol") <+> pprCLabel_asm lbl, + ptext SLIT("\tcall ___i686.get_pc_thunk.ax"), + ptext SLIT("1:"), + ptext SLIT("\tmovl L") <> pprCLabel_asm lbl + <> ptext SLIT("$lazy_ptr-1b(%eax),%edx"), + ptext SLIT("\tjmp %edx"), + ptext SLIT("L") <> pprCLabel_asm lbl + <> ptext SLIT("$stub_binder:"), + ptext SLIT("\tlea L") <> pprCLabel_asm lbl + <> ptext SLIT("$lazy_ptr-1b(%eax),%eax"), + ptext SLIT("\tpushl %eax"), + ptext SLIT("\tjmp dyld_stub_binding_helper") + ] + $+$ vcat [ ptext SLIT(".section __DATA, __la_sym_ptr") + <> (if opt_PIC then int 2 else int 3) + <> ptext SLIT(",lazy_symbol_pointers"), + ptext SLIT("L") <> pprCLabel_asm lbl <> ptext SLIT("$lazy_ptr:"), + ptext SLIT("\t.indirect_symbol") <+> pprCLabel_asm lbl, + ptext SLIT("\t.long L") <> pprCLabel_asm lbl + <> ptext SLIT("$stub_binder") + ] +#endif +-- We also have to declare our symbol pointers ourselves: + | Just (SymbolPtr, lbl) <- dynamicLinkerLabelInfo importedLbl + = vcat [ + ptext SLIT(".non_lazy_symbol_pointer"), + char 'L' <> pprCLabel_asm lbl <> ptext SLIT("$non_lazy_ptr:"), + ptext SLIT("\t.indirect_symbol") <+> pprCLabel_asm lbl, + ptext SLIT("\t.long\t0") + ] + + | otherwise = empty + +#elif linux_TARGET_OS && !powerpc64_TARGET_ARCH + +-- ELF / Linux +-- +-- In theory, we don't need to generate any stubs or symbol pointers +-- by hand for Linux. +-- +-- Reality differs from this in two areas. +-- +-- 1) If we just use a dynamically imported symbol directly in a read-only +-- section of the main executable (as GCC does), ld generates R_*_COPY +-- relocations, which are fundamentally incompatible with reversed info +-- tables. Therefore, we need a table of imported addresses in a writable +-- section. +-- The "official" GOT mechanism (label@got) isn't intended to be used +-- in position dependent code, so we have to create our own "fake GOT" +-- when not opt_PCI && not opt_Static. +-- +-- 2) PowerPC Linux is just plain broken. +-- While it's theoretically possible to use GOT offsets larger +-- than 16 bit, the standard crt*.o files don't, which leads to +-- linker errors as soon as the GOT size exceeds 16 bit. +-- Also, the assembler doesn't support @gotoff labels. +-- In order to be able to use a larger GOT, we have to circumvent the +-- entire GOT mechanism and do it ourselves (this is also what GCC does). + + +-- When needImportedSymbols is defined, +-- the NCG will keep track of all DynamicLinkerLabels it uses +-- and output each of them using pprImportedSymbol. +#if powerpc_TARGET_ARCH + -- PowerPC Linux: -fPIC or -dynamic +needImportedSymbols = opt_PIC || not opt_Static +#else + -- i386 (and others?): -dynamic but not -fPIC +needImportedSymbols = not opt_Static && not opt_PIC +#endif + +-- gotLabel +-- The label used to refer to our "fake GOT" from +-- position-independent code. +gotLabel = mkForeignLabel -- HACK: it's not really foreign + FSLIT(".LCTOC1") Nothing False + +-- pprGotDeclaration +-- Output whatever needs to be output once per .s file. +-- The .LCTOC1 label is defined to point 32768 bytes into the table, +-- to make the most of the PPC's 16-bit displacements. +-- Only needed for PIC. + +pprGotDeclaration + | not opt_PIC = Pretty.empty + | otherwise = vcat [ + ptext SLIT(".section \".got2\",\"aw\""), + ptext SLIT(".LCTOC1 = .+32768") + ] + +-- We generate one .long literal for every symbol we import; +-- the dynamic linker will relocate those addresses. + +pprImportedSymbol importedLbl + | Just (SymbolPtr, lbl) <- dynamicLinkerLabelInfo importedLbl + = vcat [ + ptext SLIT(".section \".got2\", \"aw\""), + ptext SLIT(".LC_") <> pprCLabel_asm lbl <> char ':', + ptext SLIT("\t.long") <+> pprCLabel_asm lbl + ] + +-- PLT code stubs are generated automatically be the dynamic linker. + | otherwise = empty + +#else + +-- For all other currently supported platforms, we don't need to do +-- anything at all. + +needImportedSymbols = False +pprGotDeclaration = Pretty.empty +pprImportedSymbol _ = empty +#endif + +-- ------------------------------------------------------------------- + +-- Generate code to calculate the address that should be put in the +-- PIC base register. +-- This is called by MachCodeGen for every CmmProc that accessed the +-- PIC base register. It adds the appropriate instructions to the +-- top of the CmmProc. + +-- It is assumed that the first NatCmmTop in the input list is a Proc +-- and the rest are CmmDatas. + +initializePicBase :: Reg -> [NatCmmTop] -> NatM [NatCmmTop] + +#if darwin_TARGET_OS + +-- Darwin is simple: just fetch the address of a local label. +-- The FETCHPC pseudo-instruction is expanded to multiple instructions +-- during pretty-printing so that we don't have to deal with the +-- local label: + +-- PowerPC version: +-- bcl 20,31,1f. +-- 1: mflr picReg + +-- i386 version: +-- call 1f +-- 1: popl %picReg + +initializePicBase picReg (CmmProc info lab params blocks : statics) + = return (CmmProc info lab params (b':tail blocks) : statics) + where BasicBlock bID insns = head blocks + b' = BasicBlock bID (FETCHPC picReg : insns) + +#elif powerpc_TARGET_ARCH && linux_TARGET_OS + +-- Get a pointer to our own fake GOT, which is defined on a per-module basis. +-- This is exactly how GCC does it, and it's quite horrible: +-- We first fetch the address of a local label (mkPicBaseLabel). +-- Then we add a 16-bit offset to that to get the address of a .long that we +-- define in .text space right next to the proc. This .long literal contains +-- the (32-bit) offset from our local label to our global offset table +-- (.LCTOC1 aka gotOffLabel). +initializePicBase picReg + (CmmProc info lab params blocks : statics) + = do + gotOffLabel <- getNewLabelNat + tmp <- getNewRegNat wordRep + let + gotOffset = CmmData Text [ + CmmDataLabel gotOffLabel, + CmmStaticLit (CmmLabelDiffOff gotLabel + mkPicBaseLabel + 0) + ] + offsetToOffset = ImmConstantDiff (ImmCLbl gotOffLabel) + (ImmCLbl mkPicBaseLabel) + BasicBlock bID insns = head blocks + b' = BasicBlock bID (FETCHPC picReg + : LD wordRep tmp + (AddrRegImm picReg offsetToOffset) + : ADD picReg picReg (RIReg tmp) + : insns) + return (CmmProc info lab params (b' : tail blocks) : gotOffset : statics) +#elif i386_TARGET_ARCH && linux_TARGET_OS + +-- We cheat a bit here by defining a pseudo-instruction named FETCHGOT +-- which pretty-prints as: +-- call 1f +-- 1: popl %picReg +-- addl __GLOBAL_OFFSET_TABLE__+.-1b, %picReg +-- (See PprMach.lhs) + +initializePicBase picReg (CmmProc info lab params blocks : statics) + = return (CmmProc info lab params (b':tail blocks) : statics) + where BasicBlock bID insns = head blocks + b' = BasicBlock bID (FETCHGOT picReg : insns) + +#else +initializePicBase picReg proc = panic "initializePicBase" + +-- mingw32_TARGET_OS: not needed, won't be called +#endif diff --git a/compiler/nativeGen/PprMach.hs b/compiler/nativeGen/PprMach.hs new file mode 100644 index 0000000000..afa5bcd872 --- /dev/null +++ b/compiler/nativeGen/PprMach.hs @@ -0,0 +1,2454 @@ +----------------------------------------------------------------------------- +-- +-- Pretty-printing assembly language +-- +-- (c) The University of Glasgow 1993-2005 +-- +----------------------------------------------------------------------------- + +-- We start with the @pprXXX@s with some cross-platform commonality +-- (e.g., 'pprReg'); we conclude with the no-commonality monster, +-- 'pprInstr'. + +#include "nativeGen/NCG.h" + +module PprMach ( + pprNatCmmTop, pprBasicBlock, + pprInstr, pprSize, pprUserReg, + ) where + + +#include "HsVersions.h" + +import Cmm +import MachOp ( MachRep(..), wordRep, isFloatingRep ) +import MachRegs -- may differ per-platform +import MachInstrs + +import CLabel ( CLabel, pprCLabel, externallyVisibleCLabel, + labelDynamic, mkAsmTempLabel, entryLblToInfoLbl ) +#if HAVE_SUBSECTIONS_VIA_SYMBOLS +import CLabel ( mkDeadStripPreventer ) +#endif + +import Panic ( panic ) +import Unique ( pprUnique ) +import Pretty +import FastString +import qualified Outputable + +import StaticFlags ( opt_PIC, opt_Static ) + +#if __GLASGOW_HASKELL__ >= 504 +import Data.Array.ST +import Data.Word ( Word8 ) +#else +import MutableArray +#endif + +import MONAD_ST +import Char ( chr, ord ) +import Maybe ( isJust ) + +#if powerpc_TARGET_ARCH || darwin_TARGET_OS +import DATA_WORD(Word32) +import DATA_BITS +#endif + +-- ----------------------------------------------------------------------------- +-- Printing this stuff out + +asmSDoc d = Outputable.withPprStyleDoc ( + Outputable.mkCodeStyle Outputable.AsmStyle) d +pprCLabel_asm l = asmSDoc (pprCLabel l) + +pprNatCmmTop :: NatCmmTop -> Doc +pprNatCmmTop (CmmData section dats) = + pprSectionHeader section $$ vcat (map pprData dats) + + -- special case for split markers: +pprNatCmmTop (CmmProc [] lbl _ []) = pprLabel lbl + +pprNatCmmTop (CmmProc info lbl params blocks) = + pprSectionHeader Text $$ + (if not (null info) + then +#if HAVE_SUBSECTIONS_VIA_SYMBOLS + pprCLabel_asm (mkDeadStripPreventer $ entryLblToInfoLbl lbl) + <> char ':' $$ +#endif + vcat (map pprData info) $$ + pprLabel (entryLblToInfoLbl lbl) + else empty) $$ + (case blocks of + [] -> empty + (BasicBlock _ instrs : rest) -> + (if null info then pprLabel lbl else empty) $$ + -- the first block doesn't get a label: + vcat (map pprInstr instrs) $$ + vcat (map pprBasicBlock rest) + ) +#if HAVE_SUBSECTIONS_VIA_SYMBOLS + -- If we are using the .subsections_via_symbols directive + -- (available on recent versions of Darwin), + -- we have to make sure that there is some kind of reference + -- from the entry code to a label on the _top_ of of the info table, + -- so that the linker will not think it is unreferenced and dead-strip + -- it. That's why the label is called a DeadStripPreventer (_dsp). + $$ if not (null info) + then text "\t.long " + <+> pprCLabel_asm (entryLblToInfoLbl lbl) + <+> char '-' + <+> pprCLabel_asm (mkDeadStripPreventer $ entryLblToInfoLbl lbl) + else empty +#endif + + +pprBasicBlock :: NatBasicBlock -> Doc +pprBasicBlock (BasicBlock (BlockId id) instrs) = + pprLabel (mkAsmTempLabel id) $$ + vcat (map pprInstr instrs) + +-- ----------------------------------------------------------------------------- +-- pprReg: print a 'Reg' + +-- For x86, the way we print a register name depends +-- on which bit of it we care about. Yurgh. + +pprUserReg :: Reg -> Doc +pprUserReg = pprReg IF_ARCH_i386(I32,) IF_ARCH_x86_64(I64,) + +pprReg :: IF_ARCH_i386(MachRep ->,) IF_ARCH_x86_64(MachRep ->,) Reg -> Doc + +pprReg IF_ARCH_i386(s,) IF_ARCH_x86_64(s,) r + = case r of + RealReg i -> ppr_reg_no IF_ARCH_i386(s,) IF_ARCH_x86_64(s,) i + VirtualRegI u -> text "%vI_" <> asmSDoc (pprUnique u) + VirtualRegHi u -> text "%vHi_" <> asmSDoc (pprUnique u) + VirtualRegF u -> text "%vF_" <> asmSDoc (pprUnique u) + VirtualRegD u -> text "%vD_" <> asmSDoc (pprUnique u) + where +#if alpha_TARGET_ARCH + ppr_reg_no :: Int -> Doc + ppr_reg_no i = ptext + (case i of { + 0 -> SLIT("$0"); 1 -> SLIT("$1"); + 2 -> SLIT("$2"); 3 -> SLIT("$3"); + 4 -> SLIT("$4"); 5 -> SLIT("$5"); + 6 -> SLIT("$6"); 7 -> SLIT("$7"); + 8 -> SLIT("$8"); 9 -> SLIT("$9"); + 10 -> SLIT("$10"); 11 -> SLIT("$11"); + 12 -> SLIT("$12"); 13 -> SLIT("$13"); + 14 -> SLIT("$14"); 15 -> SLIT("$15"); + 16 -> SLIT("$16"); 17 -> SLIT("$17"); + 18 -> SLIT("$18"); 19 -> SLIT("$19"); + 20 -> SLIT("$20"); 21 -> SLIT("$21"); + 22 -> SLIT("$22"); 23 -> SLIT("$23"); + 24 -> SLIT("$24"); 25 -> SLIT("$25"); + 26 -> SLIT("$26"); 27 -> SLIT("$27"); + 28 -> SLIT("$28"); 29 -> SLIT("$29"); + 30 -> SLIT("$30"); 31 -> SLIT("$31"); + 32 -> SLIT("$f0"); 33 -> SLIT("$f1"); + 34 -> SLIT("$f2"); 35 -> SLIT("$f3"); + 36 -> SLIT("$f4"); 37 -> SLIT("$f5"); + 38 -> SLIT("$f6"); 39 -> SLIT("$f7"); + 40 -> SLIT("$f8"); 41 -> SLIT("$f9"); + 42 -> SLIT("$f10"); 43 -> SLIT("$f11"); + 44 -> SLIT("$f12"); 45 -> SLIT("$f13"); + 46 -> SLIT("$f14"); 47 -> SLIT("$f15"); + 48 -> SLIT("$f16"); 49 -> SLIT("$f17"); + 50 -> SLIT("$f18"); 51 -> SLIT("$f19"); + 52 -> SLIT("$f20"); 53 -> SLIT("$f21"); + 54 -> SLIT("$f22"); 55 -> SLIT("$f23"); + 56 -> SLIT("$f24"); 57 -> SLIT("$f25"); + 58 -> SLIT("$f26"); 59 -> SLIT("$f27"); + 60 -> SLIT("$f28"); 61 -> SLIT("$f29"); + 62 -> SLIT("$f30"); 63 -> SLIT("$f31"); + _ -> SLIT("very naughty alpha register") + }) +#endif +#if i386_TARGET_ARCH + ppr_reg_no :: MachRep -> Int -> Doc + ppr_reg_no I8 = ppr_reg_byte + ppr_reg_no I16 = ppr_reg_word + ppr_reg_no _ = ppr_reg_long + + ppr_reg_byte i = ptext + (case i of { + 0 -> SLIT("%al"); 1 -> SLIT("%bl"); + 2 -> SLIT("%cl"); 3 -> SLIT("%dl"); + _ -> SLIT("very naughty I386 byte register") + }) + + ppr_reg_word i = ptext + (case i of { + 0 -> SLIT("%ax"); 1 -> SLIT("%bx"); + 2 -> SLIT("%cx"); 3 -> SLIT("%dx"); + 4 -> SLIT("%si"); 5 -> SLIT("%di"); + 6 -> SLIT("%bp"); 7 -> SLIT("%sp"); + _ -> SLIT("very naughty I386 word register") + }) + + ppr_reg_long i = ptext + (case i of { + 0 -> SLIT("%eax"); 1 -> SLIT("%ebx"); + 2 -> SLIT("%ecx"); 3 -> SLIT("%edx"); + 4 -> SLIT("%esi"); 5 -> SLIT("%edi"); + 6 -> SLIT("%ebp"); 7 -> SLIT("%esp"); + 8 -> SLIT("%fake0"); 9 -> SLIT("%fake1"); + 10 -> SLIT("%fake2"); 11 -> SLIT("%fake3"); + 12 -> SLIT("%fake4"); 13 -> SLIT("%fake5"); + _ -> SLIT("very naughty I386 register") + }) +#endif + +#if x86_64_TARGET_ARCH + ppr_reg_no :: MachRep -> Int -> Doc + ppr_reg_no I8 = ppr_reg_byte + ppr_reg_no I16 = ppr_reg_word + ppr_reg_no I32 = ppr_reg_long + ppr_reg_no _ = ppr_reg_quad + + ppr_reg_byte i = ptext + (case i of { + 0 -> SLIT("%al"); 1 -> SLIT("%bl"); + 2 -> SLIT("%cl"); 3 -> SLIT("%dl"); + 4 -> SLIT("%sil"); 5 -> SLIT("%dil"); -- new 8-bit regs! + 6 -> SLIT("%bpl"); 7 -> SLIT("%spl"); + 8 -> SLIT("%r8b"); 9 -> SLIT("%r9b"); + 10 -> SLIT("%r10b"); 11 -> SLIT("%r11b"); + 12 -> SLIT("%r12b"); 13 -> SLIT("%r13b"); + 14 -> SLIT("%r14b"); 15 -> SLIT("%r15b"); + _ -> SLIT("very naughty x86_64 byte register") + }) + + ppr_reg_word i = ptext + (case i of { + 0 -> SLIT("%ax"); 1 -> SLIT("%bx"); + 2 -> SLIT("%cx"); 3 -> SLIT("%dx"); + 4 -> SLIT("%si"); 5 -> SLIT("%di"); + 6 -> SLIT("%bp"); 7 -> SLIT("%sp"); + 8 -> SLIT("%r8w"); 9 -> SLIT("%r9w"); + 10 -> SLIT("%r10w"); 11 -> SLIT("%r11w"); + 12 -> SLIT("%r12w"); 13 -> SLIT("%r13w"); + 14 -> SLIT("%r14w"); 15 -> SLIT("%r15w"); + _ -> SLIT("very naughty x86_64 word register") + }) + + ppr_reg_long i = ptext + (case i of { + 0 -> SLIT("%eax"); 1 -> SLIT("%ebx"); + 2 -> SLIT("%ecx"); 3 -> SLIT("%edx"); + 4 -> SLIT("%esi"); 5 -> SLIT("%edi"); + 6 -> SLIT("%ebp"); 7 -> SLIT("%esp"); + 8 -> SLIT("%r8d"); 9 -> SLIT("%r9d"); + 10 -> SLIT("%r10d"); 11 -> SLIT("%r11d"); + 12 -> SLIT("%r12d"); 13 -> SLIT("%r13d"); + 14 -> SLIT("%r14d"); 15 -> SLIT("%r15d"); + _ -> SLIT("very naughty x86_64 register") + }) + + ppr_reg_quad i = ptext + (case i of { + 0 -> SLIT("%rax"); 1 -> SLIT("%rbx"); + 2 -> SLIT("%rcx"); 3 -> SLIT("%rdx"); + 4 -> SLIT("%rsi"); 5 -> SLIT("%rdi"); + 6 -> SLIT("%rbp"); 7 -> SLIT("%rsp"); + 8 -> SLIT("%r8"); 9 -> SLIT("%r9"); + 10 -> SLIT("%r10"); 11 -> SLIT("%r11"); + 12 -> SLIT("%r12"); 13 -> SLIT("%r13"); + 14 -> SLIT("%r14"); 15 -> SLIT("%r15"); + 16 -> SLIT("%xmm0"); 17 -> SLIT("%xmm1"); + 18 -> SLIT("%xmm2"); 19 -> SLIT("%xmm3"); + 20 -> SLIT("%xmm4"); 21 -> SLIT("%xmm5"); + 22 -> SLIT("%xmm6"); 23 -> SLIT("%xmm7"); + 24 -> SLIT("%xmm8"); 25 -> SLIT("%xmm9"); + 26 -> SLIT("%xmm10"); 27 -> SLIT("%xmm11"); + 28 -> SLIT("%xmm12"); 29 -> SLIT("%xmm13"); + 30 -> SLIT("%xmm14"); 31 -> SLIT("%xmm15"); + _ -> SLIT("very naughty x86_64 register") + }) +#endif + +#if sparc_TARGET_ARCH + ppr_reg_no :: Int -> Doc + ppr_reg_no i = ptext + (case i of { + 0 -> SLIT("%g0"); 1 -> SLIT("%g1"); + 2 -> SLIT("%g2"); 3 -> SLIT("%g3"); + 4 -> SLIT("%g4"); 5 -> SLIT("%g5"); + 6 -> SLIT("%g6"); 7 -> SLIT("%g7"); + 8 -> SLIT("%o0"); 9 -> SLIT("%o1"); + 10 -> SLIT("%o2"); 11 -> SLIT("%o3"); + 12 -> SLIT("%o4"); 13 -> SLIT("%o5"); + 14 -> SLIT("%o6"); 15 -> SLIT("%o7"); + 16 -> SLIT("%l0"); 17 -> SLIT("%l1"); + 18 -> SLIT("%l2"); 19 -> SLIT("%l3"); + 20 -> SLIT("%l4"); 21 -> SLIT("%l5"); + 22 -> SLIT("%l6"); 23 -> SLIT("%l7"); + 24 -> SLIT("%i0"); 25 -> SLIT("%i1"); + 26 -> SLIT("%i2"); 27 -> SLIT("%i3"); + 28 -> SLIT("%i4"); 29 -> SLIT("%i5"); + 30 -> SLIT("%i6"); 31 -> SLIT("%i7"); + 32 -> SLIT("%f0"); 33 -> SLIT("%f1"); + 34 -> SLIT("%f2"); 35 -> SLIT("%f3"); + 36 -> SLIT("%f4"); 37 -> SLIT("%f5"); + 38 -> SLIT("%f6"); 39 -> SLIT("%f7"); + 40 -> SLIT("%f8"); 41 -> SLIT("%f9"); + 42 -> SLIT("%f10"); 43 -> SLIT("%f11"); + 44 -> SLIT("%f12"); 45 -> SLIT("%f13"); + 46 -> SLIT("%f14"); 47 -> SLIT("%f15"); + 48 -> SLIT("%f16"); 49 -> SLIT("%f17"); + 50 -> SLIT("%f18"); 51 -> SLIT("%f19"); + 52 -> SLIT("%f20"); 53 -> SLIT("%f21"); + 54 -> SLIT("%f22"); 55 -> SLIT("%f23"); + 56 -> SLIT("%f24"); 57 -> SLIT("%f25"); + 58 -> SLIT("%f26"); 59 -> SLIT("%f27"); + 60 -> SLIT("%f28"); 61 -> SLIT("%f29"); + 62 -> SLIT("%f30"); 63 -> SLIT("%f31"); + _ -> SLIT("very naughty sparc register") + }) +#endif +#if powerpc_TARGET_ARCH +#if darwin_TARGET_OS + ppr_reg_no :: Int -> Doc + ppr_reg_no i = ptext + (case i of { + 0 -> SLIT("r0"); 1 -> SLIT("r1"); + 2 -> SLIT("r2"); 3 -> SLIT("r3"); + 4 -> SLIT("r4"); 5 -> SLIT("r5"); + 6 -> SLIT("r6"); 7 -> SLIT("r7"); + 8 -> SLIT("r8"); 9 -> SLIT("r9"); + 10 -> SLIT("r10"); 11 -> SLIT("r11"); + 12 -> SLIT("r12"); 13 -> SLIT("r13"); + 14 -> SLIT("r14"); 15 -> SLIT("r15"); + 16 -> SLIT("r16"); 17 -> SLIT("r17"); + 18 -> SLIT("r18"); 19 -> SLIT("r19"); + 20 -> SLIT("r20"); 21 -> SLIT("r21"); + 22 -> SLIT("r22"); 23 -> SLIT("r23"); + 24 -> SLIT("r24"); 25 -> SLIT("r25"); + 26 -> SLIT("r26"); 27 -> SLIT("r27"); + 28 -> SLIT("r28"); 29 -> SLIT("r29"); + 30 -> SLIT("r30"); 31 -> SLIT("r31"); + 32 -> SLIT("f0"); 33 -> SLIT("f1"); + 34 -> SLIT("f2"); 35 -> SLIT("f3"); + 36 -> SLIT("f4"); 37 -> SLIT("f5"); + 38 -> SLIT("f6"); 39 -> SLIT("f7"); + 40 -> SLIT("f8"); 41 -> SLIT("f9"); + 42 -> SLIT("f10"); 43 -> SLIT("f11"); + 44 -> SLIT("f12"); 45 -> SLIT("f13"); + 46 -> SLIT("f14"); 47 -> SLIT("f15"); + 48 -> SLIT("f16"); 49 -> SLIT("f17"); + 50 -> SLIT("f18"); 51 -> SLIT("f19"); + 52 -> SLIT("f20"); 53 -> SLIT("f21"); + 54 -> SLIT("f22"); 55 -> SLIT("f23"); + 56 -> SLIT("f24"); 57 -> SLIT("f25"); + 58 -> SLIT("f26"); 59 -> SLIT("f27"); + 60 -> SLIT("f28"); 61 -> SLIT("f29"); + 62 -> SLIT("f30"); 63 -> SLIT("f31"); + _ -> SLIT("very naughty powerpc register") + }) +#else + ppr_reg_no :: Int -> Doc + ppr_reg_no i | i <= 31 = int i -- GPRs + | i <= 63 = int (i-32) -- FPRs + | otherwise = ptext SLIT("very naughty powerpc register") +#endif +#endif + + +-- ----------------------------------------------------------------------------- +-- pprSize: print a 'Size' + +#if powerpc_TARGET_ARCH || i386_TARGET_ARCH || x86_64_TARGET_ARCH || sparc_TARGET_ARCH +pprSize :: MachRep -> Doc +#else +pprSize :: Size -> Doc +#endif + +pprSize x = ptext (case x of +#if alpha_TARGET_ARCH + B -> SLIT("b") + Bu -> SLIT("bu") +-- W -> SLIT("w") UNUSED +-- Wu -> SLIT("wu") UNUSED + L -> SLIT("l") + Q -> SLIT("q") +-- FF -> SLIT("f") UNUSED +-- DF -> SLIT("d") UNUSED +-- GF -> SLIT("g") UNUSED +-- SF -> SLIT("s") UNUSED + TF -> SLIT("t") +#endif +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + I8 -> SLIT("b") + I16 -> SLIT("w") + I32 -> SLIT("l") + I64 -> SLIT("q") +#endif +#if i386_TARGET_ARCH + F32 -> SLIT("s") + F64 -> SLIT("l") + F80 -> SLIT("t") +#endif +#if x86_64_TARGET_ARCH + F32 -> SLIT("ss") -- "scalar single-precision float" (SSE2) + F64 -> SLIT("sd") -- "scalar double-precision float" (SSE2) +#endif +#if sparc_TARGET_ARCH + I8 -> SLIT("sb") + I16 -> SLIT("sh") + I32 -> SLIT("") + F32 -> SLIT("") + F64 -> SLIT("d") + ) +pprStSize :: MachRep -> Doc +pprStSize x = ptext (case x of + I8 -> SLIT("b") + I16 -> SLIT("h") + I32 -> SLIT("") + F32 -> SLIT("") + F64 -> SLIT("d") +#endif +#if powerpc_TARGET_ARCH + I8 -> SLIT("b") + I16 -> SLIT("h") + I32 -> SLIT("w") + F32 -> SLIT("fs") + F64 -> SLIT("fd") +#endif + ) + +-- ----------------------------------------------------------------------------- +-- pprCond: print a 'Cond' + +pprCond :: Cond -> Doc + +pprCond c = ptext (case c of { +#if alpha_TARGET_ARCH + EQQ -> SLIT("eq"); + LTT -> SLIT("lt"); + LE -> SLIT("le"); + ULT -> SLIT("ult"); + ULE -> SLIT("ule"); + NE -> SLIT("ne"); + GTT -> SLIT("gt"); + GE -> SLIT("ge") +#endif +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + GEU -> SLIT("ae"); LU -> SLIT("b"); + EQQ -> SLIT("e"); GTT -> SLIT("g"); + GE -> SLIT("ge"); GU -> SLIT("a"); + LTT -> SLIT("l"); LE -> SLIT("le"); + LEU -> SLIT("be"); NE -> SLIT("ne"); + NEG -> SLIT("s"); POS -> SLIT("ns"); + CARRY -> SLIT("c"); OFLO -> SLIT("o"); + PARITY -> SLIT("p"); NOTPARITY -> SLIT("np"); + ALWAYS -> SLIT("mp") -- hack +#endif +#if sparc_TARGET_ARCH + ALWAYS -> SLIT(""); NEVER -> SLIT("n"); + GEU -> SLIT("geu"); LU -> SLIT("lu"); + EQQ -> SLIT("e"); GTT -> SLIT("g"); + GE -> SLIT("ge"); GU -> SLIT("gu"); + LTT -> SLIT("l"); LE -> SLIT("le"); + LEU -> SLIT("leu"); NE -> SLIT("ne"); + NEG -> SLIT("neg"); POS -> SLIT("pos"); + VC -> SLIT("vc"); VS -> SLIT("vs") +#endif +#if powerpc_TARGET_ARCH + ALWAYS -> SLIT(""); + EQQ -> SLIT("eq"); NE -> SLIT("ne"); + LTT -> SLIT("lt"); GE -> SLIT("ge"); + GTT -> SLIT("gt"); LE -> SLIT("le"); + LU -> SLIT("lt"); GEU -> SLIT("ge"); + GU -> SLIT("gt"); LEU -> SLIT("le"); +#endif + }) + + +-- ----------------------------------------------------------------------------- +-- pprImm: print an 'Imm' + +pprImm :: Imm -> Doc + +pprImm (ImmInt i) = int i +pprImm (ImmInteger i) = integer i +pprImm (ImmCLbl l) = pprCLabel_asm l +pprImm (ImmIndex l i) = pprCLabel_asm l <> char '+' <> int i +pprImm (ImmLit s) = s + +pprImm (ImmFloat _) = ptext SLIT("naughty float immediate") +pprImm (ImmDouble _) = ptext SLIT("naughty double immediate") + +pprImm (ImmConstantSum a b) = pprImm a <> char '+' <> pprImm b +#if sparc_TARGET_ARCH +-- ToDo: This should really be fixed in the PIC support, but only +-- print a for now. +pprImm (ImmConstantDiff a b) = pprImm a +#else +pprImm (ImmConstantDiff a b) = pprImm a <> char '-' + <> lparen <> pprImm b <> rparen +#endif + +#if sparc_TARGET_ARCH +pprImm (LO i) + = hcat [ pp_lo, pprImm i, rparen ] + where + pp_lo = text "%lo(" + +pprImm (HI i) + = hcat [ pp_hi, pprImm i, rparen ] + where + pp_hi = text "%hi(" +#endif +#if powerpc_TARGET_ARCH +#if darwin_TARGET_OS +pprImm (LO i) + = hcat [ pp_lo, pprImm i, rparen ] + where + pp_lo = text "lo16(" + +pprImm (HI i) + = hcat [ pp_hi, pprImm i, rparen ] + where + pp_hi = text "hi16(" + +pprImm (HA i) + = hcat [ pp_ha, pprImm i, rparen ] + where + pp_ha = text "ha16(" + +#else +pprImm (LO i) + = pprImm i <> text "@l" + +pprImm (HI i) + = pprImm i <> text "@h" + +pprImm (HA i) + = pprImm i <> text "@ha" +#endif +#endif + + +-- ----------------------------------------------------------------------------- +-- @pprAddr: print an 'AddrMode' + +pprAddr :: AddrMode -> Doc + +#if alpha_TARGET_ARCH +pprAddr (AddrReg r) = parens (pprReg r) +pprAddr (AddrImm i) = pprImm i +pprAddr (AddrRegImm r1 i) + = (<>) (pprImm i) (parens (pprReg r1)) +#endif + +------------------- + +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH +pprAddr (ImmAddr imm off) + = let pp_imm = pprImm imm + in + if (off == 0) then + pp_imm + else if (off < 0) then + pp_imm <> int off + else + pp_imm <> char '+' <> int off + +pprAddr (AddrBaseIndex base index displacement) + = let + pp_disp = ppr_disp displacement + pp_off p = pp_disp <> char '(' <> p <> char ')' + pp_reg r = pprReg wordRep r + in + case (base,index) of + (EABaseNone, EAIndexNone) -> pp_disp + (EABaseReg b, EAIndexNone) -> pp_off (pp_reg b) + (EABaseRip, EAIndexNone) -> pp_off (ptext SLIT("%rip")) + (EABaseNone, EAIndex r i) -> pp_off (comma <> pp_reg r <> comma <> int i) + (EABaseReg b, EAIndex r i) -> pp_off (pp_reg b <> comma <> pp_reg r + <> comma <> int i) + where + ppr_disp (ImmInt 0) = empty + ppr_disp imm = pprImm imm +#endif + +------------------- + +#if sparc_TARGET_ARCH +pprAddr (AddrRegReg r1 (RealReg 0)) = pprReg r1 + +pprAddr (AddrRegReg r1 r2) + = hcat [ pprReg r1, char '+', pprReg r2 ] + +pprAddr (AddrRegImm r1 (ImmInt i)) + | i == 0 = pprReg r1 + | not (fits13Bits i) = largeOffsetError i + | otherwise = hcat [ pprReg r1, pp_sign, int i ] + where + pp_sign = if i > 0 then char '+' else empty + +pprAddr (AddrRegImm r1 (ImmInteger i)) + | i == 0 = pprReg r1 + | not (fits13Bits i) = largeOffsetError i + | otherwise = hcat [ pprReg r1, pp_sign, integer i ] + where + pp_sign = if i > 0 then char '+' else empty + +pprAddr (AddrRegImm r1 imm) + = hcat [ pprReg r1, char '+', pprImm imm ] +#endif + +------------------- + +#if powerpc_TARGET_ARCH +pprAddr (AddrRegReg r1 r2) + = pprReg r1 <+> ptext SLIT(", ") <+> pprReg r2 + +pprAddr (AddrRegImm r1 (ImmInt i)) = hcat [ int i, char '(', pprReg r1, char ')' ] +pprAddr (AddrRegImm r1 (ImmInteger i)) = hcat [ integer i, char '(', pprReg r1, char ')' ] +pprAddr (AddrRegImm r1 imm) = hcat [ pprImm imm, char '(', pprReg r1, char ')' ] +#endif + + +-- ----------------------------------------------------------------------------- +-- pprData: print a 'CmmStatic' + +pprSectionHeader Text + = ptext + IF_ARCH_alpha(SLIT("\t.text\n\t.align 3") {-word boundary-} + ,IF_ARCH_sparc(SLIT(".text\n\t.align 4") {-word boundary-} + ,IF_ARCH_i386(IF_OS_darwin(SLIT(".text\n\t.align 2"), + SLIT(".text\n\t.align 4,0x90")) + {-needs per-OS variation!-} + ,IF_ARCH_x86_64(SLIT(".text\n\t.align 8") {-needs per-OS variation!-} + ,IF_ARCH_powerpc(SLIT(".text\n.align 2") + ,))))) +pprSectionHeader Data + = ptext + IF_ARCH_alpha(SLIT("\t.data\n\t.align 3") + ,IF_ARCH_sparc(SLIT(".data\n\t.align 8") {-<8 will break double constants -} + ,IF_ARCH_i386(IF_OS_darwin(SLIT(".data\n\t.align 2"), + SLIT(".data\n\t.align 4")) + ,IF_ARCH_x86_64(SLIT(".data\n\t.align 8") + ,IF_ARCH_powerpc(SLIT(".data\n.align 2") + ,))))) +pprSectionHeader ReadOnlyData + = ptext + IF_ARCH_alpha(SLIT("\t.data\n\t.align 3") + ,IF_ARCH_sparc(SLIT(".data\n\t.align 8") {-<8 will break double constants -} + ,IF_ARCH_i386(IF_OS_darwin(SLIT(".const\n.align 2"), + SLIT(".section .rodata\n\t.align 4")) + ,IF_ARCH_x86_64(SLIT(".section .rodata\n\t.align 8") + ,IF_ARCH_powerpc(IF_OS_darwin(SLIT(".const\n.align 2"), + SLIT(".section .rodata\n\t.align 2")) + ,))))) +pprSectionHeader RelocatableReadOnlyData + = ptext + IF_ARCH_alpha(SLIT("\t.data\n\t.align 3") + ,IF_ARCH_sparc(SLIT(".data\n\t.align 8") {-<8 will break double constants -} + ,IF_ARCH_i386(IF_OS_darwin(SLIT(".const_data\n.align 2"), + SLIT(".section .rodata\n\t.align 4")) + ,IF_ARCH_x86_64(SLIT(".section .rodata\n\t.align 8") + ,IF_ARCH_powerpc(IF_OS_darwin(SLIT(".const_data\n.align 2"), + SLIT(".data\n\t.align 2")) + ,))))) +pprSectionHeader UninitialisedData + = ptext + IF_ARCH_alpha(SLIT("\t.bss\n\t.align 3") + ,IF_ARCH_sparc(SLIT(".bss\n\t.align 8") {-<8 will break double constants -} + ,IF_ARCH_i386(IF_OS_darwin(SLIT(".const_data\n\t.align 2"), + SLIT(".section .bss\n\t.align 4")) + ,IF_ARCH_x86_64(SLIT(".section .bss\n\t.align 8") + ,IF_ARCH_powerpc(IF_OS_darwin(SLIT(".const_data\n.align 2"), + SLIT(".section .bss\n\t.align 2")) + ,))))) +pprSectionHeader ReadOnlyData16 + = ptext + IF_ARCH_alpha(SLIT("\t.data\n\t.align 4") + ,IF_ARCH_sparc(SLIT(".data\n\t.align 16") + ,IF_ARCH_i386(IF_OS_darwin(SLIT(".const\n.align 4"), + SLIT(".section .rodata\n\t.align 16")) + ,IF_ARCH_x86_64(SLIT(".section .rodata.cst16\n\t.align 16") + ,IF_ARCH_powerpc(IF_OS_darwin(SLIT(".const\n.align 4"), + SLIT(".section .rodata\n\t.align 4")) + ,))))) + +pprSectionHeader (OtherSection sec) + = panic "PprMach.pprSectionHeader: unknown section" + +pprData :: CmmStatic -> Doc +pprData (CmmAlign bytes) = pprAlign bytes +pprData (CmmDataLabel lbl) = pprLabel lbl +pprData (CmmString str) = pprASCII str +pprData (CmmUninitialised bytes) = ptext SLIT(".space ") <> int bytes +pprData (CmmStaticLit lit) = pprDataItem lit + +pprGloblDecl :: CLabel -> Doc +pprGloblDecl lbl + | not (externallyVisibleCLabel lbl) = empty + | otherwise = ptext IF_ARCH_sparc(SLIT(".global "), + SLIT(".globl ")) <> + pprCLabel_asm lbl + +pprLabel :: CLabel -> Doc +pprLabel lbl = pprGloblDecl lbl $$ (pprCLabel_asm lbl <> char ':') + + +pprASCII str + = vcat (map do1 str) $$ do1 0 + where + do1 :: Word8 -> Doc + do1 w = ptext SLIT("\t.byte\t") <> int (fromIntegral w) + +pprAlign bytes = + IF_ARCH_alpha(ptextSLIT(".align ") <> int pow2, + IF_ARCH_i386(ptext SLIT(".align ") <> int IF_OS_darwin(pow2,bytes), + IF_ARCH_x86_64(ptext SLIT(".align ") <> int bytes, + IF_ARCH_sparc(ptext SLIT(".align ") <> int bytes, + IF_ARCH_powerpc(ptext SLIT(".align ") <> int pow2,))))) + where + pow2 = log2 bytes + + log2 :: Int -> Int -- cache the common ones + log2 1 = 0 + log2 2 = 1 + log2 4 = 2 + log2 8 = 3 + log2 n = 1 + log2 (n `quot` 2) + + +pprDataItem :: CmmLit -> Doc +pprDataItem lit + = vcat (ppr_item (cmmLitRep lit) lit) + where + imm = litToImm lit + + -- These seem to be common: + ppr_item I8 x = [ptext SLIT("\t.byte\t") <> pprImm imm] + ppr_item I32 x = [ptext SLIT("\t.long\t") <> pprImm imm] + ppr_item F32 (CmmFloat r _) + = let bs = floatToBytes (fromRational r) + in map (\b -> ptext SLIT("\t.byte\t") <> pprImm (ImmInt b)) bs + ppr_item F64 (CmmFloat r _) + = let bs = doubleToBytes (fromRational r) + in map (\b -> ptext SLIT("\t.byte\t") <> pprImm (ImmInt b)) bs + +#if sparc_TARGET_ARCH + -- copy n paste of x86 version + ppr_item I16 x = [ptext SLIT("\t.short\t") <> pprImm imm] + ppr_item I64 x = [ptext SLIT("\t.quad\t") <> pprImm imm] +#endif +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + ppr_item I16 x = [ptext SLIT("\t.word\t") <> pprImm imm] +#endif +#if i386_TARGET_ARCH && darwin_TARGET_OS + ppr_item I64 (CmmInt x _) = + [ptext SLIT("\t.long\t") + <> int (fromIntegral (fromIntegral x :: Word32)), + ptext SLIT("\t.long\t") + <> int (fromIntegral + (fromIntegral (x `shiftR` 32) :: Word32))] +#endif +#if i386_TARGET_ARCH + ppr_item I64 x = [ptext SLIT("\t.quad\t") <> pprImm imm] +#endif +#if x86_64_TARGET_ARCH + -- x86_64: binutils can't handle the R_X86_64_PC64 relocation + -- type, which means we can't do pc-relative 64-bit addresses. + -- Fortunately we're assuming the small memory model, in which + -- all such offsets will fit into 32 bits, so we have to stick + -- to 32-bit offset fields and modify the RTS appropriately + -- (see InfoTables.h). + -- + ppr_item I64 x + | isRelativeReloc x = + [ptext SLIT("\t.long\t") <> pprImm imm, + ptext SLIT("\t.long\t0")] + | otherwise = + [ptext SLIT("\t.quad\t") <> pprImm imm] + where + isRelativeReloc (CmmLabelOff _ _) = True + isRelativeReloc (CmmLabelDiffOff _ _ _) = True + isRelativeReloc _ = False +#endif +#if powerpc_TARGET_ARCH + ppr_item I16 x = [ptext SLIT("\t.short\t") <> pprImm imm] + ppr_item I64 (CmmInt x _) = + [ptext SLIT("\t.long\t") + <> int (fromIntegral + (fromIntegral (x `shiftR` 32) :: Word32)), + ptext SLIT("\t.long\t") + <> int (fromIntegral (fromIntegral x :: Word32))] +#endif + +-- fall through to rest of (machine-specific) pprInstr... + +-- ----------------------------------------------------------------------------- +-- pprInstr: print an 'Instr' + +pprInstr :: Instr -> Doc + +--pprInstr (COMMENT s) = empty -- nuke 'em +pprInstr (COMMENT s) + = IF_ARCH_alpha( ((<>) (ptext SLIT("\t# ")) (ftext s)) + ,IF_ARCH_sparc( ((<>) (ptext SLIT("! ")) (ftext s)) + ,IF_ARCH_i386( ((<>) (ptext SLIT("# ")) (ftext s)) + ,IF_ARCH_x86_64( ((<>) (ptext SLIT("# ")) (ftext s)) + ,IF_ARCH_powerpc( IF_OS_linux( + ((<>) (ptext SLIT("# ")) (ftext s)), + ((<>) (ptext SLIT("; ")) (ftext s))) + ,))))) + +pprInstr (DELTA d) + = pprInstr (COMMENT (mkFastString ("\tdelta = " ++ show d))) + +pprInstr (NEWBLOCK _) + = panic "PprMach.pprInstr: NEWBLOCK" + +pprInstr (LDATA _ _) + = panic "PprMach.pprInstr: LDATA" + +-- ----------------------------------------------------------------------------- +-- pprInstr for an Alpha + +#if alpha_TARGET_ARCH + +pprInstr (LD size reg addr) + = hcat [ + ptext SLIT("\tld"), + pprSize size, + char '\t', + pprReg reg, + comma, + pprAddr addr + ] + +pprInstr (LDA reg addr) + = hcat [ + ptext SLIT("\tlda\t"), + pprReg reg, + comma, + pprAddr addr + ] + +pprInstr (LDAH reg addr) + = hcat [ + ptext SLIT("\tldah\t"), + pprReg reg, + comma, + pprAddr addr + ] + +pprInstr (LDGP reg addr) + = hcat [ + ptext SLIT("\tldgp\t"), + pprReg reg, + comma, + pprAddr addr + ] + +pprInstr (LDI size reg imm) + = hcat [ + ptext SLIT("\tldi"), + pprSize size, + char '\t', + pprReg reg, + comma, + pprImm imm + ] + +pprInstr (ST size reg addr) + = hcat [ + ptext SLIT("\tst"), + pprSize size, + char '\t', + pprReg reg, + comma, + pprAddr addr + ] + +pprInstr (CLR reg) + = hcat [ + ptext SLIT("\tclr\t"), + pprReg reg + ] + +pprInstr (ABS size ri reg) + = hcat [ + ptext SLIT("\tabs"), + pprSize size, + char '\t', + pprRI ri, + comma, + pprReg reg + ] + +pprInstr (NEG size ov ri reg) + = hcat [ + ptext SLIT("\tneg"), + pprSize size, + if ov then ptext SLIT("v\t") else char '\t', + pprRI ri, + comma, + pprReg reg + ] + +pprInstr (ADD size ov reg1 ri reg2) + = hcat [ + ptext SLIT("\tadd"), + pprSize size, + if ov then ptext SLIT("v\t") else char '\t', + pprReg reg1, + comma, + pprRI ri, + comma, + pprReg reg2 + ] + +pprInstr (SADD size scale reg1 ri reg2) + = hcat [ + ptext (case scale of {{-UNUSED:L -> SLIT("\ts4");-} Q -> SLIT("\ts8")}), + ptext SLIT("add"), + pprSize size, + char '\t', + pprReg reg1, + comma, + pprRI ri, + comma, + pprReg reg2 + ] + +pprInstr (SUB size ov reg1 ri reg2) + = hcat [ + ptext SLIT("\tsub"), + pprSize size, + if ov then ptext SLIT("v\t") else char '\t', + pprReg reg1, + comma, + pprRI ri, + comma, + pprReg reg2 + ] + +pprInstr (SSUB size scale reg1 ri reg2) + = hcat [ + ptext (case scale of {{-UNUSED:L -> SLIT("\ts4");-} Q -> SLIT("\ts8")}), + ptext SLIT("sub"), + pprSize size, + char '\t', + pprReg reg1, + comma, + pprRI ri, + comma, + pprReg reg2 + ] + +pprInstr (MUL size ov reg1 ri reg2) + = hcat [ + ptext SLIT("\tmul"), + pprSize size, + if ov then ptext SLIT("v\t") else char '\t', + pprReg reg1, + comma, + pprRI ri, + comma, + pprReg reg2 + ] + +pprInstr (DIV size uns reg1 ri reg2) + = hcat [ + ptext SLIT("\tdiv"), + pprSize size, + if uns then ptext SLIT("u\t") else char '\t', + pprReg reg1, + comma, + pprRI ri, + comma, + pprReg reg2 + ] + +pprInstr (REM size uns reg1 ri reg2) + = hcat [ + ptext SLIT("\trem"), + pprSize size, + if uns then ptext SLIT("u\t") else char '\t', + pprReg reg1, + comma, + pprRI ri, + comma, + pprReg reg2 + ] + +pprInstr (NOT ri reg) + = hcat [ + ptext SLIT("\tnot"), + char '\t', + pprRI ri, + comma, + pprReg reg + ] + +pprInstr (AND reg1 ri reg2) = pprRegRIReg SLIT("and") reg1 ri reg2 +pprInstr (ANDNOT reg1 ri reg2) = pprRegRIReg SLIT("andnot") reg1 ri reg2 +pprInstr (OR reg1 ri reg2) = pprRegRIReg SLIT("or") reg1 ri reg2 +pprInstr (ORNOT reg1 ri reg2) = pprRegRIReg SLIT("ornot") reg1 ri reg2 +pprInstr (XOR reg1 ri reg2) = pprRegRIReg SLIT("xor") reg1 ri reg2 +pprInstr (XORNOT reg1 ri reg2) = pprRegRIReg SLIT("xornot") reg1 ri reg2 + +pprInstr (SLL reg1 ri reg2) = pprRegRIReg SLIT("sll") reg1 ri reg2 +pprInstr (SRL reg1 ri reg2) = pprRegRIReg SLIT("srl") reg1 ri reg2 +pprInstr (SRA reg1 ri reg2) = pprRegRIReg SLIT("sra") reg1 ri reg2 + +pprInstr (ZAP reg1 ri reg2) = pprRegRIReg SLIT("zap") reg1 ri reg2 +pprInstr (ZAPNOT reg1 ri reg2) = pprRegRIReg SLIT("zapnot") reg1 ri reg2 + +pprInstr (NOP) = ptext SLIT("\tnop") + +pprInstr (CMP cond reg1 ri reg2) + = hcat [ + ptext SLIT("\tcmp"), + pprCond cond, + char '\t', + pprReg reg1, + comma, + pprRI ri, + comma, + pprReg reg2 + ] + +pprInstr (FCLR reg) + = hcat [ + ptext SLIT("\tfclr\t"), + pprReg reg + ] + +pprInstr (FABS reg1 reg2) + = hcat [ + ptext SLIT("\tfabs\t"), + pprReg reg1, + comma, + pprReg reg2 + ] + +pprInstr (FNEG size reg1 reg2) + = hcat [ + ptext SLIT("\tneg"), + pprSize size, + char '\t', + pprReg reg1, + comma, + pprReg reg2 + ] + +pprInstr (FADD size reg1 reg2 reg3) = pprSizeRegRegReg SLIT("add") size reg1 reg2 reg3 +pprInstr (FDIV size reg1 reg2 reg3) = pprSizeRegRegReg SLIT("div") size reg1 reg2 reg3 +pprInstr (FMUL size reg1 reg2 reg3) = pprSizeRegRegReg SLIT("mul") size reg1 reg2 reg3 +pprInstr (FSUB size reg1 reg2 reg3) = pprSizeRegRegReg SLIT("sub") size reg1 reg2 reg3 + +pprInstr (CVTxy size1 size2 reg1 reg2) + = hcat [ + ptext SLIT("\tcvt"), + pprSize size1, + case size2 of {Q -> ptext SLIT("qc"); _ -> pprSize size2}, + char '\t', + pprReg reg1, + comma, + pprReg reg2 + ] + +pprInstr (FCMP size cond reg1 reg2 reg3) + = hcat [ + ptext SLIT("\tcmp"), + pprSize size, + pprCond cond, + char '\t', + pprReg reg1, + comma, + pprReg reg2, + comma, + pprReg reg3 + ] + +pprInstr (FMOV reg1 reg2) + = hcat [ + ptext SLIT("\tfmov\t"), + pprReg reg1, + comma, + pprReg reg2 + ] + +pprInstr (BI ALWAYS reg lab) = pprInstr (BR lab) + +pprInstr (BI NEVER reg lab) = empty + +pprInstr (BI cond reg lab) + = hcat [ + ptext SLIT("\tb"), + pprCond cond, + char '\t', + pprReg reg, + comma, + pprImm lab + ] + +pprInstr (BF cond reg lab) + = hcat [ + ptext SLIT("\tfb"), + pprCond cond, + char '\t', + pprReg reg, + comma, + pprImm lab + ] + +pprInstr (BR lab) + = (<>) (ptext SLIT("\tbr\t")) (pprImm lab) + +pprInstr (JMP reg addr hint) + = hcat [ + ptext SLIT("\tjmp\t"), + pprReg reg, + comma, + pprAddr addr, + comma, + int hint + ] + +pprInstr (BSR imm n) + = (<>) (ptext SLIT("\tbsr\t")) (pprImm imm) + +pprInstr (JSR reg addr n) + = hcat [ + ptext SLIT("\tjsr\t"), + pprReg reg, + comma, + pprAddr addr + ] + +pprInstr (FUNBEGIN clab) + = hcat [ + if (externallyVisibleCLabel clab) then + hcat [ptext SLIT("\t.globl\t"), pp_lab, char '\n'] + else + empty, + ptext SLIT("\t.ent "), + pp_lab, + char '\n', + pp_lab, + pp_ldgp, + pp_lab, + pp_frame + ] + where + pp_lab = pprCLabel_asm clab + + -- NEVER use commas within those string literals, cpp will ruin your day + pp_ldgp = hcat [ ptext SLIT(":\n\tldgp $29"), char ',', ptext SLIT("0($27)\n") ] + pp_frame = hcat [ ptext SLIT("..ng:\n\t.frame $30"), char ',', + ptext SLIT("4240"), char ',', + ptext SLIT("$26"), char ',', + ptext SLIT("0\n\t.prologue 1") ] + +pprInstr (FUNEND clab) + = (<>) (ptext SLIT("\t.align 4\n\t.end ")) (pprCLabel_asm clab) +\end{code} + +Continue with Alpha-only printing bits and bobs: +\begin{code} +pprRI :: RI -> Doc + +pprRI (RIReg r) = pprReg r +pprRI (RIImm r) = pprImm r + +pprRegRIReg :: LitString -> Reg -> RI -> Reg -> Doc +pprRegRIReg name reg1 ri reg2 + = hcat [ + char '\t', + ptext name, + char '\t', + pprReg reg1, + comma, + pprRI ri, + comma, + pprReg reg2 + ] + +pprSizeRegRegReg :: LitString -> Size -> Reg -> Reg -> Reg -> Doc +pprSizeRegRegReg name size reg1 reg2 reg3 + = hcat [ + char '\t', + ptext name, + pprSize size, + char '\t', + pprReg reg1, + comma, + pprReg reg2, + comma, + pprReg reg3 + ] + +#endif /* alpha_TARGET_ARCH */ + + +-- ----------------------------------------------------------------------------- +-- pprInstr for an x86 + +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + +pprInstr v@(MOV size s@(OpReg src) d@(OpReg dst)) -- hack + | src == dst + = +#if 0 /* #ifdef DEBUG */ + (<>) (ptext SLIT("# warning: ")) (pprSizeOpOp SLIT("mov") size s d) +#else + empty +#endif + +pprInstr (MOV size src dst) + = pprSizeOpOp SLIT("mov") size src dst + +pprInstr (MOVZxL I32 src dst) = pprSizeOpOp SLIT("mov") I32 src dst + -- 32-to-64 bit zero extension on x86_64 is accomplished by a simple + -- movl. But we represent it as a MOVZxL instruction, because + -- the reg alloc would tend to throw away a plain reg-to-reg + -- move, and we still want it to do that. + +pprInstr (MOVZxL sizes src dst) = pprSizeOpOpCoerce SLIT("movz") sizes I32 src dst + -- zero-extension only needs to extend to 32 bits: on x86_64, + -- the remaining zero-extension to 64 bits is automatic, and the 32-bit + -- instruction is shorter. + +pprInstr (MOVSxL sizes src dst) = pprSizeOpOpCoerce SLIT("movs") sizes wordRep src dst + +-- here we do some patching, since the physical registers are only set late +-- in the code generation. +pprInstr (LEA size (OpAddr (AddrBaseIndex src1@(EABaseReg reg1) (EAIndex reg2 1) (ImmInt 0))) dst@(OpReg reg3)) + | reg1 == reg3 + = pprSizeOpOp SLIT("add") size (OpReg reg2) dst +pprInstr (LEA size (OpAddr (AddrBaseIndex src1@(EABaseReg reg1) (EAIndex reg2 1) (ImmInt 0))) dst@(OpReg reg3)) + | reg2 == reg3 + = pprSizeOpOp SLIT("add") size (OpReg reg1) dst +pprInstr (LEA size (OpAddr (AddrBaseIndex src1@(EABaseReg reg1) EAIndexNone displ)) dst@(OpReg reg3)) + | reg1 == reg3 + = pprInstr (ADD size (OpImm displ) dst) +pprInstr (LEA size src dst) = pprSizeOpOp SLIT("lea") size src dst + +pprInstr (ADD size (OpImm (ImmInt (-1))) dst) + = pprSizeOp SLIT("dec") size dst +pprInstr (ADD size (OpImm (ImmInt 1)) dst) + = pprSizeOp SLIT("inc") size dst +pprInstr (ADD size src dst) + = pprSizeOpOp SLIT("add") size src dst +pprInstr (ADC size src dst) + = pprSizeOpOp SLIT("adc") size src dst +pprInstr (SUB size src dst) = pprSizeOpOp SLIT("sub") size src dst +pprInstr (IMUL size op1 op2) = pprSizeOpOp SLIT("imul") size op1 op2 + +{- A hack. The Intel documentation says that "The two and three + operand forms [of IMUL] may also be used with unsigned operands + because the lower half of the product is the same regardless if + (sic) the operands are signed or unsigned. The CF and OF flags, + however, cannot be used to determine if the upper half of the + result is non-zero." So there. +-} +pprInstr (AND size src dst) = pprSizeOpOp SLIT("and") size src dst +pprInstr (OR size src dst) = pprSizeOpOp SLIT("or") size src dst + +pprInstr (XOR F32 src dst) = pprOpOp SLIT("xorps") F32 src dst +pprInstr (XOR F64 src dst) = pprOpOp SLIT("xorpd") F64 src dst +pprInstr (XOR size src dst) = pprSizeOpOp SLIT("xor") size src dst + +pprInstr (NOT size op) = pprSizeOp SLIT("not") size op +pprInstr (NEGI size op) = pprSizeOp SLIT("neg") size op + +pprInstr (SHL size src dst) = pprShift SLIT("shl") size src dst +pprInstr (SAR size src dst) = pprShift SLIT("sar") size src dst +pprInstr (SHR size src dst) = pprShift SLIT("shr") size src dst + +pprInstr (BT size imm src) = pprSizeImmOp SLIT("bt") size imm src + +pprInstr (CMP size src dst) + | isFloatingRep size = pprSizeOpOp SLIT("ucomi") size src dst -- SSE2 + | otherwise = pprSizeOpOp SLIT("cmp") size src dst + +pprInstr (TEST size src dst) = pprSizeOpOp SLIT("test") size src dst +pprInstr (PUSH size op) = pprSizeOp SLIT("push") size op +pprInstr (POP size op) = pprSizeOp SLIT("pop") size op + +-- both unused (SDM): +-- pprInstr PUSHA = ptext SLIT("\tpushal") +-- pprInstr POPA = ptext SLIT("\tpopal") + +pprInstr NOP = ptext SLIT("\tnop") +pprInstr (CLTD I32) = ptext SLIT("\tcltd") +pprInstr (CLTD I64) = ptext SLIT("\tcqto") + +pprInstr (SETCC cond op) = pprCondInstr SLIT("set") cond (pprOperand I8 op) + +pprInstr (JXX cond (BlockId id)) + = pprCondInstr SLIT("j") cond (pprCLabel_asm lab) + where lab = mkAsmTempLabel id + +pprInstr (JMP (OpImm imm)) = (<>) (ptext SLIT("\tjmp ")) (pprImm imm) +pprInstr (JMP op) = (<>) (ptext SLIT("\tjmp *")) (pprOperand wordRep op) +pprInstr (JMP_TBL op ids) = pprInstr (JMP op) +pprInstr (CALL (Left imm) _) = (<>) (ptext SLIT("\tcall ")) (pprImm imm) +pprInstr (CALL (Right reg) _) = (<>) (ptext SLIT("\tcall *")) (pprReg wordRep reg) + +pprInstr (IDIV sz op) = pprSizeOp SLIT("idiv") sz op +pprInstr (DIV sz op) = pprSizeOp SLIT("div") sz op +pprInstr (IMUL2 sz op) = pprSizeOp SLIT("imul") sz op + +#if x86_64_TARGET_ARCH +pprInstr (MUL size op1 op2) = pprSizeOpOp SLIT("mul") size op1 op2 + +pprInstr (FDIV size op1 op2) = pprSizeOpOp SLIT("div") size op1 op2 + +pprInstr (CVTSS2SD from to) = pprRegReg SLIT("cvtss2sd") from to +pprInstr (CVTSD2SS from to) = pprRegReg SLIT("cvtsd2ss") from to +pprInstr (CVTSS2SI from to) = pprOpReg SLIT("cvtss2si") from to +pprInstr (CVTSD2SI from to) = pprOpReg SLIT("cvtsd2si") from to +pprInstr (CVTSI2SS from to) = pprOpReg SLIT("cvtsi2ss") from to +pprInstr (CVTSI2SD from to) = pprOpReg SLIT("cvtsi2sd") from to +#endif + + -- FETCHGOT for PIC on ELF platforms +pprInstr (FETCHGOT reg) + = vcat [ ptext SLIT("\tcall 1f"), + hcat [ ptext SLIT("1:\tpopl\t"), pprReg I32 reg ], + hcat [ ptext SLIT("\taddl\t$_GLOBAL_OFFSET_TABLE_+(.-1b), "), + pprReg I32 reg ] + ] + + -- FETCHPC for PIC on Darwin/x86 + -- get the instruction pointer into a register + -- (Terminology note: the IP is called Program Counter on PPC, + -- and it's a good thing to use the same name on both platforms) +pprInstr (FETCHPC reg) + = vcat [ ptext SLIT("\tcall 1f"), + hcat [ ptext SLIT("1:\tpopl\t"), pprReg I32 reg ] + ] + + + +#endif + +-- ----------------------------------------------------------------------------- +-- i386 floating-point + +#if i386_TARGET_ARCH +-- Simulating a flat register set on the x86 FP stack is tricky. +-- you have to free %st(7) before pushing anything on the FP reg stack +-- so as to preclude the possibility of a FP stack overflow exception. +pprInstr g@(GMOV src dst) + | src == dst + = empty + | otherwise + = pprG g (hcat [gtab, gpush src 0, gsemi, gpop dst 1]) + +-- GLD sz addr dst ==> FFREE %st(7) ; FLDsz addr ; FSTP (dst+1) +pprInstr g@(GLD sz addr dst) + = pprG g (hcat [gtab, text "ffree %st(7) ; fld", pprSize sz, gsp, + pprAddr addr, gsemi, gpop dst 1]) + +-- GST sz src addr ==> FFREE %st(7) ; FLD dst ; FSTPsz addr +pprInstr g@(GST sz src addr) + = pprG g (hcat [gtab, gpush src 0, gsemi, + text "fstp", pprSize sz, gsp, pprAddr addr]) + +pprInstr g@(GLDZ dst) + = pprG g (hcat [gtab, text "ffree %st(7) ; fldz ; ", gpop dst 1]) +pprInstr g@(GLD1 dst) + = pprG g (hcat [gtab, text "ffree %st(7) ; fld1 ; ", gpop dst 1]) + +pprInstr g@(GFTOI src dst) + = pprInstr (GDTOI src dst) +pprInstr g@(GDTOI src dst) + = pprG g (hcat [gtab, text "subl $4, %esp ; ", + gpush src 0, gsemi, text "fistpl 0(%esp) ; popl ", + pprReg I32 dst]) + +pprInstr g@(GITOF src dst) + = pprInstr (GITOD src dst) +pprInstr g@(GITOD src dst) + = pprG g (hcat [gtab, text "pushl ", pprReg I32 src, + text " ; ffree %st(7); fildl (%esp) ; ", + gpop dst 1, text " ; addl $4,%esp"]) + +{- Gruesome swamp follows. If you're unfortunate enough to have ventured + this far into the jungle AND you give a Rat's Ass (tm) what's going + on, here's the deal. Generate code to do a floating point comparison + of src1 and src2, of kind cond, and set the Zero flag if true. + + The complications are to do with handling NaNs correctly. We want the + property that if either argument is NaN, then the result of the + comparison is False ... except if we're comparing for inequality, + in which case the answer is True. + + Here's how the general (non-inequality) case works. As an + example, consider generating the an equality test: + + pushl %eax -- we need to mess with this + <get src1 to top of FPU stack> + fcomp <src2 location in FPU stack> and pop pushed src1 + -- Result of comparison is in FPU Status Register bits + -- C3 C2 and C0 + fstsw %ax -- Move FPU Status Reg to %ax + sahf -- move C3 C2 C0 from %ax to integer flag reg + -- now the serious magic begins + setpo %ah -- %ah = if comparable(neither arg was NaN) then 1 else 0 + sete %al -- %al = if arg1 == arg2 then 1 else 0 + andb %ah,%al -- %al &= %ah + -- so %al == 1 iff (comparable && same); else it holds 0 + decb %al -- %al == 0, ZeroFlag=1 iff (comparable && same); + else %al == 0xFF, ZeroFlag=0 + -- the zero flag is now set as we desire. + popl %eax + + The special case of inequality differs thusly: + + setpe %ah -- %ah = if incomparable(either arg was NaN) then 1 else 0 + setne %al -- %al = if arg1 /= arg2 then 1 else 0 + orb %ah,%al -- %al = if (incomparable || different) then 1 else 0 + decb %al -- if (incomparable || different) then (%al == 0, ZF=1) + else (%al == 0xFF, ZF=0) +-} +pprInstr g@(GCMP cond src1 src2) + | case cond of { NE -> True; other -> False } + = pprG g (vcat [ + hcat [gtab, text "pushl %eax ; ",gpush src1 0], + hcat [gtab, text "fcomp ", greg src2 1, + text "; fstsw %ax ; sahf ; setpe %ah"], + hcat [gtab, text "setne %al ; ", + text "orb %ah,%al ; decb %al ; popl %eax"] + ]) + | otherwise + = pprG g (vcat [ + hcat [gtab, text "pushl %eax ; ",gpush src1 0], + hcat [gtab, text "fcomp ", greg src2 1, + text "; fstsw %ax ; sahf ; setpo %ah"], + hcat [gtab, text "set", pprCond (fix_FP_cond cond), text " %al ; ", + text "andb %ah,%al ; decb %al ; popl %eax"] + ]) + where + {- On the 486, the flags set by FP compare are the unsigned ones! + (This looks like a HACK to me. WDP 96/03) + -} + fix_FP_cond :: Cond -> Cond + fix_FP_cond GE = GEU + fix_FP_cond GTT = GU + fix_FP_cond LTT = LU + fix_FP_cond LE = LEU + fix_FP_cond EQQ = EQQ + fix_FP_cond NE = NE + -- there should be no others + + +pprInstr g@(GABS sz src dst) + = pprG g (hcat [gtab, gpush src 0, text " ; fabs ; ", gpop dst 1]) +pprInstr g@(GNEG sz src dst) + = pprG g (hcat [gtab, gpush src 0, text " ; fchs ; ", gpop dst 1]) + +pprInstr g@(GSQRT sz src dst) + = pprG g (hcat [gtab, gpush src 0, text " ; fsqrt"] $$ + hcat [gtab, gcoerceto sz, gpop dst 1]) +pprInstr g@(GSIN sz src dst) + = pprG g (hcat [gtab, gpush src 0, text " ; fsin"] $$ + hcat [gtab, gcoerceto sz, gpop dst 1]) +pprInstr g@(GCOS sz src dst) + = pprG g (hcat [gtab, gpush src 0, text " ; fcos"] $$ + hcat [gtab, gcoerceto sz, gpop dst 1]) +pprInstr g@(GTAN sz src dst) + = pprG g (hcat [gtab, text "ffree %st(6) ; ", + gpush src 0, text " ; fptan ; ", + text " fstp %st(0)"] $$ + hcat [gtab, gcoerceto sz, gpop dst 1]) + +-- In the translations for GADD, GMUL, GSUB and GDIV, +-- the first two cases are mere optimisations. The otherwise clause +-- generates correct code under all circumstances. + +pprInstr g@(GADD sz src1 src2 dst) + | src1 == dst + = pprG g (text "\t#GADD-xxxcase1" $$ + hcat [gtab, gpush src2 0, + text " ; faddp %st(0),", greg src1 1]) + | src2 == dst + = pprG g (text "\t#GADD-xxxcase2" $$ + hcat [gtab, gpush src1 0, + text " ; faddp %st(0),", greg src2 1]) + | otherwise + = pprG g (hcat [gtab, gpush src1 0, + text " ; fadd ", greg src2 1, text ",%st(0)", + gsemi, gpop dst 1]) + + +pprInstr g@(GMUL sz src1 src2 dst) + | src1 == dst + = pprG g (text "\t#GMUL-xxxcase1" $$ + hcat [gtab, gpush src2 0, + text " ; fmulp %st(0),", greg src1 1]) + | src2 == dst + = pprG g (text "\t#GMUL-xxxcase2" $$ + hcat [gtab, gpush src1 0, + text " ; fmulp %st(0),", greg src2 1]) + | otherwise + = pprG g (hcat [gtab, gpush src1 0, + text " ; fmul ", greg src2 1, text ",%st(0)", + gsemi, gpop dst 1]) + + +pprInstr g@(GSUB sz src1 src2 dst) + | src1 == dst + = pprG g (text "\t#GSUB-xxxcase1" $$ + hcat [gtab, gpush src2 0, + text " ; fsubrp %st(0),", greg src1 1]) + | src2 == dst + = pprG g (text "\t#GSUB-xxxcase2" $$ + hcat [gtab, gpush src1 0, + text " ; fsubp %st(0),", greg src2 1]) + | otherwise + = pprG g (hcat [gtab, gpush src1 0, + text " ; fsub ", greg src2 1, text ",%st(0)", + gsemi, gpop dst 1]) + + +pprInstr g@(GDIV sz src1 src2 dst) + | src1 == dst + = pprG g (text "\t#GDIV-xxxcase1" $$ + hcat [gtab, gpush src2 0, + text " ; fdivrp %st(0),", greg src1 1]) + | src2 == dst + = pprG g (text "\t#GDIV-xxxcase2" $$ + hcat [gtab, gpush src1 0, + text " ; fdivp %st(0),", greg src2 1]) + | otherwise + = pprG g (hcat [gtab, gpush src1 0, + text " ; fdiv ", greg src2 1, text ",%st(0)", + gsemi, gpop dst 1]) + + +pprInstr GFREE + = vcat [ ptext SLIT("\tffree %st(0) ;ffree %st(1) ;ffree %st(2) ;ffree %st(3)"), + ptext SLIT("\tffree %st(4) ;ffree %st(5) ;ffree %st(6) ;ffree %st(7)") + ] + +-------------------------- + +-- coerce %st(0) to the specified size +gcoerceto F64 = empty +gcoerceto F32 = empty --text "subl $4,%esp ; fstps (%esp) ; flds (%esp) ; addl $4,%esp ; " + +gpush reg offset + = hcat [text "ffree %st(7) ; fld ", greg reg offset] +gpop reg offset + = hcat [text "fstp ", greg reg offset] + +greg reg offset = text "%st(" <> int (gregno reg - 8+offset) <> char ')' +gsemi = text " ; " +gtab = char '\t' +gsp = char ' ' + +gregno (RealReg i) = i +gregno other = --pprPanic "gregno" (ppr other) + 999 -- bogus; only needed for debug printing + +pprG :: Instr -> Doc -> Doc +pprG fake actual + = (char '#' <> pprGInstr fake) $$ actual + +pprGInstr (GMOV src dst) = pprSizeRegReg SLIT("gmov") F64 src dst +pprGInstr (GLD sz src dst) = pprSizeAddrReg SLIT("gld") sz src dst +pprGInstr (GST sz src dst) = pprSizeRegAddr SLIT("gst") sz src dst + +pprGInstr (GLDZ dst) = pprSizeReg SLIT("gldz") F64 dst +pprGInstr (GLD1 dst) = pprSizeReg SLIT("gld1") F64 dst + +pprGInstr (GFTOI src dst) = pprSizeSizeRegReg SLIT("gftoi") F32 I32 src dst +pprGInstr (GDTOI src dst) = pprSizeSizeRegReg SLIT("gdtoi") F64 I32 src dst + +pprGInstr (GITOF src dst) = pprSizeSizeRegReg SLIT("gitof") I32 F32 src dst +pprGInstr (GITOD src dst) = pprSizeSizeRegReg SLIT("gitod") I32 F64 src dst + +pprGInstr (GCMP co src dst) = pprCondRegReg SLIT("gcmp_") F64 co src dst +pprGInstr (GABS sz src dst) = pprSizeRegReg SLIT("gabs") sz src dst +pprGInstr (GNEG sz src dst) = pprSizeRegReg SLIT("gneg") sz src dst +pprGInstr (GSQRT sz src dst) = pprSizeRegReg SLIT("gsqrt") sz src dst +pprGInstr (GSIN sz src dst) = pprSizeRegReg SLIT("gsin") sz src dst +pprGInstr (GCOS sz src dst) = pprSizeRegReg SLIT("gcos") sz src dst +pprGInstr (GTAN sz src dst) = pprSizeRegReg SLIT("gtan") sz src dst + +pprGInstr (GADD sz src1 src2 dst) = pprSizeRegRegReg SLIT("gadd") sz src1 src2 dst +pprGInstr (GSUB sz src1 src2 dst) = pprSizeRegRegReg SLIT("gsub") sz src1 src2 dst +pprGInstr (GMUL sz src1 src2 dst) = pprSizeRegRegReg SLIT("gmul") sz src1 src2 dst +pprGInstr (GDIV sz src1 src2 dst) = pprSizeRegRegReg SLIT("gdiv") sz src1 src2 dst +#endif + +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + +-- Continue with I386-only printing bits and bobs: + +pprDollImm :: Imm -> Doc + +pprDollImm i = ptext SLIT("$") <> pprImm i + +pprOperand :: MachRep -> Operand -> Doc +pprOperand s (OpReg r) = pprReg s r +pprOperand s (OpImm i) = pprDollImm i +pprOperand s (OpAddr ea) = pprAddr ea + +pprMnemonic_ :: LitString -> Doc +pprMnemonic_ name = + char '\t' <> ptext name <> space + +pprMnemonic :: LitString -> MachRep -> Doc +pprMnemonic name size = + char '\t' <> ptext name <> pprSize size <> space + +pprSizeImmOp :: LitString -> MachRep -> Imm -> Operand -> Doc +pprSizeImmOp name size imm op1 + = hcat [ + pprMnemonic name size, + char '$', + pprImm imm, + comma, + pprOperand size op1 + ] + +pprSizeOp :: LitString -> MachRep -> Operand -> Doc +pprSizeOp name size op1 + = hcat [ + pprMnemonic name size, + pprOperand size op1 + ] + +pprSizeOpOp :: LitString -> MachRep -> Operand -> Operand -> Doc +pprSizeOpOp name size op1 op2 + = hcat [ + pprMnemonic name size, + pprOperand size op1, + comma, + pprOperand size op2 + ] + +pprOpOp :: LitString -> MachRep -> Operand -> Operand -> Doc +pprOpOp name size op1 op2 + = hcat [ + pprMnemonic_ name, + pprOperand size op1, + comma, + pprOperand size op2 + ] + +pprSizeReg :: LitString -> MachRep -> Reg -> Doc +pprSizeReg name size reg1 + = hcat [ + pprMnemonic name size, + pprReg size reg1 + ] + +pprSizeRegReg :: LitString -> MachRep -> Reg -> Reg -> Doc +pprSizeRegReg name size reg1 reg2 + = hcat [ + pprMnemonic name size, + pprReg size reg1, + comma, + pprReg size reg2 + ] + +pprRegReg :: LitString -> Reg -> Reg -> Doc +pprRegReg name reg1 reg2 + = hcat [ + pprMnemonic_ name, + pprReg wordRep reg1, + comma, + pprReg wordRep reg2 + ] + +pprOpReg :: LitString -> Operand -> Reg -> Doc +pprOpReg name op1 reg2 + = hcat [ + pprMnemonic_ name, + pprOperand wordRep op1, + comma, + pprReg wordRep reg2 + ] + +pprCondRegReg :: LitString -> MachRep -> Cond -> Reg -> Reg -> Doc +pprCondRegReg name size cond reg1 reg2 + = hcat [ + char '\t', + ptext name, + pprCond cond, + space, + pprReg size reg1, + comma, + pprReg size reg2 + ] + +pprSizeSizeRegReg :: LitString -> MachRep -> MachRep -> Reg -> Reg -> Doc +pprSizeSizeRegReg name size1 size2 reg1 reg2 + = hcat [ + char '\t', + ptext name, + pprSize size1, + pprSize size2, + space, + pprReg size1 reg1, + + comma, + pprReg size2 reg2 + ] + +pprSizeRegRegReg :: LitString -> MachRep -> Reg -> Reg -> Reg -> Doc +pprSizeRegRegReg name size reg1 reg2 reg3 + = hcat [ + pprMnemonic name size, + pprReg size reg1, + comma, + pprReg size reg2, + comma, + pprReg size reg3 + ] + +pprSizeAddrReg :: LitString -> MachRep -> AddrMode -> Reg -> Doc +pprSizeAddrReg name size op dst + = hcat [ + pprMnemonic name size, + pprAddr op, + comma, + pprReg size dst + ] + +pprSizeRegAddr :: LitString -> MachRep -> Reg -> AddrMode -> Doc +pprSizeRegAddr name size src op + = hcat [ + pprMnemonic name size, + pprReg size src, + comma, + pprAddr op + ] + +pprShift :: LitString -> MachRep -> Operand -> Operand -> Doc +pprShift name size src dest + = hcat [ + pprMnemonic name size, + pprOperand I8 src, -- src is 8-bit sized + comma, + pprOperand size dest + ] + +pprSizeOpOpCoerce :: LitString -> MachRep -> MachRep -> Operand -> Operand -> Doc +pprSizeOpOpCoerce name size1 size2 op1 op2 + = hcat [ char '\t', ptext name, pprSize size1, pprSize size2, space, + pprOperand size1 op1, + comma, + pprOperand size2 op2 + ] + +pprCondInstr :: LitString -> Cond -> Doc -> Doc +pprCondInstr name cond arg + = hcat [ char '\t', ptext name, pprCond cond, space, arg] + +#endif /* i386_TARGET_ARCH */ + + +-- ------------------------------------------------------------------------------- pprInstr for a SPARC + +#if sparc_TARGET_ARCH + +-- a clumsy hack for now, to handle possible double alignment problems + +-- even clumsier, to allow for RegReg regs that show when doing indexed +-- reads (bytearrays). +-- + +-- Translate to the following: +-- add g1,g2,g1 +-- ld [g1],%fn +-- ld [g1+4],%f(n+1) +-- sub g1,g2,g1 -- to restore g1 + +pprInstr (LD F64 (AddrRegReg g1 g2) reg) + = vcat [ + hcat [ptext SLIT("\tadd\t"), pprReg g1,comma,pprReg g2,comma,pprReg g1], + hcat [pp_ld_lbracket, pprReg g1, pp_rbracket_comma, pprReg reg], + hcat [pp_ld_lbracket, pprReg g1, ptext SLIT("+4]"), comma, pprReg (fPair reg)], + hcat [ptext SLIT("\tsub\t"), pprReg g1,comma,pprReg g2,comma,pprReg g1] + ] + +-- Translate to +-- ld [addr],%fn +-- ld [addr+4],%f(n+1) +pprInstr (LD F64 addr reg) | isJust off_addr + = vcat [ + hcat [pp_ld_lbracket, pprAddr addr, pp_rbracket_comma, pprReg reg], + hcat [pp_ld_lbracket, pprAddr addr2, pp_rbracket_comma,pprReg (fPair reg)] + ] + where + off_addr = addrOffset addr 4 + addr2 = case off_addr of Just x -> x + + +pprInstr (LD size addr reg) + = hcat [ + ptext SLIT("\tld"), + pprSize size, + char '\t', + lbrack, + pprAddr addr, + pp_rbracket_comma, + pprReg reg + ] + +-- The same clumsy hack as above + +-- Translate to the following: +-- add g1,g2,g1 +-- st %fn,[g1] +-- st %f(n+1),[g1+4] +-- sub g1,g2,g1 -- to restore g1 +pprInstr (ST F64 reg (AddrRegReg g1 g2)) + = vcat [ + hcat [ptext SLIT("\tadd\t"), pprReg g1,comma,pprReg g2,comma,pprReg g1], + hcat [ptext SLIT("\tst\t"), pprReg reg, pp_comma_lbracket, + pprReg g1, rbrack], + hcat [ptext SLIT("\tst\t"), pprReg (fPair reg), pp_comma_lbracket, + pprReg g1, ptext SLIT("+4]")], + hcat [ptext SLIT("\tsub\t"), pprReg g1,comma,pprReg g2,comma,pprReg g1] + ] + +-- Translate to +-- st %fn,[addr] +-- st %f(n+1),[addr+4] +pprInstr (ST F64 reg addr) | isJust off_addr + = vcat [ + hcat [ptext SLIT("\tst\t"), pprReg reg, pp_comma_lbracket, + pprAddr addr, rbrack], + hcat [ptext SLIT("\tst\t"), pprReg (fPair reg), pp_comma_lbracket, + pprAddr addr2, rbrack] + ] + where + off_addr = addrOffset addr 4 + addr2 = case off_addr of Just x -> x + +-- no distinction is made between signed and unsigned bytes on stores for the +-- Sparc opcodes (at least I cannot see any, and gas is nagging me --SOF), +-- so we call a special-purpose pprSize for ST.. + +pprInstr (ST size reg addr) + = hcat [ + ptext SLIT("\tst"), + pprStSize size, + char '\t', + pprReg reg, + pp_comma_lbracket, + pprAddr addr, + rbrack + ] + +pprInstr (ADD x cc reg1 ri reg2) + | not x && not cc && riZero ri + = hcat [ ptext SLIT("\tmov\t"), pprReg reg1, comma, pprReg reg2 ] + | otherwise + = pprRegRIReg (if x then SLIT("addx") else SLIT("add")) cc reg1 ri reg2 + +pprInstr (SUB x cc reg1 ri reg2) + | not x && cc && reg2 == g0 + = hcat [ ptext SLIT("\tcmp\t"), pprReg reg1, comma, pprRI ri ] + | not x && not cc && riZero ri + = hcat [ ptext SLIT("\tmov\t"), pprReg reg1, comma, pprReg reg2 ] + | otherwise + = pprRegRIReg (if x then SLIT("subx") else SLIT("sub")) cc reg1 ri reg2 + +pprInstr (AND b reg1 ri reg2) = pprRegRIReg SLIT("and") b reg1 ri reg2 +pprInstr (ANDN b reg1 ri reg2) = pprRegRIReg SLIT("andn") b reg1 ri reg2 + +pprInstr (OR b reg1 ri reg2) + | not b && reg1 == g0 + = let doit = hcat [ ptext SLIT("\tmov\t"), pprRI ri, comma, pprReg reg2 ] + in case ri of + RIReg rrr | rrr == reg2 -> empty + other -> doit + | otherwise + = pprRegRIReg SLIT("or") b reg1 ri reg2 + +pprInstr (ORN b reg1 ri reg2) = pprRegRIReg SLIT("orn") b reg1 ri reg2 + +pprInstr (XOR b reg1 ri reg2) = pprRegRIReg SLIT("xor") b reg1 ri reg2 +pprInstr (XNOR b reg1 ri reg2) = pprRegRIReg SLIT("xnor") b reg1 ri reg2 + +pprInstr (SLL reg1 ri reg2) = pprRegRIReg SLIT("sll") False reg1 ri reg2 +pprInstr (SRL reg1 ri reg2) = pprRegRIReg SLIT("srl") False reg1 ri reg2 +pprInstr (SRA reg1 ri reg2) = pprRegRIReg SLIT("sra") False reg1 ri reg2 + +pprInstr (RDY rd) = ptext SLIT("\trd\t%y,") <> pprReg rd +pprInstr (SMUL b reg1 ri reg2) = pprRegRIReg SLIT("smul") b reg1 ri reg2 +pprInstr (UMUL b reg1 ri reg2) = pprRegRIReg SLIT("umul") b reg1 ri reg2 + +pprInstr (SETHI imm reg) + = hcat [ + ptext SLIT("\tsethi\t"), + pprImm imm, + comma, + pprReg reg + ] + +pprInstr NOP = ptext SLIT("\tnop") + +pprInstr (FABS F32 reg1 reg2) = pprSizeRegReg SLIT("fabs") F32 reg1 reg2 +pprInstr (FABS F64 reg1 reg2) + = (<>) (pprSizeRegReg SLIT("fabs") F32 reg1 reg2) + (if (reg1 == reg2) then empty + else (<>) (char '\n') + (pprSizeRegReg SLIT("fmov") F32 (fPair reg1) (fPair reg2))) + +pprInstr (FADD size reg1 reg2 reg3) + = pprSizeRegRegReg SLIT("fadd") size reg1 reg2 reg3 +pprInstr (FCMP e size reg1 reg2) + = pprSizeRegReg (if e then SLIT("fcmpe") else SLIT("fcmp")) size reg1 reg2 +pprInstr (FDIV size reg1 reg2 reg3) + = pprSizeRegRegReg SLIT("fdiv") size reg1 reg2 reg3 + +pprInstr (FMOV F32 reg1 reg2) = pprSizeRegReg SLIT("fmov") F32 reg1 reg2 +pprInstr (FMOV F64 reg1 reg2) + = (<>) (pprSizeRegReg SLIT("fmov") F32 reg1 reg2) + (if (reg1 == reg2) then empty + else (<>) (char '\n') + (pprSizeRegReg SLIT("fmov") F32 (fPair reg1) (fPair reg2))) + +pprInstr (FMUL size reg1 reg2 reg3) + = pprSizeRegRegReg SLIT("fmul") size reg1 reg2 reg3 + +pprInstr (FNEG F32 reg1 reg2) = pprSizeRegReg SLIT("fneg") F32 reg1 reg2 +pprInstr (FNEG F64 reg1 reg2) + = (<>) (pprSizeRegReg SLIT("fneg") F32 reg1 reg2) + (if (reg1 == reg2) then empty + else (<>) (char '\n') + (pprSizeRegReg SLIT("fmov") F32 (fPair reg1) (fPair reg2))) + +pprInstr (FSQRT size reg1 reg2) = pprSizeRegReg SLIT("fsqrt") size reg1 reg2 +pprInstr (FSUB size reg1 reg2 reg3) = pprSizeRegRegReg SLIT("fsub") size reg1 reg2 reg3 +pprInstr (FxTOy size1 size2 reg1 reg2) + = hcat [ + ptext SLIT("\tf"), + ptext + (case size1 of + I32 -> SLIT("ito") + F32 -> SLIT("sto") + F64 -> SLIT("dto")), + ptext + (case size2 of + I32 -> SLIT("i\t") + F32 -> SLIT("s\t") + F64 -> SLIT("d\t")), + pprReg reg1, comma, pprReg reg2 + ] + + +pprInstr (BI cond b lab) + = hcat [ + ptext SLIT("\tb"), pprCond cond, + if b then pp_comma_a else empty, + char '\t', + pprImm lab + ] + +pprInstr (BF cond b lab) + = hcat [ + ptext SLIT("\tfb"), pprCond cond, + if b then pp_comma_a else empty, + char '\t', + pprImm lab + ] + +pprInstr (JMP addr) = (<>) (ptext SLIT("\tjmp\t")) (pprAddr addr) + +pprInstr (CALL (Left imm) n _) + = hcat [ ptext SLIT("\tcall\t"), pprImm imm, comma, int n ] +pprInstr (CALL (Right reg) n _) + = hcat [ ptext SLIT("\tcall\t"), pprReg reg, comma, int n ] + +pprRI :: RI -> Doc +pprRI (RIReg r) = pprReg r +pprRI (RIImm r) = pprImm r + +pprSizeRegReg :: LitString -> MachRep -> Reg -> Reg -> Doc +pprSizeRegReg name size reg1 reg2 + = hcat [ + char '\t', + ptext name, + (case size of + F32 -> ptext SLIT("s\t") + F64 -> ptext SLIT("d\t")), + pprReg reg1, + comma, + pprReg reg2 + ] + +pprSizeRegRegReg :: LitString -> MachRep -> Reg -> Reg -> Reg -> Doc +pprSizeRegRegReg name size reg1 reg2 reg3 + = hcat [ + char '\t', + ptext name, + (case size of + F32 -> ptext SLIT("s\t") + F64 -> ptext SLIT("d\t")), + pprReg reg1, + comma, + pprReg reg2, + comma, + pprReg reg3 + ] + +pprRegRIReg :: LitString -> Bool -> Reg -> RI -> Reg -> Doc +pprRegRIReg name b reg1 ri reg2 + = hcat [ + char '\t', + ptext name, + if b then ptext SLIT("cc\t") else char '\t', + pprReg reg1, + comma, + pprRI ri, + comma, + pprReg reg2 + ] + +pprRIReg :: LitString -> Bool -> RI -> Reg -> Doc +pprRIReg name b ri reg1 + = hcat [ + char '\t', + ptext name, + if b then ptext SLIT("cc\t") else char '\t', + pprRI ri, + comma, + pprReg reg1 + ] + +pp_ld_lbracket = ptext SLIT("\tld\t[") +pp_rbracket_comma = text "]," +pp_comma_lbracket = text ",[" +pp_comma_a = text ",a" + +#endif /* sparc_TARGET_ARCH */ + + +-- ----------------------------------------------------------------------------- +-- pprInstr for PowerPC + +#if powerpc_TARGET_ARCH +pprInstr (LD sz reg addr) = hcat [ + char '\t', + ptext SLIT("l"), + ptext (case sz of + I8 -> SLIT("bz") + I16 -> SLIT("hz") + I32 -> SLIT("wz") + F32 -> SLIT("fs") + F64 -> SLIT("fd")), + case addr of AddrRegImm _ _ -> empty + AddrRegReg _ _ -> char 'x', + char '\t', + pprReg reg, + ptext SLIT(", "), + pprAddr addr + ] +pprInstr (LA sz reg addr) = hcat [ + char '\t', + ptext SLIT("l"), + ptext (case sz of + I8 -> SLIT("ba") + I16 -> SLIT("ha") + I32 -> SLIT("wa") + F32 -> SLIT("fs") + F64 -> SLIT("fd")), + case addr of AddrRegImm _ _ -> empty + AddrRegReg _ _ -> char 'x', + char '\t', + pprReg reg, + ptext SLIT(", "), + pprAddr addr + ] +pprInstr (ST sz reg addr) = hcat [ + char '\t', + ptext SLIT("st"), + pprSize sz, + case addr of AddrRegImm _ _ -> empty + AddrRegReg _ _ -> char 'x', + char '\t', + pprReg reg, + ptext SLIT(", "), + pprAddr addr + ] +pprInstr (STU sz reg addr) = hcat [ + char '\t', + ptext SLIT("st"), + pprSize sz, + ptext SLIT("u\t"), + case addr of AddrRegImm _ _ -> empty + AddrRegReg _ _ -> char 'x', + pprReg reg, + ptext SLIT(", "), + pprAddr addr + ] +pprInstr (LIS reg imm) = hcat [ + char '\t', + ptext SLIT("lis"), + char '\t', + pprReg reg, + ptext SLIT(", "), + pprImm imm + ] +pprInstr (LI reg imm) = hcat [ + char '\t', + ptext SLIT("li"), + char '\t', + pprReg reg, + ptext SLIT(", "), + pprImm imm + ] +pprInstr (MR reg1 reg2) + | reg1 == reg2 = empty + | otherwise = hcat [ + char '\t', + case regClass reg1 of + RcInteger -> ptext SLIT("mr") + _ -> ptext SLIT("fmr"), + char '\t', + pprReg reg1, + ptext SLIT(", "), + pprReg reg2 + ] +pprInstr (CMP sz reg ri) = hcat [ + char '\t', + op, + char '\t', + pprReg reg, + ptext SLIT(", "), + pprRI ri + ] + where + op = hcat [ + ptext SLIT("cmp"), + pprSize sz, + case ri of + RIReg _ -> empty + RIImm _ -> char 'i' + ] +pprInstr (CMPL sz reg ri) = hcat [ + char '\t', + op, + char '\t', + pprReg reg, + ptext SLIT(", "), + pprRI ri + ] + where + op = hcat [ + ptext SLIT("cmpl"), + pprSize sz, + case ri of + RIReg _ -> empty + RIImm _ -> char 'i' + ] +pprInstr (BCC cond (BlockId id)) = hcat [ + char '\t', + ptext SLIT("b"), + pprCond cond, + char '\t', + pprCLabel_asm lbl + ] + where lbl = mkAsmTempLabel id + +pprInstr (JMP lbl) = hcat [ -- an alias for b that takes a CLabel + char '\t', + ptext SLIT("b"), + char '\t', + pprCLabel_asm lbl + ] + +pprInstr (MTCTR reg) = hcat [ + char '\t', + ptext SLIT("mtctr"), + char '\t', + pprReg reg + ] +pprInstr (BCTR _) = hcat [ + char '\t', + ptext SLIT("bctr") + ] +pprInstr (BL lbl _) = hcat [ + ptext SLIT("\tbl\t"), + pprCLabel_asm lbl + ] +pprInstr (BCTRL _) = hcat [ + char '\t', + ptext SLIT("bctrl") + ] +pprInstr (ADD reg1 reg2 ri) = pprLogic SLIT("add") reg1 reg2 ri +pprInstr (ADDIS reg1 reg2 imm) = hcat [ + char '\t', + ptext SLIT("addis"), + char '\t', + pprReg reg1, + ptext SLIT(", "), + pprReg reg2, + ptext SLIT(", "), + pprImm imm + ] + +pprInstr (ADDC reg1 reg2 reg3) = pprLogic SLIT("addc") reg1 reg2 (RIReg reg3) +pprInstr (ADDE reg1 reg2 reg3) = pprLogic SLIT("adde") reg1 reg2 (RIReg reg3) +pprInstr (SUBF reg1 reg2 reg3) = pprLogic SLIT("subf") reg1 reg2 (RIReg reg3) +pprInstr (MULLW reg1 reg2 ri@(RIReg _)) = pprLogic SLIT("mullw") reg1 reg2 ri +pprInstr (MULLW reg1 reg2 ri@(RIImm _)) = pprLogic SLIT("mull") reg1 reg2 ri +pprInstr (DIVW reg1 reg2 reg3) = pprLogic SLIT("divw") reg1 reg2 (RIReg reg3) +pprInstr (DIVWU reg1 reg2 reg3) = pprLogic SLIT("divwu") reg1 reg2 (RIReg reg3) + +pprInstr (MULLW_MayOflo reg1 reg2 reg3) = vcat [ + hcat [ ptext SLIT("\tmullwo\t"), pprReg reg1, ptext SLIT(", "), + pprReg reg2, ptext SLIT(", "), + pprReg reg3 ], + hcat [ ptext SLIT("\tmfxer\t"), pprReg reg1 ], + hcat [ ptext SLIT("\trlwinm\t"), pprReg reg1, ptext SLIT(", "), + pprReg reg1, ptext SLIT(", "), + ptext SLIT("2, 31, 31") ] + ] + + -- for some reason, "andi" doesn't exist. + -- we'll use "andi." instead. +pprInstr (AND reg1 reg2 (RIImm imm)) = hcat [ + char '\t', + ptext SLIT("andi."), + char '\t', + pprReg reg1, + ptext SLIT(", "), + pprReg reg2, + ptext SLIT(", "), + pprImm imm + ] +pprInstr (AND reg1 reg2 ri) = pprLogic SLIT("and") reg1 reg2 ri + +pprInstr (OR reg1 reg2 ri) = pprLogic SLIT("or") reg1 reg2 ri +pprInstr (XOR reg1 reg2 ri) = pprLogic SLIT("xor") reg1 reg2 ri + +pprInstr (XORIS reg1 reg2 imm) = hcat [ + char '\t', + ptext SLIT("xoris"), + char '\t', + pprReg reg1, + ptext SLIT(", "), + pprReg reg2, + ptext SLIT(", "), + pprImm imm + ] + +pprInstr (EXTS sz reg1 reg2) = hcat [ + char '\t', + ptext SLIT("exts"), + pprSize sz, + char '\t', + pprReg reg1, + ptext SLIT(", "), + pprReg reg2 + ] + +pprInstr (NEG reg1 reg2) = pprUnary SLIT("neg") reg1 reg2 +pprInstr (NOT reg1 reg2) = pprUnary SLIT("not") reg1 reg2 + +pprInstr (SLW reg1 reg2 ri) = pprLogic SLIT("slw") reg1 reg2 (limitShiftRI ri) +pprInstr (SRW reg1 reg2 ri) = pprLogic SLIT("srw") reg1 reg2 (limitShiftRI ri) +pprInstr (SRAW reg1 reg2 ri) = pprLogic SLIT("sraw") reg1 reg2 (limitShiftRI ri) +pprInstr (RLWINM reg1 reg2 sh mb me) = hcat [ + ptext SLIT("\trlwinm\t"), + pprReg reg1, + ptext SLIT(", "), + pprReg reg2, + ptext SLIT(", "), + int sh, + ptext SLIT(", "), + int mb, + ptext SLIT(", "), + int me + ] + +pprInstr (FADD sz reg1 reg2 reg3) = pprBinaryF SLIT("fadd") sz reg1 reg2 reg3 +pprInstr (FSUB sz reg1 reg2 reg3) = pprBinaryF SLIT("fsub") sz reg1 reg2 reg3 +pprInstr (FMUL sz reg1 reg2 reg3) = pprBinaryF SLIT("fmul") sz reg1 reg2 reg3 +pprInstr (FDIV sz reg1 reg2 reg3) = pprBinaryF SLIT("fdiv") sz reg1 reg2 reg3 +pprInstr (FNEG reg1 reg2) = pprUnary SLIT("fneg") reg1 reg2 + +pprInstr (FCMP reg1 reg2) = hcat [ + char '\t', + ptext SLIT("fcmpu\tcr0, "), + -- Note: we're using fcmpu, not fcmpo + -- The difference is with fcmpo, compare with NaN is an invalid operation. + -- We don't handle invalid fp ops, so we don't care + pprReg reg1, + ptext SLIT(", "), + pprReg reg2 + ] + +pprInstr (FCTIWZ reg1 reg2) = pprUnary SLIT("fctiwz") reg1 reg2 +pprInstr (FRSP reg1 reg2) = pprUnary SLIT("frsp") reg1 reg2 + +pprInstr (CRNOR dst src1 src2) = hcat [ + ptext SLIT("\tcrnor\t"), + int dst, + ptext SLIT(", "), + int src1, + ptext SLIT(", "), + int src2 + ] + +pprInstr (MFCR reg) = hcat [ + char '\t', + ptext SLIT("mfcr"), + char '\t', + pprReg reg + ] + +pprInstr (MFLR reg) = hcat [ + char '\t', + ptext SLIT("mflr"), + char '\t', + pprReg reg + ] + +pprInstr (FETCHPC reg) = vcat [ + ptext SLIT("\tbcl\t20,31,1f"), + hcat [ ptext SLIT("1:\tmflr\t"), pprReg reg ] + ] + +pprInstr _ = panic "pprInstr (ppc)" + +pprLogic op reg1 reg2 ri = hcat [ + char '\t', + ptext op, + case ri of + RIReg _ -> empty + RIImm _ -> char 'i', + char '\t', + pprReg reg1, + ptext SLIT(", "), + pprReg reg2, + ptext SLIT(", "), + pprRI ri + ] + +pprUnary op reg1 reg2 = hcat [ + char '\t', + ptext op, + char '\t', + pprReg reg1, + ptext SLIT(", "), + pprReg reg2 + ] + +pprBinaryF op sz reg1 reg2 reg3 = hcat [ + char '\t', + ptext op, + pprFSize sz, + char '\t', + pprReg reg1, + ptext SLIT(", "), + pprReg reg2, + ptext SLIT(", "), + pprReg reg3 + ] + +pprRI :: RI -> Doc +pprRI (RIReg r) = pprReg r +pprRI (RIImm r) = pprImm r + +pprFSize F64 = empty +pprFSize F32 = char 's' + + -- limit immediate argument for shift instruction to range 0..32 + -- (yes, the maximum is really 32, not 31) +limitShiftRI :: RI -> RI +limitShiftRI (RIImm (ImmInt i)) | i > 32 || i < 0 = RIImm (ImmInt 32) +limitShiftRI x = x + +#endif /* powerpc_TARGET_ARCH */ + + +-- ----------------------------------------------------------------------------- +-- Converting floating-point literals to integrals for printing + +#if __GLASGOW_HASKELL__ >= 504 +newFloatArray :: (Int,Int) -> ST s (STUArray s Int Float) +newFloatArray = newArray_ + +newDoubleArray :: (Int,Int) -> ST s (STUArray s Int Double) +newDoubleArray = newArray_ + +castFloatToCharArray :: STUArray s Int Float -> ST s (STUArray s Int Word8) +castFloatToCharArray = castSTUArray + +castDoubleToCharArray :: STUArray s Int Double -> ST s (STUArray s Int Word8) +castDoubleToCharArray = castSTUArray + +writeFloatArray :: STUArray s Int Float -> Int -> Float -> ST s () +writeFloatArray = writeArray + +writeDoubleArray :: STUArray s Int Double -> Int -> Double -> ST s () +writeDoubleArray = writeArray + +readCharArray :: STUArray s Int Word8 -> Int -> ST s Char +readCharArray arr i = do + w <- readArray arr i + return $! (chr (fromIntegral w)) + +#else + +castFloatToCharArray :: MutableByteArray s t -> ST s (MutableByteArray s t) +castFloatToCharArray = return + +castDoubleToCharArray :: MutableByteArray s t -> ST s (MutableByteArray s t) + + +castDoubleToCharArray = return + +#endif + +-- floatToBytes and doubleToBytes convert to the host's byte +-- order. Providing that we're not cross-compiling for a +-- target with the opposite endianness, this should work ok +-- on all targets. + +-- ToDo: this stuff is very similar to the shenanigans in PprAbs, +-- could they be merged? + +floatToBytes :: Float -> [Int] +floatToBytes f + = runST (do + arr <- newFloatArray ((0::Int),3) + writeFloatArray arr 0 f + arr <- castFloatToCharArray arr + i0 <- readCharArray arr 0 + i1 <- readCharArray arr 1 + i2 <- readCharArray arr 2 + i3 <- readCharArray arr 3 + return (map ord [i0,i1,i2,i3]) + ) + +doubleToBytes :: Double -> [Int] +doubleToBytes d + = runST (do + arr <- newDoubleArray ((0::Int),7) + writeDoubleArray arr 0 d + arr <- castDoubleToCharArray arr + i0 <- readCharArray arr 0 + i1 <- readCharArray arr 1 + i2 <- readCharArray arr 2 + i3 <- readCharArray arr 3 + i4 <- readCharArray arr 4 + i5 <- readCharArray arr 5 + i6 <- readCharArray arr 6 + i7 <- readCharArray arr 7 + return (map ord [i0,i1,i2,i3,i4,i5,i6,i7]) + ) diff --git a/compiler/nativeGen/RegAllocInfo.hs b/compiler/nativeGen/RegAllocInfo.hs new file mode 100644 index 0000000000..98c4e2dfe0 --- /dev/null +++ b/compiler/nativeGen/RegAllocInfo.hs @@ -0,0 +1,850 @@ +----------------------------------------------------------------------------- +-- +-- Machine-specific parts of the register allocator +-- +-- (c) The University of Glasgow 1996-2004 +-- +----------------------------------------------------------------------------- + +#include "nativeGen/NCG.h" + +module RegAllocInfo ( + RegUsage(..), + noUsage, + regUsage, + patchRegs, + jumpDests, + patchJump, + isRegRegMove, + + maxSpillSlots, + mkSpillInstr, + mkLoadInstr, + mkRegRegMoveInstr, + mkBranchInstr + ) where + +#include "HsVersions.h" + +import Cmm ( BlockId ) +import MachOp ( MachRep(..), wordRep ) +import MachInstrs +import MachRegs +import Outputable +import Constants ( rESERVED_C_STACK_BYTES ) +import FastTypes + +-- ----------------------------------------------------------------------------- +-- RegUsage type + +-- @regUsage@ returns the sets of src and destination registers used +-- by a particular instruction. Machine registers that are +-- pre-allocated to stgRegs are filtered out, because they are +-- uninteresting from a register allocation standpoint. (We wouldn't +-- want them to end up on the free list!) As far as we are concerned, +-- the fixed registers simply don't exist (for allocation purposes, +-- anyway). + +-- regUsage doesn't need to do any trickery for jumps and such. Just +-- state precisely the regs read and written by that insn. The +-- consequences of control flow transfers, as far as register +-- allocation goes, are taken care of by the register allocator. + +data RegUsage = RU [Reg] [Reg] + +noUsage :: RegUsage +noUsage = RU [] [] + +regUsage :: Instr -> RegUsage + +interesting (VirtualRegI _) = True +interesting (VirtualRegHi _) = True +interesting (VirtualRegF _) = True +interesting (VirtualRegD _) = True +interesting (RealReg i) = isFastTrue (freeReg i) + + +#if alpha_TARGET_ARCH +regUsage instr = case instr of + LD B reg addr -> usage (regAddr addr, [reg, t9]) + LD Bu reg addr -> usage (regAddr addr, [reg, t9]) +-- LD W reg addr -> usage (regAddr addr, [reg, t9]) : UNUSED +-- LD Wu reg addr -> usage (regAddr addr, [reg, t9]) : UNUSED + LD sz reg addr -> usage (regAddr addr, [reg]) + LDA reg addr -> usage (regAddr addr, [reg]) + LDAH reg addr -> usage (regAddr addr, [reg]) + LDGP reg addr -> usage (regAddr addr, [reg]) + LDI sz reg imm -> usage ([], [reg]) + ST B reg addr -> usage (reg : regAddr addr, [t9, t10]) +-- ST W reg addr -> usage (reg : regAddr addr, [t9, t10]) : UNUSED + ST sz reg addr -> usage (reg : regAddr addr, []) + CLR reg -> usage ([], [reg]) + ABS sz ri reg -> usage (regRI ri, [reg]) + NEG sz ov ri reg -> usage (regRI ri, [reg]) + ADD sz ov r1 ar r2 -> usage (r1 : regRI ar, [r2]) + SADD sz sc r1 ar r2 -> usage (r1 : regRI ar, [r2]) + SUB sz ov r1 ar r2 -> usage (r1 : regRI ar, [r2]) + SSUB sz sc r1 ar r2 -> usage (r1 : regRI ar, [r2]) + MUL sz ov r1 ar r2 -> usage (r1 : regRI ar, [r2]) + DIV sz un r1 ar r2 -> usage (r1 : regRI ar, [r2, t9, t10, t11, t12]) + REM sz un r1 ar r2 -> usage (r1 : regRI ar, [r2, t9, t10, t11, t12]) + NOT ri reg -> usage (regRI ri, [reg]) + AND r1 ar r2 -> usage (r1 : regRI ar, [r2]) + ANDNOT r1 ar r2 -> usage (r1 : regRI ar, [r2]) + OR r1 ar r2 -> usage (r1 : regRI ar, [r2]) + ORNOT r1 ar r2 -> usage (r1 : regRI ar, [r2]) + XOR r1 ar r2 -> usage (r1 : regRI ar, [r2]) + XORNOT r1 ar r2 -> usage (r1 : regRI ar, [r2]) + SLL r1 ar r2 -> usage (r1 : regRI ar, [r2]) + SRL r1 ar r2 -> usage (r1 : regRI ar, [r2]) + SRA r1 ar r2 -> usage (r1 : regRI ar, [r2]) + ZAP r1 ar r2 -> usage (r1 : regRI ar, [r2]) + ZAPNOT r1 ar r2 -> usage (r1 : regRI ar, [r2]) + CMP co r1 ar r2 -> usage (r1 : regRI ar, [r2]) + FCLR reg -> usage ([], [reg]) + FABS r1 r2 -> usage ([r1], [r2]) + FNEG sz r1 r2 -> usage ([r1], [r2]) + FADD sz r1 r2 r3 -> usage ([r1, r2], [r3]) + FDIV sz r1 r2 r3 -> usage ([r1, r2], [r3]) + FMUL sz r1 r2 r3 -> usage ([r1, r2], [r3]) + FSUB sz r1 r2 r3 -> usage ([r1, r2], [r3]) + CVTxy sz1 sz2 r1 r2 -> usage ([r1], [r2]) + FCMP sz co r1 r2 r3 -> usage ([r1, r2], [r3]) + FMOV r1 r2 -> usage ([r1], [r2]) + + + -- We assume that all local jumps will be BI/BF/BR. JMP must be out-of-line. + BI cond reg lbl -> usage ([reg], []) + BF cond reg lbl -> usage ([reg], []) + JMP reg addr hint -> RU (mkRegSet (filter interesting (regAddr addr))) freeRegSet + + BSR _ n -> RU (argRegSet n) callClobberedRegSet + JSR reg addr n -> RU (argRegSet n) callClobberedRegSet + + _ -> noUsage + + where + usage (src, dst) = RU (mkRegSet (filter interesting src)) + (mkRegSet (filter interesting dst)) + + interesting (FixedReg _) = False + interesting _ = True + + regAddr (AddrReg r1) = [r1] + regAddr (AddrRegImm r1 _) = [r1] + regAddr (AddrImm _) = [] + + regRI (RIReg r) = [r] + regRI _ = [] + +#endif /* alpha_TARGET_ARCH */ +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + +regUsage instr = case instr of + MOV sz src dst -> usageRW src dst + MOVZxL sz src dst -> usageRW src dst + MOVSxL sz src dst -> usageRW src dst + LEA sz src dst -> usageRW src dst + ADD sz src dst -> usageRM src dst + ADC sz src dst -> usageRM src dst + SUB sz src dst -> usageRM src dst + IMUL sz src dst -> usageRM src dst + IMUL2 sz src -> mkRU (eax:use_R src) [eax,edx] + MUL sz src dst -> usageRM src dst + DIV sz op -> mkRU (eax:edx:use_R op) [eax,edx] + IDIV sz op -> mkRU (eax:edx:use_R op) [eax,edx] + AND sz src dst -> usageRM src dst + OR sz src dst -> usageRM src dst + XOR sz src dst -> usageRM src dst + NOT sz op -> usageM op + NEGI sz op -> usageM op + SHL sz imm dst -> usageRM imm dst + SAR sz imm dst -> usageRM imm dst + SHR sz imm dst -> usageRM imm dst + BT sz imm src -> mkRUR (use_R src) + + PUSH sz op -> mkRUR (use_R op) + POP sz op -> mkRU [] (def_W op) + TEST sz src dst -> mkRUR (use_R src ++ use_R dst) + CMP sz src dst -> mkRUR (use_R src ++ use_R dst) + SETCC cond op -> mkRU [] (def_W op) + JXX cond lbl -> mkRU [] [] + JMP op -> mkRUR (use_R op) + JMP_TBL op ids -> mkRUR (use_R op) + CALL (Left imm) params -> mkRU params callClobberedRegs + CALL (Right reg) params -> mkRU (reg:params) callClobberedRegs + CLTD sz -> mkRU [eax] [edx] + NOP -> mkRU [] [] + +#if i386_TARGET_ARCH + GMOV src dst -> mkRU [src] [dst] + GLD sz src dst -> mkRU (use_EA src) [dst] + GST sz src dst -> mkRUR (src : use_EA dst) + + GLDZ dst -> mkRU [] [dst] + GLD1 dst -> mkRU [] [dst] + + GFTOI src dst -> mkRU [src] [dst] + GDTOI src dst -> mkRU [src] [dst] + + GITOF src dst -> mkRU [src] [dst] + GITOD src dst -> mkRU [src] [dst] + + GADD sz s1 s2 dst -> mkRU [s1,s2] [dst] + GSUB sz s1 s2 dst -> mkRU [s1,s2] [dst] + GMUL sz s1 s2 dst -> mkRU [s1,s2] [dst] + GDIV sz s1 s2 dst -> mkRU [s1,s2] [dst] + + GCMP sz src1 src2 -> mkRUR [src1,src2] + GABS sz src dst -> mkRU [src] [dst] + GNEG sz src dst -> mkRU [src] [dst] + GSQRT sz src dst -> mkRU [src] [dst] + GSIN sz src dst -> mkRU [src] [dst] + GCOS sz src dst -> mkRU [src] [dst] + GTAN sz src dst -> mkRU [src] [dst] +#endif + +#if x86_64_TARGET_ARCH + CVTSS2SD src dst -> mkRU [src] [dst] + CVTSD2SS src dst -> mkRU [src] [dst] + CVTSS2SI src dst -> mkRU (use_R src) [dst] + CVTSD2SI src dst -> mkRU (use_R src) [dst] + CVTSI2SS src dst -> mkRU (use_R src) [dst] + CVTSI2SD src dst -> mkRU (use_R src) [dst] + FDIV sz src dst -> usageRM src dst +#endif + + FETCHGOT reg -> mkRU [] [reg] + FETCHPC reg -> mkRU [] [reg] + + COMMENT _ -> noUsage + DELTA _ -> noUsage + + _other -> panic "regUsage: unrecognised instr" + + where +#if x86_64_TARGET_ARCH + -- call parameters: include %eax, because it is used + -- to pass the number of SSE reg arguments to varargs fns. + params = eax : allArgRegs ++ allFPArgRegs +#endif + + -- 2 operand form; first operand Read; second Written + usageRW :: Operand -> Operand -> RegUsage + usageRW op (OpReg reg) = mkRU (use_R op) [reg] + usageRW op (OpAddr ea) = mkRUR (use_R op ++ use_EA ea) + + -- 2 operand form; first operand Read; second Modified + usageRM :: Operand -> Operand -> RegUsage + usageRM op (OpReg reg) = mkRU (use_R op ++ [reg]) [reg] + usageRM op (OpAddr ea) = mkRUR (use_R op ++ use_EA ea) + + -- 1 operand form; operand Modified + usageM :: Operand -> RegUsage + usageM (OpReg reg) = mkRU [reg] [reg] + usageM (OpAddr ea) = mkRUR (use_EA ea) + + -- Registers defd when an operand is written. + def_W (OpReg reg) = [reg] + def_W (OpAddr ea) = [] + + -- Registers used when an operand is read. + use_R (OpReg reg) = [reg] + use_R (OpImm imm) = [] + use_R (OpAddr ea) = use_EA ea + + -- Registers used to compute an effective address. + use_EA (ImmAddr _ _) = [] + use_EA (AddrBaseIndex base index _) = + use_base base $! use_index index + where use_base (EABaseReg r) x = r : x + use_base _ x = x + use_index EAIndexNone = [] + use_index (EAIndex i _) = [i] + + mkRUR src = src' `seq` RU src' [] + where src' = filter interesting src + + mkRU src dst = src' `seq` dst' `seq` RU src' dst' + where src' = filter interesting src + dst' = filter interesting dst + +#endif /* i386_TARGET_ARCH || x86_64_TARGET_ARCH */ +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +#if sparc_TARGET_ARCH + +regUsage instr = case instr of + LD sz addr reg -> usage (regAddr addr, [reg]) + ST sz reg addr -> usage (reg : regAddr addr, []) + ADD x cc r1 ar r2 -> usage (r1 : regRI ar, [r2]) + SUB x cc r1 ar r2 -> usage (r1 : regRI ar, [r2]) + UMUL cc r1 ar r2 -> usage (r1 : regRI ar, [r2]) + SMUL cc r1 ar r2 -> usage (r1 : regRI ar, [r2]) + RDY rd -> usage ([], [rd]) + AND b r1 ar r2 -> usage (r1 : regRI ar, [r2]) + ANDN b r1 ar r2 -> usage (r1 : regRI ar, [r2]) + OR b r1 ar r2 -> usage (r1 : regRI ar, [r2]) + ORN b r1 ar r2 -> usage (r1 : regRI ar, [r2]) + XOR b r1 ar r2 -> usage (r1 : regRI ar, [r2]) + XNOR b r1 ar r2 -> usage (r1 : regRI ar, [r2]) + SLL r1 ar r2 -> usage (r1 : regRI ar, [r2]) + SRL r1 ar r2 -> usage (r1 : regRI ar, [r2]) + SRA r1 ar r2 -> usage (r1 : regRI ar, [r2]) + SETHI imm reg -> usage ([], [reg]) + FABS s r1 r2 -> usage ([r1], [r2]) + FADD s r1 r2 r3 -> usage ([r1, r2], [r3]) + FCMP e s r1 r2 -> usage ([r1, r2], []) + FDIV s r1 r2 r3 -> usage ([r1, r2], [r3]) + FMOV s r1 r2 -> usage ([r1], [r2]) + FMUL s r1 r2 r3 -> usage ([r1, r2], [r3]) + FNEG s r1 r2 -> usage ([r1], [r2]) + FSQRT s r1 r2 -> usage ([r1], [r2]) + FSUB s r1 r2 r3 -> usage ([r1, r2], [r3]) + FxTOy s1 s2 r1 r2 -> usage ([r1], [r2]) + + -- We assume that all local jumps will be BI/BF. JMP must be out-of-line. + JMP addr -> usage (regAddr addr, []) + + CALL (Left imm) n True -> noUsage + CALL (Left imm) n False -> usage (argRegs n, callClobberedRegs) + CALL (Right reg) n True -> usage ([reg], []) + CALL (Right reg) n False -> usage (reg : (argRegs n), callClobberedRegs) + + _ -> noUsage + where + usage (src, dst) = RU (filter interesting src) + (filter interesting dst) + + regAddr (AddrRegReg r1 r2) = [r1, r2] + regAddr (AddrRegImm r1 _) = [r1] + + regRI (RIReg r) = [r] + regRI _ = [] + +#endif /* sparc_TARGET_ARCH */ +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +#if powerpc_TARGET_ARCH + +regUsage instr = case instr of + LD sz reg addr -> usage (regAddr addr, [reg]) + LA sz reg addr -> usage (regAddr addr, [reg]) + ST sz reg addr -> usage (reg : regAddr addr, []) + STU sz reg addr -> usage (reg : regAddr addr, []) + LIS reg imm -> usage ([], [reg]) + LI reg imm -> usage ([], [reg]) + MR reg1 reg2 -> usage ([reg2], [reg1]) + CMP sz reg ri -> usage (reg : regRI ri,[]) + CMPL sz reg ri -> usage (reg : regRI ri,[]) + BCC cond lbl -> noUsage + MTCTR reg -> usage ([reg],[]) + BCTR targets -> noUsage + BL imm params -> usage (params, callClobberedRegs) + BCTRL params -> usage (params, callClobberedRegs) + ADD reg1 reg2 ri -> usage (reg2 : regRI ri, [reg1]) + ADDC reg1 reg2 reg3-> usage ([reg2,reg3], [reg1]) + ADDE reg1 reg2 reg3-> usage ([reg2,reg3], [reg1]) + ADDIS reg1 reg2 imm -> usage ([reg2], [reg1]) + SUBF reg1 reg2 reg3-> usage ([reg2,reg3], [reg1]) + MULLW reg1 reg2 ri -> usage (reg2 : regRI ri, [reg1]) + DIVW reg1 reg2 reg3-> usage ([reg2,reg3], [reg1]) + DIVWU reg1 reg2 reg3-> usage ([reg2,reg3], [reg1]) + MULLW_MayOflo reg1 reg2 reg3 + -> usage ([reg2,reg3], [reg1]) + AND reg1 reg2 ri -> usage (reg2 : regRI ri, [reg1]) + OR reg1 reg2 ri -> usage (reg2 : regRI ri, [reg1]) + XOR reg1 reg2 ri -> usage (reg2 : regRI ri, [reg1]) + XORIS reg1 reg2 imm -> usage ([reg2], [reg1]) + EXTS siz reg1 reg2 -> usage ([reg2], [reg1]) + NEG reg1 reg2 -> usage ([reg2], [reg1]) + NOT reg1 reg2 -> usage ([reg2], [reg1]) + SLW reg1 reg2 ri -> usage (reg2 : regRI ri, [reg1]) + SRW reg1 reg2 ri -> usage (reg2 : regRI ri, [reg1]) + SRAW reg1 reg2 ri -> usage (reg2 : regRI ri, [reg1]) + RLWINM reg1 reg2 sh mb me + -> usage ([reg2], [reg1]) + FADD sz r1 r2 r3 -> usage ([r2,r3], [r1]) + FSUB sz r1 r2 r3 -> usage ([r2,r3], [r1]) + FMUL sz r1 r2 r3 -> usage ([r2,r3], [r1]) + FDIV sz r1 r2 r3 -> usage ([r2,r3], [r1]) + FNEG r1 r2 -> usage ([r2], [r1]) + FCMP r1 r2 -> usage ([r1,r2], []) + FCTIWZ r1 r2 -> usage ([r2], [r1]) + FRSP r1 r2 -> usage ([r2], [r1]) + MFCR reg -> usage ([], [reg]) + MFLR reg -> usage ([], [reg]) + FETCHPC reg -> usage ([], [reg]) + _ -> noUsage + where + usage (src, dst) = RU (filter interesting src) + (filter interesting dst) + regAddr (AddrRegReg r1 r2) = [r1, r2] + regAddr (AddrRegImm r1 _) = [r1] + + regRI (RIReg r) = [r] + regRI _ = [] +#endif /* powerpc_TARGET_ARCH */ + + +-- ----------------------------------------------------------------------------- +-- Determine the possible destinations from the current instruction. + +-- (we always assume that the next instruction is also a valid destination; +-- if this isn't the case then the jump should be at the end of the basic +-- block). + +jumpDests :: Instr -> [BlockId] -> [BlockId] +jumpDests insn acc + = case insn of +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + JXX _ id -> id : acc + JMP_TBL _ ids -> ids ++ acc +#elif powerpc_TARGET_ARCH + BCC _ id -> id : acc + BCTR targets -> targets ++ acc +#endif + _other -> acc + +patchJump :: Instr -> BlockId -> BlockId -> Instr + +patchJump insn old new + = case insn of +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + JXX cc id | id == old -> JXX cc new + JMP_TBL op ids -> error "Cannot patch JMP_TBL" +#elif powerpc_TARGET_ARCH + BCC cc id | id == old -> BCC cc new + BCTR targets -> error "Cannot patch BCTR" +#endif + _other -> insn + +-- ----------------------------------------------------------------------------- +-- 'patchRegs' function + +-- 'patchRegs' takes an instruction and applies the given mapping to +-- all the register references. + +patchRegs :: Instr -> (Reg -> Reg) -> Instr + +#if alpha_TARGET_ARCH + +patchRegs instr env = case instr of + LD sz reg addr -> LD sz (env reg) (fixAddr addr) + LDA reg addr -> LDA (env reg) (fixAddr addr) + LDAH reg addr -> LDAH (env reg) (fixAddr addr) + LDGP reg addr -> LDGP (env reg) (fixAddr addr) + LDI sz reg imm -> LDI sz (env reg) imm + ST sz reg addr -> ST sz (env reg) (fixAddr addr) + CLR reg -> CLR (env reg) + ABS sz ar reg -> ABS sz (fixRI ar) (env reg) + NEG sz ov ar reg -> NEG sz ov (fixRI ar) (env reg) + ADD sz ov r1 ar r2 -> ADD sz ov (env r1) (fixRI ar) (env r2) + SADD sz sc r1 ar r2 -> SADD sz sc (env r1) (fixRI ar) (env r2) + SUB sz ov r1 ar r2 -> SUB sz ov (env r1) (fixRI ar) (env r2) + SSUB sz sc r1 ar r2 -> SSUB sz sc (env r1) (fixRI ar) (env r2) + MUL sz ov r1 ar r2 -> MUL sz ov (env r1) (fixRI ar) (env r2) + DIV sz un r1 ar r2 -> DIV sz un (env r1) (fixRI ar) (env r2) + REM sz un r1 ar r2 -> REM sz un (env r1) (fixRI ar) (env r2) + NOT ar reg -> NOT (fixRI ar) (env reg) + AND r1 ar r2 -> AND (env r1) (fixRI ar) (env r2) + ANDNOT r1 ar r2 -> ANDNOT (env r1) (fixRI ar) (env r2) + OR r1 ar r2 -> OR (env r1) (fixRI ar) (env r2) + ORNOT r1 ar r2 -> ORNOT (env r1) (fixRI ar) (env r2) + XOR r1 ar r2 -> XOR (env r1) (fixRI ar) (env r2) + XORNOT r1 ar r2 -> XORNOT (env r1) (fixRI ar) (env r2) + SLL r1 ar r2 -> SLL (env r1) (fixRI ar) (env r2) + SRL r1 ar r2 -> SRL (env r1) (fixRI ar) (env r2) + SRA r1 ar r2 -> SRA (env r1) (fixRI ar) (env r2) + ZAP r1 ar r2 -> ZAP (env r1) (fixRI ar) (env r2) + ZAPNOT r1 ar r2 -> ZAPNOT (env r1) (fixRI ar) (env r2) + CMP co r1 ar r2 -> CMP co (env r1) (fixRI ar) (env r2) + FCLR reg -> FCLR (env reg) + FABS r1 r2 -> FABS (env r1) (env r2) + FNEG s r1 r2 -> FNEG s (env r1) (env r2) + FADD s r1 r2 r3 -> FADD s (env r1) (env r2) (env r3) + FDIV s r1 r2 r3 -> FDIV s (env r1) (env r2) (env r3) + FMUL s r1 r2 r3 -> FMUL s (env r1) (env r2) (env r3) + FSUB s r1 r2 r3 -> FSUB s (env r1) (env r2) (env r3) + CVTxy s1 s2 r1 r2 -> CVTxy s1 s2 (env r1) (env r2) + FCMP s co r1 r2 r3 -> FCMP s co (env r1) (env r2) (env r3) + FMOV r1 r2 -> FMOV (env r1) (env r2) + BI cond reg lbl -> BI cond (env reg) lbl + BF cond reg lbl -> BF cond (env reg) lbl + JMP reg addr hint -> JMP (env reg) (fixAddr addr) hint + JSR reg addr i -> JSR (env reg) (fixAddr addr) i + _ -> instr + where + fixAddr (AddrReg r1) = AddrReg (env r1) + fixAddr (AddrRegImm r1 i) = AddrRegImm (env r1) i + fixAddr other = other + + fixRI (RIReg r) = RIReg (env r) + fixRI other = other + +#endif /* alpha_TARGET_ARCH */ +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + +patchRegs instr env = case instr of + MOV sz src dst -> patch2 (MOV sz) src dst + MOVZxL sz src dst -> patch2 (MOVZxL sz) src dst + MOVSxL sz src dst -> patch2 (MOVSxL sz) src dst + LEA sz src dst -> patch2 (LEA sz) src dst + ADD sz src dst -> patch2 (ADD sz) src dst + ADC sz src dst -> patch2 (ADC sz) src dst + SUB sz src dst -> patch2 (SUB sz) src dst + IMUL sz src dst -> patch2 (IMUL sz) src dst + IMUL2 sz src -> patch1 (IMUL2 sz) src + MUL sz src dst -> patch2 (MUL sz) src dst + IDIV sz op -> patch1 (IDIV sz) op + DIV sz op -> patch1 (DIV sz) op + AND sz src dst -> patch2 (AND sz) src dst + OR sz src dst -> patch2 (OR sz) src dst + XOR sz src dst -> patch2 (XOR sz) src dst + NOT sz op -> patch1 (NOT sz) op + NEGI sz op -> patch1 (NEGI sz) op + SHL sz imm dst -> patch1 (SHL sz imm) dst + SAR sz imm dst -> patch1 (SAR sz imm) dst + SHR sz imm dst -> patch1 (SHR sz imm) dst + BT sz imm src -> patch1 (BT sz imm) src + TEST sz src dst -> patch2 (TEST sz) src dst + CMP sz src dst -> patch2 (CMP sz) src dst + PUSH sz op -> patch1 (PUSH sz) op + POP sz op -> patch1 (POP sz) op + SETCC cond op -> patch1 (SETCC cond) op + JMP op -> patch1 JMP op + JMP_TBL op ids -> patch1 JMP_TBL op $ ids + +#if i386_TARGET_ARCH + GMOV src dst -> GMOV (env src) (env dst) + GLD sz src dst -> GLD sz (lookupAddr src) (env dst) + GST sz src dst -> GST sz (env src) (lookupAddr dst) + + GLDZ dst -> GLDZ (env dst) + GLD1 dst -> GLD1 (env dst) + + GFTOI src dst -> GFTOI (env src) (env dst) + GDTOI src dst -> GDTOI (env src) (env dst) + + GITOF src dst -> GITOF (env src) (env dst) + GITOD src dst -> GITOD (env src) (env dst) + + GADD sz s1 s2 dst -> GADD sz (env s1) (env s2) (env dst) + GSUB sz s1 s2 dst -> GSUB sz (env s1) (env s2) (env dst) + GMUL sz s1 s2 dst -> GMUL sz (env s1) (env s2) (env dst) + GDIV sz s1 s2 dst -> GDIV sz (env s1) (env s2) (env dst) + + GCMP sz src1 src2 -> GCMP sz (env src1) (env src2) + GABS sz src dst -> GABS sz (env src) (env dst) + GNEG sz src dst -> GNEG sz (env src) (env dst) + GSQRT sz src dst -> GSQRT sz (env src) (env dst) + GSIN sz src dst -> GSIN sz (env src) (env dst) + GCOS sz src dst -> GCOS sz (env src) (env dst) + GTAN sz src dst -> GTAN sz (env src) (env dst) +#endif + +#if x86_64_TARGET_ARCH + CVTSS2SD src dst -> CVTSS2SD (env src) (env dst) + CVTSD2SS src dst -> CVTSD2SS (env src) (env dst) + CVTSS2SI src dst -> CVTSS2SI (patchOp src) (env dst) + CVTSD2SI src dst -> CVTSD2SI (patchOp src) (env dst) + CVTSI2SS src dst -> CVTSI2SS (patchOp src) (env dst) + CVTSI2SD src dst -> CVTSI2SD (patchOp src) (env dst) + FDIV sz src dst -> FDIV sz (patchOp src) (patchOp dst) +#endif + + CALL (Left imm) _ -> instr + CALL (Right reg) p -> CALL (Right (env reg)) p + + FETCHGOT reg -> FETCHGOT (env reg) + FETCHPC reg -> FETCHPC (env reg) + + NOP -> instr + COMMENT _ -> instr + DELTA _ -> instr + JXX _ _ -> instr + CLTD _ -> instr + + _other -> panic "patchRegs: unrecognised instr" + + where + patch1 insn op = insn $! patchOp op + patch2 insn src dst = (insn $! patchOp src) $! patchOp dst + + patchOp (OpReg reg) = OpReg $! env reg + patchOp (OpImm imm) = OpImm imm + patchOp (OpAddr ea) = OpAddr $! lookupAddr ea + + lookupAddr (ImmAddr imm off) = ImmAddr imm off + lookupAddr (AddrBaseIndex base index disp) + = ((AddrBaseIndex $! lookupBase base) $! lookupIndex index) disp + where + lookupBase EABaseNone = EABaseNone + lookupBase EABaseRip = EABaseRip + lookupBase (EABaseReg r) = EABaseReg (env r) + + lookupIndex EAIndexNone = EAIndexNone + lookupIndex (EAIndex r i) = EAIndex (env r) i + +#endif /* i386_TARGET_ARCH || x86_64_TARGET_ARCH*/ +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +#if sparc_TARGET_ARCH + +patchRegs instr env = case instr of + LD sz addr reg -> LD sz (fixAddr addr) (env reg) + ST sz reg addr -> ST sz (env reg) (fixAddr addr) + ADD x cc r1 ar r2 -> ADD x cc (env r1) (fixRI ar) (env r2) + SUB x cc r1 ar r2 -> SUB x cc (env r1) (fixRI ar) (env r2) + UMUL cc r1 ar r2 -> UMUL cc (env r1) (fixRI ar) (env r2) + SMUL cc r1 ar r2 -> SMUL cc (env r1) (fixRI ar) (env r2) + RDY rd -> RDY (env rd) + AND b r1 ar r2 -> AND b (env r1) (fixRI ar) (env r2) + ANDN b r1 ar r2 -> ANDN b (env r1) (fixRI ar) (env r2) + OR b r1 ar r2 -> OR b (env r1) (fixRI ar) (env r2) + ORN b r1 ar r2 -> ORN b (env r1) (fixRI ar) (env r2) + XOR b r1 ar r2 -> XOR b (env r1) (fixRI ar) (env r2) + XNOR b r1 ar r2 -> XNOR b (env r1) (fixRI ar) (env r2) + SLL r1 ar r2 -> SLL (env r1) (fixRI ar) (env r2) + SRL r1 ar r2 -> SRL (env r1) (fixRI ar) (env r2) + SRA r1 ar r2 -> SRA (env r1) (fixRI ar) (env r2) + SETHI imm reg -> SETHI imm (env reg) + FABS s r1 r2 -> FABS s (env r1) (env r2) + FADD s r1 r2 r3 -> FADD s (env r1) (env r2) (env r3) + FCMP e s r1 r2 -> FCMP e s (env r1) (env r2) + FDIV s r1 r2 r3 -> FDIV s (env r1) (env r2) (env r3) + FMOV s r1 r2 -> FMOV s (env r1) (env r2) + FMUL s r1 r2 r3 -> FMUL s (env r1) (env r2) (env r3) + FNEG s r1 r2 -> FNEG s (env r1) (env r2) + FSQRT s r1 r2 -> FSQRT s (env r1) (env r2) + FSUB s r1 r2 r3 -> FSUB s (env r1) (env r2) (env r3) + FxTOy s1 s2 r1 r2 -> FxTOy s1 s2 (env r1) (env r2) + JMP addr -> JMP (fixAddr addr) + CALL (Left i) n t -> CALL (Left i) n t + CALL (Right r) n t -> CALL (Right (env r)) n t + _ -> instr + where + fixAddr (AddrRegReg r1 r2) = AddrRegReg (env r1) (env r2) + fixAddr (AddrRegImm r1 i) = AddrRegImm (env r1) i + + fixRI (RIReg r) = RIReg (env r) + fixRI other = other + +#endif /* sparc_TARGET_ARCH */ +-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +#if powerpc_TARGET_ARCH + +patchRegs instr env = case instr of + LD sz reg addr -> LD sz (env reg) (fixAddr addr) + LA sz reg addr -> LA sz (env reg) (fixAddr addr) + ST sz reg addr -> ST sz (env reg) (fixAddr addr) + STU sz reg addr -> STU sz (env reg) (fixAddr addr) + LIS reg imm -> LIS (env reg) imm + LI reg imm -> LI (env reg) imm + MR reg1 reg2 -> MR (env reg1) (env reg2) + CMP sz reg ri -> CMP sz (env reg) (fixRI ri) + CMPL sz reg ri -> CMPL sz (env reg) (fixRI ri) + BCC cond lbl -> BCC cond lbl + MTCTR reg -> MTCTR (env reg) + BCTR targets -> BCTR targets + BL imm argRegs -> BL imm argRegs -- argument regs + BCTRL argRegs -> BCTRL argRegs -- cannot be remapped + ADD reg1 reg2 ri -> ADD (env reg1) (env reg2) (fixRI ri) + ADDC reg1 reg2 reg3-> ADDC (env reg1) (env reg2) (env reg3) + ADDE reg1 reg2 reg3-> ADDE (env reg1) (env reg2) (env reg3) + ADDIS reg1 reg2 imm -> ADDIS (env reg1) (env reg2) imm + SUBF reg1 reg2 reg3-> SUBF (env reg1) (env reg2) (env reg3) + MULLW reg1 reg2 ri -> MULLW (env reg1) (env reg2) (fixRI ri) + DIVW reg1 reg2 reg3-> DIVW (env reg1) (env reg2) (env reg3) + DIVWU reg1 reg2 reg3-> DIVWU (env reg1) (env reg2) (env reg3) + MULLW_MayOflo reg1 reg2 reg3 + -> MULLW_MayOflo (env reg1) (env reg2) (env reg3) + AND reg1 reg2 ri -> AND (env reg1) (env reg2) (fixRI ri) + OR reg1 reg2 ri -> OR (env reg1) (env reg2) (fixRI ri) + XOR reg1 reg2 ri -> XOR (env reg1) (env reg2) (fixRI ri) + XORIS reg1 reg2 imm -> XORIS (env reg1) (env reg2) imm + EXTS sz reg1 reg2 -> EXTS sz (env reg1) (env reg2) + NEG reg1 reg2 -> NEG (env reg1) (env reg2) + NOT reg1 reg2 -> NOT (env reg1) (env reg2) + SLW reg1 reg2 ri -> SLW (env reg1) (env reg2) (fixRI ri) + SRW reg1 reg2 ri -> SRW (env reg1) (env reg2) (fixRI ri) + SRAW reg1 reg2 ri -> SRAW (env reg1) (env reg2) (fixRI ri) + RLWINM reg1 reg2 sh mb me + -> RLWINM (env reg1) (env reg2) sh mb me + FADD sz r1 r2 r3 -> FADD sz (env r1) (env r2) (env r3) + FSUB sz r1 r2 r3 -> FSUB sz (env r1) (env r2) (env r3) + FMUL sz r1 r2 r3 -> FMUL sz (env r1) (env r2) (env r3) + FDIV sz r1 r2 r3 -> FDIV sz (env r1) (env r2) (env r3) + FNEG r1 r2 -> FNEG (env r1) (env r2) + FCMP r1 r2 -> FCMP (env r1) (env r2) + FCTIWZ r1 r2 -> FCTIWZ (env r1) (env r2) + FRSP r1 r2 -> FRSP (env r1) (env r2) + MFCR reg -> MFCR (env reg) + MFLR reg -> MFLR (env reg) + FETCHPC reg -> FETCHPC (env reg) + _ -> instr + where + fixAddr (AddrRegReg r1 r2) = AddrRegReg (env r1) (env r2) + fixAddr (AddrRegImm r1 i) = AddrRegImm (env r1) i + + fixRI (RIReg r) = RIReg (env r) + fixRI other = other +#endif /* powerpc_TARGET_ARCH */ + +-- ----------------------------------------------------------------------------- +-- Detecting reg->reg moves + +-- The register allocator attempts to eliminate reg->reg moves whenever it can, +-- by assigning the src and dest temporaries to the same real register. + +isRegRegMove :: Instr -> Maybe (Reg,Reg) +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH +-- TMP: +isRegRegMove (MOV _ (OpReg r1) (OpReg r2)) = Just (r1,r2) +#elif powerpc_TARGET_ARCH +isRegRegMove (MR dst src) = Just (src,dst) +#else +#warning ToDo: isRegRegMove +#endif +isRegRegMove _ = Nothing + +-- ----------------------------------------------------------------------------- +-- Generating spill instructions + +mkSpillInstr + :: Reg -- register to spill (should be a real) + -> Int -- current stack delta + -> Int -- spill slot to use + -> Instr +mkSpillInstr reg delta slot + = ASSERT(isRealReg reg) + let + off = spillSlotToOffset slot + in +#ifdef alpha_TARGET_ARCH + {-Alpha: spill below the stack pointer (?)-} + ST sz dyn (spRel (- (off `div` 8))) +#endif +#ifdef i386_TARGET_ARCH + let off_w = (off-delta) `div` 4 + in case regClass reg of + RcInteger -> MOV I32 (OpReg reg) (OpAddr (spRel off_w)) + _ -> GST F80 reg (spRel off_w) {- RcFloat/RcDouble -} +#endif +#ifdef x86_64_TARGET_ARCH + let off_w = (off-delta) `div` 8 + in case regClass reg of + RcInteger -> MOV I64 (OpReg reg) (OpAddr (spRel off_w)) + RcDouble -> MOV F64 (OpReg reg) (OpAddr (spRel off_w)) + -- ToDo: will it work to always spill as a double? + -- does that cause a stall if the data was a float? +#endif +#ifdef sparc_TARGET_ARCH + {-SPARC: spill below frame pointer leaving 2 words/spill-} + let{off_w = 1 + (off `div` 4); + sz = case regClass reg of { + RcInteger -> I32; + RcFloat -> F32; + RcDouble -> F64}} + in ST sz reg (fpRel (- off_w)) +#endif +#ifdef powerpc_TARGET_ARCH + let sz = case regClass reg of + RcInteger -> I32 + RcDouble -> F64 + in ST sz reg (AddrRegImm sp (ImmInt (off-delta))) +#endif + + +mkLoadInstr + :: Reg -- register to load (should be a real) + -> Int -- current stack delta + -> Int -- spill slot to use + -> Instr +mkLoadInstr reg delta slot + = ASSERT(isRealReg reg) + let + off = spillSlotToOffset slot + in +#if alpha_TARGET_ARCH + LD sz dyn (spRel (- (off `div` 8))) +#endif +#if i386_TARGET_ARCH + let off_w = (off-delta) `div` 4 + in case regClass reg of { + RcInteger -> MOV I32 (OpAddr (spRel off_w)) (OpReg reg); + _ -> GLD F80 (spRel off_w) reg} {- RcFloat/RcDouble -} +#endif +#if x86_64_TARGET_ARCH + let off_w = (off-delta) `div` 8 + in case regClass reg of + RcInteger -> MOV I64 (OpAddr (spRel off_w)) (OpReg reg) + _ -> MOV F64 (OpAddr (spRel off_w)) (OpReg reg) +#endif +#if sparc_TARGET_ARCH + let{off_w = 1 + (off `div` 4); + sz = case regClass reg of { + RcInteger -> I32; + RcFloat -> F32; + RcDouble -> F64}} + in LD sz (fpRel (- off_w)) reg +#endif +#if powerpc_TARGET_ARCH + let sz = case regClass reg of + RcInteger -> I32 + RcDouble -> F64 + in LD sz reg (AddrRegImm sp (ImmInt (off-delta))) +#endif + +mkRegRegMoveInstr + :: Reg + -> Reg + -> Instr +mkRegRegMoveInstr src dst +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH + = case regClass src of + RcInteger -> MOV wordRep (OpReg src) (OpReg dst) +#if i386_TARGET_ARCH + RcDouble -> GMOV src dst +#else + RcDouble -> MOV F64 (OpReg src) (OpReg dst) +#endif +#elif powerpc_TARGET_ARCH + = MR dst src +#endif + +mkBranchInstr + :: BlockId + -> [Instr] +#if alpha_TARGET_ARCH +mkBranchInstr id = [BR id] +#endif + +#if i386_TARGET_ARCH || x86_64_TARGET_ARCH +mkBranchInstr id = [JXX ALWAYS id] +#endif + +#if sparc_TARGET_ARCH +mkBranchInstr (BlockId id) = [BI ALWAYS False (ImmCLbl (mkAsmTempLabel id)), NOP] +#endif + +#if powerpc_TARGET_ARCH +mkBranchInstr id = [BCC ALWAYS id] +#endif + + +spillSlotSize :: Int +spillSlotSize = IF_ARCH_i386(12, 8) + +maxSpillSlots :: Int +maxSpillSlots = ((rESERVED_C_STACK_BYTES - 64) `div` spillSlotSize) - 1 + +-- convert a spill slot number to a *byte* offset, with no sign: +-- decide on a per arch basis whether you are spilling above or below +-- the C stack pointer. +spillSlotToOffset :: Int -> Int +spillSlotToOffset slot + | slot >= 0 && slot < maxSpillSlots + = 64 + spillSlotSize * slot + | otherwise + = pprPanic "spillSlotToOffset:" + (text "invalid spill location: " <> int slot) diff --git a/compiler/nativeGen/RegisterAlloc.hs b/compiler/nativeGen/RegisterAlloc.hs new file mode 100644 index 0000000000..7d2ab1b6d6 --- /dev/null +++ b/compiler/nativeGen/RegisterAlloc.hs @@ -0,0 +1,1004 @@ +----------------------------------------------------------------------------- +-- +-- The register allocator +-- +-- (c) The University of Glasgow 2004 +-- +----------------------------------------------------------------------------- + +{- +The algorithm is roughly: + + 1) Compute strongly connected components of the basic block list. + + 2) Compute liveness (mapping from pseudo register to + point(s) of death?). + + 3) Walk instructions in each basic block. We keep track of + (a) Free real registers (a bitmap?) + (b) Current assignment of temporaries to machine registers and/or + spill slots (call this the "assignment"). + (c) Partial mapping from basic block ids to a virt-to-loc mapping. + When we first encounter a branch to a basic block, + we fill in its entry in this table with the current mapping. + + For each instruction: + (a) For each real register clobbered by this instruction: + If a temporary resides in it, + If the temporary is live after this instruction, + Move the temporary to another (non-clobbered & free) reg, + or spill it to memory. Mark the temporary as residing + in both memory and a register if it was spilled (it might + need to be read by this instruction). + (ToDo: this is wrong for jump instructions?) + + (b) For each temporary *read* by the instruction: + If the temporary does not have a real register allocation: + - Allocate a real register from the free list. If + the list is empty: + - Find a temporary to spill. Pick one that is + not used in this instruction (ToDo: not + used for a while...) + - generate a spill instruction + - If the temporary was previously spilled, + generate an instruction to read the temp from its spill loc. + (optimisation: if we can see that a real register is going to + be used soon, then don't use it for allocation). + + (c) Update the current assignment + + (d) If the intstruction is a branch: + if the destination block already has a register assignment, + Generate a new block with fixup code and redirect the + jump to the new block. + else, + Update the block id->assignment mapping with the current + assignment. + + (e) Delete all register assignments for temps which are read + (only) and die here. Update the free register list. + + (f) Mark all registers clobbered by this instruction as not free, + and mark temporaries which have been spilled due to clobbering + as in memory (step (a) marks then as in both mem & reg). + + (g) For each temporary *written* by this instruction: + Allocate a real register as for (b), spilling something + else if necessary. + - except when updating the assignment, drop any memory + locations that the temporary was previously in, since + they will be no longer valid after this instruction. + + (h) Delete all register assignments for temps which are + written and die here (there should rarely be any). Update + the free register list. + + (i) Rewrite the instruction with the new mapping. + + (j) For each spilled reg known to be now dead, re-add its stack slot + to the free list. + +-} + +module RegisterAlloc ( + regAlloc + ) where + +#include "HsVersions.h" + +import PprMach +import MachRegs +import MachInstrs +import RegAllocInfo +import Cmm + +import Digraph +import Unique ( Uniquable(getUnique), Unique ) +import UniqSet +import UniqFM +import UniqSupply +import Outputable + +#ifndef DEBUG +import Maybe ( fromJust ) +#endif +import Maybe ( fromMaybe ) +import List ( nub, partition, mapAccumL, groupBy ) +import Monad ( when ) +import DATA_WORD +import DATA_BITS + +-- ----------------------------------------------------------------------------- +-- Some useful types + +type RegSet = UniqSet Reg + +type RegMap a = UniqFM a +emptyRegMap = emptyUFM + +type BlockMap a = UniqFM a +emptyBlockMap = emptyUFM + +-- A basic block where the isntructions are annotated with the registers +-- which are no longer live in the *next* instruction in this sequence. +-- (NB. if the instruction is a jump, these registers might still be live +-- at the jump target(s) - you have to check the liveness at the destination +-- block to find out). +type AnnBasicBlock + = GenBasicBlock (Instr, + [Reg], -- registers read (only) which die + [Reg]) -- registers written which die + +-- ----------------------------------------------------------------------------- +-- The free register set + +-- This needs to be *efficient* + +{- Here's an inefficient 'executable specification' of the FreeRegs data type: +type FreeRegs = [RegNo] + +noFreeRegs = 0 +releaseReg n f = if n `elem` f then f else (n : f) +initFreeRegs = allocatableRegs +getFreeRegs cls f = filter ( (==cls) . regClass . RealReg ) f +allocateReg f r = filter (/= r) f +-} + +#if defined(powerpc_TARGET_ARCH) + +-- The PowerPC has 32 integer and 32 floating point registers. +-- This is 32bit PowerPC, so Word64 is inefficient - two Word32s are much +-- better. +-- Note that when getFreeRegs scans for free registers, it starts at register +-- 31 and counts down. This is a hack for the PowerPC - the higher-numbered +-- registers are callee-saves, while the lower regs are caller-saves, so it +-- makes sense to start at the high end. +-- Apart from that, the code does nothing PowerPC-specific, so feel free to +-- add your favourite platform to the #if (if you have 64 registers but only +-- 32-bit words). + +data FreeRegs = FreeRegs !Word32 !Word32 + deriving( Show ) -- The Show is used in an ASSERT + +noFreeRegs :: FreeRegs +noFreeRegs = FreeRegs 0 0 + +releaseReg :: RegNo -> FreeRegs -> FreeRegs +releaseReg r (FreeRegs g f) + | r > 31 = FreeRegs g (f .|. (1 `shiftL` (fromIntegral r - 32))) + | otherwise = FreeRegs (g .|. (1 `shiftL` fromIntegral r)) f + +initFreeRegs :: FreeRegs +initFreeRegs = foldr releaseReg noFreeRegs allocatableRegs + +getFreeRegs :: RegClass -> FreeRegs -> [RegNo] -- lazilly +getFreeRegs cls (FreeRegs g f) + | RcDouble <- cls = go f (0x80000000) 63 + | RcInteger <- cls = go g (0x80000000) 31 + where + go x 0 i = [] + go x m i | x .&. m /= 0 = i : (go x (m `shiftR` 1) $! i-1) + | otherwise = go x (m `shiftR` 1) $! i-1 + +allocateReg :: RegNo -> FreeRegs -> FreeRegs +allocateReg r (FreeRegs g f) + | r > 31 = FreeRegs g (f .&. complement (1 `shiftL` (fromIntegral r - 32))) + | otherwise = FreeRegs (g .&. complement (1 `shiftL` fromIntegral r)) f + +#else + +-- If we have less than 32 registers, or if we have efficient 64-bit words, +-- we will just use a single bitfield. + +#if defined(alpha_TARGET_ARCH) +type FreeRegs = Word64 +#else +type FreeRegs = Word32 +#endif + +noFreeRegs :: FreeRegs +noFreeRegs = 0 + +releaseReg :: RegNo -> FreeRegs -> FreeRegs +releaseReg n f = f .|. (1 `shiftL` n) + +initFreeRegs :: FreeRegs +initFreeRegs = foldr releaseReg noFreeRegs allocatableRegs + +getFreeRegs :: RegClass -> FreeRegs -> [RegNo] -- lazilly +getFreeRegs cls f = go f 0 + where go 0 m = [] + go n m + | n .&. 1 /= 0 && regClass (RealReg m) == cls + = m : (go (n `shiftR` 1) $! (m+1)) + | otherwise + = go (n `shiftR` 1) $! (m+1) + -- ToDo: there's no point looking through all the integer registers + -- in order to find a floating-point one. + +allocateReg :: RegNo -> FreeRegs -> FreeRegs +allocateReg r f = f .&. complement (1 `shiftL` fromIntegral r) + +#endif + +-- ----------------------------------------------------------------------------- +-- The free list of stack slots + +-- This doesn't need to be so efficient. It also doesn't really need to be +-- maintained as a set, so we just use an ordinary list (lazy, because it +-- contains all the possible stack slots and there are lots :-). +-- We do one more thing here: We make sure that we always use the same stack +-- slot to spill the same temporary. That way, the stack slot assignments +-- will always match up and we never need to worry about memory-to-memory +-- moves when generating fixup code. + +type StackSlot = Int +data FreeStack = FreeStack [StackSlot] (UniqFM StackSlot) + +completelyFreeStack :: FreeStack +completelyFreeStack = FreeStack [0..maxSpillSlots] emptyUFM + +getFreeStackSlot :: FreeStack -> (FreeStack,Int) +getFreeStackSlot (FreeStack (slot:stack) reserved) + = (FreeStack stack reserved,slot) + +freeStackSlot :: FreeStack -> Int -> FreeStack +freeStackSlot (FreeStack stack reserved) slot + -- NOTE: This is probably terribly, unthinkably slow. + -- But on the other hand, it never gets called, because the allocator + -- currently does not free stack slots. So who cares if it's slow? + | slot `elem` eltsUFM reserved = FreeStack stack reserved + | otherwise = FreeStack (slot:stack) reserved + + +getFreeStackSlotFor :: FreeStack -> Unique -> (FreeStack,Int) +getFreeStackSlotFor fs@(FreeStack _ reserved) reg = + case lookupUFM reserved reg of + Just slot -> (fs,slot) + Nothing -> let (FreeStack stack' _, slot) = getFreeStackSlot fs + in (FreeStack stack' (addToUFM reserved reg slot), slot) + +-- ----------------------------------------------------------------------------- +-- Top level of the register allocator + +regAlloc :: NatCmmTop -> UniqSM NatCmmTop +regAlloc (CmmData sec d) = returnUs $ CmmData sec d +regAlloc (CmmProc info lbl params []) + = returnUs $ CmmProc info lbl params [] -- no blocks to run the regalloc on +regAlloc (CmmProc info lbl params blocks@(first:rest)) + = let + first_id = blockId first + sccs = sccBlocks blocks + (ann_sccs, block_live) = computeLiveness sccs + in linearRegAlloc block_live ann_sccs `thenUs` \final_blocks -> + let ((first':_),rest') = partition ((== first_id) . blockId) final_blocks + in returnUs $ -- pprTrace "Liveness" (ppr block_live) $ + CmmProc info lbl params (first':rest') + +sccBlocks :: [NatBasicBlock] -> [SCC NatBasicBlock] +sccBlocks blocks = stronglyConnComp graph + where + getOutEdges :: [Instr] -> [BlockId] + getOutEdges instrs = foldr jumpDests [] instrs + + graph = [ (block, getUnique id, map getUnique (getOutEdges instrs)) + | block@(BasicBlock id instrs) <- blocks ] + + +-- ----------------------------------------------------------------------------- +-- Computing liveness + +computeLiveness + :: [SCC NatBasicBlock] + -> ([SCC AnnBasicBlock], -- instructions annotated with list of registers + -- which are "dead after this instruction". + BlockMap RegSet) -- blocks annontated with set of live registers + -- on entry to the block. + + -- NOTE: on entry, the SCCs are in "reverse" order: later blocks may transfer + -- control to earlier ones only. The SCCs returned are in the *opposite* + -- order, which is exactly what we want for the next pass. + +computeLiveness sccs + = livenessSCCs emptyBlockMap [] sccs + where + livenessSCCs + :: BlockMap RegSet + -> [SCC AnnBasicBlock] -- accum + -> [SCC NatBasicBlock] + -> ([SCC AnnBasicBlock], BlockMap RegSet) + + livenessSCCs blockmap done [] = (done, blockmap) + livenessSCCs blockmap done + (AcyclicSCC (BasicBlock block_id instrs) : sccs) = + {- pprTrace "live instrs" (ppr (getUnique block_id) $$ + vcat (map (\(instr,regs) -> docToSDoc (pprInstr instr) $$ ppr regs) instrs')) $ + -} + livenessSCCs blockmap' + (AcyclicSCC (BasicBlock block_id instrs'):done) sccs + where (live,instrs') = liveness emptyUniqSet blockmap [] + (reverse instrs) + blockmap' = addToUFM blockmap block_id live + + livenessSCCs blockmap done + (CyclicSCC blocks : sccs) = + livenessSCCs blockmap' (CyclicSCC blocks':done) sccs + where (blockmap', blocks') + = iterateUntilUnchanged linearLiveness equalBlockMaps + blockmap blocks + + iterateUntilUnchanged + :: (a -> b -> (a,c)) -> (a -> a -> Bool) + -> a -> b + -> (a,c) + + iterateUntilUnchanged f eq a b + = head $ + concatMap tail $ + groupBy (\(a1, _) (a2, _) -> eq a1 a2) $ + iterate (\(a, _) -> f a b) $ + (a, error "RegisterAlloc.livenessSCCs") + + + linearLiveness :: BlockMap RegSet -> [NatBasicBlock] + -> (BlockMap RegSet, [AnnBasicBlock]) + linearLiveness = mapAccumL processBlock + + processBlock blockmap input@(BasicBlock block_id instrs) + = (blockmap', BasicBlock block_id instrs') + where (live,instrs') = liveness emptyUniqSet blockmap [] + (reverse instrs) + blockmap' = addToUFM blockmap block_id live + + -- probably the least efficient way to compare two + -- BlockMaps for equality. + equalBlockMaps a b + = a' == b' + where a' = map f $ ufmToList a + b' = map f $ ufmToList b + f (key,elt) = (key, uniqSetToList elt) + + liveness :: RegSet -- live regs + -> BlockMap RegSet -- live regs on entry to other BBs + -> [(Instr,[Reg],[Reg])] -- instructions (accum) + -> [Instr] -- instructions + -> (RegSet, [(Instr,[Reg],[Reg])]) + + liveness liveregs blockmap done [] = (liveregs, done) + liveness liveregs blockmap done (instr:instrs) + = liveness liveregs2 blockmap ((instr,r_dying,w_dying):done) instrs + where + RU read written = regUsage instr + + -- registers that were written here are dead going backwards. + -- registers that were read here are live going backwards. + liveregs1 = (liveregs `delListFromUniqSet` written) + `addListToUniqSet` read + + -- union in the live regs from all the jump destinations of this + -- instruction. + targets = jumpDests instr [] -- where we go from here + liveregs2 = unionManyUniqSets + (liveregs1 : map targetLiveRegs targets) + + targetLiveRegs target = case lookupUFM blockmap target of + Just ra -> ra + Nothing -> emptyBlockMap + + -- registers that are not live beyond this point, are recorded + -- as dying here. + r_dying = [ reg | reg <- read, reg `notElem` written, + not (elementOfUniqSet reg liveregs) ] + + w_dying = [ reg | reg <- written, + not (elementOfUniqSet reg liveregs) ] + + +-- ----------------------------------------------------------------------------- +-- Linear sweep to allocate registers + +data Loc = InReg {-# UNPACK #-} !RegNo + | InMem {-# UNPACK #-} !Int -- stack slot + | InBoth {-# UNPACK #-} !RegNo + {-# UNPACK #-} !Int -- stack slot + deriving (Eq, Show, Ord) + +{- +A temporary can be marked as living in both a register and memory +(InBoth), for example if it was recently loaded from a spill location. +This makes it cheap to spill (no save instruction required), but we +have to be careful to turn this into InReg if the value in the +register is changed. + +This is also useful when a temporary is about to be clobbered. We +save it in a spill location, but mark it as InBoth because the current +instruction might still want to read it. +-} + +#ifdef DEBUG +instance Outputable Loc where + ppr l = text (show l) +#endif + +linearRegAlloc + :: BlockMap RegSet -- live regs on entry to each basic block + -> [SCC AnnBasicBlock] -- instructions annotated with "deaths" + -> UniqSM [NatBasicBlock] +linearRegAlloc block_live sccs = linearRA_SCCs emptyBlockMap sccs + where + linearRA_SCCs + :: BlockAssignment + -> [SCC AnnBasicBlock] + -> UniqSM [NatBasicBlock] + linearRA_SCCs block_assig [] = returnUs [] + linearRA_SCCs block_assig + (AcyclicSCC (BasicBlock id instrs) : sccs) + = getUs `thenUs` \us -> + let + (block_assig',(instrs',fixups)) = + case lookupUFM block_assig id of + -- no prior info about this block: assume everything is + -- free and the assignment is empty. + Nothing -> + runR block_assig initFreeRegs + emptyRegMap completelyFreeStack us $ + linearRA [] [] instrs + Just (freeregs,stack,assig) -> + runR block_assig freeregs assig stack us $ + linearRA [] [] instrs + in + linearRA_SCCs block_assig' sccs `thenUs` \moreBlocks -> + returnUs $ BasicBlock id instrs' : fixups ++ moreBlocks + + linearRA_SCCs block_assig + (CyclicSCC blocks : sccs) + = getUs `thenUs` \us -> + let + ((block_assig', us'), blocks') = mapAccumL processBlock + (block_assig, us) + ({-reverse-} blocks) + in + linearRA_SCCs block_assig' sccs `thenUs` \moreBlocks -> + returnUs $ concat blocks' ++ moreBlocks + where + processBlock (block_assig, us0) (BasicBlock id instrs) + = ((block_assig', us'), BasicBlock id instrs' : fixups) + where + (us, us') = splitUniqSupply us0 + (block_assig',(instrs',fixups)) = + case lookupUFM block_assig id of + -- no prior info about this block: assume everything is + -- free and the assignment is empty. + Nothing -> + runR block_assig initFreeRegs + emptyRegMap completelyFreeStack us $ + linearRA [] [] instrs + Just (freeregs,stack,assig) -> + runR block_assig freeregs assig stack us $ + linearRA [] [] instrs + + linearRA :: [Instr] -> [NatBasicBlock] -> [(Instr,[Reg],[Reg])] + -> RegM ([Instr], [NatBasicBlock]) + linearRA instr_acc fixups [] = + return (reverse instr_acc, fixups) + linearRA instr_acc fixups (instr:instrs) = do + (instr_acc', new_fixups) <- raInsn block_live instr_acc instr + linearRA instr_acc' (new_fixups++fixups) instrs + +-- ----------------------------------------------------------------------------- +-- Register allocation for a single instruction + +type BlockAssignment = BlockMap (FreeRegs, FreeStack, RegMap Loc) + +raInsn :: BlockMap RegSet -- Live temporaries at each basic block + -> [Instr] -- new instructions (accum.) + -> (Instr,[Reg],[Reg]) -- the instruction (with "deaths") + -> RegM ( + [Instr], -- new instructions + [NatBasicBlock] -- extra fixup blocks + ) + +raInsn block_live new_instrs (instr@(DELTA n), _, _) = do + setDeltaR n + return (new_instrs, []) + +raInsn block_live new_instrs (instr, r_dying, w_dying) = do + assig <- getAssigR + + -- If we have a reg->reg move between virtual registers, where the + -- src register is not live after this instruction, and the dst + -- register does not already have an assignment, then we can + -- eliminate the instruction. + case isRegRegMove instr of + Just (src,dst) + | src `elem` r_dying, + isVirtualReg dst, + Just loc <- lookupUFM assig src, + not (dst `elemUFM` assig) -> do + setAssigR (addToUFM (delFromUFM assig src) dst loc) + return (new_instrs, []) + + other -> genRaInsn block_live new_instrs instr r_dying w_dying + + +genRaInsn block_live new_instrs instr r_dying w_dying = + case regUsage instr of { RU read written -> + case partition isRealReg written of { (real_written1,virt_written) -> + do + let + real_written = [ r | RealReg r <- real_written1 ] + + -- we don't need to do anything with real registers that are + -- only read by this instr. (the list is typically ~2 elements, + -- so using nub isn't a problem). + virt_read = nub (filter isVirtualReg read) + -- in + + -- (a) save any temporaries which will be clobbered by this instruction + clobber_saves <- saveClobberedTemps real_written r_dying + + {- + freeregs <- getFreeRegsR + assig <- getAssigR + pprTrace "raInsn" (docToSDoc (pprInstr instr) $$ ppr r_dying <+> ppr w_dying $$ ppr virt_read <+> ppr virt_written $$ text (show freeregs) $$ ppr assig) $ do + -} + + -- (b), (c) allocate real regs for all regs read by this instruction. + (r_spills, r_allocd) <- + allocateRegsAndSpill True{-reading-} virt_read [] [] virt_read + + -- (d) Update block map for new destinations + -- NB. do this before removing dead regs from the assignment, because + -- these dead regs might in fact be live in the jump targets (they're + -- only dead in the code that follows in the current basic block). + (fixup_blocks, adjusted_instr) + <- joinToTargets block_live [] instr (jumpDests instr []) + + -- (e) Delete all register assignments for temps which are read + -- (only) and die here. Update the free register list. + releaseRegs r_dying + + -- (f) Mark regs which are clobbered as unallocatable + clobberRegs real_written + + -- (g) Allocate registers for temporaries *written* (only) + (w_spills, w_allocd) <- + allocateRegsAndSpill False{-writing-} virt_written [] [] virt_written + + -- (h) Release registers for temps which are written here and not + -- used again. + releaseRegs w_dying + + let + -- (i) Patch the instruction + patch_map = listToUFM [ (t,RealReg r) | + (t,r) <- zip virt_read r_allocd + ++ zip virt_written w_allocd ] + + patched_instr = patchRegs adjusted_instr patchLookup + patchLookup x = case lookupUFM patch_map x of + Nothing -> x + Just y -> y + -- in + + -- pprTrace "patched" (docToSDoc (pprInstr patched_instr)) $ do + + -- (j) free up stack slots for dead spilled regs + -- TODO (can't be bothered right now) + + return (patched_instr : w_spills ++ reverse r_spills + ++ clobber_saves ++ new_instrs, + fixup_blocks) + }} + +-- ----------------------------------------------------------------------------- +-- releaseRegs + +releaseRegs regs = do + assig <- getAssigR + free <- getFreeRegsR + loop assig free regs + where + loop assig free _ | free `seq` False = undefined + loop assig free [] = do setAssigR assig; setFreeRegsR free; return () + loop assig free (RealReg r : rs) = loop assig (releaseReg r free) rs + loop assig free (r:rs) = + case lookupUFM assig r of + Just (InBoth real _) -> loop (delFromUFM assig r) (releaseReg real free) rs + Just (InReg real) -> loop (delFromUFM assig r) (releaseReg real free) rs + _other -> loop (delFromUFM assig r) free rs + +-- ----------------------------------------------------------------------------- +-- Clobber real registers + +{- +For each temp in a register that is going to be clobbered: + - if the temp dies after this instruction, do nothing + - otherwise, put it somewhere safe (another reg if possible, + otherwise spill and record InBoth in the assignment). + +for allocateRegs on the temps *read*, + - clobbered regs are allocatable. + +for allocateRegs on the temps *written*, + - clobbered regs are not allocatable. +-} + +saveClobberedTemps + :: [RegNo] -- real registers clobbered by this instruction + -> [Reg] -- registers which are no longer live after this insn + -> RegM [Instr] -- return: instructions to spill any temps that will + -- be clobbered. + +saveClobberedTemps [] _ = return [] -- common case +saveClobberedTemps clobbered dying = do + assig <- getAssigR + let + to_spill = [ (temp,reg) | (temp, InReg reg) <- ufmToList assig, + reg `elem` clobbered, + temp `notElem` map getUnique dying ] + -- in + (instrs,assig') <- clobber assig [] to_spill + setAssigR assig' + return instrs + where + clobber assig instrs [] = return (instrs,assig) + clobber assig instrs ((temp,reg):rest) + = do + --ToDo: copy it to another register if possible + (spill,slot) <- spillR (RealReg reg) temp + clobber (addToUFM assig temp (InBoth reg slot)) (spill:instrs) rest + +clobberRegs :: [RegNo] -> RegM () +clobberRegs [] = return () -- common case +clobberRegs clobbered = do + freeregs <- getFreeRegsR + setFreeRegsR $! foldr allocateReg freeregs clobbered + assig <- getAssigR + setAssigR $! clobber assig (ufmToList assig) + where + -- if the temp was InReg and clobbered, then we will have + -- saved it in saveClobberedTemps above. So the only case + -- we have to worry about here is InBoth. Note that this + -- also catches temps which were loaded up during allocation + -- of read registers, not just those saved in saveClobberedTemps. + clobber assig [] = assig + clobber assig ((temp, InBoth reg slot) : rest) + | reg `elem` clobbered + = clobber (addToUFM assig temp (InMem slot)) rest + clobber assig (entry:rest) + = clobber assig rest + +-- ----------------------------------------------------------------------------- +-- allocateRegsAndSpill + +-- This function does several things: +-- For each temporary referred to by this instruction, +-- we allocate a real register (spilling another temporary if necessary). +-- We load the temporary up from memory if necessary. +-- We also update the register assignment in the process, and +-- the list of free registers and free stack slots. + +allocateRegsAndSpill + :: Bool -- True <=> reading (load up spilled regs) + -> [Reg] -- don't push these out + -> [Instr] -- spill insns + -> [RegNo] -- real registers allocated (accum.) + -> [Reg] -- temps to allocate + -> RegM ([Instr], [RegNo]) + +allocateRegsAndSpill reading keep spills alloc [] + = return (spills,reverse alloc) + +allocateRegsAndSpill reading keep spills alloc (r:rs) = do + assig <- getAssigR + case lookupUFM assig r of + -- case (1a): already in a register + Just (InReg my_reg) -> + allocateRegsAndSpill reading keep spills (my_reg:alloc) rs + + -- case (1b): already in a register (and memory) + -- NB1. if we're writing this register, update its assignemnt to be + -- InReg, because the memory value is no longer valid. + -- NB2. This is why we must process written registers here, even if they + -- are also read by the same instruction. + Just (InBoth my_reg mem) -> do + when (not reading) (setAssigR (addToUFM assig r (InReg my_reg))) + allocateRegsAndSpill reading keep spills (my_reg:alloc) rs + + -- Not already in a register, so we need to find a free one... + loc -> do + freeregs <- getFreeRegsR + + case getFreeRegs (regClass r) freeregs of + + -- case (2): we have a free register + my_reg:_ -> do + spills' <- do_load reading loc my_reg spills + let new_loc + | Just (InMem slot) <- loc, reading = InBoth my_reg slot + | otherwise = InReg my_reg + setAssigR (addToUFM assig r $! new_loc) + setFreeRegsR (allocateReg my_reg freeregs) + allocateRegsAndSpill reading keep spills' (my_reg:alloc) rs + + -- case (3): we need to push something out to free up a register + [] -> do + let + keep' = map getUnique keep + candidates1 = [ (temp,reg,mem) + | (temp, InBoth reg mem) <- ufmToList assig, + temp `notElem` keep', regClass (RealReg reg) == regClass r ] + candidates2 = [ (temp,reg) + | (temp, InReg reg) <- ufmToList assig, + temp `notElem` keep', regClass (RealReg reg) == regClass r ] + -- in + ASSERT2(not (null candidates1 && null candidates2), + text (show freeregs) <+> ppr r <+> ppr assig) do + + case candidates1 of + + -- we have a temporary that is in both register and mem, + -- just free up its register for use. + -- + (temp,my_reg,slot):_ -> do + spills' <- do_load reading loc my_reg spills + let + assig1 = addToUFM assig temp (InMem slot) + assig2 = addToUFM assig1 r (InReg my_reg) + -- in + setAssigR assig2 + allocateRegsAndSpill reading keep spills' (my_reg:alloc) rs + + -- otherwise, we need to spill a temporary that currently + -- resides in a register. + [] -> do + let + (temp_to_push_out, my_reg) = myHead "regalloc" candidates2 + -- TODO: plenty of room for optimisation in choosing which temp + -- to spill. We just pick the first one that isn't used in + -- the current instruction for now. + -- in + (spill_insn,slot) <- spillR (RealReg my_reg) temp_to_push_out + let + assig1 = addToUFM assig temp_to_push_out (InMem slot) + assig2 = addToUFM assig1 r (InReg my_reg) + -- in + setAssigR assig2 + spills' <- do_load reading loc my_reg spills + allocateRegsAndSpill reading keep (spill_insn:spills') + (my_reg:alloc) rs + where + -- load up a spilled temporary if we need to + do_load True (Just (InMem slot)) reg spills = do + insn <- loadR (RealReg reg) slot + return (insn : spills) + do_load _ _ _ spills = + return spills + +myHead s [] = panic s +myHead s (x:xs) = x + +-- ----------------------------------------------------------------------------- +-- Joining a jump instruction to its targets + +-- The first time we encounter a jump to a particular basic block, we +-- record the assignment of temporaries. The next time we encounter a +-- jump to the same block, we compare our current assignment to the +-- stored one. They might be different if spilling has occrred in one +-- branch; so some fixup code will be required to match up the +-- assignments. + +joinToTargets + :: BlockMap RegSet + -> [NatBasicBlock] + -> Instr + -> [BlockId] + -> RegM ([NatBasicBlock], Instr) + +joinToTargets block_live new_blocks instr [] + = return (new_blocks, instr) +joinToTargets block_live new_blocks instr (dest:dests) = do + block_assig <- getBlockAssigR + assig <- getAssigR + let + -- adjust the assignment to remove any registers which are not + -- live on entry to the destination block. + adjusted_assig = filterUFM_Directly still_live assig + still_live uniq _ = uniq `elemUniqSet_Directly` live_set + + -- and free up those registers which are now free. + to_free = + [ r | (reg, loc) <- ufmToList assig, + not (elemUniqSet_Directly reg live_set), + r <- regsOfLoc loc ] + + regsOfLoc (InReg r) = [r] + regsOfLoc (InBoth r _) = [r] + regsOfLoc (InMem _) = [] + -- in + case lookupUFM block_assig dest of + -- Nothing <=> this is the first time we jumped to this + -- block. + Nothing -> do + freeregs <- getFreeRegsR + let freeregs' = foldr releaseReg freeregs to_free + stack <- getStackR + setBlockAssigR (addToUFM block_assig dest + (freeregs',stack,adjusted_assig)) + joinToTargets block_live new_blocks instr dests + + Just (freeregs,stack,dest_assig) + | ufmToList dest_assig == ufmToList adjusted_assig + -> -- ok, the assignments match + joinToTargets block_live new_blocks instr dests + | otherwise + -> -- need fixup code + do + delta <- getDeltaR + -- Construct a graph of register/spill movements and + -- untangle it component by component. + -- + -- We cut some corners by + -- a) not handling cyclic components + -- b) not handling memory-to-memory moves. + -- + -- Cyclic components seem to occur only very rarely, + -- and we don't need memory-to-memory moves because we + -- make sure that every temporary always gets its own + -- stack slot. + + let graph = [ (loc0, loc0, + [lookupWithDefaultUFM_Directly + dest_assig + (panic "RegisterAlloc.joinToTargets") + vreg] + ) + | (vreg, loc0) <- ufmToList adjusted_assig ] + sccs = stronglyConnCompR graph + + handleComponent (CyclicSCC [one]) = [] + handleComponent (AcyclicSCC (src,_,[dst])) + = makeMove src dst + handleComponent (CyclicSCC things) + = panic $ "Register Allocator: handleComponent: cyclic" + ++ " (workaround: use -fviaC)" + + makeMove (InReg src) (InReg dst) + = [mkRegRegMoveInstr (RealReg src) (RealReg dst)] + makeMove (InMem src) (InReg dst) + = [mkLoadInstr (RealReg dst) delta src] + makeMove (InReg src) (InMem dst) + = [mkSpillInstr (RealReg src) delta dst] + + makeMove (InBoth src _) (InReg dst) + | src == dst = [] + makeMove (InBoth _ src) (InMem dst) + | src == dst = [] + makeMove (InBoth src _) dst + = makeMove (InReg src) dst + makeMove (InReg src) (InBoth dstR dstM) + | src == dstR + = makeMove (InReg src) (InMem dstM) + | otherwise + = makeMove (InReg src) (InReg dstR) + ++ makeMove (InReg src) (InMem dstM) + + makeMove src dst + = panic $ "makeMove (" ++ show src ++ ") (" + ++ show dst ++ ")" + ++ " (workaround: use -fviaC)" + + block_id <- getUniqueR + let block = BasicBlock (BlockId block_id) $ + concatMap handleComponent sccs ++ mkBranchInstr dest + let instr' = patchJump instr dest (BlockId block_id) + joinToTargets block_live (block : new_blocks) instr' dests + where + live_set = lookItUp "joinToTargets" block_live dest + +-- ----------------------------------------------------------------------------- +-- The register allocator's monad. + +-- Here we keep all the state that the register allocator keeps track +-- of as it walks the instructions in a basic block. + +data RA_State + = RA_State { + ra_blockassig :: BlockAssignment, + -- The current mapping from basic blocks to + -- the register assignments at the beginning of that block. + ra_freeregs :: {-#UNPACK#-}!FreeRegs, -- free machine registers + ra_assig :: RegMap Loc, -- assignment of temps to locations + ra_delta :: Int, -- current stack delta + ra_stack :: FreeStack, -- free stack slots for spilling + ra_us :: UniqSupply -- unique supply for generating names + -- for fixup blocks. + } + +newtype RegM a = RegM { unReg :: RA_State -> (# RA_State, a #) } + +instance Monad RegM where + m >>= k = RegM $ \s -> case unReg m s of { (# s, a #) -> unReg (k a) s } + return a = RegM $ \s -> (# s, a #) + +runR :: BlockAssignment -> FreeRegs -> RegMap Loc -> FreeStack -> UniqSupply + -> RegM a -> (BlockAssignment, a) +runR block_assig freeregs assig stack us thing = + case unReg thing (RA_State{ ra_blockassig=block_assig, ra_freeregs=freeregs, + ra_assig=assig, ra_delta=0{-???-}, ra_stack=stack, + ra_us = us }) of + (# RA_State{ ra_blockassig=block_assig }, returned_thing #) + -> (block_assig, returned_thing) + +spillR :: Reg -> Unique -> RegM (Instr, Int) +spillR reg temp = RegM $ \ s@RA_State{ra_delta=delta, ra_stack=stack} -> + let (stack',slot) = getFreeStackSlotFor stack temp + instr = mkSpillInstr reg delta slot + in + (# s{ra_stack=stack'}, (instr,slot) #) + +loadR :: Reg -> Int -> RegM Instr +loadR reg slot = RegM $ \ s@RA_State{ra_delta=delta, ra_stack=stack} -> + (# s, mkLoadInstr reg delta slot #) + +freeSlotR :: Int -> RegM () +freeSlotR slot = RegM $ \ s@RA_State{ra_stack=stack} -> + (# s{ra_stack=freeStackSlot stack slot}, () #) + +getFreeRegsR :: RegM FreeRegs +getFreeRegsR = RegM $ \ s@RA_State{ra_freeregs = freeregs} -> + (# s, freeregs #) + +setFreeRegsR :: FreeRegs -> RegM () +setFreeRegsR regs = RegM $ \ s -> + (# s{ra_freeregs = regs}, () #) + +getAssigR :: RegM (RegMap Loc) +getAssigR = RegM $ \ s@RA_State{ra_assig = assig} -> + (# s, assig #) + +setAssigR :: RegMap Loc -> RegM () +setAssigR assig = RegM $ \ s -> + (# s{ra_assig=assig}, () #) + +getStackR :: RegM FreeStack +getStackR = RegM $ \ s@RA_State{ra_stack = stack} -> + (# s, stack #) + +setStackR :: FreeStack -> RegM () +setStackR stack = RegM $ \ s -> + (# s{ra_stack=stack}, () #) + +getBlockAssigR :: RegM BlockAssignment +getBlockAssigR = RegM $ \ s@RA_State{ra_blockassig = assig} -> + (# s, assig #) + +setBlockAssigR :: BlockAssignment -> RegM () +setBlockAssigR assig = RegM $ \ s -> + (# s{ra_blockassig = assig}, () #) + +setDeltaR :: Int -> RegM () +setDeltaR n = RegM $ \ s -> + (# s{ra_delta = n}, () #) + +getDeltaR :: RegM Int +getDeltaR = RegM $ \s -> (# s, ra_delta s #) + +getUniqueR :: RegM Unique +getUniqueR = RegM $ \s -> + case splitUniqSupply (ra_us s) of + (us1, us2) -> (# s{ra_us = us2}, uniqFromSupply us1 #) + +-- ----------------------------------------------------------------------------- +-- Utils + +#ifdef DEBUG +my_fromJust s p Nothing = pprPanic ("fromJust: " ++ s) p +my_fromJust s p (Just x) = x +#else +my_fromJust _ _ = fromJust +#endif + +lookItUp :: Uniquable b => String -> UniqFM a -> b -> a +lookItUp str fm x = my_fromJust str (ppr (getUnique x)) (lookupUFM fm x) diff --git a/compiler/ndpFlatten/FlattenInfo.hs b/compiler/ndpFlatten/FlattenInfo.hs new file mode 100644 index 0000000000..f759242455 --- /dev/null +++ b/compiler/ndpFlatten/FlattenInfo.hs @@ -0,0 +1,43 @@ +-- $Id$ +-- +-- Copyright (c) 2002 Manuel M T Chakravarty & Gabriele Keller +-- +-- Information for modules outside of the flattening module collection. +-- +--- DESCRIPTION --------------------------------------------------------------- +-- +-- This module contains information that is needed, and thus imported, by +-- modules that are otherwise independent of flattening and may in fact be +-- directly or indirectly imported by some of the flattening-related +-- modules. This is to avoid cyclic module dependencies. +-- +--- DOCU ---------------------------------------------------------------------- +-- +-- Language: Haskell 98 +-- +--- TODO ---------------------------------------------------------------------- +-- + +module FlattenInfo ( + namesNeededForFlattening +) where + +import StaticFlags (opt_Flatten) +import NameSet (FreeVars, emptyFVs, mkFVs) +import PrelNames (fstName, andName, orName, lengthPName, replicatePName, + mapPName, bpermutePName, bpermuteDftPName, indexOfPName) + + +-- this is a list of names that need to be available if flattening is +-- performed (EXPORTED) +-- +-- * needs to be kept in sync with the names used in Core generation in +-- `FlattenMonad' and `NDPCoreUtils' +-- +namesNeededForFlattening :: FreeVars +namesNeededForFlattening + | not opt_Flatten = emptyFVs -- none without -fflatten + | otherwise + = mkFVs [fstName, andName, orName, lengthPName, replicatePName, mapPName, + bpermutePName, bpermuteDftPName, indexOfPName] + -- stuff from PrelGHC doesn't have to go here diff --git a/compiler/ndpFlatten/FlattenMonad.hs b/compiler/ndpFlatten/FlattenMonad.hs new file mode 100644 index 0000000000..45405088fc --- /dev/null +++ b/compiler/ndpFlatten/FlattenMonad.hs @@ -0,0 +1,451 @@ +-- $Id$ +-- +-- Copyright (c) [2001..2002] Manuel M T Chakravarty & Gabriele Keller +-- +-- Monad maintaining parallel contexts and substitutions for flattening. +-- +--- DESCRIPTION --------------------------------------------------------------- +-- +-- The flattening transformation needs to perform a fair amount of plumbing. +-- It needs to mainatin a set of variables, called the parallel context for +-- lifting, variable substitutions in case alternatives, and so on. +-- Moreover, we need to manage uniques to create new variables. The monad +-- defined in this module takes care of maintaining this state. +-- +--- DOCU ---------------------------------------------------------------------- +-- +-- Language: Haskell 98 +-- +-- * a parallel context is a set of variables that get vectorised during a +-- lifting transformations (ie, their type changes from `t' to `[:t:]') +-- +-- * all vectorised variables in a parallel context have the same size; we +-- call this also the size of the parallel context +-- +-- * we represent contexts by maps that give the lifted version of a variable +-- (remember that in GHC, variables contain type information that changes +-- during lifting) +-- +--- TODO ---------------------------------------------------------------------- +-- +-- * Assumptions currently made that should (if they turn out to be true) be +-- documented in The Commentary: +-- +-- - Local bindings can be copied without any need to alpha-rename bound +-- variables (or their uniques). Such renaming is only necessary when +-- bindings in a recursive group are replicated; implying that this is +-- required in the case of top-level bindings). (Note: The CoreTidy path +-- generates global uniques before code generation.) +-- +-- * One FIXME left to resolve. +-- + +module FlattenMonad ( + + -- monad definition + -- + Flatten, runFlatten, + + -- variable generation + -- + newVar, mkBind, + + -- context management & query operations + -- + extendContext, packContext, liftVar, liftConst, intersectWithContext, + + -- construction of prelude functions + -- + mk'fst, mk'eq, mk'neq, mk'and, mk'or, mk'lengthP, mk'replicateP, mk'mapP, + mk'bpermuteP, mk'bpermuteDftP, mk'indexOfP +) where + +-- standard +import Monad (mplus) + +-- GHC +import Panic (panic) +import Outputable (Outputable(ppr), pprPanic) +import UniqSupply (UniqSupply, splitUniqSupply, uniqFromSupply) +import Var (Var, idType) +import Id (Id, mkSysLocal) +import Name (Name) +import VarSet (VarSet, emptyVarSet, extendVarSet, varSetElems ) +import VarEnv (VarEnv, emptyVarEnv, zipVarEnv, plusVarEnv, + elemVarEnv, lookupVarEnv, lookupVarEnv_NF, delVarEnvList) +import Type (Type, tyConAppTyCon) +import HscTypes (HomePackageTable, + ExternalPackageState(eps_PTE), HscEnv(hsc_HPT), + TyThing(..), lookupType) +import PrelNames ( fstName, andName, orName, + lengthPName, replicatePName, mapPName, bpermutePName, + bpermuteDftPName, indexOfPName) +import TysPrim ( charPrimTyCon, intPrimTyCon, floatPrimTyCon, doublePrimTyCon ) +import PrimOp ( PrimOp(..) ) +import PrelInfo ( primOpId ) +import CoreSyn (Expr(..), Bind(..), CoreBndr, CoreExpr, CoreBind, mkApps) +import CoreUtils (exprType) +import FastString (FastString) + +-- friends +import NDPCoreUtils (parrElemTy) + + +-- definition of the monad +-- ----------------------- + +-- state maintained by the flattening monad +-- +data FlattenState = FlattenState { + + -- our source for uniques + -- + us :: UniqSupply, + + -- environment containing all known names (including all + -- Prelude functions) + -- + env :: Name -> Id, + + -- this variable determines the parallel context; if + -- `Nothing', we are in pure vectorisation mode, no + -- lifting going on + -- + ctxtVar :: Maybe Var, + + -- environment that maps each variable that is + -- vectorised in the current parallel context to the + -- vectorised version of that variable + -- + ctxtEnv :: VarEnv Var, + + -- those variables from the *domain* of `ctxtEnv' that + -- have been used since the last context restriction (cf. + -- `restrictContext') + -- + usedVars :: VarSet + } + +-- initial value of the flattening state +-- +initialFlattenState :: ExternalPackageState + -> HomePackageTable + -> UniqSupply + -> FlattenState +initialFlattenState eps hpt us = + FlattenState { + us = us, + env = lookup, + ctxtVar = Nothing, + ctxtEnv = emptyVarEnv, + usedVars = emptyVarSet + } + where + lookup n = + case lookupType hpt (eps_PTE eps) n of + Just (AnId v) -> v + _ -> pprPanic "FlattenMonad: unknown name:" (ppr n) + +-- the monad representation (EXPORTED ABSTRACTLY) +-- +newtype Flatten a = Flatten { + unFlatten :: (FlattenState -> (a, FlattenState)) + } + +instance Monad Flatten where + return x = Flatten $ \s -> (x, s) + m >>= n = Flatten $ \s -> let + (r, s') = unFlatten m s + in + unFlatten (n r) s' + +-- execute the given flattening computation (EXPORTED) +-- +runFlatten :: HscEnv + -> ExternalPackageState + -> UniqSupply + -> Flatten a + -> a +runFlatten hsc_env eps us m + = fst $ unFlatten m (initialFlattenState eps (hsc_HPT hsc_env) us) + + +-- variable generation +-- ------------------- + +-- generate a new local variable whose name is based on the given lexeme and +-- whose type is as specified in the second argument (EXPORTED) +-- +newVar :: FastString -> Type -> Flatten Var +newVar lexeme ty = Flatten $ \state -> + let + (us1, us2) = splitUniqSupply (us state) + state' = state {us = us2} + in + (mkSysLocal lexeme (uniqFromSupply us1) ty, state') + +-- generate a non-recursive binding using a new binder whose name is derived +-- from the given lexeme (EXPORTED) +-- +mkBind :: FastString -> CoreExpr -> Flatten (CoreBndr, CoreBind) +mkBind lexeme e = + do + v <- newVar lexeme (exprType e) + return (v, NonRec v e) + + +-- context management +-- ------------------ + +-- extend the parallel context by the given set of variables (EXPORTED) +-- +-- * if there is no parallel context at the moment, the first element of the +-- variable list will be used to determine the new parallel context +-- +-- * the second argument is executed in the current context extended with the +-- given variables +-- +-- * the variables must already have been lifted by transforming their type, +-- but they *must* have retained their original name (or, at least, their +-- unique); this is needed so that they match the original variable in +-- variable environments +-- +-- * any trace of the given set of variables has to be removed from the state +-- at the end of this operation +-- +extendContext :: [Var] -> Flatten a -> Flatten a +extendContext [] m = m +extendContext vs m = Flatten $ \state -> + let + extState = state { + ctxtVar = ctxtVar state `mplus` Just (head vs), + ctxtEnv = ctxtEnv state `plusVarEnv` zipVarEnv vs vs + } + (r, extState') = unFlatten m extState + resState = extState' { -- remove `vs' from the result state + ctxtVar = ctxtVar state, + ctxtEnv = ctxtEnv state, + usedVars = usedVars extState' `delVarEnvList` vs + } + in + (r, resState) + +-- execute the second argument in a restricted context (EXPORTED) +-- +-- * all variables in the current parallel context are packed according to +-- the permutation vector associated with the variable passed as the first +-- argument (ie, all elements of vectorised context variables that are +-- invalid in the restricted context are dropped) +-- +-- * the returned list of core binders contains the operations that perform +-- the restriction on all variables in the parallel context that *do* occur +-- during the execution of the second argument (ie, `liftVar' is executed at +-- least once on any such variable) +-- +packContext :: Var -> Flatten a -> Flatten (a, [CoreBind]) +packContext perm m = Flatten $ \state -> + let + -- FIXME: To set the packed environment to the unpacked on is a hack of + -- which I am not sure yet (a) whether it works and (b) whether it's + -- really worth it. The one advantages is that, we can use a var set, + -- after all, instead of a var environment. + -- + -- The idea is the following: If we have to pack a variable `x', we + -- generate `let{-NonRec-} x = bpermuteP perm x in ...'. As this is a + -- non-recursive binding, the lhs `x' overshadows the rhs `x' in the + -- body of the let. + -- + -- NB: If we leave it like this, `mkCoreBind' can be simplified. + packedCtxtEnv = ctxtEnv state + packedState = state { + ctxtVar = fmap + (lookupVarEnv_NF packedCtxtEnv) + (ctxtVar state), + ctxtEnv = packedCtxtEnv, + usedVars = emptyVarSet + } + (r, packedState') = unFlatten m packedState + resState = state { -- revert to the unpacked context + ctxtVar = ctxtVar state, + ctxtEnv = ctxtEnv state + } + bndrs = map mkCoreBind . varSetElems . usedVars $ packedState' + + -- generate a binding for the packed variant of a context variable + -- + mkCoreBind var = let + rhs = fst $ unFlatten (mk'bpermuteP (idType var) + (Var perm) + (Var var) + ) state + in + NonRec (lookupVarEnv_NF packedCtxtEnv var) $ rhs + + in + ((r, bndrs), resState) + +-- lift a single variable in the current context (EXPORTED) +-- +-- * if the variable does not occur in the context, it's value is vectorised to +-- match the size of the current context +-- +-- * otherwise, the variable is replaced by whatever the context environment +-- maps it to (this may either be simply the lifted version of the original +-- variable or a packed variant of that variable) +-- +-- * the monad keeps track of all lifted variables that occur in the parallel +-- context, so that `packContext' can determine the correct set of core +-- bindings +-- +liftVar :: Var -> Flatten CoreExpr +liftVar var = Flatten $ \s -> + let + v = ctxtVarErr s + v'elemType = parrElemTy . idType $ v + len = fst $ unFlatten (mk'lengthP v'elemType (Var v)) s + replicated = fst $ unFlatten (mk'replicateP (idType var) len (Var var)) s + in case lookupVarEnv (ctxtEnv s) var of + Just liftedVar -> (Var liftedVar, + s {usedVars = usedVars s `extendVarSet` var}) + Nothing -> (replicated, s) + +-- lift a constant expression in the current context (EXPORTED) +-- +-- * the value of the constant expression is vectorised to match the current +-- parallel context +-- +liftConst :: CoreExpr -> Flatten CoreExpr +liftConst e = Flatten $ \s -> + let + v = ctxtVarErr s + v'elemType = parrElemTy . idType $ v + len = fst $ unFlatten (mk'lengthP v'elemType (Var v)) s + in + (fst $ unFlatten (mk'replicateP (exprType e) len e ) s, s) + +-- pick those variables of the given set that occur (if albeit in lifted form) +-- in the current parallel context (EXPORTED) +-- +-- * the variables returned are from the given set and *not* the corresponding +-- context variables +-- +intersectWithContext :: VarSet -> Flatten [Var] +intersectWithContext vs = Flatten $ \s -> + let + vs' = filter (`elemVarEnv` ctxtEnv s) (varSetElems vs) + in + (vs', s) + + +-- construct applications of prelude functions +-- ------------------------------------------- + +-- NB: keep all the used names listed in `FlattenInfo.namesNeededForFlattening' + +-- generate an application of `fst' (EXPORTED) +-- +mk'fst :: Type -> Type -> CoreExpr -> Flatten CoreExpr +mk'fst ty1 ty2 a = mkFunApp fstName [Type ty1, Type ty2, a] + +-- generate an application of `&&' (EXPORTED) +-- +mk'and :: CoreExpr -> CoreExpr -> Flatten CoreExpr +mk'and a1 a2 = mkFunApp andName [a1, a2] + +-- generate an application of `||' (EXPORTED) +-- +mk'or :: CoreExpr -> CoreExpr -> Flatten CoreExpr +mk'or a1 a2 = mkFunApp orName [a1, a2] + +-- generate an application of `==' where the arguments may only be literals +-- that may occur in a Core case expression (i.e., `Char', `Int', `Float', and +-- `Double') (EXPORTED) +-- +mk'eq :: Type -> CoreExpr -> CoreExpr -> Flatten CoreExpr +mk'eq ty a1 a2 = return (mkApps (Var eqName) [a1, a2]) + where + tc = tyConAppTyCon ty + -- + eqName | tc == charPrimTyCon = primOpId CharEqOp + | tc == intPrimTyCon = primOpId IntEqOp + | tc == floatPrimTyCon = primOpId FloatEqOp + | tc == doublePrimTyCon = primOpId DoubleEqOp + | otherwise = + pprPanic "FlattenMonad.mk'eq: " (ppr ty) + +-- generate an application of `==' where the arguments may only be literals +-- that may occur in a Core case expression (i.e., `Char', `Int', `Float', and +-- `Double') (EXPORTED) +-- +mk'neq :: Type -> CoreExpr -> CoreExpr -> Flatten CoreExpr +mk'neq ty a1 a2 = return (mkApps (Var neqName) [a1, a2]) + where + tc = tyConAppTyCon ty + -- + neqName {- | name == charPrimTyConName = neqCharName -} + | tc == intPrimTyCon = primOpId IntNeOp + {- | name == floatPrimTyConName = neqFloatName -} + {- | name == doublePrimTyConName = neqDoubleName -} + | otherwise = + pprPanic "FlattenMonad.mk'neq: " (ppr ty) + +-- generate an application of `lengthP' (EXPORTED) +-- +mk'lengthP :: Type -> CoreExpr -> Flatten CoreExpr +mk'lengthP ty a = mkFunApp lengthPName [Type ty, a] + +-- generate an application of `replicateP' (EXPORTED) +-- +mk'replicateP :: Type -> CoreExpr -> CoreExpr -> Flatten CoreExpr +mk'replicateP ty a1 a2 = mkFunApp replicatePName [Type ty, a1, a2] + +-- generate an application of `replicateP' (EXPORTED) +-- +mk'mapP :: Type -> Type -> CoreExpr -> CoreExpr -> Flatten CoreExpr +mk'mapP ty1 ty2 a1 a2 = mkFunApp mapPName [Type ty1, Type ty2, a1, a2] + +-- generate an application of `bpermuteP' (EXPORTED) +-- +mk'bpermuteP :: Type -> CoreExpr -> CoreExpr -> Flatten CoreExpr +mk'bpermuteP ty a1 a2 = mkFunApp bpermutePName [Type ty, a1, a2] + +-- generate an application of `bpermuteDftP' (EXPORTED) +-- +mk'bpermuteDftP :: Type -> CoreExpr -> CoreExpr -> CoreExpr -> Flatten CoreExpr +mk'bpermuteDftP ty a1 a2 a3 = mkFunApp bpermuteDftPName [Type ty, a1, a2, a3] + +-- generate an application of `indexOfP' (EXPORTED) +-- +mk'indexOfP :: Type -> CoreExpr -> CoreExpr -> Flatten CoreExpr +mk'indexOfP ty a1 a2 = mkFunApp indexOfPName [Type ty, a1, a2] + + +-- auxilliary functions +-- -------------------- + +-- obtain the context variable, aborting if it is not available (as this +-- signals an internal error in the usage of the `Flatten' monad) +-- +ctxtVarErr :: FlattenState -> Var +ctxtVarErr s = case ctxtVar s of + Nothing -> panic "FlattenMonad.ctxtVarErr: No context variable available!" + Just v -> v + +-- given the name of a known function and a set of arguments (needs to include +-- all needed type arguments), build a Core expression that applies the named +-- function to those arguments +-- +mkFunApp :: Name -> [CoreExpr] -> Flatten CoreExpr +mkFunApp name args = + do + fun <- lookupName name + return $ mkApps (Var fun) args + +-- get the `Id' of a known `Name' +-- +-- * this can be the `Name' of any function that's visible on the toplevel of +-- the current compilation unit +-- +lookupName :: Name -> Flatten Id +lookupName name = Flatten $ \s -> + (env s name, s) diff --git a/compiler/ndpFlatten/Flattening.hs b/compiler/ndpFlatten/Flattening.hs new file mode 100644 index 0000000000..18daaa6323 --- /dev/null +++ b/compiler/ndpFlatten/Flattening.hs @@ -0,0 +1,808 @@ +-- $Id$ +-- +-- Copyright (c) [2001..2002] Manuel M T Chakravarty & Gabriele Keller +-- +-- Vectorisation and lifting +-- +--- DESCRIPTION --------------------------------------------------------------- +-- +-- This module implements the vectorisation and function lifting +-- transformations of the flattening transformation. +-- +--- DOCU ---------------------------------------------------------------------- +-- +-- Language: Haskell 98 with C preprocessor +-- +-- Types: +-- the transformation on types has five purposes: +-- +-- 1) for each type definition, derive the lifted version of this type +-- liftTypeef +-- 2) change the type annotations of functions & variables acc. to rep. +-- flattenType +-- 3) derive the type of a lifted function +-- liftType +-- 4) sumtypes: +-- this is the most fuzzy and complicated part. For each lifted +-- sumtype we need to generate function to access and combine the +-- component arrays +-- +-- NOTE: the type information of variables and data constructors is *not* +-- changed to reflect it's representation. This has to be solved +-- somehow (???, FIXME) using type indexed types +-- +-- Vectorisation: +-- is very naive at the moment. One of the most striking inefficiencies is +-- application vect (app e1 e2) -> app (fst (vect e1) (vect e2)) if e1 is a +-- lambda abstraction. The vectorisation produces a pair consisting of the +-- original and the lifted function, but the lifted version is discarded. +-- I'm also not sure how much of this would be thrown out by the simplifier +-- eventually +-- +-- *) vectorise +-- +-- Conventions: +-- +--- TODO ---------------------------------------------------------------------- +-- +-- * look closer into the definition of type definition (TypeThing or so) +-- + +module Flattening ( + flatten, flattenExpr, +) where + +#include "HsVersions.h" + +-- friends +import NDPCoreUtils (tupleTyArgs, funTyArgs, parrElemTy, isDefault, + isLit, mkPArrTy, mkTuple, isSimpleExpr, substIdEnv) +import FlattenMonad (Flatten, runFlatten, mkBind, extendContext, packContext, + liftVar, liftConst, intersectWithContext, mk'fst, + mk'lengthP, mk'replicateP, mk'mapP, mk'bpermuteDftP, + mk'indexOfP,mk'eq,mk'neq) + +-- GHC +import TcType ( tcIsForAllTy, tcView ) +import TypeRep ( Type(..) ) +import StaticFlags (opt_Flatten) +import Panic (panic) +import ErrUtils (dumpIfSet_dyn) +import UniqSupply (mkSplitUniqSupply) +import DynFlags (DynFlag(..)) +import Literal (Literal, literalType) +import Var (Var(..), idType, isTyVar) +import Id (setIdType) +import DataCon (DataCon, dataConTag) +import HscTypes ( ModGuts(..), ModGuts, HscEnv(..), hscEPS ) +import CoreFVs (exprFreeVars) +import CoreSyn (Expr(..), Bind(..), Alt(..), AltCon(..), Note(..), + CoreBndr, CoreExpr, CoreBind, mkLams, mkLets, + mkApps, mkIntLitInt) +import PprCore (pprCoreExpr) +import CoreLint (showPass, endPass) + +import CoreUtils (exprType, applyTypeToArg, mkPiType) +import VarEnv (zipVarEnv) +import TysWiredIn (mkTupleTy) +import BasicTypes (Boxity(..)) +import Outputable +import FastString + + +-- FIXME: fro debugging - remove this +import TRACE (trace) + +-- standard +import Monad (liftM, foldM) + +-- toplevel transformation +-- ----------------------- + +-- entry point to the flattening transformation for the compiler driver when +-- compiling a complete module (EXPORTED) +-- +flatten :: HscEnv + -> ModGuts + -> IO ModGuts +flatten hsc_env mod_impl@(ModGuts {mg_binds = binds}) + | not opt_Flatten = return mod_impl -- skip without -fflatten + | otherwise = + do + let dflags = hsc_dflags hsc_env + + eps <- hscEPS hsc_env + us <- mkSplitUniqSupply 'l' -- 'l' as in fLattening + -- + -- announce vectorisation + -- + showPass dflags "Flattening [first phase: vectorisation]" + -- + -- vectorise all toplevel bindings + -- + let binds' = runFlatten hsc_env eps us $ vectoriseTopLevelBinds binds + -- + -- and dump the result if requested + -- + endPass dflags "Flattening [first phase: vectorisation]" + Opt_D_dump_vect binds' + return $ mod_impl {mg_binds = binds'} + +-- entry point to the flattening transformation for the compiler driver when +-- compiling a single expression in interactive mode (EXPORTED) +-- +flattenExpr :: HscEnv + -> CoreExpr -- the expression to be flattened + -> IO CoreExpr +flattenExpr hsc_env expr + | not opt_Flatten = return expr -- skip without -fflatten + | otherwise = + do + let dflags = hsc_dflags hsc_env + eps <- hscEPS hsc_env + + us <- mkSplitUniqSupply 'l' -- 'l' as in fLattening + -- + -- announce vectorisation + -- + showPass dflags "Flattening [first phase: vectorisation]" + -- + -- vectorise the expression + -- + let expr' = fst . runFlatten hsc_env eps us $ vectorise expr + -- + -- and dump the result if requested + -- + dumpIfSet_dyn dflags Opt_D_dump_vect "Vectorised expression" + (pprCoreExpr expr') + return expr' + + +-- vectorisation of bindings and expressions +-- ----------------------------------------- + + +vectoriseTopLevelBinds:: [CoreBind] -> Flatten [CoreBind] +vectoriseTopLevelBinds binds = + do + vbinds <- mapM vectoriseBind binds + return (adjustTypeBinds vbinds) + +adjustTypeBinds:: [CoreBind] -> [CoreBind] +adjustTypeBinds vbinds = + let + ids = concat (map extIds vbinds) + idEnv = zipVarEnv ids ids + in map (substIdEnvBind idEnv) vbinds + where + -- FIXME replace by 'bindersOf' + extIds (NonRec b expr) = [b] + extIds (Rec bnds) = map fst bnds + substIdEnvBind idEnv (NonRec b expr) = NonRec b (substIdEnv idEnv expr) + substIdEnvBind idEnv (Rec bnds) + = Rec (map (\ (b,e) -> (b, (substIdEnv idEnv e))) bnds) + +-- vectorise a single core binder +-- +vectoriseBind :: CoreBind -> Flatten CoreBind +vectoriseBind (NonRec b expr) = + liftM (NonRec b) $ liftM fst $ vectorise expr +vectoriseBind (Rec bindings) = + liftM Rec $ mapM vectoriseOne bindings + where + vectoriseOne (b, expr) = + do + (vexpr, ty) <- vectorise expr + return (setIdType b ty, vexpr) + + +-- Searches for function definitions and creates a lifted version for +-- each function. +-- We have only two interesting cases: +-- 1) function application (ex1) (ex2) +-- vectorise both subexpressions. The function will end up becoming a +-- pair (orig. fun, lifted fun), choose first component (in many cases, +-- this is pretty inefficient, since the lifted version is generated +-- although it is clear that it won't be used +-- +-- 2) lambda abstraction +-- any function has to exist in two forms: it's original form and it's +-- lifted form. Therefore, every lambda abstraction is transformed into +-- a pair of functions: the original function and its lifted variant +-- +-- +-- FIXME: currently, I use 'exprType' all over the place - this is terribly +-- inefficient. It should be suffiecient to change 'vectorise' and 'lift' to +-- return the type of the result expression as well. +-- +vectorise:: CoreExpr -> Flatten (CoreExpr, Type) +vectorise (Var id) = + do + let varTy = idType id + let vecTy = vectoriseTy varTy + return (Var (setIdType id vecTy), vecTy) + +vectorise (Lit lit) = + return ((Lit lit), literalType lit) + + +vectorise e@(App expr t@(Type _)) = + do + (vexpr, vexprTy) <- vectorise expr + return ((App vexpr t), applyTypeToArg vexprTy t) + +vectorise (App (Lam b expr) arg) = + do + (varg, argTy) <- vectorise arg + (vexpr, vexprTy) <- vectorise expr + let vb = setIdType b argTy + return ((App (Lam vb vexpr) varg), + applyTypeToArg (mkPiType vb vexprTy) varg) + +-- if vexpr expects a type as first argument +-- application stays just as it is +-- +vectorise (App expr arg) = + do + (vexpr, vexprTy) <- vectorise expr + (varg, vargTy) <- vectorise arg + + if (tcIsForAllTy vexprTy) + then do + let resTy = applyTypeToArg vexprTy varg + return (App vexpr varg, resTy) + else do + let [t1, t2] = tupleTyArgs vexprTy + vexpr' <- mk'fst t1 t2 vexpr + let resTy = applyTypeToArg t1 varg + return ((App vexpr' varg), resTy) -- apply the first component of + -- the vectorized function + +vectorise e@(Lam b expr) + | isTyVar b + = do + (vexpr, vexprTy) <- vectorise expr -- don't vectorise 'b'! + return ((Lam b vexpr), mkPiType b vexprTy) + | otherwise = + do + (vexpr, vexprTy) <- vectorise expr + let vb = setIdType b (vectoriseTy (idType b)) + let ve = Lam vb vexpr + (lexpr, lexprTy) <- lift e + let veTy = mkPiType vb vexprTy + return $ (mkTuple [veTy, lexprTy] [ve, lexpr], + mkTupleTy Boxed 2 [veTy, lexprTy]) + +vectorise (Let bind body) = + do + vbind <- vectoriseBind bind + (vbody, vbodyTy) <- vectorise body + return ((Let vbind vbody), vbodyTy) + +vectorise (Case expr b ty alts) = + do + (vexpr, vexprTy) <- vectorise expr + valts <- mapM vectorise' alts + let res_ty = snd (head valts) + return (Case vexpr (setIdType b vexprTy) res_ty (map fst valts), res_ty) + where vectorise' (con, bs, expr) = + do + (vexpr, vexprTy) <- vectorise expr + return ((con, bs, vexpr), vexprTy) -- FIXME: change type of con + -- and bs + + + +vectorise (Note note expr) = + do + (vexpr, vexprTy) <- vectorise expr -- FIXME: is this ok or does it + return ((Note note vexpr), vexprTy) -- change the validity of note? + +vectorise e@(Type t) = + return (e, t) -- FIXME: panic instead of 't'??? + + +{- +myShowTy (TyVarTy _) = "TyVar " +myShowTy (AppTy t1 t2) = + "AppTy (" ++ (myShowTy t1) ++ ", " ++ (myShowTy t2) ++ ")" +myShowTy (TyConApp _ t) = + "TyConApp TC (" ++ (myShowTy t) ++ ")" +-} + +vectoriseTy :: Type -> Type +vectoriseTy ty | Just ty' <- tcView ty = vectoriseTy ty' + -- Look through notes and synonyms + -- NB: This will discard notes and synonyms, of course + -- ToDo: retain somehow? +vectoriseTy t@(TyVarTy v) = t +vectoriseTy t@(AppTy t1 t2) = + AppTy (vectoriseTy t1) (vectoriseTy t2) +vectoriseTy t@(TyConApp tc ts) = + TyConApp tc (map vectoriseTy ts) +vectoriseTy t@(FunTy t1 t2) = + mkTupleTy Boxed 2 [(FunTy (vectoriseTy t1) (vectoriseTy t2)), + (liftTy t)] +vectoriseTy t@(ForAllTy v ty) = + ForAllTy v (vectoriseTy ty) +vectoriseTy t = t + + +-- liftTy: wrap the type in an array but be careful with function types +-- on the *top level* (is this sufficient???) + +liftTy:: Type -> Type +liftTy ty | Just ty' <- tcView ty = liftTy ty' +liftTy (FunTy t1 t2) = FunTy (liftTy t1) (liftTy t2) +liftTy (ForAllTy tv t) = ForAllTy tv (liftTy t) +liftTy t = mkPArrTy t + + +-- lifting: +-- ---------- +-- * liftType +-- * lift + + +-- liftBinderType: Converts a type 'a' stored in the binder to the +-- representation of '[:a:]' will therefore call liftType +-- +-- lift type, don't change name (incl unique) nor IdInfo. IdInfo looks ok, +-- but I'm not entirely sure about some fields (e.g., strictness info) +liftBinderType:: CoreBndr -> Flatten CoreBndr +liftBinderType bndr = return $ setIdType bndr (liftTy (idType bndr)) + +-- lift: lifts an expression (a -> [:a:]) +-- If the expression is a simple expression, it is treated like a constant +-- expression. +-- If the body of a lambda expression is a simple expression, it is +-- transformed into a mapP +lift:: CoreExpr -> Flatten (CoreExpr, Type) +lift cExpr@(Var id) = + do + lVar@(Var lId) <- liftVar id + return (lVar, idType lId) + +lift cExpr@(Lit lit) = + do + lLit <- liftConst cExpr + return (lLit, exprType lLit) + + +lift (Lam b expr) + | isSimpleExpr expr = liftSimpleFun b expr + | isTyVar b = + do + (lexpr, lexprTy) <- lift expr -- don't lift b! + return (Lam b lexpr, mkPiType b lexprTy) + | otherwise = + do + lb <- liftBinderType b + (lexpr, lexprTy) <- extendContext [lb] (lift expr) + return ((Lam lb lexpr) , mkPiType lb lexprTy) + +lift (App expr1 expr2) = + do + (lexpr1, lexpr1Ty) <- lift expr1 + (lexpr2, _) <- lift expr2 + return ((App lexpr1 lexpr2), applyTypeToArg lexpr1Ty lexpr2) + + +lift (Let (NonRec b expr1) expr2) + |isSimpleExpr expr2 = + do + (lexpr1, _) <- lift expr1 + (lexpr2, lexpr2Ty) <- liftSimpleFun b expr2 + let (t1, t2) = funTyArgs lexpr2Ty + liftM (\x -> (x, liftTy t2)) $ mk'mapP t1 t2 lexpr2 lexpr1 + + | otherwise = + do + (lexpr1, _) <- lift expr1 + lb <- liftBinderType b + (lexpr2, lexpr2Ty) <- extendContext [lb] (lift expr1) + return ((Let (NonRec lb lexpr1) lexpr2), lexpr2Ty) + +lift (Let (Rec binds) expr2) = + do + let (bndVars, exprs) = unzip binds + lBndVars <- mapM liftBinderType bndVars + lexprs <- extendContext bndVars (mapM lift exprs) + (lexpr2, lexpr2Ty) <- extendContext bndVars (lift expr2) + return ((Let (Rec (zip lBndVars (map fst lexprs))) lexpr2), lexpr2Ty) + +-- FIXME: +-- Assumption: alternatives can either be literals or data construtors. +-- Due to type restrictions, I don't think it is possible +-- that they are mixed. +-- The handling of literals and data constructors is completely +-- different +-- +-- +-- let b = expr in alts +-- +-- I think I read somewhere that the default case (if present) is stored +-- in the head of the list. Assume for now this is true, have to check +-- +-- (1) literals +-- (2) data constructors +-- +-- FIXME: optimisation: first, filter out all simple expression and +-- loop (mapP & filter) over all the corresponding values in a single +-- traversal: + +-- (1) splitAlts:: [Alt CoreBndr] -> ([Alt CoreBndr],[Alt CoreBndr]) +-- simple alts reg alts +-- (2) if simpleAlts = [] then (just as before) +-- if regAlts = [] then (the whole thing is just a loop) +-- otherwise (a) compute index vector for simpleAlts (for def permute +-- later on +-- (b) +-- gaw 2004 FIX? +lift cExpr@(Case expr b _ alts) = + do + (lExpr, _) <- lift expr + lb <- liftBinderType b -- lift alt-expression + lalts <- if isLit alts + then extendContext [lb] (liftCaseLit b alts) + else extendContext [lb] (liftCaseDataCon b alts) + letWrapper lExpr b lalts + +lift (Note (Coerce t1 t2) expr) = + do + (lexpr, t) <- lift expr + let lt1 = liftTy t1 + return ((Note (Coerce lt1 (liftTy t2)) lexpr), lt1) + +lift (Note note expr) = + do + (lexpr, t) <- lift expr + return ((Note note lexpr), t) + +lift e@(Type t) = return (e, t) + + +-- auxilliary functions for lifting of case statements +-- + +liftCaseDataCon:: CoreBndr -> [Alt CoreBndr] -> + Flatten (([CoreBind], [CoreBind], [CoreBind])) +liftCaseDataCon b [] = + return ([], [], []) +liftCaseDataCon b alls@(alt:alts) + | isDefault alt = + do + (i, e, defAltBndrs) <- liftCaseDataConDefault b alt alts + (is, es, altBndrs) <- liftCaseDataCon' b alts + return (i:is, e:es, defAltBndrs ++ altBndrs) + | otherwise = + liftCaseDataCon' b alls + +liftCaseDataCon':: CoreBndr -> [Alt CoreBndr] -> + Flatten ([CoreBind], [CoreBind], [CoreBind]) +liftCaseDataCon' _ [] = + do + return ([], [], []) + + +liftCaseDataCon' b ((DataAlt dcon, bnds, expr): alts) = + do + (permBnd, exprBnd, packBnd) <- liftSingleDataCon b dcon bnds expr + (permBnds, exprBnds, packBnds) <- liftCaseDataCon' b alts + return (permBnd:permBnds, exprBnd:exprBnds, packBnd ++ packBnds) + + +-- FIXME: is is really necessary to return the binding to the permutation +-- array in the data constructor case, as the representation already +-- contains the extended flag vector +liftSingleDataCon:: CoreBndr -> DataCon -> [CoreBndr] -> CoreExpr -> + Flatten (CoreBind, CoreBind, [CoreBind]) +liftSingleDataCon b dcon bnds expr = + do + let dconId = dataConTag dcon + indexExpr <- mkIndexOfExprDCon (idType b) b dconId + (bb, bbind) <- mkBind FSLIT("is") indexExpr + lbnds <- mapM liftBinderType bnds + ((lExpr, _), bnds') <- packContext bb (extendContext lbnds (lift expr)) + (_, vbind) <- mkBind FSLIT("r") lExpr + return (bbind, vbind, bnds') + +-- FIXME: clean this up. the datacon and the literal case are so +-- similar that it would be easy to use the same function here +-- instead of duplicating all the code. +-- +liftCaseDataConDefault:: CoreBndr -> (Alt CoreBndr) -> [Alt CoreBndr] + -> Flatten (CoreBind, CoreBind, [CoreBind]) +liftCaseDataConDefault b (_, _, def) alts = + do + let dconIds = map (\(DataAlt d, _, _) -> dataConTag d) alts + indexExpr <- mkIndexOfExprDConDft (idType b) b dconIds + (bb, bbind) <- mkBind FSLIT("is") indexExpr + ((lDef, _), bnds) <- packContext bb (lift def) + (_, vbind) <- mkBind FSLIT("r") lDef + return (bbind, vbind, bnds) + +-- liftCaseLit: checks if we have a default case and handles it +-- if necessary +liftCaseLit:: CoreBndr -> [Alt CoreBndr] -> + Flatten ([CoreBind], [CoreBind], [CoreBind]) +liftCaseLit b [] = + return ([], [], []) --FIXME: a case with no cases at all??? +liftCaseLit b alls@(alt:alts) + | isDefault alt = + do + (i, e, defAltBndrs) <- liftCaseLitDefault b alt alts + (is, es, altBndrs) <- liftCaseLit' b alts + return (i:is, e:es, defAltBndrs ++ altBndrs) + | otherwise = + do + liftCaseLit' b alls + +-- liftCaseLitDefault: looks at all the other alternatives which +-- contain a literal and filters all those elements from the +-- array which do not match any of the literals in the other +-- alternatives. +liftCaseLitDefault:: CoreBndr -> (Alt CoreBndr) -> [Alt CoreBndr] + -> Flatten (CoreBind, CoreBind, [CoreBind]) +liftCaseLitDefault b (_, _, def) alts = + do + let lits = map (\(LitAlt l, _, _) -> l) alts + indexExpr <- mkIndexOfExprDft (idType b) b lits + (bb, bbind) <- mkBind FSLIT("is") indexExpr + ((lDef, _), bnds) <- packContext bb (lift def) + (_, vbind) <- mkBind FSLIT("r") lDef + return (bbind, vbind, bnds) + +-- FIXME: +-- Assumption: in case of Lit, the list of binders of the alt is empty. +-- +-- returns +-- a list of all vars bound to the expr in the body of the alternative +-- a list of (var, expr) pairs, where var has to be bound to expr +-- by letWrapper +liftCaseLit':: CoreBndr -> [Alt CoreBndr] -> + Flatten ([CoreBind], [CoreBind], [CoreBind]) +liftCaseLit' _ [] = + do + return ([], [], []) +liftCaseLit' b ((LitAlt lit, [], expr):alts) = + do + (permBnd, exprBnd, packBnd) <- liftSingleCaseLit b lit expr + (permBnds, exprBnds, packBnds) <- liftCaseLit' b alts + return (permBnd:permBnds, exprBnd:exprBnds, packBnd ++ packBnds) + +-- lift a single alternative of the form: case b of lit -> expr. +-- +-- It returns the bindings: +-- (a) let b' = indexOfP (mapP (\x -> x == lit) b) +-- +-- (b) lift expr in the packed context. Returns lexpr and the +-- list of binds (bnds) that describe the packed arrays +-- +-- (c) create new var v' to bind lexpr to +-- +-- (d) return (b' = indexOf...., v' = lexpr, bnds) +liftSingleCaseLit:: CoreBndr -> Literal -> CoreExpr -> + Flatten (CoreBind, CoreBind, [CoreBind]) +liftSingleCaseLit b lit expr = + do + indexExpr <- mkIndexOfExpr (idType b) b lit -- (a) + (bb, bbind) <- mkBind FSLIT("is") indexExpr + ((lExpr, t), bnds) <- packContext bb (lift expr) -- (b) + (_, vbind) <- mkBind FSLIT("r") lExpr + return (bbind, vbind, bnds) + +-- letWrapper lExpr b ([indexbnd_i], [exprbnd_i], [pckbnd_ij]) +-- +-- let b = lExpr in +-- let index_bnd_1 in +-- let packbnd_11 in +-- ... packbnd_1m in +-- let exprbnd_1 in .... +-- ... +-- let nvar = replicate dummy (length <current context>) +-- nvar1 = bpermuteDftP index_bnd_1 ... +-- +-- in bpermuteDftP index_bnd_n nvar_(n-1) +-- +letWrapper:: CoreExpr -> CoreBndr ->([CoreBind], [CoreBind], [CoreBind]) -> + Flatten (CoreExpr, Type) +letWrapper lExpr b (indBnds, exprBnds, pckBnds) = + do + (defBpBnds, ty) <- dftbpBinders indBnds exprBnds + let resExpr = getExprOfBind (head defBpBnds) + return ((mkLets (indBnds ++ pckBnds ++ exprBnds ++ defBpBnds) resExpr), ty) + +-- dftbpBinders: return the list of binders necessary to construct the overall +-- result from the subresults computed in the different branches of the case +-- statement. The binding which contains the final result is in the *head* +-- of the result list. +-- +-- dftbpBinders [ind_i = ...] [expr_i = ...] = [dn = ..., d_n-1 = .., d1 = ...] +-- +-- let def = replicate (length of context) undefined +-- d1 = bpermuteDftP dft e1 i1 +-- ..... +-- +dftbpBinders:: [CoreBind] -> [CoreBind] -> Flatten ([CoreBind], Type) +dftbpBinders indexBnds exprBnds = + do + let expr = getExprOfBind (head exprBnds) + defVecExpr <- createDftArrayBind expr + ((b, bnds), t) <- dftbpBinders' indexBnds exprBnds defVecExpr + return ((b:bnds),t) + where + dftbpBinders' :: [CoreBind] + -> [CoreBind] + -> CoreBind + -> Flatten ((CoreBind, [CoreBind]), Type) + dftbpBinders' [] [] cBnd = + return ((cBnd, []), panic "dftbpBinders: undefined type") + dftbpBinders' (i:is) (e:es) cBind = + do + let iVar = getVarOfBind i + let eVar = getVarOfBind e + let cVar = getVarOfBind cBind + let ty = idType eVar + newBnd <- mkDftBackpermute ty iVar eVar cVar + ((fBnd, restBnds), _) <- dftbpBinders' is es newBnd + return ((fBnd, (newBnd:restBnds)), liftTy ty) + + dftbpBinders' _ _ _ = + panic "Flattening.dftbpBinders: index and expression binder lists have different length!" + +getExprOfBind:: CoreBind -> CoreExpr +getExprOfBind (NonRec _ expr) = expr + +getVarOfBind:: CoreBind -> Var +getVarOfBind (NonRec b _) = b + + + +-- Optimised Transformation +-- ========================= +-- + +-- liftSimpleFun +-- if variables x_1 to x_i occur in the context *and* free in expr +-- then +-- (liftSimpleExpression expr) => mapP (\ (x1,..xn) -> expr) (x1,..xn) +-- +liftSimpleFun:: CoreBndr -> CoreExpr -> Flatten (CoreExpr, Type) +liftSimpleFun b expr = + do + bndVars <- collectBoundVars expr + let bndVars' = b:bndVars + bndVarsTuple = mkTuple (map idType bndVars') (map Var bndVars') + lamExpr = mkLams (b:bndVars) expr -- FIXME: should be tuple + -- here + let (t1, t2) = funTyArgs . exprType $ lamExpr + mapExpr <- mk'mapP t1 t2 lamExpr bndVarsTuple + let lexpr = mkApps mapExpr [bndVarsTuple] + return (lexpr, undefined) -- FIXME!!!!! + + +collectBoundVars:: CoreExpr -> Flatten [CoreBndr] +collectBoundVars expr = + intersectWithContext (exprFreeVars expr) + + +-- auxilliary routines +-- ------------------- + +-- mkIndexOfExpr b lit -> +-- indexOf (mapP (\x -> x == lit) b) b +-- +mkIndexOfExpr:: Type -> CoreBndr -> Literal -> Flatten CoreExpr +mkIndexOfExpr idType b lit = + do + eqExpr <- mk'eq idType (Var b) (Lit lit) + let lambdaExpr = (Lam b eqExpr) + mk'indexOfP idType lambdaExpr (Var b) + +-- there is FlattenMonad.mk'indexOfP as well as +-- CoreSyn.mkApps and CoreSyn.mkLam, all of which should help here + +-- for case-distinction over data constructors: +-- let b = expr in +-- case b of +-- dcon args -> .... +-- dconId = dataConTag dcon +-- the call "mkIndexOfExprDCon b dconId" computes the core expression for +-- indexOfP (\x -> x == dconId) b) +-- +mkIndexOfExprDCon::Type -> CoreBndr -> Int -> Flatten CoreExpr +mkIndexOfExprDCon idType b dId = + do + let intExpr = mkIntLitInt dId + eqExpr <- mk'eq idType (Var b) intExpr + let lambdaExpr = (Lam b intExpr) + mk'indexOfP idType lambdaExpr (Var b) + + + +-- there is FlattenMonad.mk'indexOfP as well as +-- CoreSyn.mkApps and CoreSyn.mkLam, all of which should help here + +-- mk'IndexOfExprDConDft b dconIds : Generates the index expression for the +-- default case. "dconIds" is a list of all the data constructor idents which +-- are covered by the other cases. +-- indexOfP (\x -> x != dconId_1 && ....) b) +-- +mkIndexOfExprDConDft:: Type -> CoreBndr -> [Int] -> Flatten CoreExpr +mkIndexOfExprDConDft idType b dId = + do + let intExprs = map mkIntLitInt dId + bExpr <- foldM (mk'neq idType) (head intExprs) (tail intExprs) + let lambdaExpr = (Lam b bExpr) + mk'indexOfP idType (Var b) bExpr + + +-- mkIndexOfExprDef b [lit1, lit2,...] -> +-- indexOf (\x -> not (x == lit1 || x == lit2 ....) b +mkIndexOfExprDft:: Type -> CoreBndr -> [Literal] -> Flatten CoreExpr +mkIndexOfExprDft idType b lits = + do + let litExprs = map (\l-> Lit l) lits + bExpr <- foldM (mk'neq idType) (head litExprs) (tail litExprs) + let lambdaExpr = (Lam b bExpr) + mk'indexOfP idType bExpr (Var b) + + +-- create a back-permute binder +-- +-- * `mkDftBackpermute ty indexArrayVar srcArrayVar dftArrayVar' creates a +-- Core binding of the form +-- +-- x = bpermuteDftP indexArrayVar srcArrayVar dftArrayVar +-- +-- where `x' is a new local variable +-- +mkDftBackpermute :: Type -> Var -> Var -> Var -> Flatten CoreBind +mkDftBackpermute ty idx src dft = + do + rhs <- mk'bpermuteDftP ty (Var idx) (Var src) (Var dft) + liftM snd $ mkBind FSLIT("dbp") rhs + +-- create a dummy array with elements of the given type, which can be used as +-- default array for the combination of the subresults of the lifted case +-- expression +-- +createDftArrayBind :: CoreExpr -> Flatten CoreBind +createDftArrayBind e = + panic "Flattening.createDftArrayBind: not implemented yet" +{- + do + let ty = parrElemTy . exprType $ expr + len <- mk'lengthP e + rhs <- mk'replicateP ty len err?? + lift snd $ mkBind FSLIT("dft") rhs +FIXME: nicht so einfach; man kann kein "error"-Wert nehmen, denn der w"urde + beim bpermuteDftP sofort evaluiert, aber es ist auch schwer m"oglich einen + generischen Wert f"ur jeden beliebigen Typ zu erfinden. +-} + + + + +-- show functions (the pretty print functions sometimes don't +-- show it the way I want.... + +-- shows just the structure +showCoreExpr (Var _ ) = "Var " +showCoreExpr (Lit _) = "Lit " +showCoreExpr (App e1 e2) = + "(App \n " ++ (showCoreExpr e1) ++ "\n " ++ (showCoreExpr e2) ++ ") " +showCoreExpr (Lam b e) = + "Lam b " ++ (showCoreExpr e) +showCoreExpr (Let bnds expr) = + "Let \n" ++ (showBinds bnds) ++ "in " ++ (showCoreExpr expr) + where showBinds (NonRec b e) = showBind (b,e) + showBinds (Rec bnds) = concat (map showBind bnds) + showBind (b,e) = " b = " ++ (showCoreExpr e)++ "\n" +-- gaw 2004 FIX? +showCoreExpr (Case ex b ty alts) = + "Case b = " ++ (showCoreExpr ex) ++ " of \n" ++ (showAlts alts) + where showAlts _ = "" +showCoreExpr (Note _ ex) = "Note n " ++ (showCoreExpr ex) +showCoreExpr (Type t) = "Type" diff --git a/compiler/ndpFlatten/NDPCoreUtils.hs b/compiler/ndpFlatten/NDPCoreUtils.hs new file mode 100644 index 0000000000..6e6b94f175 --- /dev/null +++ b/compiler/ndpFlatten/NDPCoreUtils.hs @@ -0,0 +1,174 @@ +-- $Id$ +-- +-- Copyright (c) [2001..2002] Manuel M T Chakravarty & Gabriele Keller +-- +-- Auxiliary routines for NDP-related Core transformations. +-- +--- DESCRIPTION --------------------------------------------------------------- +-- +-- This module exports all functions to access and alter the `Type' data +-- structure from modules `Type' and `CoreExpr' from `CoreSyn'. As it is part +-- of the NDP flattening component, the functions provide access to all the +-- fields that are important for the flattening and lifting transformation. +-- +--- DOCU ---------------------------------------------------------------------- +-- +-- Language: Haskell 98 +-- +--- TODO ---------------------------------------------------------------------- +-- + +module NDPCoreUtils ( + + -- type inspection functions + -- + tupleTyArgs, -- :: Type -> [Type] + funTyArgs, -- :: Type -> (Type, Type) + parrElemTy, -- :: Type -> Type + + -- Core generation functions + -- + mkTuple, -- :: [Type] -> [CoreExpr] -> CoreExpr + mkInt, -- :: CoreExpr -> CoreExpr + + -- query functions + -- + isDefault, -- :: CoreAlt -> Bool + isLit, -- :: [CoreAlt] -> Bool + isSimpleExpr, -- :: CoreExpr -> Bool + + -- re-exported functions + -- + mkPArrTy, -- :: Type -> Type + boolTy, -- :: Type + + -- substitution + -- + substIdEnv +) where + +-- GHC +import Panic (panic) +import Outputable (Outputable(ppr), pprPanic) +import BasicTypes (Boxity(..)) +import Type (Type, splitTyConApp_maybe, splitFunTy) +import TyCon (isTupleTyCon) +import TysWiredIn (parrTyCon, unitDataConId, tupleCon, intDataCon, mkPArrTy, + boolTy) +import CoreSyn (CoreExpr, CoreAlt, Expr(..), AltCon(..), + Bind(..), mkConApp) +import PprCore ( {- instances -} ) +import Var (Id) +import VarEnv (IdEnv, delVarEnv, delVarEnvList, lookupVarEnv) + +-- friends: don't import any to avoid cyclic imports +-- + + +-- type inspection functions +-- ------------------------- + +-- determines the argument types of a tuple type (EXPORTED) +-- +tupleTyArgs :: Type -> [Type] +tupleTyArgs ty = + case splitTyConApp_maybe ty of + Just (tyCon, argTys) | isTupleTyCon tyCon -> argTys + _ -> + pprPanic "NDPCoreUtils.tupleTyArgs: wrong type: " (ppr ty) + +-- determines the argument and result type of a function type (EXPORTED) +-- +funTyArgs :: Type -> (Type, Type) +funTyArgs = splitFunTy + +-- for a type of the form `[:t:]', yield `t' (EXPORTED) +-- +-- * if the type has any other form, a fatal error occurs +-- +parrElemTy :: Type -> Type +parrElemTy ty = + case splitTyConApp_maybe ty of + Just (tyCon, [argTy]) | tyCon == parrTyCon -> argTy + _ -> + pprPanic "NDPCoreUtils.parrElemTy: wrong type: " (ppr ty) + + +-- Core generation functions +-- ------------------------- + +-- make a tuple construction expression from a list of argument types and +-- argument values (EXPORTED) +-- +-- * the two lists need to be of the same length +-- +mkTuple :: [Type] -> [CoreExpr] -> CoreExpr +mkTuple [] [] = Var unitDataConId +mkTuple [_] [e] = e +mkTuple ts es | length ts == length es = + mkConApp (tupleCon Boxed (length es)) (map Type ts ++ es) +mkTuple _ _ = + panic "NDPCoreUtils.mkTuple: mismatch between number of types and exprs!" + +-- make a boxed integer from an unboxed one (EXPORTED) +-- +mkInt :: CoreExpr -> CoreExpr +mkInt e = mkConApp intDataCon [e] + + +-- query functions +-- --------------- + +-- checks whether a given case alternative is a default alternative (EXPORTED) +-- +isDefault :: CoreAlt -> Bool +isDefault (DEFAULT, _, _) = True +isDefault _ = False + +-- check whether a list of case alternatives in belongs to a case over a +-- literal type (EXPORTED) +-- +isLit :: [CoreAlt] -> Bool +isLit ((DEFAULT, _, _ ):alts) = isLit alts +isLit ((LitAlt _, _, _):_ ) = True +isLit _ = False + +-- FIXME: this function should get a more expressive name and maybe also a +-- more detailed return type (depends on how the analysis goes) +isSimpleExpr:: CoreExpr -> Bool +isSimpleExpr _ = + -- FIXME + False + + +-- Substitution +-- ------------- + +substIdEnv:: IdEnv Id -> CoreExpr -> CoreExpr +substIdEnv env e@(Lit _) = e +substIdEnv env e@(Var id) = + case (lookupVarEnv env id) of + Just v -> (Var v) + _ -> e +substIdEnv env (App e arg) = + App (substIdEnv env e) (substIdEnv env arg) +substIdEnv env (Lam b expr) = + Lam b (substIdEnv (delVarEnv env b) expr) +substIdEnv env (Let (NonRec b expr1) expr2) = + Let (NonRec b (substIdEnv env expr1)) + (substIdEnv (delVarEnv env b) expr2) +substIdEnv env (Let (Rec bnds) expr) = + let + newEnv = delVarEnvList env (map fst bnds) + newExpr = substIdEnv newEnv expr + substBnd (b,e) = (b, substIdEnv newEnv e) + in Let (Rec (map substBnd bnds)) newExpr +substIdEnv env (Case expr b ty alts) = + Case (substIdEnv newEnv expr) b ty (map substAlt alts) + where + newEnv = delVarEnv env b + substAlt (c, bnds, expr) = + (c, bnds, substIdEnv (delVarEnvList env bnds) expr) +substIdEnv env (Note n expr) = + Note n (substIdEnv env expr) +substIdEnv env e@(Type t) = e diff --git a/compiler/ndpFlatten/PArrAnal.hs b/compiler/ndpFlatten/PArrAnal.hs new file mode 100644 index 0000000000..2db56221b2 --- /dev/null +++ b/compiler/ndpFlatten/PArrAnal.hs @@ -0,0 +1,203 @@ +-- $Id$ +-- +-- Copyright (c) 2002 Manuel M T Chakravarty & Gabriele Keller +-- +-- Analysis phase for an optimised flattening transformation +-- +--- DESCRIPTION --------------------------------------------------------------- +-- +-- This module implements an analysis phase that identifies Core expressions +-- that need not be transformed during flattening. The expressions when +-- executed in a parallel context are implemented as an iteration over the +-- original scalar computation, instead of vectorising the computation. This +-- usually improves efficiency by increasing locality and also reduces code +-- size. +-- +--- DOCU ---------------------------------------------------------------------- +-- +-- Language: Haskell 98 with C preprocessor +-- +-- Analyse the expression and annotate each simple subexpression accordingly. +-- +-- The result of the analysis is stored in a new field in IdInfo (has yet to +-- be extended) +-- +-- A simple expression is any expression which is not a function, not of +-- recursive type and does not contain a value of PArray type. Polymorphic +-- variables are simple expressions even though they might be instantiated to +-- a parray value or function. +-- +--- TODO ---------------------------------------------------------------------- +-- + +module PArrAnal ( + markScalarExprs -- :: [CoreBind] -> [CoreBind] +) where + +import Panic (panic) +import Outputable (pprPanic, ppr) +import CoreSyn (CoreBind) + +import TypeRep (Type(..)) +import Var (Var(..),Id) +import Literal (Literal) +import CoreSyn (Expr(..),CoreExpr,Bind(..)) +import PprCore ( {- instances -} ) +-- + +data ArrayUsage = Prim | NonPrim | Array + | PolyExpr (Id -> Maybe (ArrayUsage -> ArrayUsage)) + | PolyFun (ArrayUsage -> ArrayUsage) + + +arrUsage:: CoreExpr -> ArrayUsage +arrUsage (Var id) = varArrayUsage id +arrUsage (Lit lit) = litArrayUsage lit +arrUsage (App expr1 expr2) = + let + arr1 = arrUsage expr1 + arr2 = arrUsage expr2 + in + case (arr1, arr2) of + (_, Array) -> Array + (PolyFun f, _) -> f arr2 + (_, _) -> arr1 + +arrUsage (Lam b expr) = + bindType (b, expr) + +arrUsage (Let (NonRec b expr1) expr2) = + arrUsage (App (Lam b expr2) expr1) + +arrUsage (Let (Rec bnds) expr) = + let + t1 = foldr combineArrayUsage Prim (map bindType bnds) + t2 = arrUsage expr + in if isArrayUsage t1 then Array else t2 + +arrUsage (Case expr b _ alts) = + let + t1 = arrUsage expr + t2 = scanType (map (arrUsage . (\ (_,_,x) -> x)) alts) + in scanType [t1, t2] + +arrUsage (Note n expr) = + arrUsage expr + +arrUsage (Type t) = + typeArrayUsage t + +bindType (b, expr) = + let + bT = varArrayUsage b + exprT = arrUsage expr + in case (bT, exprT) of + (Array, _) -> Array + _ -> exprT + +scanType:: [ArrayUsage] -> ArrayUsage +scanType [t] = t +scanType (Array:ts) = Array +scanType (_:ts) = scanType ts + + + +-- the code expression represents a built-in function which generates +-- an array +isArrayGen:: CoreExpr -> Bool +isArrayGen _ = + panic "PArrAnal: isArrayGen: not yet implemented" + +isArrayCon:: CoreExpr -> Bool +isArrayCon _ = + panic "PArrAnal: isArrayCon: not yet implemented" + +markScalarExprs:: [CoreBind] -> [CoreBind] +markScalarExprs _ = + panic "PArrAnal.markScalarExprs: not implemented yet" + + +varArrayUsage:: Id -> ArrayUsage +varArrayUsage = + panic "PArrAnal.varArrayUsage: not yet implented" + +litArrayUsage:: Literal -> ArrayUsage +litArrayUsage = + panic "PArrAnal.litArrayUsage: not yet implented" + + +typeArrayUsage:: Type -> ArrayUsage +typeArrayUsage (TyVarTy tvar) = + PolyExpr (tIdFun tvar) +typeArrayUsage (AppTy _ _) = + panic "PArrAnal.typeArrayUsage: AppTy case not yet implemented" +typeArrayUsage (TyConApp tc tcargs) = + let + tcargsAU = map typeArrayUsage tcargs + tcCombine = foldr combineArrayUsage Prim tcargsAU + in auCon tcCombine +typeArrayUsage t@(PredTy _) = + pprPanic "PArrAnal.typeArrayUsage: encountered 'PredType - shouldn't be here!" + (ppr t) + + +combineArrayUsage:: ArrayUsage -> ArrayUsage -> ArrayUsage +combineArrayUsage Array _ = Array +combineArrayUsage _ Array = Array +combineArrayUsage (PolyExpr f1) (PolyExpr f2) = + PolyExpr f' + where + f' var = + let + f1lookup = f1 var + f2lookup = f2 var + in + case (f1lookup, f2lookup) of + (Nothing, _) -> f2lookup + (_, Nothing) -> f1lookup + (Just f1', Just f2') -> Just ( \e -> (combineArrayUsage (f1' e) (f2' e))) +combineArrayUsage (PolyFun f) (PolyExpr g) = + panic ("PArrAnal.typeArrayUsage: PolyFun as argument in data" ++ + " constructor - should not (?) happen\n") +combineArrayUsage (PolyExpr g) (PolyFun f) = + panic ("PArrAnal.typeArrayUsage: PolyFun as argument in data" ++ + " constructor - should not (?) happen\n") +combineArrayUsage NonPrim _ = NonPrim +combineArrayUsage _ NonPrim = NonPrim +combineArrayUsage Prim Prim = Prim + + +isArrayUsage:: ArrayUsage -> Bool +isArrayUsage Array = True +isArrayUsage _ = False + +-- Functions to serve as arguments for PolyExpr +-- --------------------------------------------- + +tIdFun:: Var -> Var -> Maybe (ArrayUsage -> ArrayUsage) +tIdFun t tcomp = + if t == tcomp then + Just auId + else + Nothing + +-- Functions to serve as argument for PolyFun +-- ------------------------------------------- + +auId:: ArrayUsage -> ArrayUsage +auId = id + +auCon:: ArrayUsage -> ArrayUsage +auCon Prim = NonPrim +auCon (PolyExpr f) = PolyExpr f' + where f' v = case f v of + Nothing -> Nothing + Just g -> Just ( \e -> (auCon (g e))) +auCon (PolyFun f) = PolyFun (auCon . f) +auCon _ = Array + +-- traversal of Core expressions +-- ----------------------------- + +-- FIXME: implement + diff --git a/compiler/ndpFlatten/TODO b/compiler/ndpFlatten/TODO new file mode 100644 index 0000000000..e596609205 --- /dev/null +++ b/compiler/ndpFlatten/TODO @@ -0,0 +1,202 @@ + TODO List for Flattening Support in GHC -*-text-*- + ======================================= + +Middle-End Related +~~~~~~~~~~~~~~~~~~ + +Flattening Transformation +~~~~~~~~~~~~~~~~~~~~~~~~~ + +* Complete and test + +* Complete the analysis + +* Type transformation: The idea solution would probably be if we can add some + generic machinery, so that we can define all the rules for handling the type + and value transformations in a library. (The PrelPArr for WayNDP.) + + +Library Related +~~~~~~~~~~~~~~~ + +* Problem with re-exporting PrelPArr from Prelude is that it would also be + visible when -pparr is not given. There should be a mechanism to implicitly + import more than one module (like PERVASIVE modules in M3) + +* We need a PrelPArr-like library for when flattening is used, too. In fact, + we need some library routines that are on the level of merely vectorised + code (eg, for the dummy default vectors), and then, all the `PArrays' stuff + implementing fast unboxed arrays and fusion. + +* Enum is a problem. Ideally, we would like `enumFromToP' and + `enumFromThenToP' to be members of `Enum'. On the other hand, we really do + not want to change `Enum'. The solution for the moment is to define + + enumFromTo x y = mapP toEnum [:fromEnum x .. fromEnum y:] + enumFromThenTo x y z = mapP toEnum [:fromEnum x, fromEnum y .. fromEnum z:] + + like the Haskell Report does for the list versions. This is hopefully + efficient enough as array fusion should fold the two traversals into one. + [DONE] + + +DOCU that should go into the Commentary +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The type constructor [::] +------------------------- + +The array type constructor [::] is quite similar to [] (list constructor) in +that GHC has to know about it (in TysWiredIn); however, there are some +differences: + +* [::] is an abstract type, whereas [] is not + +* if flattening is switched on, all occurences of the type are actually + removed by appropriate program transformations. + +The module PrelPArr that actually implements nested parallel arrays. [::] is +eliminated only if in addition to array support, flattening is activated. It +is just an option rather than the only method to implement those arrays. + + Flags: -fparr -- syntactic support for parallel arrays (via `PrelPArr') + * Dynamic hsc option; can be reversed with -fno-parr + -fflatten -- flattening transformation + * Static hsc option + -ndp -- this a way option, which implies -fparr and -fflatten + (way options are handled by the driver and are not + directly seen by hsc) + -ddump-vect -- dump Core after vectorisation + * Dynamic hsc option + +* PrelPArr implements array variants of the Prelude list functions plus some + extra functions (also, some list functions (eg, those generating infinite + lists) have been left out. + +* prelude/PrelNames has been extended with all the names from PrelPArr that + need to be known inside the compiler + +* The variable GhcSupportsPArr, which can be set in build.mk decides whether + `PrelPArr' is to be compiled or not. (We probably need to supress compiling + PrelPArr in WayNDP, or rather replace it with a different PrelPArr.) + +* Say something about `TysWiredIn.parrTyCon' as soon as we know how it + actually works... + +Parser and AST Notes: +- Parser and AST is quite straight forward. Essentially, the list cases + duplicated with a name containing `PArr' or `parr' and modified to fit the + slightly different semantics (ie, finite length, strict). +- The value and pattern `[::]' is an empty explicit parallel array (ie, + something of the form `ExplicitPArr ty []' in the AST). This is in contrast + to lists, which use the nil-constructor instead. In the case of parallel + arrays, using a constructor would be rather awkward, as it is not a + constructor-based type. +- Thus, array patterns have the general form `[:p1, p2, ..., pn:]', where n >= + 0. Thus, two array patterns overlap iff they have the same length. +- The type constructor for parallel is internally represented as a + `TyCon.AlgTyCon' with a wired in definition in `TysWiredIn'. + +Desugarer Notes: +- Desugaring of patterns involving parallel arrays: + * In Match.tidy1, we use fake array constructors; ie, any pattern `[:p1, ..., + pn:]' is replaces by the expression `MkPArr<n> p1 ... pn', where + `MkPArr<n>' is the n-ary array constructor. These constructors are fake, + because they are never used to actually represent array values; in fact, + they are removed again before pattern compilation is finished. However, + the use of these fake constructors implies that we need not modify large + parts of the machinery of the pattern matching compiler, as array patterns + are handled like any other constructor pattern. + * Check.simplify_pat introduces the same fake constructors as Match.tidy1 + and removed again by Check.make_con. + * In DsUtils.mkCoAlgCaseMatchResult, we catch the case of array patterns and + generate code as the following example illustrates, where the LHS is the + code that would be produced if array construtors would really exist: + + case v of pa { + MkPArr1 x1 -> e1 + MkPArr2 x2 x3 x4 -> e2 + DFT -> e3 + } + + => + + case lengthP v of + Int# i# -> + case i# of l { + 1 -> let x1 = v!:0 in e1 + 3 -> let x2 = v!:0; x2 = v!:1; x3 = v!:2 in e2 + DFT -> e3 + } + * The desugaring of array comprehensions is in `DsListComp', but follows + rules that are different from that for translating list comprehensions. + Denotationally, it boils down to the same, but the operational + requirements for an efficient implementation of array comprehensions are + rather different. + + [:e | qss:] = <<[:e | qss:]>> () [:():] + + <<[:e' | :]>> pa ea = mapP (\pa -> e') ea + <<[:e' | b , qs:]>> pa ea = <<[:e' | qs:]>> pa (filterP (\pa -> b) ea) + <<[:e' | p <- e, qs:]>> pa ea = + let ef = filterP (\x -> case x of {p -> True; _ -> False}) e + in + <<[:e' | qs:]>> (pa, p) (crossP ea ef) + <<[:e' | let ds, qs:]>> pa ea = + <<[:e' | qs:]>> (pa, (x_1, ..., x_n)) + (mapP (\v@pa -> (v, let ds in (x_1, ..., x_n))) ea) + where + {x_1, ..., x_n} = DV (ds) -- Defined Variables + <<[:e' | qs | qss:]>> pa ea = + <<[:e' | qss:]>> (pa, (x_1, ..., x_n)) + (zipP ea <<[:(x_1, ..., x_n) | qs:]>>) + where + {x_1, ..., x_n} = DV (qs) + + Moreover, we have + + crossP :: [:a:] -> [:b:] -> [:(a, b):] + crossP a1 a2 = let + len1 = lengthP a1 + len2 = lengthP a2 + x1 = concatP $ mapP (replicateP len2) a1 + x2 = concatP $ replicateP len1 a2 + in + zipP x1 x2 + + For a more efficient implementation of `crossP', see `PrelPArr'. + + Optimisations: + - In the `p <- e' rule, if `pa = ()', drop it and simplify the `crossP ea + e' to `e'. + - We assume that fusion will optimise sequences of array processing + combinators. + - Do we want to have the following function? + + mapFilterP :: (a -> Maybe b) -> [:a:] -> [:b:] + + Even with fusion `(mapP (\p -> e) . filterP (\p -> b))' may still result + in redundant pattern matching operations. (Let's wait with this until + we have seen what the Simplifier does to the generated code.) + +Flattening Notes: +* The story about getting access to all the names like "fst" etc that we need + to generate during flattening is quite involved. To have a reasonable + chance to get at the stuff, we need to put flattening inbetween the + desugarer and the simplifier as an extra pass in HscMain.hscMain. After + that point, the persistent compiler state is zapped (for heap space + reduction reasons, I guess) and nothing remains of the imported interfaces + in one shot mode. + + Moreover, to get the Ids that we need into the type environment, we need to + force the renamer to include them. This is done in + RnEnv.getImplicitModuleFVs, which computes all implicitly imported names. + We let it add the names from FlattenInfo.namesNeededForFlattening. + + Given all these arrangements, FlattenMonad can obtain the needed Ids from + the persistent compiler state without much further hassle. + + [It might be worthwhile to document in the non-Flattening part of the + Commentary that the persistent compiler state is zapped after desugaring and + how the free variables determined by the renamer imply which names are + imported.] diff --git a/compiler/package.conf.in b/compiler/package.conf.in new file mode 100644 index 0000000000..b356e90000 --- /dev/null +++ b/compiler/package.conf.in @@ -0,0 +1,300 @@ +name: PACKAGE +version: VERSION +license: BSD3 +maintainer: glasgow-haskell-users@haskell.org +exposed: False + +exposed-modules: + AsmCodeGen + Bag + BasicTypes + BinIface + Binary + BitSet + Bitmap + BuildTyCl + ByteCodeAsm + ByteCodeFFI + ByteCodeGen + ByteCodeInstr + ByteCodeItbls + ByteCodeLink + CLabel + CSE + CgBindery + CgCallConv + CgCase + CgClosure + CgCon + CgExpr + CgForeignCall + CgHeapery + CgInfoTbls + CgLetNoEscape + CgMonad + CgParallel + CgPrimOp + CgProf + CgStackery + CgTailCall + CgTicky + CgUtils + Check + Class + ClosureInfo + CmdLineParser + Cmm + CmmLex + CmmLint + CmmParse + CmmUtils + CodeGen + CodeOutput + Config + Constants + Convert + CoreFVs + CoreLint + CorePrep + CoreSubst + CoreSyn + CoreTidy + CoreToStg + CoreUnfold + CoreUtils + CostCentre + CprAnalyse + Ctype + DataCon + Demand + Desugar + Digraph + DmdAnal + DriverMkDepend + DriverPhases + DriverPipeline + DsArrows + DsBinds + DsCCall + DsExpr + DsForeign + DsGRHSs + DsListComp + DsMeta + DsMonad + DsUtils + DynFlags + ErrUtils + ExternalCore + FastMutInt + Encoding + FastString + FastTypes + FieldLabel + Finder + FiniteMap + FlattenInfo + FlattenMonad + Flattening + FloatIn + FloatOut + ForeignCall + FunDeps + GHC + Generics + HeaderInfo + HsBinds + HsDecls + HsExpr + HsImpExp + HsLit + HsPat + HsSyn + HsTypes + HsUtils + HscMain + HscStats + HscTypes + IOEnv + Id + IdInfo + IfaceEnv + IfaceSyn + IfaceType + IlxGen + Inst + InstEnv + Java + JavaGen + InteractiveUI + Kind + Lexer + LexCore + LiberateCase + Linker + ListSetOps + Literal + LoadIface + MachCodeGen + MachInstrs + MachOp + MachRegs + Match + MatchCon + MatchLit + Maybes + MkExternalCore + MkId + MkIface + Module + NCGMonad + NDPCoreUtils + Name + NameEnv + NameSet + NewDemand + ObjLink + OccName + OccurAnal + OrdList + Outputable + PArrAnal + PackageConfig + Packages + Panic + Parser + ParserCoreUtils + ParsePkgConf + PositionIndependentCode + PprC + PprCmm + PprCore + PprExternalCore + PprMach + PprTyThing + PrelInfo + PrelNames + PrintJava + PrelRules + Pretty + PrimOp + RdrHsSyn + RdrName + RegAllocInfo + RegisterAlloc + RnBinds + RnEnv + RnExpr + RnHsSyn + RnNames + RnSource + RnTypes + Rules + SAT + SATMonad + SCCfinal + SMRep + SRT + SaAbsInt + SaLib + SetLevels + SimplCore + SimplEnv + SimplMonad + SimplStg + SimplUtils + Simplify + SpecConstr + Specialise + SrcLoc + StaticFlags + StgLint + StgStats + StgSyn + StrictAnal + StringBuffer + SysTools + TcArrows + TcBinds + TcClassDcl + TcDefaults + TcDeriv + TcEnv + TcExpr + TcForeign + TcGenDeriv + TcHsSyn + TcHsType + TcIface + TcInstDcls + TcMType + TcMatches + TcPat + TcRnDriver + TcRnMonad + TcRnTypes + TcRules + TcSimplify + TcSplice + TcTyClsDecls + TcTyDecls + TcType + TcUnify + TidyPgm + TyCon + Type + TypeRep + TysPrim + TysWiredIn + Unify + UniqFM + UniqSet + UniqSupply + Unique + Util + Var + VarEnv + VarSet + WorkWrap + WwLib + +#ifdef INSTALLING +import-dirs: PKG_LIBDIR"/hslibs-imports/ghc" +#else +import-dirs: FPTOOLS_TOP_ABS"/ghc/compiler/stage2/utils", + FPTOOLS_TOP_ABS"/ghc/compiler/stage2/basicTypes", + FPTOOLS_TOP_ABS"/ghc/compiler/stage2/types", + FPTOOLS_TOP_ABS"/ghc/compiler/stage2/hsSyn", + FPTOOLS_TOP_ABS"/ghc/compiler/stage2/prelude", + FPTOOLS_TOP_ABS"/ghc/compiler/stage2/rename", + FPTOOLS_TOP_ABS"/ghc/compiler/stage2/typecheck", + FPTOOLS_TOP_ABS"/ghc/compiler/stage2/deSugar", + FPTOOLS_TOP_ABS"/ghc/compiler/stage2/ghci", + FPTOOLS_TOP_ABS"/ghc/compiler/stage2/coreSyn", + FPTOOLS_TOP_ABS"/ghc/compiler/stage2/specialise", + FPTOOLS_TOP_ABS"/ghc/compiler/stage2/simplCore", + FPTOOLS_TOP_ABS"/ghc/compiler/stage2/stranal", + FPTOOLS_TOP_ABS"/ghc/compiler/stage2/stgSyn", + FPTOOLS_TOP_ABS"/ghc/compiler/stage2/simplStg", + FPTOOLS_TOP_ABS"/ghc/compiler/stage2/codeGen", + FPTOOLS_TOP_ABS"/ghc/compiler/stage2/main", + FPTOOLS_TOP_ABS"/ghc/compiler/stage2/profiling", + FPTOOLS_TOP_ABS"/ghc/compiler/stage2/parser", + FPTOOLS_TOP_ABS"/ghc/compiler/stage2/cprAnalysis", + FPTOOLS_TOP_ABS"/ghc/compiler/stage2/ndpFlatten", + FPTOOLS_TOP_ABS"/ghc/compiler/stage2/iface", + FPTOOLS_TOP_ABS"/ghc/compiler/stage2/cmm", + FPTOOLS_TOP_ABS"/ghc/compiler/stage2/nativeGen" +#endif + +#ifdef INSTALLING +library-dirs: LIB_DIR +#else +library-dirs: FPTOOLS_TOP_ABS"/ghc/compiler" +#endif + +hs-libraries: "HSghc" +extra-libraries: +depends: PKG_DEPENDS +haddock-interfaces: HADDOCK_IFACE +haddock-html: HTML_DIR diff --git a/compiler/parser/Ctype.lhs b/compiler/parser/Ctype.lhs new file mode 100644 index 0000000000..dbe4e9f1b0 --- /dev/null +++ b/compiler/parser/Ctype.lhs @@ -0,0 +1,341 @@ +Character classification + +\begin{code} +module Ctype + ( is_ident -- Char# -> Bool + , is_symbol -- Char# -> Bool + , is_any -- Char# -> Bool + , is_space -- Char# -> Bool + , is_lower -- Char# -> Bool + , is_upper -- Char# -> Bool + , is_digit -- Char# -> Bool + , is_alphanum -- Char# -> Bool + + , is_hexdigit, is_octdigit + , hexDigit, octDecDigit + ) where + +#include "HsVersions.h" + +import DATA_INT ( Int32 ) +import DATA_BITS ( Bits((.&.)) ) +import Char ( ord, chr ) +\end{code} + +Bit masks + +\begin{code} +cIdent, cSymbol, cAny, cSpace, cLower, cUpper, cDigit :: Int +cIdent = 1 +cSymbol = 2 +cAny = 4 +cSpace = 8 +cLower = 16 +cUpper = 32 +cDigit = 64 +\end{code} + +The predicates below look costly, but aren't, GHC+GCC do a great job +at the big case below. + +\begin{code} +{-# INLINE is_ctype #-} +is_ctype :: Int -> Char -> Bool +is_ctype mask c = (fromIntegral (charType c) .&. fromIntegral mask) /= (0::Int32) + +is_ident, is_symbol, is_any, is_space, is_lower, is_upper, is_digit :: Char -> Bool +is_ident = is_ctype cIdent +is_symbol = is_ctype cSymbol +is_any = is_ctype cAny +is_space = is_ctype cSpace +is_lower = is_ctype cLower +is_upper = is_ctype cUpper +is_digit = is_ctype cDigit +is_alphanum = is_ctype (cLower+cUpper+cDigit) +\end{code} + +Utils + +\begin{code} +hexDigit :: Char -> Int +hexDigit c | is_digit c = ord c - ord '0' + | otherwise = ord (to_lower c) - ord 'a' + 10 + +octDecDigit :: Char -> Int +octDecDigit c = ord c - ord '0' + +is_hexdigit c + = is_digit c + || (c >= 'a' && c <= 'f') + || (c >= 'A' && c <= 'F') + +is_octdigit c = c >= '0' && c <= '7' + +to_lower c + | c >= 'A' && c <= 'Z' = chr (ord c - (ord 'A' - ord 'a')) + | otherwise = c +\end{code} + +We really mean .|. instead of + below, but GHC currently doesn't do +any constant folding with bitops. *sigh* + +\begin{code} +charType :: Char -> Int +charType c = case c of + '\0' -> 0 -- \000 + '\1' -> 0 -- \001 + '\2' -> 0 -- \002 + '\3' -> 0 -- \003 + '\4' -> 0 -- \004 + '\5' -> 0 -- \005 + '\6' -> 0 -- \006 + '\7' -> 0 -- \007 + '\8' -> 0 -- \010 + '\9' -> cAny + cSpace -- \t + '\10' -> cSpace -- \n (not allowed in strings, so !cAny) + '\11' -> cAny + cSpace -- \v + '\12' -> cAny + cSpace -- \f + '\13' -> cAny + cSpace -- ^M + '\14' -> 0 -- \016 + '\15' -> 0 -- \017 + '\16' -> 0 -- \020 + '\17' -> 0 -- \021 + '\18' -> 0 -- \022 + '\19' -> 0 -- \023 + '\20' -> 0 -- \024 + '\21' -> 0 -- \025 + '\22' -> 0 -- \026 + '\23' -> 0 -- \027 + '\24' -> 0 -- \030 + '\25' -> 0 -- \031 + '\26' -> 0 -- \032 + '\27' -> 0 -- \033 + '\28' -> 0 -- \034 + '\29' -> 0 -- \035 + '\30' -> 0 -- \036 + '\31' -> 0 -- \037 + '\32' -> cAny + cSpace -- + '\33' -> cAny + cSymbol -- ! + '\34' -> cAny -- " + '\35' -> cAny + cSymbol -- # + '\36' -> cAny + cSymbol -- $ + '\37' -> cAny + cSymbol -- % + '\38' -> cAny + cSymbol -- & + '\39' -> cAny + cIdent -- ' + '\40' -> cAny -- ( + '\41' -> cAny -- ) + '\42' -> cAny + cSymbol -- * + '\43' -> cAny + cSymbol -- + + '\44' -> cAny -- , + '\45' -> cAny + cSymbol -- - + '\46' -> cAny + cSymbol -- . + '\47' -> cAny + cSymbol -- / + '\48' -> cAny + cIdent + cDigit -- 0 + '\49' -> cAny + cIdent + cDigit -- 1 + '\50' -> cAny + cIdent + cDigit -- 2 + '\51' -> cAny + cIdent + cDigit -- 3 + '\52' -> cAny + cIdent + cDigit -- 4 + '\53' -> cAny + cIdent + cDigit -- 5 + '\54' -> cAny + cIdent + cDigit -- 6 + '\55' -> cAny + cIdent + cDigit -- 7 + '\56' -> cAny + cIdent + cDigit -- 8 + '\57' -> cAny + cIdent + cDigit -- 9 + '\58' -> cAny + cSymbol -- : + '\59' -> cAny -- ; + '\60' -> cAny + cSymbol -- < + '\61' -> cAny + cSymbol -- = + '\62' -> cAny + cSymbol -- > + '\63' -> cAny + cSymbol -- ? + '\64' -> cAny + cSymbol -- @ + '\65' -> cAny + cIdent + cUpper -- A + '\66' -> cAny + cIdent + cUpper -- B + '\67' -> cAny + cIdent + cUpper -- C + '\68' -> cAny + cIdent + cUpper -- D + '\69' -> cAny + cIdent + cUpper -- E + '\70' -> cAny + cIdent + cUpper -- F + '\71' -> cAny + cIdent + cUpper -- G + '\72' -> cAny + cIdent + cUpper -- H + '\73' -> cAny + cIdent + cUpper -- I + '\74' -> cAny + cIdent + cUpper -- J + '\75' -> cAny + cIdent + cUpper -- K + '\76' -> cAny + cIdent + cUpper -- L + '\77' -> cAny + cIdent + cUpper -- M + '\78' -> cAny + cIdent + cUpper -- N + '\79' -> cAny + cIdent + cUpper -- O + '\80' -> cAny + cIdent + cUpper -- P + '\81' -> cAny + cIdent + cUpper -- Q + '\82' -> cAny + cIdent + cUpper -- R + '\83' -> cAny + cIdent + cUpper -- S + '\84' -> cAny + cIdent + cUpper -- T + '\85' -> cAny + cIdent + cUpper -- U + '\86' -> cAny + cIdent + cUpper -- V + '\87' -> cAny + cIdent + cUpper -- W + '\88' -> cAny + cIdent + cUpper -- X + '\89' -> cAny + cIdent + cUpper -- Y + '\90' -> cAny + cIdent + cUpper -- Z + '\91' -> cAny -- [ + '\92' -> cAny + cSymbol -- backslash + '\93' -> cAny -- ] + '\94' -> cAny + cSymbol -- ^ + '\95' -> cAny + cIdent + cLower -- _ + '\96' -> cAny -- ` + '\97' -> cAny + cIdent + cLower -- a + '\98' -> cAny + cIdent + cLower -- b + '\99' -> cAny + cIdent + cLower -- c + '\100' -> cAny + cIdent + cLower -- d + '\101' -> cAny + cIdent + cLower -- e + '\102' -> cAny + cIdent + cLower -- f + '\103' -> cAny + cIdent + cLower -- g + '\104' -> cAny + cIdent + cLower -- h + '\105' -> cAny + cIdent + cLower -- i + '\106' -> cAny + cIdent + cLower -- j + '\107' -> cAny + cIdent + cLower -- k + '\108' -> cAny + cIdent + cLower -- l + '\109' -> cAny + cIdent + cLower -- m + '\110' -> cAny + cIdent + cLower -- n + '\111' -> cAny + cIdent + cLower -- o + '\112' -> cAny + cIdent + cLower -- p + '\113' -> cAny + cIdent + cLower -- q + '\114' -> cAny + cIdent + cLower -- r + '\115' -> cAny + cIdent + cLower -- s + '\116' -> cAny + cIdent + cLower -- t + '\117' -> cAny + cIdent + cLower -- u + '\118' -> cAny + cIdent + cLower -- v + '\119' -> cAny + cIdent + cLower -- w + '\120' -> cAny + cIdent + cLower -- x + '\121' -> cAny + cIdent + cLower -- y + '\122' -> cAny + cIdent + cLower -- z + '\123' -> cAny -- { + '\124' -> cAny + cSymbol -- | + '\125' -> cAny -- } + '\126' -> cAny + cSymbol -- ~ + '\127' -> 0 -- \177 + '\128' -> 0 -- \200 + '\129' -> 0 -- \201 + '\130' -> 0 -- \202 + '\131' -> 0 -- \203 + '\132' -> 0 -- \204 + '\133' -> 0 -- \205 + '\134' -> 0 -- \206 + '\135' -> 0 -- \207 + '\136' -> 0 -- \210 + '\137' -> 0 -- \211 + '\138' -> 0 -- \212 + '\139' -> 0 -- \213 + '\140' -> 0 -- \214 + '\141' -> 0 -- \215 + '\142' -> 0 -- \216 + '\143' -> 0 -- \217 + '\144' -> 0 -- \220 + '\145' -> 0 -- \221 + '\146' -> 0 -- \222 + '\147' -> 0 -- \223 + '\148' -> 0 -- \224 + '\149' -> 0 -- \225 + '\150' -> 0 -- \226 + '\151' -> 0 -- \227 + '\152' -> 0 -- \230 + '\153' -> 0 -- \231 + '\154' -> 0 -- \232 + '\155' -> 0 -- \233 + '\156' -> 0 -- \234 + '\157' -> 0 -- \235 + '\158' -> 0 -- \236 + '\159' -> 0 -- \237 + '\160' -> cSpace -- + '\161' -> cAny + cSymbol -- ¡ + '\162' -> cAny + cSymbol -- ¢ + '\163' -> cAny + cSymbol -- £ + '\164' -> cAny + cSymbol -- ¤ + '\165' -> cAny + cSymbol -- ¥ + '\166' -> cAny + cSymbol -- ¦ + '\167' -> cAny + cSymbol -- § + '\168' -> cAny + cSymbol -- ¨ + '\169' -> cAny + cSymbol -- © + '\170' -> cAny + cSymbol -- ª + '\171' -> cAny + cSymbol -- « + '\172' -> cAny + cSymbol -- ¬ + '\173' -> cAny + cSymbol -- + '\174' -> cAny + cSymbol -- ® + '\175' -> cAny + cSymbol -- ¯ + '\176' -> cAny + cSymbol -- ° + '\177' -> cAny + cSymbol -- ± + '\178' -> cAny + cSymbol -- ² + '\179' -> cAny + cSymbol -- ³ + '\180' -> cAny + cSymbol -- ´ + '\181' -> cAny + cSymbol -- µ + '\182' -> cAny + cSymbol -- ¶ + '\183' -> cAny + cSymbol -- · + '\184' -> cAny + cSymbol -- ¸ + '\185' -> cAny + cSymbol -- ¹ + '\186' -> cAny + cSymbol -- º + '\187' -> cAny + cSymbol -- » + '\188' -> cAny + cSymbol -- ¼ + '\189' -> cAny + cSymbol -- ½ + '\190' -> cAny + cSymbol -- ¾ + '\191' -> cAny + cSymbol -- ¿ + '\192' -> cAny + cIdent + cUpper -- À + '\193' -> cAny + cIdent + cUpper -- Á + '\194' -> cAny + cIdent + cUpper -- Â + '\195' -> cAny + cIdent + cUpper -- Ã + '\196' -> cAny + cIdent + cUpper -- Ä + '\197' -> cAny + cIdent + cUpper -- Å + '\198' -> cAny + cIdent + cUpper -- Æ + '\199' -> cAny + cIdent + cUpper -- Ç + '\200' -> cAny + cIdent + cUpper -- È + '\201' -> cAny + cIdent + cUpper -- É + '\202' -> cAny + cIdent + cUpper -- Ê + '\203' -> cAny + cIdent + cUpper -- Ë + '\204' -> cAny + cIdent + cUpper -- Ì + '\205' -> cAny + cIdent + cUpper -- Í + '\206' -> cAny + cIdent + cUpper -- Î + '\207' -> cAny + cIdent + cUpper -- Ï + '\208' -> cAny + cIdent + cUpper -- Ð + '\209' -> cAny + cIdent + cUpper -- Ñ + '\210' -> cAny + cIdent + cUpper -- Ò + '\211' -> cAny + cIdent + cUpper -- Ó + '\212' -> cAny + cIdent + cUpper -- Ô + '\213' -> cAny + cIdent + cUpper -- Õ + '\214' -> cAny + cIdent + cUpper -- Ö + '\215' -> cAny + cSymbol + cLower -- × + '\216' -> cAny + cIdent + cUpper -- Ø + '\217' -> cAny + cIdent + cUpper -- Ù + '\218' -> cAny + cIdent + cUpper -- Ú + '\219' -> cAny + cIdent + cUpper -- Û + '\220' -> cAny + cIdent + cUpper -- Ü + '\221' -> cAny + cIdent + cUpper -- Ý + '\222' -> cAny + cIdent + cUpper -- Þ + '\223' -> cAny + cIdent -- ß + '\224' -> cAny + cIdent + cLower -- à + '\225' -> cAny + cIdent + cLower -- á + '\226' -> cAny + cIdent + cLower -- â + '\227' -> cAny + cIdent + cLower -- ã + '\228' -> cAny + cIdent + cLower -- ä + '\229' -> cAny + cIdent + cLower -- å + '\230' -> cAny + cIdent + cLower -- æ + '\231' -> cAny + cIdent + cLower -- ç + '\232' -> cAny + cIdent + cLower -- è + '\233' -> cAny + cIdent + cLower -- é + '\234' -> cAny + cIdent + cLower -- ê + '\235' -> cAny + cIdent + cLower -- ë + '\236' -> cAny + cIdent + cLower -- ì + '\237' -> cAny + cIdent + cLower -- í + '\238' -> cAny + cIdent + cLower -- î + '\239' -> cAny + cIdent + cLower -- ï + '\240' -> cAny + cIdent + cLower -- ð + '\241' -> cAny + cIdent + cLower -- ñ + '\242' -> cAny + cIdent + cLower -- ò + '\243' -> cAny + cIdent + cLower -- ó + '\244' -> cAny + cIdent + cLower -- ô + '\245' -> cAny + cIdent + cLower -- õ + '\246' -> cAny + cIdent + cLower -- ö + '\247' -> cAny + cSymbol -- ÷ + '\248' -> cAny + cIdent -- ø + '\249' -> cAny + cIdent + cLower -- ù + '\250' -> cAny + cIdent + cLower -- ú + '\251' -> cAny + cIdent + cLower -- û + '\252' -> cAny + cIdent + cLower -- ü + '\253' -> cAny + cIdent + cLower -- ý + '\254' -> cAny + cIdent + cLower -- þ + '\255' -> cAny + cIdent + cLower -- ÿ +\end{code} diff --git a/compiler/parser/LexCore.hs b/compiler/parser/LexCore.hs new file mode 100644 index 0000000000..1a545a3e43 --- /dev/null +++ b/compiler/parser/LexCore.hs @@ -0,0 +1,130 @@ +module LexCore where + +import ParserCoreUtils +import Ratio +import Char +import qualified Numeric( readFloat, readDec ) + +isNameChar c = isAlpha c || isDigit c || (c == '_') || (c == '\'') +isKeywordChar c = isAlpha c || (c == '_') + +lexer :: (Token -> P a) -> P a +lexer cont [] = cont TKEOF [] +lexer cont ('\n':cs) = \line -> lexer cont cs (line+1) +lexer cont ('-':'>':cs) = cont TKrarrow cs + +lexer cont (c:cs) + | isSpace c = lexer cont cs + | isLower c || (c == '_') = lexName cont TKname (c:cs) + | isUpper c = lexName cont TKcname (c:cs) + | isDigit c || (c == '-') = lexNum cont (c:cs) + +lexer cont ('%':cs) = lexKeyword cont cs +lexer cont ('\'':cs) = lexChar cont cs +lexer cont ('\"':cs) = lexString [] cont cs +lexer cont ('#':cs) = cont TKhash cs +lexer cont ('(':cs) = cont TKoparen cs +lexer cont (')':cs) = cont TKcparen cs +lexer cont ('{':cs) = cont TKobrace cs +lexer cont ('}':cs) = cont TKcbrace cs +lexer cont ('=':cs) = cont TKeq cs +lexer cont (':':':':cs) = cont TKcoloncolon cs +lexer cont ('*':cs) = cont TKstar cs +lexer cont ('.':cs) = cont TKdot cs +lexer cont ('\\':cs) = cont TKlambda cs +lexer cont ('@':cs) = cont TKat cs +lexer cont ('?':cs) = cont TKquestion cs +lexer cont (';':cs) = cont TKsemicolon cs +lexer cont (c:cs) = failP "invalid character" [c] + + + +lexChar cont ('\\':'x':h1:h0:'\'':cs) + | isHexEscape [h1,h0] = cont (TKchar (hexToChar h1 h0)) cs +lexChar cont ('\\':cs) = failP "invalid char character" ('\\':(take 10 cs)) +lexChar cont ('\'':cs) = failP "invalid char character" ['\''] +lexChar cont ('\"':cs) = failP "invalid char character" ['\"'] +lexChar cont (c:'\'':cs) = cont (TKchar c) cs + + +lexString s cont ('\\':'x':h1:h0:cs) + | isHexEscape [h1,h0] = lexString (s++[hexToChar h1 h0]) cont cs +lexString s cont ('\\':cs) = failP "invalid string character" ['\\'] +lexString s cont ('\'':cs) = failP "invalid string character" ['\''] +lexString s cont ('\"':cs) = cont (TKstring s) cs +lexString s cont (c:cs) = lexString (s++[c]) cont cs + +isHexEscape = all (\c -> isHexDigit c && (isDigit c || isLower c)) + +hexToChar h1 h0 = chr (digitToInt h1 * 16 + digitToInt h0) + + +lexNum cont cs = + case cs of + ('-':cs) -> f (-1) cs + _ -> f 1 cs + where f sgn cs = + case span isDigit cs of + (digits,'.':c:rest) + | isDigit c -> cont (TKrational (fromInteger sgn * r)) rest' + where ((r,rest'):_) = readFloat (digits ++ ('.':c:rest)) + -- When reading a floating-point number, which is + -- a bit complicated, use the Haskell 98 library function + (digits,rest) -> cont (TKinteger (sgn * (read digits))) rest + +lexName cont cstr cs = cont (cstr name) rest + where (name,rest) = span isNameChar cs + +lexKeyword cont cs = + case span isKeywordChar cs of + ("module",rest) -> cont TKmodule rest + ("data",rest) -> cont TKdata rest + ("newtype",rest) -> cont TKnewtype rest + ("forall",rest) -> cont TKforall rest + ("rec",rest) -> cont TKrec rest + ("let",rest) -> cont TKlet rest + ("in",rest) -> cont TKin rest + ("case",rest) -> cont TKcase rest + ("of",rest) -> cont TKof rest + ("coerce",rest) -> cont TKcoerce rest + ("note",rest) -> cont TKnote rest + ("external",rest) -> cont TKexternal rest + ("_",rest) -> cont TKwild rest + _ -> failP "invalid keyword" ('%':cs) + + +#if __GLASGOW_HASKELL__ >= 504 +-- The readFloat in the Numeric library will do the job + +readFloat :: (RealFrac a) => ReadS a +readFloat = Numeric.readFloat + +#else +-- Haskell 98's Numeric.readFloat used to have a bogusly restricted signature +-- so it was incapable of reading a rational. +-- So for GHCs that have that old bogus library, here is the code, written out longhand. + +readFloat r = [(fromRational ((n%1)*10^^(k-d)),t) | (n,d,s) <- readFix r, + (k,t) <- readExp s] ++ + [ (0/0, t) | ("NaN",t) <- lex r] ++ + [ (1/0, t) | ("Infinity",t) <- lex r] + where + readFix r = [(read (ds++ds'), length ds', t) + | (ds,d) <- lexDigits r, + (ds',t) <- lexFrac d ] + + lexFrac ('.':ds) = lexDigits ds + lexFrac s = [("",s)] + + readExp (e:s) | e `elem` "eE" = readExp' s + readExp s = [(0,s)] + + readExp' ('-':s) = [(-k,t) | (k,t) <- Numeric.readDec s] + readExp' ('+':s) = Numeric.readDec s + readExp' s = Numeric.readDec s + +lexDigits :: ReadS String +lexDigits s = case span isDigit s of + (cs,s') | not (null cs) -> [(cs,s')] + otherwise -> [] +#endif diff --git a/compiler/parser/Lexer.x b/compiler/parser/Lexer.x new file mode 100644 index 0000000000..4c1b48efc0 --- /dev/null +++ b/compiler/parser/Lexer.x @@ -0,0 +1,1457 @@ +----------------------------------------------------------------------------- +-- (c) The University of Glasgow, 2006 +-- +-- GHC's lexer. +-- +-- This is a combination of an Alex-generated lexer from a regex +-- definition, with some hand-coded bits. +-- +-- Completely accurate information about token-spans within the source +-- file is maintained. Every token has a start and end SrcLoc attached to it. +-- +----------------------------------------------------------------------------- + +-- ToDo / known bugs: +-- - Unicode +-- - parsing integers is a bit slow +-- - readRational is a bit slow +-- +-- Known bugs, that were also in the previous version: +-- - M... should be 3 tokens, not 1. +-- - pragma-end should be only valid in a pragma + +{ +module Lexer ( + Token(..), lexer, pragState, mkPState, PState(..), + P(..), ParseResult(..), getSrcLoc, + failLocMsgP, failSpanMsgP, srcParseFail, + popContext, pushCurrentContext, setLastToken, setSrcLoc, + getLexState, popLexState, pushLexState, + extension, bangPatEnabled + ) where + +#include "HsVersions.h" + +import ErrUtils ( Message ) +import Outputable +import StringBuffer +import FastString +import FastTypes +import SrcLoc +import UniqFM +import DynFlags +import Ctype +import Util ( maybePrefixMatch, readRational ) + +import DATA_BITS +import Data.Char ( chr ) +import Ratio +--import TRACE + +#if __GLASGOW_HASKELL__ >= 605 +import Data.Char ( GeneralCategory(..), generalCategory, isPrint, isUpper ) +#else +import Compat.Unicode ( GeneralCategory(..), generalCategory, isPrint, isUpper ) +#endif +} + +$unispace = \x05 +$whitechar = [\ \t\n\r\f\v\xa0 $unispace] +$white_no_nl = $whitechar # \n + +$ascdigit = 0-9 +$unidigit = \x03 +$decdigit = $ascdigit -- for now, should really be $digit (ToDo) +$digit = [$ascdigit $unidigit] + +$special = [\(\)\,\;\[\]\`\{\}] +$ascsymbol = [\!\#\$\%\&\*\+\.\/\<\=\>\?\@\\\^\|\-\~] +$unisymbol = \x04 +$symbol = [$ascsymbol $unisymbol] # [$special \_\:\"\'] + +$unilarge = \x01 +$asclarge = [A-Z \xc0-\xd6 \xd8-\xde] +$large = [$asclarge $unilarge] + +$unismall = \x02 +$ascsmall = [a-z \xdf-\xf6 \xf8-\xff] +$small = [$ascsmall $unismall \_] + +$unigraphic = \x06 +$graphic = [$small $large $symbol $digit $special $unigraphic \:\"\'] + +$octit = 0-7 +$hexit = [$decdigit A-F a-f] +$symchar = [$symbol \:] +$nl = [\n\r] +$idchar = [$small $large $digit \'] + +@varid = $small $idchar* +@conid = $large $idchar* + +@varsym = $symbol $symchar* +@consym = \: $symchar* + +@decimal = $decdigit+ +@octal = $octit+ +@hexadecimal = $hexit+ +@exponent = [eE] [\-\+]? @decimal + +-- we support the hierarchical module name extension: +@qual = (@conid \.)+ + +@floating_point = @decimal \. @decimal @exponent? | @decimal @exponent + +haskell :- + +-- everywhere: skip whitespace and comments +$white_no_nl+ ; + +-- Everywhere: deal with nested comments. We explicitly rule out +-- pragmas, "{-#", so that we don't accidentally treat them as comments. +-- (this can happen even though pragmas will normally take precedence due to +-- longest-match, because pragmas aren't valid in every state, but comments +-- are). +"{-" / { notFollowedBy '#' } { nested_comment } + +-- Single-line comments are a bit tricky. Haskell 98 says that two or +-- more dashes followed by a symbol should be parsed as a varsym, so we +-- have to exclude those. +-- The regex says: "munch all the characters after the dashes, as long as +-- the first one is not a symbol". +"--"\-* [^$symbol :] .* ; +"--"\-* / { atEOL } ; + +-- 'bol' state: beginning of a line. Slurp up all the whitespace (including +-- blank lines) until we find a non-whitespace character, then do layout +-- processing. +-- +-- One slight wibble here: what if the line begins with {-#? In +-- theory, we have to lex the pragma to see if it's one we recognise, +-- and if it is, then we backtrack and do_bol, otherwise we treat it +-- as a nested comment. We don't bother with this: if the line begins +-- with {-#, then we'll assume it's a pragma we know about and go for do_bol. +<bol> { + \n ; + ^\# (line)? { begin line_prag1 } + ^\# pragma .* \n ; -- GCC 3.3 CPP generated, apparently + ^\# \! .* \n ; -- #!, for scripts + () { do_bol } +} + +-- after a layout keyword (let, where, do, of), we begin a new layout +-- context if the curly brace is missing. +-- Careful! This stuff is quite delicate. +<layout, layout_do> { + \{ / { notFollowedBy '-' } { pop_and open_brace } + -- we might encounter {-# here, but {- has been handled already + \n ; + ^\# (line)? { begin line_prag1 } +} + +-- do is treated in a subtly different way, see new_layout_context +<layout> () { new_layout_context True } +<layout_do> () { new_layout_context False } + +-- after a new layout context which was found to be to the left of the +-- previous context, we have generated a '{' token, and we now need to +-- generate a matching '}' token. +<layout_left> () { do_layout_left } + +<0,option_prags,glaexts> \n { begin bol } + +"{-#" $whitechar* (line|LINE) { begin line_prag2 } + +-- single-line line pragmas, of the form +-- # <line> "<file>" <extra-stuff> \n +<line_prag1> $decdigit+ { setLine line_prag1a } +<line_prag1a> \" [$graphic \ ]* \" { setFile line_prag1b } +<line_prag1b> .* { pop } + +-- Haskell-style line pragmas, of the form +-- {-# LINE <line> "<file>" #-} +<line_prag2> $decdigit+ { setLine line_prag2a } +<line_prag2a> \" [$graphic \ ]* \" { setFile line_prag2b } +<line_prag2b> "#-}"|"-}" { pop } + -- NOTE: accept -} at the end of a LINE pragma, for compatibility + -- with older versions of GHC which generated these. + +-- We only want RULES pragmas to be picked up when -fglasgow-exts +-- is on, because the contents of the pragma is always written using +-- glasgow-exts syntax (using forall etc.), so if glasgow exts are not +-- enabled, we're sure to get a parse error. +-- (ToDo: we should really emit a warning when ignoring pragmas) +<glaexts> + "{-#" $whitechar* (RULES|rules) { token ITrules_prag } + +<0,option_prags,glaexts> { + "{-#" $whitechar* (INLINE|inline) { token (ITinline_prag True) } + "{-#" $whitechar* (NO(T?)INLINE|no(t?)inline) + { token (ITinline_prag False) } + "{-#" $whitechar* (SPECIALI[SZ]E|speciali[sz]e) + { token ITspec_prag } + "{-#" $whitechar* (SPECIALI[SZ]E|speciali[sz]e) + $whitechar* (INLINE|inline) { token (ITspec_inline_prag True) } + "{-#" $whitechar* (SPECIALI[SZ]E|speciali[sz]e) + $whitechar* (NO(T?)INLINE|no(t?)inline) + { token (ITspec_inline_prag False) } + "{-#" $whitechar* (SOURCE|source) { token ITsource_prag } + "{-#" $whitechar* (DEPRECATED|deprecated) + { token ITdeprecated_prag } + "{-#" $whitechar* (SCC|scc) { token ITscc_prag } + "{-#" $whitechar* (CORE|core) { token ITcore_prag } + "{-#" $whitechar* (UNPACK|unpack) { token ITunpack_prag } + + "{-#" { nested_comment } + + -- ToDo: should only be valid inside a pragma: + "#-}" { token ITclose_prag} +} + +<option_prags> { + "{-#" $whitechar* (OPTIONS|options) { lex_string_prag IToptions_prag } + "{-#" $whitechar* (OPTIONS_GHC|options_ghc) + { lex_string_prag IToptions_prag } + "{-#" $whitechar* (LANGUAGE|language) { token ITlanguage_prag } + "{-#" $whitechar* (INCLUDE|include) { lex_string_prag ITinclude_prag } +} + +-- '0' state: ordinary lexemes +-- 'glaexts' state: glasgow extensions (postfix '#', etc.) + +-- "special" symbols + +<0,glaexts> { + "[:" / { ifExtension parrEnabled } { token ITopabrack } + ":]" / { ifExtension parrEnabled } { token ITcpabrack } +} + +<0,glaexts> { + "[|" / { ifExtension thEnabled } { token ITopenExpQuote } + "[e|" / { ifExtension thEnabled } { token ITopenExpQuote } + "[p|" / { ifExtension thEnabled } { token ITopenPatQuote } + "[d|" / { ifExtension thEnabled } { layout_token ITopenDecQuote } + "[t|" / { ifExtension thEnabled } { token ITopenTypQuote } + "|]" / { ifExtension thEnabled } { token ITcloseQuote } + \$ @varid / { ifExtension thEnabled } { skip_one_varid ITidEscape } + "$(" / { ifExtension thEnabled } { token ITparenEscape } +} + +<0,glaexts> { + "(|" / { ifExtension arrowsEnabled `alexAndPred` notFollowedBySymbol } + { special IToparenbar } + "|)" / { ifExtension arrowsEnabled } { special ITcparenbar } +} + +<0,glaexts> { + \? @varid / { ifExtension ipEnabled } { skip_one_varid ITdupipvarid } + \% @varid / { ifExtension ipEnabled } { skip_one_varid ITsplitipvarid } +} + +<glaexts> { + "(#" / { notFollowedBySymbol } { token IToubxparen } + "#)" { token ITcubxparen } + "{|" { token ITocurlybar } + "|}" { token ITccurlybar } +} + +<0,option_prags,glaexts> { + \( { special IToparen } + \) { special ITcparen } + \[ { special ITobrack } + \] { special ITcbrack } + \, { special ITcomma } + \; { special ITsemi } + \` { special ITbackquote } + + \{ { open_brace } + \} { close_brace } +} + +<0,option_prags,glaexts> { + @qual @varid { check_qvarid } + @qual @conid { idtoken qconid } + @varid { varid } + @conid { idtoken conid } +} + +-- after an illegal qvarid, such as 'M.let', +-- we back up and try again in the bad_qvarid state: +<bad_qvarid> { + @conid { pop_and (idtoken conid) } + @qual @conid { pop_and (idtoken qconid) } +} + +<glaexts> { + @qual @varid "#"+ { idtoken qvarid } + @qual @conid "#"+ { idtoken qconid } + @varid "#"+ { varid } + @conid "#"+ { idtoken conid } +} + +-- ToDo: M.(,,,) + +<0,glaexts> { + @qual @varsym { idtoken qvarsym } + @qual @consym { idtoken qconsym } + @varsym { varsym } + @consym { consym } +} + +<0,glaexts> { + @decimal { tok_decimal } + 0[oO] @octal { tok_octal } + 0[xX] @hexadecimal { tok_hexadecimal } +} + +<glaexts> { + @decimal \# { prim_decimal } + 0[oO] @octal \# { prim_octal } + 0[xX] @hexadecimal \# { prim_hexadecimal } +} + +<0,glaexts> @floating_point { strtoken tok_float } +<glaexts> @floating_point \# { init_strtoken 1 prim_float } +<glaexts> @floating_point \# \# { init_strtoken 2 prim_double } + +-- Strings and chars are lexed by hand-written code. The reason is +-- that even if we recognise the string or char here in the regex +-- lexer, we would still have to parse the string afterward in order +-- to convert it to a String. +<0,glaexts> { + \' { lex_char_tok } + \" { lex_string_tok } +} + +{ +-- work around bug in Alex 2.0 +#if __GLASGOW_HASKELL__ < 503 +unsafeAt arr i = arr ! i +#endif + +-- ----------------------------------------------------------------------------- +-- The token type + +data Token + = ITas -- Haskell keywords + | ITcase + | ITclass + | ITdata + | ITdefault + | ITderiving + | ITdo + | ITelse + | IThiding + | ITif + | ITimport + | ITin + | ITinfix + | ITinfixl + | ITinfixr + | ITinstance + | ITlet + | ITmodule + | ITnewtype + | ITof + | ITqualified + | ITthen + | ITtype + | ITwhere + | ITscc -- ToDo: remove (we use {-# SCC "..." #-} now) + + | ITforall -- GHC extension keywords + | ITforeign + | ITexport + | ITlabel + | ITdynamic + | ITsafe + | ITthreadsafe + | ITunsafe + | ITstdcallconv + | ITccallconv + | ITdotnet + | ITmdo + + -- Pragmas + | ITinline_prag Bool -- True <=> INLINE, False <=> NOINLINE + | ITspec_prag -- SPECIALISE + | ITspec_inline_prag Bool -- SPECIALISE INLINE (or NOINLINE) + | ITsource_prag + | ITrules_prag + | ITdeprecated_prag + | ITline_prag + | ITscc_prag + | ITcore_prag -- hdaume: core annotations + | ITunpack_prag + | ITclose_prag + | IToptions_prag String + | ITinclude_prag String + | ITlanguage_prag + + | ITdotdot -- reserved symbols + | ITcolon + | ITdcolon + | ITequal + | ITlam + | ITvbar + | ITlarrow + | ITrarrow + | ITat + | ITtilde + | ITdarrow + | ITminus + | ITbang + | ITstar + | ITdot + + | ITbiglam -- GHC-extension symbols + + | ITocurly -- special symbols + | ITccurly + | ITocurlybar -- {|, for type applications + | ITccurlybar -- |}, for type applications + | ITvocurly + | ITvccurly + | ITobrack + | ITopabrack -- [:, for parallel arrays with -fparr + | ITcpabrack -- :], for parallel arrays with -fparr + | ITcbrack + | IToparen + | ITcparen + | IToubxparen + | ITcubxparen + | ITsemi + | ITcomma + | ITunderscore + | ITbackquote + + | ITvarid FastString -- identifiers + | ITconid FastString + | ITvarsym FastString + | ITconsym FastString + | ITqvarid (FastString,FastString) + | ITqconid (FastString,FastString) + | ITqvarsym (FastString,FastString) + | ITqconsym (FastString,FastString) + + | ITdupipvarid FastString -- GHC extension: implicit param: ?x + | ITsplitipvarid FastString -- GHC extension: implicit param: %x + + | ITpragma StringBuffer + + | ITchar Char + | ITstring FastString + | ITinteger Integer + | ITrational Rational + + | ITprimchar Char + | ITprimstring FastString + | ITprimint Integer + | ITprimfloat Rational + | ITprimdouble Rational + + -- MetaHaskell extension tokens + | ITopenExpQuote -- [| or [e| + | ITopenPatQuote -- [p| + | ITopenDecQuote -- [d| + | ITopenTypQuote -- [t| + | ITcloseQuote -- |] + | ITidEscape FastString -- $x + | ITparenEscape -- $( + | ITvarQuote -- ' + | ITtyQuote -- '' + + -- Arrow notation extension + | ITproc + | ITrec + | IToparenbar -- (| + | ITcparenbar -- |) + | ITlarrowtail -- -< + | ITrarrowtail -- >- + | ITLarrowtail -- -<< + | ITRarrowtail -- >>- + + | ITunknown String -- Used when the lexer can't make sense of it + | ITeof -- end of file token +#ifdef DEBUG + deriving Show -- debugging +#endif + +isSpecial :: Token -> Bool +-- If we see M.x, where x is a keyword, but +-- is special, we treat is as just plain M.x, +-- not as a keyword. +isSpecial ITas = True +isSpecial IThiding = True +isSpecial ITqualified = True +isSpecial ITforall = True +isSpecial ITexport = True +isSpecial ITlabel = True +isSpecial ITdynamic = True +isSpecial ITsafe = True +isSpecial ITthreadsafe = True +isSpecial ITunsafe = True +isSpecial ITccallconv = True +isSpecial ITstdcallconv = True +isSpecial ITmdo = True +isSpecial _ = False + +-- the bitmap provided as the third component indicates whether the +-- corresponding extension keyword is valid under the extension options +-- provided to the compiler; if the extension corresponding to *any* of the +-- bits set in the bitmap is enabled, the keyword is valid (this setup +-- facilitates using a keyword in two different extensions that can be +-- activated independently) +-- +reservedWordsFM = listToUFM $ + map (\(x, y, z) -> (mkFastString x, (y, z))) + [( "_", ITunderscore, 0 ), + ( "as", ITas, 0 ), + ( "case", ITcase, 0 ), + ( "class", ITclass, 0 ), + ( "data", ITdata, 0 ), + ( "default", ITdefault, 0 ), + ( "deriving", ITderiving, 0 ), + ( "do", ITdo, 0 ), + ( "else", ITelse, 0 ), + ( "hiding", IThiding, 0 ), + ( "if", ITif, 0 ), + ( "import", ITimport, 0 ), + ( "in", ITin, 0 ), + ( "infix", ITinfix, 0 ), + ( "infixl", ITinfixl, 0 ), + ( "infixr", ITinfixr, 0 ), + ( "instance", ITinstance, 0 ), + ( "let", ITlet, 0 ), + ( "module", ITmodule, 0 ), + ( "newtype", ITnewtype, 0 ), + ( "of", ITof, 0 ), + ( "qualified", ITqualified, 0 ), + ( "then", ITthen, 0 ), + ( "type", ITtype, 0 ), + ( "where", ITwhere, 0 ), + ( "_scc_", ITscc, 0 ), -- ToDo: remove + + ( "forall", ITforall, bit tvBit), + ( "mdo", ITmdo, bit glaExtsBit), + + ( "foreign", ITforeign, bit ffiBit), + ( "export", ITexport, bit ffiBit), + ( "label", ITlabel, bit ffiBit), + ( "dynamic", ITdynamic, bit ffiBit), + ( "safe", ITsafe, bit ffiBit), + ( "threadsafe", ITthreadsafe, bit ffiBit), + ( "unsafe", ITunsafe, bit ffiBit), + ( "stdcall", ITstdcallconv, bit ffiBit), + ( "ccall", ITccallconv, bit ffiBit), + ( "dotnet", ITdotnet, bit ffiBit), + + ( "rec", ITrec, bit arrowsBit), + ( "proc", ITproc, bit arrowsBit) + ] + +reservedSymsFM = listToUFM $ + map (\ (x,y,z) -> (mkFastString x,(y,z))) + [ ("..", ITdotdot, 0) + ,(":", ITcolon, 0) -- (:) is a reserved op, + -- meaning only list cons + ,("::", ITdcolon, 0) + ,("=", ITequal, 0) + ,("\\", ITlam, 0) + ,("|", ITvbar, 0) + ,("<-", ITlarrow, 0) + ,("->", ITrarrow, 0) + ,("@", ITat, 0) + ,("~", ITtilde, 0) + ,("=>", ITdarrow, 0) + ,("-", ITminus, 0) + ,("!", ITbang, 0) + + ,("*", ITstar, bit glaExtsBit) -- For data T (a::*) = MkT + ,(".", ITdot, bit tvBit) -- For 'forall a . t' + + ,("-<", ITlarrowtail, bit arrowsBit) + ,(">-", ITrarrowtail, bit arrowsBit) + ,("-<<", ITLarrowtail, bit arrowsBit) + ,(">>-", ITRarrowtail, bit arrowsBit) + +#if __GLASGOW_HASKELL__ >= 605 + ,("λ", ITlam, bit glaExtsBit) + ,("∷", ITdcolon, bit glaExtsBit) + ,("⇒", ITdarrow, bit glaExtsBit) + ,("∀", ITforall, bit glaExtsBit) + ,("→", ITrarrow, bit glaExtsBit) + ,("←", ITlarrow, bit glaExtsBit) + ,("⋯", ITdotdot, bit glaExtsBit) +#endif + ] + +-- ----------------------------------------------------------------------------- +-- Lexer actions + +type Action = SrcSpan -> StringBuffer -> Int -> P (Located Token) + +special :: Token -> Action +special tok span _buf len = return (L span tok) + +token, layout_token :: Token -> Action +token t span buf len = return (L span t) +layout_token t span buf len = pushLexState layout >> return (L span t) + +idtoken :: (StringBuffer -> Int -> Token) -> Action +idtoken f span buf len = return (L span $! (f buf len)) + +skip_one_varid :: (FastString -> Token) -> Action +skip_one_varid f span buf len + = return (L span $! f (lexemeToFastString (stepOn buf) (len-1))) + +strtoken :: (String -> Token) -> Action +strtoken f span buf len = + return (L span $! (f $! lexemeToString buf len)) + +init_strtoken :: Int -> (String -> Token) -> Action +-- like strtoken, but drops the last N character(s) +init_strtoken drop f span buf len = + return (L span $! (f $! lexemeToString buf (len-drop))) + +begin :: Int -> Action +begin code _span _str _len = do pushLexState code; lexToken + +pop :: Action +pop _span _buf _len = do popLexState; lexToken + +pop_and :: Action -> Action +pop_and act span buf len = do popLexState; act span buf len + +notFollowedBy char _ _ _ (AI _ _ buf) = atEnd buf || currentChar buf /= char + +notFollowedBySymbol _ _ _ (AI _ _ buf) + = atEnd buf || currentChar buf `notElem` "!#$%&*+./<=>?@\\^|-~" + +atEOL _ _ _ (AI _ _ buf) = atEnd buf || currentChar buf == '\n' + +ifExtension pred bits _ _ _ = pred bits + +{- + nested comments require traversing by hand, they can't be parsed + using regular expressions. +-} +nested_comment :: Action +nested_comment span _str _len = do + input <- getInput + go 1 input + where go 0 input = do setInput input; lexToken + go n input = do + case alexGetChar input of + Nothing -> err input + Just (c,input) -> do + case c of + '-' -> do + case alexGetChar input of + Nothing -> err input + Just ('\125',input) -> go (n-1) input + Just (c,_) -> go n input + '\123' -> do + case alexGetChar input of + Nothing -> err input + Just ('-',input') -> go (n+1) input' + Just (c,input) -> go n input + c -> go n input + + err (AI end _ _) = failLocMsgP (srcSpanStart span) end "unterminated `{-'" + +open_brace, close_brace :: Action +open_brace span _str _len = do + ctx <- getContext + setContext (NoLayout:ctx) + return (L span ITocurly) +close_brace span _str _len = do + popContext + return (L span ITccurly) + +-- We have to be careful not to count M.<varid> as a qualified name +-- when <varid> is a keyword. We hack around this by catching +-- the offending tokens afterward, and re-lexing in a different state. +check_qvarid span buf len = do + case lookupUFM reservedWordsFM var of + Just (keyword,exts) + | not (isSpecial keyword) -> + if exts == 0 + then try_again + else do + b <- extension (\i -> exts .&. i /= 0) + if b then try_again + else return token + _other -> return token + where + (mod,var) = splitQualName buf len + token = L span (ITqvarid (mod,var)) + + try_again = do + (AI _ offs _) <- getInput + setInput (AI (srcSpanStart span) (offs-len) buf) + pushLexState bad_qvarid + lexToken + +qvarid buf len = ITqvarid $! splitQualName buf len +qconid buf len = ITqconid $! splitQualName buf len + +splitQualName :: StringBuffer -> Int -> (FastString,FastString) +-- takes a StringBuffer and a length, and returns the module name +-- and identifier parts of a qualified name. Splits at the *last* dot, +-- because of hierarchical module names. +splitQualName orig_buf len = split orig_buf orig_buf + where + split buf dot_buf + | orig_buf `byteDiff` buf >= len = done dot_buf + | c == '.' = found_dot buf' + | otherwise = split buf' dot_buf + where + (c,buf') = nextChar buf + + -- careful, we might get names like M.... + -- so, if the character after the dot is not upper-case, this is + -- the end of the qualifier part. + found_dot buf -- buf points after the '.' + | isUpper c = split buf' buf + | otherwise = done buf + where + (c,buf') = nextChar buf + + done dot_buf = + (lexemeToFastString orig_buf (qual_size - 1), + lexemeToFastString dot_buf (len - qual_size)) + where + qual_size = orig_buf `byteDiff` dot_buf + +varid span buf len = + case lookupUFM reservedWordsFM fs of + Just (keyword,0) -> do + maybe_layout keyword + return (L span keyword) + Just (keyword,exts) -> do + b <- extension (\i -> exts .&. i /= 0) + if b then do maybe_layout keyword + return (L span keyword) + else return (L span (ITvarid fs)) + _other -> return (L span (ITvarid fs)) + where + fs = lexemeToFastString buf len + +conid buf len = ITconid fs + where fs = lexemeToFastString buf len + +qvarsym buf len = ITqvarsym $! splitQualName buf len +qconsym buf len = ITqconsym $! splitQualName buf len + +varsym = sym ITvarsym +consym = sym ITconsym + +sym con span buf len = + case lookupUFM reservedSymsFM fs of + Just (keyword,0) -> return (L span keyword) + Just (keyword,exts) -> do + b <- extension (\i -> exts .&. i /= 0) + if b then return (L span keyword) + else return (L span $! con fs) + _other -> return (L span $! con fs) + where + fs = lexemeToFastString buf len + +tok_decimal span buf len + = return (L span (ITinteger $! parseInteger buf len 10 octDecDigit)) + +tok_octal span buf len + = return (L span (ITinteger $! parseInteger (offsetBytes 2 buf) (len-2) 8 octDecDigit)) + +tok_hexadecimal span buf len + = return (L span (ITinteger $! parseInteger (offsetBytes 2 buf) (len-2) 16 hexDigit)) + +prim_decimal span buf len + = return (L span (ITprimint $! parseInteger buf (len-1) 10 octDecDigit)) + +prim_octal span buf len + = return (L span (ITprimint $! parseInteger (offsetBytes 2 buf) (len-3) 8 octDecDigit)) + +prim_hexadecimal span buf len + = return (L span (ITprimint $! parseInteger (offsetBytes 2 buf) (len-3) 16 hexDigit)) + +tok_float str = ITrational $! readRational str +prim_float str = ITprimfloat $! readRational str +prim_double str = ITprimdouble $! readRational str + +-- ----------------------------------------------------------------------------- +-- Layout processing + +-- we're at the first token on a line, insert layout tokens if necessary +do_bol :: Action +do_bol span _str _len = do + pos <- getOffside + case pos of + LT -> do + --trace "layout: inserting '}'" $ do + popContext + -- do NOT pop the lex state, we might have a ';' to insert + return (L span ITvccurly) + EQ -> do + --trace "layout: inserting ';'" $ do + popLexState + return (L span ITsemi) + GT -> do + popLexState + lexToken + +-- certain keywords put us in the "layout" state, where we might +-- add an opening curly brace. +maybe_layout ITdo = pushLexState layout_do +maybe_layout ITmdo = pushLexState layout_do +maybe_layout ITof = pushLexState layout +maybe_layout ITlet = pushLexState layout +maybe_layout ITwhere = pushLexState layout +maybe_layout ITrec = pushLexState layout +maybe_layout _ = return () + +-- Pushing a new implicit layout context. If the indentation of the +-- next token is not greater than the previous layout context, then +-- Haskell 98 says that the new layout context should be empty; that is +-- the lexer must generate {}. +-- +-- We are slightly more lenient than this: when the new context is started +-- by a 'do', then we allow the new context to be at the same indentation as +-- the previous context. This is what the 'strict' argument is for. +-- +new_layout_context strict span _buf _len = do + popLexState + (AI _ offset _) <- getInput + ctx <- getContext + case ctx of + Layout prev_off : _ | + (strict && prev_off >= offset || + not strict && prev_off > offset) -> do + -- token is indented to the left of the previous context. + -- we must generate a {} sequence now. + pushLexState layout_left + return (L span ITvocurly) + other -> do + setContext (Layout offset : ctx) + return (L span ITvocurly) + +do_layout_left span _buf _len = do + popLexState + pushLexState bol -- we must be at the start of a line + return (L span ITvccurly) + +-- ----------------------------------------------------------------------------- +-- LINE pragmas + +setLine :: Int -> Action +setLine code span buf len = do + let line = parseInteger buf len 10 octDecDigit + setSrcLoc (mkSrcLoc (srcSpanFile span) (fromIntegral line - 1) 0) + -- subtract one: the line number refers to the *following* line + popLexState + pushLexState code + lexToken + +setFile :: Int -> Action +setFile code span buf len = do + let file = lexemeToFastString (stepOn buf) (len-2) + setSrcLoc (mkSrcLoc file (srcSpanEndLine span) (srcSpanEndCol span)) + popLexState + pushLexState code + lexToken + + +-- ----------------------------------------------------------------------------- +-- Options, includes and language pragmas. + +lex_string_prag :: (String -> Token) -> Action +lex_string_prag mkTok span buf len + = do input <- getInput + start <- getSrcLoc + tok <- go [] input + end <- getSrcLoc + return (L (mkSrcSpan start end) tok) + where go acc input + = if isString input "#-}" + then do setInput input + return (mkTok (reverse acc)) + else case alexGetChar input of + Just (c,i) -> go (c:acc) i + Nothing -> err input + isString i [] = True + isString i (x:xs) + = case alexGetChar i of + Just (c,i') | c == x -> isString i' xs + _other -> False + err (AI end _ _) = failLocMsgP (srcSpanStart span) end "unterminated options pragma" + + +-- ----------------------------------------------------------------------------- +-- Strings & Chars + +-- This stuff is horrible. I hates it. + +lex_string_tok :: Action +lex_string_tok span buf len = do + tok <- lex_string "" + end <- getSrcLoc + return (L (mkSrcSpan (srcSpanStart span) end) tok) + +lex_string :: String -> P Token +lex_string s = do + i <- getInput + case alexGetChar' i of + Nothing -> lit_error + + Just ('"',i) -> do + setInput i + glaexts <- extension glaExtsEnabled + if glaexts + then do + i <- getInput + case alexGetChar' i of + Just ('#',i) -> do + setInput i + if any (> '\xFF') s + then failMsgP "primitive string literal must contain only characters <= \'\\xFF\'" + else let s' = mkZFastString (reverse s) in + return (ITprimstring s') + -- mkZFastString is a hack to avoid encoding the + -- string in UTF-8. We just want the exact bytes. + _other -> + return (ITstring (mkFastString (reverse s))) + else + return (ITstring (mkFastString (reverse s))) + + Just ('\\',i) + | Just ('&',i) <- next -> do + setInput i; lex_string s + | Just (c,i) <- next, is_space c -> do + setInput i; lex_stringgap s + where next = alexGetChar' i + + Just (c, i) -> do + c' <- lex_char c i + lex_string (c':s) + +lex_stringgap s = do + c <- getCharOrFail + case c of + '\\' -> lex_string s + c | is_space c -> lex_stringgap s + _other -> lit_error + + +lex_char_tok :: Action +-- Here we are basically parsing character literals, such as 'x' or '\n' +-- but, when Template Haskell is on, we additionally spot +-- 'x and ''T, returning ITvarQuote and ITtyQuote respectively, +-- but WIHTOUT CONSUMING the x or T part (the parser does that). +-- So we have to do two characters of lookahead: when we see 'x we need to +-- see if there's a trailing quote +lex_char_tok span buf len = do -- We've seen ' + i1 <- getInput -- Look ahead to first character + let loc = srcSpanStart span + case alexGetChar' i1 of + Nothing -> lit_error + + Just ('\'', i2@(AI end2 _ _)) -> do -- We've seen '' + th_exts <- extension thEnabled + if th_exts then do + setInput i2 + return (L (mkSrcSpan loc end2) ITtyQuote) + else lit_error + + Just ('\\', i2@(AI end2 _ _)) -> do -- We've seen 'backslash + setInput i2 + lit_ch <- lex_escape + mc <- getCharOrFail -- Trailing quote + if mc == '\'' then finish_char_tok loc lit_ch + else do setInput i2; lit_error + + Just (c, i2@(AI end2 _ _)) + | not (isAny c) -> lit_error + | otherwise -> + + -- We've seen 'x, where x is a valid character + -- (i.e. not newline etc) but not a quote or backslash + case alexGetChar' i2 of -- Look ahead one more character + Nothing -> lit_error + Just ('\'', i3) -> do -- We've seen 'x' + setInput i3 + finish_char_tok loc c + _other -> do -- We've seen 'x not followed by quote + -- If TH is on, just parse the quote only + th_exts <- extension thEnabled + let (AI end _ _) = i1 + if th_exts then return (L (mkSrcSpan loc end) ITvarQuote) + else do setInput i2; lit_error + +finish_char_tok :: SrcLoc -> Char -> P (Located Token) +finish_char_tok loc ch -- We've already seen the closing quote + -- Just need to check for trailing # + = do glaexts <- extension glaExtsEnabled + i@(AI end _ _) <- getInput + if glaexts then do + case alexGetChar' i of + Just ('#',i@(AI end _ _)) -> do + setInput i + return (L (mkSrcSpan loc end) (ITprimchar ch)) + _other -> + return (L (mkSrcSpan loc end) (ITchar ch)) + else do + return (L (mkSrcSpan loc end) (ITchar ch)) + +lex_char :: Char -> AlexInput -> P Char +lex_char c inp = do + case c of + '\\' -> do setInput inp; lex_escape + c | isAny c -> do setInput inp; return c + _other -> lit_error + +isAny c | c > '\xff' = isPrint c + | otherwise = is_any c + +lex_escape :: P Char +lex_escape = do + c <- getCharOrFail + case c of + 'a' -> return '\a' + 'b' -> return '\b' + 'f' -> return '\f' + 'n' -> return '\n' + 'r' -> return '\r' + 't' -> return '\t' + 'v' -> return '\v' + '\\' -> return '\\' + '"' -> return '\"' + '\'' -> return '\'' + '^' -> do c <- getCharOrFail + if c >= '@' && c <= '_' + then return (chr (ord c - ord '@')) + else lit_error + + 'x' -> readNum is_hexdigit 16 hexDigit + 'o' -> readNum is_octdigit 8 octDecDigit + x | is_digit x -> readNum2 is_digit 10 octDecDigit (octDecDigit x) + + c1 -> do + i <- getInput + case alexGetChar' i of + Nothing -> lit_error + Just (c2,i2) -> + case alexGetChar' i2 of + Nothing -> do setInput i2; lit_error + Just (c3,i3) -> + let str = [c1,c2,c3] in + case [ (c,rest) | (p,c) <- silly_escape_chars, + Just rest <- [maybePrefixMatch p str] ] of + (escape_char,[]):_ -> do + setInput i3 + return escape_char + (escape_char,_:_):_ -> do + setInput i2 + return escape_char + [] -> lit_error + +readNum :: (Char -> Bool) -> Int -> (Char -> Int) -> P Char +readNum is_digit base conv = do + i <- getInput + c <- getCharOrFail + if is_digit c + then readNum2 is_digit base conv (conv c) + else do setInput i; lit_error + +readNum2 is_digit base conv i = do + input <- getInput + read i input + where read i input = do + case alexGetChar' input of + Just (c,input') | is_digit c -> do + read (i*base + conv c) input' + _other -> do + if i >= 0 && i <= 0x10FFFF + then do setInput input; return (chr i) + else lit_error + +silly_escape_chars = [ + ("NUL", '\NUL'), + ("SOH", '\SOH'), + ("STX", '\STX'), + ("ETX", '\ETX'), + ("EOT", '\EOT'), + ("ENQ", '\ENQ'), + ("ACK", '\ACK'), + ("BEL", '\BEL'), + ("BS", '\BS'), + ("HT", '\HT'), + ("LF", '\LF'), + ("VT", '\VT'), + ("FF", '\FF'), + ("CR", '\CR'), + ("SO", '\SO'), + ("SI", '\SI'), + ("DLE", '\DLE'), + ("DC1", '\DC1'), + ("DC2", '\DC2'), + ("DC3", '\DC3'), + ("DC4", '\DC4'), + ("NAK", '\NAK'), + ("SYN", '\SYN'), + ("ETB", '\ETB'), + ("CAN", '\CAN'), + ("EM", '\EM'), + ("SUB", '\SUB'), + ("ESC", '\ESC'), + ("FS", '\FS'), + ("GS", '\GS'), + ("RS", '\RS'), + ("US", '\US'), + ("SP", '\SP'), + ("DEL", '\DEL') + ] + +-- before calling lit_error, ensure that the current input is pointing to +-- the position of the error in the buffer. This is so that we can report +-- a correct location to the user, but also so we can detect UTF-8 decoding +-- errors if they occur. +lit_error = lexError "lexical error in string/character literal" + +getCharOrFail :: P Char +getCharOrFail = do + i <- getInput + case alexGetChar' i of + Nothing -> lexError "unexpected end-of-file in string/character literal" + Just (c,i) -> do setInput i; return c + +-- ----------------------------------------------------------------------------- +-- The Parse Monad + +data LayoutContext + = NoLayout + | Layout !Int + +data ParseResult a + = POk PState a + | PFailed + SrcSpan -- The start and end of the text span related to + -- the error. Might be used in environments which can + -- show this span, e.g. by highlighting it. + Message -- The error message + +data PState = PState { + buffer :: StringBuffer, + last_loc :: SrcSpan, -- pos of previous token + last_offs :: !Int, -- offset of the previous token from the + -- beginning of the current line. + -- \t is equal to 8 spaces. + last_len :: !Int, -- len of previous token + loc :: SrcLoc, -- current loc (end of prev token + 1) + extsBitmap :: !Int, -- bitmap that determines permitted extensions + context :: [LayoutContext], + lex_state :: [Int] + } + -- last_loc and last_len are used when generating error messages, + -- and in pushCurrentContext only. Sigh, if only Happy passed the + -- current token to happyError, we could at least get rid of last_len. + -- Getting rid of last_loc would require finding another way to + -- implement pushCurrentContext (which is only called from one place). + +newtype P a = P { unP :: PState -> ParseResult a } + +instance Monad P where + return = returnP + (>>=) = thenP + fail = failP + +returnP :: a -> P a +returnP a = P $ \s -> POk s a + +thenP :: P a -> (a -> P b) -> P b +(P m) `thenP` k = P $ \ s -> + case m s of + POk s1 a -> (unP (k a)) s1 + PFailed span err -> PFailed span err + +failP :: String -> P a +failP msg = P $ \s -> PFailed (last_loc s) (text msg) + +failMsgP :: String -> P a +failMsgP msg = P $ \s -> PFailed (last_loc s) (text msg) + +failLocMsgP :: SrcLoc -> SrcLoc -> String -> P a +failLocMsgP loc1 loc2 str = P $ \s -> PFailed (mkSrcSpan loc1 loc2) (text str) + +failSpanMsgP :: SrcSpan -> String -> P a +failSpanMsgP span msg = P $ \s -> PFailed span (text msg) + +extension :: (Int -> Bool) -> P Bool +extension p = P $ \s -> POk s (p $! extsBitmap s) + +getExts :: P Int +getExts = P $ \s -> POk s (extsBitmap s) + +setSrcLoc :: SrcLoc -> P () +setSrcLoc new_loc = P $ \s -> POk s{loc=new_loc} () + +getSrcLoc :: P SrcLoc +getSrcLoc = P $ \s@(PState{ loc=loc }) -> POk s loc + +setLastToken :: SrcSpan -> Int -> P () +setLastToken loc len = P $ \s -> POk s{ last_loc=loc, last_len=len } () + +data AlexInput = AI SrcLoc {-#UNPACK#-}!Int StringBuffer + +alexInputPrevChar :: AlexInput -> Char +alexInputPrevChar (AI _ _ buf) = prevChar buf '\n' + +alexGetChar :: AlexInput -> Maybe (Char,AlexInput) +alexGetChar (AI loc ofs s) + | atEnd s = Nothing + | otherwise = adj_c `seq` loc' `seq` ofs' `seq` s' `seq` + Just (adj_c, (AI loc' ofs' s')) + where (c,s') = nextChar s + loc' = advanceSrcLoc loc c + ofs' = advanceOffs c ofs + + non_graphic = '\x0' + upper = '\x1' + lower = '\x2' + digit = '\x3' + symbol = '\x4' + space = '\x5' + other_graphic = '\x6' + + adj_c + | c <= '\x06' = non_graphic + | c <= '\xff' = c + | otherwise = + case generalCategory c of + UppercaseLetter -> upper + LowercaseLetter -> lower + TitlecaseLetter -> upper + ModifierLetter -> other_graphic + OtherLetter -> other_graphic + NonSpacingMark -> other_graphic + SpacingCombiningMark -> other_graphic + EnclosingMark -> other_graphic + DecimalNumber -> digit + LetterNumber -> other_graphic + OtherNumber -> other_graphic + ConnectorPunctuation -> other_graphic + DashPunctuation -> other_graphic + OpenPunctuation -> other_graphic + ClosePunctuation -> other_graphic + InitialQuote -> other_graphic + FinalQuote -> other_graphic + OtherPunctuation -> other_graphic + MathSymbol -> symbol + CurrencySymbol -> symbol + ModifierSymbol -> symbol + OtherSymbol -> symbol + Space -> space + _other -> non_graphic + +-- This version does not squash unicode characters, it is used when +-- lexing strings. +alexGetChar' :: AlexInput -> Maybe (Char,AlexInput) +alexGetChar' (AI loc ofs s) + | atEnd s = Nothing + | otherwise = c `seq` loc' `seq` ofs' `seq` s' `seq` + Just (c, (AI loc' ofs' s')) + where (c,s') = nextChar s + loc' = advanceSrcLoc loc c + ofs' = advanceOffs c ofs + +advanceOffs :: Char -> Int -> Int +advanceOffs '\n' offs = 0 +advanceOffs '\t' offs = (offs `quot` 8 + 1) * 8 +advanceOffs _ offs = offs + 1 + +getInput :: P AlexInput +getInput = P $ \s@PState{ loc=l, last_offs=o, buffer=b } -> POk s (AI l o b) + +setInput :: AlexInput -> P () +setInput (AI l o b) = P $ \s -> POk s{ loc=l, last_offs=o, buffer=b } () + +pushLexState :: Int -> P () +pushLexState ls = P $ \s@PState{ lex_state=l } -> POk s{lex_state=ls:l} () + +popLexState :: P Int +popLexState = P $ \s@PState{ lex_state=ls:l } -> POk s{ lex_state=l } ls + +getLexState :: P Int +getLexState = P $ \s@PState{ lex_state=ls:l } -> POk s ls + +-- for reasons of efficiency, flags indicating language extensions (eg, +-- -fglasgow-exts or -fparr) are represented by a bitmap stored in an unboxed +-- integer + +glaExtsBit, ffiBit, parrBit :: Int +glaExtsBit = 0 +ffiBit = 1 +parrBit = 2 +arrowsBit = 4 +thBit = 5 +ipBit = 6 +tvBit = 7 -- Scoped type variables enables 'forall' keyword +bangPatBit = 8 -- Tells the parser to understand bang-patterns + -- (doesn't affect the lexer) + +glaExtsEnabled, ffiEnabled, parrEnabled :: Int -> Bool +glaExtsEnabled flags = testBit flags glaExtsBit +ffiEnabled flags = testBit flags ffiBit +parrEnabled flags = testBit flags parrBit +arrowsEnabled flags = testBit flags arrowsBit +thEnabled flags = testBit flags thBit +ipEnabled flags = testBit flags ipBit +tvEnabled flags = testBit flags tvBit +bangPatEnabled flags = testBit flags bangPatBit + +-- PState for parsing options pragmas +-- +pragState :: StringBuffer -> SrcLoc -> PState +pragState buf loc = + PState { + buffer = buf, + last_loc = mkSrcSpan loc loc, + last_offs = 0, + last_len = 0, + loc = loc, + extsBitmap = 0, + context = [], + lex_state = [bol, option_prags, 0] + } + + +-- create a parse state +-- +mkPState :: StringBuffer -> SrcLoc -> DynFlags -> PState +mkPState buf loc flags = + PState { + buffer = buf, + last_loc = mkSrcSpan loc loc, + last_offs = 0, + last_len = 0, + loc = loc, + extsBitmap = fromIntegral bitmap, + context = [], + lex_state = [bol, if glaExtsEnabled bitmap then glaexts else 0] + -- we begin in the layout state if toplev_layout is set + } + where + bitmap = glaExtsBit `setBitIf` dopt Opt_GlasgowExts flags + .|. ffiBit `setBitIf` dopt Opt_FFI flags + .|. parrBit `setBitIf` dopt Opt_PArr flags + .|. arrowsBit `setBitIf` dopt Opt_Arrows flags + .|. thBit `setBitIf` dopt Opt_TH flags + .|. ipBit `setBitIf` dopt Opt_ImplicitParams flags + .|. tvBit `setBitIf` dopt Opt_ScopedTypeVariables flags + .|. bangPatBit `setBitIf` dopt Opt_BangPatterns flags + -- + setBitIf :: Int -> Bool -> Int + b `setBitIf` cond | cond = bit b + | otherwise = 0 + +getContext :: P [LayoutContext] +getContext = P $ \s@PState{context=ctx} -> POk s ctx + +setContext :: [LayoutContext] -> P () +setContext ctx = P $ \s -> POk s{context=ctx} () + +popContext :: P () +popContext = P $ \ s@(PState{ buffer = buf, context = ctx, + loc = loc, last_len = len, last_loc = last_loc }) -> + case ctx of + (_:tl) -> POk s{ context = tl } () + [] -> PFailed last_loc (srcParseErr buf len) + +-- Push a new layout context at the indentation of the last token read. +-- This is only used at the outer level of a module when the 'module' +-- keyword is missing. +pushCurrentContext :: P () +pushCurrentContext = P $ \ s@PState{ last_offs=offs, last_len=len, context=ctx } -> + POk s{context = Layout (offs-len) : ctx} () + +getOffside :: P Ordering +getOffside = P $ \s@PState{last_offs=offs, context=stk} -> + let ord = case stk of + (Layout n:_) -> compare offs n + _ -> GT + in POk s ord + +-- --------------------------------------------------------------------------- +-- Construct a parse error + +srcParseErr + :: StringBuffer -- current buffer (placed just after the last token) + -> Int -- length of the previous token + -> Message +srcParseErr buf len + = hcat [ if null token + then ptext SLIT("parse error (possibly incorrect indentation)") + else hcat [ptext SLIT("parse error on input "), + char '`', text token, char '\''] + ] + where token = lexemeToString (offsetBytes (-len) buf) len + +-- Report a parse failure, giving the span of the previous token as +-- the location of the error. This is the entry point for errors +-- detected during parsing. +srcParseFail :: P a +srcParseFail = P $ \PState{ buffer = buf, last_len = len, + last_loc = last_loc } -> + PFailed last_loc (srcParseErr buf len) + +-- A lexical error is reported at a particular position in the source file, +-- not over a token range. +lexError :: String -> P a +lexError str = do + loc <- getSrcLoc + i@(AI end _ buf) <- getInput + reportLexError loc end buf str + +-- ----------------------------------------------------------------------------- +-- This is the top-level function: called from the parser each time a +-- new token is to be read from the input. + +lexer :: (Located Token -> P a) -> P a +lexer cont = do + tok@(L _ tok__) <- lexToken + --trace ("token: " ++ show tok__) $ do + cont tok + +lexToken :: P (Located Token) +lexToken = do + inp@(AI loc1 _ buf) <- getInput + sc <- getLexState + exts <- getExts + case alexScanUser exts inp sc of + AlexEOF -> do let span = mkSrcSpan loc1 loc1 + setLastToken span 0 + return (L span ITeof) + AlexError (AI loc2 _ buf) -> do + reportLexError loc1 loc2 buf "lexical error" + AlexSkip inp2 _ -> do + setInput inp2 + lexToken + AlexToken inp2@(AI end _ buf2) len t -> do + setInput inp2 + let span = mkSrcSpan loc1 end + let bytes = byteDiff buf buf2 + span `seq` setLastToken span bytes + t span buf bytes + +-- ToDo: Alex reports the buffer at the start of the erroneous lexeme, +-- but it would be more informative to report the location where the +-- error was actually discovered, especially if this is a decoding +-- error. +reportLexError loc1 loc2 buf str = + let + c = fst (nextChar buf) + in + if c == '\0' -- decoding errors are mapped to '\0', see utf8DecodeChar# + then failLocMsgP loc2 loc2 "UTF-8 decoding error" + else failLocMsgP loc1 loc2 (str ++ " at character " ++ show c) +} diff --git a/compiler/parser/Parser.y.pp b/compiler/parser/Parser.y.pp new file mode 100644 index 0000000000..3066a0f876 --- /dev/null +++ b/compiler/parser/Parser.y.pp @@ -0,0 +1,1607 @@ +-- -*-haskell-*- +-- --------------------------------------------------------------------------- +-- (c) The University of Glasgow 1997-2003 +--- +-- The GHC grammar. +-- +-- Author(s): Simon Marlow, Sven Panne 1997, 1998, 1999 +-- --------------------------------------------------------------------------- + +{ +module Parser ( parseModule, parseStmt, parseIdentifier, parseType, + parseHeader ) where + +#define INCLUDE #include +INCLUDE "HsVersions.h" + +import HsSyn +import RdrHsSyn +import HscTypes ( IsBootInterface, DeprecTxt ) +import Lexer +import RdrName +import TysWiredIn ( unitTyCon, unitDataCon, tupleTyCon, tupleCon, nilDataCon, + listTyCon_RDR, parrTyCon_RDR, consDataCon_RDR ) +import Type ( funTyCon ) +import ForeignCall ( Safety(..), CExportSpec(..), CLabelString, + CCallConv(..), CCallTarget(..), defaultCCallConv + ) +import OccName ( varName, dataName, tcClsName, tvName ) +import DataCon ( DataCon, dataConName ) +import SrcLoc ( Located(..), unLoc, getLoc, noLoc, combineSrcSpans, + SrcSpan, combineLocs, srcLocFile, + mkSrcLoc, mkSrcSpan ) +import Module +import StaticFlags ( opt_SccProfilingOn ) +import Type ( Kind, mkArrowKind, liftedTypeKind ) +import BasicTypes ( Boxity(..), Fixity(..), FixityDirection(..), IPName(..), + Activation(..), defaultInlineSpec ) +import OrdList + +import FastString +import Maybes ( orElse ) +import Outputable +import GLAEXTS +} + +{- +----------------------------------------------------------------------------- +Conflicts: 36 shift/reduce (1.25) + +10 for abiguity in 'if x then y else z + 1' [State 178] + (shift parses as 'if x then y else (z + 1)', as per longest-parse rule) + 10 because op might be: : - ! * . `x` VARSYM CONSYM QVARSYM QCONSYM + +1 for ambiguity in 'if x then y else z :: T' [State 178] + (shift parses as 'if x then y else (z :: T)', as per longest-parse rule) + +4 for ambiguity in 'if x then y else z -< e' [State 178] + (shift parses as 'if x then y else (z -< T)', as per longest-parse rule) + There are four such operators: -<, >-, -<<, >>- + + +2 for ambiguity in 'case v of { x :: T -> T ... } ' [States 11, 253] + Which of these two is intended? + case v of + (x::T) -> T -- Rhs is T + or + case v of + (x::T -> T) -> .. -- Rhs is ... + +10 for ambiguity in 'e :: a `b` c'. Does this mean [States 11, 253] + (e::a) `b` c, or + (e :: (a `b` c)) + As well as `b` we can have !, VARSYM, QCONSYM, and CONSYM, hence 5 cases + Same duplication between states 11 and 253 as the previous case + +1 for ambiguity in 'let ?x ...' [State 329] + the parser can't tell whether the ?x is the lhs of a normal binding or + an implicit binding. Fortunately resolving as shift gives it the only + sensible meaning, namely the lhs of an implicit binding. + +1 for ambiguity in '{-# RULES "name" [ ... #-} [State 382] + we don't know whether the '[' starts the activation or not: it + might be the start of the declaration with the activation being + empty. --SDM 1/4/2002 + +6 for conflicts between `fdecl' and `fdeclDEPRECATED', [States 393,394] + which are resolved correctly, and moreover, + should go away when `fdeclDEPRECATED' is removed. + +1 for ambiguity in '{-# RULES "name" forall = ... #-}' [State 474] + since 'forall' is a valid variable name, we don't know whether + to treat a forall on the input as the beginning of a quantifier + or the beginning of the rule itself. Resolving to shift means + it's always treated as a quantifier, hence the above is disallowed. + This saves explicitly defining a grammar for the rule lhs that + doesn't include 'forall'. + +-- --------------------------------------------------------------------------- +-- Adding location info + +This is done in a stylised way using the three macros below, L0, L1 +and LL. Each of these macros can be thought of as having type + + L0, L1, LL :: a -> Located a + +They each add a SrcSpan to their argument. + + L0 adds 'noSrcSpan', used for empty productions + + L1 for a production with a single token on the lhs. Grabs the SrcSpan + from that token. + + LL for a production with >1 token on the lhs. Makes up a SrcSpan from + the first and last tokens. + +These suffice for the majority of cases. However, we must be +especially careful with empty productions: LL won't work if the first +or last token on the lhs can represent an empty span. In these cases, +we have to calculate the span using more of the tokens from the lhs, eg. + + | 'newtype' tycl_hdr '=' newconstr deriving + { L (comb3 $1 $4 $5) + (mkTyData NewType (unLoc $2) [$4] (unLoc $5)) } + +We provide comb3 and comb4 functions which are useful in such cases. + +Be careful: there's no checking that you actually got this right, the +only symptom will be that the SrcSpans of your syntax will be +incorrect. + +/* + * We must expand these macros *before* running Happy, which is why this file is + * Parser.y.pp rather than just Parser.y - we run the C pre-processor first. + */ +#define L0 L noSrcSpan +#define L1 sL (getLoc $1) +#define LL sL (comb2 $1 $>) + +-- ----------------------------------------------------------------------------- + +-} + +%token + '_' { L _ ITunderscore } -- Haskell keywords + 'as' { L _ ITas } + 'case' { L _ ITcase } + 'class' { L _ ITclass } + 'data' { L _ ITdata } + 'default' { L _ ITdefault } + 'deriving' { L _ ITderiving } + 'do' { L _ ITdo } + 'else' { L _ ITelse } + 'hiding' { L _ IThiding } + 'if' { L _ ITif } + 'import' { L _ ITimport } + 'in' { L _ ITin } + 'infix' { L _ ITinfix } + 'infixl' { L _ ITinfixl } + 'infixr' { L _ ITinfixr } + 'instance' { L _ ITinstance } + 'let' { L _ ITlet } + 'module' { L _ ITmodule } + 'newtype' { L _ ITnewtype } + 'of' { L _ ITof } + 'qualified' { L _ ITqualified } + 'then' { L _ ITthen } + 'type' { L _ ITtype } + 'where' { L _ ITwhere } + '_scc_' { L _ ITscc } -- ToDo: remove + + 'forall' { L _ ITforall } -- GHC extension keywords + 'foreign' { L _ ITforeign } + 'export' { L _ ITexport } + 'label' { L _ ITlabel } + 'dynamic' { L _ ITdynamic } + 'safe' { L _ ITsafe } + 'threadsafe' { L _ ITthreadsafe } + 'unsafe' { L _ ITunsafe } + 'mdo' { L _ ITmdo } + 'stdcall' { L _ ITstdcallconv } + 'ccall' { L _ ITccallconv } + 'dotnet' { L _ ITdotnet } + 'proc' { L _ ITproc } -- for arrow notation extension + 'rec' { L _ ITrec } -- for arrow notation extension + + '{-# INLINE' { L _ (ITinline_prag _) } + '{-# SPECIALISE' { L _ ITspec_prag } + '{-# SPECIALISE_INLINE' { L _ (ITspec_inline_prag _) } + '{-# SOURCE' { L _ ITsource_prag } + '{-# RULES' { L _ ITrules_prag } + '{-# CORE' { L _ ITcore_prag } -- hdaume: annotated core + '{-# SCC' { L _ ITscc_prag } + '{-# DEPRECATED' { L _ ITdeprecated_prag } + '{-# UNPACK' { L _ ITunpack_prag } + '#-}' { L _ ITclose_prag } + + '..' { L _ ITdotdot } -- reserved symbols + ':' { L _ ITcolon } + '::' { L _ ITdcolon } + '=' { L _ ITequal } + '\\' { L _ ITlam } + '|' { L _ ITvbar } + '<-' { L _ ITlarrow } + '->' { L _ ITrarrow } + '@' { L _ ITat } + '~' { L _ ITtilde } + '=>' { L _ ITdarrow } + '-' { L _ ITminus } + '!' { L _ ITbang } + '*' { L _ ITstar } + '-<' { L _ ITlarrowtail } -- for arrow notation + '>-' { L _ ITrarrowtail } -- for arrow notation + '-<<' { L _ ITLarrowtail } -- for arrow notation + '>>-' { L _ ITRarrowtail } -- for arrow notation + '.' { L _ ITdot } + + '{' { L _ ITocurly } -- special symbols + '}' { L _ ITccurly } + '{|' { L _ ITocurlybar } + '|}' { L _ ITccurlybar } + vocurly { L _ ITvocurly } -- virtual open curly (from layout) + vccurly { L _ ITvccurly } -- virtual close curly (from layout) + '[' { L _ ITobrack } + ']' { L _ ITcbrack } + '[:' { L _ ITopabrack } + ':]' { L _ ITcpabrack } + '(' { L _ IToparen } + ')' { L _ ITcparen } + '(#' { L _ IToubxparen } + '#)' { L _ ITcubxparen } + '(|' { L _ IToparenbar } + '|)' { L _ ITcparenbar } + ';' { L _ ITsemi } + ',' { L _ ITcomma } + '`' { L _ ITbackquote } + + VARID { L _ (ITvarid _) } -- identifiers + CONID { L _ (ITconid _) } + VARSYM { L _ (ITvarsym _) } + CONSYM { L _ (ITconsym _) } + QVARID { L _ (ITqvarid _) } + QCONID { L _ (ITqconid _) } + QVARSYM { L _ (ITqvarsym _) } + QCONSYM { L _ (ITqconsym _) } + + IPDUPVARID { L _ (ITdupipvarid _) } -- GHC extension + IPSPLITVARID { L _ (ITsplitipvarid _) } -- GHC extension + + CHAR { L _ (ITchar _) } + STRING { L _ (ITstring _) } + INTEGER { L _ (ITinteger _) } + RATIONAL { L _ (ITrational _) } + + PRIMCHAR { L _ (ITprimchar _) } + PRIMSTRING { L _ (ITprimstring _) } + PRIMINTEGER { L _ (ITprimint _) } + PRIMFLOAT { L _ (ITprimfloat _) } + PRIMDOUBLE { L _ (ITprimdouble _) } + +-- Template Haskell +'[|' { L _ ITopenExpQuote } +'[p|' { L _ ITopenPatQuote } +'[t|' { L _ ITopenTypQuote } +'[d|' { L _ ITopenDecQuote } +'|]' { L _ ITcloseQuote } +TH_ID_SPLICE { L _ (ITidEscape _) } -- $x +'$(' { L _ ITparenEscape } -- $( exp ) +TH_VAR_QUOTE { L _ ITvarQuote } -- 'x +TH_TY_QUOTE { L _ ITtyQuote } -- ''T + +%monad { P } { >>= } { return } +%lexer { lexer } { L _ ITeof } +%name parseModule module +%name parseStmt maybe_stmt +%name parseIdentifier identifier +%name parseType ctype +%partial parseHeader header +%tokentype { (Located Token) } +%% + +----------------------------------------------------------------------------- +-- Identifiers; one of the entry points +identifier :: { Located RdrName } + : qvar { $1 } + | qcon { $1 } + | qvarop { $1 } + | qconop { $1 } + +----------------------------------------------------------------------------- +-- Module Header + +-- The place for module deprecation is really too restrictive, but if it +-- was allowed at its natural place just before 'module', we get an ugly +-- s/r conflict with the second alternative. Another solution would be the +-- introduction of a new pragma DEPRECATED_MODULE, but this is not very nice, +-- either, and DEPRECATED is only expected to be used by people who really +-- know what they are doing. :-) + +module :: { Located (HsModule RdrName) } + : 'module' modid maybemoddeprec maybeexports 'where' body + {% fileSrcSpan >>= \ loc -> + return (L loc (HsModule (Just $2) $4 (fst $6) (snd $6) $3)) } + | missing_module_keyword top close + {% fileSrcSpan >>= \ loc -> + return (L loc (HsModule Nothing Nothing + (fst $2) (snd $2) Nothing)) } + +missing_module_keyword :: { () } + : {- empty -} {% pushCurrentContext } + +maybemoddeprec :: { Maybe DeprecTxt } + : '{-# DEPRECATED' STRING '#-}' { Just (getSTRING $2) } + | {- empty -} { Nothing } + +body :: { ([LImportDecl RdrName], [LHsDecl RdrName]) } + : '{' top '}' { $2 } + | vocurly top close { $2 } + +top :: { ([LImportDecl RdrName], [LHsDecl RdrName]) } + : importdecls { (reverse $1,[]) } + | importdecls ';' cvtopdecls { (reverse $1,$3) } + | cvtopdecls { ([],$1) } + +cvtopdecls :: { [LHsDecl RdrName] } + : topdecls { cvTopDecls $1 } + +----------------------------------------------------------------------------- +-- Module declaration & imports only + +header :: { Located (HsModule RdrName) } + : 'module' modid maybemoddeprec maybeexports 'where' header_body + {% fileSrcSpan >>= \ loc -> + return (L loc (HsModule (Just $2) $4 $6 [] $3)) } + | missing_module_keyword importdecls + {% fileSrcSpan >>= \ loc -> + return (L loc (HsModule Nothing Nothing $2 [] Nothing)) } + +header_body :: { [LImportDecl RdrName] } + : '{' importdecls { $2 } + | vocurly importdecls { $2 } + +----------------------------------------------------------------------------- +-- The Export List + +maybeexports :: { Maybe [LIE RdrName] } + : '(' exportlist ')' { Just $2 } + | {- empty -} { Nothing } + +exportlist :: { [LIE RdrName] } + : exportlist ',' export { $3 : $1 } + | exportlist ',' { $1 } + | export { [$1] } + | {- empty -} { [] } + + -- No longer allow things like [] and (,,,) to be exported + -- They are built in syntax, always available +export :: { LIE RdrName } + : qvar { L1 (IEVar (unLoc $1)) } + | oqtycon { L1 (IEThingAbs (unLoc $1)) } + | oqtycon '(' '..' ')' { LL (IEThingAll (unLoc $1)) } + | oqtycon '(' ')' { LL (IEThingWith (unLoc $1) []) } + | oqtycon '(' qcnames ')' { LL (IEThingWith (unLoc $1) (reverse $3)) } + | 'module' modid { LL (IEModuleContents (unLoc $2)) } + +qcnames :: { [RdrName] } + : qcnames ',' qcname { unLoc $3 : $1 } + | qcname { [unLoc $1] } + +qcname :: { Located RdrName } -- Variable or data constructor + : qvar { $1 } + | qcon { $1 } + +----------------------------------------------------------------------------- +-- Import Declarations + +-- import decls can be *empty*, or even just a string of semicolons +-- whereas topdecls must contain at least one topdecl. + +importdecls :: { [LImportDecl RdrName] } + : importdecls ';' importdecl { $3 : $1 } + | importdecls ';' { $1 } + | importdecl { [ $1 ] } + | {- empty -} { [] } + +importdecl :: { LImportDecl RdrName } + : 'import' maybe_src optqualified modid maybeas maybeimpspec + { L (comb4 $1 $4 $5 $6) (ImportDecl $4 $2 $3 (unLoc $5) (unLoc $6)) } + +maybe_src :: { IsBootInterface } + : '{-# SOURCE' '#-}' { True } + | {- empty -} { False } + +optqualified :: { Bool } + : 'qualified' { True } + | {- empty -} { False } + +maybeas :: { Located (Maybe Module) } + : 'as' modid { LL (Just (unLoc $2)) } + | {- empty -} { noLoc Nothing } + +maybeimpspec :: { Located (Maybe (Bool, [LIE RdrName])) } + : impspec { L1 (Just (unLoc $1)) } + | {- empty -} { noLoc Nothing } + +impspec :: { Located (Bool, [LIE RdrName]) } + : '(' exportlist ')' { LL (False, reverse $2) } + | 'hiding' '(' exportlist ')' { LL (True, reverse $3) } + +----------------------------------------------------------------------------- +-- Fixity Declarations + +prec :: { Int } + : {- empty -} { 9 } + | INTEGER {% checkPrecP (L1 (fromInteger (getINTEGER $1))) } + +infix :: { Located FixityDirection } + : 'infix' { L1 InfixN } + | 'infixl' { L1 InfixL } + | 'infixr' { L1 InfixR } + +ops :: { Located [Located RdrName] } + : ops ',' op { LL ($3 : unLoc $1) } + | op { L1 [$1] } + +----------------------------------------------------------------------------- +-- Top-Level Declarations + +topdecls :: { OrdList (LHsDecl RdrName) } -- Reversed + : topdecls ';' topdecl { $1 `appOL` $3 } + | topdecls ';' { $1 } + | topdecl { $1 } + +topdecl :: { OrdList (LHsDecl RdrName) } + : tycl_decl { unitOL (L1 (TyClD (unLoc $1))) } + | 'instance' inst_type where + { let (binds,sigs) = cvBindsAndSigs (unLoc $3) + in unitOL (L (comb3 $1 $2 $3) (InstD (InstDecl $2 binds sigs))) } + | 'default' '(' comma_types0 ')' { unitOL (LL $ DefD (DefaultDecl $3)) } + | 'foreign' fdecl { unitOL (LL (unLoc $2)) } + | '{-# DEPRECATED' deprecations '#-}' { $2 } + | '{-# RULES' rules '#-}' { $2 } + | '$(' exp ')' { unitOL (LL $ SpliceD (SpliceDecl $2)) } + | decl { unLoc $1 } + +tycl_decl :: { LTyClDecl RdrName } + : 'type' type '=' ctype + -- Note type on the left of the '='; this allows + -- infix type constructors to be declared + -- + -- Note ctype, not sigtype, on the right + -- We allow an explicit for-all but we don't insert one + -- in type Foo a = (b,b) + -- Instead we just say b is out of scope + {% do { (tc,tvs) <- checkSynHdr $2 + ; return (LL (TySynonym tc tvs $4)) } } + + | data_or_newtype tycl_hdr constrs deriving + { L (comb4 $1 $2 $3 $4) -- We need the location on tycl_hdr + -- in case constrs and deriving are both empty + (mkTyData (unLoc $1) (unLoc $2) Nothing (reverse (unLoc $3)) (unLoc $4)) } + + | data_or_newtype tycl_hdr opt_kind_sig + 'where' gadt_constrlist + deriving + { L (comb4 $1 $2 $4 $5) + (mkTyData (unLoc $1) (unLoc $2) $3 (reverse (unLoc $5)) (unLoc $6)) } + + | 'class' tycl_hdr fds where + { let + (binds,sigs) = cvBindsAndSigs (unLoc $4) + in + L (comb4 $1 $2 $3 $4) (mkClassDecl (unLoc $2) (unLoc $3) sigs + binds) } + +data_or_newtype :: { Located NewOrData } + : 'data' { L1 DataType } + | 'newtype' { L1 NewType } + +opt_kind_sig :: { Maybe Kind } + : { Nothing } + | '::' kind { Just $2 } + +-- tycl_hdr parses the header of a type or class decl, +-- which takes the form +-- T a b +-- Eq a => T a +-- (Eq a, Ord b) => T a b +-- Rather a lot of inlining here, else we get reduce/reduce errors +tycl_hdr :: { Located (LHsContext RdrName, Located RdrName, [LHsTyVarBndr RdrName]) } + : context '=>' type {% checkTyClHdr $1 $3 >>= return.LL } + | type {% checkTyClHdr (noLoc []) $1 >>= return.L1 } + +----------------------------------------------------------------------------- +-- Nested declarations + +decls :: { Located (OrdList (LHsDecl RdrName)) } -- Reversed + : decls ';' decl { LL (unLoc $1 `appOL` unLoc $3) } + | decls ';' { LL (unLoc $1) } + | decl { $1 } + | {- empty -} { noLoc nilOL } + + +decllist :: { Located (OrdList (LHsDecl RdrName)) } -- Reversed + : '{' decls '}' { LL (unLoc $2) } + | vocurly decls close { $2 } + +where :: { Located (OrdList (LHsDecl RdrName)) } -- Reversed + -- No implicit parameters + : 'where' decllist { LL (unLoc $2) } + | {- empty -} { noLoc nilOL } + +binds :: { Located (HsLocalBinds RdrName) } -- May have implicit parameters + : decllist { L1 (HsValBinds (cvBindGroup (unLoc $1))) } + | '{' dbinds '}' { LL (HsIPBinds (IPBinds (unLoc $2) emptyLHsBinds)) } + | vocurly dbinds close { L (getLoc $2) (HsIPBinds (IPBinds (unLoc $2) emptyLHsBinds)) } + +wherebinds :: { Located (HsLocalBinds RdrName) } -- May have implicit parameters + : 'where' binds { LL (unLoc $2) } + | {- empty -} { noLoc emptyLocalBinds } + + +----------------------------------------------------------------------------- +-- Transformation Rules + +rules :: { OrdList (LHsDecl RdrName) } -- Reversed + : rules ';' rule { $1 `snocOL` $3 } + | rules ';' { $1 } + | rule { unitOL $1 } + | {- empty -} { nilOL } + +rule :: { LHsDecl RdrName } + : STRING activation rule_forall infixexp '=' exp + { LL $ RuleD (HsRule (getSTRING $1) + ($2 `orElse` AlwaysActive) + $3 $4 placeHolderNames $6 placeHolderNames) } + +activation :: { Maybe Activation } + : {- empty -} { Nothing } + | explicit_activation { Just $1 } + +explicit_activation :: { Activation } -- In brackets + : '[' INTEGER ']' { ActiveAfter (fromInteger (getINTEGER $2)) } + | '[' '~' INTEGER ']' { ActiveBefore (fromInteger (getINTEGER $3)) } + +rule_forall :: { [RuleBndr RdrName] } + : 'forall' rule_var_list '.' { $2 } + | {- empty -} { [] } + +rule_var_list :: { [RuleBndr RdrName] } + : rule_var { [$1] } + | rule_var rule_var_list { $1 : $2 } + +rule_var :: { RuleBndr RdrName } + : varid { RuleBndr $1 } + | '(' varid '::' ctype ')' { RuleBndrSig $2 $4 } + +----------------------------------------------------------------------------- +-- Deprecations (c.f. rules) + +deprecations :: { OrdList (LHsDecl RdrName) } -- Reversed + : deprecations ';' deprecation { $1 `appOL` $3 } + | deprecations ';' { $1 } + | deprecation { $1 } + | {- empty -} { nilOL } + +-- SUP: TEMPORARY HACK, not checking for `module Foo' +deprecation :: { OrdList (LHsDecl RdrName) } + : depreclist STRING + { toOL [ LL $ DeprecD (Deprecation n (getSTRING $2)) + | n <- unLoc $1 ] } + + +----------------------------------------------------------------------------- +-- Foreign import and export declarations + +-- for the time being, the following accepts foreign declarations conforming +-- to the FFI Addendum, Version 1.0 as well as pre-standard declarations +-- +-- * a flag indicates whether pre-standard declarations have been used and +-- triggers a deprecation warning further down the road +-- +-- NB: The first two rules could be combined into one by replacing `safety1' +-- with `safety'. However, the combined rule conflicts with the +-- DEPRECATED rules. +-- +fdecl :: { LHsDecl RdrName } +fdecl : 'import' callconv safety1 fspec + {% mkImport $2 $3 (unLoc $4) >>= return.LL } + | 'import' callconv fspec + {% do { d <- mkImport $2 (PlaySafe False) (unLoc $3); + return (LL d) } } + | 'export' callconv fspec + {% mkExport $2 (unLoc $3) >>= return.LL } + -- the following syntax is DEPRECATED + | fdecl1DEPRECATED { L1 (ForD (unLoc $1)) } + | fdecl2DEPRECATED { L1 (unLoc $1) } + +fdecl1DEPRECATED :: { LForeignDecl RdrName } +fdecl1DEPRECATED + ----------- DEPRECATED label decls ------------ + : 'label' ext_name varid '::' sigtype + { LL $ ForeignImport $3 $5 (CImport defaultCCallConv (PlaySafe False) nilFS nilFS + (CLabel ($2 `orElse` mkExtName (unLoc $3)))) True } + + ----------- DEPRECATED ccall/stdcall decls ------------ + -- + -- NB: This business with the case expression below may seem overly + -- complicated, but it is necessary to avoid some conflicts. + + -- DEPRECATED variant #1: lack of a calling convention specification + -- (import) + | 'import' {-no callconv-} ext_name safety varid_no_unsafe '::' sigtype + { let + target = StaticTarget ($2 `orElse` mkExtName (unLoc $4)) + in + LL $ ForeignImport $4 $6 (CImport defaultCCallConv $3 nilFS nilFS + (CFunction target)) True } + + -- DEPRECATED variant #2: external name consists of two separate strings + -- (module name and function name) (import) + | 'import' callconv STRING STRING safety varid_no_unsafe '::' sigtype + {% case $2 of + DNCall -> parseError (comb2 $1 $>) "Illegal format of .NET foreign import" + CCall cconv -> return $ + let + imp = CFunction (StaticTarget (getSTRING $4)) + in + LL $ ForeignImport $6 $8 (CImport cconv $5 nilFS nilFS imp) True } + + -- DEPRECATED variant #3: `unsafe' after entity + | 'import' callconv STRING 'unsafe' varid_no_unsafe '::' sigtype + {% case $2 of + DNCall -> parseError (comb2 $1 $>) "Illegal format of .NET foreign import" + CCall cconv -> return $ + let + imp = CFunction (StaticTarget (getSTRING $3)) + in + LL $ ForeignImport $5 $7 (CImport cconv PlayRisky nilFS nilFS imp) True } + + -- DEPRECATED variant #4: use of the special identifier `dynamic' without + -- an explicit calling convention (import) + | 'import' {-no callconv-} 'dynamic' safety varid_no_unsafe '::' sigtype + { LL $ ForeignImport $4 $6 (CImport defaultCCallConv $3 nilFS nilFS + (CFunction DynamicTarget)) True } + + -- DEPRECATED variant #5: use of the special identifier `dynamic' (import) + | 'import' callconv 'dynamic' safety varid_no_unsafe '::' sigtype + {% case $2 of + DNCall -> parseError (comb2 $1 $>) "Illegal format of .NET foreign import" + CCall cconv -> return $ + LL $ ForeignImport $5 $7 (CImport cconv $4 nilFS nilFS + (CFunction DynamicTarget)) True } + + -- DEPRECATED variant #6: lack of a calling convention specification + -- (export) + | 'export' {-no callconv-} ext_name varid '::' sigtype + { LL $ ForeignExport $3 $5 (CExport (CExportStatic ($2 `orElse` mkExtName (unLoc $3)) + defaultCCallConv)) True } + + -- DEPRECATED variant #7: external name consists of two separate strings + -- (module name and function name) (export) + | 'export' callconv STRING STRING varid '::' sigtype + {% case $2 of + DNCall -> parseError (comb2 $1 $>) "Illegal format of .NET foreign import" + CCall cconv -> return $ + LL $ ForeignExport $5 $7 + (CExport (CExportStatic (getSTRING $4) cconv)) True } + + -- DEPRECATED variant #8: use of the special identifier `dynamic' without + -- an explicit calling convention (export) + | 'export' {-no callconv-} 'dynamic' varid '::' sigtype + { LL $ ForeignImport $3 $5 (CImport defaultCCallConv (PlaySafe False) nilFS nilFS + CWrapper) True } + + -- DEPRECATED variant #9: use of the special identifier `dynamic' (export) + | 'export' callconv 'dynamic' varid '::' sigtype + {% case $2 of + DNCall -> parseError (comb2 $1 $>) "Illegal format of .NET foreign import" + CCall cconv -> return $ + LL $ ForeignImport $4 $6 + (CImport cconv (PlaySafe False) nilFS nilFS CWrapper) True } + + ----------- DEPRECATED .NET decls ------------ + -- NB: removed the .NET call declaration, as it is entirely subsumed + -- by the new standard FFI declarations + +fdecl2DEPRECATED :: { LHsDecl RdrName } +fdecl2DEPRECATED + : 'import' 'dotnet' 'type' ext_name tycon { LL $ TyClD (ForeignType $5 $4 DNType) } + -- left this one unchanged for the moment as type imports are not + -- covered currently by the FFI standard -=chak + + +callconv :: { CallConv } + : 'stdcall' { CCall StdCallConv } + | 'ccall' { CCall CCallConv } + | 'dotnet' { DNCall } + +safety :: { Safety } + : 'unsafe' { PlayRisky } + | 'safe' { PlaySafe False } + | 'threadsafe' { PlaySafe True } + | {- empty -} { PlaySafe False } + +safety1 :: { Safety } + : 'unsafe' { PlayRisky } + | 'safe' { PlaySafe False } + | 'threadsafe' { PlaySafe True } + -- only needed to avoid conflicts with the DEPRECATED rules + +fspec :: { Located (Located FastString, Located RdrName, LHsType RdrName) } + : STRING var '::' sigtype { LL (L (getLoc $1) (getSTRING $1), $2, $4) } + | var '::' sigtype { LL (noLoc nilFS, $1, $3) } + -- if the entity string is missing, it defaults to the empty string; + -- the meaning of an empty entity string depends on the calling + -- convention + +-- DEPRECATED syntax +ext_name :: { Maybe CLabelString } + : STRING { Just (getSTRING $1) } + | STRING STRING { Just (getSTRING $2) } -- Ignore "module name" for now + | {- empty -} { Nothing } + + +----------------------------------------------------------------------------- +-- Type signatures + +opt_sig :: { Maybe (LHsType RdrName) } + : {- empty -} { Nothing } + | '::' sigtype { Just $2 } + +opt_asig :: { Maybe (LHsType RdrName) } + : {- empty -} { Nothing } + | '::' atype { Just $2 } + +sigtypes1 :: { [LHsType RdrName] } + : sigtype { [ $1 ] } + | sigtype ',' sigtypes1 { $1 : $3 } + +sigtype :: { LHsType RdrName } + : ctype { L1 (mkImplicitHsForAllTy (noLoc []) $1) } + -- Wrap an Implicit forall if there isn't one there already + +sig_vars :: { Located [Located RdrName] } + : sig_vars ',' var { LL ($3 : unLoc $1) } + | var { L1 [$1] } + +----------------------------------------------------------------------------- +-- Types + +strict_mark :: { Located HsBang } + : '!' { L1 HsStrict } + | '{-# UNPACK' '#-}' '!' { LL HsUnbox } + +-- A ctype is a for-all type +ctype :: { LHsType RdrName } + : 'forall' tv_bndrs '.' ctype { LL $ mkExplicitHsForAllTy $2 (noLoc []) $4 } + | context '=>' type { LL $ mkImplicitHsForAllTy $1 $3 } + -- A type of form (context => type) is an *implicit* HsForAllTy + | type { $1 } + +-- We parse a context as a btype so that we don't get reduce/reduce +-- errors in ctype. The basic problem is that +-- (Eq a, Ord a) +-- looks so much like a tuple type. We can't tell until we find the => +context :: { LHsContext RdrName } + : btype {% checkContext $1 } + +type :: { LHsType RdrName } + : ipvar '::' gentype { LL (HsPredTy (HsIParam (unLoc $1) $3)) } + | gentype { $1 } + +gentype :: { LHsType RdrName } + : btype { $1 } + | btype qtyconop gentype { LL $ HsOpTy $1 $2 $3 } + | btype tyvarop gentype { LL $ HsOpTy $1 $2 $3 } + | btype '->' ctype { LL $ HsFunTy $1 $3 } + +btype :: { LHsType RdrName } + : btype atype { LL $ HsAppTy $1 $2 } + | atype { $1 } + +atype :: { LHsType RdrName } + : gtycon { L1 (HsTyVar (unLoc $1)) } + | tyvar { L1 (HsTyVar (unLoc $1)) } + | strict_mark atype { LL (HsBangTy (unLoc $1) $2) } + | '(' ctype ',' comma_types1 ')' { LL $ HsTupleTy Boxed ($2:$4) } + | '(#' comma_types1 '#)' { LL $ HsTupleTy Unboxed $2 } + | '[' ctype ']' { LL $ HsListTy $2 } + | '[:' ctype ':]' { LL $ HsPArrTy $2 } + | '(' ctype ')' { LL $ HsParTy $2 } + | '(' ctype '::' kind ')' { LL $ HsKindSig $2 $4 } +-- Generics + | INTEGER { L1 (HsNumTy (getINTEGER $1)) } + +-- An inst_type is what occurs in the head of an instance decl +-- e.g. (Foo a, Gaz b) => Wibble a b +-- It's kept as a single type, with a MonoDictTy at the right +-- hand corner, for convenience. +inst_type :: { LHsType RdrName } + : sigtype {% checkInstType $1 } + +inst_types1 :: { [LHsType RdrName] } + : inst_type { [$1] } + | inst_type ',' inst_types1 { $1 : $3 } + +comma_types0 :: { [LHsType RdrName] } + : comma_types1 { $1 } + | {- empty -} { [] } + +comma_types1 :: { [LHsType RdrName] } + : ctype { [$1] } + | ctype ',' comma_types1 { $1 : $3 } + +tv_bndrs :: { [LHsTyVarBndr RdrName] } + : tv_bndr tv_bndrs { $1 : $2 } + | {- empty -} { [] } + +tv_bndr :: { LHsTyVarBndr RdrName } + : tyvar { L1 (UserTyVar (unLoc $1)) } + | '(' tyvar '::' kind ')' { LL (KindedTyVar (unLoc $2) $4) } + +fds :: { Located [Located ([RdrName], [RdrName])] } + : {- empty -} { noLoc [] } + | '|' fds1 { LL (reverse (unLoc $2)) } + +fds1 :: { Located [Located ([RdrName], [RdrName])] } + : fds1 ',' fd { LL ($3 : unLoc $1) } + | fd { L1 [$1] } + +fd :: { Located ([RdrName], [RdrName]) } + : varids0 '->' varids0 { L (comb3 $1 $2 $3) + (reverse (unLoc $1), reverse (unLoc $3)) } + +varids0 :: { Located [RdrName] } + : {- empty -} { noLoc [] } + | varids0 tyvar { LL (unLoc $2 : unLoc $1) } + +----------------------------------------------------------------------------- +-- Kinds + +kind :: { Kind } + : akind { $1 } + | akind '->' kind { mkArrowKind $1 $3 } + +akind :: { Kind } + : '*' { liftedTypeKind } + | '(' kind ')' { $2 } + + +----------------------------------------------------------------------------- +-- Datatype declarations + +gadt_constrlist :: { Located [LConDecl RdrName] } + : '{' gadt_constrs '}' { LL (unLoc $2) } + | vocurly gadt_constrs close { $2 } + +gadt_constrs :: { Located [LConDecl RdrName] } + : gadt_constrs ';' gadt_constr { LL ($3 : unLoc $1) } + | gadt_constrs ';' { $1 } + | gadt_constr { L1 [$1] } + +-- We allow the following forms: +-- C :: Eq a => a -> T a +-- C :: forall a. Eq a => !a -> T a +-- D { x,y :: a } :: T a +-- forall a. Eq a => D { x,y :: a } :: T a + +gadt_constr :: { LConDecl RdrName } + : con '::' sigtype + { LL (mkGadtDecl $1 $3) } + -- Syntax: Maybe merge the record stuff with the single-case above? + -- (to kill the mostly harmless reduce/reduce error) + -- XXX revisit autrijus + | constr_stuff_record '::' sigtype + { let (con,details) = unLoc $1 in + LL (ConDecl con Implicit [] (noLoc []) details (ResTyGADT $3)) } +{- + | forall context '=>' constr_stuff_record '::' sigtype + { let (con,details) = unLoc $4 in + LL (ConDecl con Implicit (unLoc $1) $2 details (ResTyGADT $6)) } + | forall constr_stuff_record '::' sigtype + { let (con,details) = unLoc $2 in + LL (ConDecl con Implicit (unLoc $1) (noLoc []) details (ResTyGADT $4)) } +-} + + +constrs :: { Located [LConDecl RdrName] } + : {- empty; a GHC extension -} { noLoc [] } + | '=' constrs1 { LL (unLoc $2) } + +constrs1 :: { Located [LConDecl RdrName] } + : constrs1 '|' constr { LL ($3 : unLoc $1) } + | constr { L1 [$1] } + +constr :: { LConDecl RdrName } + : forall context '=>' constr_stuff + { let (con,details) = unLoc $4 in + LL (ConDecl con Explicit (unLoc $1) $2 details ResTyH98) } + | forall constr_stuff + { let (con,details) = unLoc $2 in + LL (ConDecl con Explicit (unLoc $1) (noLoc []) details ResTyH98) } + +forall :: { Located [LHsTyVarBndr RdrName] } + : 'forall' tv_bndrs '.' { LL $2 } + | {- empty -} { noLoc [] } + +constr_stuff :: { Located (Located RdrName, HsConDetails RdrName (LBangType RdrName)) } +-- We parse the constructor declaration +-- C t1 t2 +-- as a btype (treating C as a type constructor) and then convert C to be +-- a data constructor. Reason: it might continue like this: +-- C t1 t2 %: D Int +-- in which case C really would be a type constructor. We can't resolve this +-- ambiguity till we come across the constructor oprerator :% (or not, more usually) + : btype {% mkPrefixCon $1 [] >>= return.LL } + | oqtycon '{' '}' {% mkRecCon $1 [] >>= return.LL } + | oqtycon '{' fielddecls '}' {% mkRecCon $1 $3 >>= return.LL } + | btype conop btype { LL ($2, InfixCon $1 $3) } + +constr_stuff_record :: { Located (Located RdrName, HsConDetails RdrName (LBangType RdrName)) } + : oqtycon '{' '}' {% mkRecCon $1 [] >>= return.sL (comb2 $1 $>) } + | oqtycon '{' fielddecls '}' {% mkRecCon $1 $3 >>= return.sL (comb2 $1 $>) } + +fielddecls :: { [([Located RdrName], LBangType RdrName)] } + : fielddecl ',' fielddecls { unLoc $1 : $3 } + | fielddecl { [unLoc $1] } + +fielddecl :: { Located ([Located RdrName], LBangType RdrName) } + : sig_vars '::' ctype { LL (reverse (unLoc $1), $3) } + +-- We allow the odd-looking 'inst_type' in a deriving clause, so that +-- we can do deriving( forall a. C [a] ) in a newtype (GHC extension). +-- The 'C [a]' part is converted to an HsPredTy by checkInstType +-- We don't allow a context, but that's sorted out by the type checker. +deriving :: { Located (Maybe [LHsType RdrName]) } + : {- empty -} { noLoc Nothing } + | 'deriving' qtycon {% do { let { L loc tv = $2 } + ; p <- checkInstType (L loc (HsTyVar tv)) + ; return (LL (Just [p])) } } + | 'deriving' '(' ')' { LL (Just []) } + | 'deriving' '(' inst_types1 ')' { LL (Just $3) } + -- Glasgow extension: allow partial + -- applications in derivings + +----------------------------------------------------------------------------- +-- Value definitions + +{- There's an awkward overlap with a type signature. Consider + f :: Int -> Int = ...rhs... + Then we can't tell whether it's a type signature or a value + definition with a result signature until we see the '='. + So we have to inline enough to postpone reductions until we know. +-} + +{- + ATTENTION: Dirty Hackery Ahead! If the second alternative of vars is var + instead of qvar, we get another shift/reduce-conflict. Consider the + following programs: + + { (^^) :: Int->Int ; } Type signature; only var allowed + + { (^^) :: Int->Int = ... ; } Value defn with result signature; + qvar allowed (because of instance decls) + + We can't tell whether to reduce var to qvar until after we've read the signatures. +-} + +decl :: { Located (OrdList (LHsDecl RdrName)) } + : sigdecl { $1 } + | '!' infixexp rhs {% do { pat <- checkPattern $2; + return (LL $ unitOL $ LL $ ValD $ + PatBind (LL $ BangPat pat) (unLoc $3) + placeHolderType placeHolderNames) } } + | infixexp opt_sig rhs {% do { r <- checkValDef $1 $2 $3; + return (LL $ unitOL (LL $ ValD r)) } } + +rhs :: { Located (GRHSs RdrName) } + : '=' exp wherebinds { L (comb3 $1 $2 $3) $ GRHSs (unguardedRHS $2) (unLoc $3) } + | gdrhs wherebinds { LL $ GRHSs (reverse (unLoc $1)) (unLoc $2) } + +gdrhs :: { Located [LGRHS RdrName] } + : gdrhs gdrh { LL ($2 : unLoc $1) } + | gdrh { L1 [$1] } + +gdrh :: { LGRHS RdrName } + : '|' quals '=' exp { sL (comb2 $1 $>) $ GRHS (reverse (unLoc $2)) $4 } + +sigdecl :: { Located (OrdList (LHsDecl RdrName)) } + : infixexp '::' sigtype + {% do s <- checkValSig $1 $3; + return (LL $ unitOL (LL $ SigD s)) } + -- See the above notes for why we need infixexp here + | var ',' sig_vars '::' sigtype + { LL $ toOL [ LL $ SigD (TypeSig n $5) | n <- $1 : unLoc $3 ] } + | infix prec ops { LL $ toOL [ LL $ SigD (FixSig (FixitySig n (Fixity $2 (unLoc $1)))) + | n <- unLoc $3 ] } + | '{-# INLINE' activation qvar '#-}' + { LL $ unitOL (LL $ SigD (InlineSig $3 (mkInlineSpec $2 (getINLINE $1)))) } + | '{-# SPECIALISE' qvar '::' sigtypes1 '#-}' + { LL $ toOL [ LL $ SigD (SpecSig $2 t defaultInlineSpec) + | t <- $4] } + | '{-# SPECIALISE_INLINE' activation qvar '::' sigtypes1 '#-}' + { LL $ toOL [ LL $ SigD (SpecSig $3 t (mkInlineSpec $2 (getSPEC_INLINE $1))) + | t <- $5] } + | '{-# SPECIALISE' 'instance' inst_type '#-}' + { LL $ unitOL (LL $ SigD (SpecInstSig $3)) } + +----------------------------------------------------------------------------- +-- Expressions + +exp :: { LHsExpr RdrName } + : infixexp '::' sigtype { LL $ ExprWithTySig $1 $3 } + | infixexp '-<' exp { LL $ HsArrApp $1 $3 placeHolderType HsFirstOrderApp True } + | infixexp '>-' exp { LL $ HsArrApp $3 $1 placeHolderType HsFirstOrderApp False } + | infixexp '-<<' exp { LL $ HsArrApp $1 $3 placeHolderType HsHigherOrderApp True } + | infixexp '>>-' exp { LL $ HsArrApp $3 $1 placeHolderType HsHigherOrderApp False} + | infixexp { $1 } + +infixexp :: { LHsExpr RdrName } + : exp10 { $1 } + | infixexp qop exp10 { LL (OpApp $1 $2 (panic "fixity") $3) } + +exp10 :: { LHsExpr RdrName } + : '\\' aexp aexps opt_asig '->' exp + {% checkPatterns ($2 : reverse $3) >>= \ ps -> + return (LL $ HsLam (mkMatchGroup [LL $ Match ps $4 + (GRHSs (unguardedRHS $6) emptyLocalBinds + )])) } + | 'let' binds 'in' exp { LL $ HsLet (unLoc $2) $4 } + | 'if' exp 'then' exp 'else' exp { LL $ HsIf $2 $4 $6 } + | 'case' exp 'of' altslist { LL $ HsCase $2 (mkMatchGroup (unLoc $4)) } + | '-' fexp { LL $ mkHsNegApp $2 } + + | 'do' stmtlist {% let loc = comb2 $1 $2 in + checkDo loc (unLoc $2) >>= \ (stmts,body) -> + return (L loc (mkHsDo DoExpr stmts body)) } + | 'mdo' stmtlist {% let loc = comb2 $1 $2 in + checkDo loc (unLoc $2) >>= \ (stmts,body) -> + return (L loc (mkHsDo (MDoExpr noPostTcTable) stmts body)) } + | scc_annot exp { LL $ if opt_SccProfilingOn + then HsSCC (unLoc $1) $2 + else HsPar $2 } + + | 'proc' aexp '->' exp + {% checkPattern $2 >>= \ p -> + return (LL $ HsProc p (LL $ HsCmdTop $4 [] + placeHolderType undefined)) } + -- TODO: is LL right here? + + | '{-# CORE' STRING '#-}' exp { LL $ HsCoreAnn (getSTRING $2) $4 } + -- hdaume: core annotation + | fexp { $1 } + +scc_annot :: { Located FastString } + : '_scc_' STRING { LL $ getSTRING $2 } + | '{-# SCC' STRING '#-}' { LL $ getSTRING $2 } + +fexp :: { LHsExpr RdrName } + : fexp aexp { LL $ HsApp $1 $2 } + | aexp { $1 } + +aexps :: { [LHsExpr RdrName] } + : aexps aexp { $2 : $1 } + | {- empty -} { [] } + +aexp :: { LHsExpr RdrName } + : qvar '@' aexp { LL $ EAsPat $1 $3 } + | '~' aexp { LL $ ELazyPat $2 } +-- | '!' aexp { LL $ EBangPat $2 } + | aexp1 { $1 } + +aexp1 :: { LHsExpr RdrName } + : aexp1 '{' fbinds '}' {% do { r <- mkRecConstrOrUpdate $1 (comb2 $2 $4) + (reverse $3); + return (LL r) }} + | aexp2 { $1 } + +-- Here was the syntax for type applications that I was planning +-- but there are difficulties (e.g. what order for type args) +-- so it's not enabled yet. +-- But this case *is* used for the left hand side of a generic definition, +-- which is parsed as an expression before being munged into a pattern + | qcname '{|' gentype '|}' { LL $ HsApp (sL (getLoc $1) (HsVar (unLoc $1))) + (sL (getLoc $3) (HsType $3)) } + +aexp2 :: { LHsExpr RdrName } + : ipvar { L1 (HsIPVar $! unLoc $1) } + | qcname { L1 (HsVar $! unLoc $1) } + | literal { L1 (HsLit $! unLoc $1) } + | INTEGER { L1 (HsOverLit $! mkHsIntegral (getINTEGER $1)) } + | RATIONAL { L1 (HsOverLit $! mkHsFractional (getRATIONAL $1)) } + | '(' exp ')' { LL (HsPar $2) } + | '(' texp ',' texps ')' { LL $ ExplicitTuple ($2 : reverse $4) Boxed } + | '(#' texps '#)' { LL $ ExplicitTuple (reverse $2) Unboxed } + | '[' list ']' { LL (unLoc $2) } + | '[:' parr ':]' { LL (unLoc $2) } + | '(' infixexp qop ')' { LL $ SectionL $2 $3 } + | '(' qopm infixexp ')' { LL $ SectionR $2 $3 } + | '_' { L1 EWildPat } + + -- MetaHaskell Extension + | TH_ID_SPLICE { L1 $ HsSpliceE (mkHsSplice + (L1 $ HsVar (mkUnqual varName + (getTH_ID_SPLICE $1)))) } -- $x + | '$(' exp ')' { LL $ HsSpliceE (mkHsSplice $2) } -- $( exp ) + + | TH_VAR_QUOTE qvar { LL $ HsBracket (VarBr (unLoc $2)) } + | TH_VAR_QUOTE qcon { LL $ HsBracket (VarBr (unLoc $2)) } + | TH_TY_QUOTE tyvar { LL $ HsBracket (VarBr (unLoc $2)) } + | TH_TY_QUOTE gtycon { LL $ HsBracket (VarBr (unLoc $2)) } + | '[|' exp '|]' { LL $ HsBracket (ExpBr $2) } + | '[t|' ctype '|]' { LL $ HsBracket (TypBr $2) } + | '[p|' infixexp '|]' {% checkPattern $2 >>= \p -> + return (LL $ HsBracket (PatBr p)) } + | '[d|' cvtopbody '|]' { LL $ HsBracket (DecBr (mkGroup $2)) } + + -- arrow notation extension + | '(|' aexp2 cmdargs '|)' { LL $ HsArrForm $2 Nothing (reverse $3) } + +cmdargs :: { [LHsCmdTop RdrName] } + : cmdargs acmd { $2 : $1 } + | {- empty -} { [] } + +acmd :: { LHsCmdTop RdrName } + : aexp2 { L1 $ HsCmdTop $1 [] placeHolderType undefined } + +cvtopbody :: { [LHsDecl RdrName] } + : '{' cvtopdecls0 '}' { $2 } + | vocurly cvtopdecls0 close { $2 } + +cvtopdecls0 :: { [LHsDecl RdrName] } + : {- empty -} { [] } + | cvtopdecls { $1 } + +texp :: { LHsExpr RdrName } + : exp { $1 } + | qopm infixexp { LL $ SectionR $1 $2 } + -- The second production is really here only for bang patterns + -- but + +texps :: { [LHsExpr RdrName] } + : texps ',' texp { $3 : $1 } + | texp { [$1] } + + +----------------------------------------------------------------------------- +-- List expressions + +-- The rules below are little bit contorted to keep lexps left-recursive while +-- avoiding another shift/reduce-conflict. + +list :: { LHsExpr RdrName } + : texp { L1 $ ExplicitList placeHolderType [$1] } + | lexps { L1 $ ExplicitList placeHolderType (reverse (unLoc $1)) } + | texp '..' { LL $ ArithSeq noPostTcExpr (From $1) } + | texp ',' exp '..' { LL $ ArithSeq noPostTcExpr (FromThen $1 $3) } + | texp '..' exp { LL $ ArithSeq noPostTcExpr (FromTo $1 $3) } + | texp ',' exp '..' exp { LL $ ArithSeq noPostTcExpr (FromThenTo $1 $3 $5) } + | texp pquals { sL (comb2 $1 $>) $ mkHsDo ListComp (reverse (unLoc $2)) $1 } + +lexps :: { Located [LHsExpr RdrName] } + : lexps ',' texp { LL ($3 : unLoc $1) } + | texp ',' texp { LL [$3,$1] } + +----------------------------------------------------------------------------- +-- List Comprehensions + +pquals :: { Located [LStmt RdrName] } -- Either a singleton ParStmt, + -- or a reversed list of Stmts + : pquals1 { case unLoc $1 of + [qs] -> L1 qs + qss -> L1 [L1 (ParStmt stmtss)] + where + stmtss = [ (reverse qs, undefined) + | qs <- qss ] + } + +pquals1 :: { Located [[LStmt RdrName]] } + : pquals1 '|' quals { LL (unLoc $3 : unLoc $1) } + | '|' quals { L (getLoc $2) [unLoc $2] } + +quals :: { Located [LStmt RdrName] } + : quals ',' qual { LL ($3 : unLoc $1) } + | qual { L1 [$1] } + +----------------------------------------------------------------------------- +-- Parallel array expressions + +-- The rules below are little bit contorted; see the list case for details. +-- Note that, in contrast to lists, we only have finite arithmetic sequences. +-- Moreover, we allow explicit arrays with no element (represented by the nil +-- constructor in the list case). + +parr :: { LHsExpr RdrName } + : { noLoc (ExplicitPArr placeHolderType []) } + | exp { L1 $ ExplicitPArr placeHolderType [$1] } + | lexps { L1 $ ExplicitPArr placeHolderType + (reverse (unLoc $1)) } + | exp '..' exp { LL $ PArrSeq noPostTcExpr (FromTo $1 $3) } + | exp ',' exp '..' exp { LL $ PArrSeq noPostTcExpr (FromThenTo $1 $3 $5) } + | exp pquals { sL (comb2 $1 $>) $ mkHsDo PArrComp (reverse (unLoc $2)) $1 } + +-- We are reusing `lexps' and `pquals' from the list case. + +----------------------------------------------------------------------------- +-- Case alternatives + +altslist :: { Located [LMatch RdrName] } + : '{' alts '}' { LL (reverse (unLoc $2)) } + | vocurly alts close { L (getLoc $2) (reverse (unLoc $2)) } + +alts :: { Located [LMatch RdrName] } + : alts1 { L1 (unLoc $1) } + | ';' alts { LL (unLoc $2) } + +alts1 :: { Located [LMatch RdrName] } + : alts1 ';' alt { LL ($3 : unLoc $1) } + | alts1 ';' { LL (unLoc $1) } + | alt { L1 [$1] } + +alt :: { LMatch RdrName } + : infixexp opt_sig alt_rhs {% checkPattern $1 >>= \p -> + return (LL (Match [p] $2 (unLoc $3))) } + +alt_rhs :: { Located (GRHSs RdrName) } + : ralt wherebinds { LL (GRHSs (unLoc $1) (unLoc $2)) } + +ralt :: { Located [LGRHS RdrName] } + : '->' exp { LL (unguardedRHS $2) } + | gdpats { L1 (reverse (unLoc $1)) } + +gdpats :: { Located [LGRHS RdrName] } + : gdpats gdpat { LL ($2 : unLoc $1) } + | gdpat { L1 [$1] } + +gdpat :: { LGRHS RdrName } + : '|' quals '->' exp { sL (comb2 $1 $>) $ GRHS (reverse (unLoc $2)) $4 } + +----------------------------------------------------------------------------- +-- Statement sequences + +stmtlist :: { Located [LStmt RdrName] } + : '{' stmts '}' { LL (unLoc $2) } + | vocurly stmts close { $2 } + +-- do { ;; s ; s ; ; s ;; } +-- The last Stmt should be an expression, but that's hard to enforce +-- here, because we need too much lookahead if we see do { e ; } +-- So we use ExprStmts throughout, and switch the last one over +-- in ParseUtils.checkDo instead +stmts :: { Located [LStmt RdrName] } + : stmt stmts_help { LL ($1 : unLoc $2) } + | ';' stmts { LL (unLoc $2) } + | {- empty -} { noLoc [] } + +stmts_help :: { Located [LStmt RdrName] } -- might be empty + : ';' stmts { LL (unLoc $2) } + | {- empty -} { noLoc [] } + +-- For typing stmts at the GHCi prompt, where +-- the input may consist of just comments. +maybe_stmt :: { Maybe (LStmt RdrName) } + : stmt { Just $1 } + | {- nothing -} { Nothing } + +stmt :: { LStmt RdrName } + : qual { $1 } + | infixexp '->' exp {% checkPattern $3 >>= \p -> + return (LL $ mkBindStmt p $1) } + | 'rec' stmtlist { LL $ mkRecStmt (unLoc $2) } + +qual :: { LStmt RdrName } + : exp '<-' exp {% checkPattern $1 >>= \p -> + return (LL $ mkBindStmt p $3) } + | exp { L1 $ mkExprStmt $1 } + | 'let' binds { LL $ LetStmt (unLoc $2) } + +----------------------------------------------------------------------------- +-- Record Field Update/Construction + +fbinds :: { HsRecordBinds RdrName } + : fbinds1 { $1 } + | {- empty -} { [] } + +fbinds1 :: { HsRecordBinds RdrName } + : fbinds1 ',' fbind { $3 : $1 } + | fbind { [$1] } + +fbind :: { (Located RdrName, LHsExpr RdrName) } + : qvar '=' exp { ($1,$3) } + +----------------------------------------------------------------------------- +-- Implicit Parameter Bindings + +dbinds :: { Located [LIPBind RdrName] } + : dbinds ';' dbind { LL ($3 : unLoc $1) } + | dbinds ';' { LL (unLoc $1) } + | dbind { L1 [$1] } +-- | {- empty -} { [] } + +dbind :: { LIPBind RdrName } +dbind : ipvar '=' exp { LL (IPBind (unLoc $1) $3) } + +ipvar :: { Located (IPName RdrName) } + : IPDUPVARID { L1 (Dupable (mkUnqual varName (getIPDUPVARID $1))) } + | IPSPLITVARID { L1 (Linear (mkUnqual varName (getIPSPLITVARID $1))) } + +----------------------------------------------------------------------------- +-- Deprecations + +depreclist :: { Located [RdrName] } +depreclist : deprec_var { L1 [unLoc $1] } + | deprec_var ',' depreclist { LL (unLoc $1 : unLoc $3) } + +deprec_var :: { Located RdrName } +deprec_var : var { $1 } + | con { $1 } + +----------------------------------------- +-- Data constructors +qcon :: { Located RdrName } + : qconid { $1 } + | '(' qconsym ')' { LL (unLoc $2) } + | sysdcon { L1 $ nameRdrName (dataConName (unLoc $1)) } +-- The case of '[:' ':]' is part of the production `parr' + +con :: { Located RdrName } + : conid { $1 } + | '(' consym ')' { LL (unLoc $2) } + | sysdcon { L1 $ nameRdrName (dataConName (unLoc $1)) } + +sysdcon :: { Located DataCon } -- Wired in data constructors + : '(' ')' { LL unitDataCon } + | '(' commas ')' { LL $ tupleCon Boxed $2 } + | '[' ']' { LL nilDataCon } + +conop :: { Located RdrName } + : consym { $1 } + | '`' conid '`' { LL (unLoc $2) } + +qconop :: { Located RdrName } + : qconsym { $1 } + | '`' qconid '`' { LL (unLoc $2) } + +----------------------------------------------------------------------------- +-- Type constructors + +gtycon :: { Located RdrName } -- A "general" qualified tycon + : oqtycon { $1 } + | '(' ')' { LL $ getRdrName unitTyCon } + | '(' commas ')' { LL $ getRdrName (tupleTyCon Boxed $2) } + | '(' '->' ')' { LL $ getRdrName funTyCon } + | '[' ']' { LL $ listTyCon_RDR } + | '[:' ':]' { LL $ parrTyCon_RDR } + +oqtycon :: { Located RdrName } -- An "ordinary" qualified tycon + : qtycon { $1 } + | '(' qtyconsym ')' { LL (unLoc $2) } + +qtyconop :: { Located RdrName } -- Qualified or unqualified + : qtyconsym { $1 } + | '`' qtycon '`' { LL (unLoc $2) } + +qtycon :: { Located RdrName } -- Qualified or unqualified + : QCONID { L1 $! mkQual tcClsName (getQCONID $1) } + | tycon { $1 } + +tycon :: { Located RdrName } -- Unqualified + : CONID { L1 $! mkUnqual tcClsName (getCONID $1) } + +qtyconsym :: { Located RdrName } + : QCONSYM { L1 $! mkQual tcClsName (getQCONSYM $1) } + | tyconsym { $1 } + +tyconsym :: { Located RdrName } + : CONSYM { L1 $! mkUnqual tcClsName (getCONSYM $1) } + +----------------------------------------------------------------------------- +-- Operators + +op :: { Located RdrName } -- used in infix decls + : varop { $1 } + | conop { $1 } + +varop :: { Located RdrName } + : varsym { $1 } + | '`' varid '`' { LL (unLoc $2) } + +qop :: { LHsExpr RdrName } -- used in sections + : qvarop { L1 $ HsVar (unLoc $1) } + | qconop { L1 $ HsVar (unLoc $1) } + +qopm :: { LHsExpr RdrName } -- used in sections + : qvaropm { L1 $ HsVar (unLoc $1) } + | qconop { L1 $ HsVar (unLoc $1) } + +qvarop :: { Located RdrName } + : qvarsym { $1 } + | '`' qvarid '`' { LL (unLoc $2) } + +qvaropm :: { Located RdrName } + : qvarsym_no_minus { $1 } + | '`' qvarid '`' { LL (unLoc $2) } + +----------------------------------------------------------------------------- +-- Type variables + +tyvar :: { Located RdrName } +tyvar : tyvarid { $1 } + | '(' tyvarsym ')' { LL (unLoc $2) } + +tyvarop :: { Located RdrName } +tyvarop : '`' tyvarid '`' { LL (unLoc $2) } + | tyvarsym { $1 } + +tyvarid :: { Located RdrName } + : VARID { L1 $! mkUnqual tvName (getVARID $1) } + | special_id { L1 $! mkUnqual tvName (unLoc $1) } + | 'unsafe' { L1 $! mkUnqual tvName FSLIT("unsafe") } + | 'safe' { L1 $! mkUnqual tvName FSLIT("safe") } + | 'threadsafe' { L1 $! mkUnqual tvName FSLIT("threadsafe") } + +tyvarsym :: { Located RdrName } +-- Does not include "!", because that is used for strictness marks +-- or ".", because that separates the quantified type vars from the rest +-- or "*", because that's used for kinds +tyvarsym : VARSYM { L1 $! mkUnqual tvName (getVARSYM $1) } + +----------------------------------------------------------------------------- +-- Variables + +var :: { Located RdrName } + : varid { $1 } + | '(' varsym ')' { LL (unLoc $2) } + +qvar :: { Located RdrName } + : qvarid { $1 } + | '(' varsym ')' { LL (unLoc $2) } + | '(' qvarsym1 ')' { LL (unLoc $2) } +-- We've inlined qvarsym here so that the decision about +-- whether it's a qvar or a var can be postponed until +-- *after* we see the close paren. + +qvarid :: { Located RdrName } + : varid { $1 } + | QVARID { L1 $ mkQual varName (getQVARID $1) } + +varid :: { Located RdrName } + : varid_no_unsafe { $1 } + | 'unsafe' { L1 $! mkUnqual varName FSLIT("unsafe") } + | 'safe' { L1 $! mkUnqual varName FSLIT("safe") } + | 'threadsafe' { L1 $! mkUnqual varName FSLIT("threadsafe") } + +varid_no_unsafe :: { Located RdrName } + : VARID { L1 $! mkUnqual varName (getVARID $1) } + | special_id { L1 $! mkUnqual varName (unLoc $1) } + | 'forall' { L1 $! mkUnqual varName FSLIT("forall") } + +qvarsym :: { Located RdrName } + : varsym { $1 } + | qvarsym1 { $1 } + +qvarsym_no_minus :: { Located RdrName } + : varsym_no_minus { $1 } + | qvarsym1 { $1 } + +qvarsym1 :: { Located RdrName } +qvarsym1 : QVARSYM { L1 $ mkQual varName (getQVARSYM $1) } + +varsym :: { Located RdrName } + : varsym_no_minus { $1 } + | '-' { L1 $ mkUnqual varName FSLIT("-") } + +varsym_no_minus :: { Located RdrName } -- varsym not including '-' + : VARSYM { L1 $ mkUnqual varName (getVARSYM $1) } + | special_sym { L1 $ mkUnqual varName (unLoc $1) } + + +-- These special_ids are treated as keywords in various places, +-- but as ordinary ids elsewhere. 'special_id' collects all these +-- except 'unsafe' and 'forall' whose treatment differs depending on context +special_id :: { Located FastString } +special_id + : 'as' { L1 FSLIT("as") } + | 'qualified' { L1 FSLIT("qualified") } + | 'hiding' { L1 FSLIT("hiding") } + | 'export' { L1 FSLIT("export") } + | 'label' { L1 FSLIT("label") } + | 'dynamic' { L1 FSLIT("dynamic") } + | 'stdcall' { L1 FSLIT("stdcall") } + | 'ccall' { L1 FSLIT("ccall") } + +special_sym :: { Located FastString } +special_sym : '!' { L1 FSLIT("!") } + | '.' { L1 FSLIT(".") } + | '*' { L1 FSLIT("*") } + +----------------------------------------------------------------------------- +-- Data constructors + +qconid :: { Located RdrName } -- Qualified or unqualified + : conid { $1 } + | QCONID { L1 $ mkQual dataName (getQCONID $1) } + +conid :: { Located RdrName } + : CONID { L1 $ mkUnqual dataName (getCONID $1) } + +qconsym :: { Located RdrName } -- Qualified or unqualified + : consym { $1 } + | QCONSYM { L1 $ mkQual dataName (getQCONSYM $1) } + +consym :: { Located RdrName } + : CONSYM { L1 $ mkUnqual dataName (getCONSYM $1) } + + -- ':' means only list cons + | ':' { L1 $ consDataCon_RDR } + + +----------------------------------------------------------------------------- +-- Literals + +literal :: { Located HsLit } + : CHAR { L1 $ HsChar $ getCHAR $1 } + | STRING { L1 $ HsString $ getSTRING $1 } + | PRIMINTEGER { L1 $ HsIntPrim $ getPRIMINTEGER $1 } + | PRIMCHAR { L1 $ HsCharPrim $ getPRIMCHAR $1 } + | PRIMSTRING { L1 $ HsStringPrim $ getPRIMSTRING $1 } + | PRIMFLOAT { L1 $ HsFloatPrim $ getPRIMFLOAT $1 } + | PRIMDOUBLE { L1 $ HsDoublePrim $ getPRIMDOUBLE $1 } + +----------------------------------------------------------------------------- +-- Layout + +close :: { () } + : vccurly { () } -- context popped in lexer. + | error {% popContext } + +----------------------------------------------------------------------------- +-- Miscellaneous (mostly renamings) + +modid :: { Located Module } + : CONID { L1 $ mkModuleFS (getCONID $1) } + | QCONID { L1 $ let (mod,c) = getQCONID $1 in + mkModuleFS + (mkFastString + (unpackFS mod ++ '.':unpackFS c)) + } + +commas :: { Int } + : commas ',' { $1 + 1 } + | ',' { 2 } + +----------------------------------------------------------------------------- + +{ +happyError :: P a +happyError = srcParseFail + +getVARID (L _ (ITvarid x)) = x +getCONID (L _ (ITconid x)) = x +getVARSYM (L _ (ITvarsym x)) = x +getCONSYM (L _ (ITconsym x)) = x +getQVARID (L _ (ITqvarid x)) = x +getQCONID (L _ (ITqconid x)) = x +getQVARSYM (L _ (ITqvarsym x)) = x +getQCONSYM (L _ (ITqconsym x)) = x +getIPDUPVARID (L _ (ITdupipvarid x)) = x +getIPSPLITVARID (L _ (ITsplitipvarid x)) = x +getCHAR (L _ (ITchar x)) = x +getSTRING (L _ (ITstring x)) = x +getINTEGER (L _ (ITinteger x)) = x +getRATIONAL (L _ (ITrational x)) = x +getPRIMCHAR (L _ (ITprimchar x)) = x +getPRIMSTRING (L _ (ITprimstring x)) = x +getPRIMINTEGER (L _ (ITprimint x)) = x +getPRIMFLOAT (L _ (ITprimfloat x)) = x +getPRIMDOUBLE (L _ (ITprimdouble x)) = x +getTH_ID_SPLICE (L _ (ITidEscape x)) = x +getINLINE (L _ (ITinline_prag b)) = b +getSPEC_INLINE (L _ (ITspec_inline_prag b)) = b + +-- Utilities for combining source spans +comb2 :: Located a -> Located b -> SrcSpan +comb2 = combineLocs + +comb3 :: Located a -> Located b -> Located c -> SrcSpan +comb3 a b c = combineSrcSpans (getLoc a) (combineSrcSpans (getLoc b) (getLoc c)) + +comb4 :: Located a -> Located b -> Located c -> Located d -> SrcSpan +comb4 a b c d = combineSrcSpans (getLoc a) $ combineSrcSpans (getLoc b) $ + combineSrcSpans (getLoc c) (getLoc d) + +-- strict constructor version: +{-# INLINE sL #-} +sL :: SrcSpan -> a -> Located a +sL span a = span `seq` L span a + +-- Make a source location for the file. We're a bit lazy here and just +-- make a point SrcSpan at line 1, column 0. Strictly speaking we should +-- try to find the span of the whole file (ToDo). +fileSrcSpan :: P SrcSpan +fileSrcSpan = do + l <- getSrcLoc; + let loc = mkSrcLoc (srcLocFile l) 1 0; + return (mkSrcSpan loc loc) +} diff --git a/compiler/parser/ParserCore.y b/compiler/parser/ParserCore.y new file mode 100644 index 0000000000..3210583f96 --- /dev/null +++ b/compiler/parser/ParserCore.y @@ -0,0 +1,339 @@ +{ +module ParserCore ( parseCore ) where + +import IfaceSyn +import ForeignCall +import RdrHsSyn +import HsSyn +import RdrName +import OccName +import Kind( Kind(..) ) +import Name( nameOccName, nameModule ) +import Module +import ParserCoreUtils +import LexCore +import Literal +import SrcLoc +import TysPrim( wordPrimTyCon, intPrimTyCon, charPrimTyCon, + floatPrimTyCon, doublePrimTyCon, addrPrimTyCon ) +import TyCon ( TyCon, tyConName ) +import FastString +import Outputable +import Char + +#include "../HsVersions.h" + +} + +%name parseCore +%tokentype { Token } + +%token + '%module' { TKmodule } + '%data' { TKdata } + '%newtype' { TKnewtype } + '%forall' { TKforall } + '%rec' { TKrec } + '%let' { TKlet } + '%in' { TKin } + '%case' { TKcase } + '%of' { TKof } + '%coerce' { TKcoerce } + '%note' { TKnote } + '%external' { TKexternal } + '%_' { TKwild } + '(' { TKoparen } + ')' { TKcparen } + '{' { TKobrace } + '}' { TKcbrace } + '#' { TKhash} + '=' { TKeq } + '::' { TKcoloncolon } + '*' { TKstar } + '->' { TKrarrow } + '\\' { TKlambda} + '@' { TKat } + '.' { TKdot } + '?' { TKquestion} + ';' { TKsemicolon } + NAME { TKname $$ } + CNAME { TKcname $$ } + INTEGER { TKinteger $$ } + RATIONAL { TKrational $$ } + STRING { TKstring $$ } + CHAR { TKchar $$ } + +%monad { P } { thenP } { returnP } +%lexer { lexer } { TKEOF } + +%% + +module :: { HsExtCore RdrName } + : '%module' modid tdefs vdefgs { HsExtCore $2 $3 $4 } + +modid :: { Module } + : CNAME { mkModuleFS (mkFastString $1) } + +------------------------------------------------------------- +-- Type and newtype declarations are in HsSyn syntax + +tdefs :: { [TyClDecl RdrName] } + : {- empty -} {[]} + | tdef ';' tdefs {$1:$3} + +tdef :: { TyClDecl RdrName } + : '%data' q_tc_name tv_bndrs '=' '{' cons1 '}' + { mkTyData DataType (noLoc [], noLoc (ifaceExtRdrName $2), map toHsTvBndr $3) Nothing $6 Nothing } + | '%newtype' q_tc_name tv_bndrs trep + { let tc_rdr = ifaceExtRdrName $2 in + mkTyData NewType (noLoc [], noLoc tc_rdr, map toHsTvBndr $3) Nothing ($4 (rdrNameOcc tc_rdr)) Nothing } + +-- For a newtype we have to invent a fake data constructor name +-- It doesn't matter what it is, because it won't be used +trep :: { OccName -> [LConDecl RdrName] } + : {- empty -} { (\ tc_occ -> []) } + | '=' ty { (\ tc_occ -> let { dc_name = mkRdrUnqual (setOccNameSpace dataName tc_occ) ; + con_info = PrefixCon [toHsType $2] } + in [noLoc $ ConDecl (noLoc dc_name) Explicit [] + (noLoc []) con_info ResTyH98]) } + +cons1 :: { [LConDecl RdrName] } + : con { [$1] } + | con ';' cons1 { $1:$3 } + +con :: { LConDecl RdrName } + : d_pat_occ attv_bndrs hs_atys + { noLoc $ ConDecl (noLoc (mkRdrUnqual $1)) Explicit $2 (noLoc []) (PrefixCon $3) ResTyH98} + | d_pat_occ '::' ty + -- XXX - autrijus - $3 needs to be split into argument and return types! + -- also not sure whether the [] below (quantified vars) appears. + -- also the "PrefixCon []" is wrong. + -- also we want to munge $3 somehow. + -- extractWhatEver to unpack ty into the parts to ConDecl + -- XXX - define it somewhere in RdrHsSyn + { noLoc $ ConDecl (noLoc (mkRdrUnqual $1)) Explicit [] (noLoc []) (PrefixCon []) (undefined $3) } + +attv_bndrs :: { [LHsTyVarBndr RdrName] } + : {- empty -} { [] } + | '@' tv_bndr attv_bndrs { toHsTvBndr $2 : $3 } + +hs_atys :: { [LHsType RdrName] } + : atys { map toHsType $1 } + + +--------------------------------------- +-- Types +--------------------------------------- + +atys :: { [IfaceType] } + : {- empty -} { [] } + | aty atys { $1:$2 } + +aty :: { IfaceType } + : tv_occ { IfaceTyVar $1 } + | q_tc_name { IfaceTyConApp (IfaceTc $1) [] } + | '(' ty ')' { $2 } + +bty :: { IfaceType } + : tv_occ atys { foldl IfaceAppTy (IfaceTyVar $1) $2 } + | q_tc_name atys { IfaceTyConApp (IfaceTc $1) $2 } + | '(' ty ')' { $2 } + +ty :: { IfaceType } + : bty { $1 } + | bty '->' ty { IfaceFunTy $1 $3 } + | '%forall' tv_bndrs '.' ty { foldr IfaceForAllTy $4 $2 } + +---------------------------------------------- +-- Bindings are in Iface syntax + +vdefgs :: { [IfaceBinding] } + : {- empty -} { [] } + | let_bind ';' vdefgs { $1 : $3 } + +let_bind :: { IfaceBinding } + : '%rec' '{' vdefs1 '}' { IfaceRec $3 } + | vdef { let (b,r) = $1 + in IfaceNonRec b r } + +vdefs1 :: { [(IfaceIdBndr, IfaceExpr)] } + : vdef { [$1] } + | vdef ';' vdefs1 { $1:$3 } + +vdef :: { (IfaceIdBndr, IfaceExpr) } + : qd_occ '::' ty '=' exp { (($1, $3), $5) } + -- NB: qd_occ includes data constructors, because + -- we allow data-constructor wrappers at top level + -- But we discard the module name, because it must be the + -- same as the module being compiled, and Iface syntax only + -- has OccNames in binding positions + +qd_occ :: { OccName } + : var_occ { $1 } + | d_occ { $1 } + +--------------------------------------- +-- Binders +bndr :: { IfaceBndr } + : '@' tv_bndr { IfaceTvBndr $2 } + | id_bndr { IfaceIdBndr $1 } + +bndrs :: { [IfaceBndr] } + : bndr { [$1] } + | bndr bndrs { $1:$2 } + +id_bndr :: { IfaceIdBndr } + : '(' var_occ '::' ty ')' { ($2,$4) } + +id_bndrs :: { [IfaceIdBndr] } + : {-empty -} { [] } + | id_bndr id_bndrs { $1:$2 } + +tv_bndr :: { IfaceTvBndr } + : tv_occ { ($1, LiftedTypeKind) } + | '(' tv_occ '::' akind ')' { ($2, $4) } + +tv_bndrs :: { [IfaceTvBndr] } + : {- empty -} { [] } + | tv_bndr tv_bndrs { $1:$2 } + +akind :: { IfaceKind } + : '*' { LiftedTypeKind } + | '#' { UnliftedTypeKind } + | '?' { OpenTypeKind } + | '(' kind ')' { $2 } + +kind :: { IfaceKind } + : akind { $1 } + | akind '->' kind { FunKind $1 $3 } + +----------------------------------------- +-- Expressions + +aexp :: { IfaceExpr } + : var_occ { IfaceLcl $1 } + | modid '.' qd_occ { IfaceExt (ExtPkg $1 $3) } + | lit { IfaceLit $1 } + | '(' exp ')' { $2 } + +fexp :: { IfaceExpr } + : fexp aexp { IfaceApp $1 $2 } + | fexp '@' aty { IfaceApp $1 (IfaceType $3) } + | aexp { $1 } + +exp :: { IfaceExpr } + : fexp { $1 } + | '\\' bndrs '->' exp { foldr IfaceLam $4 $2 } + | '%let' let_bind '%in' exp { IfaceLet $2 $4 } +-- gaw 2004 + | '%case' '(' ty ')' aexp '%of' id_bndr + '{' alts1 '}' { IfaceCase $5 (fst $7) $3 $9 } + | '%coerce' aty exp { IfaceNote (IfaceCoerce $2) $3 } + | '%note' STRING exp + { case $2 of + --"SCC" -> IfaceNote (IfaceSCC "scc") $3 + "InlineCall" -> IfaceNote IfaceInlineCall $3 + "InlineMe" -> IfaceNote IfaceInlineMe $3 + } + | '%external' STRING aty { IfaceFCall (ForeignCall.CCall + (CCallSpec (StaticTarget (mkFastString $2)) + CCallConv (PlaySafe False))) + $3 } + +alts1 :: { [IfaceAlt] } + : alt { [$1] } + | alt ';' alts1 { $1:$3 } + +alt :: { IfaceAlt } + : modid '.' d_pat_occ bndrs '->' exp + { (IfaceDataAlt $3, map ifaceBndrName $4, $6) } + -- The external syntax currently includes the types of the + -- the args, but they aren't needed internally + -- Nor is the module qualifier + | lit '->' exp + { (IfaceLitAlt $1, [], $3) } + | '%_' '->' exp + { (IfaceDefault, [], $3) } + +lit :: { Literal } + : '(' INTEGER '::' aty ')' { convIntLit $2 $4 } + | '(' RATIONAL '::' aty ')' { convRatLit $2 $4 } + | '(' CHAR '::' aty ')' { MachChar $2 } + | '(' STRING '::' aty ')' { MachStr (mkFastString $2) } + +tv_occ :: { OccName } + : NAME { mkOccName tvName $1 } + +var_occ :: { OccName } + : NAME { mkVarOcc $1 } + + +-- Type constructor +q_tc_name :: { IfaceExtName } + : modid '.' CNAME { ExtPkg $1 (mkOccName tcName $3) } + +-- Data constructor in a pattern or data type declaration; use the dataName, +-- because that's what we expect in Core case patterns +d_pat_occ :: { OccName } + : CNAME { mkOccName dataName $1 } + +-- Data constructor occurrence in an expression; +-- use the varName because that's the worker Id +d_occ :: { OccName } + : CNAME { mkVarOcc $1 } + +{ + +ifaceBndrName (IfaceIdBndr (n,_)) = n +ifaceBndrName (IfaceTvBndr (n,_)) = n + +convIntLit :: Integer -> IfaceType -> Literal +convIntLit i (IfaceTyConApp tc []) + | tc `eqTc` intPrimTyCon = MachInt i + | tc `eqTc` wordPrimTyCon = MachWord i + | tc `eqTc` charPrimTyCon = MachChar (chr (fromInteger i)) + | tc `eqTc` addrPrimTyCon && i == 0 = MachNullAddr +convIntLit i aty + = pprPanic "Unknown integer literal type" (ppr aty) + +convRatLit :: Rational -> IfaceType -> Literal +convRatLit r (IfaceTyConApp tc []) + | tc `eqTc` floatPrimTyCon = MachFloat r + | tc `eqTc` doublePrimTyCon = MachDouble r +convRatLit i aty + = pprPanic "Unknown rational literal type" (ppr aty) + +eqTc :: IfaceTyCon -> TyCon -> Bool -- Ugh! +eqTc (IfaceTc (ExtPkg mod occ)) tycon + = mod == nameModule nm && occ == nameOccName nm + where + nm = tyConName tycon + +-- Tiresomely, we have to generate both HsTypes (in type/class decls) +-- and IfaceTypes (in Core expressions). So we parse them as IfaceTypes, +-- and convert to HsTypes here. But the IfaceTypes we can see here +-- are very limited (see the productions for 'ty', so the translation +-- isn't hard +toHsType :: IfaceType -> LHsType RdrName +toHsType (IfaceTyVar v) = noLoc $ HsTyVar (mkRdrUnqual v) +toHsType (IfaceAppTy t1 t2) = noLoc $ HsAppTy (toHsType t1) (toHsType t2) +toHsType (IfaceFunTy t1 t2) = noLoc $ HsFunTy (toHsType t1) (toHsType t2) +toHsType (IfaceTyConApp (IfaceTc tc) ts) = foldl mkHsAppTy (noLoc $ HsTyVar (ifaceExtRdrName tc)) (map toHsType ts) +toHsType (IfaceForAllTy tv t) = add_forall (toHsTvBndr tv) (toHsType t) + +toHsTvBndr :: IfaceTvBndr -> LHsTyVarBndr RdrName +toHsTvBndr (tv,k) = noLoc $ KindedTyVar (mkRdrUnqual tv) k + +ifaceExtRdrName :: IfaceExtName -> RdrName +ifaceExtRdrName (ExtPkg mod occ) = mkOrig mod occ +ifaceExtRdrName other = pprPanic "ParserCore.ifaceExtRdrName" (ppr other) + +add_forall tv (L _ (HsForAllTy exp tvs cxt t)) + = noLoc $ HsForAllTy exp (tv:tvs) cxt t +add_forall tv t + = noLoc $ HsForAllTy Explicit [tv] (noLoc []) t + +happyError :: P a +happyError s l = failP (show l ++ ": Parse error\n") (take 100 s) l +} + diff --git a/compiler/parser/ParserCoreUtils.hs b/compiler/parser/ParserCoreUtils.hs new file mode 100644 index 0000000000..a590fb5c93 --- /dev/null +++ b/compiler/parser/ParserCoreUtils.hs @@ -0,0 +1,72 @@ +module ParserCoreUtils where + +import IO + +data ParseResult a = OkP a | FailP String +type P a = String -> Int -> ParseResult a + +thenP :: P a -> (a -> P b) -> P b +m `thenP` k = \ s l -> + case m s l of + OkP a -> k a s l + FailP s -> FailP s + +returnP :: a -> P a +returnP m _ _ = OkP m + +failP :: String -> P a +failP s s' _ = FailP (s ++ ":" ++ s') + +getCoreModuleName :: FilePath -> IO String +getCoreModuleName fpath = + catch (do + h <- openFile fpath ReadMode + ls <- hGetContents h + let mo = findMod (words ls) + -- make sure we close up the file right away. + (length mo) `seq` return () + hClose h + return mo) + (\ _ -> return "Main") + where + findMod [] = "Main" + findMod ("%module":m:_) = m + findMod (_:xs) = findMod xs + + +data Token = + TKmodule + | TKdata + | TKnewtype + | TKforall + | TKrec + | TKlet + | TKin + | TKcase + | TKof + | TKcoerce + | TKnote + | TKexternal + | TKwild + | TKoparen + | TKcparen + | TKobrace + | TKcbrace + | TKhash + | TKeq + | TKcoloncolon + | TKstar + | TKrarrow + | TKlambda + | TKat + | TKdot + | TKquestion + | TKsemicolon + | TKname String + | TKcname String + | TKinteger Integer + | TKrational Rational + | TKstring String + | TKchar Char + | TKEOF + diff --git a/compiler/parser/RdrHsSyn.lhs b/compiler/parser/RdrHsSyn.lhs new file mode 100644 index 0000000000..8d59e2b22c --- /dev/null +++ b/compiler/parser/RdrHsSyn.lhs @@ -0,0 +1,869 @@ +% +% (c) The University of Glasgow, 1996-2003 + +Functions over HsSyn specialised to RdrName. + +\begin{code} +module RdrHsSyn ( + extractHsTyRdrTyVars, + extractHsRhoRdrTyVars, extractGenericPatTyVars, + + mkHsOpApp, mkClassDecl, + mkHsNegApp, mkHsIntegral, mkHsFractional, + mkHsDo, mkHsSplice, + mkTyData, mkPrefixCon, mkRecCon, mkInlineSpec, + mkRecConstrOrUpdate, -- HsExp -> [HsFieldUpdate] -> P HsExp + + cvBindGroup, + cvBindsAndSigs, + cvTopDecls, + findSplice, mkGroup, + + -- Stuff to do with Foreign declarations + CallConv(..), + mkImport, -- CallConv -> Safety + -- -> (FastString, RdrName, RdrNameHsType) + -- -> P RdrNameHsDecl + mkExport, -- CallConv + -- -> (FastString, RdrName, RdrNameHsType) + -- -> P RdrNameHsDecl + mkExtName, -- RdrName -> CLabelString + mkGadtDecl, -- Located RdrName -> LHsType RdrName -> ConDecl RdrName + + -- Bunch of functions in the parser monad for + -- checking and constructing values + checkPrecP, -- Int -> P Int + checkContext, -- HsType -> P HsContext + checkPred, -- HsType -> P HsPred + checkTyClHdr, -- LHsContext RdrName -> LHsType RdrName -> P (LHsContext RdrName, Located RdrName, [LHsTyVarBndr RdrName]) + checkSynHdr, -- LHsType RdrName -> P (Located RdrName, [LHsTyVarBndr RdrName]) + checkInstType, -- HsType -> P HsType + checkPattern, -- HsExp -> P HsPat + checkPatterns, -- SrcLoc -> [HsExp] -> P [HsPat] + checkDo, -- [Stmt] -> P [Stmt] + checkMDo, -- [Stmt] -> P [Stmt] + checkValDef, -- (SrcLoc, HsExp, HsRhs, [HsDecl]) -> P HsDecl + checkValSig, -- (SrcLoc, HsExp, HsRhs, [HsDecl]) -> P HsDecl + parseError, -- String -> Pa + ) where + +#include "HsVersions.h" + +import HsSyn -- Lots of it +import RdrName ( RdrName, isRdrTyVar, mkUnqual, rdrNameOcc, + isRdrDataCon, isUnqual, getRdrName, isQual, + setRdrNameSpace ) +import BasicTypes ( maxPrecedence, Activation, InlineSpec(..), alwaysInlineSpec, neverInlineSpec ) +import Lexer ( P, failSpanMsgP, extension, bangPatEnabled ) +import TysWiredIn ( unitTyCon ) +import ForeignCall ( CCallConv, Safety, CCallTarget(..), CExportSpec(..), + DNCallSpec(..), DNKind(..), CLabelString ) +import OccName ( srcDataName, varName, isDataOcc, isTcOcc, + occNameString ) +import SrcLoc +import OrdList ( OrdList, fromOL ) +import Bag ( Bag, emptyBag, snocBag, consBag, foldrBag ) +import Outputable +import FastString +import Panic + +import List ( isSuffixOf, nubBy ) +\end{code} + + +%************************************************************************ +%* * +\subsection{A few functions over HsSyn at RdrName} +%* * +%************************************************************************ + +extractHsTyRdrNames finds the free variables of a HsType +It's used when making the for-alls explicit. + +\begin{code} +extractHsTyRdrTyVars :: LHsType RdrName -> [Located RdrName] +extractHsTyRdrTyVars ty = nubBy eqLocated (extract_lty ty []) + +extractHsRhoRdrTyVars :: LHsContext RdrName -> LHsType RdrName -> [Located RdrName] +-- This one takes the context and tau-part of a +-- sigma type and returns their free type variables +extractHsRhoRdrTyVars ctxt ty + = nubBy eqLocated $ extract_lctxt ctxt (extract_lty ty []) + +extract_lctxt ctxt acc = foldr (extract_pred . unLoc) acc (unLoc ctxt) + +extract_pred (HsClassP cls tys) acc = foldr extract_lty acc tys +extract_pred (HsIParam n ty) acc = extract_lty ty acc + +extract_lty (L loc ty) acc + = case ty of + HsTyVar tv -> extract_tv loc tv acc + HsBangTy _ ty -> extract_lty ty acc + HsAppTy ty1 ty2 -> extract_lty ty1 (extract_lty ty2 acc) + HsListTy ty -> extract_lty ty acc + HsPArrTy ty -> extract_lty ty acc + HsTupleTy _ tys -> foldr extract_lty acc tys + HsFunTy ty1 ty2 -> extract_lty ty1 (extract_lty ty2 acc) + HsPredTy p -> extract_pred p acc + HsOpTy ty1 (L loc tv) ty2 -> extract_tv loc tv (extract_lty ty1 (extract_lty ty2 acc)) + HsParTy ty -> extract_lty ty acc + HsNumTy num -> acc + HsSpliceTy _ -> acc -- Type splices mention no type variables + HsKindSig ty k -> extract_lty ty acc + HsForAllTy exp [] cx ty -> extract_lctxt cx (extract_lty ty acc) + HsForAllTy exp tvs cx ty -> acc ++ (filter ((`notElem` locals) . unLoc) $ + extract_lctxt cx (extract_lty ty [])) + where + locals = hsLTyVarNames tvs + +extract_tv :: SrcSpan -> RdrName -> [Located RdrName] -> [Located RdrName] +extract_tv loc tv acc | isRdrTyVar tv = L loc tv : acc + | otherwise = acc + +extractGenericPatTyVars :: LHsBinds RdrName -> [Located RdrName] +-- Get the type variables out of the type patterns in a bunch of +-- possibly-generic bindings in a class declaration +extractGenericPatTyVars binds + = nubBy eqLocated (foldrBag get [] binds) + where + get (L _ (FunBind { fun_matches = MatchGroup ms _ })) acc = foldr (get_m.unLoc) acc ms + get other acc = acc + + get_m (Match (L _ (TypePat ty) : _) _ _) acc = extract_lty ty acc + get_m other acc = acc +\end{code} + + +%************************************************************************ +%* * +\subsection{Construction functions for Rdr stuff} +%* * +%************************************************************************ + +mkClassDecl builds a RdrClassDecl, filling in the names for tycon and datacon +by deriving them from the name of the class. We fill in the names for the +tycon and datacon corresponding to the class, by deriving them from the +name of the class itself. This saves recording the names in the interface +file (which would be equally good). + +Similarly for mkConDecl, mkClassOpSig and default-method names. + + *** See "THE NAMING STORY" in HsDecls **** + +\begin{code} +mkClassDecl (cxt, cname, tyvars) fds sigs mbinds + = ClassDecl { tcdCtxt = cxt, tcdLName = cname, tcdTyVars = tyvars, + tcdFDs = fds, + tcdSigs = sigs, + tcdMeths = mbinds + } + +mkTyData new_or_data (context, tname, tyvars) ksig data_cons maybe_deriv + = TyData { tcdND = new_or_data, tcdCtxt = context, tcdLName = tname, + tcdTyVars = tyvars, tcdCons = data_cons, + tcdKindSig = ksig, tcdDerivs = maybe_deriv } +\end{code} + +\begin{code} +mkHsNegApp :: LHsExpr RdrName -> HsExpr RdrName +-- RdrName If the type checker sees (negate 3#) it will barf, because negate +-- can't take an unboxed arg. But that is exactly what it will see when +-- we write "-3#". So we have to do the negation right now! +mkHsNegApp (L loc e) = f e + where f (HsLit (HsIntPrim i)) = HsLit (HsIntPrim (-i)) + f (HsLit (HsFloatPrim i)) = HsLit (HsFloatPrim (-i)) + f (HsLit (HsDoublePrim i)) = HsLit (HsDoublePrim (-i)) + f expr = NegApp (L loc e) noSyntaxExpr +\end{code} + +%************************************************************************ +%* * +\subsection[cvBinds-etc]{Converting to @HsBinds@, etc.} +%* * +%************************************************************************ + +Function definitions are restructured here. Each is assumed to be recursive +initially, and non recursive definitions are discovered by the dependency +analyser. + + +\begin{code} +-- | Groups together bindings for a single function +cvTopDecls :: OrdList (LHsDecl RdrName) -> [LHsDecl RdrName] +cvTopDecls decls = go (fromOL decls) + where + go :: [LHsDecl RdrName] -> [LHsDecl RdrName] + go [] = [] + go (L l (ValD b) : ds) = L l' (ValD b') : go ds' + where (L l' b', ds') = getMonoBind (L l b) ds + go (d : ds) = d : go ds + +cvBindGroup :: OrdList (LHsDecl RdrName) -> HsValBinds RdrName +cvBindGroup binding + = case (cvBindsAndSigs binding) of { (mbs, sigs) -> + ValBindsIn mbs sigs + } + +cvBindsAndSigs :: OrdList (LHsDecl RdrName) + -> (Bag (LHsBind RdrName), [LSig RdrName]) +-- Input decls contain just value bindings and signatures +cvBindsAndSigs fb = go (fromOL fb) + where + go [] = (emptyBag, []) + go (L l (SigD s) : ds) = (bs, L l s : ss) + where (bs,ss) = go ds + go (L l (ValD b) : ds) = (b' `consBag` bs, ss) + where (b',ds') = getMonoBind (L l b) ds + (bs,ss) = go ds' + +----------------------------------------------------------------------------- +-- Group function bindings into equation groups + +getMonoBind :: LHsBind RdrName -> [LHsDecl RdrName] + -> (LHsBind RdrName, [LHsDecl RdrName]) +-- Suppose (b',ds') = getMonoBind b ds +-- ds is a *reversed* list of parsed bindings +-- b is a MonoBinds that has just been read off the front + +-- Then b' is the result of grouping more equations from ds that +-- belong with b into a single MonoBinds, and ds' is the depleted +-- list of parsed bindings. +-- +-- No AndMonoBinds or EmptyMonoBinds here; just single equations + +getMonoBind (L loc bind@(FunBind { fun_id = L _ f, fun_matches = MatchGroup mtchs _ })) binds + | has_args mtchs + = go mtchs loc binds + where + go mtchs1 loc1 (L loc2 (ValD (FunBind { fun_id = L _ f2, fun_matches = MatchGroup mtchs2 _ })) : binds) + | f == f2 = go (mtchs2++mtchs1) loc binds + where loc = combineSrcSpans loc1 loc2 + go mtchs1 loc binds + = (L loc (bind { fun_matches = mkMatchGroup (reverse mtchs1) }), binds) + -- Reverse the final matches, to get it back in the right order + +getMonoBind bind binds = (bind, binds) + +has_args ((L _ (Match args _ _)) : _) = not (null args) + -- Don't group together FunBinds if they have + -- no arguments. This is necessary now that variable bindings + -- with no arguments are now treated as FunBinds rather + -- than pattern bindings (tests/rename/should_fail/rnfail002). +\end{code} + +\begin{code} +findSplice :: [LHsDecl a] -> (HsGroup a, Maybe (SpliceDecl a, [LHsDecl a])) +findSplice ds = addl emptyRdrGroup ds + +mkGroup :: [LHsDecl a] -> HsGroup a +mkGroup ds = addImpDecls emptyRdrGroup ds + +addImpDecls :: HsGroup a -> [LHsDecl a] -> HsGroup a +-- The decls are imported, and should not have a splice +addImpDecls group decls = case addl group decls of + (group', Nothing) -> group' + other -> panic "addImpDecls" + +addl :: HsGroup a -> [LHsDecl a] -> (HsGroup a, Maybe (SpliceDecl a, [LHsDecl a])) + -- This stuff reverses the declarations (again) but it doesn't matter + +-- Base cases +addl gp [] = (gp, Nothing) +addl gp (L l d : ds) = add gp l d ds + + +add :: HsGroup a -> SrcSpan -> HsDecl a -> [LHsDecl a] + -> (HsGroup a, Maybe (SpliceDecl a, [LHsDecl a])) + +add gp l (SpliceD e) ds = (gp, Just (e, ds)) + +-- Class declarations: pull out the fixity signatures to the top +add gp@(HsGroup {hs_tyclds = ts, hs_fixds = fs}) l (TyClD d) ds + | isClassDecl d = + let fsigs = [ L l f | L l (FixSig f) <- tcdSigs d ] in + addl (gp { hs_tyclds = L l d : ts, hs_fixds = fsigs ++ fs }) ds + | otherwise = + addl (gp { hs_tyclds = L l d : ts }) ds + +-- Signatures: fixity sigs go a different place than all others +add gp@(HsGroup {hs_fixds = ts}) l (SigD (FixSig f)) ds + = addl (gp {hs_fixds = L l f : ts}) ds +add gp@(HsGroup {hs_valds = ts}) l (SigD d) ds + = addl (gp {hs_valds = add_sig (L l d) ts}) ds + +-- Value declarations: use add_bind +add gp@(HsGroup {hs_valds = ts}) l (ValD d) ds + = addl (gp { hs_valds = add_bind (L l d) ts }) ds + +-- The rest are routine +add gp@(HsGroup {hs_instds = ts}) l (InstD d) ds + = addl (gp { hs_instds = L l d : ts }) ds +add gp@(HsGroup {hs_defds = ts}) l (DefD d) ds + = addl (gp { hs_defds = L l d : ts }) ds +add gp@(HsGroup {hs_fords = ts}) l (ForD d) ds + = addl (gp { hs_fords = L l d : ts }) ds +add gp@(HsGroup {hs_depds = ts}) l (DeprecD d) ds + = addl (gp { hs_depds = L l d : ts }) ds +add gp@(HsGroup {hs_ruleds = ts}) l (RuleD d) ds + = addl (gp { hs_ruleds = L l d : ts }) ds + +add_bind b (ValBindsIn bs sigs) = ValBindsIn (bs `snocBag` b) sigs +add_sig s (ValBindsIn bs sigs) = ValBindsIn bs (s:sigs) +\end{code} + +%************************************************************************ +%* * +\subsection[PrefixToHS-utils]{Utilities for conversion} +%* * +%************************************************************************ + + +\begin{code} +----------------------------------------------------------------------------- +-- mkPrefixCon + +-- When parsing data declarations, we sometimes inadvertently parse +-- a constructor application as a type (eg. in data T a b = C a b `D` E a b) +-- This function splits up the type application, adds any pending +-- arguments, and converts the type constructor back into a data constructor. + +mkPrefixCon :: LHsType RdrName -> [LBangType RdrName] + -> P (Located RdrName, HsConDetails RdrName (LBangType RdrName)) +mkPrefixCon ty tys + = split ty tys + where + split (L _ (HsAppTy t u)) ts = split t (u : ts) + split (L l (HsTyVar tc)) ts = do data_con <- tyConToDataCon l tc + return (data_con, PrefixCon ts) + split (L l _) _ = parseError l "parse error in data/newtype declaration" + +mkRecCon :: Located RdrName -> [([Located RdrName], LBangType RdrName)] + -> P (Located RdrName, HsConDetails RdrName (LBangType RdrName)) +mkRecCon (L loc con) fields + = do data_con <- tyConToDataCon loc con + return (data_con, RecCon [ (l,t) | (ls,t) <- fields, l <- ls ]) + +tyConToDataCon :: SrcSpan -> RdrName -> P (Located RdrName) +tyConToDataCon loc tc + | isTcOcc (rdrNameOcc tc) + = return (L loc (setRdrNameSpace tc srcDataName)) + | otherwise + = parseError loc (showSDoc (text "Not a constructor:" <+> quotes (ppr tc))) + +---------------------------------------------------------------------------- +-- Various Syntactic Checks + +checkInstType :: LHsType RdrName -> P (LHsType RdrName) +checkInstType (L l t) + = case t of + HsForAllTy exp tvs ctxt ty -> do + dict_ty <- checkDictTy ty + return (L l (HsForAllTy exp tvs ctxt dict_ty)) + + HsParTy ty -> checkInstType ty + + ty -> do dict_ty <- checkDictTy (L l ty) + return (L l (HsForAllTy Implicit [] (noLoc []) dict_ty)) + +checkTyVars :: [LHsType RdrName] -> P [LHsTyVarBndr RdrName] +checkTyVars tvs + = mapM chk tvs + where + -- Check that the name space is correct! + chk (L l (HsKindSig (L _ (HsTyVar tv)) k)) + | isRdrTyVar tv = return (L l (KindedTyVar tv k)) + chk (L l (HsTyVar tv)) + | isRdrTyVar tv = return (L l (UserTyVar tv)) + chk (L l other) + = parseError l "Type found where type variable expected" + +checkSynHdr :: LHsType RdrName -> P (Located RdrName, [LHsTyVarBndr RdrName]) +checkSynHdr ty = do { (_, tc, tvs) <- checkTyClHdr (noLoc []) ty + ; return (tc, tvs) } + +checkTyClHdr :: LHsContext RdrName -> LHsType RdrName + -> P (LHsContext RdrName, Located RdrName, [LHsTyVarBndr RdrName]) +-- The header of a type or class decl should look like +-- (C a, D b) => T a b +-- or T a b +-- or a + b +-- etc +checkTyClHdr (L l cxt) ty + = do (tc, tvs) <- gol ty [] + mapM_ chk_pred cxt + return (L l cxt, tc, tvs) + where + gol (L l ty) acc = go l ty acc + + go l (HsTyVar tc) acc + | not (isRdrTyVar tc) = checkTyVars acc >>= \ tvs -> + return (L l tc, tvs) + go l (HsOpTy t1 tc t2) acc = checkTyVars (t1:t2:acc) >>= \ tvs -> + return (tc, tvs) + go l (HsParTy ty) acc = gol ty acc + go l (HsAppTy t1 t2) acc = gol t1 (t2:acc) + go l other acc = parseError l "Malformed LHS to type of class declaration" + + -- The predicates in a type or class decl must all + -- be HsClassPs. They need not all be type variables, + -- even in Haskell 98. E.g. class (Monad m, Monad (t m)) => MonadT t m + chk_pred (L l (HsClassP _ args)) = return () + chk_pred (L l _) + = parseError l "Malformed context in type or class declaration" + + +checkContext :: LHsType RdrName -> P (LHsContext RdrName) +checkContext (L l t) + = check t + where + check (HsTupleTy _ ts) -- (Eq a, Ord b) shows up as a tuple type + = do ctx <- mapM checkPred ts + return (L l ctx) + + check (HsParTy ty) -- to be sure HsParTy doesn't get into the way + = check (unLoc ty) + + check (HsTyVar t) -- Empty context shows up as a unit type () + | t == getRdrName unitTyCon = return (L l []) + + check t + = do p <- checkPred (L l t) + return (L l [p]) + + +checkPred :: LHsType RdrName -> P (LHsPred RdrName) +-- Watch out.. in ...deriving( Show )... we use checkPred on +-- the list of partially applied predicates in the deriving, +-- so there can be zero args. +checkPred (L spn (HsPredTy (HsIParam n ty))) + = return (L spn (HsIParam n ty)) +checkPred (L spn ty) + = check spn ty [] + where + checkl (L l ty) args = check l ty args + + check _loc (HsTyVar t) args | not (isRdrTyVar t) + = return (L spn (HsClassP t args)) + check _loc (HsAppTy l r) args = checkl l (r:args) + check _loc (HsOpTy l (L loc tc) r) args = check loc (HsTyVar tc) (l:r:args) + check _loc (HsParTy t) args = checkl t args + check loc _ _ = parseError loc "malformed class assertion" + +checkDictTy :: LHsType RdrName -> P (LHsType RdrName) +checkDictTy (L spn ty) = check ty [] + where + check (HsTyVar t) args | not (isRdrTyVar t) + = return (L spn (HsPredTy (HsClassP t args))) + check (HsAppTy l r) args = check (unLoc l) (r:args) + check (HsParTy t) args = check (unLoc t) args + check _ _ = parseError spn "Malformed context in instance header" + +--------------------------------------------------------------------------- +-- Checking statements in a do-expression +-- We parse do { e1 ; e2 ; } +-- as [ExprStmt e1, ExprStmt e2] +-- checkDo (a) checks that the last thing is an ExprStmt +-- (b) returns it separately +-- same comments apply for mdo as well + +checkDo = checkDoMDo "a " "'do'" +checkMDo = checkDoMDo "an " "'mdo'" + +checkDoMDo :: String -> String -> SrcSpan -> [LStmt RdrName] -> P ([LStmt RdrName], LHsExpr RdrName) +checkDoMDo pre nm loc [] = parseError loc ("Empty " ++ nm ++ " construct") +checkDoMDo pre nm loc ss = do + check ss + where + check [L l (ExprStmt e _ _)] = return ([], e) + check [L l _] = parseError l ("The last statement in " ++ pre ++ nm ++ + " construct must be an expression") + check (s:ss) = do + (ss',e') <- check ss + return ((s:ss'),e') + +-- ------------------------------------------------------------------------- +-- Checking Patterns. + +-- We parse patterns as expressions and check for valid patterns below, +-- converting the expression into a pattern at the same time. + +checkPattern :: LHsExpr RdrName -> P (LPat RdrName) +checkPattern e = checkLPat e + +checkPatterns :: [LHsExpr RdrName] -> P [LPat RdrName] +checkPatterns es = mapM checkPattern es + +checkLPat :: LHsExpr RdrName -> P (LPat RdrName) +checkLPat e@(L l _) = checkPat l e [] + +checkPat :: SrcSpan -> LHsExpr RdrName -> [LPat RdrName] -> P (LPat RdrName) +checkPat loc (L l (HsVar c)) args + | isRdrDataCon c = return (L loc (ConPatIn (L l c) (PrefixCon args))) +checkPat loc e args -- OK to let this happen even if bang-patterns + -- are not enabled, because there is no valid + -- non-bang-pattern parse of (C ! e) + | Just (e', args') <- splitBang e + = do { args'' <- checkPatterns args' + ; checkPat loc e' (args'' ++ args) } +checkPat loc (L _ (HsApp f x)) args + = do { x <- checkLPat x; checkPat loc f (x:args) } +checkPat loc (L _ e) [] + = do { p <- checkAPat loc e; return (L loc p) } +checkPat loc pat _some_args + = patFail loc + +checkAPat loc e = case e of + EWildPat -> return (WildPat placeHolderType) + HsVar x | isQual x -> parseError loc ("Qualified variable in pattern: " + ++ showRdrName x) + | otherwise -> return (VarPat x) + HsLit l -> return (LitPat l) + + -- Overloaded numeric patterns (e.g. f 0 x = x) + -- Negation is recorded separately, so that the literal is zero or +ve + -- NB. Negative *primitive* literals are already handled by + -- RdrHsSyn.mkHsNegApp + HsOverLit pos_lit -> return (mkNPat pos_lit Nothing) + NegApp (L _ (HsOverLit pos_lit)) _ + -> return (mkNPat pos_lit (Just noSyntaxExpr)) + + SectionR (L _ (HsVar bang)) e + | bang == bang_RDR -> checkLPat e >>= (return . BangPat) + ELazyPat e -> checkLPat e >>= (return . LazyPat) + EAsPat n e -> checkLPat e >>= (return . AsPat n) + ExprWithTySig e t -> checkLPat e >>= \e -> + -- Pattern signatures are parsed as sigtypes, + -- but they aren't explicit forall points. Hence + -- we have to remove the implicit forall here. + let t' = case t of + L _ (HsForAllTy Implicit _ (L _ []) ty) -> ty + other -> other + in + return (SigPatIn e t') + + -- n+k patterns + OpApp (L nloc (HsVar n)) (L _ (HsVar plus)) _ + (L _ (HsOverLit lit@(HsIntegral _ _))) + | plus == plus_RDR + -> return (mkNPlusKPat (L nloc n) lit) + + OpApp l op fix r -> checkLPat l >>= \l -> + checkLPat r >>= \r -> + case op of + L cl (HsVar c) | isDataOcc (rdrNameOcc c) + -> return (ConPatIn (L cl c) (InfixCon l r)) + _ -> patFail loc + + HsPar e -> checkLPat e >>= (return . ParPat) + ExplicitList _ es -> mapM (\e -> checkLPat e) es >>= \ps -> + return (ListPat ps placeHolderType) + ExplicitPArr _ es -> mapM (\e -> checkLPat e) es >>= \ps -> + return (PArrPat ps placeHolderType) + + ExplicitTuple es b -> mapM (\e -> checkLPat e) es >>= \ps -> + return (TuplePat ps b placeHolderType) + + RecordCon c _ fs -> mapM checkPatField fs >>= \fs -> + return (ConPatIn c (RecCon fs)) +-- Generics + HsType ty -> return (TypePat ty) + _ -> patFail loc + +plus_RDR, bang_RDR :: RdrName +plus_RDR = mkUnqual varName FSLIT("+") -- Hack +bang_RDR = mkUnqual varName FSLIT("!") -- Hack + +checkPatField :: (Located RdrName, LHsExpr RdrName) -> P (Located RdrName, LPat RdrName) +checkPatField (n,e) = do + p <- checkLPat e + return (n,p) + +patFail loc = parseError loc "Parse error in pattern" + + +--------------------------------------------------------------------------- +-- Check Equation Syntax + +checkValDef :: LHsExpr RdrName + -> Maybe (LHsType RdrName) + -> Located (GRHSs RdrName) + -> P (HsBind RdrName) + +checkValDef lhs opt_sig grhss + = do { mb_fun <- isFunLhs lhs + ; case mb_fun of + Just (fun, is_infix, pats) -> checkFunBind (getLoc lhs) + fun is_infix pats opt_sig grhss + Nothing -> checkPatBind lhs grhss } + +checkFunBind lhs_loc fun is_infix pats opt_sig (L rhs_span grhss) + | isQual (unLoc fun) + = parseError (getLoc fun) ("Qualified name in function definition: " ++ + showRdrName (unLoc fun)) + | otherwise + = do ps <- checkPatterns pats + let match_span = combineSrcSpans lhs_loc rhs_span + matches = mkMatchGroup [L match_span (Match ps opt_sig grhss)] + return (FunBind { fun_id = fun, fun_infix = is_infix, fun_matches = matches, + fun_co_fn = idCoercion, bind_fvs = placeHolderNames }) + -- The span of the match covers the entire equation. + -- That isn't quite right, but it'll do for now. + +checkPatBind lhs (L _ grhss) + = do { lhs <- checkPattern lhs + ; return (PatBind lhs grhss placeHolderType placeHolderNames) } + +checkValSig + :: LHsExpr RdrName + -> LHsType RdrName + -> P (Sig RdrName) +checkValSig (L l (HsVar v)) ty + | isUnqual v && not (isDataOcc (rdrNameOcc v)) + = return (TypeSig (L l v) ty) +checkValSig (L l other) ty + = parseError l "Invalid type signature" + +mkGadtDecl + :: Located RdrName + -> LHsType RdrName -- assuming HsType + -> ConDecl RdrName +mkGadtDecl name (L _ (HsForAllTy _ qvars cxt ty)) = ConDecl + { con_name = name + , con_explicit = Implicit + , con_qvars = qvars + , con_cxt = cxt + , con_details = PrefixCon args + , con_res = ResTyGADT res + } + where + (args, res) = splitHsFunType ty +mkGadtDecl name ty = ConDecl + { con_name = name + , con_explicit = Implicit + , con_qvars = [] + , con_cxt = noLoc [] + , con_details = PrefixCon args + , con_res = ResTyGADT res + } + where + (args, res) = splitHsFunType ty + +-- A variable binding is parsed as a FunBind. + + + -- The parser left-associates, so there should + -- not be any OpApps inside the e's +splitBang :: LHsExpr RdrName -> Maybe (LHsExpr RdrName, [LHsExpr RdrName]) +-- Splits (f ! g a b) into (f, [(! g), a, g]) +splitBang (L loc (OpApp l_arg bang@(L loc' (HsVar op)) _ r_arg)) + | op == bang_RDR = Just (l_arg, L loc (SectionR bang arg1) : argns) + where + (arg1,argns) = split_bang r_arg [] + split_bang (L _ (HsApp f e)) es = split_bang f (e:es) + split_bang e es = (e,es) +splitBang other = Nothing + +isFunLhs :: LHsExpr RdrName + -> P (Maybe (Located RdrName, Bool, [LHsExpr RdrName])) +-- Just (fun, is_infix, arg_pats) if e is a function LHS +isFunLhs e = go e [] + where + go (L loc (HsVar f)) es + | not (isRdrDataCon f) = return (Just (L loc f, False, es)) + go (L _ (HsApp f e)) es = go f (e:es) + go (L _ (HsPar e)) es@(_:_) = go e es + go e@(L loc (OpApp l (L loc' (HsVar op)) fix r)) es + | Just (e',es') <- splitBang e + = do { bang_on <- extension bangPatEnabled + ; if bang_on then go e' (es' ++ es) + else return (Just (L loc' op, True, (l:r:es))) } + -- No bangs; behave just like the next case + | not (isRdrDataCon op) + = return (Just (L loc' op, True, (l:r:es))) + | otherwise + = do { mb_l <- go l es + ; case mb_l of + Just (op', True, j : k : es') + -> return (Just (op', True, j : op_app : es')) + where + op_app = L loc (OpApp k (L loc' (HsVar op)) fix r) + _ -> return Nothing } + go _ _ = return Nothing + +--------------------------------------------------------------------------- +-- Miscellaneous utilities + +checkPrecP :: Located Int -> P Int +checkPrecP (L l i) + | 0 <= i && i <= maxPrecedence = return i + | otherwise = parseError l "Precedence out of range" + +mkRecConstrOrUpdate + :: LHsExpr RdrName + -> SrcSpan + -> HsRecordBinds RdrName + -> P (HsExpr RdrName) + +mkRecConstrOrUpdate (L l (HsVar c)) loc fs | isRdrDataCon c + = return (RecordCon (L l c) noPostTcExpr fs) +mkRecConstrOrUpdate exp loc fs@(_:_) + = return (RecordUpd exp fs placeHolderType placeHolderType) +mkRecConstrOrUpdate _ loc [] + = parseError loc "Empty record update" + +mkInlineSpec :: Maybe Activation -> Bool -> InlineSpec +-- The Maybe is becuase the user can omit the activation spec (and usually does) +mkInlineSpec Nothing True = alwaysInlineSpec -- INLINE +mkInlineSpec Nothing False = neverInlineSpec -- NOINLINE +mkInlineSpec (Just act) inl = Inline act inl + + +----------------------------------------------------------------------------- +-- utilities for foreign declarations + +-- supported calling conventions +-- +data CallConv = CCall CCallConv -- ccall or stdcall + | DNCall -- .NET + +-- construct a foreign import declaration +-- +mkImport :: CallConv + -> Safety + -> (Located FastString, Located RdrName, LHsType RdrName) + -> P (HsDecl RdrName) +mkImport (CCall cconv) safety (entity, v, ty) = do + importSpec <- parseCImport entity cconv safety v + return (ForD (ForeignImport v ty importSpec False)) +mkImport (DNCall ) _ (entity, v, ty) = do + spec <- parseDImport entity + return $ ForD (ForeignImport v ty (DNImport spec) False) + +-- parse the entity string of a foreign import declaration for the `ccall' or +-- `stdcall' calling convention' +-- +parseCImport :: Located FastString + -> CCallConv + -> Safety + -> Located RdrName + -> P ForeignImport +parseCImport (L loc entity) cconv safety v + -- FIXME: we should allow white space around `dynamic' and `wrapper' -=chak + | entity == FSLIT ("dynamic") = + return $ CImport cconv safety nilFS nilFS (CFunction DynamicTarget) + | entity == FSLIT ("wrapper") = + return $ CImport cconv safety nilFS nilFS CWrapper + | otherwise = parse0 (unpackFS entity) + where + -- using the static keyword? + parse0 (' ': rest) = parse0 rest + parse0 ('s':'t':'a':'t':'i':'c':rest) = parse1 rest + parse0 rest = parse1 rest + -- check for header file name + parse1 "" = parse4 "" nilFS False nilFS + parse1 (' ':rest) = parse1 rest + parse1 str@('&':_ ) = parse2 str nilFS + parse1 str@('[':_ ) = parse3 str nilFS False + parse1 str + | ".h" `isSuffixOf` first = parse2 rest (mkFastString first) + | otherwise = parse4 str nilFS False nilFS + where + (first, rest) = break (\c -> c == ' ' || c == '&' || c == '[') str + -- check for address operator (indicating a label import) + parse2 "" header = parse4 "" header False nilFS + parse2 (' ':rest) header = parse2 rest header + parse2 ('&':rest) header = parse3 rest header True + parse2 str@('[':_ ) header = parse3 str header False + parse2 str header = parse4 str header False nilFS + -- check for library object name + parse3 (' ':rest) header isLbl = parse3 rest header isLbl + parse3 ('[':rest) header isLbl = + case break (== ']') rest of + (lib, ']':rest) -> parse4 rest header isLbl (mkFastString lib) + _ -> parseError loc "Missing ']' in entity" + parse3 str header isLbl = parse4 str header isLbl nilFS + -- check for name of C function + parse4 "" header isLbl lib = build (mkExtName (unLoc v)) header isLbl lib + parse4 (' ':rest) header isLbl lib = parse4 rest header isLbl lib + parse4 str header isLbl lib + | all (== ' ') rest = build (mkFastString first) header isLbl lib + | otherwise = parseError loc "Malformed entity string" + where + (first, rest) = break (== ' ') str + -- + build cid header False lib = return $ + CImport cconv safety header lib (CFunction (StaticTarget cid)) + build cid header True lib = return $ + CImport cconv safety header lib (CLabel cid ) + +-- +-- Unravel a dotnet spec string. +-- +parseDImport :: Located FastString -> P DNCallSpec +parseDImport (L loc entity) = parse0 comps + where + comps = words (unpackFS entity) + + parse0 [] = d'oh + parse0 (x : xs) + | x == "static" = parse1 True xs + | otherwise = parse1 False (x:xs) + + parse1 _ [] = d'oh + parse1 isStatic (x:xs) + | x == "method" = parse2 isStatic DNMethod xs + | x == "field" = parse2 isStatic DNField xs + | x == "ctor" = parse2 isStatic DNConstructor xs + parse1 isStatic xs = parse2 isStatic DNMethod xs + + parse2 _ _ [] = d'oh + parse2 isStatic kind (('[':x):xs) = + case x of + [] -> d'oh + vs | last vs == ']' -> parse3 isStatic kind (init vs) xs + parse2 isStatic kind xs = parse3 isStatic kind "" xs + + parse3 isStatic kind assem [x] = + return (DNCallSpec isStatic kind assem x + -- these will be filled in once known. + (error "FFI-dotnet-args") + (error "FFI-dotnet-result")) + parse3 _ _ _ _ = d'oh + + d'oh = parseError loc "Malformed entity string" + +-- construct a foreign export declaration +-- +mkExport :: CallConv + -> (Located FastString, Located RdrName, LHsType RdrName) + -> P (HsDecl RdrName) +mkExport (CCall cconv) (L loc entity, v, ty) = return $ + ForD (ForeignExport v ty (CExport (CExportStatic entity' cconv)) False) + where + entity' | nullFS entity = mkExtName (unLoc v) + | otherwise = entity +mkExport DNCall (L loc entity, v, ty) = + parseError (getLoc v){-TODO: not quite right-} + "Foreign export is not yet supported for .NET" + +-- Supplying the ext_name in a foreign decl is optional; if it +-- isn't there, the Haskell name is assumed. Note that no transformation +-- of the Haskell name is then performed, so if you foreign export (++), +-- it's external name will be "++". Too bad; it's important because we don't +-- want z-encoding (e.g. names with z's in them shouldn't be doubled) +-- +mkExtName :: RdrName -> CLabelString +mkExtName rdrNm = mkFastString (occNameString (rdrNameOcc rdrNm)) +\end{code} + + +----------------------------------------------------------------------------- +-- Misc utils + +\begin{code} +showRdrName :: RdrName -> String +showRdrName r = showSDoc (ppr r) + +parseError :: SrcSpan -> String -> P a +parseError span s = failSpanMsgP span s +\end{code} diff --git a/compiler/parser/cutils.c b/compiler/parser/cutils.c new file mode 100644 index 0000000000..08832f298d --- /dev/null +++ b/compiler/parser/cutils.c @@ -0,0 +1,70 @@ +/* +These utility routines are used various +places in the GHC library. +*/ + +/* For GHC 4.08, we are relying on the fact that RtsFlags has + * compatible layout with the current version, because we're + * #including the current version of RtsFlags.h below. 4.08 didn't + * ship with its own RtsFlags.h, unfortunately. For later GHC + * versions, we #include the correct RtsFlags.h. + */ +#if __GLASGOW_HASKELL__ < 502 +#include "../includes/Rts.h" +#include "../includes/RtsFlags.h" +#else +#include "Rts.h" +#include "RtsFlags.h" +#endif + +#include "HsFFI.h" + +#include <string.h> + +#ifdef HAVE_UNISTD_H +#include <unistd.h> +#endif + +/* +Calling 'strlen' and 'memcpy' directly gives problems with GCC's inliner, +and causes gcc to require too many registers on x84 +*/ + +HsInt +ghc_strlen( HsAddr a ) +{ + return (strlen((char *)a)); +} + +HsInt +ghc_memcmp( HsAddr a1, HsAddr a2, HsInt len ) +{ + return (memcmp((char *)a1, a2, len)); +} + +HsInt +ghc_memcmp_off( HsAddr a1, HsInt i, HsAddr a2, HsInt len ) +{ + return (memcmp((char *)a1 + i, a2, len)); +} + +void +enableTimingStats( void ) /* called from the driver */ +{ +#if __GLASGOW_HASKELL__ >= 411 + RtsFlags.GcFlags.giveStats = ONELINE_GC_STATS; +#endif + /* ignored when bootstrapping with an older GHC */ +} + +void +setHeapSize( HsInt size ) +{ + RtsFlags.GcFlags.heapSizeSuggestion = size / BLOCK_SIZE; + if (RtsFlags.GcFlags.maxHeapSize != 0 && + RtsFlags.GcFlags.heapSizeSuggestion > RtsFlags.GcFlags.maxHeapSize) { + RtsFlags.GcFlags.maxHeapSize = RtsFlags.GcFlags.heapSizeSuggestion; + } +} + + diff --git a/compiler/parser/cutils.h b/compiler/parser/cutils.h new file mode 100644 index 0000000000..c7c1867ded --- /dev/null +++ b/compiler/parser/cutils.h @@ -0,0 +1,16 @@ +/* ----------------------------------------------------------------------------- + * + * Utility C functions. + * + * -------------------------------------------------------------------------- */ + +#include "HsFFI.h" + +// Out-of-line string functions, see PrimPacked.lhs +HsInt ghc_strlen( HsAddr a ); +HsInt ghc_memcmp( HsAddr a1, HsAddr a2, HsInt len ); +HsInt ghc_memcmp_off( HsAddr a1, HsInt i, HsAddr a2, HsInt len ); + + +void enableTimingStats( void ); +void setHeapSize( HsInt size ); diff --git a/compiler/parser/hschooks.c b/compiler/parser/hschooks.c new file mode 100644 index 0000000000..f3e7447a49 --- /dev/null +++ b/compiler/parser/hschooks.c @@ -0,0 +1,55 @@ +/* +These routines customise the error messages +for various bits of the RTS. They are linked +in instead of the defaults. +*/ + +/* For GHC 4.08, we are relying on the fact that RtsFlags has + * compatible layout with the current version, because we're + * #including the current version of RtsFlags.h below. 4.08 didn't + * ship with its own RtsFlags.h, unfortunately. For later GHC + * versions, we #include the correct RtsFlags.h. + */ +#if __GLASGOW_HASKELL__ < 502 +#include "../includes/Rts.h" +#include "../includes/RtsFlags.h" +#else +#include "Rts.h" +#include "RtsFlags.h" +#endif + +#include "HsFFI.h" + +#include <string.h> + +#ifdef HAVE_UNISTD_H +#include <unistd.h> +#endif + +void +defaultsHook (void) +{ + RtsFlags.GcFlags.heapSizeSuggestion = 6*1024*1024 / BLOCK_SIZE; + RtsFlags.GcFlags.maxStkSize = 8*1024*1024 / sizeof(W_); +#if __GLASGOW_HASKELL__ >= 411 + /* GHC < 4.11 didn't have these */ + RtsFlags.GcFlags.giveStats = COLLECT_GC_STATS; + RtsFlags.GcFlags.statsFile = stderr; +#endif +} + +void +OutOfHeapHook (unsigned long request_size/* always zero these days */, + unsigned long heap_size) + /* both in bytes */ +{ + fprintf(stderr, "GHC's heap exhausted: current limit is %lu bytes;\nUse the `-M<size>' option to increase the total heap size.\n", + heap_size); +} + +void +StackOverflowHook (unsigned long stack_size) /* in bytes */ +{ + fprintf(stderr, "GHC stack-space overflow: current limit is %ld bytes.\nUse the `-K<size>' option to increase it.\n", stack_size); +} + diff --git a/compiler/parser/hschooks.h b/compiler/parser/hschooks.h new file mode 100644 index 0000000000..4ce1c0f93d --- /dev/null +++ b/compiler/parser/hschooks.h @@ -0,0 +1,9 @@ +/* ----------------------------------------------------------------------------- + * $Id: hschooks.h,v 1.4 2002/04/22 14:54:10 simonmar Exp $ + * + * Hooks into the RTS from the compiler. + * + * -------------------------------------------------------------------------- */ + +#include "HsFFI.h" + diff --git a/compiler/prelude/ForeignCall.lhs b/compiler/prelude/ForeignCall.lhs new file mode 100644 index 0000000000..2c90a7dc6e --- /dev/null +++ b/compiler/prelude/ForeignCall.lhs @@ -0,0 +1,423 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[Foreign]{Foreign calls} + +\begin{code} +module ForeignCall ( + ForeignCall(..), + Safety(..), playSafe, playThreadSafe, + + CExportSpec(..), CLabelString, isCLabelString, pprCLabelString, + CCallSpec(..), + CCallTarget(..), isDynamicTarget, + CCallConv(..), defaultCCallConv, ccallConvToInt, ccallConvAttribute, + + DNCallSpec(..), DNKind(..), DNType(..), + withDNTypes + ) where + +#include "HsVersions.h" + +import FastString ( FastString, unpackFS ) +import Char ( isAlphaNum ) +import Binary +import Outputable +\end{code} + + +%************************************************************************ +%* * +\subsubsection{Data types} +%* * +%************************************************************************ + +\begin{code} +data ForeignCall + = CCall CCallSpec + | DNCall DNCallSpec + deriving( Eq ) -- We compare them when seeing if an interface + -- has changed (for versioning purposes) + {-! derive: Binary !-} + +-- We may need more clues to distinguish foreign calls +-- but this simple printer will do for now +instance Outputable ForeignCall where + ppr (CCall cc) = ppr cc + ppr (DNCall dn) = ppr dn +\end{code} + + +\begin{code} +data Safety + = PlaySafe -- Might invoke Haskell GC, or do a call back, or + -- switch threads, etc. So make sure things are + -- tidy before the call + Bool -- => True, external function is also re-entrant. + -- [if supported, RTS arranges for the external call + -- to be executed by a separate OS thread, i.e., + -- _concurrently_ to the execution of other Haskell threads.] + + | PlayRisky -- None of the above can happen; the call will return + -- without interacting with the runtime system at all + deriving( Eq, Show ) + -- Show used just for Show Lex.Token, I think + {-! derive: Binary !-} + +instance Outputable Safety where + ppr (PlaySafe False) = ptext SLIT("safe") + ppr (PlaySafe True) = ptext SLIT("threadsafe") + ppr PlayRisky = ptext SLIT("unsafe") + +playSafe :: Safety -> Bool +playSafe PlaySafe{} = True +playSafe PlayRisky = False + +playThreadSafe :: Safety -> Bool +playThreadSafe (PlaySafe x) = x +playThreadSafe _ = False +\end{code} + + +%************************************************************************ +%* * +\subsubsection{Calling C} +%* * +%************************************************************************ + +\begin{code} +data CExportSpec + = CExportStatic -- foreign export ccall foo :: ty + CLabelString -- C Name of exported function + CCallConv + {-! derive: Binary !-} + +data CCallSpec + = CCallSpec CCallTarget -- What to call + CCallConv -- Calling convention to use. + Safety + deriving( Eq ) + {-! derive: Binary !-} +\end{code} + +The call target: + +\begin{code} +data CCallTarget + = StaticTarget CLabelString -- An "unboxed" ccall# to `fn'. + | DynamicTarget -- First argument (an Addr#) is the function pointer + deriving( Eq ) + {-! derive: Binary !-} + +isDynamicTarget :: CCallTarget -> Bool +isDynamicTarget DynamicTarget = True +isDynamicTarget other = False +\end{code} + + +Stuff to do with calling convention: + +ccall: Caller allocates parameters, *and* deallocates them. + +stdcall: Caller allocates parameters, callee deallocates. + Function name has @N after it, where N is number of arg bytes + e.g. _Foo@8 + +ToDo: The stdcall calling convention is x86 (win32) specific, +so perhaps we should emit a warning if it's being used on other +platforms. + +\begin{code} +data CCallConv = CCallConv | StdCallConv + deriving (Eq) + {-! derive: Binary !-} + +instance Outputable CCallConv where + ppr StdCallConv = ptext SLIT("stdcall") + ppr CCallConv = ptext SLIT("ccall") + +defaultCCallConv :: CCallConv +defaultCCallConv = CCallConv + +ccallConvToInt :: CCallConv -> Int +ccallConvToInt StdCallConv = 0 +ccallConvToInt CCallConv = 1 +\end{code} + +Generate the gcc attribute corresponding to the given +calling convention (used by PprAbsC): + +\begin{code} +ccallConvAttribute :: CCallConv -> String +ccallConvAttribute StdCallConv = "__attribute__((__stdcall__))" +ccallConvAttribute CCallConv = "" +\end{code} + +\begin{code} +type CLabelString = FastString -- A C label, completely unencoded + +pprCLabelString :: CLabelString -> SDoc +pprCLabelString lbl = ftext lbl + +isCLabelString :: CLabelString -> Bool -- Checks to see if this is a valid C label +isCLabelString lbl + = all ok (unpackFS lbl) + where + ok c = isAlphaNum c || c == '_' || c == '.' + -- The '.' appears in e.g. "foo.so" in the + -- module part of a ExtName. Maybe it should be separate +\end{code} + + +Printing into C files: + +\begin{code} +instance Outputable CExportSpec where + ppr (CExportStatic str _) = pprCLabelString str + +instance Outputable CCallSpec where + ppr (CCallSpec fun cconv safety) + = hcat [ ifPprDebug callconv, ppr_fun fun ] + where + callconv = text "{-" <> ppr cconv <> text "-}" + + gc_suf | playSafe safety = text "_GC" + | otherwise = empty + + ppr_fun DynamicTarget = text "__dyn_ccall" <> gc_suf <+> text "\"\"" + ppr_fun (StaticTarget fn) = text "__ccall" <> gc_suf <+> pprCLabelString fn +\end{code} + + +%************************************************************************ +%* * +\subsubsection{.NET interop} +%* * +%************************************************************************ + +\begin{code} +data DNCallSpec = + DNCallSpec Bool -- True => static method/field + DNKind -- what type of access + String -- assembly + String -- fully qualified method/field name. + [DNType] -- argument types. + DNType -- result type. + deriving ( Eq ) + {-! derive: Binary !-} + +data DNKind + = DNMethod + | DNField + | DNConstructor + deriving ( Eq ) + {-! derive: Binary !-} + +data DNType + = DNByte + | DNBool + | DNChar + | DNDouble + | DNFloat + | DNInt + | DNInt8 + | DNInt16 + | DNInt32 + | DNInt64 + | DNWord8 + | DNWord16 + | DNWord32 + | DNWord64 + | DNPtr + | DNUnit + | DNObject + | DNString + deriving ( Eq ) + {-! derive: Binary !-} + +withDNTypes :: DNCallSpec -> [DNType] -> DNType -> DNCallSpec +withDNTypes (DNCallSpec isStatic k assem nm _ _) argTys resTy + = DNCallSpec isStatic k assem nm argTys resTy + +instance Outputable DNCallSpec where + ppr (DNCallSpec isStatic kind ass nm _ _ ) + = char '"' <> + (if isStatic then text "static" else empty) <+> + (text (case kind of { DNMethod -> "method" ; DNField -> "field"; DNConstructor -> "ctor" })) <+> + (if null ass then char ' ' else char '[' <> text ass <> char ']') <> + text nm <> + char '"' +\end{code} + + + +%************************************************************************ +%* * +\subsubsection{Misc} +%* * +%************************************************************************ + +\begin{code} +{-* Generated by DrIFT-v1.0 : Look, but Don't Touch. *-} +instance Binary ForeignCall where + put_ bh (CCall aa) = do + putByte bh 0 + put_ bh aa + put_ bh (DNCall ab) = do + putByte bh 1 + put_ bh ab + get bh = do + h <- getByte bh + case h of + 0 -> do aa <- get bh + return (CCall aa) + _ -> do ab <- get bh + return (DNCall ab) + +instance Binary Safety where + put_ bh (PlaySafe aa) = do + putByte bh 0 + put_ bh aa + put_ bh PlayRisky = do + putByte bh 1 + get bh = do + h <- getByte bh + case h of + 0 -> do aa <- get bh + return (PlaySafe aa) + _ -> do return PlayRisky + +instance Binary CExportSpec where + put_ bh (CExportStatic aa ab) = do + put_ bh aa + put_ bh ab + get bh = do + aa <- get bh + ab <- get bh + return (CExportStatic aa ab) + +instance Binary CCallSpec where + put_ bh (CCallSpec aa ab ac) = do + put_ bh aa + put_ bh ab + put_ bh ac + get bh = do + aa <- get bh + ab <- get bh + ac <- get bh + return (CCallSpec aa ab ac) + +instance Binary CCallTarget where + put_ bh (StaticTarget aa) = do + putByte bh 0 + put_ bh aa + put_ bh DynamicTarget = do + putByte bh 1 + get bh = do + h <- getByte bh + case h of + 0 -> do aa <- get bh + return (StaticTarget aa) + _ -> do return DynamicTarget + +instance Binary CCallConv where + put_ bh CCallConv = do + putByte bh 0 + put_ bh StdCallConv = do + putByte bh 1 + get bh = do + h <- getByte bh + case h of + 0 -> do return CCallConv + _ -> do return StdCallConv + +instance Binary DNCallSpec where + put_ bh (DNCallSpec isStatic kind ass nm _ _) = do + put_ bh isStatic + put_ bh kind + put_ bh ass + put_ bh nm + get bh = do + isStatic <- get bh + kind <- get bh + ass <- get bh + nm <- get bh + return (DNCallSpec isStatic kind ass nm [] undefined) + +instance Binary DNKind where + put_ bh DNMethod = do + putByte bh 0 + put_ bh DNField = do + putByte bh 1 + put_ bh DNConstructor = do + putByte bh 2 + get bh = do + h <- getByte bh + case h of + 0 -> do return DNMethod + 1 -> do return DNField + _ -> do return DNConstructor + +instance Binary DNType where + put_ bh DNByte = do + putByte bh 0 + put_ bh DNBool = do + putByte bh 1 + put_ bh DNChar = do + putByte bh 2 + put_ bh DNDouble = do + putByte bh 3 + put_ bh DNFloat = do + putByte bh 4 + put_ bh DNInt = do + putByte bh 5 + put_ bh DNInt8 = do + putByte bh 6 + put_ bh DNInt16 = do + putByte bh 7 + put_ bh DNInt32 = do + putByte bh 8 + put_ bh DNInt64 = do + putByte bh 9 + put_ bh DNWord8 = do + putByte bh 10 + put_ bh DNWord16 = do + putByte bh 11 + put_ bh DNWord32 = do + putByte bh 12 + put_ bh DNWord64 = do + putByte bh 13 + put_ bh DNPtr = do + putByte bh 14 + put_ bh DNUnit = do + putByte bh 15 + put_ bh DNObject = do + putByte bh 16 + put_ bh DNString = do + putByte bh 17 + + get bh = do + h <- getByte bh + case h of + 0 -> return DNByte + 1 -> return DNBool + 2 -> return DNChar + 3 -> return DNDouble + 4 -> return DNFloat + 5 -> return DNInt + 6 -> return DNInt8 + 7 -> return DNInt16 + 8 -> return DNInt32 + 9 -> return DNInt64 + 10 -> return DNWord8 + 11 -> return DNWord16 + 12 -> return DNWord32 + 13 -> return DNWord64 + 14 -> return DNPtr + 15 -> return DNUnit + 16 -> return DNObject + 17 -> return DNString + +-- Imported from other files :- + +\end{code} diff --git a/compiler/prelude/PrelInfo.lhs b/compiler/prelude/PrelInfo.lhs new file mode 100644 index 0000000000..31457b2b63 --- /dev/null +++ b/compiler/prelude/PrelInfo.lhs @@ -0,0 +1,139 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[PrelInfo]{The @PrelInfo@ interface to the compiler's prelude knowledge} + +\begin{code} +module PrelInfo ( + module MkId, + + ghcPrimExports, + wiredInThings, basicKnownKeyNames, + primOpId, + + -- Random other things + maybeCharLikeCon, maybeIntLikeCon, + + -- Class categories + isNumericClass, isStandardClass + + ) where + +#include "HsVersions.h" + +import PrelNames ( basicKnownKeyNames, + hasKey, charDataConKey, intDataConKey, + numericClassKeys, standardClassKeys ) + +import PrimOp ( PrimOp, allThePrimOps, primOpOcc, primOpTag, maxPrimOpTag ) +import DataCon ( DataCon ) +import Id ( Id, idName ) +import MkId ( mkPrimOpId, wiredInIds ) +import MkId -- All of it, for re-export +import Name ( nameOccName ) +import TysPrim ( primTyCons ) +import TysWiredIn ( wiredInTyCons ) +import HscTypes ( TyThing(..), implicitTyThings, GenAvailInfo(..), RdrAvailInfo ) +import Class ( Class, classKey ) +import Type ( funTyCon ) +import TyCon ( tyConName ) +import Util ( isIn ) + +import Array ( Array, array, (!) ) +\end{code} + +%************************************************************************ +%* * +\subsection[builtinNameInfo]{Lookup built-in names} +%* * +%************************************************************************ + +We have two ``builtin name funs,'' one to look up @TyCons@ and +@Classes@, the other to look up values. + +\begin{code} +wiredInThings :: [TyThing] +wiredInThings + = concat + [ -- Wired in TyCons and their implicit Ids + tycon_things + , concatMap implicitTyThings tycon_things + + -- Wired in Ids + , map AnId wiredInIds + + -- PrimOps + , map (AnId . mkPrimOpId) allThePrimOps + ] + where + tycon_things = map ATyCon ([funTyCon] ++ primTyCons ++ wiredInTyCons) +\end{code} + +We let a lot of "non-standard" values be visible, so that we can make +sense of them in interface pragmas. It's cool, though they all have +"non-standard" names, so they won't get past the parser in user code. + +%************************************************************************ +%* * + PrimOpIds +%* * +%************************************************************************ + +\begin{code} +primOpIds :: Array Int Id -- Indexed by PrimOp tag +primOpIds = array (1,maxPrimOpTag) [ (primOpTag op, mkPrimOpId op) + | op <- allThePrimOps] + +primOpId :: PrimOp -> Id +primOpId op = primOpIds ! primOpTag op +\end{code} + + +%************************************************************************ +%* * +\subsection{Export lists for pseudo-modules (GHC.Prim)} +%* * +%************************************************************************ + +GHC.Prim "exports" all the primops and primitive types, some +wired-in Ids. + +\begin{code} +ghcPrimExports :: [RdrAvailInfo] +ghcPrimExports + = map (Avail . nameOccName . idName) ghcPrimIds ++ + map (Avail . primOpOcc) allThePrimOps ++ + [ AvailTC occ [occ] | + n <- funTyCon : primTyCons, let occ = nameOccName (tyConName n) + ] +\end{code} + + +%************************************************************************ +%* * +\subsection{Built-in keys} +%* * +%************************************************************************ + +ToDo: make it do the ``like'' part properly (as in 0.26 and before). + +\begin{code} +maybeCharLikeCon, maybeIntLikeCon :: DataCon -> Bool +maybeCharLikeCon con = con `hasKey` charDataConKey +maybeIntLikeCon con = con `hasKey` intDataConKey +\end{code} + + +%************************************************************************ +%* * +\subsection{Class predicates} +%* * +%************************************************************************ + +\begin{code} +isNumericClass, isStandardClass :: Class -> Bool + +isNumericClass clas = classKey clas `is_elem` numericClassKeys +isStandardClass clas = classKey clas `is_elem` standardClassKeys +is_elem = isIn "is_X_Class" +\end{code} diff --git a/compiler/prelude/PrelNames.lhs b/compiler/prelude/PrelNames.lhs new file mode 100644 index 0000000000..d656fbf18e --- /dev/null +++ b/compiler/prelude/PrelNames.lhs @@ -0,0 +1,1063 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[PrelNames]{Definitions of prelude modules and names} + + +Nota Bene: all Names defined in here should come from the base package + +* ModuleNames for prelude modules, + e.g. pREL_BASE_Name :: ModuleName + +* Modules for prelude modules + e.g. pREL_Base :: Module + +* Uniques for Ids, DataCons, TyCons and Classes that the compiler + "knows about" in some way + e.g. intTyConKey :: Unique + minusClassOpKey :: Unique + +* Names for Ids, DataCons, TyCons and Classes that the compiler + "knows about" in some way + e.g. intTyConName :: Name + minusName :: Name + One of these Names contains + (a) the module and occurrence name of the thing + (b) its Unique + The may way the compiler "knows about" one of these things is + where the type checker or desugarer needs to look it up. For + example, when desugaring list comprehensions the desugarer + needs to conjure up 'foldr'. It does this by looking up + foldrName in the environment. + +* RdrNames for Ids, DataCons etc that the compiler may emit into + generated code (e.g. for deriving). It's not necessary to know + the uniques for these guys, only their names + + +\begin{code} +module PrelNames ( + Unique, Uniquable(..), hasKey, -- Re-exported for convenience + + ----------------------------------------------------------- + module PrelNames, -- A huge bunch of (a) Names, e.g. intTyConName + -- (b) Uniques e.g. intTyConKey + -- (c) Groups of classes and types + -- (d) miscellaneous things + -- So many that we export them all + ) where + +#include "HsVersions.h" + +import Module ( Module, mkModule ) +import OccName ( dataName, tcName, clsName, varName, mkOccNameFS, + mkVarOccFS ) +import RdrName ( RdrName, nameRdrName, mkOrig, rdrNameOcc, mkUnqual ) +import Unique ( Unique, Uniquable(..), hasKey, + mkPreludeMiscIdUnique, mkPreludeDataConUnique, + mkPreludeTyConUnique, mkPreludeClassUnique, + mkTupleTyConUnique + ) +import BasicTypes ( Boxity(..), Arity ) +import Name ( Name, mkInternalName, mkExternalName, nameModule ) +import SrcLoc ( noSrcLoc ) +import FastString +\end{code} + + +%************************************************************************ +%* * +\subsection{Local Names} +%* * +%************************************************************************ + +This *local* name is used by the interactive stuff + +\begin{code} +itName uniq = mkInternalName uniq (mkOccNameFS varName FSLIT("it")) noSrcLoc +\end{code} + +\begin{code} +-- mkUnboundName makes a place-holder Name; it shouldn't be looked at except possibly +-- during compiler debugging. +mkUnboundName :: RdrName -> Name +mkUnboundName rdr_name = mkInternalName unboundKey (rdrNameOcc rdr_name) noSrcLoc + +isUnboundName :: Name -> Bool +isUnboundName name = name `hasKey` unboundKey +\end{code} + + +%************************************************************************ +%* * +\subsection{Known key Names} +%* * +%************************************************************************ + +This section tells what the compiler knows about the assocation of +names with uniques. These ones are the *non* wired-in ones. The +wired in ones are defined in TysWiredIn etc. + +\begin{code} +basicKnownKeyNames :: [Name] +basicKnownKeyNames + = genericTyConNames + ++ typeableClassNames + ++ [ -- Type constructors (synonyms especially) + ioTyConName, ioDataConName, + runMainIOName, + orderingTyConName, + rationalTyConName, + ratioDataConName, + ratioTyConName, + byteArrayTyConName, + mutableByteArrayTyConName, + integerTyConName, smallIntegerDataConName, largeIntegerDataConName, + + -- Classes. *Must* include: + -- classes that are grabbed by key (e.g., eqClassKey) + -- classes in "Class.standardClassKeys" (quite a few) + eqClassName, -- mentioned, derivable + ordClassName, -- derivable + boundedClassName, -- derivable + numClassName, -- mentioned, numeric + enumClassName, -- derivable + monadClassName, + functorClassName, + realClassName, -- numeric + integralClassName, -- numeric + fractionalClassName, -- numeric + floatingClassName, -- numeric + realFracClassName, -- numeric + realFloatClassName, -- numeric + dataClassName, + + -- Numeric stuff + negateName, minusName, + fromRationalName, fromIntegerName, + geName, eqName, + + -- Enum stuff + enumFromName, enumFromThenName, + enumFromThenToName, enumFromToName, + enumFromToPName, enumFromThenToPName, + + -- Monad stuff + thenIOName, bindIOName, returnIOName, failIOName, + failMName, bindMName, thenMName, returnMName, + + -- MonadRec stuff + mfixName, + + -- Arrow stuff + arrAName, composeAName, firstAName, + appAName, choiceAName, loopAName, + + -- Ix stuff + ixClassName, + + -- Show stuff + showClassName, + + -- Read stuff + readClassName, + + -- Stable pointers + newStablePtrName, + + -- Strings and lists + unpackCStringName, unpackCStringAppendName, + unpackCStringFoldrName, unpackCStringUtf8Name, + + -- List operations + concatName, filterName, + zipName, foldrName, buildName, augmentName, appendName, + + -- Parallel array operations + nullPName, lengthPName, replicatePName, mapPName, + filterPName, zipPName, crossPName, indexPName, + toPName, bpermutePName, bpermuteDftPName, indexOfPName, + + -- FFI primitive types that are not wired-in. + stablePtrTyConName, ptrTyConName, funPtrTyConName, addrTyConName, + int8TyConName, int16TyConName, int32TyConName, int64TyConName, + wordTyConName, word8TyConName, word16TyConName, word32TyConName, word64TyConName, + + -- Others + otherwiseIdName, + plusIntegerName, timesIntegerName, + eqStringName, assertName, breakpointName, assertErrorName, + runSTRepName, + printName, fstName, sndName, + + -- MonadFix + monadFixClassName, mfixName, + + -- Splittable class + splittableClassName, splitName, + + -- Other classes + randomClassName, randomGenClassName, monadPlusClassName, + + -- Booleans + andName, orName + + -- The Either type + , eitherTyConName, leftDataConName, rightDataConName + + -- dotnet interop + , objectTyConName, marshalObjectName, unmarshalObjectName + , marshalStringName, unmarshalStringName, checkDotnetResName + ] + +genericTyConNames :: [Name] +genericTyConNames = [crossTyConName, plusTyConName, genUnitTyConName] +\end{code} + + +%************************************************************************ +%* * +\subsection{Module names} +%* * +%************************************************************************ + + +--MetaHaskell Extension Add a new module here +\begin{code} +pRELUDE = mkModule "Prelude" +gHC_PRIM = mkModule "GHC.Prim" -- Primitive types and values +pREL_BASE = mkModule "GHC.Base" +pREL_ENUM = mkModule "GHC.Enum" +pREL_SHOW = mkModule "GHC.Show" +pREL_READ = mkModule "GHC.Read" +pREL_NUM = mkModule "GHC.Num" +pREL_LIST = mkModule "GHC.List" +pREL_PARR = mkModule "GHC.PArr" +pREL_TUP = mkModule "Data.Tuple" +pREL_EITHER = mkModule "Data.Either" +pREL_PACK = mkModule "GHC.Pack" +pREL_CONC = mkModule "GHC.Conc" +pREL_IO_BASE = mkModule "GHC.IOBase" +pREL_ST = mkModule "GHC.ST" +pREL_ARR = mkModule "GHC.Arr" +pREL_BYTEARR = mkModule "PrelByteArr" +pREL_STABLE = mkModule "GHC.Stable" +pREL_ADDR = mkModule "GHC.Addr" +pREL_PTR = mkModule "GHC.Ptr" +pREL_ERR = mkModule "GHC.Err" +pREL_REAL = mkModule "GHC.Real" +pREL_FLOAT = mkModule "GHC.Float" +pREL_TOP_HANDLER= mkModule "GHC.TopHandler" +sYSTEM_IO = mkModule "System.IO" +dYNAMIC = mkModule "Data.Dynamic" +tYPEABLE = mkModule "Data.Typeable" +gENERICS = mkModule "Data.Generics.Basics" +dOTNET = mkModule "GHC.Dotnet" + +rEAD_PREC = mkModule "Text.ParserCombinators.ReadPrec" +lEX = mkModule "Text.Read.Lex" + +mAIN = mkModule "Main" +pREL_INT = mkModule "GHC.Int" +pREL_WORD = mkModule "GHC.Word" +mONAD = mkModule "Control.Monad" +mONAD_FIX = mkModule "Control.Monad.Fix" +aRROW = mkModule "Control.Arrow" +aDDR = mkModule "Addr" +rANDOM = mkModule "System.Random" + +gLA_EXTS = mkModule "GHC.Exts" +rOOT_MAIN = mkModule ":Main" -- Root module for initialisation + -- The ':xxx' makes a module name that the user can never + -- use himself. The z-encoding for ':' is "ZC", so the z-encoded + -- module name still starts with a capital letter, which keeps + -- the z-encoded version consistent. + +iNTERACTIVE = mkModule ":Interactive" +thFAKE = mkModule ":THFake" +\end{code} + +%************************************************************************ +%* * +\subsection{Constructing the names of tuples +%* * +%************************************************************************ + +\begin{code} +mkTupleModule :: Boxity -> Arity -> Module +mkTupleModule Boxed 0 = pREL_BASE +mkTupleModule Boxed _ = pREL_TUP +mkTupleModule Unboxed _ = gHC_PRIM +\end{code} + + +%************************************************************************ +%* * + RdrNames +%* * +%************************************************************************ + +\begin{code} +main_RDR_Unqual = mkUnqual varName FSLIT("main") + -- We definitely don't want an Orig RdrName, because + -- main might, in principle, be imported into module Main + +eq_RDR = nameRdrName eqName +ge_RDR = nameRdrName geName +ne_RDR = varQual_RDR pREL_BASE FSLIT("/=") +le_RDR = varQual_RDR pREL_BASE FSLIT("<=") +gt_RDR = varQual_RDR pREL_BASE FSLIT(">") +compare_RDR = varQual_RDR pREL_BASE FSLIT("compare") +ltTag_RDR = dataQual_RDR pREL_BASE FSLIT("LT") +eqTag_RDR = dataQual_RDR pREL_BASE FSLIT("EQ") +gtTag_RDR = dataQual_RDR pREL_BASE FSLIT("GT") + +eqClass_RDR = nameRdrName eqClassName +numClass_RDR = nameRdrName numClassName +ordClass_RDR = nameRdrName ordClassName +enumClass_RDR = nameRdrName enumClassName +monadClass_RDR = nameRdrName monadClassName + +map_RDR = varQual_RDR pREL_BASE FSLIT("map") +append_RDR = varQual_RDR pREL_BASE FSLIT("++") + +foldr_RDR = nameRdrName foldrName +build_RDR = nameRdrName buildName +returnM_RDR = nameRdrName returnMName +bindM_RDR = nameRdrName bindMName +failM_RDR = nameRdrName failMName + +and_RDR = nameRdrName andName + +left_RDR = nameRdrName leftDataConName +right_RDR = nameRdrName rightDataConName + +fromEnum_RDR = varQual_RDR pREL_ENUM FSLIT("fromEnum") +toEnum_RDR = varQual_RDR pREL_ENUM FSLIT("toEnum") + +enumFrom_RDR = nameRdrName enumFromName +enumFromTo_RDR = nameRdrName enumFromToName +enumFromThen_RDR = nameRdrName enumFromThenName +enumFromThenTo_RDR = nameRdrName enumFromThenToName + +ratioDataCon_RDR = nameRdrName ratioDataConName +plusInteger_RDR = nameRdrName plusIntegerName +timesInteger_RDR = nameRdrName timesIntegerName + +ioDataCon_RDR = nameRdrName ioDataConName + +eqString_RDR = nameRdrName eqStringName +unpackCString_RDR = nameRdrName unpackCStringName +unpackCStringFoldr_RDR = nameRdrName unpackCStringFoldrName +unpackCStringUtf8_RDR = nameRdrName unpackCStringUtf8Name + +newStablePtr_RDR = nameRdrName newStablePtrName +addrDataCon_RDR = dataQual_RDR aDDR FSLIT("A#") +wordDataCon_RDR = dataQual_RDR pREL_WORD FSLIT("W#") + +bindIO_RDR = nameRdrName bindIOName +returnIO_RDR = nameRdrName returnIOName + +fromInteger_RDR = nameRdrName fromIntegerName +fromRational_RDR = nameRdrName fromRationalName +minus_RDR = nameRdrName minusName +times_RDR = varQual_RDR pREL_NUM FSLIT("*") +plus_RDR = varQual_RDR pREL_NUM FSLIT("+") + +compose_RDR = varQual_RDR pREL_BASE FSLIT(".") + +not_RDR = varQual_RDR pREL_BASE FSLIT("not") +getTag_RDR = varQual_RDR pREL_BASE FSLIT("getTag") +succ_RDR = varQual_RDR pREL_ENUM FSLIT("succ") +pred_RDR = varQual_RDR pREL_ENUM FSLIT("pred") +minBound_RDR = varQual_RDR pREL_ENUM FSLIT("minBound") +maxBound_RDR = varQual_RDR pREL_ENUM FSLIT("maxBound") +range_RDR = varQual_RDR pREL_ARR FSLIT("range") +inRange_RDR = varQual_RDR pREL_ARR FSLIT("inRange") +index_RDR = varQual_RDR pREL_ARR FSLIT("index") +unsafeIndex_RDR = varQual_RDR pREL_ARR FSLIT("unsafeIndex") +unsafeRangeSize_RDR = varQual_RDR pREL_ARR FSLIT("unsafeRangeSize") + +readList_RDR = varQual_RDR pREL_READ FSLIT("readList") +readListDefault_RDR = varQual_RDR pREL_READ FSLIT("readListDefault") +readListPrec_RDR = varQual_RDR pREL_READ FSLIT("readListPrec") +readListPrecDefault_RDR = varQual_RDR pREL_READ FSLIT("readListPrecDefault") +readPrec_RDR = varQual_RDR pREL_READ FSLIT("readPrec") +parens_RDR = varQual_RDR pREL_READ FSLIT("parens") +choose_RDR = varQual_RDR pREL_READ FSLIT("choose") +lexP_RDR = varQual_RDR pREL_READ FSLIT("lexP") + +punc_RDR = dataQual_RDR lEX FSLIT("Punc") +ident_RDR = dataQual_RDR lEX FSLIT("Ident") +symbol_RDR = dataQual_RDR lEX FSLIT("Symbol") + +step_RDR = varQual_RDR rEAD_PREC FSLIT("step") +alt_RDR = varQual_RDR rEAD_PREC FSLIT("+++") +reset_RDR = varQual_RDR rEAD_PREC FSLIT("reset") +prec_RDR = varQual_RDR rEAD_PREC FSLIT("prec") + +showList_RDR = varQual_RDR pREL_SHOW FSLIT("showList") +showList___RDR = varQual_RDR pREL_SHOW FSLIT("showList__") +showsPrec_RDR = varQual_RDR pREL_SHOW FSLIT("showsPrec") +showString_RDR = varQual_RDR pREL_SHOW FSLIT("showString") +showSpace_RDR = varQual_RDR pREL_SHOW FSLIT("showSpace") +showParen_RDR = varQual_RDR pREL_SHOW FSLIT("showParen") + +typeOf_RDR = varQual_RDR tYPEABLE FSLIT("typeOf") +mkTypeRep_RDR = varQual_RDR tYPEABLE FSLIT("mkTyConApp") +mkTyConRep_RDR = varQual_RDR tYPEABLE FSLIT("mkTyCon") + +undefined_RDR = varQual_RDR pREL_ERR FSLIT("undefined") + +crossDataCon_RDR = dataQual_RDR pREL_BASE FSLIT(":*:") +inlDataCon_RDR = dataQual_RDR pREL_BASE FSLIT("Inl") +inrDataCon_RDR = dataQual_RDR pREL_BASE FSLIT("Inr") +genUnitDataCon_RDR = dataQual_RDR pREL_BASE FSLIT("Unit") + +---------------------- +varQual_RDR mod str = mkOrig mod (mkOccNameFS varName str) +tcQual_RDR mod str = mkOrig mod (mkOccNameFS tcName str) +clsQual_RDR mod str = mkOrig mod (mkOccNameFS clsName str) +dataQual_RDR mod str = mkOrig mod (mkOccNameFS dataName str) +\end{code} + +%************************************************************************ +%* * +\subsection{Known-key names} +%* * +%************************************************************************ + +Many of these Names are not really "built in", but some parts of the +compiler (notably the deriving mechanism) need to mention their names, +and it's convenient to write them all down in one place. + +--MetaHaskell Extension add the constrs and the lower case case +-- guys as well (perhaps) e.g. see trueDataConName below + + +\begin{code} +runMainIOName = varQual pREL_TOP_HANDLER FSLIT("runMainIO") runMainKey + +orderingTyConName = tcQual pREL_BASE FSLIT("Ordering") orderingTyConKey + +eitherTyConName = tcQual pREL_EITHER FSLIT("Either") eitherTyConKey +leftDataConName = conName eitherTyConName FSLIT("Left") leftDataConKey +rightDataConName = conName eitherTyConName FSLIT("Right") rightDataConKey + +-- Generics +crossTyConName = tcQual pREL_BASE FSLIT(":*:") crossTyConKey +plusTyConName = tcQual pREL_BASE FSLIT(":+:") plusTyConKey +genUnitTyConName = tcQual pREL_BASE FSLIT("Unit") genUnitTyConKey + +-- Base strings Strings +unpackCStringName = varQual pREL_BASE FSLIT("unpackCString#") unpackCStringIdKey +unpackCStringAppendName = varQual pREL_BASE FSLIT("unpackAppendCString#") unpackCStringAppendIdKey +unpackCStringFoldrName = varQual pREL_BASE FSLIT("unpackFoldrCString#") unpackCStringFoldrIdKey +unpackCStringUtf8Name = varQual pREL_BASE FSLIT("unpackCStringUtf8#") unpackCStringUtf8IdKey +eqStringName = varQual pREL_BASE FSLIT("eqString") eqStringIdKey + +-- Base classes (Eq, Ord, Functor) +eqClassName = clsQual pREL_BASE FSLIT("Eq") eqClassKey +eqName = methName eqClassName FSLIT("==") eqClassOpKey +ordClassName = clsQual pREL_BASE FSLIT("Ord") ordClassKey +geName = methName ordClassName FSLIT(">=") geClassOpKey +functorClassName = clsQual pREL_BASE FSLIT("Functor") functorClassKey + +-- Class Monad +monadClassName = clsQual pREL_BASE FSLIT("Monad") monadClassKey +thenMName = methName monadClassName FSLIT(">>") thenMClassOpKey +bindMName = methName monadClassName FSLIT(">>=") bindMClassOpKey +returnMName = methName monadClassName FSLIT("return") returnMClassOpKey +failMName = methName monadClassName FSLIT("fail") failMClassOpKey + +-- Random PrelBase functions +otherwiseIdName = varQual pREL_BASE FSLIT("otherwise") otherwiseIdKey +foldrName = varQual pREL_BASE FSLIT("foldr") foldrIdKey +buildName = varQual pREL_BASE FSLIT("build") buildIdKey +augmentName = varQual pREL_BASE FSLIT("augment") augmentIdKey +appendName = varQual pREL_BASE FSLIT("++") appendIdKey +andName = varQual pREL_BASE FSLIT("&&") andIdKey +orName = varQual pREL_BASE FSLIT("||") orIdKey +assertName = varQual pREL_BASE FSLIT("assert") assertIdKey +breakpointName = varQual pREL_BASE FSLIT("breakpoint") breakpointIdKey +breakpointJumpName + = mkInternalName + breakpointJumpIdKey + (mkOccNameFS varName FSLIT("breakpointJump")) + noSrcLoc + +-- PrelTup +fstName = varQual pREL_TUP FSLIT("fst") fstIdKey +sndName = varQual pREL_TUP FSLIT("snd") sndIdKey + +-- Module PrelNum +numClassName = clsQual pREL_NUM FSLIT("Num") numClassKey +fromIntegerName = methName numClassName FSLIT("fromInteger") fromIntegerClassOpKey +minusName = methName numClassName FSLIT("-") minusClassOpKey +negateName = methName numClassName FSLIT("negate") negateClassOpKey +plusIntegerName = varQual pREL_NUM FSLIT("plusInteger") plusIntegerIdKey +timesIntegerName = varQual pREL_NUM FSLIT("timesInteger") timesIntegerIdKey +integerTyConName = tcQual pREL_NUM FSLIT("Integer") integerTyConKey +smallIntegerDataConName = conName integerTyConName FSLIT("S#") smallIntegerDataConKey +largeIntegerDataConName = conName integerTyConName FSLIT("J#") largeIntegerDataConKey + +-- PrelReal types and classes +rationalTyConName = tcQual pREL_REAL FSLIT("Rational") rationalTyConKey +ratioTyConName = tcQual pREL_REAL FSLIT("Ratio") ratioTyConKey +ratioDataConName = conName ratioTyConName FSLIT(":%") ratioDataConKey +realClassName = clsQual pREL_REAL FSLIT("Real") realClassKey +integralClassName = clsQual pREL_REAL FSLIT("Integral") integralClassKey +realFracClassName = clsQual pREL_REAL FSLIT("RealFrac") realFracClassKey +fractionalClassName = clsQual pREL_REAL FSLIT("Fractional") fractionalClassKey +fromRationalName = methName fractionalClassName FSLIT("fromRational") fromRationalClassOpKey + +-- PrelFloat classes +floatingClassName = clsQual pREL_FLOAT FSLIT("Floating") floatingClassKey +realFloatClassName = clsQual pREL_FLOAT FSLIT("RealFloat") realFloatClassKey + +-- Class Ix +ixClassName = clsQual pREL_ARR FSLIT("Ix") ixClassKey + +-- Class Typeable +typeableClassName = clsQual tYPEABLE FSLIT("Typeable") typeableClassKey +typeable1ClassName = clsQual tYPEABLE FSLIT("Typeable1") typeable1ClassKey +typeable2ClassName = clsQual tYPEABLE FSLIT("Typeable2") typeable2ClassKey +typeable3ClassName = clsQual tYPEABLE FSLIT("Typeable3") typeable3ClassKey +typeable4ClassName = clsQual tYPEABLE FSLIT("Typeable4") typeable4ClassKey +typeable5ClassName = clsQual tYPEABLE FSLIT("Typeable5") typeable5ClassKey +typeable6ClassName = clsQual tYPEABLE FSLIT("Typeable6") typeable6ClassKey +typeable7ClassName = clsQual tYPEABLE FSLIT("Typeable7") typeable7ClassKey + +typeableClassNames = [ typeableClassName, typeable1ClassName, typeable2ClassName + , typeable3ClassName, typeable4ClassName, typeable5ClassName + , typeable6ClassName, typeable7ClassName ] + +-- Class Data +dataClassName = clsQual gENERICS FSLIT("Data") dataClassKey + +-- Error module +assertErrorName = varQual pREL_ERR FSLIT("assertError") assertErrorIdKey + +-- Enum module (Enum, Bounded) +enumClassName = clsQual pREL_ENUM FSLIT("Enum") enumClassKey +enumFromName = methName enumClassName FSLIT("enumFrom") enumFromClassOpKey +enumFromToName = methName enumClassName FSLIT("enumFromTo") enumFromToClassOpKey +enumFromThenName = methName enumClassName FSLIT("enumFromThen") enumFromThenClassOpKey +enumFromThenToName = methName enumClassName FSLIT("enumFromThenTo") enumFromThenToClassOpKey +boundedClassName = clsQual pREL_ENUM FSLIT("Bounded") boundedClassKey + +-- List functions +concatName = varQual pREL_LIST FSLIT("concat") concatIdKey +filterName = varQual pREL_LIST FSLIT("filter") filterIdKey +zipName = varQual pREL_LIST FSLIT("zip") zipIdKey + +-- Class Show +showClassName = clsQual pREL_SHOW FSLIT("Show") showClassKey + +-- Class Read +readClassName = clsQual pREL_READ FSLIT("Read") readClassKey + +-- parallel array types and functions +enumFromToPName = varQual pREL_PARR FSLIT("enumFromToP") enumFromToPIdKey +enumFromThenToPName= varQual pREL_PARR FSLIT("enumFromThenToP") enumFromThenToPIdKey +nullPName = varQual pREL_PARR FSLIT("nullP") nullPIdKey +lengthPName = varQual pREL_PARR FSLIT("lengthP") lengthPIdKey +replicatePName = varQual pREL_PARR FSLIT("replicateP") replicatePIdKey +mapPName = varQual pREL_PARR FSLIT("mapP") mapPIdKey +filterPName = varQual pREL_PARR FSLIT("filterP") filterPIdKey +zipPName = varQual pREL_PARR FSLIT("zipP") zipPIdKey +crossPName = varQual pREL_PARR FSLIT("crossP") crossPIdKey +indexPName = varQual pREL_PARR FSLIT("!:") indexPIdKey +toPName = varQual pREL_PARR FSLIT("toP") toPIdKey +bpermutePName = varQual pREL_PARR FSLIT("bpermuteP") bpermutePIdKey +bpermuteDftPName = varQual pREL_PARR FSLIT("bpermuteDftP") bpermuteDftPIdKey +indexOfPName = varQual pREL_PARR FSLIT("indexOfP") indexOfPIdKey + +-- IOBase things +ioTyConName = tcQual pREL_IO_BASE FSLIT("IO") ioTyConKey +ioDataConName = conName ioTyConName FSLIT("IO") ioDataConKey +thenIOName = varQual pREL_IO_BASE FSLIT("thenIO") thenIOIdKey +bindIOName = varQual pREL_IO_BASE FSLIT("bindIO") bindIOIdKey +returnIOName = varQual pREL_IO_BASE FSLIT("returnIO") returnIOIdKey +failIOName = varQual pREL_IO_BASE FSLIT("failIO") failIOIdKey + +-- IO things +printName = varQual sYSTEM_IO FSLIT("print") printIdKey + +-- Int, Word, and Addr things +int8TyConName = tcQual pREL_INT FSLIT("Int8") int8TyConKey +int16TyConName = tcQual pREL_INT FSLIT("Int16") int16TyConKey +int32TyConName = tcQual pREL_INT FSLIT("Int32") int32TyConKey +int64TyConName = tcQual pREL_INT FSLIT("Int64") int64TyConKey + +-- Word module +word8TyConName = tcQual pREL_WORD FSLIT("Word8") word8TyConKey +word16TyConName = tcQual pREL_WORD FSLIT("Word16") word16TyConKey +word32TyConName = tcQual pREL_WORD FSLIT("Word32") word32TyConKey +word64TyConName = tcQual pREL_WORD FSLIT("Word64") word64TyConKey +wordTyConName = tcQual pREL_WORD FSLIT("Word") wordTyConKey +wordDataConName = conName wordTyConName FSLIT("W#") wordDataConKey + +-- Addr module +addrTyConName = tcQual aDDR FSLIT("Addr") addrTyConKey + +-- PrelPtr module +ptrTyConName = tcQual pREL_PTR FSLIT("Ptr") ptrTyConKey +funPtrTyConName = tcQual pREL_PTR FSLIT("FunPtr") funPtrTyConKey + +-- Byte array types +byteArrayTyConName = tcQual pREL_BYTEARR FSLIT("ByteArray") byteArrayTyConKey +mutableByteArrayTyConName = tcQual pREL_BYTEARR FSLIT("MutableByteArray") mutableByteArrayTyConKey + +-- Foreign objects and weak pointers +stablePtrTyConName = tcQual pREL_STABLE FSLIT("StablePtr") stablePtrTyConKey +newStablePtrName = varQual pREL_STABLE FSLIT("newStablePtr") newStablePtrIdKey + +-- PrelST module +runSTRepName = varQual pREL_ST FSLIT("runSTRep") runSTRepIdKey + +-- The "split" Id for splittable implicit parameters +splittableClassName = clsQual gLA_EXTS FSLIT("Splittable") splittableClassKey +splitName = methName splittableClassName FSLIT("split") splitIdKey + +-- Recursive-do notation +monadFixClassName = clsQual mONAD_FIX FSLIT("MonadFix") monadFixClassKey +mfixName = methName monadFixClassName FSLIT("mfix") mfixIdKey + +-- Arrow notation +arrAName = varQual aRROW FSLIT("arr") arrAIdKey +composeAName = varQual aRROW FSLIT(">>>") composeAIdKey +firstAName = varQual aRROW FSLIT("first") firstAIdKey +appAName = varQual aRROW FSLIT("app") appAIdKey +choiceAName = varQual aRROW FSLIT("|||") choiceAIdKey +loopAName = varQual aRROW FSLIT("loop") loopAIdKey + +-- Other classes, needed for type defaulting +monadPlusClassName = clsQual mONAD FSLIT("MonadPlus") monadPlusClassKey +randomClassName = clsQual rANDOM FSLIT("Random") randomClassKey +randomGenClassName = clsQual rANDOM FSLIT("RandomGen") randomGenClassKey + +-- dotnet interop +objectTyConName = tcQual dOTNET FSLIT("Object") objectTyConKey + -- objectTyConName was "wTcQual", but that's gone now, and + -- I can't see why it was wired in anyway... +unmarshalObjectName = varQual dOTNET FSLIT("unmarshalObject") unmarshalObjectIdKey +marshalObjectName = varQual dOTNET FSLIT("marshalObject") marshalObjectIdKey +marshalStringName = varQual dOTNET FSLIT("marshalString") marshalStringIdKey +unmarshalStringName = varQual dOTNET FSLIT("unmarshalString") unmarshalStringIdKey +checkDotnetResName = varQual dOTNET FSLIT("checkResult") checkDotnetResNameIdKey +\end{code} + +%************************************************************************ +%* * +\subsection{Local helpers} +%* * +%************************************************************************ + +All these are original names; hence mkOrig + +\begin{code} +varQual = mk_known_key_name varName +tcQual = mk_known_key_name tcName +clsQual = mk_known_key_name clsName + +mk_known_key_name space mod str uniq + = mkExternalName uniq mod (mkOccNameFS space str) + Nothing noSrcLoc + +conName :: Name -> FastString -> Unique -> Name +conName tycon occ uniq + = mkExternalName uniq (nameModule tycon) (mkOccNameFS dataName occ) + (Just tycon) noSrcLoc + +methName :: Name -> FastString -> Unique -> Name +methName cls occ uniq + = mkExternalName uniq (nameModule cls) (mkVarOccFS occ) + (Just cls) noSrcLoc +\end{code} + +%************************************************************************ +%* * +\subsubsection[Uniques-prelude-Classes]{@Uniques@ for wired-in @Classes@} +%* * +%************************************************************************ +--MetaHaskell extension hand allocate keys here + +\begin{code} +boundedClassKey = mkPreludeClassUnique 1 +enumClassKey = mkPreludeClassUnique 2 +eqClassKey = mkPreludeClassUnique 3 +floatingClassKey = mkPreludeClassUnique 5 +fractionalClassKey = mkPreludeClassUnique 6 +integralClassKey = mkPreludeClassUnique 7 +monadClassKey = mkPreludeClassUnique 8 +dataClassKey = mkPreludeClassUnique 9 +functorClassKey = mkPreludeClassUnique 10 +numClassKey = mkPreludeClassUnique 11 +ordClassKey = mkPreludeClassUnique 12 +readClassKey = mkPreludeClassUnique 13 +realClassKey = mkPreludeClassUnique 14 +realFloatClassKey = mkPreludeClassUnique 15 +realFracClassKey = mkPreludeClassUnique 16 +showClassKey = mkPreludeClassUnique 17 +ixClassKey = mkPreludeClassUnique 18 + +typeableClassKey = mkPreludeClassUnique 20 +typeable1ClassKey = mkPreludeClassUnique 21 +typeable2ClassKey = mkPreludeClassUnique 22 +typeable3ClassKey = mkPreludeClassUnique 23 +typeable4ClassKey = mkPreludeClassUnique 24 +typeable5ClassKey = mkPreludeClassUnique 25 +typeable6ClassKey = mkPreludeClassUnique 26 +typeable7ClassKey = mkPreludeClassUnique 27 + +monadFixClassKey = mkPreludeClassUnique 28 +splittableClassKey = mkPreludeClassUnique 29 + +monadPlusClassKey = mkPreludeClassUnique 30 +randomClassKey = mkPreludeClassUnique 31 +randomGenClassKey = mkPreludeClassUnique 32 +\end{code} + +%************************************************************************ +%* * +\subsubsection[Uniques-prelude-TyCons]{@Uniques@ for wired-in @TyCons@} +%* * +%************************************************************************ + +\begin{code} +addrPrimTyConKey = mkPreludeTyConUnique 1 +addrTyConKey = mkPreludeTyConUnique 2 +arrayPrimTyConKey = mkPreludeTyConUnique 3 +boolTyConKey = mkPreludeTyConUnique 4 +byteArrayPrimTyConKey = mkPreludeTyConUnique 5 +charPrimTyConKey = mkPreludeTyConUnique 7 +charTyConKey = mkPreludeTyConUnique 8 +doublePrimTyConKey = mkPreludeTyConUnique 9 +doubleTyConKey = mkPreludeTyConUnique 10 +floatPrimTyConKey = mkPreludeTyConUnique 11 +floatTyConKey = mkPreludeTyConUnique 12 +funTyConKey = mkPreludeTyConUnique 13 +intPrimTyConKey = mkPreludeTyConUnique 14 +intTyConKey = mkPreludeTyConUnique 15 +int8TyConKey = mkPreludeTyConUnique 16 +int16TyConKey = mkPreludeTyConUnique 17 +int32PrimTyConKey = mkPreludeTyConUnique 18 +int32TyConKey = mkPreludeTyConUnique 19 +int64PrimTyConKey = mkPreludeTyConUnique 20 +int64TyConKey = mkPreludeTyConUnique 21 +integerTyConKey = mkPreludeTyConUnique 22 +listTyConKey = mkPreludeTyConUnique 23 +foreignObjPrimTyConKey = mkPreludeTyConUnique 24 +weakPrimTyConKey = mkPreludeTyConUnique 27 +mutableArrayPrimTyConKey = mkPreludeTyConUnique 28 +mutableByteArrayPrimTyConKey = mkPreludeTyConUnique 29 +orderingTyConKey = mkPreludeTyConUnique 30 +mVarPrimTyConKey = mkPreludeTyConUnique 31 +ratioTyConKey = mkPreludeTyConUnique 32 +rationalTyConKey = mkPreludeTyConUnique 33 +realWorldTyConKey = mkPreludeTyConUnique 34 +stablePtrPrimTyConKey = mkPreludeTyConUnique 35 +stablePtrTyConKey = mkPreludeTyConUnique 36 +statePrimTyConKey = mkPreludeTyConUnique 50 +stableNamePrimTyConKey = mkPreludeTyConUnique 51 +stableNameTyConKey = mkPreludeTyConUnique 52 +mutableByteArrayTyConKey = mkPreludeTyConUnique 53 +mutVarPrimTyConKey = mkPreludeTyConUnique 55 +ioTyConKey = mkPreludeTyConUnique 56 +byteArrayTyConKey = mkPreludeTyConUnique 57 +wordPrimTyConKey = mkPreludeTyConUnique 58 +wordTyConKey = mkPreludeTyConUnique 59 +word8TyConKey = mkPreludeTyConUnique 60 +word16TyConKey = mkPreludeTyConUnique 61 +word32PrimTyConKey = mkPreludeTyConUnique 62 +word32TyConKey = mkPreludeTyConUnique 63 +word64PrimTyConKey = mkPreludeTyConUnique 64 +word64TyConKey = mkPreludeTyConUnique 65 +liftedConKey = mkPreludeTyConUnique 66 +unliftedConKey = mkPreludeTyConUnique 67 +anyBoxConKey = mkPreludeTyConUnique 68 +kindConKey = mkPreludeTyConUnique 69 +boxityConKey = mkPreludeTyConUnique 70 +typeConKey = mkPreludeTyConUnique 71 +threadIdPrimTyConKey = mkPreludeTyConUnique 72 +bcoPrimTyConKey = mkPreludeTyConUnique 73 +ptrTyConKey = mkPreludeTyConUnique 74 +funPtrTyConKey = mkPreludeTyConUnique 75 +tVarPrimTyConKey = mkPreludeTyConUnique 76 + +-- Generic Type Constructors +crossTyConKey = mkPreludeTyConUnique 79 +plusTyConKey = mkPreludeTyConUnique 80 +genUnitTyConKey = mkPreludeTyConUnique 81 + +-- Parallel array type constructor +parrTyConKey = mkPreludeTyConUnique 82 + +-- dotnet interop +objectTyConKey = mkPreludeTyConUnique 83 + +eitherTyConKey = mkPreludeTyConUnique 84 + +---------------- Template Haskell ------------------- +-- USES TyConUniques 100-129 +----------------------------------------------------- + +unitTyConKey = mkTupleTyConUnique Boxed 0 +\end{code} + +%************************************************************************ +%* * +\subsubsection[Uniques-prelude-DataCons]{@Uniques@ for wired-in @DataCons@} +%* * +%************************************************************************ + +\begin{code} +charDataConKey = mkPreludeDataConUnique 1 +consDataConKey = mkPreludeDataConUnique 2 +doubleDataConKey = mkPreludeDataConUnique 3 +falseDataConKey = mkPreludeDataConUnique 4 +floatDataConKey = mkPreludeDataConUnique 5 +intDataConKey = mkPreludeDataConUnique 6 +smallIntegerDataConKey = mkPreludeDataConUnique 7 +largeIntegerDataConKey = mkPreludeDataConUnique 8 +nilDataConKey = mkPreludeDataConUnique 11 +ratioDataConKey = mkPreludeDataConUnique 12 +stableNameDataConKey = mkPreludeDataConUnique 14 +trueDataConKey = mkPreludeDataConUnique 15 +wordDataConKey = mkPreludeDataConUnique 16 +ioDataConKey = mkPreludeDataConUnique 17 + +-- Generic data constructors +crossDataConKey = mkPreludeDataConUnique 20 +inlDataConKey = mkPreludeDataConUnique 21 +inrDataConKey = mkPreludeDataConUnique 22 +genUnitDataConKey = mkPreludeDataConUnique 23 + +-- Data constructor for parallel arrays +parrDataConKey = mkPreludeDataConUnique 24 + +leftDataConKey = mkPreludeDataConUnique 25 +rightDataConKey = mkPreludeDataConUnique 26 +\end{code} + +%************************************************************************ +%* * +\subsubsection[Uniques-prelude-Ids]{@Uniques@ for wired-in @Ids@ (except @DataCons@)} +%* * +%************************************************************************ + +\begin{code} +absentErrorIdKey = mkPreludeMiscIdUnique 1 +augmentIdKey = mkPreludeMiscIdUnique 3 +appendIdKey = mkPreludeMiscIdUnique 4 +buildIdKey = mkPreludeMiscIdUnique 5 +errorIdKey = mkPreludeMiscIdUnique 6 +foldlIdKey = mkPreludeMiscIdUnique 7 +foldrIdKey = mkPreludeMiscIdUnique 8 +recSelErrorIdKey = mkPreludeMiscIdUnique 9 +integerMinusOneIdKey = mkPreludeMiscIdUnique 10 +integerPlusOneIdKey = mkPreludeMiscIdUnique 11 +integerPlusTwoIdKey = mkPreludeMiscIdUnique 12 +integerZeroIdKey = mkPreludeMiscIdUnique 13 +int2IntegerIdKey = mkPreludeMiscIdUnique 14 +seqIdKey = mkPreludeMiscIdUnique 15 +irrefutPatErrorIdKey = mkPreludeMiscIdUnique 16 +eqStringIdKey = mkPreludeMiscIdUnique 17 +noMethodBindingErrorIdKey = mkPreludeMiscIdUnique 18 +nonExhaustiveGuardsErrorIdKey = mkPreludeMiscIdUnique 19 +runtimeErrorIdKey = mkPreludeMiscIdUnique 20 +parErrorIdKey = mkPreludeMiscIdUnique 21 +parIdKey = mkPreludeMiscIdUnique 22 +patErrorIdKey = mkPreludeMiscIdUnique 23 +realWorldPrimIdKey = mkPreludeMiscIdUnique 24 +recConErrorIdKey = mkPreludeMiscIdUnique 25 +recUpdErrorIdKey = mkPreludeMiscIdUnique 26 +traceIdKey = mkPreludeMiscIdUnique 27 +unpackCStringUtf8IdKey = mkPreludeMiscIdUnique 28 +unpackCStringAppendIdKey = mkPreludeMiscIdUnique 29 +unpackCStringFoldrIdKey = mkPreludeMiscIdUnique 30 +unpackCStringIdKey = mkPreludeMiscIdUnique 31 + +unsafeCoerceIdKey = mkPreludeMiscIdUnique 32 +concatIdKey = mkPreludeMiscIdUnique 33 +filterIdKey = mkPreludeMiscIdUnique 34 +zipIdKey = mkPreludeMiscIdUnique 35 +bindIOIdKey = mkPreludeMiscIdUnique 36 +returnIOIdKey = mkPreludeMiscIdUnique 37 +deRefStablePtrIdKey = mkPreludeMiscIdUnique 38 +newStablePtrIdKey = mkPreludeMiscIdUnique 39 +plusIntegerIdKey = mkPreludeMiscIdUnique 41 +timesIntegerIdKey = mkPreludeMiscIdUnique 42 +printIdKey = mkPreludeMiscIdUnique 43 +failIOIdKey = mkPreludeMiscIdUnique 44 +nullAddrIdKey = mkPreludeMiscIdUnique 46 +voidArgIdKey = mkPreludeMiscIdUnique 47 +splitIdKey = mkPreludeMiscIdUnique 48 +fstIdKey = mkPreludeMiscIdUnique 49 +sndIdKey = mkPreludeMiscIdUnique 50 +otherwiseIdKey = mkPreludeMiscIdUnique 51 +assertIdKey = mkPreludeMiscIdUnique 53 +runSTRepIdKey = mkPreludeMiscIdUnique 54 + +rootMainKey = mkPreludeMiscIdUnique 55 +runMainKey = mkPreludeMiscIdUnique 56 + +andIdKey = mkPreludeMiscIdUnique 57 +orIdKey = mkPreludeMiscIdUnique 58 +thenIOIdKey = mkPreludeMiscIdUnique 59 +lazyIdKey = mkPreludeMiscIdUnique 60 +assertErrorIdKey = mkPreludeMiscIdUnique 61 + +breakpointIdKey = mkPreludeMiscIdUnique 62 +breakpointJumpIdKey = mkPreludeMiscIdUnique 63 + +-- Parallel array functions +nullPIdKey = mkPreludeMiscIdUnique 80 +lengthPIdKey = mkPreludeMiscIdUnique 81 +replicatePIdKey = mkPreludeMiscIdUnique 82 +mapPIdKey = mkPreludeMiscIdUnique 83 +filterPIdKey = mkPreludeMiscIdUnique 84 +zipPIdKey = mkPreludeMiscIdUnique 85 +crossPIdKey = mkPreludeMiscIdUnique 86 +indexPIdKey = mkPreludeMiscIdUnique 87 +toPIdKey = mkPreludeMiscIdUnique 88 +enumFromToPIdKey = mkPreludeMiscIdUnique 89 +enumFromThenToPIdKey = mkPreludeMiscIdUnique 90 +bpermutePIdKey = mkPreludeMiscIdUnique 91 +bpermuteDftPIdKey = mkPreludeMiscIdUnique 92 +indexOfPIdKey = mkPreludeMiscIdUnique 93 + +-- dotnet interop +unmarshalObjectIdKey = mkPreludeMiscIdUnique 94 +marshalObjectIdKey = mkPreludeMiscIdUnique 95 +marshalStringIdKey = mkPreludeMiscIdUnique 96 +unmarshalStringIdKey = mkPreludeMiscIdUnique 97 +checkDotnetResNameIdKey = mkPreludeMiscIdUnique 98 + +\end{code} + +Certain class operations from Prelude classes. They get their own +uniques so we can look them up easily when we want to conjure them up +during type checking. + +\begin{code} + -- Just a place holder for unbound variables produced by the renamer: +unboundKey = mkPreludeMiscIdUnique 101 + +fromIntegerClassOpKey = mkPreludeMiscIdUnique 102 +minusClassOpKey = mkPreludeMiscIdUnique 103 +fromRationalClassOpKey = mkPreludeMiscIdUnique 104 +enumFromClassOpKey = mkPreludeMiscIdUnique 105 +enumFromThenClassOpKey = mkPreludeMiscIdUnique 106 +enumFromToClassOpKey = mkPreludeMiscIdUnique 107 +enumFromThenToClassOpKey = mkPreludeMiscIdUnique 108 +eqClassOpKey = mkPreludeMiscIdUnique 109 +geClassOpKey = mkPreludeMiscIdUnique 110 +negateClassOpKey = mkPreludeMiscIdUnique 111 +failMClassOpKey = mkPreludeMiscIdUnique 112 +bindMClassOpKey = mkPreludeMiscIdUnique 113 -- (>>=) +thenMClassOpKey = mkPreludeMiscIdUnique 114 -- (>>) +returnMClassOpKey = mkPreludeMiscIdUnique 117 + +-- Recursive do notation +mfixIdKey = mkPreludeMiscIdUnique 118 + +-- Arrow notation +arrAIdKey = mkPreludeMiscIdUnique 119 +composeAIdKey = mkPreludeMiscIdUnique 120 -- >>> +firstAIdKey = mkPreludeMiscIdUnique 121 +appAIdKey = mkPreludeMiscIdUnique 122 +choiceAIdKey = mkPreludeMiscIdUnique 123 -- ||| +loopAIdKey = mkPreludeMiscIdUnique 124 + +---------------- Template Haskell ------------------- +-- USES IdUniques 200-399 +----------------------------------------------------- +\end{code} + + +%************************************************************************ +%* * +\subsection{Standard groups of types} +%* * +%************************************************************************ + +\begin{code} +numericTyKeys = + [ addrTyConKey + , wordTyConKey + , intTyConKey + , integerTyConKey + , doubleTyConKey + , floatTyConKey + ] + + -- Renamer always imports these data decls replete with constructors + -- so that desugarer can always see their constructors. Ugh! +cCallishTyKeys = + [ addrTyConKey + , wordTyConKey + , byteArrayTyConKey + , mutableByteArrayTyConKey + , stablePtrTyConKey + , int8TyConKey + , int16TyConKey + , int32TyConKey + , int64TyConKey + , word8TyConKey + , word16TyConKey + , word32TyConKey + , word64TyConKey + ] +\end{code} + + +%************************************************************************ +%* * +\subsection[Class-std-groups]{Standard groups of Prelude classes} +%* * +%************************************************************************ + +NOTE: @Eq@ and @Text@ do need to appear in @standardClasses@ +even though every numeric class has these two as a superclass, +because the list of ambiguous dictionaries hasn't been simplified. + +\begin{code} +numericClassKeys = + [ numClassKey + , realClassKey + , integralClassKey + ] + ++ fractionalClassKeys + +fractionalClassKeys = + [ fractionalClassKey + , floatingClassKey + , realFracClassKey + , realFloatClassKey + ] + + -- the strictness analyser needs to know about numeric types + -- (see SaAbsInt.lhs) +needsDataDeclCtxtClassKeys = -- see comments in TcDeriv + [ readClassKey + ] + +-- The "standard classes" are used in defaulting (Haskell 98 report 4.3.4), +-- and are: "classes defined in the Prelude or a standard library" +standardClassKeys = derivableClassKeys ++ numericClassKeys + ++ [randomClassKey, randomGenClassKey, + functorClassKey, + monadClassKey, monadPlusClassKey] +\end{code} + +@derivableClassKeys@ is also used in checking \tr{deriving} constructs +(@TcDeriv@). + +\begin{code} +derivableClassKeys + = [ eqClassKey, ordClassKey, enumClassKey, ixClassKey, + boundedClassKey, showClassKey, readClassKey ] +\end{code} + diff --git a/compiler/prelude/PrelRules.lhs b/compiler/prelude/PrelRules.lhs new file mode 100644 index 0000000000..9cdddc9065 --- /dev/null +++ b/compiler/prelude/PrelRules.lhs @@ -0,0 +1,447 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[ConFold]{Constant Folder} + +Conceptually, constant folding should be parameterized with the kind +of target machine to get identical behaviour during compilation time +and runtime. We cheat a little bit here... + +ToDo: + check boundaries before folding, e.g. we can fold the Float addition + (i1 + i2) only if it results in a valid Float. + +\begin{code} + +{-# OPTIONS -optc-DNON_POSIX_SOURCE #-} + +module PrelRules ( primOpRules, builtinRules ) where + +#include "HsVersions.h" + +import CoreSyn +import Id ( mkWildId, isPrimOpId_maybe ) +import Literal ( Literal(..), mkMachInt, mkMachWord + , literalType + , word2IntLit, int2WordLit + , narrow8IntLit, narrow16IntLit, narrow32IntLit + , narrow8WordLit, narrow16WordLit, narrow32WordLit + , char2IntLit, int2CharLit + , float2IntLit, int2FloatLit, double2IntLit, int2DoubleLit + , float2DoubleLit, double2FloatLit + ) +import PrimOp ( PrimOp(..), primOpOcc ) +import TysWiredIn ( boolTy, trueDataConId, falseDataConId ) +import TyCon ( tyConDataCons_maybe, isEnumerationTyCon, isNewTyCon ) +import DataCon ( dataConTag, dataConTyCon, dataConWorkId, fIRST_TAG ) +import CoreUtils ( cheapEqExpr, exprIsConApp_maybe ) +import Type ( tyConAppTyCon, coreEqType ) +import OccName ( occNameFS ) +import PrelNames ( unpackCStringFoldrName, unpackCStringFoldrIdKey, hasKey, + eqStringName, unpackCStringIdKey ) +import Maybes ( orElse ) +import Name ( Name ) +import Outputable +import FastString +import StaticFlags ( opt_SimplExcessPrecision ) + +import DATA_BITS ( Bits(..) ) +#if __GLASGOW_HASKELL__ >= 500 +import DATA_WORD ( Word ) +#else +import DATA_WORD ( Word64 ) +#endif +\end{code} + + +\begin{code} +primOpRules :: PrimOp -> Name -> [CoreRule] +primOpRules op op_name = primop_rule op + where + rule_name = occNameFS (primOpOcc op) + rule_name_case = rule_name `appendFS` FSLIT("->case") + + -- A useful shorthand + one_rule rule_fn = [BuiltinRule { ru_name = rule_name, + ru_fn = op_name, + ru_try = rule_fn }] + case_rule rule_fn = [BuiltinRule { ru_name = rule_name_case, + ru_fn = op_name, + ru_try = rule_fn }] + + -- ToDo: something for integer-shift ops? + -- NotOp + + primop_rule TagToEnumOp = one_rule tagToEnumRule + primop_rule DataToTagOp = one_rule dataToTagRule + + -- Int operations + primop_rule IntAddOp = one_rule (twoLits (intOp2 (+))) + primop_rule IntSubOp = one_rule (twoLits (intOp2 (-))) + primop_rule IntMulOp = one_rule (twoLits (intOp2 (*))) + primop_rule IntQuotOp = one_rule (twoLits (intOp2Z quot)) + primop_rule IntRemOp = one_rule (twoLits (intOp2Z rem)) + primop_rule IntNegOp = one_rule (oneLit negOp) + + -- Word operations +#if __GLASGOW_HASKELL__ >= 500 + primop_rule WordAddOp = one_rule (twoLits (wordOp2 (+))) + primop_rule WordSubOp = one_rule (twoLits (wordOp2 (-))) + primop_rule WordMulOp = one_rule (twoLits (wordOp2 (*))) +#endif + primop_rule WordQuotOp = one_rule (twoLits (wordOp2Z quot)) + primop_rule WordRemOp = one_rule (twoLits (wordOp2Z rem)) +#if __GLASGOW_HASKELL__ >= 407 + primop_rule AndOp = one_rule (twoLits (wordBitOp2 (.&.))) + primop_rule OrOp = one_rule (twoLits (wordBitOp2 (.|.))) + primop_rule XorOp = one_rule (twoLits (wordBitOp2 xor)) +#endif + + -- coercions + primop_rule Word2IntOp = one_rule (oneLit (litCoerce word2IntLit)) + primop_rule Int2WordOp = one_rule (oneLit (litCoerce int2WordLit)) + primop_rule Narrow8IntOp = one_rule (oneLit (litCoerce narrow8IntLit)) + primop_rule Narrow16IntOp = one_rule (oneLit (litCoerce narrow16IntLit)) + primop_rule Narrow32IntOp = one_rule (oneLit (litCoerce narrow32IntLit)) + primop_rule Narrow8WordOp = one_rule (oneLit (litCoerce narrow8WordLit)) + primop_rule Narrow16WordOp = one_rule (oneLit (litCoerce narrow16WordLit)) + primop_rule Narrow32WordOp = one_rule (oneLit (litCoerce narrow32WordLit)) + primop_rule OrdOp = one_rule (oneLit (litCoerce char2IntLit)) + primop_rule ChrOp = one_rule (oneLit (litCoerce int2CharLit)) + primop_rule Float2IntOp = one_rule (oneLit (litCoerce float2IntLit)) + primop_rule Int2FloatOp = one_rule (oneLit (litCoerce int2FloatLit)) + primop_rule Double2IntOp = one_rule (oneLit (litCoerce double2IntLit)) + primop_rule Int2DoubleOp = one_rule (oneLit (litCoerce int2DoubleLit)) + -- SUP: Not sure what the standard says about precision in the following 2 cases + primop_rule Float2DoubleOp = one_rule (oneLit (litCoerce float2DoubleLit)) + primop_rule Double2FloatOp = one_rule (oneLit (litCoerce double2FloatLit)) + + -- Float + primop_rule FloatAddOp = one_rule (twoLits (floatOp2 (+))) + primop_rule FloatSubOp = one_rule (twoLits (floatOp2 (-))) + primop_rule FloatMulOp = one_rule (twoLits (floatOp2 (*))) + primop_rule FloatDivOp = one_rule (twoLits (floatOp2Z (/))) + primop_rule FloatNegOp = one_rule (oneLit negOp) + + -- Double + primop_rule DoubleAddOp = one_rule (twoLits (doubleOp2 (+))) + primop_rule DoubleSubOp = one_rule (twoLits (doubleOp2 (-))) + primop_rule DoubleMulOp = one_rule (twoLits (doubleOp2 (*))) + primop_rule DoubleDivOp = one_rule (twoLits (doubleOp2Z (/))) + primop_rule DoubleNegOp = one_rule (oneLit negOp) + + -- Relational operators + primop_rule IntEqOp = one_rule (relop (==)) ++ case_rule (litEq True) + primop_rule IntNeOp = one_rule (relop (/=)) ++ case_rule (litEq False) + primop_rule CharEqOp = one_rule (relop (==)) ++ case_rule (litEq True) + primop_rule CharNeOp = one_rule (relop (/=)) ++ case_rule (litEq False) + + primop_rule IntGtOp = one_rule (relop (>)) + primop_rule IntGeOp = one_rule (relop (>=)) + primop_rule IntLeOp = one_rule (relop (<=)) + primop_rule IntLtOp = one_rule (relop (<)) + + primop_rule CharGtOp = one_rule (relop (>)) + primop_rule CharGeOp = one_rule (relop (>=)) + primop_rule CharLeOp = one_rule (relop (<=)) + primop_rule CharLtOp = one_rule (relop (<)) + + primop_rule FloatGtOp = one_rule (relop (>)) + primop_rule FloatGeOp = one_rule (relop (>=)) + primop_rule FloatLeOp = one_rule (relop (<=)) + primop_rule FloatLtOp = one_rule (relop (<)) + primop_rule FloatEqOp = one_rule (relop (==)) + primop_rule FloatNeOp = one_rule (relop (/=)) + + primop_rule DoubleGtOp = one_rule (relop (>)) + primop_rule DoubleGeOp = one_rule (relop (>=)) + primop_rule DoubleLeOp = one_rule (relop (<=)) + primop_rule DoubleLtOp = one_rule (relop (<)) + primop_rule DoubleEqOp = one_rule (relop (==)) + primop_rule DoubleNeOp = one_rule (relop (/=)) + + primop_rule WordGtOp = one_rule (relop (>)) + primop_rule WordGeOp = one_rule (relop (>=)) + primop_rule WordLeOp = one_rule (relop (<=)) + primop_rule WordLtOp = one_rule (relop (<)) + primop_rule WordEqOp = one_rule (relop (==)) + primop_rule WordNeOp = one_rule (relop (/=)) + + primop_rule other = [] + + + relop cmp = twoLits (cmpOp (\ord -> ord `cmp` EQ)) + -- Cunning. cmpOp compares the values to give an Ordering. + -- It applies its argument to that ordering value to turn + -- the ordering into a boolean value. (`cmp` EQ) is just the job. +\end{code} + +%************************************************************************ +%* * +\subsection{Doing the business} +%* * +%************************************************************************ + +ToDo: the reason these all return Nothing is because there used to be +the possibility of an argument being a litlit. Litlits are now gone, +so this could be cleaned up. + +\begin{code} +-------------------------- +litCoerce :: (Literal -> Literal) -> Literal -> Maybe CoreExpr +litCoerce fn lit = Just (Lit (fn lit)) + +-------------------------- +cmpOp :: (Ordering -> Bool) -> Literal -> Literal -> Maybe CoreExpr +cmpOp cmp l1 l2 + = go l1 l2 + where + done res | cmp res = Just trueVal + | otherwise = Just falseVal + + -- These compares are at different types + go (MachChar i1) (MachChar i2) = done (i1 `compare` i2) + go (MachInt i1) (MachInt i2) = done (i1 `compare` i2) + go (MachInt64 i1) (MachInt64 i2) = done (i1 `compare` i2) + go (MachWord i1) (MachWord i2) = done (i1 `compare` i2) + go (MachWord64 i1) (MachWord64 i2) = done (i1 `compare` i2) + go (MachFloat i1) (MachFloat i2) = done (i1 `compare` i2) + go (MachDouble i1) (MachDouble i2) = done (i1 `compare` i2) + go l1 l2 = Nothing + +-------------------------- + +negOp (MachFloat 0.0) = Nothing -- can't represent -0.0 as a Rational +negOp (MachFloat f) = Just (mkFloatVal (-f)) +negOp (MachDouble 0.0) = Nothing +negOp (MachDouble d) = Just (mkDoubleVal (-d)) +negOp (MachInt i) = intResult (-i) +negOp l = Nothing + +-------------------------- +intOp2 op (MachInt i1) (MachInt i2) = intResult (i1 `op` i2) +intOp2 op l1 l2 = Nothing -- Could find LitLit + +intOp2Z op (MachInt i1) (MachInt i2) + | i2 /= 0 = Just (mkIntVal (i1 `op` i2)) +intOp2Z op l1 l2 = Nothing -- LitLit or zero dividend + +-------------------------- +#if __GLASGOW_HASKELL__ >= 500 +wordOp2 op (MachWord w1) (MachWord w2) + = wordResult (w1 `op` w2) +wordOp2 op l1 l2 = Nothing -- Could find LitLit +#endif + +wordOp2Z op (MachWord w1) (MachWord w2) + | w2 /= 0 = Just (mkWordVal (w1 `op` w2)) +wordOp2Z op l1 l2 = Nothing -- LitLit or zero dividend + +#if __GLASGOW_HASKELL__ >= 500 +wordBitOp2 op l1@(MachWord w1) l2@(MachWord w2) + = Just (mkWordVal (w1 `op` w2)) +#else +-- Integer is not an instance of Bits, so we operate on Word64 +wordBitOp2 op l1@(MachWord w1) l2@(MachWord w2) + = Just (mkWordVal ((fromIntegral::Word64->Integer) (fromIntegral w1 `op` fromIntegral w2))) +#endif +wordBitOp2 op l1 l2 = Nothing -- Could find LitLit + +-------------------------- +floatOp2 op (MachFloat f1) (MachFloat f2) + = Just (mkFloatVal (f1 `op` f2)) +floatOp2 op l1 l2 = Nothing + +floatOp2Z op (MachFloat f1) (MachFloat f2) + | f2 /= 0 = Just (mkFloatVal (f1 `op` f2)) +floatOp2Z op l1 l2 = Nothing + +-------------------------- +doubleOp2 op (MachDouble f1) (MachDouble f2) + = Just (mkDoubleVal (f1 `op` f2)) +doubleOp2 op l1 l2 = Nothing + +doubleOp2Z op (MachDouble f1) (MachDouble f2) + | f2 /= 0 = Just (mkDoubleVal (f1 `op` f2)) +doubleOp2Z op l1 l2 = Nothing + + +-------------------------- + -- This stuff turns + -- n ==# 3# + -- into + -- case n of + -- 3# -> True + -- m -> False + -- + -- This is a Good Thing, because it allows case-of case things + -- to happen, and case-default absorption to happen. For + -- example: + -- + -- if (n ==# 3#) || (n ==# 4#) then e1 else e2 + -- will transform to + -- case n of + -- 3# -> e1 + -- 4# -> e1 + -- m -> e2 + -- (modulo the usual precautions to avoid duplicating e1) + +litEq :: Bool -- True <=> equality, False <=> inequality + -> RuleFun +litEq is_eq [Lit lit, expr] = do_lit_eq is_eq lit expr +litEq is_eq [expr, Lit lit] = do_lit_eq is_eq lit expr +litEq is_eq other = Nothing + +do_lit_eq is_eq lit expr + = Just (Case expr (mkWildId (literalType lit)) boolTy + [(DEFAULT, [], val_if_neq), + (LitAlt lit, [], val_if_eq)]) + where + val_if_eq | is_eq = trueVal + | otherwise = falseVal + val_if_neq | is_eq = falseVal + | otherwise = trueVal + +-- Note that we *don't* warn the user about overflow. It's not done at +-- runtime either, and compilation of completely harmless things like +-- ((124076834 :: Word32) + (2147483647 :: Word32)) +-- would yield a warning. Instead we simply squash the value into the +-- Int range, but not in a way suitable for cross-compiling... :-( +intResult :: Integer -> Maybe CoreExpr +intResult result + = Just (mkIntVal (toInteger (fromInteger result :: Int))) + +#if __GLASGOW_HASKELL__ >= 500 +wordResult :: Integer -> Maybe CoreExpr +wordResult result + = Just (mkWordVal (toInteger (fromInteger result :: Word))) +#endif +\end{code} + + +%************************************************************************ +%* * +\subsection{Vaguely generic functions +%* * +%************************************************************************ + +\begin{code} +type RuleFun = [CoreExpr] -> Maybe CoreExpr + +twoLits :: (Literal -> Literal -> Maybe CoreExpr) -> RuleFun +twoLits rule [Lit l1, Lit l2] = rule (convFloating l1) (convFloating l2) +twoLits rule _ = Nothing + +oneLit :: (Literal -> Maybe CoreExpr) -> RuleFun +oneLit rule [Lit l1] = rule (convFloating l1) +oneLit rule _ = Nothing + +-- When excess precision is not requested, cut down the precision of the +-- Rational value to that of Float/Double. We confuse host architecture +-- and target architecture here, but it's convenient (and wrong :-). +convFloating :: Literal -> Literal +convFloating (MachFloat f) | not opt_SimplExcessPrecision = + MachFloat (toRational ((fromRational f) :: Float )) +convFloating (MachDouble d) | not opt_SimplExcessPrecision = + MachDouble (toRational ((fromRational d) :: Double)) +convFloating l = l + + +trueVal = Var trueDataConId +falseVal = Var falseDataConId +mkIntVal i = Lit (mkMachInt i) +mkWordVal w = Lit (mkMachWord w) +mkFloatVal f = Lit (convFloating (MachFloat f)) +mkDoubleVal d = Lit (convFloating (MachDouble d)) +\end{code} + + +%************************************************************************ +%* * +\subsection{Special rules for seq, tagToEnum, dataToTag} +%* * +%************************************************************************ + +\begin{code} +tagToEnumRule [Type ty, Lit (MachInt i)] + = ASSERT( isEnumerationTyCon tycon ) + case filter correct_tag (tyConDataCons_maybe tycon `orElse` []) of + + + [] -> Nothing -- Abstract type + (dc:rest) -> ASSERT( null rest ) + Just (Var (dataConWorkId dc)) + where + correct_tag dc = (dataConTag dc - fIRST_TAG) == tag + tag = fromInteger i + tycon = tyConAppTyCon ty + +tagToEnumRule other = Nothing +\end{code} + +For dataToTag#, we can reduce if either + + (a) the argument is a constructor + (b) the argument is a variable whose unfolding is a known constructor + +\begin{code} +dataToTagRule [Type ty1, Var tag_to_enum `App` Type ty2 `App` tag] + | Just TagToEnumOp <- isPrimOpId_maybe tag_to_enum + , ty1 `coreEqType` ty2 + = Just tag -- dataToTag (tagToEnum x) ==> x + +dataToTagRule [_, val_arg] + | Just (dc,_) <- exprIsConApp_maybe val_arg + = ASSERT( not (isNewTyCon (dataConTyCon dc)) ) + Just (mkIntVal (toInteger (dataConTag dc - fIRST_TAG))) + +dataToTagRule other = Nothing +\end{code} + +%************************************************************************ +%* * +\subsection{Built in rules} +%* * +%************************************************************************ + +\begin{code} +builtinRules :: [CoreRule] +-- Rules for non-primops that can't be expressed using a RULE pragma +builtinRules + = [ BuiltinRule FSLIT("AppendLitString") unpackCStringFoldrName match_append_lit, + BuiltinRule FSLIT("EqString") eqStringName match_eq_string + ] + + +-- The rule is this: +-- unpackFoldrCString# "foo" c (unpackFoldrCString# "baz" c n) = unpackFoldrCString# "foobaz" c n + +match_append_lit [Type ty1, + Lit (MachStr s1), + c1, + Var unpk `App` Type ty2 + `App` Lit (MachStr s2) + `App` c2 + `App` n + ] + | unpk `hasKey` unpackCStringFoldrIdKey && + c1 `cheapEqExpr` c2 + = ASSERT( ty1 `coreEqType` ty2 ) + Just (Var unpk `App` Type ty1 + `App` Lit (MachStr (s1 `appendFS` s2)) + `App` c1 + `App` n) + +match_append_lit other = Nothing + +-- The rule is this: +-- eqString (unpackCString# (Lit s1)) (unpackCString# (Lit s2) = s1==s2 + +match_eq_string [Var unpk1 `App` Lit (MachStr s1), + Var unpk2 `App` Lit (MachStr s2)] + | unpk1 `hasKey` unpackCStringIdKey, + unpk2 `hasKey` unpackCStringIdKey + = Just (if s1 == s2 then trueVal else falseVal) + +match_eq_string other = Nothing +\end{code} diff --git a/compiler/prelude/PrimOp.lhs b/compiler/prelude/PrimOp.lhs new file mode 100644 index 0000000000..a650352280 --- /dev/null +++ b/compiler/prelude/PrimOp.lhs @@ -0,0 +1,461 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[PrimOp]{Primitive operations (machine-level)} + +\begin{code} +module PrimOp ( + PrimOp(..), allThePrimOps, + primOpType, primOpSig, + primOpTag, maxPrimOpTag, primOpOcc, + + primOpOutOfLine, primOpNeedsWrapper, + primOpOkForSpeculation, primOpIsCheap, primOpIsDupable, + + getPrimOpResultInfo, PrimOpResultInfo(..) + ) where + +#include "HsVersions.h" + +import TysPrim +import TysWiredIn + +import NewDemand +import Var ( TyVar ) +import OccName ( OccName, pprOccName, mkVarOccFS ) +import TyCon ( TyCon, isPrimTyCon, tyConPrimRep, PrimRep(..) ) +import Type ( Type, mkForAllTys, mkFunTy, mkFunTys, tyConAppTyCon, + typePrimRep ) +import BasicTypes ( Arity, Boxity(..) ) +import Outputable +import FastTypes +\end{code} + +%************************************************************************ +%* * +\subsection[PrimOp-datatype]{Datatype for @PrimOp@ (an enumeration)} +%* * +%************************************************************************ + +These are in \tr{state-interface.verb} order. + +\begin{code} + +-- supplies: +-- data PrimOp = ... +#include "primop-data-decl.hs-incl" +\end{code} + +Used for the Ord instance + +\begin{code} +primOpTag :: PrimOp -> Int +primOpTag op = iBox (tagOf_PrimOp op) + +-- supplies +-- tagOf_PrimOp :: PrimOp -> FastInt +#include "primop-tag.hs-incl" + + +instance Eq PrimOp where + op1 == op2 = tagOf_PrimOp op1 ==# tagOf_PrimOp op2 + +instance Ord PrimOp where + op1 < op2 = tagOf_PrimOp op1 <# tagOf_PrimOp op2 + op1 <= op2 = tagOf_PrimOp op1 <=# tagOf_PrimOp op2 + op1 >= op2 = tagOf_PrimOp op1 >=# tagOf_PrimOp op2 + op1 > op2 = tagOf_PrimOp op1 ># tagOf_PrimOp op2 + op1 `compare` op2 | op1 < op2 = LT + | op1 == op2 = EQ + | otherwise = GT + +instance Outputable PrimOp where + ppr op = pprPrimOp op + +instance Show PrimOp where + showsPrec p op = showsPrecSDoc p (pprPrimOp op) +\end{code} + +An @Enum@-derived list would be better; meanwhile... (ToDo) + +\begin{code} +allThePrimOps :: [PrimOp] +allThePrimOps = +#include "primop-list.hs-incl" +\end{code} + +%************************************************************************ +%* * +\subsection[PrimOp-info]{The essential info about each @PrimOp@} +%* * +%************************************************************************ + +The @String@ in the @PrimOpInfos@ is the ``base name'' by which the user may +refer to the primitive operation. The conventional \tr{#}-for- +unboxed ops is added on later. + +The reason for the funny characters in the names is so we do not +interfere with the programmer's Haskell name spaces. + +We use @PrimKinds@ for the ``type'' information, because they're +(slightly) more convenient to use than @TyCons@. +\begin{code} +data PrimOpInfo + = Dyadic OccName -- string :: T -> T -> T + Type + | Monadic OccName -- string :: T -> T + Type + | Compare OccName -- string :: T -> T -> Bool + Type + + | GenPrimOp OccName -- string :: \/a1..an . T1 -> .. -> Tk -> T + [TyVar] + [Type] + Type + +mkDyadic str ty = Dyadic (mkVarOccFS str) ty +mkMonadic str ty = Monadic (mkVarOccFS str) ty +mkCompare str ty = Compare (mkVarOccFS str) ty +mkGenPrimOp str tvs tys ty = GenPrimOp (mkVarOccFS str) tvs tys ty +\end{code} + +%************************************************************************ +%* * +\subsubsection{Strictness} +%* * +%************************************************************************ + +Not all primops are strict! + +\begin{code} +primOpStrictness :: PrimOp -> Arity -> StrictSig + -- See Demand.StrictnessInfo for discussion of what the results + -- The arity should be the arity of the primop; that's why + -- this function isn't exported. +#include "primop-strictness.hs-incl" +\end{code} + +%************************************************************************ +%* * +\subsubsection[PrimOp-comparison]{PrimOpInfo basic comparison ops} +%* * +%************************************************************************ + +@primOpInfo@ gives all essential information (from which everything +else, notably a type, can be constructed) for each @PrimOp@. + +\begin{code} +primOpInfo :: PrimOp -> PrimOpInfo +#include "primop-primop-info.hs-incl" +\end{code} + +Here are a load of comments from the old primOp info: + +A @Word#@ is an unsigned @Int#@. + +@decodeFloat#@ is given w/ Integer-stuff (it's similar). + +@decodeDouble#@ is given w/ Integer-stuff (it's similar). + +Decoding of floating-point numbers is sorta Integer-related. Encoding +is done with plain ccalls now (see PrelNumExtra.lhs). + +A @Weak@ Pointer is created by the @mkWeak#@ primitive: + + mkWeak# :: k -> v -> f -> State# RealWorld + -> (# State# RealWorld, Weak# v #) + +In practice, you'll use the higher-level + + data Weak v = Weak# v + mkWeak :: k -> v -> IO () -> IO (Weak v) + +The following operation dereferences a weak pointer. The weak pointer +may have been finalized, so the operation returns a result code which +must be inspected before looking at the dereferenced value. + + deRefWeak# :: Weak# v -> State# RealWorld -> + (# State# RealWorld, v, Int# #) + +Only look at v if the Int# returned is /= 0 !! + +The higher-level op is + + deRefWeak :: Weak v -> IO (Maybe v) + +Weak pointers can be finalized early by using the finalize# operation: + + finalizeWeak# :: Weak# v -> State# RealWorld -> + (# State# RealWorld, Int#, IO () #) + +The Int# returned is either + + 0 if the weak pointer has already been finalized, or it has no + finalizer (the third component is then invalid). + + 1 if the weak pointer is still alive, with the finalizer returned + as the third component. + +A {\em stable name/pointer} is an index into a table of stable name +entries. Since the garbage collector is told about stable pointers, +it is safe to pass a stable pointer to external systems such as C +routines. + +\begin{verbatim} +makeStablePtr# :: a -> State# RealWorld -> (# State# RealWorld, StablePtr# a #) +freeStablePtr :: StablePtr# a -> State# RealWorld -> State# RealWorld +deRefStablePtr# :: StablePtr# a -> State# RealWorld -> (# State# RealWorld, a #) +eqStablePtr# :: StablePtr# a -> StablePtr# a -> Int# +\end{verbatim} + +It may seem a bit surprising that @makeStablePtr#@ is a @IO@ +operation since it doesn't (directly) involve IO operations. The +reason is that if some optimisation pass decided to duplicate calls to +@makeStablePtr#@ and we only pass one of the stable pointers over, a +massive space leak can result. Putting it into the IO monad +prevents this. (Another reason for putting them in a monad is to +ensure correct sequencing wrt the side-effecting @freeStablePtr@ +operation.) + +An important property of stable pointers is that if you call +makeStablePtr# twice on the same object you get the same stable +pointer back. + +Note that we can implement @freeStablePtr#@ using @_ccall_@ (and, +besides, it's not likely to be used from Haskell) so it's not a +primop. + +Question: Why @RealWorld@ - won't any instance of @_ST@ do the job? [ADR] + +Stable Names +~~~~~~~~~~~~ + +A stable name is like a stable pointer, but with three important differences: + + (a) You can't deRef one to get back to the original object. + (b) You can convert one to an Int. + (c) You don't need to 'freeStableName' + +The existence of a stable name doesn't guarantee to keep the object it +points to alive (unlike a stable pointer), hence (a). + +Invariants: + + (a) makeStableName always returns the same value for a given + object (same as stable pointers). + + (b) if two stable names are equal, it implies that the objects + from which they were created were the same. + + (c) stableNameToInt always returns the same Int for a given + stable name. + + +-- HWL: The first 4 Int# in all par... annotations denote: +-- name, granularity info, size of result, degree of parallelism +-- Same structure as _seq_ i.e. returns Int# +-- KSW: v, the second arg in parAt# and parAtForNow#, is used only to determine +-- `the processor containing the expression v'; it is not evaluated + +These primops are pretty wierd. + + dataToTag# :: a -> Int (arg must be an evaluated data type) + tagToEnum# :: Int -> a (result type must be an enumerated type) + +The constraints aren't currently checked by the front end, but the +code generator will fall over if they aren't satisfied. + +\begin{code} +#ifdef DEBUG +primOpInfo op = pprPanic "primOpInfo:" (ppr op) +#endif +\end{code} + +%************************************************************************ +%* * +\subsubsection[PrimOp-ool]{Which PrimOps are out-of-line} +%* * +%************************************************************************ + +Some PrimOps need to be called out-of-line because they either need to +perform a heap check or they block. + + +\begin{code} +primOpOutOfLine :: PrimOp -> Bool +#include "primop-out-of-line.hs-incl" +\end{code} + + +primOpOkForSpeculation +~~~~~~~~~~~~~~~~~~~~~~ +Sometimes we may choose to execute a PrimOp even though it isn't +certain that its result will be required; ie execute them +``speculatively''. The same thing as ``cheap eagerness.'' Usually +this is OK, because PrimOps are usually cheap, but it isn't OK for +(a)~expensive PrimOps and (b)~PrimOps which can fail. + +PrimOps that have side effects also should not be executed speculatively. + +Ok-for-speculation also means that it's ok *not* to execute the +primop. For example + case op a b of + r -> 3 +Here the result is not used, so we can discard the primop. Anything +that has side effects mustn't be dicarded in this way, of course! + +See also @primOpIsCheap@ (below). + + +\begin{code} +primOpOkForSpeculation :: PrimOp -> Bool + -- See comments with CoreUtils.exprOkForSpeculation +primOpOkForSpeculation op + = not (primOpHasSideEffects op || primOpOutOfLine op || primOpCanFail op) +\end{code} + + +primOpIsCheap +~~~~~~~~~~~~~ +@primOpIsCheap@, as used in \tr{SimplUtils.lhs}. For now (HACK +WARNING), we just borrow some other predicates for a +what-should-be-good-enough test. "Cheap" means willing to call it more +than once, and/or push it inside a lambda. The latter could change the +behaviour of 'seq' for primops that can fail, so we don't treat them as cheap. + +\begin{code} +primOpIsCheap :: PrimOp -> Bool +primOpIsCheap op = primOpOkForSpeculation op +-- In March 2001, we changed this to +-- primOpIsCheap op = False +-- thereby making *no* primops seem cheap. But this killed eta +-- expansion on case (x ==# y) of True -> \s -> ... +-- which is bad. In particular a loop like +-- doLoop n = loop 0 +-- where +-- loop i | i == n = return () +-- | otherwise = bar i >> loop (i+1) +-- allocated a closure every time round because it doesn't eta expand. +-- +-- The problem that originally gave rise to the change was +-- let x = a +# b *# c in x +# x +-- were we don't want to inline x. But primopIsCheap doesn't control +-- that (it's exprIsDupable that does) so the problem doesn't occur +-- even if primOpIsCheap sometimes says 'True'. +\end{code} + +primOpIsDupable +~~~~~~~~~~~~~~~ +primOpIsDupable means that the use of the primop is small enough to +duplicate into different case branches. See CoreUtils.exprIsDupable. + +\begin{code} +primOpIsDupable :: PrimOp -> Bool + -- See comments with CoreUtils.exprIsDupable + -- We say it's dupable it isn't implemented by a C call with a wrapper +primOpIsDupable op = not (primOpNeedsWrapper op) +\end{code} + + +\begin{code} +primOpCanFail :: PrimOp -> Bool +#include "primop-can-fail.hs-incl" +\end{code} + +And some primops have side-effects and so, for example, must not be +duplicated. + +\begin{code} +primOpHasSideEffects :: PrimOp -> Bool +#include "primop-has-side-effects.hs-incl" +\end{code} + +Inline primitive operations that perform calls need wrappers to save +any live variables that are stored in caller-saves registers. + +\begin{code} +primOpNeedsWrapper :: PrimOp -> Bool +#include "primop-needs-wrapper.hs-incl" +\end{code} + +\begin{code} +primOpType :: PrimOp -> Type -- you may want to use primOpSig instead +primOpType op + = case (primOpInfo op) of + Dyadic occ ty -> dyadic_fun_ty ty + Monadic occ ty -> monadic_fun_ty ty + Compare occ ty -> compare_fun_ty ty + + GenPrimOp occ tyvars arg_tys res_ty -> + mkForAllTys tyvars (mkFunTys arg_tys res_ty) + +primOpOcc :: PrimOp -> OccName +primOpOcc op = case (primOpInfo op) of + Dyadic occ _ -> occ + Monadic occ _ -> occ + Compare occ _ -> occ + GenPrimOp occ _ _ _ -> occ + +-- primOpSig is like primOpType but gives the result split apart: +-- (type variables, argument types, result type) +-- It also gives arity, strictness info + +primOpSig :: PrimOp -> ([TyVar], [Type], Type, Arity, StrictSig) +primOpSig op + = (tyvars, arg_tys, res_ty, arity, primOpStrictness op arity) + where + arity = length arg_tys + (tyvars, arg_tys, res_ty) + = case (primOpInfo op) of + Monadic occ ty -> ([], [ty], ty ) + Dyadic occ ty -> ([], [ty,ty], ty ) + Compare occ ty -> ([], [ty,ty], boolTy) + GenPrimOp occ tyvars arg_tys res_ty + -> (tyvars, arg_tys, res_ty) +\end{code} + +\begin{code} +data PrimOpResultInfo + = ReturnsPrim PrimRep + | ReturnsAlg TyCon + +-- Some PrimOps need not return a manifest primitive or algebraic value +-- (i.e. they might return a polymorphic value). These PrimOps *must* +-- be out of line, or the code generator won't work. + +getPrimOpResultInfo :: PrimOp -> PrimOpResultInfo +getPrimOpResultInfo op + = case (primOpInfo op) of + Dyadic _ ty -> ReturnsPrim (typePrimRep ty) + Monadic _ ty -> ReturnsPrim (typePrimRep ty) + Compare _ ty -> ReturnsAlg boolTyCon + GenPrimOp _ _ _ ty | isPrimTyCon tc -> ReturnsPrim (tyConPrimRep tc) + | otherwise -> ReturnsAlg tc + where + tc = tyConAppTyCon ty + -- All primops return a tycon-app result + -- The tycon can be an unboxed tuple, though, which + -- gives rise to a ReturnAlg +\end{code} + +The commutable ops are those for which we will try to move constants +to the right hand side for strength reduction. + +\begin{code} +commutableOp :: PrimOp -> Bool +#include "primop-commutable.hs-incl" +\end{code} + +Utils: +\begin{code} +dyadic_fun_ty ty = mkFunTys [ty, ty] ty +monadic_fun_ty ty = mkFunTy ty ty +compare_fun_ty ty = mkFunTys [ty, ty] boolTy +\end{code} + +Output stuff: +\begin{code} +pprPrimOp :: PrimOp -> SDoc +pprPrimOp other_op = pprOccName (primOpOcc other_op) +\end{code} + diff --git a/compiler/prelude/TysPrim.lhs b/compiler/prelude/TysPrim.lhs new file mode 100644 index 0000000000..2f6168bafb --- /dev/null +++ b/compiler/prelude/TysPrim.lhs @@ -0,0 +1,392 @@ +% +% (c) The AQUA Project, Glasgow University, 1994-1998 +% +\section[TysPrim]{Wired-in knowledge about primitive types} + +\begin{code} +module TysPrim( + alphaTyVars, betaTyVars, alphaTyVar, betaTyVar, gammaTyVar, deltaTyVar, + alphaTy, betaTy, gammaTy, deltaTy, + openAlphaTy, openAlphaTyVar, openAlphaTyVars, + + primTyCons, + + charPrimTyCon, charPrimTy, + intPrimTyCon, intPrimTy, + wordPrimTyCon, wordPrimTy, + addrPrimTyCon, addrPrimTy, + floatPrimTyCon, floatPrimTy, + doublePrimTyCon, doublePrimTy, + + statePrimTyCon, mkStatePrimTy, + realWorldTyCon, realWorldTy, realWorldStatePrimTy, + + arrayPrimTyCon, mkArrayPrimTy, + byteArrayPrimTyCon, byteArrayPrimTy, + mutableArrayPrimTyCon, mkMutableArrayPrimTy, + mutableByteArrayPrimTyCon, mkMutableByteArrayPrimTy, + mutVarPrimTyCon, mkMutVarPrimTy, + + mVarPrimTyCon, mkMVarPrimTy, + tVarPrimTyCon, mkTVarPrimTy, + stablePtrPrimTyCon, mkStablePtrPrimTy, + stableNamePrimTyCon, mkStableNamePrimTy, + bcoPrimTyCon, bcoPrimTy, + weakPrimTyCon, mkWeakPrimTy, + threadIdPrimTyCon, threadIdPrimTy, + + int32PrimTyCon, int32PrimTy, + word32PrimTyCon, word32PrimTy, + + int64PrimTyCon, int64PrimTy, + word64PrimTyCon, word64PrimTy + ) where + +#include "HsVersions.h" + +import Var ( TyVar, mkTyVar ) +import Name ( Name, BuiltInSyntax(..), mkInternalName, mkWiredInName ) +import OccName ( mkOccNameFS, tcName, mkTyVarOcc ) +import TyCon ( TyCon, ArgVrcs, mkPrimTyCon, mkLiftedPrimTyCon, + PrimRep(..) ) +import Type ( mkTyConApp, mkTyConTy, mkTyVarTys, mkTyVarTy, + unliftedTypeKind, liftedTypeKind, openTypeKind, + Kind, mkArrowKinds, + TyThing(..) + ) +import SrcLoc ( noSrcLoc ) +import Unique ( mkAlphaTyVarUnique ) +import PrelNames +import FastString ( FastString, mkFastString ) +import Outputable + +import Char ( ord, chr ) +\end{code} + +%************************************************************************ +%* * +\subsection{Primitive type constructors} +%* * +%************************************************************************ + +\begin{code} +primTyCons :: [TyCon] +primTyCons + = [ addrPrimTyCon + , arrayPrimTyCon + , byteArrayPrimTyCon + , charPrimTyCon + , doublePrimTyCon + , floatPrimTyCon + , intPrimTyCon + , int32PrimTyCon + , int64PrimTyCon + , bcoPrimTyCon + , weakPrimTyCon + , mutableArrayPrimTyCon + , mutableByteArrayPrimTyCon + , mVarPrimTyCon + , tVarPrimTyCon + , mutVarPrimTyCon + , realWorldTyCon + , stablePtrPrimTyCon + , stableNamePrimTyCon + , statePrimTyCon + , threadIdPrimTyCon + , wordPrimTyCon + , word32PrimTyCon + , word64PrimTyCon + ] + +mkPrimTc :: FastString -> Unique -> TyCon -> Name +mkPrimTc fs uniq tycon + = mkWiredInName gHC_PRIM (mkOccNameFS tcName fs) + uniq + Nothing -- No parent object + (ATyCon tycon) -- Relevant TyCon + UserSyntax -- None are built-in syntax + +charPrimTyConName = mkPrimTc FSLIT("Char#") charPrimTyConKey charPrimTyCon +intPrimTyConName = mkPrimTc FSLIT("Int#") intPrimTyConKey intPrimTyCon +int32PrimTyConName = mkPrimTc FSLIT("Int32#") int32PrimTyConKey int32PrimTyCon +int64PrimTyConName = mkPrimTc FSLIT("Int64#") int64PrimTyConKey int64PrimTyCon +wordPrimTyConName = mkPrimTc FSLIT("Word#") wordPrimTyConKey wordPrimTyCon +word32PrimTyConName = mkPrimTc FSLIT("Word32#") word32PrimTyConKey word32PrimTyCon +word64PrimTyConName = mkPrimTc FSLIT("Word64#") word64PrimTyConKey word64PrimTyCon +addrPrimTyConName = mkPrimTc FSLIT("Addr#") addrPrimTyConKey addrPrimTyCon +floatPrimTyConName = mkPrimTc FSLIT("Float#") floatPrimTyConKey floatPrimTyCon +doublePrimTyConName = mkPrimTc FSLIT("Double#") doublePrimTyConKey doublePrimTyCon +statePrimTyConName = mkPrimTc FSLIT("State#") statePrimTyConKey statePrimTyCon +realWorldTyConName = mkPrimTc FSLIT("RealWorld") realWorldTyConKey realWorldTyCon +arrayPrimTyConName = mkPrimTc FSLIT("Array#") arrayPrimTyConKey arrayPrimTyCon +byteArrayPrimTyConName = mkPrimTc FSLIT("ByteArray#") byteArrayPrimTyConKey byteArrayPrimTyCon +mutableArrayPrimTyConName = mkPrimTc FSLIT("MutableArray#") mutableArrayPrimTyConKey mutableArrayPrimTyCon +mutableByteArrayPrimTyConName = mkPrimTc FSLIT("MutableByteArray#") mutableByteArrayPrimTyConKey mutableByteArrayPrimTyCon +mutVarPrimTyConName = mkPrimTc FSLIT("MutVar#") mutVarPrimTyConKey mutVarPrimTyCon +mVarPrimTyConName = mkPrimTc FSLIT("MVar#") mVarPrimTyConKey mVarPrimTyCon +tVarPrimTyConName = mkPrimTc FSLIT("TVar#") tVarPrimTyConKey tVarPrimTyCon +stablePtrPrimTyConName = mkPrimTc FSLIT("StablePtr#") stablePtrPrimTyConKey stablePtrPrimTyCon +stableNamePrimTyConName = mkPrimTc FSLIT("StableName#") stableNamePrimTyConKey stableNamePrimTyCon +bcoPrimTyConName = mkPrimTc FSLIT("BCO#") bcoPrimTyConKey bcoPrimTyCon +weakPrimTyConName = mkPrimTc FSLIT("Weak#") weakPrimTyConKey weakPrimTyCon +threadIdPrimTyConName = mkPrimTc FSLIT("ThreadId#") threadIdPrimTyConKey threadIdPrimTyCon +\end{code} + +%************************************************************************ +%* * +\subsection{Support code} +%* * +%************************************************************************ + +alphaTyVars is a list of type variables for use in templates: + ["a", "b", ..., "z", "t1", "t2", ... ] + +\begin{code} +tyVarList :: Kind -> [TyVar] +tyVarList kind = [ mkTyVar (mkInternalName (mkAlphaTyVarUnique u) + (mkTyVarOcc (mkFastString name)) + noSrcLoc) kind + | u <- [2..], + let name | c <= 'z' = [c] + | otherwise = 't':show u + where c = chr (u-2 + ord 'a') + ] + +alphaTyVars :: [TyVar] +alphaTyVars = tyVarList liftedTypeKind + +betaTyVars = tail alphaTyVars + +alphaTyVar, betaTyVar, gammaTyVar :: TyVar +(alphaTyVar:betaTyVar:gammaTyVar:deltaTyVar:_) = alphaTyVars + +alphaTys = mkTyVarTys alphaTyVars +(alphaTy:betaTy:gammaTy:deltaTy:_) = alphaTys + + -- openAlphaTyVar is prepared to be instantiated + -- to a lifted or unlifted type variable. It's used for the + -- result type for "error", so that we can have (error Int# "Help") +openAlphaTyVars :: [TyVar] +openAlphaTyVars@(openAlphaTyVar:_) = tyVarList openTypeKind + +openAlphaTy = mkTyVarTy openAlphaTyVar + +vrcPos,vrcZero :: (Bool,Bool) +vrcPos = (True,False) +vrcZero = (False,False) + +vrcsP,vrcsZ,vrcsZP :: ArgVrcs +vrcsP = [vrcPos] +vrcsZ = [vrcZero] +vrcsZP = [vrcZero,vrcPos] +\end{code} + + +%************************************************************************ +%* * +\subsection[TysPrim-basic]{Basic primitive types (@Char#@, @Int#@, etc.)} +%* * +%************************************************************************ + +\begin{code} +-- only used herein +pcPrimTyCon :: Name -> ArgVrcs -> PrimRep -> TyCon +pcPrimTyCon name arg_vrcs rep + = mkPrimTyCon name kind arity arg_vrcs rep + where + arity = length arg_vrcs + kind = mkArrowKinds (replicate arity liftedTypeKind) result_kind + result_kind = unliftedTypeKind -- all primitive types are unlifted + +pcPrimTyCon0 :: Name -> PrimRep -> TyCon +pcPrimTyCon0 name rep + = mkPrimTyCon name result_kind 0 [] rep + where + result_kind = unliftedTypeKind -- all primitive types are unlifted + +charPrimTy = mkTyConTy charPrimTyCon +charPrimTyCon = pcPrimTyCon0 charPrimTyConName WordRep + +intPrimTy = mkTyConTy intPrimTyCon +intPrimTyCon = pcPrimTyCon0 intPrimTyConName IntRep + +int32PrimTy = mkTyConTy int32PrimTyCon +int32PrimTyCon = pcPrimTyCon0 int32PrimTyConName IntRep + +int64PrimTy = mkTyConTy int64PrimTyCon +int64PrimTyCon = pcPrimTyCon0 int64PrimTyConName Int64Rep + +wordPrimTy = mkTyConTy wordPrimTyCon +wordPrimTyCon = pcPrimTyCon0 wordPrimTyConName WordRep + +word32PrimTy = mkTyConTy word32PrimTyCon +word32PrimTyCon = pcPrimTyCon0 word32PrimTyConName WordRep + +word64PrimTy = mkTyConTy word64PrimTyCon +word64PrimTyCon = pcPrimTyCon0 word64PrimTyConName Word64Rep + +addrPrimTy = mkTyConTy addrPrimTyCon +addrPrimTyCon = pcPrimTyCon0 addrPrimTyConName AddrRep + +floatPrimTy = mkTyConTy floatPrimTyCon +floatPrimTyCon = pcPrimTyCon0 floatPrimTyConName FloatRep + +doublePrimTy = mkTyConTy doublePrimTyCon +doublePrimTyCon = pcPrimTyCon0 doublePrimTyConName DoubleRep +\end{code} + + +%************************************************************************ +%* * +\subsection[TysPrim-state]{The @State#@ type (and @_RealWorld@ types)} +%* * +%************************************************************************ + +State# is the primitive, unlifted type of states. It has one type parameter, +thus + State# RealWorld +or + State# s + +where s is a type variable. The only purpose of the type parameter is to +keep different state threads separate. It is represented by nothing at all. + +\begin{code} +mkStatePrimTy ty = mkTyConApp statePrimTyCon [ty] +statePrimTyCon = pcPrimTyCon statePrimTyConName vrcsZ VoidRep +\end{code} + +RealWorld is deeply magical. It is *primitive*, but it is not +*unlifted* (hence ptrArg). We never manipulate values of type +RealWorld; it's only used in the type system, to parameterise State#. + +\begin{code} +realWorldTyCon = mkLiftedPrimTyCon realWorldTyConName liftedTypeKind 0 [] PtrRep +realWorldTy = mkTyConTy realWorldTyCon +realWorldStatePrimTy = mkStatePrimTy realWorldTy -- State# RealWorld +\end{code} + +Note: the ``state-pairing'' types are not truly primitive, so they are +defined in \tr{TysWiredIn.lhs}, not here. + + +%************************************************************************ +%* * +\subsection[TysPrim-arrays]{The primitive array types} +%* * +%************************************************************************ + +\begin{code} +arrayPrimTyCon = pcPrimTyCon arrayPrimTyConName vrcsP PtrRep +mutableArrayPrimTyCon = pcPrimTyCon mutableArrayPrimTyConName vrcsZP PtrRep +mutableByteArrayPrimTyCon = pcPrimTyCon mutableByteArrayPrimTyConName vrcsZ PtrRep +byteArrayPrimTyCon = pcPrimTyCon0 byteArrayPrimTyConName PtrRep + +mkArrayPrimTy elt = mkTyConApp arrayPrimTyCon [elt] +byteArrayPrimTy = mkTyConTy byteArrayPrimTyCon +mkMutableArrayPrimTy s elt = mkTyConApp mutableArrayPrimTyCon [s, elt] +mkMutableByteArrayPrimTy s = mkTyConApp mutableByteArrayPrimTyCon [s] +\end{code} + +%************************************************************************ +%* * +\subsection[TysPrim-mut-var]{The mutable variable type} +%* * +%************************************************************************ + +\begin{code} +mutVarPrimTyCon = pcPrimTyCon mutVarPrimTyConName vrcsZP PtrRep + +mkMutVarPrimTy s elt = mkTyConApp mutVarPrimTyCon [s, elt] +\end{code} + +%************************************************************************ +%* * +\subsection[TysPrim-synch-var]{The synchronizing variable type} +%* * +%************************************************************************ + +\begin{code} +mVarPrimTyCon = pcPrimTyCon mVarPrimTyConName vrcsZP PtrRep + +mkMVarPrimTy s elt = mkTyConApp mVarPrimTyCon [s, elt] +\end{code} + +%************************************************************************ +%* * +\subsection[TysPrim-stm-var]{The transactional variable type} +%* * +%************************************************************************ + +\begin{code} +tVarPrimTyCon = pcPrimTyCon tVarPrimTyConName vrcsZP PtrRep + +mkTVarPrimTy s elt = mkTyConApp tVarPrimTyCon [s, elt] +\end{code} + +%************************************************************************ +%* * +\subsection[TysPrim-stable-ptrs]{The stable-pointer type} +%* * +%************************************************************************ + +\begin{code} +stablePtrPrimTyCon = pcPrimTyCon stablePtrPrimTyConName vrcsP AddrRep + +mkStablePtrPrimTy ty = mkTyConApp stablePtrPrimTyCon [ty] +\end{code} + +%************************************************************************ +%* * +\subsection[TysPrim-stable-names]{The stable-name type} +%* * +%************************************************************************ + +\begin{code} +stableNamePrimTyCon = pcPrimTyCon stableNamePrimTyConName vrcsP PtrRep + +mkStableNamePrimTy ty = mkTyConApp stableNamePrimTyCon [ty] +\end{code} + +%************************************************************************ +%* * +\subsection[TysPrim-BCOs]{The ``bytecode object'' type} +%* * +%************************************************************************ + +\begin{code} +bcoPrimTy = mkTyConTy bcoPrimTyCon +bcoPrimTyCon = pcPrimTyCon0 bcoPrimTyConName PtrRep +\end{code} + +%************************************************************************ +%* * +\subsection[TysPrim-Weak]{The ``weak pointer'' type} +%* * +%************************************************************************ + +\begin{code} +weakPrimTyCon = pcPrimTyCon weakPrimTyConName vrcsP PtrRep + +mkWeakPrimTy v = mkTyConApp weakPrimTyCon [v] +\end{code} + +%************************************************************************ +%* * +\subsection[TysPrim-thread-ids]{The ``thread id'' type} +%* * +%************************************************************************ + +A thread id is represented by a pointer to the TSO itself, to ensure +that they are always unique and we can always find the TSO for a given +thread id. However, this has the unfortunate consequence that a +ThreadId# for a given thread is treated as a root by the garbage +collector and can keep TSOs around for too long. + +Hence the programmer API for thread manipulation uses a weak pointer +to the thread id internally. + +\begin{code} +threadIdPrimTy = mkTyConTy threadIdPrimTyCon +threadIdPrimTyCon = pcPrimTyCon0 threadIdPrimTyConName PtrRep +\end{code} diff --git a/compiler/prelude/TysWiredIn.lhs b/compiler/prelude/TysWiredIn.lhs new file mode 100644 index 0000000000..ceb4df550a --- /dev/null +++ b/compiler/prelude/TysWiredIn.lhs @@ -0,0 +1,549 @@ +% +% (c) The GRASP Project, Glasgow University, 1994-1998 +% +\section[TysWiredIn]{Wired-in knowledge about {\em non-primitive} types} + +This module is about types that can be defined in Haskell, but which +must be wired into the compiler nonetheless. + +This module tracks the ``state interface'' document, ``GHC prelude: +types and operations.'' + +\begin{code} +module TysWiredIn ( + wiredInTyCons, + + boolTy, boolTyCon, boolTyCon_RDR, boolTyConName, + trueDataCon, trueDataConId, true_RDR, + falseDataCon, falseDataConId, false_RDR, + + charTyCon, charDataCon, charTyCon_RDR, + charTy, stringTy, charTyConName, + + + doubleTyCon, doubleDataCon, doubleTy, doubleTyConName, + + floatTyCon, floatDataCon, floatTy, floatTyConName, + + intTyCon, intDataCon, intTyCon_RDR, intDataCon_RDR, intTyConName, + intTy, + + listTyCon, nilDataCon, consDataCon, + listTyCon_RDR, consDataCon_RDR, listTyConName, + mkListTy, + + -- tuples + mkTupleTy, + tupleTyCon, tupleCon, + unitTyCon, unitDataCon, unitDataConId, pairTyCon, + unboxedSingletonTyCon, unboxedSingletonDataCon, + unboxedPairTyCon, unboxedPairDataCon, + + unitTy, + voidTy, + + -- parallel arrays + mkPArrTy, + parrTyCon, parrFakeCon, isPArrTyCon, isPArrFakeCon, + parrTyCon_RDR, parrTyConName + ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} MkId( mkDataConIds ) + +-- friends: +import PrelNames +import TysPrim + +-- others: +import Constants ( mAX_TUPLE_SIZE ) +import Module ( Module ) +import RdrName ( nameRdrName ) +import Name ( Name, BuiltInSyntax(..), nameUnique, nameOccName, + nameModule, mkWiredInName ) +import OccName ( mkOccNameFS, tcName, dataName, mkTupleOcc, + mkDataConWorkerOcc ) +import DataCon ( DataCon, mkDataCon, dataConWorkId, dataConSourceArity ) +import Var ( TyVar, tyVarKind ) +import TyCon ( TyCon, AlgTyConRhs(DataTyCon), tyConDataCons, + mkTupleTyCon, mkAlgTyCon, tyConName ) + +import BasicTypes ( Arity, RecFlag(..), Boxity(..), isBoxed, + StrictnessMark(..) ) + +import Type ( Type, mkTyConTy, mkTyConApp, mkTyVarTy, mkTyVarTys, + TyThing(..) ) +import Kind ( mkArrowKinds, liftedTypeKind, ubxTupleKind ) +import Unique ( incrUnique, mkTupleTyConUnique, + mkTupleDataConUnique, mkPArrDataConUnique ) +import Array +import FastString +import Outputable + +alpha_tyvar = [alphaTyVar] +alpha_ty = [alphaTy] +\end{code} + + +%************************************************************************ +%* * +\subsection{Wired in type constructors} +%* * +%************************************************************************ + +If you change which things are wired in, make sure you change their +names in PrelNames, so they use wTcQual, wDataQual, etc + +\begin{code} +wiredInTyCons :: [TyCon] -- Excludes tuples +wiredInTyCons = [ unitTyCon -- Not treated like other tuples, because + -- it's defined in GHC.Base, and there's only + -- one of it. We put it in wiredInTyCons so + -- that it'll pre-populate the name cache, so + -- the special case in lookupOrigNameCache + -- doesn't need to look out for it + , boolTyCon + , charTyCon + , doubleTyCon + , floatTyCon + , intTyCon + , listTyCon + , parrTyCon + ] +\end{code} + +\begin{code} +mkWiredInTyConName :: BuiltInSyntax -> Module -> FastString -> Unique -> TyCon -> Name +mkWiredInTyConName built_in mod fs uniq tycon + = mkWiredInName mod (mkOccNameFS tcName fs) uniq + Nothing -- No parent object + (ATyCon tycon) -- Relevant TyCon + built_in + +mkWiredInDataConName :: BuiltInSyntax -> Module -> FastString -> Unique -> DataCon -> Name -> Name +mkWiredInDataConName built_in mod fs uniq datacon parent + = mkWiredInName mod (mkOccNameFS dataName fs) uniq + (Just parent) -- Name of parent TyCon + (ADataCon datacon) -- Relevant DataCon + built_in + +charTyConName = mkWiredInTyConName UserSyntax pREL_BASE FSLIT("Char") charTyConKey charTyCon +charDataConName = mkWiredInDataConName UserSyntax pREL_BASE FSLIT("C#") charDataConKey charDataCon charTyConName +intTyConName = mkWiredInTyConName UserSyntax pREL_BASE FSLIT("Int") intTyConKey intTyCon +intDataConName = mkWiredInDataConName UserSyntax pREL_BASE FSLIT("I#") intDataConKey intDataCon intTyConName + +boolTyConName = mkWiredInTyConName UserSyntax pREL_BASE FSLIT("Bool") boolTyConKey boolTyCon +falseDataConName = mkWiredInDataConName UserSyntax pREL_BASE FSLIT("False") falseDataConKey falseDataCon boolTyConName +trueDataConName = mkWiredInDataConName UserSyntax pREL_BASE FSLIT("True") trueDataConKey trueDataCon boolTyConName +listTyConName = mkWiredInTyConName BuiltInSyntax pREL_BASE FSLIT("[]") listTyConKey listTyCon +nilDataConName = mkWiredInDataConName BuiltInSyntax pREL_BASE FSLIT("[]") nilDataConKey nilDataCon listTyConName +consDataConName = mkWiredInDataConName BuiltInSyntax pREL_BASE FSLIT(":") consDataConKey consDataCon listTyConName + +floatTyConName = mkWiredInTyConName UserSyntax pREL_FLOAT FSLIT("Float") floatTyConKey floatTyCon +floatDataConName = mkWiredInDataConName UserSyntax pREL_FLOAT FSLIT("F#") floatDataConKey floatDataCon floatTyConName +doubleTyConName = mkWiredInTyConName UserSyntax pREL_FLOAT FSLIT("Double") doubleTyConKey doubleTyCon +doubleDataConName = mkWiredInDataConName UserSyntax pREL_FLOAT FSLIT("D#") doubleDataConKey doubleDataCon doubleTyConName + +parrTyConName = mkWiredInTyConName BuiltInSyntax pREL_PARR FSLIT("[::]") parrTyConKey parrTyCon +parrDataConName = mkWiredInDataConName UserSyntax pREL_PARR FSLIT("PArr") parrDataConKey parrDataCon parrTyConName + +boolTyCon_RDR = nameRdrName boolTyConName +false_RDR = nameRdrName falseDataConName +true_RDR = nameRdrName trueDataConName +intTyCon_RDR = nameRdrName intTyConName +charTyCon_RDR = nameRdrName charTyConName +intDataCon_RDR = nameRdrName intDataConName +listTyCon_RDR = nameRdrName listTyConName +consDataCon_RDR = nameRdrName consDataConName +parrTyCon_RDR = nameRdrName parrTyConName +\end{code} + + +%************************************************************************ +%* * +\subsection{mkWiredInTyCon} +%* * +%************************************************************************ + +\begin{code} +pcNonRecDataTyCon = pcTyCon False NonRecursive +pcRecDataTyCon = pcTyCon False Recursive + +pcTyCon is_enum is_rec name tyvars argvrcs cons + = tycon + where + tycon = mkAlgTyCon name + (mkArrowKinds (map tyVarKind tyvars) liftedTypeKind) + tyvars + argvrcs + [] -- No stupid theta + (DataTyCon cons is_enum) + [] -- No record selectors + is_rec + True -- All the wired-in tycons have generics + +pcDataCon :: Name -> [TyVar] -> [Type] -> TyCon -> DataCon +pcDataCon = pcDataConWithFixity False + +pcDataConWithFixity :: Bool -> Name -> [TyVar] -> [Type] -> TyCon -> DataCon +-- The Name should be in the DataName name space; it's the name +-- of the DataCon itself. +-- +-- The unique is the first of two free uniques; +-- the first is used for the datacon itself, +-- the second is used for the "worker name" + +pcDataConWithFixity declared_infix dc_name tyvars arg_tys tycon + = data_con + where + data_con = mkDataCon dc_name declared_infix True {- Vanilla -} + (map (const NotMarkedStrict) arg_tys) + [{- No labelled fields -}] + tyvars [] [] arg_tys tycon (mkTyVarTys tyvars) + (mkDataConIds bogus_wrap_name wrk_name data_con) + + + mod = nameModule dc_name + wrk_occ = mkDataConWorkerOcc (nameOccName dc_name) + wrk_key = incrUnique (nameUnique dc_name) + wrk_name = mkWiredInName mod wrk_occ wrk_key + (Just (tyConName tycon)) + (AnId (dataConWorkId data_con)) UserSyntax + bogus_wrap_name = pprPanic "Wired-in data wrapper id" (ppr dc_name) + -- Wired-in types are too simple to need wrappers +\end{code} + + +%************************************************************************ +%* * +\subsection[TysWiredIn-tuples]{The tuple types} +%* * +%************************************************************************ + +\begin{code} +tupleTyCon :: Boxity -> Arity -> TyCon +tupleTyCon boxity i | i > mAX_TUPLE_SIZE = fst (mk_tuple boxity i) -- Build one specially +tupleTyCon Boxed i = fst (boxedTupleArr ! i) +tupleTyCon Unboxed i = fst (unboxedTupleArr ! i) + +tupleCon :: Boxity -> Arity -> DataCon +tupleCon boxity i | i > mAX_TUPLE_SIZE = snd (mk_tuple boxity i) -- Build one specially +tupleCon Boxed i = snd (boxedTupleArr ! i) +tupleCon Unboxed i = snd (unboxedTupleArr ! i) + +boxedTupleArr, unboxedTupleArr :: Array Int (TyCon,DataCon) +boxedTupleArr = listArray (0,mAX_TUPLE_SIZE) [mk_tuple Boxed i | i <- [0..mAX_TUPLE_SIZE]] +unboxedTupleArr = listArray (0,mAX_TUPLE_SIZE) [mk_tuple Unboxed i | i <- [0..mAX_TUPLE_SIZE]] + +mk_tuple :: Boxity -> Int -> (TyCon,DataCon) +mk_tuple boxity arity = (tycon, tuple_con) + where + tycon = mkTupleTyCon tc_name tc_kind arity tyvars tuple_con boxity gen_info + mod = mkTupleModule boxity arity + tc_name = mkWiredInName mod (mkTupleOcc tcName boxity arity) tc_uniq + Nothing (ATyCon tycon) BuiltInSyntax + tc_kind = mkArrowKinds (map tyVarKind tyvars) res_kind + res_kind | isBoxed boxity = liftedTypeKind + | otherwise = ubxTupleKind + + tyvars | isBoxed boxity = take arity alphaTyVars + | otherwise = take arity openAlphaTyVars + + tuple_con = pcDataCon dc_name tyvars tyvar_tys tycon + tyvar_tys = mkTyVarTys tyvars + dc_name = mkWiredInName mod (mkTupleOcc dataName boxity arity) dc_uniq + (Just tc_name) (ADataCon tuple_con) BuiltInSyntax + tc_uniq = mkTupleTyConUnique boxity arity + dc_uniq = mkTupleDataConUnique boxity arity + gen_info = True -- Tuples all have generics.. + -- hmm: that's a *lot* of code + +unitTyCon = tupleTyCon Boxed 0 +unitDataCon = head (tyConDataCons unitTyCon) +unitDataConId = dataConWorkId unitDataCon + +pairTyCon = tupleTyCon Boxed 2 + +unboxedSingletonTyCon = tupleTyCon Unboxed 1 +unboxedSingletonDataCon = tupleCon Unboxed 1 + +unboxedPairTyCon = tupleTyCon Unboxed 2 +unboxedPairDataCon = tupleCon Unboxed 2 +\end{code} + +%************************************************************************ +%* * +\subsection[TysWiredIn-boxed-prim]{The ``boxed primitive'' types (@Char@, @Int@, etc)} +%* * +%************************************************************************ + +\begin{code} +-- The Void type is represented as a data type with no constructors +-- It's a built in type (i.e. there's no way to define it in Haskell; +-- the nearest would be +-- +-- data Void = -- No constructors! +-- +-- ) It's lifted; there is only one value of this +-- type, namely "void", whose semantics is just bottom. +-- +-- Haskell 98 drops the definition of a Void type, so we just 'simulate' +-- voidTy using (). +voidTy = unitTy +\end{code} + + +\begin{code} +charTy = mkTyConTy charTyCon + +charTyCon = pcNonRecDataTyCon charTyConName [] [] [charDataCon] +charDataCon = pcDataCon charDataConName [] [charPrimTy] charTyCon + +stringTy = mkListTy charTy -- convenience only +\end{code} + +\begin{code} +intTy = mkTyConTy intTyCon + +intTyCon = pcNonRecDataTyCon intTyConName [] [] [intDataCon] +intDataCon = pcDataCon intDataConName [] [intPrimTy] intTyCon +\end{code} + +\begin{code} +floatTy = mkTyConTy floatTyCon + +floatTyCon = pcNonRecDataTyCon floatTyConName [] [] [floatDataCon] +floatDataCon = pcDataCon floatDataConName [] [floatPrimTy] floatTyCon +\end{code} + +\begin{code} +doubleTy = mkTyConTy doubleTyCon + +doubleTyCon = pcNonRecDataTyCon doubleTyConName [] [] [doubleDataCon] +doubleDataCon = pcDataCon doubleDataConName [] [doublePrimTy] doubleTyCon +\end{code} + + +%************************************************************************ +%* * +\subsection[TysWiredIn-Bool]{The @Bool@ type} +%* * +%************************************************************************ + +An ordinary enumeration type, but deeply wired in. There are no +magical operations on @Bool@ (just the regular Prelude code). + +{\em BEGIN IDLE SPECULATION BY SIMON} + +This is not the only way to encode @Bool@. A more obvious coding makes +@Bool@ just a boxed up version of @Bool#@, like this: +\begin{verbatim} +type Bool# = Int# +data Bool = MkBool Bool# +\end{verbatim} + +Unfortunately, this doesn't correspond to what the Report says @Bool@ +looks like! Furthermore, we get slightly less efficient code (I +think) with this coding. @gtInt@ would look like this: + +\begin{verbatim} +gtInt :: Int -> Int -> Bool +gtInt x y = case x of I# x# -> + case y of I# y# -> + case (gtIntPrim x# y#) of + b# -> MkBool b# +\end{verbatim} + +Notice that the result of the @gtIntPrim@ comparison has to be turned +into an integer (here called @b#@), and returned in a @MkBool@ box. + +The @if@ expression would compile to this: +\begin{verbatim} +case (gtInt x y) of + MkBool b# -> case b# of { 1# -> e1; 0# -> e2 } +\end{verbatim} + +I think this code is a little less efficient than the previous code, +but I'm not certain. At all events, corresponding with the Report is +important. The interesting thing is that the language is expressive +enough to describe more than one alternative; and that a type doesn't +necessarily need to be a straightforwardly boxed version of its +primitive counterpart. + +{\em END IDLE SPECULATION BY SIMON} + +\begin{code} +boolTy = mkTyConTy boolTyCon + +boolTyCon = pcTyCon True NonRecursive boolTyConName + [] [] [falseDataCon, trueDataCon] + +falseDataCon = pcDataCon falseDataConName [] [] boolTyCon +trueDataCon = pcDataCon trueDataConName [] [] boolTyCon + +falseDataConId = dataConWorkId falseDataCon +trueDataConId = dataConWorkId trueDataCon +\end{code} + +%************************************************************************ +%* * +\subsection[TysWiredIn-List]{The @List@ type (incl ``build'' magic)} +%* * +%************************************************************************ + +Special syntax, deeply wired in, but otherwise an ordinary algebraic +data types: +\begin{verbatim} +data [] a = [] | a : (List a) +data () = () +data (,) a b = (,,) a b +... +\end{verbatim} + +\begin{code} +mkListTy :: Type -> Type +mkListTy ty = mkTyConApp listTyCon [ty] + +listTyCon = pcRecDataTyCon listTyConName + alpha_tyvar [(True,False)] [nilDataCon, consDataCon] + +nilDataCon = pcDataCon nilDataConName alpha_tyvar [] listTyCon +consDataCon = pcDataConWithFixity True {- Declared infix -} + consDataConName + alpha_tyvar [alphaTy, mkTyConApp listTyCon alpha_ty] listTyCon +-- Interesting: polymorphic recursion would help here. +-- We can't use (mkListTy alphaTy) in the defn of consDataCon, else mkListTy +-- gets the over-specific type (Type -> Type) +\end{code} + +%************************************************************************ +%* * +\subsection[TysWiredIn-Tuples]{The @Tuple@ types} +%* * +%************************************************************************ + +The tuple types are definitely magic, because they form an infinite +family. + +\begin{itemize} +\item +They have a special family of type constructors, of type @TyCon@ +These contain the tycon arity, but don't require a Unique. + +\item +They have a special family of constructors, of type +@Id@. Again these contain their arity but don't need a Unique. + +\item +There should be a magic way of generating the info tables and +entry code for all tuples. + +But at the moment we just compile a Haskell source +file\srcloc{lib/prelude/...} containing declarations like: +\begin{verbatim} +data Tuple0 = Tup0 +data Tuple2 a b = Tup2 a b +data Tuple3 a b c = Tup3 a b c +data Tuple4 a b c d = Tup4 a b c d +... +\end{verbatim} +The print-names associated with the magic @Id@s for tuple constructors +``just happen'' to be the same as those generated by these +declarations. + +\item +The instance environment should have a magic way to know +that each tuple type is an instances of classes @Eq@, @Ix@, @Ord@ and +so on. \ToDo{Not implemented yet.} + +\item +There should also be a way to generate the appropriate code for each +of these instances, but (like the info tables and entry code) it is +done by enumeration\srcloc{lib/prelude/InTup?.hs}. +\end{itemize} + +\begin{code} +mkTupleTy :: Boxity -> Int -> [Type] -> Type +mkTupleTy boxity arity tys = mkTyConApp (tupleTyCon boxity arity) tys + +unitTy = mkTupleTy Boxed 0 [] +\end{code} + +%************************************************************************ +%* * +\subsection[TysWiredIn-PArr]{The @[::]@ type} +%* * +%************************************************************************ + +Special syntax for parallel arrays needs some wired in definitions. + +\begin{code} +-- construct a type representing the application of the parallel array +-- constructor +-- +mkPArrTy :: Type -> Type +mkPArrTy ty = mkTyConApp parrTyCon [ty] + +-- represents the type constructor of parallel arrays +-- +-- * this must match the definition in `PrelPArr' +-- +-- NB: Although the constructor is given here, it will not be accessible in +-- user code as it is not in the environment of any compiled module except +-- `PrelPArr'. +-- +parrTyCon :: TyCon +parrTyCon = pcNonRecDataTyCon parrTyConName alpha_tyvar [(True, False)] [parrDataCon] + +parrDataCon :: DataCon +parrDataCon = pcDataCon + parrDataConName + alpha_tyvar -- forall'ed type variables + [intPrimTy, -- 1st argument: Int# + mkTyConApp -- 2nd argument: Array# a + arrayPrimTyCon + alpha_ty] + parrTyCon + +-- check whether a type constructor is the constructor for parallel arrays +-- +isPArrTyCon :: TyCon -> Bool +isPArrTyCon tc = tyConName tc == parrTyConName + +-- fake array constructors +-- +-- * these constructors are never really used to represent array values; +-- however, they are very convenient during desugaring (and, in particular, +-- in the pattern matching compiler) to treat array pattern just like +-- yet another constructor pattern +-- +parrFakeCon :: Arity -> DataCon +parrFakeCon i | i > mAX_TUPLE_SIZE = mkPArrFakeCon i -- build one specially +parrFakeCon i = parrFakeConArr!i + +-- pre-defined set of constructors +-- +parrFakeConArr :: Array Int DataCon +parrFakeConArr = array (0, mAX_TUPLE_SIZE) [(i, mkPArrFakeCon i) + | i <- [0..mAX_TUPLE_SIZE]] + +-- build a fake parallel array constructor for the given arity +-- +mkPArrFakeCon :: Int -> DataCon +mkPArrFakeCon arity = data_con + where + data_con = pcDataCon name [tyvar] tyvarTys parrTyCon + tyvar = head alphaTyVars + tyvarTys = replicate arity $ mkTyVarTy tyvar + nameStr = mkFastString ("MkPArr" ++ show arity) + name = mkWiredInName pREL_PARR (mkOccNameFS dataName nameStr) uniq + Nothing (ADataCon data_con) UserSyntax + uniq = mkPArrDataConUnique arity + +-- checks whether a data constructor is a fake constructor for parallel arrays +-- +isPArrFakeCon :: DataCon -> Bool +isPArrFakeCon dcon = dcon == parrFakeCon (dataConSourceArity dcon) +\end{code} + diff --git a/compiler/prelude/primops.txt.pp b/compiler/prelude/primops.txt.pp new file mode 100644 index 0000000000..13b4b6c97d --- /dev/null +++ b/compiler/prelude/primops.txt.pp @@ -0,0 +1,1687 @@ +----------------------------------------------------------------------- +-- $Id: primops.txt.pp,v 1.37 2005/11/25 09:46:19 simonmar Exp $ +-- +-- Primitive Operations +-- +----------------------------------------------------------------------- + +-- This file is processed by the utility program genprimopcode to produce +-- a number of include files within the compiler and optionally to produce +-- human-readable documentation. +-- +-- It should first be preprocessed. +-- +-- To add a new primop, you currently need to update the following files: +-- +-- - this file (ghc/compiler/prelude/primops.txt.pp), which includes +-- the type of the primop, and various other properties (its +-- strictness attributes, whether it is defined as a macro +-- or as out-of-line code, etc.) +-- +-- - if the primop is inline (i.e. a macro), then: +-- ghc/compiler/AbsCUtils.lhs (dscCOpStmt) +-- defines the translation of the primop into simpler +-- abstract C operations. +-- +-- - or, for an out-of-line primop: +-- ghc/includes/StgMiscClosures.h (just add the declaration) +-- ghc/rts/PrimOps.cmm (define it here) +-- ghc/rts/Linker.c (declare the symbol for GHCi) +-- +-- - the User's Guide +-- + +-- This file is divided into named sections, each containing or more +-- primop entries. Section headers have the format: +-- +-- section "section-name" {description} +-- +-- This information is used solely when producing documentation; it is +-- otherwise ignored. The description is optional. +-- +-- The format of each primop entry is as follows: +-- +-- primop internal-name "name-in-program-text" type category {description} attributes + +-- The default attribute values which apply if you don't specify +-- other ones. Attribute values can be True, False, or arbitrary +-- text between curly brackets. This is a kludge to enable +-- processors of this file to easily get hold of simple info +-- (eg, out_of_line), whilst avoiding parsing complex expressions +-- needed for strictness and usage info. + +defaults + has_side_effects = False + out_of_line = False + commutable = False + needs_wrapper = False + can_fail = False + strictness = { \ arity -> mkStrictSig (mkTopDmdType (replicate arity lazyDmd) TopRes) } + usage = { nomangle other } + +-- Currently, documentation is produced using latex, so contents of +-- description fields should be legal latex. Descriptions can contain +-- matched pairs of embedded curly brackets. + +#include "MachDeps.h" + +-- We need platform defines (tests for mingw32 below). However, we only +-- test the TARGET platform, which doesn't vary between stages, so the +-- stage1 platform defines are fine: +#include "../stage1/ghc_boot_platform.h" + +section "The word size story." + {Haskell98 specifies that signed integers (type {\tt Int}) + must contain at least 30 bits. GHC always implements {\tt + Int} using the primitive type {\tt Int\#}, whose size equals + the {\tt MachDeps.h} constant {\tt WORD\_SIZE\_IN\_BITS}. + This is normally set based on the {\tt config.h} parameter + {\tt SIZEOF\_HSWORD}, i.e., 32 bits on 32-bit machines, 64 + bits on 64-bit machines. However, it can also be explicitly + set to a smaller number, e.g., 31 bits, to allow the + possibility of using tag bits. Currently GHC itself has only + 32-bit and 64-bit variants, but 30 or 31-bit code can be + exported as an external core file for use in other back ends. + + GHC also implements a primitive unsigned integer type {\tt + Word\#} which always has the same number of bits as {\tt + Int\#}. + + In addition, GHC supports families of explicit-sized integers + and words at 8, 16, 32, and 64 bits, with the usual + arithmetic operations, comparisons, and a range of + conversions. The 8-bit and 16-bit sizes are always + represented as {\tt Int\#} and {\tt Word\#}, and the + operations implemented in terms of the the primops on these + types, with suitable range restrictions on the results (using + the {\tt narrow$n$Int\#} and {\tt narrow$n$Word\#} families + of primops. The 32-bit sizes are represented using {\tt + Int\#} and {\tt Word\#} when {\tt WORD\_SIZE\_IN\_BITS} + $\geq$ 32; otherwise, these are represented using distinct + primitive types {\tt Int32\#} and {\tt Word32\#}. These (when + needed) have a complete set of corresponding operations; + however, nearly all of these are implemented as external C + functions rather than as primops. Exactly the same story + applies to the 64-bit sizes. All of these details are hidden + under the {\tt PrelInt} and {\tt PrelWord} modules, which use + {\tt \#if}-defs to invoke the appropriate types and + operators. + + Word size also matters for the families of primops for + indexing/reading/writing fixed-size quantities at offsets + from an array base, address, or foreign pointer. Here, a + slightly different approach is taken. The names of these + primops are fixed, but their {\it types} vary according to + the value of {\tt WORD\_SIZE\_IN\_BITS}. For example, if word + size is at least 32 bits then an operator like + \texttt{indexInt32Array\#} has type {\tt ByteArr\# -> Int\# + -> Int\#}; otherwise it has type {\tt ByteArr\# -> Int\# -> + Int32\#}. This approach confines the necessary {\tt + \#if}-defs to this file; no conditional compilation is needed + in the files that expose these primops. + + Finally, there are strongly deprecated primops for coercing + between {\tt Addr\#}, the primitive type of machine + addresses, and {\tt Int\#}. These are pretty bogus anyway, + but will work on existing 32-bit and 64-bit GHC targets; they + are completely bogus when tag bits are used in {\tt Int\#}, + so are not available in this case. } + +-- Define synonyms for indexing ops. + +#if WORD_SIZE_IN_BITS < 32 +#define INT32 Int32# +#define WORD32 Word32# +#else +#define INT32 Int# +#define WORD32 Word# +#endif + +#if WORD_SIZE_IN_BITS < 64 +#define INT64 Int64# +#define WORD64 Word64# +#else +#define INT64 Int# +#define WORD64 Word# +#endif + +------------------------------------------------------------------------ +section "Char#" + {Operations on 31-bit characters.} +------------------------------------------------------------------------ + + +primop CharGtOp "gtChar#" Compare Char# -> Char# -> Bool +primop CharGeOp "geChar#" Compare Char# -> Char# -> Bool + +primop CharEqOp "eqChar#" Compare + Char# -> Char# -> Bool + with commutable = True + +primop CharNeOp "neChar#" Compare + Char# -> Char# -> Bool + with commutable = True + +primop CharLtOp "ltChar#" Compare Char# -> Char# -> Bool +primop CharLeOp "leChar#" Compare Char# -> Char# -> Bool + +primop OrdOp "ord#" GenPrimOp Char# -> Int# + +------------------------------------------------------------------------ +section "Int#" + {Operations on native-size integers (30+ bits).} +------------------------------------------------------------------------ + +primop IntAddOp "+#" Dyadic + Int# -> Int# -> Int# + with commutable = True + +primop IntSubOp "-#" Dyadic Int# -> Int# -> Int# + +primop IntMulOp "*#" + Dyadic Int# -> Int# -> Int# + {Low word of signed integer multiply.} + with commutable = True + +primop IntMulMayOfloOp "mulIntMayOflo#" + Dyadic Int# -> Int# -> Int# + {Return non-zero if there is any possibility that the upper word of a + signed integer multiply might contain useful information. Return + zero only if you are completely sure that no overflow can occur. + On a 32-bit platform, the recommmended implementation is to do a + 32 x 32 -> 64 signed multiply, and subtract result[63:32] from + (result[31] >>signed 31). If this is zero, meaning that the + upper word is merely a sign extension of the lower one, no + overflow can occur. + + On a 64-bit platform it is not always possible to + acquire the top 64 bits of the result. Therefore, a recommended + implementation is to take the absolute value of both operands, and + return 0 iff bits[63:31] of them are zero, since that means that their + magnitudes fit within 31 bits, so the magnitude of the product must fit + into 62 bits. + + If in doubt, return non-zero, but do make an effort to create the + correct answer for small args, since otherwise the performance of + (*) :: Integer -> Integer -> Integer will be poor. + } + with commutable = True + +primop IntQuotOp "quotInt#" Dyadic + Int# -> Int# -> Int# + {Rounds towards zero.} + with can_fail = True + +primop IntRemOp "remInt#" Dyadic + Int# -> Int# -> Int# + {Satisfies \texttt{(quotInt\# x y) *\# y +\# (remInt\# x y) == x}.} + with can_fail = True + +primop IntGcdOp "gcdInt#" Dyadic Int# -> Int# -> Int# + with out_of_line = True + +primop IntNegOp "negateInt#" Monadic Int# -> Int# +primop IntAddCOp "addIntC#" GenPrimOp Int# -> Int# -> (# Int#, Int# #) + {Add with carry. First member of result is (wrapped) sum; + second member is 0 iff no overflow occured.} +primop IntSubCOp "subIntC#" GenPrimOp Int# -> Int# -> (# Int#, Int# #) + {Subtract with carry. First member of result is (wrapped) difference; + second member is 0 iff no overflow occured.} + +primop IntGtOp ">#" Compare Int# -> Int# -> Bool +primop IntGeOp ">=#" Compare Int# -> Int# -> Bool + +primop IntEqOp "==#" Compare + Int# -> Int# -> Bool + with commutable = True + +primop IntNeOp "/=#" Compare + Int# -> Int# -> Bool + with commutable = True + +primop IntLtOp "<#" Compare Int# -> Int# -> Bool +primop IntLeOp "<=#" Compare Int# -> Int# -> Bool + +primop ChrOp "chr#" GenPrimOp Int# -> Char# + +primop Int2WordOp "int2Word#" GenPrimOp Int# -> Word# +primop Int2FloatOp "int2Float#" GenPrimOp Int# -> Float# +primop Int2DoubleOp "int2Double#" GenPrimOp Int# -> Double# + +primop Int2IntegerOp "int2Integer#" + GenPrimOp Int# -> (# Int#, ByteArr# #) + with out_of_line = True + +primop ISllOp "uncheckedIShiftL#" GenPrimOp Int# -> Int# -> Int# + {Shift left. Result undefined if shift amount is not + in the range 0 to word size - 1 inclusive.} +primop ISraOp "uncheckedIShiftRA#" GenPrimOp Int# -> Int# -> Int# + {Shift right arithmetic. Result undefined if shift amount is not + in the range 0 to word size - 1 inclusive.} +primop ISrlOp "uncheckedIShiftRL#" GenPrimOp Int# -> Int# -> Int# + {Shift right logical. Result undefined if shift amount is not + in the range 0 to word size - 1 inclusive.} + +------------------------------------------------------------------------ +section "Word#" + {Operations on native-sized unsigned words (30+ bits).} +------------------------------------------------------------------------ + +primop WordAddOp "plusWord#" Dyadic Word# -> Word# -> Word# + with commutable = True + +primop WordSubOp "minusWord#" Dyadic Word# -> Word# -> Word# + +primop WordMulOp "timesWord#" Dyadic Word# -> Word# -> Word# + with commutable = True + +primop WordQuotOp "quotWord#" Dyadic Word# -> Word# -> Word# + with can_fail = True + +primop WordRemOp "remWord#" Dyadic Word# -> Word# -> Word# + with can_fail = True + +primop AndOp "and#" Dyadic Word# -> Word# -> Word# + with commutable = True + +primop OrOp "or#" Dyadic Word# -> Word# -> Word# + with commutable = True + +primop XorOp "xor#" Dyadic Word# -> Word# -> Word# + with commutable = True + +primop NotOp "not#" Monadic Word# -> Word# + +primop SllOp "uncheckedShiftL#" GenPrimOp Word# -> Int# -> Word# + {Shift left logical. Result undefined if shift amount is not + in the range 0 to word size - 1 inclusive.} +primop SrlOp "uncheckedShiftRL#" GenPrimOp Word# -> Int# -> Word# + {Shift right logical. Result undefined if shift amount is not + in the range 0 to word size - 1 inclusive.} + +primop Word2IntOp "word2Int#" GenPrimOp Word# -> Int# + +primop Word2IntegerOp "word2Integer#" GenPrimOp + Word# -> (# Int#, ByteArr# #) + with out_of_line = True + +primop WordGtOp "gtWord#" Compare Word# -> Word# -> Bool +primop WordGeOp "geWord#" Compare Word# -> Word# -> Bool +primop WordEqOp "eqWord#" Compare Word# -> Word# -> Bool +primop WordNeOp "neWord#" Compare Word# -> Word# -> Bool +primop WordLtOp "ltWord#" Compare Word# -> Word# -> Bool +primop WordLeOp "leWord#" Compare Word# -> Word# -> Bool + +------------------------------------------------------------------------ +section "Narrowings" + {Explicit narrowing of native-sized ints or words.} +------------------------------------------------------------------------ + +primop Narrow8IntOp "narrow8Int#" Monadic Int# -> Int# +primop Narrow16IntOp "narrow16Int#" Monadic Int# -> Int# +primop Narrow32IntOp "narrow32Int#" Monadic Int# -> Int# +primop Narrow8WordOp "narrow8Word#" Monadic Word# -> Word# +primop Narrow16WordOp "narrow16Word#" Monadic Word# -> Word# +primop Narrow32WordOp "narrow32Word#" Monadic Word# -> Word# + + +#if WORD_SIZE_IN_BITS < 32 +------------------------------------------------------------------------ +section "Int32#" + {Operations on 32-bit integers (Int32\#). This type is only used + if plain Int\# has less than 32 bits. In any case, the operations + are not primops; they are implemented (if needed) as ccalls instead.} +------------------------------------------------------------------------ + +primop Int32ToIntegerOp "int32ToInteger#" GenPrimOp + Int32# -> (# Int#, ByteArr# #) + with out_of_line = True + + +------------------------------------------------------------------------ +section "Word32#" + {Operations on 32-bit unsigned words. This type is only used + if plain Word\# has less than 32 bits. In any case, the operations + are not primops; they are implemented (if needed) as ccalls instead.} +------------------------------------------------------------------------ + +primop Word32ToIntegerOp "word32ToInteger#" GenPrimOp + Word32# -> (# Int#, ByteArr# #) + with out_of_line = True + + +#endif + + +#if WORD_SIZE_IN_BITS < 64 +------------------------------------------------------------------------ +section "Int64#" + {Operations on 64-bit unsigned words. This type is only used + if plain Int\# has less than 64 bits. In any case, the operations + are not primops; they are implemented (if needed) as ccalls instead.} +------------------------------------------------------------------------ + +primop Int64ToIntegerOp "int64ToInteger#" GenPrimOp + Int64# -> (# Int#, ByteArr# #) + with out_of_line = True + +------------------------------------------------------------------------ +section "Word64#" + {Operations on 64-bit unsigned words. This type is only used + if plain Word\# has less than 64 bits. In any case, the operations + are not primops; they are implemented (if needed) as ccalls instead.} +------------------------------------------------------------------------ + +primop Word64ToIntegerOp "word64ToInteger#" GenPrimOp + Word64# -> (# Int#, ByteArr# #) + with out_of_line = True + +#endif + +------------------------------------------------------------------------ +section "Integer#" + {Operations on arbitrary-precision integers. These operations are +implemented via the GMP package. An integer is represented as a pair +consisting of an Int\# representing the number of 'limbs' in use and +the sign, and a ByteArr\# containing the 'limbs' themselves. Such pairs +are returned as unboxed pairs, but must be passed as separate +components. + +For .NET these operations are implemented by foreign imports, so the +primops are omitted.} +------------------------------------------------------------------------ + +#ifndef ILX + +primop IntegerAddOp "plusInteger#" GenPrimOp + Int# -> ByteArr# -> Int# -> ByteArr# -> (# Int#, ByteArr# #) + with commutable = True + out_of_line = True + +primop IntegerSubOp "minusInteger#" GenPrimOp + Int# -> ByteArr# -> Int# -> ByteArr# -> (# Int#, ByteArr# #) + with out_of_line = True + +primop IntegerMulOp "timesInteger#" GenPrimOp + Int# -> ByteArr# -> Int# -> ByteArr# -> (# Int#, ByteArr# #) + with commutable = True + out_of_line = True + +primop IntegerGcdOp "gcdInteger#" GenPrimOp + Int# -> ByteArr# -> Int# -> ByteArr# -> (# Int#, ByteArr# #) + {Greatest common divisor.} + with commutable = True + out_of_line = True + +primop IntegerIntGcdOp "gcdIntegerInt#" GenPrimOp + Int# -> ByteArr# -> Int# -> Int# + {Greatest common divisor, where second argument is an ordinary Int\#.} + with out_of_line = True + +primop IntegerDivExactOp "divExactInteger#" GenPrimOp + Int# -> ByteArr# -> Int# -> ByteArr# -> (# Int#, ByteArr# #) + {Divisor is guaranteed to be a factor of dividend.} + with out_of_line = True + +primop IntegerQuotOp "quotInteger#" GenPrimOp + Int# -> ByteArr# -> Int# -> ByteArr# -> (# Int#, ByteArr# #) + {Rounds towards zero.} + with out_of_line = True + +primop IntegerRemOp "remInteger#" GenPrimOp + Int# -> ByteArr# -> Int# -> ByteArr# -> (# Int#, ByteArr# #) + {Satisfies \texttt{plusInteger\# (timesInteger\# (quotInteger\# x y) y) (remInteger\# x y) == x}.} + with out_of_line = True + +primop IntegerCmpOp "cmpInteger#" GenPrimOp + Int# -> ByteArr# -> Int# -> ByteArr# -> Int# + {Returns -1,0,1 according as first argument is less than, equal to, or greater than second argument.} + with needs_wrapper = True + out_of_line = True + +primop IntegerCmpIntOp "cmpIntegerInt#" GenPrimOp + Int# -> ByteArr# -> Int# -> Int# + {Returns -1,0,1 according as first argument is less than, equal to, or greater than second argument, which + is an ordinary Int\#.} + with needs_wrapper = True + out_of_line = True + +primop IntegerQuotRemOp "quotRemInteger#" GenPrimOp + Int# -> ByteArr# -> Int# -> ByteArr# -> (# Int#, ByteArr#, Int#, ByteArr# #) + {Compute quot and rem simulaneously.} + with can_fail = True + out_of_line = True + +primop IntegerDivModOp "divModInteger#" GenPrimOp + Int# -> ByteArr# -> Int# -> ByteArr# -> (# Int#, ByteArr#, Int#, ByteArr# #) + {Compute div and mod simultaneously, where div rounds towards negative infinity + and\texttt{(q,r) = divModInteger\#(x,y)} implies \texttt{plusInteger\# (timesInteger\# q y) r = x}.} + with can_fail = True + out_of_line = True + +primop Integer2IntOp "integer2Int#" GenPrimOp + Int# -> ByteArr# -> Int# + with needs_wrapper = True + out_of_line = True + +primop Integer2WordOp "integer2Word#" GenPrimOp + Int# -> ByteArr# -> Word# + with needs_wrapper = True + out_of_line = True + +#if WORD_SIZE_IN_BITS < 32 +primop IntegerToInt32Op "integerToInt32#" GenPrimOp + Int# -> ByteArr# -> Int32# + +primop IntegerToWord32Op "integerToWord32#" GenPrimOp + Int# -> ByteArr# -> Word32# +#endif + +primop IntegerAndOp "andInteger#" GenPrimOp + Int# -> ByteArr# -> Int# -> ByteArr# -> (# Int#, ByteArr# #) + with out_of_line = True + +primop IntegerOrOp "orInteger#" GenPrimOp + Int# -> ByteArr# -> Int# -> ByteArr# -> (# Int#, ByteArr# #) + with out_of_line = True + +primop IntegerXorOp "xorInteger#" GenPrimOp + Int# -> ByteArr# -> Int# -> ByteArr# -> (# Int#, ByteArr# #) + with out_of_line = True + +primop IntegerComplementOp "complementInteger#" GenPrimOp + Int# -> ByteArr# -> (# Int#, ByteArr# #) + with out_of_line = True + +#endif /* ndef ILX */ + +------------------------------------------------------------------------ +section "Double#" + {Operations on double-precision (64 bit) floating-point numbers.} +------------------------------------------------------------------------ + +primop DoubleGtOp ">##" Compare Double# -> Double# -> Bool +primop DoubleGeOp ">=##" Compare Double# -> Double# -> Bool + +primop DoubleEqOp "==##" Compare + Double# -> Double# -> Bool + with commutable = True + +primop DoubleNeOp "/=##" Compare + Double# -> Double# -> Bool + with commutable = True + +primop DoubleLtOp "<##" Compare Double# -> Double# -> Bool +primop DoubleLeOp "<=##" Compare Double# -> Double# -> Bool + +primop DoubleAddOp "+##" Dyadic + Double# -> Double# -> Double# + with commutable = True + +primop DoubleSubOp "-##" Dyadic Double# -> Double# -> Double# + +primop DoubleMulOp "*##" Dyadic + Double# -> Double# -> Double# + with commutable = True + +primop DoubleDivOp "/##" Dyadic + Double# -> Double# -> Double# + with can_fail = True + +primop DoubleNegOp "negateDouble#" Monadic Double# -> Double# + +primop Double2IntOp "double2Int#" GenPrimOp Double# -> Int# +primop Double2FloatOp "double2Float#" GenPrimOp Double# -> Float# + +primop DoubleExpOp "expDouble#" Monadic + Double# -> Double# + with needs_wrapper = True + +primop DoubleLogOp "logDouble#" Monadic + Double# -> Double# + with + needs_wrapper = True + can_fail = True + +primop DoubleSqrtOp "sqrtDouble#" Monadic + Double# -> Double# + with needs_wrapper = True + +primop DoubleSinOp "sinDouble#" Monadic + Double# -> Double# + with needs_wrapper = True + +primop DoubleCosOp "cosDouble#" Monadic + Double# -> Double# + with needs_wrapper = True + +primop DoubleTanOp "tanDouble#" Monadic + Double# -> Double# + with needs_wrapper = True + +primop DoubleAsinOp "asinDouble#" Monadic + Double# -> Double# + with + needs_wrapper = True + can_fail = True + +primop DoubleAcosOp "acosDouble#" Monadic + Double# -> Double# + with + needs_wrapper = True + can_fail = True + +primop DoubleAtanOp "atanDouble#" Monadic + Double# -> Double# + with + needs_wrapper = True + +primop DoubleSinhOp "sinhDouble#" Monadic + Double# -> Double# + with needs_wrapper = True + +primop DoubleCoshOp "coshDouble#" Monadic + Double# -> Double# + with needs_wrapper = True + +primop DoubleTanhOp "tanhDouble#" Monadic + Double# -> Double# + with needs_wrapper = True + +primop DoublePowerOp "**##" Dyadic + Double# -> Double# -> Double# + {Exponentiation.} + with needs_wrapper = True + +primop DoubleDecodeOp "decodeDouble#" GenPrimOp + Double# -> (# Int#, Int#, ByteArr# #) + {Convert to arbitrary-precision integer. + First Int\# in result is the exponent; second Int\# and ByteArr\# represent an Integer\# + holding the mantissa.} + with out_of_line = True + +------------------------------------------------------------------------ +section "Float#" + {Operations on single-precision (32-bit) floating-point numbers.} +------------------------------------------------------------------------ + +primop FloatGtOp "gtFloat#" Compare Float# -> Float# -> Bool +primop FloatGeOp "geFloat#" Compare Float# -> Float# -> Bool + +primop FloatEqOp "eqFloat#" Compare + Float# -> Float# -> Bool + with commutable = True + +primop FloatNeOp "neFloat#" Compare + Float# -> Float# -> Bool + with commutable = True + +primop FloatLtOp "ltFloat#" Compare Float# -> Float# -> Bool +primop FloatLeOp "leFloat#" Compare Float# -> Float# -> Bool + +primop FloatAddOp "plusFloat#" Dyadic + Float# -> Float# -> Float# + with commutable = True + +primop FloatSubOp "minusFloat#" Dyadic Float# -> Float# -> Float# + +primop FloatMulOp "timesFloat#" Dyadic + Float# -> Float# -> Float# + with commutable = True + +primop FloatDivOp "divideFloat#" Dyadic + Float# -> Float# -> Float# + with can_fail = True + +primop FloatNegOp "negateFloat#" Monadic Float# -> Float# + +primop Float2IntOp "float2Int#" GenPrimOp Float# -> Int# + +primop FloatExpOp "expFloat#" Monadic + Float# -> Float# + with needs_wrapper = True + +primop FloatLogOp "logFloat#" Monadic + Float# -> Float# + with needs_wrapper = True + can_fail = True + +primop FloatSqrtOp "sqrtFloat#" Monadic + Float# -> Float# + with needs_wrapper = True + +primop FloatSinOp "sinFloat#" Monadic + Float# -> Float# + with needs_wrapper = True + +primop FloatCosOp "cosFloat#" Monadic + Float# -> Float# + with needs_wrapper = True + +primop FloatTanOp "tanFloat#" Monadic + Float# -> Float# + with needs_wrapper = True + +primop FloatAsinOp "asinFloat#" Monadic + Float# -> Float# + with needs_wrapper = True + can_fail = True + +primop FloatAcosOp "acosFloat#" Monadic + Float# -> Float# + with needs_wrapper = True + can_fail = True + +primop FloatAtanOp "atanFloat#" Monadic + Float# -> Float# + with needs_wrapper = True + +primop FloatSinhOp "sinhFloat#" Monadic + Float# -> Float# + with needs_wrapper = True + +primop FloatCoshOp "coshFloat#" Monadic + Float# -> Float# + with needs_wrapper = True + +primop FloatTanhOp "tanhFloat#" Monadic + Float# -> Float# + with needs_wrapper = True + +primop FloatPowerOp "powerFloat#" Dyadic + Float# -> Float# -> Float# + with needs_wrapper = True + +primop Float2DoubleOp "float2Double#" GenPrimOp Float# -> Double# + +primop FloatDecodeOp "decodeFloat#" GenPrimOp + Float# -> (# Int#, Int#, ByteArr# #) + {Convert to arbitrary-precision integer. + First Int\# in result is the exponent; second Int\# and ByteArr\# represent an Integer\# + holding the mantissa.} + with out_of_line = True + +------------------------------------------------------------------------ +section "Arrays" + {Operations on Array\#.} +------------------------------------------------------------------------ + +primop NewArrayOp "newArray#" GenPrimOp + Int# -> a -> State# s -> (# State# s, MutArr# s a #) + {Create a new mutable array of specified size (in bytes), + in the specified state thread, + with each element containing the specified initial value.} + with + usage = { mangle NewArrayOp [mkP, mkM, mkP] mkM } + out_of_line = True + +primop SameMutableArrayOp "sameMutableArray#" GenPrimOp + MutArr# s a -> MutArr# s a -> Bool + with + usage = { mangle SameMutableArrayOp [mkP, mkP] mkM } + +primop ReadArrayOp "readArray#" GenPrimOp + MutArr# s a -> Int# -> State# s -> (# State# s, a #) + {Read from specified index of mutable array. Result is not yet evaluated.} + with + usage = { mangle ReadArrayOp [mkM, mkP, mkP] mkM } + +primop WriteArrayOp "writeArray#" GenPrimOp + MutArr# s a -> Int# -> a -> State# s -> State# s + {Write to specified index of mutable array.} + with + usage = { mangle WriteArrayOp [mkM, mkP, mkM, mkP] mkR } + has_side_effects = True + +primop IndexArrayOp "indexArray#" GenPrimOp + Array# a -> Int# -> (# a #) + {Read from specified index of immutable array. Result is packaged into + an unboxed singleton; the result itself is not yet evaluated.} + with + usage = { mangle IndexArrayOp [mkM, mkP] mkM } + +primop UnsafeFreezeArrayOp "unsafeFreezeArray#" GenPrimOp + MutArr# s a -> State# s -> (# State# s, Array# a #) + {Make a mutable array immutable, without copying.} + with + usage = { mangle UnsafeFreezeArrayOp [mkM, mkP] mkM } + has_side_effects = True + +primop UnsafeThawArrayOp "unsafeThawArray#" GenPrimOp + Array# a -> State# s -> (# State# s, MutArr# s a #) + {Make an immutable array mutable, without copying.} + with + usage = { mangle UnsafeThawArrayOp [mkM, mkP] mkM } + out_of_line = True + +------------------------------------------------------------------------ +section "Byte Arrays" + {Operations on ByteArray\#. A ByteArray\# is a just a region of + raw memory in the garbage-collected heap, which is not scanned + for pointers. It carries its own size (in bytes). There are + three sets of operations for accessing byte array contents: + index for reading from immutable byte arrays, and read/write + for mutable byte arrays. Each set contains operations for + a range of useful primitive data types. Each operation takes + an offset measured in terms of the size fo the primitive type + being read or written.} + +------------------------------------------------------------------------ + +primop NewByteArrayOp_Char "newByteArray#" GenPrimOp + Int# -> State# s -> (# State# s, MutByteArr# s #) + {Create a new mutable byte array of specified size (in bytes), in + the specified state thread.} + with out_of_line = True + +primop NewPinnedByteArrayOp_Char "newPinnedByteArray#" GenPrimOp + Int# -> State# s -> (# State# s, MutByteArr# s #) + {Create a mutable byte array that the GC guarantees not to move.} + with out_of_line = True + +primop ByteArrayContents_Char "byteArrayContents#" GenPrimOp + ByteArr# -> Addr# + {Intended for use with pinned arrays; otherwise very unsafe!} + +primop SameMutableByteArrayOp "sameMutableByteArray#" GenPrimOp + MutByteArr# s -> MutByteArr# s -> Bool + +primop UnsafeFreezeByteArrayOp "unsafeFreezeByteArray#" GenPrimOp + MutByteArr# s -> State# s -> (# State# s, ByteArr# #) + {Make a mutable byte array immutable, without copying.} + with + has_side_effects = True + +primop SizeofByteArrayOp "sizeofByteArray#" GenPrimOp + ByteArr# -> Int# + +primop SizeofMutableByteArrayOp "sizeofMutableByteArray#" GenPrimOp + MutByteArr# s -> Int# + + +primop IndexByteArrayOp_Char "indexCharArray#" GenPrimOp + ByteArr# -> Int# -> Char# + {Read 8-bit character; offset in bytes.} + +primop IndexByteArrayOp_WideChar "indexWideCharArray#" GenPrimOp + ByteArr# -> Int# -> Char# + {Read 31-bit character; offset in 4-byte words.} + +primop IndexByteArrayOp_Int "indexIntArray#" GenPrimOp + ByteArr# -> Int# -> Int# + +primop IndexByteArrayOp_Word "indexWordArray#" GenPrimOp + ByteArr# -> Int# -> Word# + +primop IndexByteArrayOp_Addr "indexAddrArray#" GenPrimOp + ByteArr# -> Int# -> Addr# + +primop IndexByteArrayOp_Float "indexFloatArray#" GenPrimOp + ByteArr# -> Int# -> Float# + +primop IndexByteArrayOp_Double "indexDoubleArray#" GenPrimOp + ByteArr# -> Int# -> Double# + +primop IndexByteArrayOp_StablePtr "indexStablePtrArray#" GenPrimOp + ByteArr# -> Int# -> StablePtr# a + +primop IndexByteArrayOp_Int8 "indexInt8Array#" GenPrimOp + ByteArr# -> Int# -> Int# + +primop IndexByteArrayOp_Int16 "indexInt16Array#" GenPrimOp + ByteArr# -> Int# -> Int# + +primop IndexByteArrayOp_Int32 "indexInt32Array#" GenPrimOp + ByteArr# -> Int# -> INT32 + +primop IndexByteArrayOp_Int64 "indexInt64Array#" GenPrimOp + ByteArr# -> Int# -> INT64 + +primop IndexByteArrayOp_Word8 "indexWord8Array#" GenPrimOp + ByteArr# -> Int# -> Word# + +primop IndexByteArrayOp_Word16 "indexWord16Array#" GenPrimOp + ByteArr# -> Int# -> Word# + +primop IndexByteArrayOp_Word32 "indexWord32Array#" GenPrimOp + ByteArr# -> Int# -> WORD32 + +primop IndexByteArrayOp_Word64 "indexWord64Array#" GenPrimOp + ByteArr# -> Int# -> WORD64 + +primop ReadByteArrayOp_Char "readCharArray#" GenPrimOp + MutByteArr# s -> Int# -> State# s -> (# State# s, Char# #) + {Read 8-bit character; offset in bytes.} + +primop ReadByteArrayOp_WideChar "readWideCharArray#" GenPrimOp + MutByteArr# s -> Int# -> State# s -> (# State# s, Char# #) + {Read 31-bit character; offset in 4-byte words.} + +primop ReadByteArrayOp_Int "readIntArray#" GenPrimOp + MutByteArr# s -> Int# -> State# s -> (# State# s, Int# #) + +primop ReadByteArrayOp_Word "readWordArray#" GenPrimOp + MutByteArr# s -> Int# -> State# s -> (# State# s, Word# #) + +primop ReadByteArrayOp_Addr "readAddrArray#" GenPrimOp + MutByteArr# s -> Int# -> State# s -> (# State# s, Addr# #) + +primop ReadByteArrayOp_Float "readFloatArray#" GenPrimOp + MutByteArr# s -> Int# -> State# s -> (# State# s, Float# #) + +primop ReadByteArrayOp_Double "readDoubleArray#" GenPrimOp + MutByteArr# s -> Int# -> State# s -> (# State# s, Double# #) + +primop ReadByteArrayOp_StablePtr "readStablePtrArray#" GenPrimOp + MutByteArr# s -> Int# -> State# s -> (# State# s, StablePtr# a #) + +primop ReadByteArrayOp_Int8 "readInt8Array#" GenPrimOp + MutByteArr# s -> Int# -> State# s -> (# State# s, Int# #) + +primop ReadByteArrayOp_Int16 "readInt16Array#" GenPrimOp + MutByteArr# s -> Int# -> State# s -> (# State# s, Int# #) + +primop ReadByteArrayOp_Int32 "readInt32Array#" GenPrimOp + MutByteArr# s -> Int# -> State# s -> (# State# s, INT32 #) + +primop ReadByteArrayOp_Int64 "readInt64Array#" GenPrimOp + MutByteArr# s -> Int# -> State# s -> (# State# s, INT64 #) + +primop ReadByteArrayOp_Word8 "readWord8Array#" GenPrimOp + MutByteArr# s -> Int# -> State# s -> (# State# s, Word# #) + +primop ReadByteArrayOp_Word16 "readWord16Array#" GenPrimOp + MutByteArr# s -> Int# -> State# s -> (# State# s, Word# #) + +primop ReadByteArrayOp_Word32 "readWord32Array#" GenPrimOp + MutByteArr# s -> Int# -> State# s -> (# State# s, WORD32 #) + +primop ReadByteArrayOp_Word64 "readWord64Array#" GenPrimOp + MutByteArr# s -> Int# -> State# s -> (# State# s, WORD64 #) + +primop WriteByteArrayOp_Char "writeCharArray#" GenPrimOp + MutByteArr# s -> Int# -> Char# -> State# s -> State# s + {Write 8-bit character; offset in bytes.} + with has_side_effects = True + +primop WriteByteArrayOp_WideChar "writeWideCharArray#" GenPrimOp + MutByteArr# s -> Int# -> Char# -> State# s -> State# s + {Write 31-bit character; offset in 4-byte words.} + with has_side_effects = True + +primop WriteByteArrayOp_Int "writeIntArray#" GenPrimOp + MutByteArr# s -> Int# -> Int# -> State# s -> State# s + with has_side_effects = True + +primop WriteByteArrayOp_Word "writeWordArray#" GenPrimOp + MutByteArr# s -> Int# -> Word# -> State# s -> State# s + with has_side_effects = True + +primop WriteByteArrayOp_Addr "writeAddrArray#" GenPrimOp + MutByteArr# s -> Int# -> Addr# -> State# s -> State# s + with has_side_effects = True + +primop WriteByteArrayOp_Float "writeFloatArray#" GenPrimOp + MutByteArr# s -> Int# -> Float# -> State# s -> State# s + with has_side_effects = True + +primop WriteByteArrayOp_Double "writeDoubleArray#" GenPrimOp + MutByteArr# s -> Int# -> Double# -> State# s -> State# s + with has_side_effects = True + +primop WriteByteArrayOp_StablePtr "writeStablePtrArray#" GenPrimOp + MutByteArr# s -> Int# -> StablePtr# a -> State# s -> State# s + with has_side_effects = True + +primop WriteByteArrayOp_Int8 "writeInt8Array#" GenPrimOp + MutByteArr# s -> Int# -> Int# -> State# s -> State# s + with has_side_effects = True + +primop WriteByteArrayOp_Int16 "writeInt16Array#" GenPrimOp + MutByteArr# s -> Int# -> Int# -> State# s -> State# s + with has_side_effects = True + +primop WriteByteArrayOp_Int32 "writeInt32Array#" GenPrimOp + MutByteArr# s -> Int# -> INT32 -> State# s -> State# s + with has_side_effects = True + +primop WriteByteArrayOp_Int64 "writeInt64Array#" GenPrimOp + MutByteArr# s -> Int# -> INT64 -> State# s -> State# s + with has_side_effects = True + +primop WriteByteArrayOp_Word8 "writeWord8Array#" GenPrimOp + MutByteArr# s -> Int# -> Word# -> State# s -> State# s + with has_side_effects = True + +primop WriteByteArrayOp_Word16 "writeWord16Array#" GenPrimOp + MutByteArr# s -> Int# -> Word# -> State# s -> State# s + with has_side_effects = True + +primop WriteByteArrayOp_Word32 "writeWord32Array#" GenPrimOp + MutByteArr# s -> Int# -> WORD32 -> State# s -> State# s + with has_side_effects = True + +primop WriteByteArrayOp_Word64 "writeWord64Array#" GenPrimOp + MutByteArr# s -> Int# -> WORD64 -> State# s -> State# s + with has_side_effects = True + +------------------------------------------------------------------------ +section "Addr#" + {Addr\# is an arbitrary machine address assumed to point outside + the garbage-collected heap. + + NB: {\tt nullAddr\#::Addr\#} is not a primop, but is defined in MkId.lhs. + It is the null address.} +------------------------------------------------------------------------ + +primop AddrAddOp "plusAddr#" GenPrimOp Addr# -> Int# -> Addr# +primop AddrSubOp "minusAddr#" GenPrimOp Addr# -> Addr# -> Int# + {Result is meaningless if two Addr\#s are so far apart that their + difference doesn't fit in an Int\#.} +primop AddrRemOp "remAddr#" GenPrimOp Addr# -> Int# -> Int# + {Return the remainder when the Addr\# arg, treated like an Int\#, + is divided by the Int\# arg.} +#if (WORD_SIZE_IN_BITS == 32 || WORD_SIZE_IN_BITS == 64) +primop Addr2IntOp "addr2Int#" GenPrimOp Addr# -> Int# + {Coerce directly from address to int. Strongly deprecated.} +primop Int2AddrOp "int2Addr#" GenPrimOp Int# -> Addr# + {Coerce directly from int to address. Strongly deprecated.} +#endif + +primop AddrGtOp "gtAddr#" Compare Addr# -> Addr# -> Bool +primop AddrGeOp "geAddr#" Compare Addr# -> Addr# -> Bool +primop AddrEqOp "eqAddr#" Compare Addr# -> Addr# -> Bool +primop AddrNeOp "neAddr#" Compare Addr# -> Addr# -> Bool +primop AddrLtOp "ltAddr#" Compare Addr# -> Addr# -> Bool +primop AddrLeOp "leAddr#" Compare Addr# -> Addr# -> Bool + +primop IndexOffAddrOp_Char "indexCharOffAddr#" GenPrimOp + Addr# -> Int# -> Char# + {Reads 8-bit character; offset in bytes.} + +primop IndexOffAddrOp_WideChar "indexWideCharOffAddr#" GenPrimOp + Addr# -> Int# -> Char# + {Reads 31-bit character; offset in 4-byte words.} + +primop IndexOffAddrOp_Int "indexIntOffAddr#" GenPrimOp + Addr# -> Int# -> Int# + +primop IndexOffAddrOp_Word "indexWordOffAddr#" GenPrimOp + Addr# -> Int# -> Word# + +primop IndexOffAddrOp_Addr "indexAddrOffAddr#" GenPrimOp + Addr# -> Int# -> Addr# + +primop IndexOffAddrOp_Float "indexFloatOffAddr#" GenPrimOp + Addr# -> Int# -> Float# + +primop IndexOffAddrOp_Double "indexDoubleOffAddr#" GenPrimOp + Addr# -> Int# -> Double# + +primop IndexOffAddrOp_StablePtr "indexStablePtrOffAddr#" GenPrimOp + Addr# -> Int# -> StablePtr# a + +primop IndexOffAddrOp_Int8 "indexInt8OffAddr#" GenPrimOp + Addr# -> Int# -> Int# + +primop IndexOffAddrOp_Int16 "indexInt16OffAddr#" GenPrimOp + Addr# -> Int# -> Int# + +primop IndexOffAddrOp_Int32 "indexInt32OffAddr#" GenPrimOp + Addr# -> Int# -> INT32 + +primop IndexOffAddrOp_Int64 "indexInt64OffAddr#" GenPrimOp + Addr# -> Int# -> INT64 + +primop IndexOffAddrOp_Word8 "indexWord8OffAddr#" GenPrimOp + Addr# -> Int# -> Word# + +primop IndexOffAddrOp_Word16 "indexWord16OffAddr#" GenPrimOp + Addr# -> Int# -> Word# + +primop IndexOffAddrOp_Word32 "indexWord32OffAddr#" GenPrimOp + Addr# -> Int# -> WORD32 + +primop IndexOffAddrOp_Word64 "indexWord64OffAddr#" GenPrimOp + Addr# -> Int# -> WORD64 + +primop ReadOffAddrOp_Char "readCharOffAddr#" GenPrimOp + Addr# -> Int# -> State# s -> (# State# s, Char# #) + {Reads 8-bit character; offset in bytes.} + +primop ReadOffAddrOp_WideChar "readWideCharOffAddr#" GenPrimOp + Addr# -> Int# -> State# s -> (# State# s, Char# #) + {Reads 31-bit character; offset in 4-byte words.} + +primop ReadOffAddrOp_Int "readIntOffAddr#" GenPrimOp + Addr# -> Int# -> State# s -> (# State# s, Int# #) + +primop ReadOffAddrOp_Word "readWordOffAddr#" GenPrimOp + Addr# -> Int# -> State# s -> (# State# s, Word# #) + +primop ReadOffAddrOp_Addr "readAddrOffAddr#" GenPrimOp + Addr# -> Int# -> State# s -> (# State# s, Addr# #) + +primop ReadOffAddrOp_Float "readFloatOffAddr#" GenPrimOp + Addr# -> Int# -> State# s -> (# State# s, Float# #) + +primop ReadOffAddrOp_Double "readDoubleOffAddr#" GenPrimOp + Addr# -> Int# -> State# s -> (# State# s, Double# #) + +primop ReadOffAddrOp_StablePtr "readStablePtrOffAddr#" GenPrimOp + Addr# -> Int# -> State# s -> (# State# s, StablePtr# a #) + +primop ReadOffAddrOp_Int8 "readInt8OffAddr#" GenPrimOp + Addr# -> Int# -> State# s -> (# State# s, Int# #) + +primop ReadOffAddrOp_Int16 "readInt16OffAddr#" GenPrimOp + Addr# -> Int# -> State# s -> (# State# s, Int# #) + +primop ReadOffAddrOp_Int32 "readInt32OffAddr#" GenPrimOp + Addr# -> Int# -> State# s -> (# State# s, INT32 #) + +primop ReadOffAddrOp_Int64 "readInt64OffAddr#" GenPrimOp + Addr# -> Int# -> State# s -> (# State# s, INT64 #) + +primop ReadOffAddrOp_Word8 "readWord8OffAddr#" GenPrimOp + Addr# -> Int# -> State# s -> (# State# s, Word# #) + +primop ReadOffAddrOp_Word16 "readWord16OffAddr#" GenPrimOp + Addr# -> Int# -> State# s -> (# State# s, Word# #) + +primop ReadOffAddrOp_Word32 "readWord32OffAddr#" GenPrimOp + Addr# -> Int# -> State# s -> (# State# s, WORD32 #) + +primop ReadOffAddrOp_Word64 "readWord64OffAddr#" GenPrimOp + Addr# -> Int# -> State# s -> (# State# s, WORD64 #) + + +primop WriteOffAddrOp_Char "writeCharOffAddr#" GenPrimOp + Addr# -> Int# -> Char# -> State# s -> State# s + with has_side_effects = True + +primop WriteOffAddrOp_WideChar "writeWideCharOffAddr#" GenPrimOp + Addr# -> Int# -> Char# -> State# s -> State# s + with has_side_effects = True + +primop WriteOffAddrOp_Int "writeIntOffAddr#" GenPrimOp + Addr# -> Int# -> Int# -> State# s -> State# s + with has_side_effects = True + +primop WriteOffAddrOp_Word "writeWordOffAddr#" GenPrimOp + Addr# -> Int# -> Word# -> State# s -> State# s + with has_side_effects = True + +primop WriteOffAddrOp_Addr "writeAddrOffAddr#" GenPrimOp + Addr# -> Int# -> Addr# -> State# s -> State# s + with has_side_effects = True + +primop WriteOffAddrOp_Float "writeFloatOffAddr#" GenPrimOp + Addr# -> Int# -> Float# -> State# s -> State# s + with has_side_effects = True + +primop WriteOffAddrOp_Double "writeDoubleOffAddr#" GenPrimOp + Addr# -> Int# -> Double# -> State# s -> State# s + with has_side_effects = True + +primop WriteOffAddrOp_StablePtr "writeStablePtrOffAddr#" GenPrimOp + Addr# -> Int# -> StablePtr# a -> State# s -> State# s + with has_side_effects = True + +primop WriteOffAddrOp_Int8 "writeInt8OffAddr#" GenPrimOp + Addr# -> Int# -> Int# -> State# s -> State# s + with has_side_effects = True + +primop WriteOffAddrOp_Int16 "writeInt16OffAddr#" GenPrimOp + Addr# -> Int# -> Int# -> State# s -> State# s + with has_side_effects = True + +primop WriteOffAddrOp_Int32 "writeInt32OffAddr#" GenPrimOp + Addr# -> Int# -> INT32 -> State# s -> State# s + with has_side_effects = True + +primop WriteOffAddrOp_Int64 "writeInt64OffAddr#" GenPrimOp + Addr# -> Int# -> INT64 -> State# s -> State# s + with has_side_effects = True + +primop WriteOffAddrOp_Word8 "writeWord8OffAddr#" GenPrimOp + Addr# -> Int# -> Word# -> State# s -> State# s + with has_side_effects = True + +primop WriteOffAddrOp_Word16 "writeWord16OffAddr#" GenPrimOp + Addr# -> Int# -> Word# -> State# s -> State# s + with has_side_effects = True + +primop WriteOffAddrOp_Word32 "writeWord32OffAddr#" GenPrimOp + Addr# -> Int# -> WORD32 -> State# s -> State# s + with has_side_effects = True + +primop WriteOffAddrOp_Word64 "writeWord64OffAddr#" GenPrimOp + Addr# -> Int# -> WORD64 -> State# s -> State# s + with has_side_effects = True + +------------------------------------------------------------------------ +section "Mutable variables" + {Operations on MutVar\#s, which behave like single-element mutable arrays.} +------------------------------------------------------------------------ + +primop NewMutVarOp "newMutVar#" GenPrimOp + a -> State# s -> (# State# s, MutVar# s a #) + {Create MutVar\# with specified initial value in specified state thread.} + with + usage = { mangle NewMutVarOp [mkM, mkP] mkM } + out_of_line = True + +primop ReadMutVarOp "readMutVar#" GenPrimOp + MutVar# s a -> State# s -> (# State# s, a #) + {Read contents of MutVar\#. Result is not yet evaluated.} + with + usage = { mangle ReadMutVarOp [mkM, mkP] mkM } + +primop WriteMutVarOp "writeMutVar#" GenPrimOp + MutVar# s a -> a -> State# s -> State# s + {Write contents of MutVar\#.} + with + usage = { mangle WriteMutVarOp [mkM, mkM, mkP] mkR } + has_side_effects = True + +primop SameMutVarOp "sameMutVar#" GenPrimOp + MutVar# s a -> MutVar# s a -> Bool + with + usage = { mangle SameMutVarOp [mkP, mkP] mkM } + +-- not really the right type, but we don't know about pairs here. The +-- correct type is +-- +-- MutVar# s a -> (a -> (a,b)) -> State# s -> (# State# s, b #) +-- +primop AtomicModifyMutVarOp "atomicModifyMutVar#" GenPrimOp + MutVar# s a -> (a -> b) -> State# s -> (# State# s, c #) + with + usage = { mangle AtomicModifyMutVarOp [mkP, mkM, mkP] mkM } + has_side_effects = True + out_of_line = True + +------------------------------------------------------------------------ +section "Exceptions" +------------------------------------------------------------------------ + +primop CatchOp "catch#" GenPrimOp + (State# RealWorld -> (# State# RealWorld, a #) ) + -> (b -> State# RealWorld -> (# State# RealWorld, a #) ) + -> State# RealWorld + -> (# State# RealWorld, a #) + with + -- Catch is actually strict in its first argument + -- but we don't want to tell the strictness + -- analyser about that! + usage = { mangle CatchOp [mkM, mkM . (inFun CatchOp mkM mkM), mkP] mkM } + -- [mkO, mkO . (inFun mkM mkO)] mkO + -- might use caught action multiply + out_of_line = True + +primop RaiseOp "raise#" GenPrimOp + a -> b + with + strictness = { \ arity -> mkStrictSig (mkTopDmdType [lazyDmd] BotRes) } + -- NB: result is bottom + usage = { mangle RaiseOp [mkM] mkM } + out_of_line = True + +-- raiseIO# needs to be a primop, because exceptions in the IO monad +-- must be *precise* - we don't want the strictness analyser turning +-- one kind of bottom into another, as it is allowed to do in pure code. + +primop RaiseIOOp "raiseIO#" GenPrimOp + a -> State# RealWorld -> (# State# RealWorld, b #) + with + out_of_line = True + +primop BlockAsyncExceptionsOp "blockAsyncExceptions#" GenPrimOp + (State# RealWorld -> (# State# RealWorld, a #)) + -> (State# RealWorld -> (# State# RealWorld, a #)) + with + out_of_line = True + +primop UnblockAsyncExceptionsOp "unblockAsyncExceptions#" GenPrimOp + (State# RealWorld -> (# State# RealWorld, a #)) + -> (State# RealWorld -> (# State# RealWorld, a #)) + with + out_of_line = True + +------------------------------------------------------------------------ +section "STM-accessible Mutable Variables" +------------------------------------------------------------------------ + +primop AtomicallyOp "atomically#" GenPrimOp + (State# RealWorld -> (# State# RealWorld, a #) ) + -> State# RealWorld -> (# State# RealWorld, a #) + with + out_of_line = True + has_side_effects = True + +primop RetryOp "retry#" GenPrimOp + State# RealWorld -> (# State# RealWorld, a #) + with + out_of_line = True + has_side_effects = True + +primop CatchRetryOp "catchRetry#" GenPrimOp + (State# RealWorld -> (# State# RealWorld, a #) ) + -> (State# RealWorld -> (# State# RealWorld, a #) ) + -> (State# RealWorld -> (# State# RealWorld, a #) ) + with + out_of_line = True + has_side_effects = True + +primop CatchSTMOp "catchSTM#" GenPrimOp + (State# RealWorld -> (# State# RealWorld, a #) ) + -> (b -> State# RealWorld -> (# State# RealWorld, a #) ) + -> (State# RealWorld -> (# State# RealWorld, a #) ) + with + out_of_line = True + has_side_effects = True + +primop NewTVarOp "newTVar#" GenPrimOp + a + -> State# s -> (# State# s, TVar# s a #) + {Create a new Tar\# holding a specified initial value.} + with + out_of_line = True + +primop ReadTVarOp "readTVar#" GenPrimOp + TVar# s a + -> State# s -> (# State# s, a #) + {Read contents of TVar\#. Result is not yet evaluated.} + with + out_of_line = True + +primop WriteTVarOp "writeTVar#" GenPrimOp + TVar# s a + -> a + -> State# s -> State# s + {Write contents of TVar\#.} + with + out_of_line = True + has_side_effects = True + +primop SameTVarOp "sameTVar#" GenPrimOp + TVar# s a -> TVar# s a -> Bool + + +------------------------------------------------------------------------ +section "Synchronized Mutable Variables" + {Operations on MVar\#s, which are shared mutable variables + ({\it not} the same as MutVar\#s!). (Note: in a non-concurrent implementation, + (MVar\# a) can be represented by (MutVar\# (Maybe a)).)} +------------------------------------------------------------------------ + + +primop NewMVarOp "newMVar#" GenPrimOp + State# s -> (# State# s, MVar# s a #) + {Create new mvar; initially empty.} + with + usage = { mangle NewMVarOp [mkP] mkR } + out_of_line = True + +primop TakeMVarOp "takeMVar#" GenPrimOp + MVar# s a -> State# s -> (# State# s, a #) + {If mvar is empty, block until it becomes full. + Then remove and return its contents, and set it empty.} + with + usage = { mangle TakeMVarOp [mkM, mkP] mkM } + has_side_effects = True + out_of_line = True + +primop TryTakeMVarOp "tryTakeMVar#" GenPrimOp + MVar# s a -> State# s -> (# State# s, Int#, a #) + {If mvar is empty, immediately return with integer 0 and value undefined. + Otherwise, return with integer 1 and contents of mvar, and set mvar empty.} + with + usage = { mangle TryTakeMVarOp [mkM, mkP] mkM } + has_side_effects = True + out_of_line = True + +primop PutMVarOp "putMVar#" GenPrimOp + MVar# s a -> a -> State# s -> State# s + {If mvar is full, block until it becomes empty. + Then store value arg as its new contents.} + with + usage = { mangle PutMVarOp [mkM, mkM, mkP] mkR } + has_side_effects = True + out_of_line = True + +primop TryPutMVarOp "tryPutMVar#" GenPrimOp + MVar# s a -> a -> State# s -> (# State# s, Int# #) + {If mvar is full, immediately return with integer 0. + Otherwise, store value arg as mvar's new contents, and return with integer 1.} + with + usage = { mangle TryPutMVarOp [mkM, mkM, mkP] mkR } + has_side_effects = True + out_of_line = True + +primop SameMVarOp "sameMVar#" GenPrimOp + MVar# s a -> MVar# s a -> Bool + with + usage = { mangle SameMVarOp [mkP, mkP] mkM } + +primop IsEmptyMVarOp "isEmptyMVar#" GenPrimOp + MVar# s a -> State# s -> (# State# s, Int# #) + {Return 1 if mvar is empty; 0 otherwise.} + with + usage = { mangle IsEmptyMVarOp [mkP, mkP] mkM } + out_of_line = True + +------------------------------------------------------------------------ +section "Delay/wait operations" +------------------------------------------------------------------------ + +primop DelayOp "delay#" GenPrimOp + Int# -> State# s -> State# s + {Sleep specified number of microseconds.} + with + needs_wrapper = True + has_side_effects = True + out_of_line = True + +primop WaitReadOp "waitRead#" GenPrimOp + Int# -> State# s -> State# s + {Block until input is available on specified file descriptor.} + with + needs_wrapper = True + has_side_effects = True + out_of_line = True + +primop WaitWriteOp "waitWrite#" GenPrimOp + Int# -> State# s -> State# s + {Block until output is possible on specified file descriptor.} + with + needs_wrapper = True + has_side_effects = True + out_of_line = True + +#ifdef mingw32_TARGET_OS +primop AsyncReadOp "asyncRead#" GenPrimOp + Int# -> Int# -> Int# -> Addr# -> State# RealWorld-> (# State# RealWorld, Int#, Int# #) + {Asynchronously read bytes from specified file descriptor.} + with + needs_wrapper = True + has_side_effects = True + out_of_line = True + +primop AsyncWriteOp "asyncWrite#" GenPrimOp + Int# -> Int# -> Int# -> Addr# -> State# RealWorld-> (# State# RealWorld, Int#, Int# #) + {Asynchronously write bytes from specified file descriptor.} + with + needs_wrapper = True + has_side_effects = True + out_of_line = True + +primop AsyncDoProcOp "asyncDoProc#" GenPrimOp + Addr# -> Addr# -> State# RealWorld-> (# State# RealWorld, Int#, Int# #) + {Asynchronously perform procedure (first arg), passing it 2nd arg.} + with + needs_wrapper = True + has_side_effects = True + out_of_line = True + +#endif + +------------------------------------------------------------------------ +section "Concurrency primitives" + {(In a non-concurrent implementation, ThreadId\# can be as singleton + type, whose (unique) value is returned by myThreadId\#. The + other operations can be omitted.)} +------------------------------------------------------------------------ + +primop ForkOp "fork#" GenPrimOp + a -> State# RealWorld -> (# State# RealWorld, ThreadId# #) + with + usage = { mangle ForkOp [mkO, mkP] mkR } + has_side_effects = True + out_of_line = True + +primop ForkOnOp "forkOn#" GenPrimOp + Int# -> a -> State# RealWorld -> (# State# RealWorld, ThreadId# #) + with + usage = { mangle ForkOnOp [mkO, mkP] mkR } + has_side_effects = True + out_of_line = True + +primop KillThreadOp "killThread#" GenPrimOp + ThreadId# -> a -> State# RealWorld -> State# RealWorld + with + usage = { mangle KillThreadOp [mkP, mkM, mkP] mkR } + has_side_effects = True + out_of_line = True + +primop YieldOp "yield#" GenPrimOp + State# RealWorld -> State# RealWorld + with + has_side_effects = True + out_of_line = True + +primop MyThreadIdOp "myThreadId#" GenPrimOp + State# RealWorld -> (# State# RealWorld, ThreadId# #) + with + out_of_line = True + +primop LabelThreadOp "labelThread#" GenPrimOp + ThreadId# -> Addr# -> State# RealWorld -> State# RealWorld + with + has_side_effects = True + out_of_line = True + +primop IsCurrentThreadBoundOp "isCurrentThreadBound#" GenPrimOp + State# RealWorld -> (# State# RealWorld, Int# #) + with + out_of_line = True + +------------------------------------------------------------------------ +section "Weak pointers" +------------------------------------------------------------------------ + +-- note that tyvar "o" denotes openAlphaTyVar + +primop MkWeakOp "mkWeak#" GenPrimOp + o -> b -> c -> State# RealWorld -> (# State# RealWorld, Weak# b #) + with + usage = { mangle MkWeakOp [mkZ, mkM, mkM, mkP] mkM } + has_side_effects = True + out_of_line = True + +primop DeRefWeakOp "deRefWeak#" GenPrimOp + Weak# a -> State# RealWorld -> (# State# RealWorld, Int#, a #) + with + usage = { mangle DeRefWeakOp [mkM, mkP] mkM } + has_side_effects = True + out_of_line = True + +primop FinalizeWeakOp "finalizeWeak#" GenPrimOp + Weak# a -> State# RealWorld -> (# State# RealWorld, Int#, + (State# RealWorld -> (# State# RealWorld, () #)) #) + with + usage = { mangle FinalizeWeakOp [mkM, mkP] + (mkR . (inUB FinalizeWeakOp + [id,id,inFun FinalizeWeakOp mkR mkM])) } + has_side_effects = True + out_of_line = True + +primop TouchOp "touch#" GenPrimOp + o -> State# RealWorld -> State# RealWorld + with + has_side_effects = True + +------------------------------------------------------------------------ +section "Stable pointers and names" +------------------------------------------------------------------------ + +primop MakeStablePtrOp "makeStablePtr#" GenPrimOp + a -> State# RealWorld -> (# State# RealWorld, StablePtr# a #) + with + usage = { mangle MakeStablePtrOp [mkM, mkP] mkM } + has_side_effects = True + out_of_line = True + +primop DeRefStablePtrOp "deRefStablePtr#" GenPrimOp + StablePtr# a -> State# RealWorld -> (# State# RealWorld, a #) + with + usage = { mangle DeRefStablePtrOp [mkM, mkP] mkM } + needs_wrapper = True + has_side_effects = True + out_of_line = True + +primop EqStablePtrOp "eqStablePtr#" GenPrimOp + StablePtr# a -> StablePtr# a -> Int# + with + usage = { mangle EqStablePtrOp [mkP, mkP] mkR } + has_side_effects = True + +primop MakeStableNameOp "makeStableName#" GenPrimOp + a -> State# RealWorld -> (# State# RealWorld, StableName# a #) + with + usage = { mangle MakeStableNameOp [mkZ, mkP] mkR } + needs_wrapper = True + has_side_effects = True + out_of_line = True + +primop EqStableNameOp "eqStableName#" GenPrimOp + StableName# a -> StableName# a -> Int# + with + usage = { mangle EqStableNameOp [mkP, mkP] mkR } + +primop StableNameToIntOp "stableNameToInt#" GenPrimOp + StableName# a -> Int# + with + usage = { mangle StableNameToIntOp [mkP] mkR } + +------------------------------------------------------------------------ +section "Unsafe pointer equality" +-- (#1 Bad Guy: Alistair Reid :) +------------------------------------------------------------------------ + +primop ReallyUnsafePtrEqualityOp "reallyUnsafePtrEquality#" GenPrimOp + a -> a -> Int# + with + usage = { mangle ReallyUnsafePtrEqualityOp [mkZ, mkZ] mkR } + +------------------------------------------------------------------------ +section "Parallelism" +------------------------------------------------------------------------ + +primop ParOp "par#" GenPrimOp + a -> Int# + with + usage = { mangle ParOp [mkO] mkR } + -- Note that Par is lazy to avoid that the sparked thing + -- gets evaluted strictly, which it should *not* be + has_side_effects = True + +-- HWL: The first 4 Int# in all par... annotations denote: +-- name, granularity info, size of result, degree of parallelism +-- Same structure as _seq_ i.e. returns Int# +-- KSW: v, the second arg in parAt# and parAtForNow#, is used only to determine +-- `the processor containing the expression v'; it is not evaluated + +primop ParGlobalOp "parGlobal#" GenPrimOp + a -> Int# -> Int# -> Int# -> Int# -> b -> Int# + with + usage = { mangle ParGlobalOp [mkO, mkP, mkP, mkP, mkP, mkM] mkM } + has_side_effects = True + +primop ParLocalOp "parLocal#" GenPrimOp + a -> Int# -> Int# -> Int# -> Int# -> b -> Int# + with + usage = { mangle ParLocalOp [mkO, mkP, mkP, mkP, mkP, mkM] mkM } + has_side_effects = True + +primop ParAtOp "parAt#" GenPrimOp + b -> a -> Int# -> Int# -> Int# -> Int# -> c -> Int# + with + usage = { mangle ParAtOp [mkO, mkZ, mkP, mkP, mkP, mkP, mkM] mkM } + has_side_effects = True + +primop ParAtAbsOp "parAtAbs#" GenPrimOp + a -> Int# -> Int# -> Int# -> Int# -> Int# -> b -> Int# + with + usage = { mangle ParAtAbsOp [mkO, mkP, mkP, mkP, mkP, mkM] mkM } + has_side_effects = True + +primop ParAtRelOp "parAtRel#" GenPrimOp + a -> Int# -> Int# -> Int# -> Int# -> Int# -> b -> Int# + with + usage = { mangle ParAtRelOp [mkO, mkP, mkP, mkP, mkP, mkM] mkM } + has_side_effects = True + +primop ParAtForNowOp "parAtForNow#" GenPrimOp + b -> a -> Int# -> Int# -> Int# -> Int# -> c -> Int# + with + usage = { mangle ParAtForNowOp [mkO, mkZ, mkP, mkP, mkP, mkP, mkM] mkM } + has_side_effects = True + +-- copyable# and noFollow# are yet to be implemented (for GpH) +-- +--primop CopyableOp "copyable#" GenPrimOp +-- a -> Int# +-- with +-- usage = { mangle CopyableOp [mkZ] mkR } +-- has_side_effects = True +-- +--primop NoFollowOp "noFollow#" GenPrimOp +-- a -> Int# +-- with +-- usage = { mangle NoFollowOp [mkZ] mkR } +-- has_side_effects = True + + +------------------------------------------------------------------------ +section "Tag to enum stuff" + {Convert back and forth between values of enumerated types + and small integers.} +------------------------------------------------------------------------ + +primop DataToTagOp "dataToTag#" GenPrimOp + a -> Int# + with + strictness = { \ arity -> mkStrictSig (mkTopDmdType [seqDmd] TopRes) } + -- dataToTag# must have an evaluated argument + +primop TagToEnumOp "tagToEnum#" GenPrimOp + Int# -> a + +------------------------------------------------------------------------ +section "Bytecode operations" + {Support for the bytecode interpreter and linker.} +------------------------------------------------------------------------ + + +primop AddrToHValueOp "addrToHValue#" GenPrimOp + Addr# -> (# a #) + {Convert an Addr\# to a followable type.} + +primop MkApUpd0_Op "mkApUpd0#" GenPrimOp + BCO# -> (# a #) + with + out_of_line = True + +primop NewBCOOp "newBCO#" GenPrimOp + ByteArr# -> ByteArr# -> Array# a -> ByteArr# -> Int# -> ByteArr# -> State# s -> (# State# s, BCO# #) + with + has_side_effects = True + out_of_line = True + +------------------------------------------------------------------------ +section "Coercion" + {{\tt unsafeCoerce\# :: a -> b} is not a primop, but is defined in MkId.lhs.} + +------------------------------------------------------------------------ + + +------------------------------------------------------------------------ +--- --- +------------------------------------------------------------------------ + +thats_all_folks + + + diff --git a/compiler/profiling/CostCentre.lhs b/compiler/profiling/CostCentre.lhs new file mode 100644 index 0000000000..3ee46a88db --- /dev/null +++ b/compiler/profiling/CostCentre.lhs @@ -0,0 +1,373 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[CostCentre]{The @CostCentre@ data type} + +\begin{code} +module CostCentre ( + CostCentre(..), CcName, IsDupdCC(..), IsCafCC(..), + -- All abstract except to friend: ParseIface.y + + CostCentreStack, + CollectedCCs, + noCCS, subsumedCCS, currentCCS, overheadCCS, dontCareCCS, + noCostCentre, noCCAttached, + noCCSAttached, isCurrentCCS, isSubsumedCCS, currentOrSubsumedCCS, + isDerivedFromCurrentCCS, maybeSingletonCCS, + decomposeCCS, + + mkUserCC, mkAutoCC, mkAllCafsCC, + mkSingletonCCS, dupifyCC, pushCCOnCCS, + isCafCCS, isCafCC, + isSccCountCostCentre, + sccAbleCostCentre, + ccFromThisModule, + + pprCostCentreCore, + costCentreUserName, + + cmpCostCentre -- used for removing dups in a list + ) where + +#include "HsVersions.h" + +import Var ( Id ) +import Name ( getOccName, occNameFS ) +import Module ( Module, moduleFS ) +import Outputable +import FastTypes +import FastString +import Util ( thenCmp ) +\end{code} + +A Cost Centre Stack is something that can be attached to a closure. +This is either: + + - the current cost centre stack (CCCS) + - a pre-defined cost centre stack (there are several + pre-defined CCSs, see below). + +\begin{code} +data CostCentreStack + = NoCCS + + | CurrentCCS -- Pinned on a let(rec)-bound + -- thunk/function/constructor, this says that the + -- cost centre to be attached to the object, when it + -- is allocated, is whatever is in the + -- current-cost-centre-stack register. + + | SubsumedCCS -- Cost centre stack for top-level subsumed functions + -- (CAFs get an AllCafsCC). + -- Its execution costs get subsumed into the caller. + -- This guy is *only* ever pinned on static closures, + -- and is *never* the cost centre for an SCC construct. + + | OverheadCCS -- We charge costs due to the profiling-system + -- doing its work to "overhead". + -- + -- Objects whose CCS is "Overhead" + -- have their *allocation* charged to "overhead", + -- but have the current CCS put into the object + -- itself. + + -- For example, if we transform "f g" to "let + -- g' = g in f g'" (so that something about + -- profiling works better...), then we charge + -- the *allocation* of g' to OverheadCCS, but + -- we put the cost-centre of the call to f + -- (i.e., current CCS) into the g' object. When + -- g' is entered, the CCS of the call + -- to f will be set. + + | DontCareCCS -- We need a CCS to stick in static closures + -- (for data), but we *don't* expect them to + -- accumulate any costs. But we still need + -- the placeholder. This CCS is it. + + | PushCC CostCentre CostCentreStack + -- These are used during code generation as the CCSs + -- attached to closures. A PushCC never appears as + -- the argument to an _scc_. + -- + -- The tail (2nd argument) is either NoCCS, indicating + -- a staticly allocated CCS, or CurrentCCS indicating + -- a dynamically created CCS. We only support + -- statically allocated *singleton* CCSs at the + -- moment, for the purposes of initialising the CCS + -- field of a CAF. + + deriving (Eq, Ord) -- needed for Ord on CLabel +\end{code} + +A Cost Centre is the argument of an _scc_ expression. + +\begin{code} +data CostCentre + = NoCostCentre -- Having this constructor avoids having + -- to use "Maybe CostCentre" all the time. + + | NormalCC { + cc_name :: CcName, -- Name of the cost centre itself + cc_mod :: Module, -- Name of module defining this CC. + cc_is_dupd :: IsDupdCC, -- see below + cc_is_caf :: IsCafCC -- see below + } + + | AllCafsCC { + cc_mod :: Module -- Name of module defining this CC. + } + +type CcName = FastString + +data IsDupdCC + = OriginalCC -- This says how the CC is *used*. Saying that + | DupdCC -- it is DupdCC doesn't make it a different + -- CC, just that it a sub-expression which has + -- been moved ("dupd") into a different scope. + -- + -- The point about a dupd SCC is that we don't + -- count entries to it, because it's not the + -- "original" one. + -- + -- In the papers, it's called "SCCsub", + -- i.e. SCCsub CC == SCC DupdCC, + -- but we are trying to avoid confusion between + -- "subd" and "subsumed". So we call the former + -- "dupd". + +data IsCafCC = CafCC | NotCafCC + +-- synonym for triple which describes the cost centre info in the generated +-- code for a module. +type CollectedCCs + = ( [CostCentre] -- local cost-centres that need to be decl'd + , [CostCentre] -- "extern" cost-centres + , [CostCentreStack] -- pre-defined "singleton" cost centre stacks + ) +\end{code} + +WILL: Would there be any merit to recording ``I am now using a +cost-centre from another module''? I don't know if this would help a +user; it might be interesting to us to know how much computation is +being moved across module boundaries. + +SIMON: Maybe later... + +\begin{code} + +noCCS = NoCCS +subsumedCCS = SubsumedCCS +currentCCS = CurrentCCS +overheadCCS = OverheadCCS +dontCareCCS = DontCareCCS + +noCostCentre = NoCostCentre +\end{code} + +Predicates on Cost-Centre Stacks + +\begin{code} +noCCSAttached NoCCS = True +noCCSAttached _ = False + +noCCAttached NoCostCentre = True +noCCAttached _ = False + +isCurrentCCS CurrentCCS = True +isCurrentCCS _ = False + +isSubsumedCCS SubsumedCCS = True +isSubsumedCCS _ = False + +isCafCCS (PushCC cc NoCCS) = isCafCC cc +isCafCCS _ = False + +isDerivedFromCurrentCCS CurrentCCS = True +isDerivedFromCurrentCCS (PushCC _ ccs) = isDerivedFromCurrentCCS ccs +isDerivedFromCurrentCCS _ = False + +currentOrSubsumedCCS SubsumedCCS = True +currentOrSubsumedCCS CurrentCCS = True +currentOrSubsumedCCS _ = False + +maybeSingletonCCS (PushCC cc NoCCS) = Just cc +maybeSingletonCCS _ = Nothing +\end{code} + +Building cost centres + +\begin{code} +mkUserCC :: FastString -> Module -> CostCentre +mkUserCC cc_name mod + = NormalCC { cc_name = cc_name, cc_mod = mod, + cc_is_dupd = OriginalCC, cc_is_caf = NotCafCC {-might be changed-} + } + +mkAutoCC :: Id -> Module -> IsCafCC -> CostCentre +mkAutoCC id mod is_caf + = NormalCC { cc_name = occNameFS (getOccName id), cc_mod = mod, + cc_is_dupd = OriginalCC, cc_is_caf = is_caf + } + +mkAllCafsCC m = AllCafsCC { cc_mod = m } + + + +mkSingletonCCS :: CostCentre -> CostCentreStack +mkSingletonCCS cc = pushCCOnCCS cc NoCCS + +pushCCOnCCS :: CostCentre -> CostCentreStack -> CostCentreStack +pushCCOnCCS = PushCC + +dupifyCC cc = cc {cc_is_dupd = DupdCC} + +isCafCC, isDupdCC :: CostCentre -> Bool + +isCafCC (AllCafsCC {}) = True +isCafCC (NormalCC {cc_is_caf = CafCC}) = True +isCafCC _ = False + +isDupdCC (NormalCC {cc_is_dupd = DupdCC}) = True +isDupdCC _ = False + +isSccCountCostCentre :: CostCentre -> Bool + -- Is this a cost-centre which records scc counts + +#if DEBUG +isSccCountCostCentre NoCostCentre = panic "isSccCount:NoCostCentre" +#endif +isSccCountCostCentre cc | isCafCC cc = False + | isDupdCC cc = False + | otherwise = True + +sccAbleCostCentre :: CostCentre -> Bool + -- Is this a cost-centre which can be sccd ? + +#if DEBUG +sccAbleCostCentre NoCostCentre = panic "sccAbleCC:NoCostCentre" +#endif +sccAbleCostCentre cc | isCafCC cc = False + | otherwise = True + +ccFromThisModule :: CostCentre -> Module -> Bool +ccFromThisModule cc m = cc_mod cc == m +\end{code} + +\begin{code} +instance Eq CostCentre where + c1 == c2 = case c1 `cmpCostCentre` c2 of { EQ -> True; _ -> False } + +instance Ord CostCentre where + compare = cmpCostCentre + +cmpCostCentre :: CostCentre -> CostCentre -> Ordering + +cmpCostCentre (AllCafsCC {cc_mod = m1}) (AllCafsCC {cc_mod = m2}) = m1 `compare` m2 + +cmpCostCentre (NormalCC {cc_name = n1, cc_mod = m1, cc_is_caf = c1}) + (NormalCC {cc_name = n2, cc_mod = m2, cc_is_caf = c2}) + -- first key is module name, then we use "kinds" (which include + -- names) and finally the caf flag + = (m1 `compare` m2) `thenCmp` (n1 `compare` n2) `thenCmp` (c1 `cmp_caf` c2) + +cmpCostCentre other_1 other_2 + = let + tag1 = tag_CC other_1 + tag2 = tag_CC other_2 + in + if tag1 <# tag2 then LT else GT + where + tag_CC (NormalCC {}) = (_ILIT 1 :: FastInt) + tag_CC (AllCafsCC {}) = _ILIT 2 + +cmp_caf NotCafCC CafCC = LT +cmp_caf NotCafCC NotCafCC = EQ +cmp_caf CafCC CafCC = EQ +cmp_caf CafCC NotCafCC = GT + +decomposeCCS :: CostCentreStack -> ([CostCentre],CostCentreStack) +decomposeCCS (PushCC cc ccs) = (cc:more, ccs') + where (more,ccs') = decomposeCCS ccs +decomposeCCS ccs = ([],ccs) +\end{code} + +----------------------------------------------------------------------------- +Printing Cost Centre Stacks. + +The outputable instance for CostCentreStack prints the CCS as a C +expression. + +NOTE: Not all cost centres are suitable for using in a static +initializer. In particular, the PushCC forms where the tail is CCCS +may only be used in inline C code because they expand to a +non-constant C expression. + +\begin{code} +instance Outputable CostCentreStack where + ppr NoCCS = ptext SLIT("NO_CCS") + ppr CurrentCCS = ptext SLIT("CCCS") + ppr OverheadCCS = ptext SLIT("CCS_OVERHEAD") + ppr DontCareCCS = ptext SLIT("CCS_DONT_CARE") + ppr SubsumedCCS = ptext SLIT("CCS_SUBSUMED") + ppr (PushCC cc NoCCS) = ppr cc <> ptext SLIT("_ccs") + ppr (PushCC cc ccs) = ptext SLIT("PushCostCentre") <> + parens (ppr ccs <> comma <> + parens(ptext SLIT("void *")) <> ppr cc) +\end{code} + +----------------------------------------------------------------------------- +Printing Cost Centres. + +There are several different ways in which we might want to print a +cost centre: + + - the name of the cost centre, for profiling output (a C string) + - the label, i.e. C label for cost centre in .hc file. + - the debugging name, for output in -ddump things + - the interface name, for printing in _scc_ exprs in iface files. + +The last 3 are derived from costCentreStr below. The first is given +by costCentreName. + +\begin{code} +instance Outputable CostCentre where + ppr cc = getPprStyle $ \ sty -> + if codeStyle sty + then ppCostCentreLbl cc + else text (costCentreUserName cc) + +-- Printing in an interface file or in Core generally +pprCostCentreCore (AllCafsCC {cc_mod = m}) + = text "__sccC" <+> braces (ppr_mod m) +pprCostCentreCore (NormalCC {cc_name = n, cc_mod = m, + cc_is_caf = caf, cc_is_dupd = dup}) + = text "__scc" <+> braces (hsep [ + ftext (zEncodeFS n), + ppr_mod m, + pp_dup dup, + pp_caf caf + ]) + +pp_dup DupdCC = char '!' +pp_dup other = empty + +pp_caf CafCC = text "__C" +pp_caf other = empty + +ppr_mod m = ftext (zEncodeFS (moduleFS m)) + +-- Printing as a C label +ppCostCentreLbl (NoCostCentre) = text "NONE_cc" +ppCostCentreLbl (AllCafsCC {cc_mod = m}) = ppr m <> text "_CAFs_cc" +ppCostCentreLbl (NormalCC {cc_name = n, cc_mod = m, cc_is_caf = is_caf}) + = ppr_mod m <> ftext (zEncodeFS n) <> + text (case is_caf of { CafCC -> "_CAF"; _ -> "" }) <> text "_cc" + +-- This is the name to go in the user-displayed string, +-- recorded in the cost centre declaration +costCentreUserName (NoCostCentre) = "NO_CC" +costCentreUserName (AllCafsCC {}) = "CAF" +costCentreUserName cc@(NormalCC {cc_name = name, cc_is_caf = is_caf}) + = case is_caf of { CafCC -> "CAF:"; _ -> "" } ++ unpackFS name +\end{code} diff --git a/compiler/profiling/NOTES b/compiler/profiling/NOTES new file mode 100644 index 0000000000..c50cf562e3 --- /dev/null +++ b/compiler/profiling/NOTES @@ -0,0 +1,301 @@ +Profiling Implementation Notes -- June/July/Sept 1994 +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Simon and Will + +Pre-code-generator-ish +~~~~~~~~~~~~~~~~~~~~~~ + +* Automagic insertion of _sccs_ on... + + - If -auto is specified, add _scc_ on each *exported* top-level definition. + NB this includes CAFs. Done by addAutoCostCentres (Core-to-Core pass). + + - If -auto-all is specified, add _scc_ on *all* top-level definitions. + Done by same pass. + + - Always: just before code generation of module M, onto any CAF + which hasn't already got an explicit cost centre attached, pin + "AllCAFs-M". + + Done by finalStgMassageForProfiling (final STG-to-STG pass) + + Only the one-off costs of evaluating the CAFs will be attributed + to the AllCAFs-M cost centre. We hope that these costs will be + small; since the _scc_s are introduced automatically it's + confusing to attribute any significant costs to them. However if + there *are* significant one-off costs we'd better know about it. + + Why so late in the compilation process? We aren't *absolutely* + sure what is and isn't a CAF until *just* before code generation. + So we don't want to mark them as such until then. + + - Individual DICTs + + We do it in the desugarer, because that's the *only* point at + which we *know* exactly what bindings are introduced by + overloading. NB should include bindings for selected methods, eg + + f d = let op = _scc_ DICT op_sel d in + ...op...op...op + + The DICT CC ensures that: + (a) [minor] that the selection cost is separately attributed + (b) [major] that the cost of executing op is attributed to + its call site, eg + + ...(scc "a" op)...(scc "b" op)...(scc "c" op)... + +* Automagic "boxing" of higher-order args: + + finalStgMassageForProfiling (final STG-to-STG pass) + + This (as well as CAF stuff above) is really quite separate + from the other business of finalStgMassageForProfiling + (collecting up CostCentres that need to be + declared/registered). + + But throwing it all into the pot together means that we don't + have to have Yet Another STG Syntax Walker. + + Furthermore, these "boxes" are really just let-bindings that + many other parts of the compiler will happily substitute away! + Doing them at the very last instant prevents this. + + A down side of doing these so late is that we get lots of + "let"s, which if generated earlier and not substituted away, + could be floated outwards. Having them floated outwards would + lessen the chance of skewing profiling results (because of + gratuitous "let"s added by the compiler into the inner loop of + some program...). The allocation itself will be attributed to + profiling overhead; the only thing which'll be skewed is time measurement. + + So if we have, post-boxing-higher-order-args... + + _scc_ "foo" ( let f' = [f] \ [] f + in + map f' xs ) + + ... we want "foo" to be put in the thunk for "f'", but we want the + allocation cost (heap census stuff) to be attr to OVERHEAD. + + As an example of what could be improved + f = _scc_ "f" (g h) + To save dynamic allocation, we could have a static closure for h: + h_inf = _scc_ "f" h + f = _scc_ "f" (g h_inf) + + + + + +Code generator-ish +~~~~~~~~~~~~~~~~~~ + +(1) _Entry_ code for a closure *usually* sets CC from the closure, + at the fast entry point + + Exceptions: + + (a) Top-level subsumed functions (i.e., w/ no _scc_ on them) + + Refrain from setting CC from the closure + + (b) Constructors + + Again, refrain. (This is *new*) + + Reasons: (i) The CC will be zapped very shortly by the restore + of the enclosing CC when we return to the eval'ing "case". + (ii) Any intervening updates will indirect to this existing + constructor (...mumble... new update mechanism... mumble...) + +(2) "_scc_ cc expr" + + Set current CC to "cc". + No later "restore" of the previous CC is reqd. + +(3) "case e of { ...alts... }" expression (eval) + + Save CC before eval'ing scrutinee + Restore CC at the start of the case-alternative(s) + +(4) _Updates_ : updatee gets current CC + + (???? not sure this is OK yet 94/07/04) + + Reasons: + + * Constructors : want to be insensitive to return-in-heap vs + return-in-regs. For example, + + f x = _scc_ "f" (x, x) + + The pair (x,x) would get CC of "f" if returned-in-heap; + therefore, updatees should get CC of "f". + + * PAPs : Example: + + f x = _scc_ "f" (let g = \ y -> ... in g) + + At the moment of update (updatePAP?), CC is "f", which + is what we want to set it to if the "updatee" is entered + + When we enter the PAP ("please put the arguments back so I can + use them"), we restore the setup as at the moment the + arg-satisfaction check failed. + + Be careful! UPDATE_PAP is called from the arg-satis check, + which is before the fast entry point. So the cost centre + won't yet have been set from the closure which has just + been entered. Solution: in UPDATE_PAP see if the cost centre inside + the function closure which is being entered is "SUB"; if so, use + the current cost centre to update the updatee; otherwise use that + inside the function closure. (See the computation of cc_pap + in rule 16_l for lexical semantics.) + + +(5) CAFs + +CAFs get their own cost centre. Ie + + x = e +is transformed to + x = _scc_ "CAF:x" e + +Or sometimes we lump all the CAFs in a module together. +(Reporting issue or code-gen issue?) + + + +Hybrid stuff +~~~~~~~~~~~~ + +The problem: + + f = _scc_ "CAF:f" (let g = \xy -> ... + in (g,g)) + +Now, g has cost-centre "CAF:f", and is returned as part of +the result. So whenever the function embedded in the result +is called, the costs will accumulate to "CAF:f". This is +particularly (de)pressing for dictionaries, which contain lots +of functions. + +Solution: + + A. Whenever in case (1) above we would otherwise "set the CC from the + closure", we *refrain* from doing so if + (a) the closure is a function, not a thunk; and + (b) the cost-centre in the closure is a CAF cost centre. + + B. Whenever we enter a thunk [at least, one which might return a function] + we save the current cost centre in the update frame. Then, UPDATE_PAP + restores the saved cost centre from the update frame iff the cost + centre at the point of update (cc_pap in (4) above) is a CAF cost centre. + + It isn't necessary to save and possibly-restore the cost centre for + thunks which will certainly return a constructor, because the + cost centre is about to be restored anyway by the enclosing case. + +Both A and B are runtime tests. For A, consider: + + f = _scc_ "CAF:f" (g 2) + + h y = _scc_ "h" g (y+y) + + g x = let w = \p -> ... + in (w,w) + + +Now, in the call to g from h, the cost-centre on w will be "h", and +indeed all calls to the result of the call should be attributed to +"h". + + ... _scc_ "x1" (let (t,_) = h 2 in t 3) ... + + Costs of executing (w 3) attributed to "h". + +But in the call to g from f, the cost-centre on w will be +"CAF:f", and calls to w should be attributed to the call site. + + ..._scc_ "x2" (let (t,_) = f in t 3)... + + Costs of executing (w 3) attributed to "x2". + + + Remaining problem + +Consider + + _scc_ "CAF:f" (if expensive then g 2 else g 3) + +where g is a function with arity 2. In theory we should +restore the enclosing cost centre once we've reduced to +(g 2) or (g 3). In practice this is pretty tiresome; and pretty rare. + +A quick fix: given (_scc_ "CAF" e) where e might be function-valued +(in practice we usually know, because CAF sccs are top level), transform to + + _scc_ "CAF" (let f = e in f) + + + + + +============ + +scc cc x ===> x + + UNLESS + +(a) cc is a user-defined, non-dup'd cost + centre (so we care about entry counts) + +OR + +(b) cc is not a CAF/DICT cost centre and x is top-level subsumed + function. + [If x is lambda/let bound it'll have a cost centre + attached dynamically.] + + To repeat, the transformation is OK if + x is a not top-level subsumed function + OR + cc is a CAF/DICT cost centre and x is a top-level + subsumed function + + + +(scc cc e) x ===> (scc cc e x) + + OK????? IFF + +cc is not CAF/DICT --- remains to be proved!!!!!! +True for lex +False for eval +Can we tell which in hybrid? + +eg Is this ok? + + (scc "f" (scc "CAF" (\x.b))) y ==> (scc "f" (scc "CAF" (\x.b) y)) + + +\x -> (scc cc e) ===> (scc cc \x->e) + + OK IFF cc is not CAF/DICT + + +scc cc1 (scc cc2 e)) ===> scc cc2 e + + IFF not interested in cc1's entry count + AND cc2 is not CAF/DICT + +(scc cc1 ... (scc cc2 e) ...) ===> (scc cc1 ... e ...) + + IFF cc2 is CAF/DICT + AND e is a lambda not appearing as the RHS of a let + OR + e is a variable not bound to SUB + + diff --git a/compiler/profiling/SCCfinal.lhs b/compiler/profiling/SCCfinal.lhs new file mode 100644 index 0000000000..c95db9c358 --- /dev/null +++ b/compiler/profiling/SCCfinal.lhs @@ -0,0 +1,411 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[SCCfinal]{Modify and collect code generation for final STG program} + +This is now a sort-of-normal STG-to-STG pass (WDP 94/06), run by stg2stg. + +* Traverses the STG program collecting the cost centres. These are + required to declare the cost centres at the start of code + generation. + + Note: because of cross-module unfolding, some of these cost centres + may be from other modules. But will still have to give them + "extern" declarations. + +* Puts on CAF cost-centres if the user has asked for individual CAF + cost-centres. + +* Ditto for individual DICT cost-centres. + +* Boxes top-level inherited functions passed as arguments. + +* "Distributes" given cost-centres to all as-yet-unmarked RHSs. + +\begin{code} +module SCCfinal ( stgMassageForProfiling ) where + +#include "HsVersions.h" + +import StgSyn + +import Packages ( HomeModules ) +import StaticFlags ( opt_AutoSccsOnIndividualCafs ) +import CostCentre -- lots of things +import Id ( Id ) +import Module ( Module ) +import UniqSupply ( uniqFromSupply, splitUniqSupply, UniqSupply ) +import Unique ( Unique ) +import VarSet +import ListSetOps ( removeDups ) +import Outputable + +infixr 9 `thenMM`, `thenMM_` +\end{code} + +\begin{code} +stgMassageForProfiling + :: HomeModules + -> Module -- module name + -> UniqSupply -- unique supply + -> [StgBinding] -- input + -> (CollectedCCs, [StgBinding]) + +stgMassageForProfiling pdeps mod_name us stg_binds + = let + ((local_ccs, extern_ccs, cc_stacks), + stg_binds2) + = initMM mod_name us (do_top_bindings stg_binds) + + (fixed_ccs, fixed_cc_stacks) + = if opt_AutoSccsOnIndividualCafs + then ([],[]) -- don't need "all CAFs" CC + -- (for Prelude, we use PreludeCC) + else ([all_cafs_cc], [all_cafs_ccs]) + + local_ccs_no_dups = fst (removeDups cmpCostCentre local_ccs) + extern_ccs_no_dups = fst (removeDups cmpCostCentre extern_ccs) + in + ((fixed_ccs ++ local_ccs_no_dups, + extern_ccs_no_dups, + fixed_cc_stacks ++ cc_stacks), stg_binds2) + where + + all_cafs_cc = mkAllCafsCC mod_name + all_cafs_ccs = mkSingletonCCS all_cafs_cc + + ---------- + do_top_bindings :: [StgBinding] -> MassageM [StgBinding] + + do_top_bindings [] = returnMM [] + + do_top_bindings (StgNonRec b rhs : bs) + = do_top_rhs b rhs `thenMM` \ rhs' -> + addTopLevelIshId b ( + do_top_bindings bs `thenMM` \bs' -> + returnMM (StgNonRec b rhs' : bs') + ) + + do_top_bindings (StgRec pairs : bs) + = addTopLevelIshIds binders ( + mapMM do_pair pairs `thenMM` \ pairs2 -> + do_top_bindings bs `thenMM` \ bs' -> + returnMM (StgRec pairs2 : bs') + ) + where + binders = map fst pairs + do_pair (b, rhs) + = do_top_rhs b rhs `thenMM` \ rhs2 -> + returnMM (b, rhs2) + + ---------- + do_top_rhs :: Id -> StgRhs -> MassageM StgRhs + + do_top_rhs binder (StgRhsClosure _ bi fv u srt [] (StgSCC cc (StgConApp con args))) + | not (isSccCountCostCentre cc) && not (isDllConApp pdeps con args) + -- Trivial _scc_ around nothing but static data + -- Eliminate _scc_ ... and turn into StgRhsCon + + -- isDllConApp checks for LitLit args too + = returnMM (StgRhsCon dontCareCCS con args) + +{- Can't do this one with cost-centre stacks: --SDM + do_top_rhs binder (StgRhsClosure no_cc bi fv u [] (StgSCC ty cc expr)) + | (noCCSAttached no_cc || currentOrSubsumedCCS no_cc) + && not (isSccCountCostCentre cc) + -- Top level CAF without a cost centre attached + -- Attach and collect cc of trivial _scc_ in body + = collectCC cc `thenMM_` + set_prevailing_cc cc (do_expr expr) `thenMM` \ expr' -> + returnMM (StgRhsClosure cc bi fv u [] expr') +-} + + do_top_rhs binder (StgRhsClosure no_cc bi fv u srt [] body) + | noCCSAttached no_cc || currentOrSubsumedCCS no_cc + -- Top level CAF without a cost centre attached + -- Attach CAF cc (collect if individual CAF ccs) + = (if opt_AutoSccsOnIndividualCafs + then let cc = mkAutoCC binder mod_name CafCC + ccs = mkSingletonCCS cc + in + collectCC cc `thenMM_` + collectCCS ccs `thenMM_` + returnMM ccs + else + returnMM all_cafs_ccs) `thenMM` \ caf_ccs -> + set_prevailing_cc caf_ccs (do_expr body) `thenMM` \ body' -> + returnMM (StgRhsClosure caf_ccs bi fv u srt [] body') + + do_top_rhs binder (StgRhsClosure cc bi fv u srt [] body) + -- Top level CAF with cost centre attached + -- Should this be a CAF cc ??? Does this ever occur ??? + = pprPanic "SCCfinal: CAF with cc:" (ppr cc) + + do_top_rhs binder (StgRhsClosure no_ccs bi fv u srt args body) + -- Top level function, probably subsumed + | noCCSAttached no_ccs + = set_lambda_cc (do_expr body) `thenMM` \ body' -> + returnMM (StgRhsClosure subsumedCCS bi fv u srt args body') + + | otherwise + = pprPanic "SCCfinal: CAF with cc:" (ppr no_ccs) + + do_top_rhs binder (StgRhsCon ccs con args) + -- Top-level (static) data is not counted in heap + -- profiles; nor do we set CCCS from it; so we + -- just slam in dontCareCostCentre + = returnMM (StgRhsCon dontCareCCS con args) + + ------ + do_expr :: StgExpr -> MassageM StgExpr + + do_expr (StgLit l) = returnMM (StgLit l) + + do_expr (StgApp fn args) + = boxHigherOrderArgs (StgApp fn) args + + do_expr (StgConApp con args) + = boxHigherOrderArgs (\args -> StgConApp con args) args + + do_expr (StgOpApp con args res_ty) + = boxHigherOrderArgs (\args -> StgOpApp con args res_ty) args + + do_expr (StgSCC cc expr) -- Ha, we found a cost centre! + = collectCC cc `thenMM_` + do_expr expr `thenMM` \ expr' -> + returnMM (StgSCC cc expr') + + do_expr (StgCase expr fv1 fv2 bndr srt alt_type alts) + = do_expr expr `thenMM` \ expr' -> + mapMM do_alt alts `thenMM` \ alts' -> + returnMM (StgCase expr' fv1 fv2 bndr srt alt_type alts') + where + do_alt (id, bs, use_mask, e) + = do_expr e `thenMM` \ e' -> + returnMM (id, bs, use_mask, e') + + do_expr (StgLet b e) + = do_let b e `thenMM` \ (b,e) -> + returnMM (StgLet b e) + + do_expr (StgLetNoEscape lvs1 lvs2 b e) + = do_let b e `thenMM` \ (b,e) -> + returnMM (StgLetNoEscape lvs1 lvs2 b e) + +#ifdef DEBUG + do_expr other = pprPanic "SCCfinal.do_expr" (ppr other) +#endif + + ---------------------------------- + + do_let (StgNonRec b rhs) e + = do_rhs rhs `thenMM` \ rhs' -> + addTopLevelIshId b ( + do_expr e `thenMM` \ e' -> + returnMM (StgNonRec b rhs',e') + ) + + do_let (StgRec pairs) e + = addTopLevelIshIds binders ( + mapMM do_pair pairs `thenMM` \ pairs' -> + do_expr e `thenMM` \ e' -> + returnMM (StgRec pairs', e') + ) + where + binders = map fst pairs + do_pair (b, rhs) + = do_rhs rhs `thenMM` \ rhs2 -> + returnMM (b, rhs2) + + ---------------------------------- + do_rhs :: StgRhs -> MassageM StgRhs + -- We play much the same game as we did in do_top_rhs above; + -- but we don't have to worry about cafs etc. + +{- + do_rhs (StgRhsClosure closure_cc bi fv u [] (StgSCC ty cc (StgCon (DataCon con) args _))) + | not (isSccCountCostCentre cc) + = collectCC cc `thenMM_` + returnMM (StgRhsCon cc con args) +-} + + do_rhs (StgRhsClosure _ bi fv u srt args expr) + = slurpSCCs currentCCS expr `thenMM` \ (expr', ccs) -> + do_expr expr' `thenMM` \ expr'' -> + returnMM (StgRhsClosure ccs bi fv u srt args expr'') + where + slurpSCCs ccs (StgSCC cc e) + = collectCC cc `thenMM_` + slurpSCCs (cc `pushCCOnCCS` ccs) e + slurpSCCs ccs e + = returnMM (e, ccs) + + do_rhs (StgRhsCon cc con args) + = returnMM (StgRhsCon currentCCS con args) +\end{code} + +%************************************************************************ +%* * +\subsection{Boxing higher-order args} +%* * +%************************************************************************ + +Boxing is *turned off* at the moment, until we can figure out how to +do it properly in general. + +\begin{code} +boxHigherOrderArgs + :: ([StgArg] -> StgExpr) + -- An application lacking its arguments + -> [StgArg] -- arguments which we might box + -> MassageM StgExpr + +#ifndef PROF_DO_BOXING +boxHigherOrderArgs almost_expr args + = returnMM (almost_expr args) +#else +boxHigherOrderArgs almost_expr args + = getTopLevelIshIds `thenMM` \ ids -> + mapAccumMM (do_arg ids) [] args `thenMM` \ (let_bindings, new_args) -> + returnMM (foldr (mk_stg_let currentCCS) (almost_expr new_args) let_bindings) + where + --------------- + + do_arg ids bindings arg@(StgVarArg old_var) + | (not (isLocalVar old_var) || elemVarSet old_var ids) + && isFunTy (dropForAlls var_type) + = -- make a trivial let-binding for the top-level function + getUniqueMM `thenMM` \ uniq -> + let + new_var = mkSysLocal FSLIT("sf") uniq var_type + in + returnMM ( (new_var, old_var) : bindings, StgVarArg new_var ) + where + var_type = idType old_var + + do_arg ids bindings arg = returnMM (bindings, arg) + + --------------- + mk_stg_let :: CostCentreStack -> (Id, Id) -> StgExpr -> StgExpr + + mk_stg_let cc (new_var, old_var) body + = let + rhs_body = StgApp old_var [{-args-}] + rhs_closure = StgRhsClosure cc stgArgOcc [{-fvs-}] ReEntrant NoSRT{-eeek!!!-} [{-args-}] rhs_body + in + StgLet (StgNonRec new_var rhs_closure) body + where + bOGUS_LVs = emptyUniqSet -- easier to print than: panic "mk_stg_let: LVs" +#endif +\end{code} + +%************************************************************************ +%* * +\subsection{Boring monad stuff for this} +%* * +%************************************************************************ + +\begin{code} +type MassageM result + = Module -- module name + -> CostCentreStack -- prevailing CostCentre + -- if none, subsumedCosts at top-level + -- currentCostCentre at nested levels + -> UniqSupply + -> VarSet -- toplevel-ish Ids for boxing + -> CollectedCCs + -> (CollectedCCs, result) + +-- the initMM function also returns the final CollectedCCs + +initMM :: Module -- module name, which we may consult + -> UniqSupply + -> MassageM a + -> (CollectedCCs, a) + +initMM mod_name init_us m = m mod_name noCCS init_us emptyVarSet ([],[],[]) + +thenMM :: MassageM a -> (a -> MassageM b) -> MassageM b +thenMM_ :: MassageM a -> (MassageM b) -> MassageM b + +thenMM expr cont mod scope_cc us ids ccs + = case splitUniqSupply us of { (s1, s2) -> + case (expr mod scope_cc s1 ids ccs) of { (ccs2, result) -> + cont result mod scope_cc s2 ids ccs2 }} + +thenMM_ expr cont mod scope_cc us ids ccs + = case splitUniqSupply us of { (s1, s2) -> + case (expr mod scope_cc s1 ids ccs) of { (ccs2, _) -> + cont mod scope_cc s2 ids ccs2 }} + +returnMM :: a -> MassageM a +returnMM result mod scope_cc us ids ccs = (ccs, result) + +nopMM :: MassageM () +nopMM mod scope_cc us ids ccs = (ccs, ()) + +mapMM :: (a -> MassageM b) -> [a] -> MassageM [b] +mapMM f [] = returnMM [] +mapMM f (m:ms) + = f m `thenMM` \ r -> + mapMM f ms `thenMM` \ rs -> + returnMM (r:rs) + +mapAccumMM :: (acc -> x -> MassageM (acc, y)) -> acc -> [x] -> MassageM (acc, [y]) +mapAccumMM f b [] = returnMM (b, []) +mapAccumMM f b (m:ms) + = f b m `thenMM` \ (b2, r) -> + mapAccumMM f b2 ms `thenMM` \ (b3, rs) -> + returnMM (b3, r:rs) + +getUniqueMM :: MassageM Unique +getUniqueMM mod scope_cc us ids ccs = (ccs, uniqFromSupply us) + +addTopLevelIshId :: Id -> MassageM a -> MassageM a +addTopLevelIshId id scope mod scope_cc us ids ccs + | isCurrentCCS scope_cc = scope mod scope_cc us ids ccs + | otherwise = scope mod scope_cc us (extendVarSet ids id) ccs + +addTopLevelIshIds :: [Id] -> MassageM a -> MassageM a +addTopLevelIshIds [] cont = cont +addTopLevelIshIds (id:ids) cont + = addTopLevelIshId id (addTopLevelIshIds ids cont) + +getTopLevelIshIds :: MassageM VarSet +getTopLevelIshIds mod scope_cc us ids ccs = (ccs, ids) +\end{code} + +The prevailing CCS is used to tell whether we're in a top-levelish +position, where top-levelish is defined as "not inside a lambda". +Prevailing CCs used to be used for something much more complicated, +I'm sure --SDM + +\begin{code} +set_lambda_cc :: MassageM a -> MassageM a +set_lambda_cc action mod scope_cc us ids ccs + = action mod currentCCS us ids ccs + +set_prevailing_cc :: CostCentreStack -> MassageM a -> MassageM a +set_prevailing_cc cc_to_set_to action mod scope_cc us ids ccs + = action mod cc_to_set_to us ids ccs + +get_prevailing_cc :: MassageM CostCentreStack +get_prevailing_cc mod scope_cc us ids ccs = (ccs, scope_cc) +\end{code} + +\begin{code} +collectCC :: CostCentre -> MassageM () + +collectCC cc mod_name scope_cc us ids (local_ccs, extern_ccs, ccss) + = ASSERT(not (noCCAttached cc)) + if (cc `ccFromThisModule` mod_name) then + ((cc : local_ccs, extern_ccs, ccss), ()) + else -- must declare it "extern" + ((local_ccs, cc : extern_ccs, ccss), ()) + +collectCCS :: CostCentreStack -> MassageM () + +collectCCS ccs mod_name scope_cc us ids (local_ccs, extern_ccs, ccss) + = ASSERT(not (noCCSAttached ccs)) + ((local_ccs, extern_ccs, ccs : ccss), ()) +\end{code} diff --git a/compiler/rename/RnBinds.lhs b/compiler/rename/RnBinds.lhs new file mode 100644 index 0000000000..13035e72e2 --- /dev/null +++ b/compiler/rename/RnBinds.lhs @@ -0,0 +1,660 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[RnBinds]{Renaming and dependency analysis of bindings} + +This module does renaming and dependency analysis on value bindings in +the abstract syntax. It does {\em not} do cycle-checks on class or +type-synonym declarations; those cannot be done at this stage because +they may be affected by renaming (which isn't fully worked out yet). + +\begin{code} +module RnBinds ( + rnTopBinds, + rnLocalBindsAndThen, rnValBindsAndThen, rnValBinds, trimWith, + rnMethodBinds, renameSigs, + rnMatchGroup, rnGRHSs + ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} RnExpr( rnLExpr, rnStmts ) + +import HsSyn +import RdrHsSyn +import RnHsSyn +import TcRnMonad +import RnTypes ( rnHsSigType, rnLHsType, rnHsTypeFVs, + rnLPat, rnPatsAndThen, patSigErr, checkPrecMatch ) +import RnEnv ( bindLocatedLocalsRn, lookupLocatedBndrRn, + lookupLocatedInstDeclBndr, newIPNameRn, + lookupLocatedSigOccRn, bindPatSigTyVars, bindPatSigTyVarsFV, + bindLocalFixities, bindSigTyVarsFV, + warnUnusedLocalBinds, mapFvRn, extendTyVarEnvFVRn, + ) +import DynFlags ( DynFlag(..) ) +import Name ( Name, nameOccName, nameSrcLoc ) +import NameEnv +import NameSet +import PrelNames ( isUnboundName ) +import RdrName ( RdrName, rdrNameOcc ) +import SrcLoc ( mkSrcSpan, Located(..), unLoc ) +import ListSetOps ( findDupsEq ) +import BasicTypes ( RecFlag(..) ) +import Digraph ( SCC(..), stronglyConnComp ) +import Bag +import Outputable +import Maybes ( orElse, isJust ) +import Util ( filterOut ) +import Monad ( foldM ) +\end{code} + +-- ToDo: Put the annotations into the monad, so that they arrive in the proper +-- place and can be used when complaining. + +The code tree received by the function @rnBinds@ contains definitions +in where-clauses which are all apparently mutually recursive, but which may +not really depend upon each other. For example, in the top level program +\begin{verbatim} +f x = y where a = x + y = x +\end{verbatim} +the definitions of @a@ and @y@ do not depend on each other at all. +Unfortunately, the typechecker cannot always check such definitions. +\footnote{Mycroft, A. 1984. Polymorphic type schemes and recursive +definitions. In Proceedings of the International Symposium on Programming, +Toulouse, pp. 217-39. LNCS 167. Springer Verlag.} +However, the typechecker usually can check definitions in which only the +strongly connected components have been collected into recursive bindings. +This is precisely what the function @rnBinds@ does. + +ToDo: deal with case where a single monobinds binds the same variable +twice. + +The vertag tag is a unique @Int@; the tags only need to be unique +within one @MonoBinds@, so that unique-Int plumbing is done explicitly +(heavy monad machinery not needed). + + +%************************************************************************ +%* * +%* naming conventions * +%* * +%************************************************************************ + +\subsection[name-conventions]{Name conventions} + +The basic algorithm involves walking over the tree and returning a tuple +containing the new tree plus its free variables. Some functions, such +as those walking polymorphic bindings (HsBinds) and qualifier lists in +list comprehensions (@Quals@), return the variables bound in local +environments. These are then used to calculate the free variables of the +expression evaluated in these environments. + +Conventions for variable names are as follows: +\begin{itemize} +\item +new code is given a prime to distinguish it from the old. + +\item +a set of variables defined in @Exp@ is written @dvExp@ + +\item +a set of variables free in @Exp@ is written @fvExp@ +\end{itemize} + +%************************************************************************ +%* * +%* analysing polymorphic bindings (HsBindGroup, HsBind) +%* * +%************************************************************************ + +\subsubsection[dep-HsBinds]{Polymorphic bindings} + +Non-recursive expressions are reconstructed without any changes at top +level, although their component expressions may have to be altered. +However, non-recursive expressions are currently not expected as +\Haskell{} programs, and this code should not be executed. + +Monomorphic bindings contain information that is returned in a tuple +(a @FlatMonoBinds@) containing: + +\begin{enumerate} +\item +a unique @Int@ that serves as the ``vertex tag'' for this binding. + +\item +the name of a function or the names in a pattern. These are a set +referred to as @dvLhs@, the defined variables of the left hand side. + +\item +the free variables of the body. These are referred to as @fvBody@. + +\item +the definition's actual code. This is referred to as just @code@. +\end{enumerate} + +The function @nonRecDvFv@ returns two sets of variables. The first is +the set of variables defined in the set of monomorphic bindings, while the +second is the set of free variables in those bindings. + +The set of variables defined in a non-recursive binding is just the +union of all of them, as @union@ removes duplicates. However, the +free variables in each successive set of cumulative bindings is the +union of those in the previous set plus those of the newest binding after +the defined variables of the previous set have been removed. + +@rnMethodBinds@ deals only with the declarations in class and +instance declarations. It expects only to see @FunMonoBind@s, and +it expects the global environment to contain bindings for the binders +(which are all class operations). + +%************************************************************************ +%* * +\subsubsection{ Top-level bindings} +%* * +%************************************************************************ + +@rnTopMonoBinds@ assumes that the environment already +contains bindings for the binders of this particular binding. + +\begin{code} +rnTopBinds :: HsValBinds RdrName -> RnM (HsValBinds Name, DefUses) + +-- The binders of the binding are in scope already; +-- the top level scope resolution does that + +rnTopBinds binds + = do { is_boot <- tcIsHsBoot + ; if is_boot then rnTopBindsBoot binds + else rnTopBindsSrc binds } + +rnTopBindsBoot :: HsValBinds RdrName -> RnM (HsValBinds Name, DefUses) +-- A hs-boot file has no bindings. +-- Return a single HsBindGroup with empty binds and renamed signatures +rnTopBindsBoot (ValBindsIn mbinds sigs) + = do { checkErr (isEmptyLHsBinds mbinds) (bindsInHsBootFile mbinds) + ; sigs' <- renameSigs okHsBootSig sigs + ; return (ValBindsOut [] sigs', usesOnly (hsSigsFVs sigs')) } + +rnTopBindsSrc :: HsValBinds RdrName -> RnM (HsValBinds Name, DefUses) +rnTopBindsSrc binds@(ValBindsIn mbinds _) + = do { (binds', dus) <- rnValBinds noTrim binds + + -- Warn about missing signatures, + ; let { ValBindsOut _ sigs' = binds' + ; ty_sig_vars = mkNameSet [ unLoc n | L _ (TypeSig n _) <- sigs'] + ; un_sigd_bndrs = duDefs dus `minusNameSet` ty_sig_vars } + + ; warn_missing_sigs <- doptM Opt_WarnMissingSigs + ; ifM (warn_missing_sigs) + (mappM_ missingSigWarn (nameSetToList un_sigd_bndrs)) + + ; return (binds', dus) + } +\end{code} + + + +%********************************************************* +%* * + HsLocalBinds +%* * +%********************************************************* + +\begin{code} +rnLocalBindsAndThen + :: HsLocalBinds RdrName + -> (HsLocalBinds Name -> RnM (result, FreeVars)) + -> RnM (result, FreeVars) +-- This version (a) assumes that the binding vars are not already in scope +-- (b) removes the binders from the free vars of the thing inside +-- The parser doesn't produce ThenBinds +rnLocalBindsAndThen EmptyLocalBinds thing_inside + = thing_inside EmptyLocalBinds + +rnLocalBindsAndThen (HsValBinds val_binds) thing_inside + = rnValBindsAndThen val_binds $ \ val_binds' -> + thing_inside (HsValBinds val_binds') + +rnLocalBindsAndThen (HsIPBinds binds) thing_inside + = rnIPBinds binds `thenM` \ (binds',fv_binds) -> + thing_inside (HsIPBinds binds') `thenM` \ (thing, fvs_thing) -> + returnM (thing, fvs_thing `plusFV` fv_binds) + +------------- +rnIPBinds (IPBinds ip_binds _no_dict_binds) + = do { (ip_binds', fvs_s) <- mapAndUnzipM (wrapLocFstM rnIPBind) ip_binds + ; return (IPBinds ip_binds' emptyLHsBinds, plusFVs fvs_s) } + +rnIPBind (IPBind n expr) + = newIPNameRn n `thenM` \ name -> + rnLExpr expr `thenM` \ (expr',fvExpr) -> + return (IPBind name expr', fvExpr) +\end{code} + + +%************************************************************************ +%* * + ValBinds +%* * +%************************************************************************ + +\begin{code} +rnValBindsAndThen :: HsValBinds RdrName + -> (HsValBinds Name -> RnM (result, FreeVars)) + -> RnM (result, FreeVars) + +rnValBindsAndThen binds@(ValBindsIn mbinds sigs) thing_inside + = -- Extract all the binders in this group, and extend the + -- current scope, inventing new names for the new binders + -- This also checks that the names form a set + bindLocatedLocalsRn doc mbinders_w_srclocs $ \ bndrs -> + + -- Then install local fixity declarations + -- Notice that they scope over thing_inside too + bindLocalFixities [sig | L _ (FixSig sig) <- sigs ] $ + + -- Do the business + rnValBinds (trimWith bndrs) binds `thenM` \ (binds, bind_dus) -> + + -- Now do the "thing inside" + thing_inside binds `thenM` \ (result,result_fvs) -> + + -- Final error checking + let + all_uses = duUses bind_dus `plusFV` result_fvs + -- duUses: It's important to return all the uses, not the 'real uses' + -- used for warning about unused bindings. Otherwise consider: + -- x = 3 + -- y = let p = x in 'x' -- NB: p not used + -- If we don't "see" the dependency of 'y' on 'x', we may put the + -- bindings in the wrong order, and the type checker will complain + -- that x isn't in scope + + unused_bndrs = [ b | b <- bndrs, not (b `elemNameSet` all_uses)] + in + warnUnusedLocalBinds unused_bndrs `thenM_` + + returnM (result, delListFromNameSet all_uses bndrs) + where + mbinders_w_srclocs = collectHsBindLocatedBinders mbinds + doc = text "In the binding group for:" + <+> pprWithCommas ppr (map unLoc mbinders_w_srclocs) + +--------------------- +rnValBinds :: (FreeVars -> FreeVars) + -> HsValBinds RdrName + -> RnM (HsValBinds Name, DefUses) +-- Assumes the binders of the binding are in scope already + +rnValBinds trim (ValBindsIn mbinds sigs) + = do { sigs' <- rename_sigs sigs + + ; binds_w_dus <- mapBagM (rnBind (mkSigTvFn sigs') trim) mbinds + + ; let (binds', bind_dus) = depAnalBinds binds_w_dus + + -- We do the check-sigs after renaming the bindings, + -- so that we have convenient access to the binders + ; check_sigs (okBindSig (duDefs bind_dus)) sigs' + + ; return (ValBindsOut binds' sigs', + usesOnly (hsSigsFVs sigs') `plusDU` bind_dus) } + + +--------------------- +depAnalBinds :: Bag (LHsBind Name, [Name], Uses) + -> ([(RecFlag, LHsBinds Name)], DefUses) +-- Dependency analysis; this is important so that +-- unused-binding reporting is accurate +depAnalBinds binds_w_dus + = (map get_binds sccs, map get_du sccs) + where + sccs = stronglyConnComp edges + + keyd_nodes = bagToList binds_w_dus `zip` [0::Int ..] + + edges = [ (node, key, [key | n <- nameSetToList uses, + Just key <- [lookupNameEnv key_map n] ]) + | (node@(_,_,uses), key) <- keyd_nodes ] + + key_map :: NameEnv Int -- Which binding it comes from + key_map = mkNameEnv [(bndr, key) | ((_, bndrs, _), key) <- keyd_nodes + , bndr <- bndrs ] + + get_binds (AcyclicSCC (bind, _, _)) = (NonRecursive, unitBag bind) + get_binds (CyclicSCC binds_w_dus) = (Recursive, listToBag [b | (b,d,u) <- binds_w_dus]) + + get_du (AcyclicSCC (_, bndrs, uses)) = (Just (mkNameSet bndrs), uses) + get_du (CyclicSCC binds_w_dus) = (Just defs, uses) + where + defs = mkNameSet [b | (_,bs,_) <- binds_w_dus, b <- bs] + uses = unionManyNameSets [u | (_,_,u) <- binds_w_dus] + + +--------------------- +-- Bind the top-level forall'd type variables in the sigs. +-- E.g f :: a -> a +-- f = rhs +-- The 'a' scopes over the rhs +-- +-- NB: there'll usually be just one (for a function binding) +-- but if there are many, one may shadow the rest; too bad! +-- e.g x :: [a] -> [a] +-- y :: [(a,a)] -> a +-- (x,y) = e +-- In e, 'a' will be in scope, and it'll be the one from 'y'! + +mkSigTvFn :: [LSig Name] -> (Name -> [Name]) +-- Return a lookup function that maps an Id Name to the names +-- of the type variables that should scope over its body.. +mkSigTvFn sigs + = \n -> lookupNameEnv env n `orElse` [] + where + env :: NameEnv [Name] + env = mkNameEnv [ (name, map hsLTyVarName ltvs) + | L _ (TypeSig (L _ name) + (L _ (HsForAllTy Explicit ltvs _ _))) <- sigs] + -- Note the pattern-match on "Explicit"; we only bind + -- type variables from signatures with an explicit top-level for-all + +-- The trimming function trims the free vars we attach to a +-- binding so that it stays reasonably small +noTrim :: FreeVars -> FreeVars +noTrim fvs = fvs -- Used at top level + +trimWith :: [Name] -> FreeVars -> FreeVars +-- Nested bindings; trim by intersection with the names bound here +trimWith bndrs = intersectNameSet (mkNameSet bndrs) + +--------------------- +rnBind :: (Name -> [Name]) -- Signature tyvar function + -> (FreeVars -> FreeVars) -- Trimming function for rhs free vars + -> LHsBind RdrName + -> RnM (LHsBind Name, [Name], Uses) +rnBind sig_fn trim (L loc (PatBind { pat_lhs = pat, pat_rhs = grhss })) + = setSrcSpan loc $ + do { (pat', pat_fvs) <- rnLPat pat + + ; let bndrs = collectPatBinders pat' + + ; (grhss', fvs) <- bindSigTyVarsFV (concatMap sig_fn bndrs) $ + rnGRHSs PatBindRhs grhss + + ; return (L loc (PatBind { pat_lhs = pat', pat_rhs = grhss', + pat_rhs_ty = placeHolderType, bind_fvs = trim fvs }), + bndrs, pat_fvs `plusFV` fvs) } + +rnBind sig_fn trim (L loc (FunBind { fun_id = name, fun_infix = inf, fun_matches = matches })) + = setSrcSpan loc $ + do { new_name <- lookupLocatedBndrRn name + ; let plain_name = unLoc new_name + + ; (matches', fvs) <- bindSigTyVarsFV (sig_fn plain_name) $ + rnMatchGroup (FunRhs plain_name) matches + + ; checkPrecMatch inf plain_name matches' + + ; return (L loc (FunBind { fun_id = new_name, fun_infix = inf, fun_matches = matches', + bind_fvs = trim fvs, fun_co_fn = idCoercion }), + [plain_name], fvs) + } +\end{code} + + +@rnMethodBinds@ is used for the method bindings of a class and an instance +declaration. Like @rnBinds@ but without dependency analysis. + +NOTA BENE: we record each {\em binder} of a method-bind group as a free variable. +That's crucial when dealing with an instance decl: +\begin{verbatim} + instance Foo (T a) where + op x = ... +\end{verbatim} +This might be the {\em sole} occurrence of @op@ for an imported class @Foo@, +and unless @op@ occurs we won't treat the type signature of @op@ in the class +decl for @Foo@ as a source of instance-decl gates. But we should! Indeed, +in many ways the @op@ in an instance decl is just like an occurrence, not +a binder. + +\begin{code} +rnMethodBinds :: Name -- Class name + -> [Name] -- Names for generic type variables + -> LHsBinds RdrName + -> RnM (LHsBinds Name, FreeVars) + +rnMethodBinds cls gen_tyvars binds + = foldM do_one (emptyBag,emptyFVs) (bagToList binds) + where do_one (binds,fvs) bind = do + (bind', fvs_bind) <- rnMethodBind cls gen_tyvars bind + return (bind' `unionBags` binds, fvs_bind `plusFV` fvs) + +rnMethodBind cls gen_tyvars (L loc (FunBind { fun_id = name, fun_infix = inf, + fun_matches = MatchGroup matches _ })) + = setSrcSpan loc $ + lookupLocatedInstDeclBndr cls name `thenM` \ sel_name -> + let plain_name = unLoc sel_name in + -- We use the selector name as the binder + + mapFvRn (rn_match plain_name) matches `thenM` \ (new_matches, fvs) -> + let + new_group = MatchGroup new_matches placeHolderType + in + checkPrecMatch inf plain_name new_group `thenM_` + returnM (unitBag (L loc (FunBind { fun_id = sel_name, fun_infix = inf, fun_matches = new_group, + bind_fvs = fvs, fun_co_fn = idCoercion })), + fvs `addOneFV` plain_name) + -- The 'fvs' field isn't used for method binds + where + -- Truly gruesome; bring into scope the correct members of the generic + -- type variables. See comments in RnSource.rnSourceDecl(ClassDecl) + rn_match sel_name match@(L _ (Match (L _ (TypePat ty) : _) _ _)) + = extendTyVarEnvFVRn gen_tvs $ + rnMatch (FunRhs sel_name) match + where + tvs = map (rdrNameOcc.unLoc) (extractHsTyRdrTyVars ty) + gen_tvs = [tv | tv <- gen_tyvars, nameOccName tv `elem` tvs] + + rn_match sel_name match = rnMatch (FunRhs sel_name) match + + +-- Can't handle method pattern-bindings which bind multiple methods. +rnMethodBind cls gen_tyvars mbind@(L loc (PatBind other_pat _ _ _)) + = addLocErr mbind methodBindErr `thenM_` + returnM (emptyBag, emptyFVs) +\end{code} + + +%************************************************************************ +%* * +\subsubsection[dep-Sigs]{Signatures (and user-pragmas for values)} +%* * +%************************************************************************ + +@renameSigs@ checks for: +\begin{enumerate} +\item more than one sig for one thing; +\item signatures given for things not bound here; +\item with suitably flaggery, that all top-level things have type signatures. +\end{enumerate} +% +At the moment we don't gather free-var info from the types in +signatures. We'd only need this if we wanted to report unused tyvars. + +\begin{code} +renameSigs :: (LSig Name -> Bool) -> [LSig RdrName] -> RnM [LSig Name] +-- Renames the signatures and performs error checks +renameSigs ok_sig sigs + = do { sigs' <- rename_sigs sigs + ; check_sigs ok_sig sigs' + ; return sigs' } + +---------------------- +rename_sigs :: [LSig RdrName] -> RnM [LSig Name] +rename_sigs sigs = mappM (wrapLocM renameSig) + (filter (not . isFixityLSig) sigs) + -- Remove fixity sigs which have been dealt with already + +---------------------- +check_sigs :: (LSig Name -> Bool) -> [LSig Name] -> RnM () +-- Used for class and instance decls, as well as regular bindings +check_sigs ok_sig sigs + -- Check for (a) duplicate signatures + -- (b) signatures for things not in this group + = do { mappM_ unknownSigErr (filter (not . ok_sig) sigs') + ; mappM_ dupSigDeclErr (findDupsEq eqHsSig sigs') } + where + -- Don't complain about an unbound name again + sigs' = filterOut bad_name sigs + bad_name sig = case sigName sig of + Just n -> isUnboundName n + other -> False + +-- We use lookupLocatedSigOccRn in the signatures, which is a little bit unsatisfactory +-- because this won't work for: +-- instance Foo T where +-- {-# INLINE op #-} +-- Baz.op = ... +-- We'll just rename the INLINE prag to refer to whatever other 'op' +-- is in scope. (I'm assuming that Baz.op isn't in scope unqualified.) +-- Doesn't seem worth much trouble to sort this. + +renameSig :: Sig RdrName -> RnM (Sig Name) +-- FixitSig is renamed elsewhere. +renameSig (TypeSig v ty) + = lookupLocatedSigOccRn v `thenM` \ new_v -> + rnHsSigType (quotes (ppr v)) ty `thenM` \ new_ty -> + returnM (TypeSig new_v new_ty) + +renameSig (SpecInstSig ty) + = rnLHsType (text "A SPECIALISE instance pragma") ty `thenM` \ new_ty -> + returnM (SpecInstSig new_ty) + +renameSig (SpecSig v ty inl) + = lookupLocatedSigOccRn v `thenM` \ new_v -> + rnHsSigType (quotes (ppr v)) ty `thenM` \ new_ty -> + returnM (SpecSig new_v new_ty inl) + +renameSig (InlineSig v s) + = lookupLocatedSigOccRn v `thenM` \ new_v -> + returnM (InlineSig new_v s) +\end{code} + + +************************************************************************ +* * +\subsection{Match} +* * +************************************************************************ + +\begin{code} +rnMatchGroup :: HsMatchContext Name -> MatchGroup RdrName -> RnM (MatchGroup Name, FreeVars) +rnMatchGroup ctxt (MatchGroup ms _) + = mapFvRn (rnMatch ctxt) ms `thenM` \ (new_ms, ms_fvs) -> + returnM (MatchGroup new_ms placeHolderType, ms_fvs) + +rnMatch :: HsMatchContext Name -> LMatch RdrName -> RnM (LMatch Name, FreeVars) +rnMatch ctxt = wrapLocFstM (rnMatch' ctxt) + +rnMatch' ctxt match@(Match pats maybe_rhs_sig grhss) + = + -- Deal with the rhs type signature + bindPatSigTyVarsFV rhs_sig_tys $ + doptM Opt_GlasgowExts `thenM` \ opt_GlasgowExts -> + (case maybe_rhs_sig of + Nothing -> returnM (Nothing, emptyFVs) + Just ty | opt_GlasgowExts -> rnHsTypeFVs doc_sig ty `thenM` \ (ty', ty_fvs) -> + returnM (Just ty', ty_fvs) + | otherwise -> addLocErr ty patSigErr `thenM_` + returnM (Nothing, emptyFVs) + ) `thenM` \ (maybe_rhs_sig', ty_fvs) -> + + -- Now the main event + rnPatsAndThen ctxt pats $ \ pats' -> + rnGRHSs ctxt grhss `thenM` \ (grhss', grhss_fvs) -> + + returnM (Match pats' maybe_rhs_sig' grhss', grhss_fvs `plusFV` ty_fvs) + -- The bindPatSigTyVarsFV and rnPatsAndThen will remove the bound FVs + where + rhs_sig_tys = case maybe_rhs_sig of + Nothing -> [] + Just ty -> [ty] + doc_sig = text "In a result type-signature" +\end{code} + + +%************************************************************************ +%* * +\subsubsection{Guarded right-hand sides (GRHSs)} +%* * +%************************************************************************ + +\begin{code} +rnGRHSs :: HsMatchContext Name -> GRHSs RdrName -> RnM (GRHSs Name, FreeVars) + +rnGRHSs ctxt (GRHSs grhss binds) + = rnLocalBindsAndThen binds $ \ binds' -> + mapFvRn (rnGRHS ctxt) grhss `thenM` \ (grhss', fvGRHSs) -> + returnM (GRHSs grhss' binds', fvGRHSs) + +rnGRHS :: HsMatchContext Name -> LGRHS RdrName -> RnM (LGRHS Name, FreeVars) +rnGRHS ctxt = wrapLocFstM (rnGRHS' ctxt) + +rnGRHS' ctxt (GRHS guards rhs) + = do { opt_GlasgowExts <- doptM Opt_GlasgowExts + ; ((guards', rhs'), fvs) <- rnStmts (PatGuard ctxt) guards $ + rnLExpr rhs + + ; checkM (opt_GlasgowExts || is_standard_guard guards') + (addWarn (nonStdGuardErr guards')) + + ; return (GRHS guards' rhs', fvs) } + where + -- Standard Haskell 1.4 guards are just a single boolean + -- expression, rather than a list of qualifiers as in the + -- Glasgow extension + is_standard_guard [] = True + is_standard_guard [L _ (ExprStmt _ _ _)] = True + is_standard_guard other = False +\end{code} + +%************************************************************************ +%* * +\subsection{Error messages} +%* * +%************************************************************************ + +\begin{code} +dupSigDeclErr sigs@(L loc sig : _) + = addErrAt loc $ + vcat [ptext SLIT("Duplicate") <+> what_it_is <> colon, + nest 2 (vcat (map ppr_sig sigs))] + where + what_it_is = hsSigDoc sig + ppr_sig (L loc sig) = ppr loc <> colon <+> ppr sig + +unknownSigErr (L loc sig) + = addErrAt loc $ + sep [ptext SLIT("Misplaced") <+> what_it_is <> colon, ppr sig] + where + what_it_is = hsSigDoc sig + +missingSigWarn var + = addWarnAt (mkSrcSpan loc loc) $ + sep [ptext SLIT("Definition but no type signature for"), quotes (ppr var)] + where + loc = nameSrcLoc var -- TODO: make a proper span + +methodBindErr mbind + = hang (ptext SLIT("Pattern bindings (except simple variables) not allowed in instance declarations")) + 2 (ppr mbind) + +bindsInHsBootFile mbinds + = hang (ptext SLIT("Bindings in hs-boot files are not allowed")) + 2 (ppr mbinds) + +nonStdGuardErr guards + = hang (ptext SLIT("accepting non-standard pattern guards (-fglasgow-exts to suppress this message)")) + 4 (interpp'SP guards) +\end{code} diff --git a/compiler/rename/RnEnv.lhs b/compiler/rename/RnEnv.lhs new file mode 100644 index 0000000000..2be3bfd5c0 --- /dev/null +++ b/compiler/rename/RnEnv.lhs @@ -0,0 +1,811 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[RnEnv]{Environment manipulation for the renamer monad} + +\begin{code} +module RnEnv ( + newTopSrcBinder, + lookupLocatedBndrRn, lookupBndrRn, + lookupLocatedTopBndrRn, lookupTopBndrRn, + lookupLocatedOccRn, lookupOccRn, + lookupLocatedGlobalOccRn, lookupGlobalOccRn, + lookupLocalDataTcNames, lookupSrcOcc_maybe, + lookupFixityRn, lookupTyFixityRn, lookupLocatedSigOccRn, + lookupLocatedInstDeclBndr, + lookupSyntaxName, lookupSyntaxTable, lookupImportedName, + + newLocalsRn, newIPNameRn, + bindLocalNames, bindLocalNamesFV, + bindLocatedLocalsFV, bindLocatedLocalsRn, + bindSigTyVarsFV, bindPatSigTyVars, bindPatSigTyVarsFV, + bindTyVarsRn, extendTyVarEnvFVRn, + bindLocalFixities, + + checkDupNames, mapFvRn, + warnUnusedMatches, warnUnusedModules, warnUnusedImports, + warnUnusedTopBinds, warnUnusedLocalBinds, + dataTcOccs, unknownNameErr, + ) where + +#include "HsVersions.h" + +import LoadIface ( loadHomeInterface, loadSrcInterface ) +import IfaceEnv ( lookupOrig, newGlobalBinder, newIPName ) +import HsSyn ( FixitySig(..), HsExpr(..), SyntaxExpr, SyntaxTable, + LHsTyVarBndr, LHsType, + Fixity, hsLTyVarLocNames, replaceTyVarName ) +import RdrHsSyn ( extractHsTyRdrTyVars ) +import RdrName ( RdrName, rdrNameModule, isQual, isUnqual, isOrig, + mkRdrUnqual, setRdrNameSpace, rdrNameOcc, + pprGlobalRdrEnv, lookupGRE_RdrName, + isExact_maybe, isSrcRdrName, + GlobalRdrElt(..), GlobalRdrEnv, lookupGlobalRdrEnv, + isLocalGRE, extendLocalRdrEnv, elemLocalRdrEnv, lookupLocalRdrEnv, + Provenance(..), pprNameProvenance, + importSpecLoc, importSpecModule + ) +import HscTypes ( availNames, ModIface(..), FixItem(..), lookupFixity ) +import TcRnMonad +import Name ( Name, nameIsLocalOrFrom, mkInternalName, isWiredInName, + nameSrcLoc, nameOccName, nameModule, nameParent, isExternalName ) +import NameSet +import OccName ( tcName, isDataOcc, pprNonVarNameSpace, occNameSpace, + reportIfUnused ) +import Module ( Module ) +import PrelNames ( mkUnboundName, rOOT_MAIN, iNTERACTIVE, consDataConKey, hasKey ) +import UniqSupply +import BasicTypes ( IPName, mapIPName ) +import SrcLoc ( SrcSpan, srcSpanStart, Located(..), eqLocated, unLoc, + srcLocSpan, getLoc, combineSrcSpans, srcSpanStartLine, srcSpanEndLine ) +import Outputable +import Util ( sortLe ) +import ListSetOps ( removeDups ) +import List ( nubBy ) +import Monad ( when ) +import DynFlags +\end{code} + +%********************************************************* +%* * + Source-code binders +%* * +%********************************************************* + +\begin{code} +newTopSrcBinder :: Module -> Maybe Name -> Located RdrName -> RnM Name +newTopSrcBinder this_mod mb_parent (L loc rdr_name) + | Just name <- isExact_maybe rdr_name + = -- This is here to catch + -- (a) Exact-name binders created by Template Haskell + -- (b) The PrelBase defn of (say) [] and similar, for which + -- the parser reads the special syntax and returns an Exact RdrName + -- We are at a binding site for the name, so check first that it + -- the current module is the correct one; otherwise GHC can get + -- very confused indeed. This test rejects code like + -- data T = (,) Int Int + -- unless we are in GHC.Tup + ASSERT2( isExternalName name, ppr name ) + do checkErr (this_mod == nameModule name) + (badOrigBinding rdr_name) + returnM name + + + | isOrig rdr_name + = do checkErr (rdr_mod == this_mod || rdr_mod == rOOT_MAIN) + (badOrigBinding rdr_name) + -- When reading External Core we get Orig names as binders, + -- but they should agree with the module gotten from the monad + -- + -- We can get built-in syntax showing up here too, sadly. If you type + -- data T = (,,,) + -- the constructor is parsed as a type, and then RdrHsSyn.tyConToDataCon + -- uses setRdrNameSpace to make it into a data constructors. At that point + -- the nice Exact name for the TyCon gets swizzled to an Orig name. + -- Hence the badOrigBinding error message. + -- + -- Except for the ":Main.main = ..." definition inserted into + -- the Main module; ugh! + + -- Because of this latter case, we call newGlobalBinder with a module from + -- the RdrName, not from the environment. In principle, it'd be fine to + -- have an arbitrary mixture of external core definitions in a single module, + -- (apart from module-initialisation issues, perhaps). + newGlobalBinder rdr_mod (rdrNameOcc rdr_name) mb_parent + (srcSpanStart loc) --TODO, should pass the whole span + + | otherwise + = newGlobalBinder this_mod (rdrNameOcc rdr_name) mb_parent (srcSpanStart loc) + where + rdr_mod = rdrNameModule rdr_name +\end{code} + +%********************************************************* +%* * + Source code occurrences +%* * +%********************************************************* + +Looking up a name in the RnEnv. + +\begin{code} +lookupLocatedBndrRn :: Located RdrName -> RnM (Located Name) +lookupLocatedBndrRn = wrapLocM lookupBndrRn + +lookupBndrRn :: RdrName -> RnM Name +-- NOTE: assumes that the SrcSpan of the binder has already been setSrcSpan'd +lookupBndrRn rdr_name + = getLocalRdrEnv `thenM` \ local_env -> + case lookupLocalRdrEnv local_env rdr_name of + Just name -> returnM name + Nothing -> lookupTopBndrRn rdr_name + +lookupLocatedTopBndrRn :: Located RdrName -> RnM (Located Name) +lookupLocatedTopBndrRn = wrapLocM lookupTopBndrRn + +lookupTopBndrRn :: RdrName -> RnM Name +-- Look up a top-level source-code binder. We may be looking up an unqualified 'f', +-- and there may be several imported 'f's too, which must not confuse us. +-- For example, this is OK: +-- import Foo( f ) +-- infix 9 f -- The 'f' here does not need to be qualified +-- f x = x -- Nor here, of course +-- So we have to filter out the non-local ones. +-- +-- A separate function (importsFromLocalDecls) reports duplicate top level +-- decls, so here it's safe just to choose an arbitrary one. +-- +-- There should never be a qualified name in a binding position in Haskell, +-- but there can be if we have read in an external-Core file. +-- The Haskell parser checks for the illegal qualified name in Haskell +-- source files, so we don't need to do so here. + +lookupTopBndrRn rdr_name + | Just name <- isExact_maybe rdr_name + = returnM name + + | isOrig rdr_name + -- This deals with the case of derived bindings, where + -- we don't bother to call newTopSrcBinder first + -- We assume there is no "parent" name + = do { loc <- getSrcSpanM + ; newGlobalBinder (rdrNameModule rdr_name) + (rdrNameOcc rdr_name) Nothing (srcSpanStart loc) } + + | otherwise + = do { mb_gre <- lookupGreLocalRn rdr_name + ; case mb_gre of + Nothing -> unboundName rdr_name + Just gre -> returnM (gre_name gre) } + +-- lookupLocatedSigOccRn is used for type signatures and pragmas +-- Is this valid? +-- module A +-- import M( f ) +-- f :: Int -> Int +-- f x = x +-- It's clear that the 'f' in the signature must refer to A.f +-- The Haskell98 report does not stipulate this, but it will! +-- So we must treat the 'f' in the signature in the same way +-- as the binding occurrence of 'f', using lookupBndrRn +lookupLocatedSigOccRn :: Located RdrName -> RnM (Located Name) +lookupLocatedSigOccRn = lookupLocatedBndrRn + +-- lookupInstDeclBndr is used for the binders in an +-- instance declaration. Here we use the class name to +-- disambiguate. + +lookupLocatedInstDeclBndr :: Name -> Located RdrName -> RnM (Located Name) +lookupLocatedInstDeclBndr cls = wrapLocM (lookupInstDeclBndr cls) + +lookupInstDeclBndr :: Name -> RdrName -> RnM Name +lookupInstDeclBndr cls_name rdr_name + | isUnqual rdr_name -- Find all the things the rdr-name maps to + = do { -- and pick the one with the right parent name + let { is_op gre = cls_name == nameParent (gre_name gre) + ; occ = rdrNameOcc rdr_name + ; lookup_fn env = filter is_op (lookupGlobalRdrEnv env occ) } + ; mb_gre <- lookupGreRn_help rdr_name lookup_fn + ; case mb_gre of + Just gre -> return (gre_name gre) + Nothing -> do { addErr (unknownInstBndrErr cls_name rdr_name) + ; return (mkUnboundName rdr_name) } } + + | otherwise -- Occurs in derived instances, where we just + -- refer directly to the right method + = ASSERT2( not (isQual rdr_name), ppr rdr_name ) + -- NB: qualified names are rejected by the parser + lookupImportedName rdr_name + +newIPNameRn :: IPName RdrName -> TcRnIf m n (IPName Name) +newIPNameRn ip_rdr = newIPName (mapIPName rdrNameOcc ip_rdr) + +-------------------------------------------------- +-- Occurrences +-------------------------------------------------- + +lookupLocatedOccRn :: Located RdrName -> RnM (Located Name) +lookupLocatedOccRn = wrapLocM lookupOccRn + +-- lookupOccRn looks up an occurrence of a RdrName +lookupOccRn :: RdrName -> RnM Name +lookupOccRn rdr_name + = getLocalRdrEnv `thenM` \ local_env -> + case lookupLocalRdrEnv local_env rdr_name of + Just name -> returnM name + Nothing -> lookupGlobalOccRn rdr_name + +lookupLocatedGlobalOccRn :: Located RdrName -> RnM (Located Name) +lookupLocatedGlobalOccRn = wrapLocM lookupGlobalOccRn + +lookupGlobalOccRn :: RdrName -> RnM Name +-- lookupGlobalOccRn is like lookupOccRn, except that it looks in the global +-- environment. It's used only for +-- record field names +-- class op names in class and instance decls + +lookupGlobalOccRn rdr_name + | not (isSrcRdrName rdr_name) + = lookupImportedName rdr_name + + | otherwise + = -- First look up the name in the normal environment. + lookupGreRn rdr_name `thenM` \ mb_gre -> + case mb_gre of { + Just gre -> returnM (gre_name gre) ; + Nothing -> + + -- We allow qualified names on the command line to refer to + -- *any* name exported by any module in scope, just as if + -- there was an "import qualified M" declaration for every + -- module. + getModule `thenM` \ mod -> + if isQual rdr_name && mod == iNTERACTIVE then + -- This test is not expensive, + lookupQualifiedName rdr_name -- and only happens for failed lookups + else + unboundName rdr_name } + +lookupImportedName :: RdrName -> TcRnIf m n Name +-- Lookup the occurrence of an imported name +-- The RdrName is *always* qualified or Exact +-- Treat it as an original name, and conjure up the Name +-- Usually it's Exact or Orig, but it can be Qual if it +-- comes from an hi-boot file. (This minor infelicity is +-- just to reduce duplication in the parser.) +lookupImportedName rdr_name + | Just n <- isExact_maybe rdr_name + -- This happens in derived code + = returnM n + + | otherwise -- Always Orig, even when reading a .hi-boot file + = ASSERT( not (isUnqual rdr_name) ) + lookupOrig (rdrNameModule rdr_name) (rdrNameOcc rdr_name) + +unboundName :: RdrName -> RnM Name +unboundName rdr_name + = do { addErr (unknownNameErr rdr_name) + ; env <- getGlobalRdrEnv; + ; traceRn (vcat [unknownNameErr rdr_name, + ptext SLIT("Global envt is:"), + nest 3 (pprGlobalRdrEnv env)]) + ; returnM (mkUnboundName rdr_name) } + +-------------------------------------------------- +-- Lookup in the Global RdrEnv of the module +-------------------------------------------------- + +lookupSrcOcc_maybe :: RdrName -> RnM (Maybe Name) +-- No filter function; does not report an error on failure +lookupSrcOcc_maybe rdr_name + = do { mb_gre <- lookupGreRn rdr_name + ; case mb_gre of + Nothing -> returnM Nothing + Just gre -> returnM (Just (gre_name gre)) } + +------------------------- +lookupGreRn :: RdrName -> RnM (Maybe GlobalRdrElt) +-- Just look up the RdrName in the GlobalRdrEnv +lookupGreRn rdr_name + = lookupGreRn_help rdr_name (lookupGRE_RdrName rdr_name) + +lookupGreLocalRn :: RdrName -> RnM (Maybe GlobalRdrElt) +-- Similar, but restricted to locally-defined things +lookupGreLocalRn rdr_name + = lookupGreRn_help rdr_name lookup_fn + where + lookup_fn env = filter isLocalGRE (lookupGRE_RdrName rdr_name env) + +lookupGreRn_help :: RdrName -- Only used in error message + -> (GlobalRdrEnv -> [GlobalRdrElt]) -- Lookup function + -> RnM (Maybe GlobalRdrElt) +-- Checks for exactly one match; reports deprecations +-- Returns Nothing, without error, if too few +lookupGreRn_help rdr_name lookup + = do { env <- getGlobalRdrEnv + ; case lookup env of + [] -> returnM Nothing + [gre] -> returnM (Just gre) + gres -> do { addNameClashErrRn rdr_name gres + ; returnM (Just (head gres)) } } + +------------------------------ +-- GHCi support +------------------------------ + +-- A qualified name on the command line can refer to any module at all: we +-- try to load the interface if we don't already have it. +lookupQualifiedName :: RdrName -> RnM Name +lookupQualifiedName rdr_name + = let + mod = rdrNameModule rdr_name + occ = rdrNameOcc rdr_name + in + -- Note: we want to behave as we would for a source file import here, + -- and respect hiddenness of modules/packages, hence loadSrcInterface. + loadSrcInterface doc mod False `thenM` \ iface -> + + case [ (mod,occ) | + (mod,avails) <- mi_exports iface, + avail <- avails, + name <- availNames avail, + name == occ ] of + ((mod,occ):ns) -> ASSERT (null ns) + lookupOrig mod occ + _ -> unboundName rdr_name + where + doc = ptext SLIT("Need to find") <+> ppr rdr_name +\end{code} + +%********************************************************* +%* * + Fixities +%* * +%********************************************************* + +\begin{code} +lookupLocalDataTcNames :: RdrName -> RnM [Name] +-- GHC extension: look up both the tycon and data con +-- for con-like things +-- Complain if neither is in scope +lookupLocalDataTcNames rdr_name + | Just n <- isExact_maybe rdr_name + -- Special case for (:), which doesn't get into the GlobalRdrEnv + = return [n] -- For this we don't need to try the tycon too + | otherwise + = do { mb_gres <- mapM lookupGreLocalRn (dataTcOccs rdr_name) + ; case [gre_name gre | Just gre <- mb_gres] of + [] -> do { addErr (unknownNameErr rdr_name) + ; return [] } + names -> return names + } + +-------------------------------- +bindLocalFixities :: [FixitySig RdrName] -> RnM a -> RnM a +-- Used for nested fixity decls +-- No need to worry about type constructors here, +-- Should check for duplicates but we don't +bindLocalFixities fixes thing_inside + | null fixes = thing_inside + | otherwise = mappM rn_sig fixes `thenM` \ new_bit -> + extendFixityEnv new_bit thing_inside + where + rn_sig (FixitySig lv@(L loc v) fix) + = addLocM lookupBndrRn lv `thenM` \ new_v -> + returnM (new_v, (FixItem (rdrNameOcc v) fix loc)) +\end{code} + +-------------------------------- +lookupFixity is a bit strange. + +* Nested local fixity decls are put in the local fixity env, which we + find with getFixtyEnv + +* Imported fixities are found in the HIT or PIT + +* Top-level fixity decls in this module may be for Names that are + either Global (constructors, class operations) + or Local/Exported (everything else) + (See notes with RnNames.getLocalDeclBinders for why we have this split.) + We put them all in the local fixity environment + +\begin{code} +lookupFixityRn :: Name -> RnM Fixity +lookupFixityRn name + = getModule `thenM` \ this_mod -> + if nameIsLocalOrFrom this_mod name + then -- It's defined in this module + getFixityEnv `thenM` \ local_fix_env -> + traceRn (text "lookupFixityRn" <+> (ppr name $$ ppr local_fix_env)) `thenM_` + returnM (lookupFixity local_fix_env name) + + else -- It's imported + -- For imported names, we have to get their fixities by doing a + -- loadHomeInterface, and consulting the Ifaces that comes back + -- from that, because the interface file for the Name might not + -- have been loaded yet. Why not? Suppose you import module A, + -- which exports a function 'f', thus; + -- module CurrentModule where + -- import A( f ) + -- module A( f ) where + -- import B( f ) + -- Then B isn't loaded right away (after all, it's possible that + -- nothing from B will be used). When we come across a use of + -- 'f', we need to know its fixity, and it's then, and only + -- then, that we load B.hi. That is what's happening here. + -- + -- loadHomeInterface will find B.hi even if B is a hidden module, + -- and that's what we want. + loadHomeInterface doc name `thenM` \ iface -> + returnM (mi_fix_fn iface (nameOccName name)) + where + doc = ptext SLIT("Checking fixity for") <+> ppr name + +--------------- +lookupTyFixityRn :: Located Name -> RnM Fixity +lookupTyFixityRn (L loc n) + = doptM Opt_GlasgowExts `thenM` \ glaExts -> + when (not glaExts) + (setSrcSpan loc $ addWarn (infixTyConWarn n)) `thenM_` + lookupFixityRn n + +--------------- +dataTcOccs :: RdrName -> [RdrName] +-- If the input is a data constructor, return both it and a type +-- constructor. This is useful when we aren't sure which we are +-- looking at. +dataTcOccs rdr_name + | Just n <- isExact_maybe rdr_name -- Ghastly special case + , n `hasKey` consDataConKey = [rdr_name] -- see note below + | isDataOcc occ = [rdr_name_tc, rdr_name] + | otherwise = [rdr_name] + where + occ = rdrNameOcc rdr_name + rdr_name_tc = setRdrNameSpace rdr_name tcName + +-- If the user typed "[]" or "(,,)", we'll generate an Exact RdrName, +-- and setRdrNameSpace generates an Orig, which is fine +-- But it's not fine for (:), because there *is* no corresponding type +-- constructor. If we generate an Orig tycon for GHC.Base.(:), it'll +-- appear to be in scope (because Orig's simply allocate a new name-cache +-- entry) and then we get an error when we use dataTcOccs in +-- TcRnDriver.tcRnGetInfo. Large sigh. +\end{code} + +%************************************************************************ +%* * + Rebindable names + Dealing with rebindable syntax is driven by the + Opt_NoImplicitPrelude dynamic flag. + + In "deriving" code we don't want to use rebindable syntax + so we switch off the flag locally + +%* * +%************************************************************************ + +Haskell 98 says that when you say "3" you get the "fromInteger" from the +Standard Prelude, regardless of what is in scope. However, to experiment +with having a language that is less coupled to the standard prelude, we're +trying a non-standard extension that instead gives you whatever "Prelude.fromInteger" +happens to be in scope. Then you can + import Prelude () + import MyPrelude as Prelude +to get the desired effect. + +At the moment this just happens for + * fromInteger, fromRational on literals (in expressions and patterns) + * negate (in expressions) + * minus (arising from n+k patterns) + * "do" notation + +We store the relevant Name in the HsSyn tree, in + * HsIntegral/HsFractional + * NegApp + * NPlusKPat + * HsDo +respectively. Initially, we just store the "standard" name (PrelNames.fromIntegralName, +fromRationalName etc), but the renamer changes this to the appropriate user +name if Opt_NoImplicitPrelude is on. That is what lookupSyntaxName does. + +We treat the orignal (standard) names as free-vars too, because the type checker +checks the type of the user thing against the type of the standard thing. + +\begin{code} +lookupSyntaxName :: Name -- The standard name + -> RnM (SyntaxExpr Name, FreeVars) -- Possibly a non-standard name +lookupSyntaxName std_name + = doptM Opt_ImplicitPrelude `thenM` \ implicit_prelude -> + if implicit_prelude then normal_case + else + -- Get the similarly named thing from the local environment + lookupOccRn (mkRdrUnqual (nameOccName std_name)) `thenM` \ usr_name -> + returnM (HsVar usr_name, unitFV usr_name) + where + normal_case = returnM (HsVar std_name, emptyFVs) + +lookupSyntaxTable :: [Name] -- Standard names + -> RnM (SyntaxTable Name, FreeVars) -- See comments with HsExpr.ReboundNames +lookupSyntaxTable std_names + = doptM Opt_ImplicitPrelude `thenM` \ implicit_prelude -> + if implicit_prelude then normal_case + else + -- Get the similarly named thing from the local environment + mappM (lookupOccRn . mkRdrUnqual . nameOccName) std_names `thenM` \ usr_names -> + + returnM (std_names `zip` map HsVar usr_names, mkFVs usr_names) + where + normal_case = returnM (std_names `zip` map HsVar std_names, emptyFVs) +\end{code} + + +%********************************************************* +%* * +\subsection{Binding} +%* * +%********************************************************* + +\begin{code} +newLocalsRn :: [Located RdrName] -> RnM [Name] +newLocalsRn rdr_names_w_loc + = newUniqueSupply `thenM` \ us -> + returnM (zipWith mk rdr_names_w_loc (uniqsFromSupply us)) + where + mk (L loc rdr_name) uniq + | Just name <- isExact_maybe rdr_name = name + -- This happens in code generated by Template Haskell + | otherwise = ASSERT2( isUnqual rdr_name, ppr rdr_name ) + -- We only bind unqualified names here + -- lookupRdrEnv doesn't even attempt to look up a qualified RdrName + mkInternalName uniq (rdrNameOcc rdr_name) (srcSpanStart loc) + +bindLocatedLocalsRn :: SDoc -- Documentation string for error message + -> [Located RdrName] + -> ([Name] -> RnM a) + -> RnM a +bindLocatedLocalsRn doc_str rdr_names_w_loc enclosed_scope + = -- Check for duplicate names + checkDupNames doc_str rdr_names_w_loc `thenM_` + + -- Warn about shadowing, but only in source modules + ifOptM Opt_WarnNameShadowing + (checkShadowing doc_str rdr_names_w_loc) `thenM_` + + -- Make fresh Names and extend the environment + newLocalsRn rdr_names_w_loc `thenM` \ names -> + getLocalRdrEnv `thenM` \ local_env -> + setLocalRdrEnv (extendLocalRdrEnv local_env names) + (enclosed_scope names) + + +bindLocalNames :: [Name] -> RnM a -> RnM a +bindLocalNames names enclosed_scope + = getLocalRdrEnv `thenM` \ name_env -> + setLocalRdrEnv (extendLocalRdrEnv name_env names) + enclosed_scope + +bindLocalNamesFV :: [Name] -> RnM (a, FreeVars) -> RnM (a, FreeVars) +bindLocalNamesFV names enclosed_scope + = do { (result, fvs) <- bindLocalNames names enclosed_scope + ; returnM (result, delListFromNameSet fvs names) } + + +------------------------------------- + -- binLocalsFVRn is the same as bindLocalsRn + -- except that it deals with free vars +bindLocatedLocalsFV :: SDoc -> [Located RdrName] -> ([Name] -> RnM (a,FreeVars)) + -> RnM (a, FreeVars) +bindLocatedLocalsFV doc rdr_names enclosed_scope + = bindLocatedLocalsRn doc rdr_names $ \ names -> + enclosed_scope names `thenM` \ (thing, fvs) -> + returnM (thing, delListFromNameSet fvs names) + +------------------------------------- +bindTyVarsRn :: SDoc -> [LHsTyVarBndr RdrName] + -> ([LHsTyVarBndr Name] -> RnM a) + -> RnM a +-- Haskell-98 binding of type variables; e.g. within a data type decl +bindTyVarsRn doc_str tyvar_names enclosed_scope + = let + located_tyvars = hsLTyVarLocNames tyvar_names + in + bindLocatedLocalsRn doc_str located_tyvars $ \ names -> + enclosed_scope (zipWith replace tyvar_names names) + where + replace (L loc n1) n2 = L loc (replaceTyVarName n1 n2) + +bindPatSigTyVars :: [LHsType RdrName] -> ([Name] -> RnM a) -> RnM a + -- Find the type variables in the pattern type + -- signatures that must be brought into scope +bindPatSigTyVars tys thing_inside + = do { scoped_tyvars <- doptM Opt_ScopedTypeVariables + ; if not scoped_tyvars then + thing_inside [] + else + do { name_env <- getLocalRdrEnv + ; let locd_tvs = [ tv | ty <- tys + , tv <- extractHsTyRdrTyVars ty + , not (unLoc tv `elemLocalRdrEnv` name_env) ] + nubbed_tvs = nubBy eqLocated locd_tvs + -- The 'nub' is important. For example: + -- f (x :: t) (y :: t) = .... + -- We don't want to complain about binding t twice! + + ; bindLocatedLocalsRn doc_sig nubbed_tvs thing_inside }} + where + doc_sig = text "In a pattern type-signature" + +bindPatSigTyVarsFV :: [LHsType RdrName] + -> RnM (a, FreeVars) + -> RnM (a, FreeVars) +bindPatSigTyVarsFV tys thing_inside + = bindPatSigTyVars tys $ \ tvs -> + thing_inside `thenM` \ (result,fvs) -> + returnM (result, fvs `delListFromNameSet` tvs) + +bindSigTyVarsFV :: [Name] + -> RnM (a, FreeVars) + -> RnM (a, FreeVars) +bindSigTyVarsFV tvs thing_inside + = do { scoped_tyvars <- doptM Opt_ScopedTypeVariables + ; if not scoped_tyvars then + thing_inside + else + bindLocalNamesFV tvs thing_inside } + +extendTyVarEnvFVRn :: [Name] -> RnM (a, FreeVars) -> RnM (a, FreeVars) + -- This function is used only in rnSourceDecl on InstDecl +extendTyVarEnvFVRn tyvars thing_inside = bindLocalNamesFV tyvars thing_inside + +------------------------------------- +checkDupNames :: SDoc + -> [Located RdrName] + -> RnM () +checkDupNames doc_str rdr_names_w_loc + = -- Check for duplicated names in a binding group + mappM_ (dupNamesErr doc_str) dups + where + (_, dups) = removeDups (\n1 n2 -> unLoc n1 `compare` unLoc n2) rdr_names_w_loc + +------------------------------------- +checkShadowing doc_str loc_rdr_names + = getLocalRdrEnv `thenM` \ local_env -> + getGlobalRdrEnv `thenM` \ global_env -> + let + check_shadow (L loc rdr_name) + | rdr_name `elemLocalRdrEnv` local_env + || not (null (lookupGRE_RdrName rdr_name global_env )) + = setSrcSpan loc $ addWarn (shadowedNameWarn doc_str rdr_name) + | otherwise = returnM () + in + mappM_ check_shadow loc_rdr_names +\end{code} + + +%************************************************************************ +%* * +\subsection{Free variable manipulation} +%* * +%************************************************************************ + +\begin{code} +-- A useful utility +mapFvRn f xs = mappM f xs `thenM` \ stuff -> + let + (ys, fvs_s) = unzip stuff + in + returnM (ys, plusFVs fvs_s) +\end{code} + + +%************************************************************************ +%* * +\subsection{Envt utility functions} +%* * +%************************************************************************ + +\begin{code} +warnUnusedModules :: [(Module,SrcSpan)] -> RnM () +warnUnusedModules mods + = ifOptM Opt_WarnUnusedImports (mappM_ bleat mods) + where + bleat (mod,loc) = setSrcSpan loc $ addWarn (mk_warn mod) + mk_warn m = vcat [ptext SLIT("Module") <+> quotes (ppr m) + <+> text "is imported, but nothing from it is used,", + nest 2 (ptext SLIT("except perhaps instances visible in") + <+> quotes (ppr m)), + ptext SLIT("To suppress this warning, use:") + <+> ptext SLIT("import") <+> ppr m <> parens empty ] + + +warnUnusedImports, warnUnusedTopBinds :: [GlobalRdrElt] -> RnM () +warnUnusedImports gres = ifOptM Opt_WarnUnusedImports (warnUnusedGREs gres) +warnUnusedTopBinds gres = ifOptM Opt_WarnUnusedBinds (warnUnusedGREs gres) + +warnUnusedLocalBinds, warnUnusedMatches :: [Name] -> RnM () +warnUnusedLocalBinds names = ifOptM Opt_WarnUnusedBinds (warnUnusedLocals names) +warnUnusedMatches names = ifOptM Opt_WarnUnusedMatches (warnUnusedLocals names) + +------------------------- +-- Helpers +warnUnusedGREs gres + = warnUnusedBinds [(n,Just p) | GRE {gre_name = n, gre_prov = p} <- gres] + +warnUnusedLocals names + = warnUnusedBinds [(n,Nothing) | n<-names] + +warnUnusedBinds :: [(Name,Maybe Provenance)] -> RnM () +warnUnusedBinds names = mappM_ warnUnusedName (filter reportable names) + where reportable (name,_) + | isWiredInName name = False -- Don't report unused wired-in names + -- Otherwise we get a zillion warnings + -- from Data.Tuple + | otherwise = reportIfUnused (nameOccName name) + +------------------------- + +warnUnusedName :: (Name, Maybe Provenance) -> RnM () +warnUnusedName (name, prov) + = addWarnAt loc $ + sep [msg <> colon, + nest 2 $ pprNonVarNameSpace (occNameSpace (nameOccName name)) + <+> quotes (ppr name)] + -- TODO should be a proper span + where + (loc,msg) = case prov of + Just (Imported is) + -> (importSpecLoc imp_spec, imp_from (importSpecModule imp_spec)) + where + imp_spec = head is + other -> (srcLocSpan (nameSrcLoc name), unused_msg) + + unused_msg = text "Defined but not used" + imp_from mod = text "Imported from" <+> quotes (ppr mod) <+> text "but not used" +\end{code} + +\begin{code} +addNameClashErrRn rdr_name (np1:nps) + = addErr (vcat [ptext SLIT("Ambiguous occurrence") <+> quotes (ppr rdr_name), + ptext SLIT("It could refer to") <+> vcat (msg1 : msgs)]) + where + msg1 = ptext SLIT("either") <+> mk_ref np1 + msgs = [ptext SLIT(" or") <+> mk_ref np | np <- nps] + mk_ref gre = quotes (ppr (gre_name gre)) <> comma <+> pprNameProvenance gre + +shadowedNameWarn doc shadow + = hsep [ptext SLIT("This binding for"), + quotes (ppr shadow), + ptext SLIT("shadows an existing binding")] + $$ doc + +unknownNameErr rdr_name + = sep [ptext SLIT("Not in scope:"), + nest 2 $ pprNonVarNameSpace (occNameSpace (rdrNameOcc rdr_name)) + <+> quotes (ppr rdr_name)] + +unknownInstBndrErr cls op + = quotes (ppr op) <+> ptext SLIT("is not a (visible) method of class") <+> quotes (ppr cls) + +badOrigBinding name + = ptext SLIT("Illegal binding of built-in syntax:") <+> ppr (rdrNameOcc name) + -- The rdrNameOcc is because we don't want to print Prelude.(,) + +dupNamesErr :: SDoc -> [Located RdrName] -> RnM () +dupNamesErr descriptor located_names + = setSrcSpan big_loc $ + addErr (vcat [ptext SLIT("Conflicting definitions for") <+> quotes (ppr name1), + locations, + descriptor]) + where + L _ name1 = head located_names + locs = map getLoc located_names + big_loc = foldr1 combineSrcSpans locs + one_line = srcSpanStartLine big_loc == srcSpanEndLine big_loc + locations | one_line = empty + | otherwise = ptext SLIT("Bound at:") <+> + vcat (map ppr (sortLe (<=) locs)) + +infixTyConWarn op + = vcat [ftext FSLIT("Accepting non-standard infix type constructor") <+> quotes (ppr op), + ftext FSLIT("Use -fglasgow-exts to avoid this warning")] +\end{code} diff --git a/compiler/rename/RnExpr.hi-boot-6 b/compiler/rename/RnExpr.hi-boot-6 new file mode 100644 index 0000000000..8f6c7f154b --- /dev/null +++ b/compiler/rename/RnExpr.hi-boot-6 @@ -0,0 +1,11 @@ +module RnExpr where
+
+rnLExpr :: HsExpr.LHsExpr RdrName.RdrName
+ -> TcRnTypes.RnM (HsExpr.LHsExpr Name.Name, NameSet.FreeVars)
+
+rnStmts :: forall thing.
+ HsExpr.HsStmtContext Name.Name -> [HsExpr.LStmt RdrName.RdrName]
+ -> TcRnTypes.RnM (thing, NameSet.FreeVars)
+ -> TcRnTypes.RnM (([HsExpr.LStmt Name.Name], thing), NameSet.FreeVars)
+
+
diff --git a/compiler/rename/RnExpr.lhs b/compiler/rename/RnExpr.lhs new file mode 100644 index 0000000000..716a85a3b3 --- /dev/null +++ b/compiler/rename/RnExpr.lhs @@ -0,0 +1,996 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[RnExpr]{Renaming of expressions} + +Basically dependency analysis. + +Handles @Match@, @GRHSs@, @HsExpr@, and @Qualifier@ datatypes. In +general, all of these functions return a renamed thing, and a set of +free variables. + +\begin{code} +module RnExpr ( + rnLExpr, rnExpr, rnStmts + ) where + +#include "HsVersions.h" + +import RnSource ( rnSrcDecls, rnSplice, checkTH ) +import RnBinds ( rnLocalBindsAndThen, rnValBinds, + rnMatchGroup, trimWith ) +import HsSyn +import RnHsSyn +import TcRnMonad +import RnEnv +import OccName ( plusOccEnv ) +import RnNames ( getLocalDeclBinders, extendRdrEnvRn ) +import RnTypes ( rnHsTypeFVs, rnLPat, rnOverLit, rnPatsAndThen, rnLit, + mkOpFormRn, mkOpAppRn, mkNegAppRn, checkSectionPrec, + dupFieldErr, checkTupSize ) +import DynFlags ( DynFlag(..) ) +import BasicTypes ( FixityDirection(..) ) +import PrelNames ( thFAKE, hasKey, assertIdKey, assertErrorName, + loopAName, choiceAName, appAName, arrAName, composeAName, firstAName, + negateName, thenMName, bindMName, failMName ) +#if defined(GHCI) && defined(BREAKPOINT) +import PrelNames ( breakpointJumpName, undefined_RDR, breakpointIdKey ) +import UniqFM ( eltsUFM ) +import DynFlags ( GhcMode(..) ) +import SrcLoc ( srcSpanFile, srcSpanStartLine ) +import Name ( isTyVarName ) +#endif +import Name ( Name, nameOccName, nameIsLocalOrFrom ) +import NameSet +import RdrName ( RdrName, emptyGlobalRdrEnv, extendLocalRdrEnv, lookupLocalRdrEnv ) +import LoadIface ( loadHomeInterface ) +import UniqFM ( isNullUFM ) +import UniqSet ( emptyUniqSet ) +import List ( nub ) +import Util ( isSingleton ) +import ListSetOps ( removeDups ) +import Maybes ( expectJust ) +import Outputable +import SrcLoc ( Located(..), unLoc, getLoc, cmpLocated ) +import FastString + +import List ( unzip4 ) +\end{code} + + +%************************************************************************ +%* * +\subsubsection{Expressions} +%* * +%************************************************************************ + +\begin{code} +rnExprs :: [LHsExpr RdrName] -> RnM ([LHsExpr Name], FreeVars) +rnExprs ls = rnExprs' ls emptyUniqSet + where + rnExprs' [] acc = returnM ([], acc) + rnExprs' (expr:exprs) acc + = rnLExpr expr `thenM` \ (expr', fvExpr) -> + + -- Now we do a "seq" on the free vars because typically it's small + -- or empty, especially in very long lists of constants + let + acc' = acc `plusFV` fvExpr + in + (grubby_seqNameSet acc' rnExprs') exprs acc' `thenM` \ (exprs', fvExprs) -> + returnM (expr':exprs', fvExprs) + +-- Grubby little function to do "seq" on namesets; replace by proper seq when GHC can do seq +grubby_seqNameSet ns result | isNullUFM ns = result + | otherwise = result +\end{code} + +Variables. We look up the variable and return the resulting name. + +\begin{code} +rnLExpr :: LHsExpr RdrName -> RnM (LHsExpr Name, FreeVars) +rnLExpr = wrapLocFstM rnExpr + +rnExpr :: HsExpr RdrName -> RnM (HsExpr Name, FreeVars) + +rnExpr (HsVar v) + = do name <- lookupOccRn v + localRdrEnv <- getLocalRdrEnv + lclEnv <- getLclEnv + ignore_asserts <- doptM Opt_IgnoreAsserts + ignore_breakpoints <- doptM Opt_IgnoreBreakpoints + let conds = [ (name `hasKey` assertIdKey + && not ignore_asserts, + do (e, fvs) <- mkAssertErrorExpr + return (e, fvs `addOneFV` name)) +#if defined(GHCI) && defined(BREAKPOINT) + , (name `hasKey` breakpointIdKey + && not ignore_breakpoints, + do ghcMode <- getGhcMode + case ghcMode of + Interactive + -> do let isWantedName = not.isTyVarName + (e, fvs) <- mkBreakPointExpr (filter isWantedName (eltsUFM localRdrEnv)) + return (e, fvs `addOneFV` name) + _ -> return (HsVar name, unitFV name) + ) +#endif + ] + case lookup True conds of + Just action -> action + Nothing -> return (HsVar name, unitFV name) + +rnExpr (HsIPVar v) + = newIPNameRn v `thenM` \ name -> + returnM (HsIPVar name, emptyFVs) + +rnExpr (HsLit lit) + = rnLit lit `thenM_` + returnM (HsLit lit, emptyFVs) + +rnExpr (HsOverLit lit) + = rnOverLit lit `thenM` \ (lit', fvs) -> + returnM (HsOverLit lit', fvs) + +rnExpr (HsApp fun arg) + = rnLExpr fun `thenM` \ (fun',fvFun) -> + rnLExpr arg `thenM` \ (arg',fvArg) -> + returnM (HsApp fun' arg', fvFun `plusFV` fvArg) + +rnExpr (OpApp e1 op _ e2) + = rnLExpr e1 `thenM` \ (e1', fv_e1) -> + rnLExpr e2 `thenM` \ (e2', fv_e2) -> + rnLExpr op `thenM` \ (op'@(L _ (HsVar op_name)), fv_op) -> + + -- Deal with fixity + -- When renaming code synthesised from "deriving" declarations + -- we used to avoid fixity stuff, but we can't easily tell any + -- more, so I've removed the test. Adding HsPars in TcGenDeriv + -- should prevent bad things happening. + lookupFixityRn op_name `thenM` \ fixity -> + mkOpAppRn e1' op' fixity e2' `thenM` \ final_e -> + + returnM (final_e, + fv_e1 `plusFV` fv_op `plusFV` fv_e2) + +rnExpr (NegApp e _) + = rnLExpr e `thenM` \ (e', fv_e) -> + lookupSyntaxName negateName `thenM` \ (neg_name, fv_neg) -> + mkNegAppRn e' neg_name `thenM` \ final_e -> + returnM (final_e, fv_e `plusFV` fv_neg) + +rnExpr (HsPar e) + = rnLExpr e `thenM` \ (e', fvs_e) -> + returnM (HsPar e', fvs_e) + +-- Template Haskell extensions +-- Don't ifdef-GHCI them because we want to fail gracefully +-- (not with an rnExpr crash) in a stage-1 compiler. +rnExpr e@(HsBracket br_body) + = checkTH e "bracket" `thenM_` + rnBracket br_body `thenM` \ (body', fvs_e) -> + returnM (HsBracket body', fvs_e) + +rnExpr e@(HsSpliceE splice) + = rnSplice splice `thenM` \ (splice', fvs) -> + returnM (HsSpliceE splice', fvs) + +rnExpr section@(SectionL expr op) + = rnLExpr expr `thenM` \ (expr', fvs_expr) -> + rnLExpr op `thenM` \ (op', fvs_op) -> + checkSectionPrec InfixL section op' expr' `thenM_` + returnM (SectionL expr' op', fvs_op `plusFV` fvs_expr) + +rnExpr section@(SectionR op expr) + = rnLExpr op `thenM` \ (op', fvs_op) -> + rnLExpr expr `thenM` \ (expr', fvs_expr) -> + checkSectionPrec InfixR section op' expr' `thenM_` + returnM (SectionR op' expr', fvs_op `plusFV` fvs_expr) + +rnExpr (HsCoreAnn ann expr) + = rnLExpr expr `thenM` \ (expr', fvs_expr) -> + returnM (HsCoreAnn ann expr', fvs_expr) + +rnExpr (HsSCC lbl expr) + = rnLExpr expr `thenM` \ (expr', fvs_expr) -> + returnM (HsSCC lbl expr', fvs_expr) + +rnExpr (HsLam matches) + = rnMatchGroup LambdaExpr matches `thenM` \ (matches', fvMatch) -> + returnM (HsLam matches', fvMatch) + +rnExpr (HsCase expr matches) + = rnLExpr expr `thenM` \ (new_expr, e_fvs) -> + rnMatchGroup CaseAlt matches `thenM` \ (new_matches, ms_fvs) -> + returnM (HsCase new_expr new_matches, e_fvs `plusFV` ms_fvs) + +rnExpr (HsLet binds expr) + = rnLocalBindsAndThen binds $ \ binds' -> + rnLExpr expr `thenM` \ (expr',fvExpr) -> + returnM (HsLet binds' expr', fvExpr) + +rnExpr e@(HsDo do_or_lc stmts body _) + = do { ((stmts', body'), fvs) <- rnStmts do_or_lc stmts $ + rnLExpr body + ; return (HsDo do_or_lc stmts' body' placeHolderType, fvs) } + +rnExpr (ExplicitList _ exps) + = rnExprs exps `thenM` \ (exps', fvs) -> + returnM (ExplicitList placeHolderType exps', fvs `addOneFV` listTyCon_name) + +rnExpr (ExplicitPArr _ exps) + = rnExprs exps `thenM` \ (exps', fvs) -> + returnM (ExplicitPArr placeHolderType exps', fvs) + +rnExpr e@(ExplicitTuple exps boxity) + = checkTupSize tup_size `thenM_` + rnExprs exps `thenM` \ (exps', fvs) -> + returnM (ExplicitTuple exps' boxity, fvs `addOneFV` tycon_name) + where + tup_size = length exps + tycon_name = tupleTyCon_name boxity tup_size + +rnExpr (RecordCon con_id _ rbinds) + = lookupLocatedOccRn con_id `thenM` \ conname -> + rnRbinds "construction" rbinds `thenM` \ (rbinds', fvRbinds) -> + returnM (RecordCon conname noPostTcExpr rbinds', + fvRbinds `addOneFV` unLoc conname) + +rnExpr (RecordUpd expr rbinds _ _) + = rnLExpr expr `thenM` \ (expr', fvExpr) -> + rnRbinds "update" rbinds `thenM` \ (rbinds', fvRbinds) -> + returnM (RecordUpd expr' rbinds' placeHolderType placeHolderType, + fvExpr `plusFV` fvRbinds) + +rnExpr (ExprWithTySig expr pty) + = rnLExpr expr `thenM` \ (expr', fvExpr) -> + rnHsTypeFVs doc pty `thenM` \ (pty', fvTy) -> + returnM (ExprWithTySig expr' pty', fvExpr `plusFV` fvTy) + where + doc = text "In an expression type signature" + +rnExpr (HsIf p b1 b2) + = rnLExpr p `thenM` \ (p', fvP) -> + rnLExpr b1 `thenM` \ (b1', fvB1) -> + rnLExpr b2 `thenM` \ (b2', fvB2) -> + returnM (HsIf p' b1' b2', plusFVs [fvP, fvB1, fvB2]) + +rnExpr (HsType a) + = rnHsTypeFVs doc a `thenM` \ (t, fvT) -> + returnM (HsType t, fvT) + where + doc = text "In a type argument" + +rnExpr (ArithSeq _ seq) + = rnArithSeq seq `thenM` \ (new_seq, fvs) -> + returnM (ArithSeq noPostTcExpr new_seq, fvs) + +rnExpr (PArrSeq _ seq) + = rnArithSeq seq `thenM` \ (new_seq, fvs) -> + returnM (PArrSeq noPostTcExpr new_seq, fvs) +\end{code} + +These three are pattern syntax appearing in expressions. +Since all the symbols are reservedops we can simply reject them. +We return a (bogus) EWildPat in each case. + +\begin{code} +rnExpr e@EWildPat = patSynErr e +rnExpr e@(EAsPat {}) = patSynErr e +rnExpr e@(ELazyPat {}) = patSynErr e +\end{code} + +%************************************************************************ +%* * + Arrow notation +%* * +%************************************************************************ + +\begin{code} +rnExpr (HsProc pat body) + = newArrowScope $ + rnPatsAndThen ProcExpr [pat] $ \ [pat'] -> + rnCmdTop body `thenM` \ (body',fvBody) -> + returnM (HsProc pat' body', fvBody) + +rnExpr (HsArrApp arrow arg _ ho rtl) + = select_arrow_scope (rnLExpr arrow) `thenM` \ (arrow',fvArrow) -> + rnLExpr arg `thenM` \ (arg',fvArg) -> + returnM (HsArrApp arrow' arg' placeHolderType ho rtl, + fvArrow `plusFV` fvArg) + where + select_arrow_scope tc = case ho of + HsHigherOrderApp -> tc + HsFirstOrderApp -> escapeArrowScope tc + +-- infix form +rnExpr (HsArrForm op (Just _) [arg1, arg2]) + = escapeArrowScope (rnLExpr op) + `thenM` \ (op'@(L _ (HsVar op_name)),fv_op) -> + rnCmdTop arg1 `thenM` \ (arg1',fv_arg1) -> + rnCmdTop arg2 `thenM` \ (arg2',fv_arg2) -> + + -- Deal with fixity + + lookupFixityRn op_name `thenM` \ fixity -> + mkOpFormRn arg1' op' fixity arg2' `thenM` \ final_e -> + + returnM (final_e, + fv_arg1 `plusFV` fv_op `plusFV` fv_arg2) + +rnExpr (HsArrForm op fixity cmds) + = escapeArrowScope (rnLExpr op) `thenM` \ (op',fvOp) -> + rnCmdArgs cmds `thenM` \ (cmds',fvCmds) -> + returnM (HsArrForm op' fixity cmds', fvOp `plusFV` fvCmds) + +rnExpr other = pprPanic "rnExpr: unexpected expression" (ppr other) + -- DictApp, DictLam, TyApp, TyLam +\end{code} + + +%************************************************************************ +%* * + Arrow commands +%* * +%************************************************************************ + +\begin{code} +rnCmdArgs [] = returnM ([], emptyFVs) +rnCmdArgs (arg:args) + = rnCmdTop arg `thenM` \ (arg',fvArg) -> + rnCmdArgs args `thenM` \ (args',fvArgs) -> + returnM (arg':args', fvArg `plusFV` fvArgs) + + +rnCmdTop = wrapLocFstM rnCmdTop' + where + rnCmdTop' (HsCmdTop cmd _ _ _) + = rnLExpr (convertOpFormsLCmd cmd) `thenM` \ (cmd', fvCmd) -> + let + cmd_names = [arrAName, composeAName, firstAName] ++ + nameSetToList (methodNamesCmd (unLoc cmd')) + in + -- Generate the rebindable syntax for the monad + lookupSyntaxTable cmd_names `thenM` \ (cmd_names', cmd_fvs) -> + + returnM (HsCmdTop cmd' [] placeHolderType cmd_names', + fvCmd `plusFV` cmd_fvs) + +--------------------------------------------------- +-- convert OpApp's in a command context to HsArrForm's + +convertOpFormsLCmd :: LHsCmd id -> LHsCmd id +convertOpFormsLCmd = fmap convertOpFormsCmd + +convertOpFormsCmd :: HsCmd id -> HsCmd id + +convertOpFormsCmd (HsApp c e) = HsApp (convertOpFormsLCmd c) e +convertOpFormsCmd (HsLam match) = HsLam (convertOpFormsMatch match) +convertOpFormsCmd (OpApp c1 op fixity c2) + = let + arg1 = L (getLoc c1) $ HsCmdTop (convertOpFormsLCmd c1) [] placeHolderType [] + arg2 = L (getLoc c2) $ HsCmdTop (convertOpFormsLCmd c2) [] placeHolderType [] + in + HsArrForm op (Just fixity) [arg1, arg2] + +convertOpFormsCmd (HsPar c) = HsPar (convertOpFormsLCmd c) + +-- gaw 2004 +convertOpFormsCmd (HsCase exp matches) + = HsCase exp (convertOpFormsMatch matches) + +convertOpFormsCmd (HsIf exp c1 c2) + = HsIf exp (convertOpFormsLCmd c1) (convertOpFormsLCmd c2) + +convertOpFormsCmd (HsLet binds cmd) + = HsLet binds (convertOpFormsLCmd cmd) + +convertOpFormsCmd (HsDo ctxt stmts body ty) + = HsDo ctxt (map (fmap convertOpFormsStmt) stmts) + (convertOpFormsLCmd body) ty + +-- Anything else is unchanged. This includes HsArrForm (already done), +-- things with no sub-commands, and illegal commands (which will be +-- caught by the type checker) +convertOpFormsCmd c = c + +convertOpFormsStmt (BindStmt pat cmd _ _) + = BindStmt pat (convertOpFormsLCmd cmd) noSyntaxExpr noSyntaxExpr +convertOpFormsStmt (ExprStmt cmd _ _) + = ExprStmt (convertOpFormsLCmd cmd) noSyntaxExpr placeHolderType +convertOpFormsStmt (RecStmt stmts lvs rvs es binds) + = RecStmt (map (fmap convertOpFormsStmt) stmts) lvs rvs es binds +convertOpFormsStmt stmt = stmt + +convertOpFormsMatch (MatchGroup ms ty) + = MatchGroup (map (fmap convert) ms) ty + where convert (Match pat mty grhss) + = Match pat mty (convertOpFormsGRHSs grhss) + +convertOpFormsGRHSs (GRHSs grhss binds) + = GRHSs (map convertOpFormsGRHS grhss) binds + +convertOpFormsGRHS = fmap convert + where + convert (GRHS stmts cmd) = GRHS stmts (convertOpFormsLCmd cmd) + +--------------------------------------------------- +type CmdNeeds = FreeVars -- Only inhabitants are + -- appAName, choiceAName, loopAName + +-- find what methods the Cmd needs (loop, choice, apply) +methodNamesLCmd :: LHsCmd Name -> CmdNeeds +methodNamesLCmd = methodNamesCmd . unLoc + +methodNamesCmd :: HsCmd Name -> CmdNeeds + +methodNamesCmd cmd@(HsArrApp _arrow _arg _ HsFirstOrderApp _rtl) + = emptyFVs +methodNamesCmd cmd@(HsArrApp _arrow _arg _ HsHigherOrderApp _rtl) + = unitFV appAName +methodNamesCmd cmd@(HsArrForm {}) = emptyFVs + +methodNamesCmd (HsPar c) = methodNamesLCmd c + +methodNamesCmd (HsIf p c1 c2) + = methodNamesLCmd c1 `plusFV` methodNamesLCmd c2 `addOneFV` choiceAName + +methodNamesCmd (HsLet b c) = methodNamesLCmd c + +methodNamesCmd (HsDo sc stmts body ty) + = methodNamesStmts stmts `plusFV` methodNamesLCmd body + +methodNamesCmd (HsApp c e) = methodNamesLCmd c + +methodNamesCmd (HsLam match) = methodNamesMatch match + +methodNamesCmd (HsCase scrut matches) + = methodNamesMatch matches `addOneFV` choiceAName + +methodNamesCmd other = emptyFVs + -- Other forms can't occur in commands, but it's not convenient + -- to error here so we just do what's convenient. + -- The type checker will complain later + +--------------------------------------------------- +methodNamesMatch (MatchGroup ms ty) + = plusFVs (map do_one ms) + where + do_one (L _ (Match pats sig_ty grhss)) = methodNamesGRHSs grhss + +------------------------------------------------- +-- gaw 2004 +methodNamesGRHSs (GRHSs grhss binds) = plusFVs (map methodNamesGRHS grhss) + +------------------------------------------------- +methodNamesGRHS (L _ (GRHS stmts rhs)) = methodNamesLCmd rhs + +--------------------------------------------------- +methodNamesStmts stmts = plusFVs (map methodNamesLStmt stmts) + +--------------------------------------------------- +methodNamesLStmt = methodNamesStmt . unLoc + +methodNamesStmt (ExprStmt cmd _ _) = methodNamesLCmd cmd +methodNamesStmt (BindStmt pat cmd _ _) = methodNamesLCmd cmd +methodNamesStmt (RecStmt stmts _ _ _ _) + = methodNamesStmts stmts `addOneFV` loopAName +methodNamesStmt (LetStmt b) = emptyFVs +methodNamesStmt (ParStmt ss) = emptyFVs + -- ParStmt can't occur in commands, but it's not convenient to error + -- here so we just do what's convenient +\end{code} + + +%************************************************************************ +%* * + Arithmetic sequences +%* * +%************************************************************************ + +\begin{code} +rnArithSeq (From expr) + = rnLExpr expr `thenM` \ (expr', fvExpr) -> + returnM (From expr', fvExpr) + +rnArithSeq (FromThen expr1 expr2) + = rnLExpr expr1 `thenM` \ (expr1', fvExpr1) -> + rnLExpr expr2 `thenM` \ (expr2', fvExpr2) -> + returnM (FromThen expr1' expr2', fvExpr1 `plusFV` fvExpr2) + +rnArithSeq (FromTo expr1 expr2) + = rnLExpr expr1 `thenM` \ (expr1', fvExpr1) -> + rnLExpr expr2 `thenM` \ (expr2', fvExpr2) -> + returnM (FromTo expr1' expr2', fvExpr1 `plusFV` fvExpr2) + +rnArithSeq (FromThenTo expr1 expr2 expr3) + = rnLExpr expr1 `thenM` \ (expr1', fvExpr1) -> + rnLExpr expr2 `thenM` \ (expr2', fvExpr2) -> + rnLExpr expr3 `thenM` \ (expr3', fvExpr3) -> + returnM (FromThenTo expr1' expr2' expr3', + plusFVs [fvExpr1, fvExpr2, fvExpr3]) +\end{code} + + +%************************************************************************ +%* * +\subsubsection{@Rbinds@s and @Rpats@s: in record expressions} +%* * +%************************************************************************ + +\begin{code} +rnRbinds str rbinds + = mappM_ field_dup_err dup_fields `thenM_` + mapFvRn rn_rbind rbinds `thenM` \ (rbinds', fvRbind) -> + returnM (rbinds', fvRbind) + where + (_, dup_fields) = removeDups cmpLocated [ f | (f,_) <- rbinds ] + + field_dup_err dups = mappM_ (\f -> addLocErr f (dupFieldErr str)) dups + + rn_rbind (field, expr) + = lookupLocatedGlobalOccRn field `thenM` \ fieldname -> + rnLExpr expr `thenM` \ (expr', fvExpr) -> + returnM ((fieldname, expr'), fvExpr `addOneFV` unLoc fieldname) +\end{code} + +%************************************************************************ +%* * + Template Haskell brackets +%* * +%************************************************************************ + +\begin{code} +rnBracket (VarBr n) = do { name <- lookupOccRn n + ; this_mod <- getModule + ; checkM (nameIsLocalOrFrom this_mod name) $ -- Reason: deprecation checking asumes the + do { loadHomeInterface msg name -- home interface is loaded, and this is the + ; return () } -- only way that is going to happen + ; returnM (VarBr name, unitFV name) } + where + msg = ptext SLIT("Need interface for Template Haskell quoted Name") + +rnBracket (ExpBr e) = do { (e', fvs) <- rnLExpr e + ; return (ExpBr e', fvs) } +rnBracket (PatBr p) = do { (p', fvs) <- rnLPat p + ; return (PatBr p', fvs) } +rnBracket (TypBr t) = do { (t', fvs) <- rnHsTypeFVs doc t + ; return (TypBr t', fvs) } + where + doc = ptext SLIT("In a Template-Haskell quoted type") +rnBracket (DecBr group) + = do { gbl_env <- getGblEnv + + ; let gbl_env1 = gbl_env { tcg_mod = thFAKE } + -- Note the thFAKE. The top-level names from the bracketed + -- declarations will go into the name cache, and we don't want them to + -- confuse the Names for the current module. + -- By using a pretend module, thFAKE, we keep them safely out of the way. + + ; names <- getLocalDeclBinders gbl_env1 group + ; rdr_env' <- extendRdrEnvRn emptyGlobalRdrEnv names + -- Furthermore, the names in the bracket shouldn't conflict with + -- existing top-level names E.g. + -- foo = 1 + -- bar = [d| foo = 1|] + -- But both 'foo's get a LocalDef provenance, so we'd get a complaint unless + -- we start with an emptyGlobalRdrEnv + + ; setGblEnv (gbl_env { tcg_rdr_env = tcg_rdr_env gbl_env1 `plusOccEnv` rdr_env', + tcg_dus = emptyDUs }) $ do + -- Notice plusOccEnv, not plusGlobalRdrEnv. In this situation we want + -- to *shadow* top-level bindings. (See the 'foo' example above.) + -- If we don't shadow, we'll get an ambiguity complaint when we do + -- a lookupTopBndrRn (which uses lookupGreLocalRn) on the binder of the 'foo' + -- + -- Furthermore, arguably if the splice does define foo, that should hide + -- any foo's further out + -- + -- The emptyDUs is so that we just collect uses for this group alone + + { (tcg_env, group') <- rnSrcDecls group + -- Discard the tcg_env; it contains only extra info about fixity + ; return (DecBr group', allUses (tcg_dus tcg_env)) } } +\end{code} + +%************************************************************************ +%* * +\subsubsection{@Stmt@s: in @do@ expressions} +%* * +%************************************************************************ + +\begin{code} +rnStmts :: HsStmtContext Name -> [LStmt RdrName] + -> RnM (thing, FreeVars) + -> RnM (([LStmt Name], thing), FreeVars) + +rnStmts (MDoExpr _) = rnMDoStmts +rnStmts ctxt = rnNormalStmts ctxt + +rnNormalStmts :: HsStmtContext Name -> [LStmt RdrName] + -> RnM (thing, FreeVars) + -> RnM (([LStmt Name], thing), FreeVars) +-- Used for cases *other* than recursive mdo +-- Implements nested scopes + +rnNormalStmts ctxt [] thing_inside + = do { (thing, fvs) <- thing_inside + ; return (([],thing), fvs) } + +rnNormalStmts ctxt (L loc stmt : stmts) thing_inside + = do { ((stmt', (stmts', thing)), fvs) + <- rnStmt ctxt stmt $ + rnNormalStmts ctxt stmts thing_inside + ; return (((L loc stmt' : stmts'), thing), fvs) } + +rnStmt :: HsStmtContext Name -> Stmt RdrName + -> RnM (thing, FreeVars) + -> RnM ((Stmt Name, thing), FreeVars) + +rnStmt ctxt (ExprStmt expr _ _) thing_inside + = do { (expr', fv_expr) <- rnLExpr expr + ; (then_op, fvs1) <- lookupSyntaxName thenMName + ; (thing, fvs2) <- thing_inside + ; return ((ExprStmt expr' then_op placeHolderType, thing), + fv_expr `plusFV` fvs1 `plusFV` fvs2) } + +rnStmt ctxt (BindStmt pat expr _ _) thing_inside + = do { (expr', fv_expr) <- rnLExpr expr + -- The binders do not scope over the expression + ; (bind_op, fvs1) <- lookupSyntaxName bindMName + ; (fail_op, fvs2) <- lookupSyntaxName failMName + ; rnPatsAndThen (StmtCtxt ctxt) [pat] $ \ [pat'] -> do + { (thing, fvs3) <- thing_inside + ; return ((BindStmt pat' expr' bind_op fail_op, thing), + fv_expr `plusFV` fvs1 `plusFV` fvs2 `plusFV` fvs3) }} + -- fv_expr shouldn't really be filtered by the rnPatsAndThen + -- but it does not matter because the names are unique + +rnStmt ctxt (LetStmt binds) thing_inside + = do { checkErr (ok ctxt binds) + (badIpBinds (ptext SLIT("a parallel list comprehension:")) binds) + ; rnLocalBindsAndThen binds $ \ binds' -> do + { (thing, fvs) <- thing_inside + ; return ((LetStmt binds', thing), fvs) }} + where + -- We do not allow implicit-parameter bindings in a parallel + -- list comprehension. I'm not sure what it might mean. + ok (ParStmtCtxt _) (HsIPBinds _) = False + ok _ _ = True + +rnStmt ctxt (RecStmt rec_stmts _ _ _ _) thing_inside + = bindLocatedLocalsRn doc (collectLStmtsBinders rec_stmts) $ \ bndrs -> + rn_rec_stmts bndrs rec_stmts `thenM` \ segs -> + thing_inside `thenM` \ (thing, fvs) -> + let + segs_w_fwd_refs = addFwdRefs segs + (ds, us, fs, rec_stmts') = unzip4 segs_w_fwd_refs + later_vars = nameSetToList (plusFVs ds `intersectNameSet` fvs) + fwd_vars = nameSetToList (plusFVs fs) + uses = plusFVs us + rec_stmt = RecStmt rec_stmts' later_vars fwd_vars [] emptyLHsBinds + in + returnM ((rec_stmt, thing), uses `plusFV` fvs) + where + doc = text "In a recursive do statement" + +rnStmt ctxt (ParStmt segs) thing_inside + = do { opt_GlasgowExts <- doptM Opt_GlasgowExts + ; checkM opt_GlasgowExts parStmtErr + ; orig_lcl_env <- getLocalRdrEnv + ; ((segs',thing), fvs) <- go orig_lcl_env [] segs + ; return ((ParStmt segs', thing), fvs) } + where +-- type ParSeg id = [([LStmt id], [id])] +-- go :: NameSet -> [ParSeg RdrName] +-- -> RnM (([ParSeg Name], thing), FreeVars) + + go orig_lcl_env bndrs [] + = do { let { (bndrs', dups) = removeDups cmpByOcc bndrs + ; inner_env = extendLocalRdrEnv orig_lcl_env bndrs' } + ; mappM dupErr dups + ; (thing, fvs) <- setLocalRdrEnv inner_env thing_inside + ; return (([], thing), fvs) } + + go orig_lcl_env bndrs_so_far ((stmts, _) : segs) + = do { ((stmts', (bndrs, segs', thing)), fvs) + <- rnNormalStmts par_ctxt stmts $ do + { -- Find the Names that are bound by stmts + lcl_env <- getLocalRdrEnv + ; let { rdr_bndrs = collectLStmtsBinders stmts + ; bndrs = map ( expectJust "rnStmt" + . lookupLocalRdrEnv lcl_env + . unLoc) rdr_bndrs + ; new_bndrs = nub bndrs ++ bndrs_so_far + -- The nub is because there might be shadowing + -- x <- e1; x <- e2 + -- So we'll look up (Unqual x) twice, getting + -- the second binding both times, which is the + } -- one we want + + -- Typecheck the thing inside, passing on all + -- the Names bound, but separately; revert the envt + ; ((segs', thing), fvs) <- setLocalRdrEnv orig_lcl_env $ + go orig_lcl_env new_bndrs segs + + -- Figure out which of the bound names are used + ; let used_bndrs = filter (`elemNameSet` fvs) bndrs + ; return ((used_bndrs, segs', thing), fvs) } + + ; let seg' = (stmts', bndrs) + ; return (((seg':segs'), thing), + delListFromNameSet fvs bndrs) } + + par_ctxt = ParStmtCtxt ctxt + + cmpByOcc n1 n2 = nameOccName n1 `compare` nameOccName n2 + dupErr vs = addErr (ptext SLIT("Duplicate binding in parallel list comprehension for:") + <+> quotes (ppr (head vs))) +\end{code} + + +%************************************************************************ +%* * +\subsubsection{mdo expressions} +%* * +%************************************************************************ + +\begin{code} +type FwdRefs = NameSet +type Segment stmts = (Defs, + Uses, -- May include defs + FwdRefs, -- A subset of uses that are + -- (a) used before they are bound in this segment, or + -- (b) used here, and bound in subsequent segments + stmts) -- Either Stmt or [Stmt] + + +---------------------------------------------------- +rnMDoStmts :: [LStmt RdrName] + -> RnM (thing, FreeVars) + -> RnM (([LStmt Name], thing), FreeVars) +rnMDoStmts stmts thing_inside + = -- Step1: bring all the binders of the mdo into scope + -- Remember that this also removes the binders from the + -- finally-returned free-vars + bindLocatedLocalsRn doc (collectLStmtsBinders stmts) $ \ bndrs -> + do { + -- Step 2: Rename each individual stmt, making a + -- singleton segment. At this stage the FwdRefs field + -- isn't finished: it's empty for all except a BindStmt + -- for which it's the fwd refs within the bind itself + -- (This set may not be empty, because we're in a recursive + -- context.) + segs <- rn_rec_stmts bndrs stmts + + ; (thing, fvs_later) <- thing_inside + + ; let + -- Step 3: Fill in the fwd refs. + -- The segments are all singletons, but their fwd-ref + -- field mentions all the things used by the segment + -- that are bound after their use + segs_w_fwd_refs = addFwdRefs segs + + -- Step 4: Group together the segments to make bigger segments + -- Invariant: in the result, no segment uses a variable + -- bound in a later segment + grouped_segs = glomSegments segs_w_fwd_refs + + -- Step 5: Turn the segments into Stmts + -- Use RecStmt when and only when there are fwd refs + -- Also gather up the uses from the end towards the + -- start, so we can tell the RecStmt which things are + -- used 'after' the RecStmt + (stmts', fvs) = segsToStmts grouped_segs fvs_later + + ; return ((stmts', thing), fvs) } + where + doc = text "In a recursive mdo-expression" + +--------------------------------------------- +rn_rec_stmts :: [Name] -> [LStmt RdrName] -> RnM [Segment (LStmt Name)] +rn_rec_stmts bndrs stmts = mappM (rn_rec_stmt bndrs) stmts `thenM` \ segs_s -> + returnM (concat segs_s) + +---------------------------------------------------- +rn_rec_stmt :: [Name] -> LStmt RdrName -> RnM [Segment (LStmt Name)] + -- Rename a Stmt that is inside a RecStmt (or mdo) + -- Assumes all binders are already in scope + -- Turns each stmt into a singleton Stmt + +rn_rec_stmt all_bndrs (L loc (ExprStmt expr _ _)) + = rnLExpr expr `thenM` \ (expr', fvs) -> + lookupSyntaxName thenMName `thenM` \ (then_op, fvs1) -> + returnM [(emptyNameSet, fvs `plusFV` fvs1, emptyNameSet, + L loc (ExprStmt expr' then_op placeHolderType))] + +rn_rec_stmt all_bndrs (L loc (BindStmt pat expr _ _)) + = rnLExpr expr `thenM` \ (expr', fv_expr) -> + rnLPat pat `thenM` \ (pat', fv_pat) -> + lookupSyntaxName bindMName `thenM` \ (bind_op, fvs1) -> + lookupSyntaxName failMName `thenM` \ (fail_op, fvs2) -> + let + bndrs = mkNameSet (collectPatBinders pat') + fvs = fv_expr `plusFV` fv_pat `plusFV` fvs1 `plusFV` fvs2 + in + returnM [(bndrs, fvs, bndrs `intersectNameSet` fvs, + L loc (BindStmt pat' expr' bind_op fail_op))] + +rn_rec_stmt all_bndrs (L loc (LetStmt binds@(HsIPBinds _))) + = do { addErr (badIpBinds (ptext SLIT("an mdo expression")) binds) + ; failM } + +rn_rec_stmt all_bndrs (L loc (LetStmt (HsValBinds binds))) + = rnValBinds (trimWith all_bndrs) binds `thenM` \ (binds', du_binds) -> + returnM [(duDefs du_binds, duUses du_binds, + emptyNameSet, L loc (LetStmt (HsValBinds binds')))] + +rn_rec_stmt all_bndrs (L loc (RecStmt stmts _ _ _ _)) -- Flatten Rec inside Rec + = rn_rec_stmts all_bndrs stmts + +rn_rec_stmt all_bndrs stmt@(L _ (ParStmt _)) -- Syntactically illegal in mdo + = pprPanic "rn_rec_stmt" (ppr stmt) + +--------------------------------------------- +addFwdRefs :: [Segment a] -> [Segment a] +-- So far the segments only have forward refs *within* the Stmt +-- (which happens for bind: x <- ...x...) +-- This function adds the cross-seg fwd ref info + +addFwdRefs pairs + = fst (foldr mk_seg ([], emptyNameSet) pairs) + where + mk_seg (defs, uses, fwds, stmts) (segs, later_defs) + = (new_seg : segs, all_defs) + where + new_seg = (defs, uses, new_fwds, stmts) + all_defs = later_defs `unionNameSets` defs + new_fwds = fwds `unionNameSets` (uses `intersectNameSet` later_defs) + -- Add the downstream fwd refs here + +---------------------------------------------------- +-- Glomming the singleton segments of an mdo into +-- minimal recursive groups. +-- +-- At first I thought this was just strongly connected components, but +-- there's an important constraint: the order of the stmts must not change. +-- +-- Consider +-- mdo { x <- ...y... +-- p <- z +-- y <- ...x... +-- q <- x +-- z <- y +-- r <- x } +-- +-- Here, the first stmt mention 'y', which is bound in the third. +-- But that means that the innocent second stmt (p <- z) gets caught +-- up in the recursion. And that in turn means that the binding for +-- 'z' has to be included... and so on. +-- +-- Start at the tail { r <- x } +-- Now add the next one { z <- y ; r <- x } +-- Now add one more { q <- x ; z <- y ; r <- x } +-- Now one more... but this time we have to group a bunch into rec +-- { rec { y <- ...x... ; q <- x ; z <- y } ; r <- x } +-- Now one more, which we can add on without a rec +-- { p <- z ; +-- rec { y <- ...x... ; q <- x ; z <- y } ; +-- r <- x } +-- Finally we add the last one; since it mentions y we have to +-- glom it togeher with the first two groups +-- { rec { x <- ...y...; p <- z ; y <- ...x... ; +-- q <- x ; z <- y } ; +-- r <- x } + +glomSegments :: [Segment (LStmt Name)] -> [Segment [LStmt Name]] + +glomSegments [] = [] +glomSegments ((defs,uses,fwds,stmt) : segs) + -- Actually stmts will always be a singleton + = (seg_defs, seg_uses, seg_fwds, seg_stmts) : others + where + segs' = glomSegments segs + (extras, others) = grab uses segs' + (ds, us, fs, ss) = unzip4 extras + + seg_defs = plusFVs ds `plusFV` defs + seg_uses = plusFVs us `plusFV` uses + seg_fwds = plusFVs fs `plusFV` fwds + seg_stmts = stmt : concat ss + + grab :: NameSet -- The client + -> [Segment a] + -> ([Segment a], -- Needed by the 'client' + [Segment a]) -- Not needed by the client + -- The result is simply a split of the input + grab uses dus + = (reverse yeses, reverse noes) + where + (noes, yeses) = span not_needed (reverse dus) + not_needed (defs,_,_,_) = not (intersectsNameSet defs uses) + + +---------------------------------------------------- +segsToStmts :: [Segment [LStmt Name]] + -> FreeVars -- Free vars used 'later' + -> ([LStmt Name], FreeVars) + +segsToStmts [] fvs_later = ([], fvs_later) +segsToStmts ((defs, uses, fwds, ss) : segs) fvs_later + = ASSERT( not (null ss) ) + (new_stmt : later_stmts, later_uses `plusFV` uses) + where + (later_stmts, later_uses) = segsToStmts segs fvs_later + new_stmt | non_rec = head ss + | otherwise = L (getLoc (head ss)) $ + RecStmt ss (nameSetToList used_later) (nameSetToList fwds) + [] emptyLHsBinds + where + non_rec = isSingleton ss && isEmptyNameSet fwds + used_later = defs `intersectNameSet` later_uses + -- The ones needed after the RecStmt +\end{code} + +%************************************************************************ +%* * +\subsubsection{breakpoint utils} +%* * +%************************************************************************ + +\begin{code} +#if defined(GHCI) && defined(BREAKPOINT) +mkBreakPointExpr :: [Name] -> RnM (HsExpr Name, FreeVars) +mkBreakPointExpr scope + = do sloc <- getSrcSpanM + undef <- lookupOccRn undefined_RDR + let inLoc = L sloc + lHsApp x y = inLoc (HsApp x y) + mkExpr fnName args = mkExpr' fnName (reverse args) + mkExpr' fnName [] = inLoc (HsVar fnName) + mkExpr' fnName (arg:args) + = lHsApp (mkExpr' fnName args) (inLoc arg) + expr = unLoc $ mkExpr breakpointJumpName [mkScopeArg scope, HsVar undef, HsLit msg] + mkScopeArg args + = unLoc $ mkExpr undef (map HsVar args) + msg = HsString (mkFastString (unpackFS (srcSpanFile sloc) ++ ":" ++ show (srcSpanStartLine sloc))) + return (expr, emptyFVs) +#endif +\end{code} + +%************************************************************************ +%* * +\subsubsection{Assertion utils} +%* * +%************************************************************************ + +\begin{code} +mkAssertErrorExpr :: RnM (HsExpr Name, FreeVars) +-- Return an expression for (assertError "Foo.hs:27") +mkAssertErrorExpr + = getSrcSpanM `thenM` \ sloc -> + let + expr = HsApp (L sloc (HsVar assertErrorName)) (L sloc (HsLit msg)) + msg = HsStringPrim (mkFastString (showSDoc (ppr sloc))) + in + returnM (expr, emptyFVs) +\end{code} + +%************************************************************************ +%* * +\subsubsection{Errors} +%* * +%************************************************************************ + +\begin{code} +patSynErr e = do { addErr (sep [ptext SLIT("Pattern syntax in expression context:"), + nest 4 (ppr e)]) + ; return (EWildPat, emptyFVs) } + +parStmtErr = addErr (ptext SLIT("Illegal parallel list comprehension: use -fglasgow-exts")) + +badIpBinds what binds + = hang (ptext SLIT("Implicit-parameter bindings illegal in") <+> what) + 2 (ppr binds) +\end{code} diff --git a/compiler/rename/RnExpr.lhs-boot b/compiler/rename/RnExpr.lhs-boot new file mode 100644 index 0000000000..b03f50a890 --- /dev/null +++ b/compiler/rename/RnExpr.lhs-boot @@ -0,0 +1,17 @@ +\begin{code}
+module RnExpr where
+import HsSyn
+import Name ( Name )
+import NameSet ( FreeVars )
+import RdrName ( RdrName )
+import TcRnTypes
+
+rnLExpr :: LHsExpr RdrName
+ -> RnM (LHsExpr Name, FreeVars)
+
+rnStmts :: forall thing.
+ HsStmtContext Name -> [LStmt RdrName]
+ -> RnM (thing, FreeVars)
+ -> RnM (([LStmt Name], thing), FreeVars)
+\end{code}
+
diff --git a/compiler/rename/RnHsSyn.lhs b/compiler/rename/RnHsSyn.lhs new file mode 100644 index 0000000000..6752218b29 --- /dev/null +++ b/compiler/rename/RnHsSyn.lhs @@ -0,0 +1,156 @@ +% +% (c) The AQUA Project, Glasgow University, 1996-1998 +% +\section[RnHsSyn]{Specialisations of the @HsSyn@ syntax for the renamer} + +\begin{code} +module RnHsSyn( + -- Names + charTyCon_name, listTyCon_name, parrTyCon_name, tupleTyCon_name, + extractHsTyVars, extractHsTyNames, extractHsTyNames_s, + extractFunDepNames, extractHsCtxtTyNames, extractHsPredTyNames, + + -- Free variables + hsSigsFVs, hsSigFVs, conDeclFVs, bangTyFVs, + + maybeGenericMatch + ) where + +#include "HsVersions.h" + +import HsSyn +import Class ( FunDep ) +import TysWiredIn ( tupleTyCon, listTyCon, parrTyCon, charTyCon ) +import Name ( Name, getName, isTyVarName ) +import NameSet +import BasicTypes ( Boxity ) +import SrcLoc ( Located(..), unLoc ) +\end{code} + +%************************************************************************ +%* * +\subsection{Free variables} +%* * +%************************************************************************ + +These free-variable finders returns tycons and classes too. + +\begin{code} +charTyCon_name, listTyCon_name, parrTyCon_name :: Name +charTyCon_name = getName charTyCon +listTyCon_name = getName listTyCon +parrTyCon_name = getName parrTyCon + +tupleTyCon_name :: Boxity -> Int -> Name +tupleTyCon_name boxity n = getName (tupleTyCon boxity n) + +extractHsTyVars :: LHsType Name -> NameSet +extractHsTyVars x = filterNameSet isTyVarName (extractHsTyNames x) + +extractFunDepNames :: FunDep Name -> NameSet +extractFunDepNames (ns1, ns2) = mkNameSet ns1 `unionNameSets` mkNameSet ns2 + +extractHsTyNames :: LHsType Name -> NameSet +extractHsTyNames ty + = getl ty + where + getl (L _ ty) = get ty + + get (HsAppTy ty1 ty2) = getl ty1 `unionNameSets` getl ty2 + get (HsListTy ty) = unitNameSet listTyCon_name `unionNameSets` getl ty + get (HsPArrTy ty) = unitNameSet parrTyCon_name `unionNameSets` getl ty + get (HsTupleTy con tys) = extractHsTyNames_s tys + get (HsFunTy ty1 ty2) = getl ty1 `unionNameSets` getl ty2 + get (HsPredTy p) = extractHsPredTyNames p + get (HsOpTy ty1 op ty2) = getl ty1 `unionNameSets` getl ty2 `unionNameSets` unitNameSet (unLoc op) + get (HsParTy ty) = getl ty + get (HsBangTy _ ty) = getl ty + get (HsNumTy n) = emptyNameSet + get (HsTyVar tv) = unitNameSet tv + get (HsSpliceTy _) = emptyNameSet -- Type splices mention no type variables + get (HsKindSig ty k) = getl ty + get (HsForAllTy _ tvs + ctxt ty) = (extractHsCtxtTyNames ctxt + `unionNameSets` getl ty) + `minusNameSet` + mkNameSet (hsLTyVarNames tvs) + +extractHsTyNames_s :: [LHsType Name] -> NameSet +extractHsTyNames_s tys = foldr (unionNameSets . extractHsTyNames) emptyNameSet tys + +extractHsCtxtTyNames :: LHsContext Name -> NameSet +extractHsCtxtTyNames (L _ ctxt) + = foldr (unionNameSets . extractHsPredTyNames . unLoc) emptyNameSet ctxt + +-- You don't import or export implicit parameters, +-- so don't mention the IP names +extractHsPredTyNames (HsClassP cls tys) + = unitNameSet cls `unionNameSets` extractHsTyNames_s tys +extractHsPredTyNames (HsIParam n ty) + = extractHsTyNames ty +\end{code} + + +%************************************************************************ +%* * +\subsection{Free variables of declarations} +%* * +%************************************************************************ + +Return the Names that must be in scope if we are to use this declaration. +In all cases this is set up for interface-file declarations: + - for class decls we ignore the bindings + - for instance decls likewise, plus the pragmas + - for rule decls, we ignore HsRules + - for data decls, we ignore derivings + + *** See "THE NAMING STORY" in HsDecls **** + +\begin{code} +---------------- +hsSigsFVs :: [LSig Name] -> FreeVars +hsSigsFVs sigs = plusFVs (map (hsSigFVs.unLoc) sigs) + +hsSigFVs (TypeSig v ty) = extractHsTyNames ty +hsSigFVs (SpecInstSig ty) = extractHsTyNames ty +hsSigFVs (SpecSig v ty inl) = extractHsTyNames ty +hsSigFVs other = emptyFVs + +---------------- +conDeclFVs (L _ (ConDecl { con_qvars = tyvars, con_cxt = context, + con_details = details, con_res = res_ty})) + = delFVs (map hsLTyVarName tyvars) $ + extractHsCtxtTyNames context `plusFV` + conDetailsFVs details `plusFV` + conResTyFVs res_ty + +conResTyFVs ResTyH98 = emptyFVs +conResTyFVs (ResTyGADT ty) = extractHsTyNames ty + +conDetailsFVs (PrefixCon btys) = plusFVs (map bangTyFVs btys) +conDetailsFVs (InfixCon bty1 bty2) = bangTyFVs bty1 `plusFV` bangTyFVs bty2 +conDetailsFVs (RecCon flds) = plusFVs [bangTyFVs bty | (_, bty) <- flds] + +bangTyFVs bty = extractHsTyNames (getBangType bty) +\end{code} + + +%************************************************************************ +%* * +\subsection{A few functions on generic defintions +%* * +%************************************************************************ + +These functions on generics are defined over Matches Name, which is +why they are here and not in HsMatches. + +\begin{code} +maybeGenericMatch :: LMatch Name -> Maybe (HsType Name, LMatch Name) + -- Tells whether a Match is for a generic definition + -- and extract the type from a generic match and put it at the front + +maybeGenericMatch (L loc (Match (L _ (TypePat (L _ ty)) : pats) sig_ty grhss)) + = Just (ty, L loc (Match pats sig_ty grhss)) + +maybeGenericMatch other_match = Nothing +\end{code} diff --git a/compiler/rename/RnNames.lhs b/compiler/rename/RnNames.lhs new file mode 100644 index 0000000000..654c101cd5 --- /dev/null +++ b/compiler/rename/RnNames.lhs @@ -0,0 +1,1138 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[RnNames]{Extracting imported and top-level names in scope} + +\begin{code} +module RnNames ( + rnImports, mkRdrEnvAndImports, importsFromLocalDecls, + rnExports, mkExportNameSet, + getLocalDeclBinders, extendRdrEnvRn, + reportUnusedNames, reportDeprecations + ) where + +#include "HsVersions.h" + +import DynFlags ( DynFlag(..), GhcMode(..) ) +import HsSyn ( IE(..), ieName, ImportDecl(..), LImportDecl, + ForeignDecl(..), HsGroup(..), HsValBinds(..), + Sig(..), collectHsBindLocatedBinders, tyClDeclNames, + LIE ) +import RnEnv +import IfaceEnv ( ifaceExportNames ) +import LoadIface ( loadSrcInterface ) +import TcRnMonad hiding (LIE) + +import FiniteMap +import PrelNames ( pRELUDE, isUnboundName, main_RDR_Unqual ) +import Module ( Module, moduleString, unitModuleEnv, + lookupModuleEnv, moduleEnvElts, foldModuleEnv ) +import Name ( Name, nameSrcLoc, nameOccName, nameModule, isWiredInName, + nameParent, nameParent_maybe, isExternalName, + isBuiltInSyntax ) +import NameSet +import NameEnv +import OccName ( srcDataName, isTcOcc, pprNonVarNameSpace, + occNameSpace, + OccEnv, mkOccEnv, lookupOccEnv, emptyOccEnv, + extendOccEnv ) +import HscTypes ( GenAvailInfo(..), AvailInfo, + HomePackageTable, PackageIfaceTable, + unQualInScope, + Deprecs(..), ModIface(..), Dependencies(..), + lookupIface, ExternalPackageState(..) + ) +import Packages ( PackageIdH(..) ) +import RdrName ( RdrName, rdrNameOcc, setRdrNameSpace, + GlobalRdrEnv, mkGlobalRdrEnv, GlobalRdrElt(..), + emptyGlobalRdrEnv, plusGlobalRdrEnv, globalRdrEnvElts, + extendGlobalRdrEnv, lookupGlobalRdrEnv, unQualOK, lookupGRE_Name, + Provenance(..), ImportSpec(..), ImpDeclSpec(..), ImpItemSpec(..), + importSpecLoc, importSpecModule, isLocalGRE, pprNameProvenance ) +import Outputable +import Maybes ( isNothing, catMaybes, mapCatMaybes, seqMaybe, orElse ) +import SrcLoc ( Located(..), mkGeneralSrcSpan, + unLoc, noLoc, srcLocSpan, SrcSpan ) +import BasicTypes ( DeprecTxt ) +import DriverPhases ( isHsBoot ) +import Util ( notNull ) +import List ( partition ) +import IO ( openFile, IOMode(..) ) +\end{code} + + + +%************************************************************************ +%* * + rnImports +%* * +%************************************************************************ + +\begin{code} +rnImports :: [LImportDecl RdrName] -> RnM [LImportDecl Name] +rnImports imports + -- PROCESS IMPORT DECLS + -- Do the non {- SOURCE -} ones first, so that we get a helpful + -- warning for {- SOURCE -} ones that are unnecessary + = do this_mod <- getModule + implicit_prelude <- doptM Opt_ImplicitPrelude + let all_imports = mk_prel_imports this_mod implicit_prelude ++ imports + (source, ordinary) = partition is_source_import all_imports + is_source_import (L _ (ImportDecl _ is_boot _ _ _)) = is_boot + get_imports = rnImportDecl this_mod + + stuff1 <- mapM get_imports ordinary + stuff2 <- mapM get_imports source + return (stuff1 ++ stuff2) + where +-- NB: opt_NoImplicitPrelude is slightly different to import Prelude (); +-- because the former doesn't even look at Prelude.hi for instance +-- declarations, whereas the latter does. + mk_prel_imports this_mod implicit_prelude + | this_mod == pRELUDE + || explicit_prelude_import + || not implicit_prelude + = [] + | otherwise = [preludeImportDecl] + explicit_prelude_import + = notNull [ () | L _ (ImportDecl mod _ _ _ _) <- imports, + unLoc mod == pRELUDE ] + +preludeImportDecl :: LImportDecl RdrName +preludeImportDecl + = L loc $ + ImportDecl (L loc pRELUDE) + False {- Not a boot interface -} + False {- Not qualified -} + Nothing {- No "as" -} + Nothing {- No import list -} + where + loc = mkGeneralSrcSpan FSLIT("Implicit import declaration") + +mkRdrEnvAndImports :: [LImportDecl Name] -> RnM (GlobalRdrEnv, ImportAvails) +mkRdrEnvAndImports imports + = do this_mod <- getModule + let get_imports = importsFromImportDecl this_mod + stuff <- mapM get_imports imports + let (imp_gbl_envs, imp_avails) = unzip stuff + gbl_env :: GlobalRdrEnv + gbl_env = foldr plusGlobalRdrEnv emptyGlobalRdrEnv imp_gbl_envs + + all_avails :: ImportAvails + all_avails = foldr plusImportAvails emptyImportAvails imp_avails + -- ALL DONE + return (gbl_env, all_avails) + +\end{code} + +\begin{code} +rnImportDecl :: Module + -> LImportDecl RdrName + -> RnM (LImportDecl Name) +rnImportDecl this_mod (L loc importDecl@(ImportDecl loc_imp_mod_name want_boot qual_only as_mod imp_details)) + = setSrcSpan loc $ + do iface <- loadSrcInterface doc imp_mod_name want_boot + let qual_mod_name = case as_mod of + Nothing -> imp_mod_name + Just another_name -> another_name + imp_spec = ImpDeclSpec { is_mod = imp_mod_name, is_qual = qual_only, + is_dloc = loc, is_as = qual_mod_name } + total_avails <- ifaceExportNames (mi_exports iface) + importDecl' <- rnImportDecl' iface imp_spec importDecl total_avails + return (L loc importDecl') + where imp_mod_name = unLoc loc_imp_mod_name + doc = ppr imp_mod_name <+> ptext SLIT("is directly imported") + +rnImportDecl' :: ModIface -> ImpDeclSpec -> ImportDecl RdrName -> NameSet -> RnM (ImportDecl Name) +rnImportDecl' iface decl_spec (ImportDecl mod_name want_boot qual_only as_mod Nothing) all_names + = return $ ImportDecl mod_name want_boot qual_only as_mod Nothing +rnImportDecl' iface decl_spec (ImportDecl mod_name want_boot qual_only as_mod (Just (want_hiding,import_items))) all_names + = do import_items_mbs <- mapM (srcSpanWrapper) import_items + let rn_import_items = concat . catMaybes $ import_items_mbs + return $ ImportDecl mod_name want_boot qual_only as_mod (Just (want_hiding,rn_import_items)) + where + srcSpanWrapper (L span ieRdr) + = setSrcSpan span $ + case get_item ieRdr of + Nothing + -> do addErr (badImportItemErr iface decl_spec ieRdr) + return Nothing + Just ieNames + -> return (Just [L span ie | ie <- ieNames]) + occ_env :: OccEnv Name -- Maps OccName to corresponding Name + occ_env = mkOccEnv [(nameOccName n, n) | n <- nameSetToList all_names] + -- This env will have entries for data constructors too, + -- they won't make any difference because naked entities like T + -- in an import list map to TcOccs, not VarOccs. + + sub_env :: NameEnv [Name] + sub_env = mkSubNameEnv all_names + + get_item :: IE RdrName -> Maybe [IE Name] + -- Empty result for a bad item. + -- Singleton result is typical case. + -- Can have two when we are hiding, and mention C which might be + -- both a class and a data constructor. + get_item item@(IEModuleContents _) + = Nothing + get_item (IEThingAll tc) + = do name <- check_name tc + return [IEThingAll name] + get_item (IEThingAbs tc) + | want_hiding -- hiding ( C ) + -- Here the 'C' can be a data constructor + -- *or* a type/class, or even both + = case catMaybes [check_name tc, check_name (setRdrNameSpace tc srcDataName)] of + [] -> Nothing + names -> return [ IEThingAbs n | n <- names ] + | otherwise + = do name <- check_name tc + return [IEThingAbs name] + get_item (IEThingWith n ns) -- import (C (A,B)) + = do name <- check_name n + let env = mkOccEnv [(nameOccName s, s) | s <- subNames sub_env name] + mb_names = map (lookupOccEnv env . rdrNameOcc) ns + names <- sequence mb_names + return [IEThingWith name names] + get_item (IEVar n) + = do name <- check_name n + return [IEVar name] + + check_name :: RdrName -> Maybe Name + check_name rdrName + = lookupOccEnv occ_env (rdrNameOcc rdrName) + + +importsFromImportDecl :: Module + -> LImportDecl Name + -> RnM (GlobalRdrEnv, ImportAvails) + +importsFromImportDecl this_mod + (L loc (ImportDecl loc_imp_mod_name want_boot qual_only as_mod imp_details)) + = + setSrcSpan loc $ + + -- If there's an error in loadInterface, (e.g. interface + -- file not found) we get lots of spurious errors from 'filterImports' + let + imp_mod_name = unLoc loc_imp_mod_name + doc = ppr imp_mod_name <+> ptext SLIT("is directly imported") + in + loadSrcInterface doc imp_mod_name want_boot `thenM` \ iface -> + + -- Compiler sanity check: if the import didn't say + -- {-# SOURCE #-} we should not get a hi-boot file + WARN( not want_boot && mi_boot iface, ppr imp_mod_name ) + + -- Issue a user warning for a redundant {- SOURCE -} import + -- NB that we arrange to read all the ordinary imports before + -- any of the {- SOURCE -} imports + warnIf (want_boot && not (mi_boot iface)) + (warnRedundantSourceImport imp_mod_name) `thenM_` + + let + imp_mod = mi_module iface + deprecs = mi_deprecs iface + is_orph = mi_orphan iface + deps = mi_deps iface + + filtered_exports = filter not_this_mod (mi_exports iface) + not_this_mod (mod,_) = mod /= this_mod + -- If the module exports anything defined in this module, just ignore it. + -- Reason: otherwise it looks as if there are two local definition sites + -- for the thing, and an error gets reported. Easiest thing is just to + -- filter them out up front. This situation only arises if a module + -- imports itself, or another module that imported it. (Necessarily, + -- this invoves a loop.) + -- + -- Tiresome consequence: if you say + -- module A where + -- import B( AType ) + -- type AType = ... + -- + -- module B( AType ) where + -- import {-# SOURCE #-} A( AType ) + -- + -- then you'll get a 'B does not export AType' message. Oh well. + + qual_mod_name = case as_mod of + Nothing -> imp_mod_name + Just another_name -> another_name + imp_spec = ImpDeclSpec { is_mod = imp_mod_name, is_qual = qual_only, + is_dloc = loc, is_as = qual_mod_name } + in + -- Get the total imports, and filter them according to the import list + ifaceExportNames filtered_exports `thenM` \ total_avails -> + filterImports iface imp_spec + imp_details total_avails `thenM` \ (avail_env, gbl_env) -> + + getDOpts `thenM` \ dflags -> + + let + -- Compute new transitive dependencies + + orphans | is_orph = ASSERT( not (imp_mod_name `elem` dep_orphs deps) ) + imp_mod_name : dep_orphs deps + | otherwise = dep_orphs deps + + (dependent_mods, dependent_pkgs) + = case mi_package iface of + HomePackage -> + -- Imported module is from the home package + -- Take its dependent modules and add imp_mod itself + -- Take its dependent packages unchanged + -- + -- NB: (dep_mods deps) might include a hi-boot file + -- for the module being compiled, CM. Do *not* filter + -- this out (as we used to), because when we've + -- finished dealing with the direct imports we want to + -- know if any of them depended on CM.hi-boot, in + -- which case we should do the hi-boot consistency + -- check. See LoadIface.loadHiBootInterface + ((imp_mod_name, want_boot) : dep_mods deps, dep_pkgs deps) + + ExtPackage pkg -> + -- Imported module is from another package + -- Dump the dependent modules + -- Add the package imp_mod comes from to the dependent packages + ASSERT2( not (pkg `elem` dep_pkgs deps), ppr pkg <+> ppr (dep_pkgs deps) ) + ([], pkg : dep_pkgs deps) + + -- True <=> import M () + import_all = case imp_details of + Just (is_hiding, ls) -> not is_hiding && null ls + other -> False + + -- unqual_avails is the Avails that are visible in *unqualified* form + -- We need to know this so we know what to export when we see + -- module M ( module P ) where ... + -- Then we must export whatever came from P unqualified. + imports = ImportAvails { + imp_env = unitModuleEnv qual_mod_name avail_env, + imp_mods = unitModuleEnv imp_mod (imp_mod, import_all, loc), + imp_orphs = orphans, + imp_dep_mods = mkModDeps dependent_mods, + imp_dep_pkgs = dependent_pkgs } + + in + -- Complain if we import a deprecated module + ifOptM Opt_WarnDeprecations ( + case deprecs of + DeprecAll txt -> addWarn (moduleDeprec imp_mod_name txt) + other -> returnM () + ) `thenM_` + + returnM (gbl_env, imports) + +warnRedundantSourceImport mod_name + = ptext SLIT("Unnecessary {- SOURCE -} in the import of module") + <+> quotes (ppr mod_name) +\end{code} + + +%************************************************************************ +%* * + importsFromLocalDecls +%* * +%************************************************************************ + +From the top-level declarations of this module produce + * the lexical environment + * the ImportAvails +created by its bindings. + +Complain about duplicate bindings + +\begin{code} +importsFromLocalDecls :: HsGroup RdrName -> RnM TcGblEnv +importsFromLocalDecls group + = do { gbl_env <- getGblEnv + + ; names <- getLocalDeclBinders gbl_env group + + ; implicit_prelude <- doptM Opt_ImplicitPrelude + ; let { + -- Optimisation: filter out names for built-in syntax + -- They just clutter up the environment (esp tuples), and the parser + -- will generate Exact RdrNames for them, so the cluttered + -- envt is no use. To avoid doing this filter all the time, + -- we use -fno-implicit-prelude as a clue that the filter is + -- worth while. Really, it's only useful for GHC.Base and GHC.Tuple. + -- + -- It's worth doing because it makes the environment smaller for + -- every module that imports the Prelude + -- + -- Note: don't filter the gbl_env (hence all_names, not filered_all_names + -- in defn of gres above). Stupid reason: when parsing + -- data type decls, the constructors start as Exact tycon-names, + -- and then get turned into data con names by zapping the name space; + -- but that stops them being Exact, so they get looked up. + -- Ditto in fixity decls; e.g. infix 5 : + -- Sigh. It doesn't matter because it only affects the Data.Tuple really. + -- The important thing is to trim down the exports. + filtered_names + | implicit_prelude = names + | otherwise = filter (not . isBuiltInSyntax) names ; + + ; this_mod = tcg_mod gbl_env + ; imports = emptyImportAvails { + imp_env = unitModuleEnv this_mod $ + mkNameSet filtered_names + } + } + + ; rdr_env' <- extendRdrEnvRn (tcg_rdr_env gbl_env) names + + ; returnM (gbl_env { tcg_rdr_env = rdr_env', + tcg_imports = imports `plusImportAvails` tcg_imports gbl_env }) + } + +extendRdrEnvRn :: GlobalRdrEnv -> [Name] -> RnM GlobalRdrEnv +-- Add the new locally-bound names one by one, checking for duplicates as +-- we do so. Remember that in Template Haskell the duplicates +-- might *already be* in the GlobalRdrEnv from higher up the module +extendRdrEnvRn rdr_env names + = foldlM add_local rdr_env names + where + add_local rdr_env name + | gres <- lookupGlobalRdrEnv rdr_env (nameOccName name) + , (dup_gre:_) <- filter isLocalGRE gres -- Check for existing *local* defns + = do { addDupDeclErr (gre_name dup_gre) name + ; return rdr_env } + | otherwise + = return (extendGlobalRdrEnv rdr_env new_gre) + where + new_gre = GRE {gre_name = name, gre_prov = LocalDef} +\end{code} + +@getLocalDeclBinders@ returns the names for an @HsDecl@. It's +used for source code. + + *** See "THE NAMING STORY" in HsDecls **** + +\begin{code} +getLocalDeclBinders :: TcGblEnv -> HsGroup RdrName -> RnM [Name] +getLocalDeclBinders gbl_env (HsGroup {hs_valds = ValBindsIn val_decls val_sigs, + hs_tyclds = tycl_decls, + hs_fords = foreign_decls }) + = do { tc_names_s <- mappM new_tc tycl_decls + ; val_names <- mappM new_simple val_bndrs + ; return (foldr (++) val_names tc_names_s) } + where + mod = tcg_mod gbl_env + is_hs_boot = isHsBoot (tcg_src gbl_env) ; + val_bndrs | is_hs_boot = sig_hs_bndrs + | otherwise = for_hs_bndrs ++ val_hs_bndrs + -- In a hs-boot file, the value binders come from the + -- *signatures*, and there should be no foreign binders + + new_simple rdr_name = newTopSrcBinder mod Nothing rdr_name + + sig_hs_bndrs = [nm | L _ (TypeSig nm _) <- val_sigs] + val_hs_bndrs = collectHsBindLocatedBinders val_decls + for_hs_bndrs = [nm | L _ (ForeignImport nm _ _ _) <- foreign_decls] + + new_tc tc_decl + = do { main_name <- newTopSrcBinder mod Nothing main_rdr + ; sub_names <- mappM (newTopSrcBinder mod (Just main_name)) sub_rdrs + ; return (main_name : sub_names) } + where + (main_rdr : sub_rdrs) = tyClDeclNames (unLoc tc_decl) +\end{code} + + +%************************************************************************ +%* * +\subsection{Filtering imports} +%* * +%************************************************************************ + +@filterImports@ takes the @ExportEnv@ telling what the imported module makes +available, and filters it through the import spec (if any). + +\begin{code} +filterImports :: ModIface + -> ImpDeclSpec -- The span for the entire import decl + -> Maybe (Bool, [LIE Name]) -- Import spec; True => hiding + -> NameSet -- What's available + -> RnM (NameSet, -- What's imported (qualified or unqualified) + GlobalRdrEnv) -- Same again, but in GRE form + + -- Complains if import spec mentions things that the module doesn't export + -- Warns/informs if import spec contains duplicates. + +mkGenericRdrEnv decl_spec names + = mkGlobalRdrEnv [ GRE { gre_name = name, gre_prov = Imported [imp_spec] } + | name <- nameSetToList names ] + where + imp_spec = ImpSpec { is_decl = decl_spec, is_item = ImpAll } + +filterImports iface decl_spec Nothing all_names + = return (all_names, mkGenericRdrEnv decl_spec all_names) + +filterImports iface decl_spec (Just (want_hiding, import_items)) all_names + = mapM (addLocM get_item) import_items >>= \gres_s -> + let gres = concat gres_s + specified_names = mkNameSet (map gre_name gres) + in if not want_hiding then + return (specified_names, mkGlobalRdrEnv gres) + else let keep n = not (n `elemNameSet` specified_names) + pruned_avails = filterNameSet keep all_names + in return (pruned_avails, mkGenericRdrEnv decl_spec pruned_avails) + where + sub_env :: NameEnv [Name] -- Classify each name by its parent + sub_env = mkSubNameEnv all_names + + succeed_with :: Bool -> [Name] -> RnM [GlobalRdrElt] + succeed_with all_explicit names + = do { loc <- getSrcSpanM + ; returnM (map (mk_gre loc) names) } + where + mk_gre loc name = GRE { gre_name = name, + gre_prov = Imported [imp_spec] } + where + imp_spec = ImpSpec { is_decl = decl_spec, is_item = item_spec } + item_spec = ImpSome { is_explicit = explicit, is_iloc = loc } + explicit = all_explicit || isNothing (nameParent_maybe name) + + get_item :: IE Name -> RnM [GlobalRdrElt] + -- Empty result for a bad item. + -- Singleton result is typical case. + -- Can have two when we are hiding, and mention C which might be + -- both a class and a data constructor. + get_item item@(IEModuleContents _) + -- This case should be filtered out by 'rnImports'. + = panic "filterImports: IEModuleContents?" + + get_item (IEThingAll name) + = case subNames sub_env name of + [] -> -- This occurs when you import T(..), but + -- only export T abstractly. + do ifOptM Opt_WarnDodgyImports (addWarn (dodgyImportWarn name)) + succeed_with False [name] + names -> succeed_with False (name:names) + + get_item (IEThingAbs name) + = succeed_with True [name] + + get_item (IEThingWith name names) + = succeed_with True (name:names) + get_item (IEVar name) + = succeed_with True [name] + +\end{code} + + +%************************************************************************ +%* * +\subsection{Export list processing} +%* * +%************************************************************************ + +Processing the export list. + +You might think that we should record things that appear in the export +list as ``occurrences'' (using @addOccurrenceName@), but you'd be +wrong. We do check (here) that they are in scope, but there is no +need to slurp in their actual declaration (which is what +@addOccurrenceName@ forces). + +Indeed, doing so would big trouble when compiling @PrelBase@, because +it re-exports @GHC@, which includes @takeMVar#@, whose type includes +@ConcBase.StateAndSynchVar#@, and so on... + +\begin{code} +type ExportAccum -- The type of the accumulating parameter of + -- the main worker function in rnExports + = ([Module], -- 'module M's seen so far + ExportOccMap, -- Tracks exported occurrence names + NameSet) -- The accumulated exported stuff +emptyExportAccum = ([], emptyOccEnv, emptyNameSet) + +type ExportOccMap = OccEnv (Name, IE RdrName) + -- Tracks what a particular exported OccName + -- in an export list refers to, and which item + -- it came from. It's illegal to export two distinct things + -- that have the same occurrence name + +rnExports :: Maybe [LIE RdrName] + -> RnM (Maybe [LIE Name]) +rnExports Nothing = return Nothing +rnExports (Just exports) + = do TcGblEnv { tcg_imports = ImportAvails { imp_env = imp_env } } <- getGblEnv + let sub_env :: NameEnv [Name] -- Classify each name by its parent + sub_env = mkSubNameEnv (foldModuleEnv unionNameSets emptyNameSet imp_env) + rnExport (IEVar rdrName) + = do name <- lookupGlobalOccRn rdrName + return (IEVar name) + rnExport (IEThingAbs rdrName) + = do name <- lookupGlobalOccRn rdrName + return (IEThingAbs name) + rnExport (IEThingAll rdrName) + = do name <- lookupGlobalOccRn rdrName + return (IEThingAll name) + rnExport ie@(IEThingWith rdrName rdrNames) + = do name <- lookupGlobalOccRn rdrName + if isUnboundName name + then return (IEThingWith name []) + else do + let env = mkOccEnv [(nameOccName s, s) | s <- subNames sub_env name] + mb_names = map (lookupOccEnv env . rdrNameOcc) rdrNames + if any isNothing mb_names + then do addErr (exportItemErr ie) + return (IEThingWith name []) + else return (IEThingWith name (catMaybes mb_names)) + rnExport (IEModuleContents mod) + = return (IEModuleContents mod) + rn_exports <- mapM (wrapLocM rnExport) exports + return (Just rn_exports) + +mkExportNameSet :: Bool -- False => no 'module M(..) where' header at all + -> Maybe ([LIE Name], [LIE RdrName]) -- Nothing => no explicit export list + -> RnM NameSet + -- Complains if two distinct exports have same OccName + -- Warns about identical exports. + -- Complains about exports items not in scope + +mkExportNameSet explicit_mod exports + = do TcGblEnv { tcg_rdr_env = rdr_env, + tcg_imports = imports } <- getGblEnv + + -- If the module header is omitted altogether, then behave + -- as if the user had written "module Main(main) where..." + -- EXCEPT in interactive mode, when we behave as if he had + -- written "module Main where ..." + -- Reason: don't want to complain about 'main' not in scope + -- in interactive mode + ghc_mode <- getGhcMode + real_exports <- case () of + () | explicit_mod + -> return exports + | ghc_mode == Interactive + -> return Nothing + | otherwise + -> do mainName <- lookupGlobalOccRn main_RDR_Unqual + return (Just ([noLoc (IEVar mainName)] + ,[noLoc (IEVar main_RDR_Unqual)])) + -- ToDo: the 'noLoc' here is unhelpful if 'main' turns out to be out of scope + exports_from_avail real_exports rdr_env imports + + +exports_from_avail Nothing rdr_env imports + = -- Export all locally-defined things + -- We do this by filtering the global RdrEnv, + -- keeping only things that are locally-defined + return (mkNameSet [ gre_name gre + | gre <- globalRdrEnvElts rdr_env, + isLocalGRE gre ]) + +exports_from_avail (Just (items,origItems)) rdr_env (ImportAvails { imp_env = imp_env }) + = do (_, _, exports) <- foldlM do_litem emptyExportAccum (zip items origItems) + return exports + where + sub_env :: NameEnv [Name] -- Classify each name by its parent + sub_env = mkSubNameEnv (foldModuleEnv unionNameSets emptyNameSet imp_env) + + do_litem :: ExportAccum -> (LIE Name, LIE RdrName) -> RnM ExportAccum + do_litem acc (ieName, ieRdr) + = addLocM (exports_from_item acc (unLoc ieRdr)) ieName + + exports_from_item :: ExportAccum -> IE RdrName -> IE Name -> RnM ExportAccum + exports_from_item acc@(mods, occs, exports) ieRdr@(IEModuleContents mod) ie + | mod `elem` mods -- Duplicate export of M + = do { warn_dup_exports <- doptM Opt_WarnDuplicateExports ; + warnIf warn_dup_exports (dupModuleExport mod) ; + returnM acc } + + | otherwise + = case lookupModuleEnv imp_env mod of + Nothing -> do addErr (modExportErr mod) + return acc + Just names + -> do let new_exports = filterNameSet (inScopeUnqual rdr_env) names + -- This check_occs not only finds conflicts between this item + -- and others, but also internally within this item. That is, + -- if 'M.x' is in scope in several ways, we'll have several + -- members of mod_avails with the same OccName. + occs' <- check_occs ieRdr occs (nameSetToList new_exports) + return (mod:mods, occs', exports `unionNameSets` new_exports) + + exports_from_item acc@(mods, occs, exports) ieRdr ie + = if isUnboundName (ieName ie) + then return acc -- Avoid error cascade + else let new_exports = filterAvail ie sub_env in + do -- checkErr (not (null (drop 1 new_exports))) (exportItemErr ie) + checkForDodgyExport ie new_exports + occs' <- check_occs ieRdr occs new_exports + return (mods, occs', addListToNameSet exports new_exports) + +------------------------------- +filterAvail :: IE Name -- Wanted + -> NameEnv [Name] -- Maps type/class names to their sub-names + -> [Name] + +filterAvail (IEVar n) subs = [n] +filterAvail (IEThingAbs n) subs = [n] +filterAvail (IEThingAll n) subs = n : subNames subs n +filterAvail (IEThingWith n ns) subs = n : ns +filterAvail (IEModuleContents _) _ = panic "filterAvail" + +subNames :: NameEnv [Name] -> Name -> [Name] +subNames env n = lookupNameEnv env n `orElse` [] + +mkSubNameEnv :: NameSet -> NameEnv [Name] +-- Maps types and classes to their constructors/classops respectively +-- This mapping just makes it easier to deal with A(..) export items +mkSubNameEnv names + = foldNameSet add_name emptyNameEnv names + where + add_name name env + | Just parent <- nameParent_maybe name + = extendNameEnv_C (\ns _ -> name:ns) env parent [name] + | otherwise = env + +------------------------------- +inScopeUnqual :: GlobalRdrEnv -> Name -> Bool +-- Checks whether the Name is in scope unqualified, +-- regardless of whether it's ambiguous or not +inScopeUnqual env n = any unQualOK (lookupGRE_Name env n) + +------------------------------- +checkForDodgyExport :: IE Name -> [Name] -> RnM () +checkForDodgyExport ie@(IEThingAll tc) [n] + | isTcOcc (nameOccName n) = addWarn (dodgyExportWarn tc) + -- This occurs when you export T(..), but + -- only import T abstractly, or T is a synonym. + -- The single [n] is the type or class itself + | otherwise = addErr (exportItemErr ie) + -- This happes if you export x(..), which is bogus +checkForDodgyExport _ _ = return () + +------------------------------- +check_occs :: IE RdrName -> ExportOccMap -> [Name] -> RnM ExportOccMap +check_occs ie occs names + = foldlM check occs names + where + check occs name + = case lookupOccEnv occs name_occ of + Nothing -> returnM (extendOccEnv occs name_occ (name, ie)) + + Just (name', ie') + | name == name' -- Duplicate export + -> do { warn_dup_exports <- doptM Opt_WarnDuplicateExports ; + warnIf warn_dup_exports (dupExportWarn name_occ ie ie') ; + returnM occs } + + | otherwise -- Same occ name but different names: an error + -> do { global_env <- getGlobalRdrEnv ; + addErr (exportClashErr global_env name name' ie ie') ; + returnM occs } + where + name_occ = nameOccName name +\end{code} + +%********************************************************* +%* * + Deprecations +%* * +%********************************************************* + +\begin{code} +reportDeprecations :: TcGblEnv -> RnM () +reportDeprecations tcg_env + = ifOptM Opt_WarnDeprecations $ + do { (eps,hpt) <- getEpsAndHpt + -- By this time, typechecking is complete, + -- so the PIT is fully populated + ; mapM_ (check hpt (eps_PIT eps)) all_gres } + where + used_names = allUses (tcg_dus tcg_env) + -- Report on all deprecated uses; hence allUses + all_gres = globalRdrEnvElts (tcg_rdr_env tcg_env) + + check hpt pit (GRE {gre_name = name, gre_prov = Imported (imp_spec:_)}) + | name `elemNameSet` used_names + , Just deprec_txt <- lookupDeprec hpt pit name + = setSrcSpan (importSpecLoc imp_spec) $ + addWarn (sep [ptext SLIT("Deprecated use of") <+> + pprNonVarNameSpace (occNameSpace (nameOccName name)) <+> + quotes (ppr name), + (parens imp_msg) <> colon, + (ppr deprec_txt) ]) + where + name_mod = nameModule name + imp_mod = importSpecModule imp_spec + imp_msg = ptext SLIT("imported from") <+> ppr imp_mod <> extra + extra | imp_mod == name_mod = empty + | otherwise = ptext SLIT(", but defined in") <+> ppr name_mod + + check hpt pit ok_gre = returnM () -- Local, or not used, or not deprectated + -- The Imported pattern-match: don't deprecate locally defined names + -- For a start, we may be exporting a deprecated thing + -- Also we may use a deprecated thing in the defn of another + -- deprecated things. We may even use a deprecated thing in + -- the defn of a non-deprecated thing, when changing a module's + -- interface + +lookupDeprec :: HomePackageTable -> PackageIfaceTable + -> Name -> Maybe DeprecTxt +lookupDeprec hpt pit n + = case lookupIface hpt pit (nameModule n) of + Just iface -> mi_dep_fn iface n `seqMaybe` -- Bleat if the thing, *or + mi_dep_fn iface (nameParent n) -- its parent*, is deprec'd + Nothing + | isWiredInName n -> Nothing + -- We have not necessarily loaded the .hi file for a + -- wired-in name (yet), although we *could*. + -- And we never deprecate them + + | otherwise -> pprPanic "lookupDeprec" (ppr n) + -- By now all the interfaces should have been loaded + +gre_is_used :: NameSet -> GlobalRdrElt -> Bool +gre_is_used used_names gre = gre_name gre `elemNameSet` used_names +\end{code} + +%********************************************************* +%* * + Unused names +%* * +%********************************************************* + +\begin{code} +reportUnusedNames :: Maybe [LIE RdrName] -- Export list + -> TcGblEnv -> RnM () +reportUnusedNames export_decls gbl_env + = do { traceRn ((text "RUN") <+> (ppr (tcg_dus gbl_env))) + ; warnUnusedTopBinds unused_locals + ; warnUnusedModules unused_imp_mods + ; warnUnusedImports unused_imports + ; warnDuplicateImports defined_and_used + ; printMinimalImports minimal_imports } + where + used_names, all_used_names :: NameSet + used_names = findUses (tcg_dus gbl_env) emptyNameSet + -- NB: currently, if f x = g, we only treat 'g' as used if 'f' is used + -- Hence findUses + + all_used_names = used_names `unionNameSets` + mkNameSet (mapCatMaybes nameParent_maybe (nameSetToList used_names)) + -- A use of C implies a use of T, + -- if C was brought into scope by T(..) or T(C) + + -- Collect the defined names from the in-scope environment + defined_names :: [GlobalRdrElt] + defined_names = globalRdrEnvElts (tcg_rdr_env gbl_env) + + -- Note that defined_and_used, defined_but_not_used + -- are both [GRE]; that's why we need defined_and_used + -- rather than just all_used_names + defined_and_used, defined_but_not_used :: [GlobalRdrElt] + (defined_and_used, defined_but_not_used) + = partition (gre_is_used all_used_names) defined_names + + -- Filter out the ones that are + -- (a) defined in this module, and + -- (b) not defined by a 'deriving' clause + -- The latter have an Internal Name, so we can filter them out easily + unused_locals :: [GlobalRdrElt] + unused_locals = filter is_unused_local defined_but_not_used + is_unused_local :: GlobalRdrElt -> Bool + is_unused_local gre = isLocalGRE gre && isExternalName (gre_name gre) + + unused_imports :: [GlobalRdrElt] + unused_imports = filter unused_imp defined_but_not_used + unused_imp (GRE {gre_prov = Imported imp_specs}) + = not (all (module_unused . importSpecModule) imp_specs) + && or [exp | ImpSpec { is_item = ImpSome { is_explicit = exp } } <- imp_specs] + -- Don't complain about unused imports if we've already said the + -- entire import is unused + unused_imp other = False + + -- To figure out the minimal set of imports, start with the things + -- that are in scope (i.e. in gbl_env). Then just combine them + -- into a bunch of avails, so they are properly grouped + -- + -- BUG WARNING: this does not deal properly with qualified imports! + minimal_imports :: FiniteMap Module AvailEnv + minimal_imports0 = foldr add_expall emptyFM expall_mods + minimal_imports1 = foldr add_name minimal_imports0 defined_and_used + minimal_imports = foldr add_inst_mod minimal_imports1 direct_import_mods + -- The last line makes sure that we retain all direct imports + -- even if we import nothing explicitly. + -- It's not necessarily redundant to import such modules. Consider + -- module This + -- import M () + -- + -- The import M() is not *necessarily* redundant, even if + -- we suck in no instance decls from M (e.g. it contains + -- no instance decls, or This contains no code). It may be + -- that we import M solely to ensure that M's orphan instance + -- decls (or those in its imports) are visible to people who + -- import This. Sigh. + -- There's really no good way to detect this, so the error message + -- in RnEnv.warnUnusedModules is weakened instead + + -- We've carefully preserved the provenance so that we can + -- construct minimal imports that import the name by (one of) + -- the same route(s) as the programmer originally did. + add_name (GRE {gre_name = n, gre_prov = Imported imp_specs}) acc + = addToFM_C plusAvailEnv acc (importSpecModule (head imp_specs)) + (unitAvailEnv (mk_avail n (nameParent_maybe n))) + add_name other acc + = acc + + -- Modules mentioned as 'module M' in the export list + expall_mods = case export_decls of + Nothing -> [] + Just es -> [m | L _ (IEModuleContents m) <- es] + + -- This is really bogus. The idea is that if we see 'module M' in + -- the export list we must retain the import decls that drive it + -- If we aren't careful we might see + -- module A( module M ) where + -- import M + -- import N + -- and suppose that N exports everything that M does. Then we + -- must not drop the import of M even though N brings it all into + -- scope. + -- + -- BUG WARNING: 'module M' exports aside, what if M.x is mentioned?! + -- + -- The reason that add_expall is bogus is that it doesn't take + -- qualified imports into account. But it's an improvement. + add_expall mod acc = addToFM_C plusAvailEnv acc mod emptyAvailEnv + + -- n is the name of the thing, p is the name of its parent + mk_avail n (Just p) = AvailTC p [p,n] + mk_avail n Nothing | isTcOcc (nameOccName n) = AvailTC n [n] + | otherwise = Avail n + + add_inst_mod (mod,_,_) acc + | mod `elemFM` acc = acc -- We import something already + | otherwise = addToFM acc mod emptyAvailEnv + where + -- Add an empty collection of imports for a module + -- from which we have sucked only instance decls + + imports = tcg_imports gbl_env + + direct_import_mods :: [(Module, Bool, SrcSpan)] + -- See the type of the imp_mods for this triple + direct_import_mods = moduleEnvElts (imp_mods imports) + + -- unused_imp_mods are the directly-imported modules + -- that are not mentioned in minimal_imports1 + -- [Note: not 'minimal_imports', because that includes directly-imported + -- modules even if we use nothing from them; see notes above] + -- + -- BUG WARNING: does not deal correctly with multiple imports of the same module + -- becuase direct_import_mods has only one entry per module + unused_imp_mods = [(mod,loc) | (mod,no_imp,loc) <- direct_import_mods, + not (mod `elemFM` minimal_imports1), + mod /= pRELUDE, + not no_imp] + -- The not no_imp part is not to complain about + -- import M (), which is an idiom for importing + -- instance declarations + + module_unused :: Module -> Bool + module_unused mod = any (((==) mod) . fst) unused_imp_mods + +--------------------- +warnDuplicateImports :: [GlobalRdrElt] -> RnM () +-- Given the GREs for names that are used, figure out which imports +-- could be omitted without changing the top-level environment. +-- +-- NB: Given import Foo( T ) +-- import qualified Foo +-- we do not report a duplicate import, even though Foo.T is brought +-- into scope by both, because there's nothing you can *omit* without +-- changing the top-level environment. So we complain only if it's +-- explicitly named in both imports or neither. +-- +-- Furthermore, we complain about Foo.T only if +-- there is no complaint about (unqualified) T + +warnDuplicateImports gres + = ifOptM Opt_WarnUnusedImports $ + sequenceM_ [ warn name pr + -- The 'head' picks the first offending group + -- for this particular name + | GRE { gre_name = name, gre_prov = Imported imps } <- gres + , pr <- redundants imps ] + where + warn name (red_imp, cov_imp) + = addWarnAt (importSpecLoc red_imp) + (vcat [ptext SLIT("Redundant import of:") <+> quotes pp_name, + ptext SLIT("It is also") <+> ppr cov_imp]) + where + pp_name | is_qual red_decl = ppr (is_as red_decl) <> dot <> ppr occ + | otherwise = ppr occ + occ = nameOccName name + red_decl = is_decl red_imp + + redundants :: [ImportSpec] -> [(ImportSpec,ImportSpec)] + -- The returned pair is (redundant-import, covering-import) + redundants imps + = [ (red_imp, cov_imp) + | red_imp <- imps + , cov_imp <- take 1 (filter (covers red_imp) imps) ] + + -- "red_imp" is a putative redundant import + -- "cov_imp" potentially covers it + -- This test decides whether red_imp could be dropped + -- + -- NOTE: currently the test does not warn about + -- import M( x ) + -- imoprt N( x ) + -- even if the same underlying 'x' is involved, because dropping + -- either import would change the qualified names in scope (M.x, N.x) + -- But if the qualified names aren't used, the import is indeed redundant + -- Sadly we don't know that. Oh well. + covers red_imp@(ImpSpec { is_decl = red_decl, is_item = red_item }) + cov_imp@(ImpSpec { is_decl = cov_decl, is_item = cov_item }) + | red_loc == cov_loc + = False -- Ignore diagonal elements + | not (is_as red_decl == is_as cov_decl) + = False -- They bring into scope different qualified names + | not (is_qual red_decl) && is_qual cov_decl + = False -- Covering one doesn't bring unqualified name into scope + | red_selective + = not cov_selective -- Redundant one is selective and covering one isn't + || red_later -- Both are explicit; tie-break using red_later + | otherwise + = not cov_selective -- Neither import is selective + && (is_mod red_decl == is_mod cov_decl) -- They import the same module + && red_later -- Tie-break + where + red_loc = importSpecLoc red_imp + cov_loc = importSpecLoc cov_imp + red_later = red_loc > cov_loc + cov_selective = selectiveImpItem cov_item + red_selective = selectiveImpItem red_item + +selectiveImpItem :: ImpItemSpec -> Bool +selectiveImpItem ImpAll = False +selectiveImpItem (ImpSome {}) = True + +-- ToDo: deal with original imports with 'qualified' and 'as M' clauses +printMinimalImports :: FiniteMap Module AvailEnv -- Minimal imports + -> RnM () +printMinimalImports imps + = ifOptM Opt_D_dump_minimal_imports $ do { + + mod_ies <- mappM to_ies (fmToList imps) ; + this_mod <- getModule ; + rdr_env <- getGlobalRdrEnv ; + ioToTcRn (do { h <- openFile (mkFilename this_mod) WriteMode ; + printForUser h (unQualInScope rdr_env) + (vcat (map ppr_mod_ie mod_ies)) }) + } + where + mkFilename this_mod = moduleString this_mod ++ ".imports" + ppr_mod_ie (mod_name, ies) + | mod_name == pRELUDE + = empty + | null ies -- Nothing except instances comes from here + = ptext SLIT("import") <+> ppr mod_name <> ptext SLIT("() -- Instances only") + | otherwise + = ptext SLIT("import") <+> ppr mod_name <> + parens (fsep (punctuate comma (map ppr ies))) + + to_ies (mod, avail_env) = do ies <- mapM to_ie (availEnvElts avail_env) + returnM (mod, ies) + + to_ie :: AvailInfo -> RnM (IE Name) + -- The main trick here is that if we're importing all the constructors + -- we want to say "T(..)", but if we're importing only a subset we want + -- to say "T(A,B,C)". So we have to find out what the module exports. + to_ie (Avail n) = returnM (IEVar n) + to_ie (AvailTC n [m]) = ASSERT( n==m ) + returnM (IEThingAbs n) + to_ie (AvailTC n ns) + = loadSrcInterface doc n_mod False `thenM` \ iface -> + case [xs | (m,as) <- mi_exports iface, + m == n_mod, + AvailTC x xs <- as, + x == nameOccName n] of + [xs] | all_used xs -> returnM (IEThingAll n) + | otherwise -> returnM (IEThingWith n (filter (/= n) ns)) + other -> pprTrace "to_ie" (ppr n <+> ppr n_mod <+> ppr other) $ + returnM (IEVar n) + where + all_used avail_occs = all (`elem` map nameOccName ns) avail_occs + doc = text "Compute minimal imports from" <+> ppr n + n_mod = nameModule n +\end{code} + + +%************************************************************************ +%* * +\subsection{Errors} +%* * +%************************************************************************ + +\begin{code} +badImportItemErr iface decl_spec ie + = sep [ptext SLIT("Module"), quotes (ppr (is_mod decl_spec)), source_import, + ptext SLIT("does not export"), quotes (ppr ie)] + where + source_import | mi_boot iface = ptext SLIT("(hi-boot interface)") + | otherwise = empty + +dodgyImportWarn item = dodgyMsg (ptext SLIT("import")) item +dodgyExportWarn item = dodgyMsg (ptext SLIT("export")) item + +dodgyMsg kind tc + = sep [ ptext SLIT("The") <+> kind <+> ptext SLIT("item") <+> quotes (ppr (IEThingAll tc)), + ptext SLIT("suggests that") <+> quotes (ppr tc) <+> ptext SLIT("has constructor or class methods"), + ptext SLIT("but it has none; it is a type synonym or abstract type or class") ] + +modExportErr mod + = hsep [ ptext SLIT("Unknown module in export list: module"), quotes (ppr mod)] + +exportItemErr export_item + = sep [ ptext SLIT("The export item") <+> quotes (ppr export_item), + ptext SLIT("attempts to export constructors or class methods that are not visible here") ] + +exportClashErr global_env name1 name2 ie1 ie2 + = vcat [ ptext SLIT("Conflicting exports for") <+> quotes (ppr occ) <> colon + , ppr_export ie1 name1 + , ppr_export ie2 name2 ] + where + occ = nameOccName name1 + ppr_export ie name = nest 2 (quotes (ppr ie) <+> ptext SLIT("exports") <+> + quotes (ppr name) <+> pprNameProvenance (get_gre name)) + + -- get_gre finds a GRE for the Name, so that we can show its provenance + get_gre name + = case lookupGRE_Name global_env name of + (gre:_) -> gre + [] -> pprPanic "exportClashErr" (ppr name) + +addDupDeclErr :: Name -> Name -> TcRn () +addDupDeclErr name_a name_b + = addErrAt (srcLocSpan loc2) $ + vcat [ptext SLIT("Multiple declarations of") <+> quotes (ppr name1), + ptext SLIT("Declared at:") <+> vcat [ppr (nameSrcLoc name1), ppr loc2]] + where + loc2 = nameSrcLoc name2 + (name1,name2) | nameSrcLoc name_a > nameSrcLoc name_b = (name_b,name_a) + | otherwise = (name_a,name_b) + -- Report the error at the later location + +dupExportWarn occ_name ie1 ie2 + = hsep [quotes (ppr occ_name), + ptext SLIT("is exported by"), quotes (ppr ie1), + ptext SLIT("and"), quotes (ppr ie2)] + +dupModuleExport mod + = hsep [ptext SLIT("Duplicate"), + quotes (ptext SLIT("Module") <+> ppr mod), + ptext SLIT("in export list")] + +moduleDeprec mod txt + = sep [ ptext SLIT("Module") <+> quotes (ppr mod) <+> ptext SLIT("is deprecated:"), + nest 4 (ppr txt) ] +\end{code} diff --git a/compiler/rename/RnSource.hi-boot-5 b/compiler/rename/RnSource.hi-boot-5 new file mode 100644 index 0000000000..1ec4d52522 --- /dev/null +++ b/compiler/rename/RnSource.hi-boot-5 @@ -0,0 +1,13 @@ +__interface RnSource 1 0 where +__export RnSource rnBindsAndThen rnBinds rnSrcDecls; + +1 rnBindsAndThen :: __forall [b] => [HsBinds.HsBindGroup RdrName.RdrName] + -> ([HsBinds.HsBindGroup Name.Name] + -> TcRnTypes.RnM (b, NameSet.FreeVars)) + -> TcRnTypes.RnM (b, NameSet.FreeVars) ; + +1 rnBinds :: [HsBinds.HsBindGroup RdrName.RdrName] + -> TcRnTypes.RnM ([HsBinds.HsBindGroup Name.Name], NameSet.DefUses) ; + +1 rnSrcDecls :: HsDecls.HsGroup RdrName.RdrName + -> TcRnTypes.RnM (TcRnTypes.TcGblEnv, HsDecls.HsGroup Name.Name) diff --git a/compiler/rename/RnSource.lhs b/compiler/rename/RnSource.lhs new file mode 100644 index 0000000000..9150440aee --- /dev/null +++ b/compiler/rename/RnSource.lhs @@ -0,0 +1,722 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[RnSource]{Main pass of renamer} + +\begin{code} +module RnSource ( + rnSrcDecls, addTcgDUs, + rnTyClDecls, checkModDeprec, + rnSplice, checkTH + ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} RnExpr( rnLExpr ) + +import HsSyn +import RdrName ( RdrName, isRdrDataCon, elemLocalRdrEnv, globalRdrEnvElts, + GlobalRdrElt(..), isLocalGRE ) +import RdrHsSyn ( extractGenericPatTyVars, extractHsRhoRdrTyVars ) +import RnHsSyn +import RnTypes ( rnLHsType, rnLHsTypes, rnHsSigType, rnHsTypeFVs, rnContext ) +import RnBinds ( rnTopBinds, rnMethodBinds, renameSigs ) +import RnEnv ( lookupLocalDataTcNames, + lookupLocatedTopBndrRn, lookupLocatedOccRn, + lookupOccRn, newLocalsRn, + bindLocatedLocalsFV, bindPatSigTyVarsFV, + bindTyVarsRn, extendTyVarEnvFVRn, + bindLocalNames, checkDupNames, mapFvRn + ) +import TcRnMonad + +import HscTypes ( FixityEnv, FixItem(..), + Deprecations, Deprecs(..), DeprecTxt, plusDeprecs ) +import Class ( FunDep ) +import Name ( Name, nameOccName ) +import NameSet +import NameEnv +import OccName ( occEnvElts ) +import Outputable +import SrcLoc ( Located(..), unLoc, getLoc, noLoc ) +import DynFlags ( DynFlag(..) ) +import Maybes ( seqMaybe ) +import Maybe ( isNothing ) +import BasicTypes ( Boxity(..) ) +\end{code} + +@rnSourceDecl@ `renames' declarations. +It simultaneously performs dependency analysis and precedence parsing. +It also does the following error checks: +\begin{enumerate} +\item +Checks that tyvars are used properly. This includes checking +for undefined tyvars, and tyvars in contexts that are ambiguous. +(Some of this checking has now been moved to module @TcMonoType@, +since we don't have functional dependency information at this point.) +\item +Checks that all variable occurences are defined. +\item +Checks the @(..)@ etc constraints in the export list. +\end{enumerate} + + +\begin{code} +rnSrcDecls :: HsGroup RdrName -> RnM (TcGblEnv, HsGroup Name) + +rnSrcDecls (HsGroup { hs_valds = val_decls, + hs_tyclds = tycl_decls, + hs_instds = inst_decls, + hs_fixds = fix_decls, + hs_depds = deprec_decls, + hs_fords = foreign_decls, + hs_defds = default_decls, + hs_ruleds = rule_decls }) + + = do { -- Deal with deprecations (returns only the extra deprecations) + deprecs <- rnSrcDeprecDecls deprec_decls ; + updGblEnv (\gbl -> gbl { tcg_deprecs = tcg_deprecs gbl `plusDeprecs` deprecs }) + $ do { + + -- Deal with top-level fixity decls + -- (returns the total new fixity env) + fix_env <- rnSrcFixityDeclsEnv fix_decls ; + rn_fix_decls <- rnSrcFixityDecls fix_decls ; + updGblEnv (\gbl -> gbl { tcg_fix_env = fix_env }) + $ do { + + -- Rename other declarations + traceRn (text "Start rnmono") ; + (rn_val_decls, bind_dus) <- rnTopBinds val_decls ; + traceRn (text "finish rnmono" <+> ppr rn_val_decls) ; + + -- You might think that we could build proper def/use information + -- for type and class declarations, but they can be involved + -- in mutual recursion across modules, and we only do the SCC + -- analysis for them in the type checker. + -- So we content ourselves with gathering uses only; that + -- means we'll only report a declaration as unused if it isn't + -- mentioned at all. Ah well. + (rn_tycl_decls, src_fvs1) + <- mapFvRn (wrapLocFstM rnTyClDecl) tycl_decls ; + (rn_inst_decls, src_fvs2) + <- mapFvRn (wrapLocFstM rnSrcInstDecl) inst_decls ; + (rn_rule_decls, src_fvs3) + <- mapFvRn (wrapLocFstM rnHsRuleDecl) rule_decls ; + (rn_foreign_decls, src_fvs4) + <- mapFvRn (wrapLocFstM rnHsForeignDecl) foreign_decls ; + (rn_default_decls, src_fvs5) + <- mapFvRn (wrapLocFstM rnDefaultDecl) default_decls ; + + let { + rn_group = HsGroup { hs_valds = rn_val_decls, + hs_tyclds = rn_tycl_decls, + hs_instds = rn_inst_decls, + hs_fixds = rn_fix_decls, + hs_depds = [], + hs_fords = rn_foreign_decls, + hs_defds = rn_default_decls, + hs_ruleds = rn_rule_decls } ; + + other_fvs = plusFVs [src_fvs1, src_fvs2, src_fvs3, + src_fvs4, src_fvs5] ; + src_dus = bind_dus `plusDU` usesOnly other_fvs + -- Note: src_dus will contain *uses* for locally-defined types + -- and classes, but no *defs* for them. (Because rnTyClDecl + -- returns only the uses.) This is a little + -- surprising but it doesn't actually matter at all. + } ; + + traceRn (text "finish rnSrc" <+> ppr rn_group) ; + traceRn (text "finish Dus" <+> ppr src_dus ) ; + tcg_env <- getGblEnv ; + return (tcg_env `addTcgDUs` src_dus, rn_group) + }}} + +rnTyClDecls :: [LTyClDecl RdrName] -> RnM [LTyClDecl Name] +rnTyClDecls tycl_decls = do + (decls', fvs) <- mapFvRn (wrapLocFstM rnTyClDecl) tycl_decls + return decls' + +addTcgDUs :: TcGblEnv -> DefUses -> TcGblEnv +addTcgDUs tcg_env dus = tcg_env { tcg_dus = tcg_dus tcg_env `plusDU` dus } +\end{code} + + +%********************************************************* +%* * + Source-code fixity declarations +%* * +%********************************************************* + +\begin{code} +rnSrcFixityDecls :: [LFixitySig RdrName] -> RnM [LFixitySig Name] +rnSrcFixityDecls fix_decls + = do fix_decls <- mapM rnFixityDecl fix_decls + return (concat fix_decls) + +rnFixityDecl :: LFixitySig RdrName -> RnM [LFixitySig Name] +rnFixityDecl (L loc (FixitySig (L nameLoc rdr_name) fixity)) + = do names <- lookupLocalDataTcNames rdr_name + return [ L loc (FixitySig (L nameLoc name) fixity) + | name <- names ] + +rnSrcFixityDeclsEnv :: [LFixitySig RdrName] -> RnM FixityEnv +rnSrcFixityDeclsEnv fix_decls + = getGblEnv `thenM` \ gbl_env -> + foldlM rnFixityDeclEnv (tcg_fix_env gbl_env) + fix_decls `thenM` \ fix_env -> + traceRn (text "fixity env" <+> pprFixEnv fix_env) `thenM_` + returnM fix_env + +rnFixityDeclEnv :: FixityEnv -> LFixitySig RdrName -> RnM FixityEnv +rnFixityDeclEnv fix_env (L loc (FixitySig rdr_name fixity)) + = setSrcSpan loc $ + -- GHC extension: look up both the tycon and data con + -- for con-like things + -- If neither are in scope, report an error; otherwise + -- add both to the fixity env + addLocM lookupLocalDataTcNames rdr_name `thenM` \ names -> + foldlM add fix_env names + where + add fix_env name + = case lookupNameEnv fix_env name of + Just (FixItem _ _ loc') + -> addLocErr rdr_name (dupFixityDecl loc') `thenM_` + returnM fix_env + Nothing -> returnM (extendNameEnv fix_env name fix_item) + where + fix_item = FixItem (nameOccName name) fixity (getLoc rdr_name) + +pprFixEnv :: FixityEnv -> SDoc +pprFixEnv env + = pprWithCommas (\ (FixItem n f _) -> ppr f <+> ppr n) + (nameEnvElts env) + +dupFixityDecl loc rdr_name + = vcat [ptext SLIT("Multiple fixity declarations for") <+> quotes (ppr rdr_name), + ptext SLIT("also at ") <+> ppr loc + ] +\end{code} + + +%********************************************************* +%* * + Source-code deprecations declarations +%* * +%********************************************************* + +For deprecations, all we do is check that the names are in scope. +It's only imported deprecations, dealt with in RnIfaces, that we +gather them together. + +\begin{code} +rnSrcDeprecDecls :: [LDeprecDecl RdrName] -> RnM Deprecations +rnSrcDeprecDecls [] + = returnM NoDeprecs + +rnSrcDeprecDecls decls + = mappM (addLocM rn_deprec) decls `thenM` \ pairs_s -> + returnM (DeprecSome (mkNameEnv (concat pairs_s))) + where + rn_deprec (Deprecation rdr_name txt) + = lookupLocalDataTcNames rdr_name `thenM` \ names -> + returnM [(name, (nameOccName name, txt)) | name <- names] + +checkModDeprec :: Maybe DeprecTxt -> Deprecations +-- Check for a module deprecation; done once at top level +checkModDeprec Nothing = NoDeprecs +checkModDeprec (Just txt) = DeprecAll txt +\end{code} + +%********************************************************* +%* * +\subsection{Source code declarations} +%* * +%********************************************************* + +\begin{code} +rnDefaultDecl (DefaultDecl tys) + = mapFvRn (rnHsTypeFVs doc_str) tys `thenM` \ (tys', fvs) -> + returnM (DefaultDecl tys', fvs) + where + doc_str = text "In a `default' declaration" +\end{code} + +%********************************************************* +%* * +\subsection{Foreign declarations} +%* * +%********************************************************* + +\begin{code} +rnHsForeignDecl (ForeignImport name ty spec isDeprec) + = lookupLocatedTopBndrRn name `thenM` \ name' -> + rnHsTypeFVs (fo_decl_msg name) ty `thenM` \ (ty', fvs) -> + returnM (ForeignImport name' ty' spec isDeprec, fvs) + +rnHsForeignDecl (ForeignExport name ty spec isDeprec) + = lookupLocatedOccRn name `thenM` \ name' -> + rnHsTypeFVs (fo_decl_msg name) ty `thenM` \ (ty', fvs) -> + returnM (ForeignExport name' ty' spec isDeprec, fvs ) + -- NB: a foreign export is an *occurrence site* for name, so + -- we add it to the free-variable list. It might, for example, + -- be imported from another module + +fo_decl_msg name = ptext SLIT("In the foreign declaration for") <+> ppr name +\end{code} + + +%********************************************************* +%* * +\subsection{Instance declarations} +%* * +%********************************************************* + +\begin{code} +rnSrcInstDecl (InstDecl inst_ty mbinds uprags) + -- Used for both source and interface file decls + = rnHsSigType (text "an instance decl") inst_ty `thenM` \ inst_ty' -> + + -- Rename the bindings + -- The typechecker (not the renamer) checks that all + -- the bindings are for the right class + let + meth_doc = text "In the bindings in an instance declaration" + meth_names = collectHsBindLocatedBinders mbinds + (inst_tyvars, _, cls,_) = splitHsInstDeclTy (unLoc inst_ty') + in + checkDupNames meth_doc meth_names `thenM_` + extendTyVarEnvForMethodBinds inst_tyvars ( + -- (Slightly strangely) the forall-d tyvars scope over + -- the method bindings too + rnMethodBinds cls [] mbinds + ) `thenM` \ (mbinds', meth_fvs) -> + -- Rename the prags and signatures. + -- Note that the type variables are not in scope here, + -- so that instance Eq a => Eq (T a) where + -- {-# SPECIALISE instance Eq a => Eq (T [a]) #-} + -- works OK. + -- + -- But the (unqualified) method names are in scope + let + binders = collectHsBindBinders mbinds' + ok_sig = okInstDclSig (mkNameSet binders) + in + bindLocalNames binders (renameSigs ok_sig uprags) `thenM` \ uprags' -> + + returnM (InstDecl inst_ty' mbinds' uprags', + meth_fvs `plusFV` hsSigsFVs uprags' + `plusFV` extractHsTyNames inst_ty') +\end{code} + +For the method bindings in class and instance decls, we extend the +type variable environment iff -fglasgow-exts + +\begin{code} +extendTyVarEnvForMethodBinds tyvars thing_inside + = doptM Opt_GlasgowExts `thenM` \ opt_GlasgowExts -> + if opt_GlasgowExts then + extendTyVarEnvFVRn (map hsLTyVarName tyvars) thing_inside + else + thing_inside +\end{code} + + +%********************************************************* +%* * +\subsection{Rules} +%* * +%********************************************************* + +\begin{code} +rnHsRuleDecl (HsRule rule_name act vars lhs fv_lhs rhs fv_rhs) + = bindPatSigTyVarsFV (collectRuleBndrSigTys vars) $ + + bindLocatedLocalsFV doc (map get_var vars) $ \ ids -> + mapFvRn rn_var (vars `zip` ids) `thenM` \ (vars', fv_vars) -> + + rnLExpr lhs `thenM` \ (lhs', fv_lhs') -> + rnLExpr rhs `thenM` \ (rhs', fv_rhs') -> + let + mb_bad = validRuleLhs ids lhs' + in + checkErr (isNothing mb_bad) + (badRuleLhsErr rule_name lhs' mb_bad) `thenM_` + let + bad_vars = [var | var <- ids, not (var `elemNameSet` fv_lhs')] + in + mappM (addErr . badRuleVar rule_name) bad_vars `thenM_` + returnM (HsRule rule_name act vars' lhs' fv_lhs' rhs' fv_rhs', + fv_vars `plusFV` fv_lhs' `plusFV` fv_rhs') + where + doc = text "In the transformation rule" <+> ftext rule_name + + get_var (RuleBndr v) = v + get_var (RuleBndrSig v _) = v + + rn_var (RuleBndr (L loc v), id) + = returnM (RuleBndr (L loc id), emptyFVs) + rn_var (RuleBndrSig (L loc v) t, id) + = rnHsTypeFVs doc t `thenM` \ (t', fvs) -> + returnM (RuleBndrSig (L loc id) t', fvs) +\end{code} + +Check the shape of a transformation rule LHS. Currently +we only allow LHSs of the form @(f e1 .. en)@, where @f@ is +not one of the @forall@'d variables. We also restrict the form of the LHS so +that it may be plausibly matched. Basically you only get to write ordinary +applications. (E.g. a case expression is not allowed: too elaborate.) + +NB: if you add new cases here, make sure you add new ones to TcRule.ruleLhsTvs + +\begin{code} +validRuleLhs :: [Name] -> LHsExpr Name -> Maybe (HsExpr Name) +-- Nothing => OK +-- Just e => Not ok, and e is the offending expression +validRuleLhs foralls lhs + = checkl lhs + where + checkl (L loc e) = check e + + check (OpApp e1 op _ e2) = checkl op `seqMaybe` checkl_e e1 `seqMaybe` checkl_e e2 + check (HsApp e1 e2) = checkl e1 `seqMaybe` checkl_e e2 + check (HsVar v) | v `notElem` foralls = Nothing + check other = Just other -- Failure + + checkl_e (L loc e) = check_e e + + check_e (HsVar v) = Nothing + check_e (HsPar e) = checkl_e e + check_e (HsLit e) = Nothing + check_e (HsOverLit e) = Nothing + + check_e (OpApp e1 op _ e2) = checkl_e e1 `seqMaybe` checkl_e op `seqMaybe` checkl_e e2 + check_e (HsApp e1 e2) = checkl_e e1 `seqMaybe` checkl_e e2 + check_e (NegApp e _) = checkl_e e + check_e (ExplicitList _ es) = checkl_es es + check_e (ExplicitTuple es _) = checkl_es es + check_e other = Just other -- Fails + + checkl_es es = foldr (seqMaybe . checkl_e) Nothing es + +badRuleLhsErr name lhs (Just bad_e) + = sep [ptext SLIT("Rule") <+> ftext name <> colon, + nest 4 (vcat [ptext SLIT("Illegal expression:") <+> ppr bad_e, + ptext SLIT("in left-hand side:") <+> ppr lhs])] + $$ + ptext SLIT("LHS must be of form (f e1 .. en) where f is not forall'd") + +badRuleVar name var + = sep [ptext SLIT("Rule") <+> doubleQuotes (ftext name) <> colon, + ptext SLIT("Forall'd variable") <+> quotes (ppr var) <+> + ptext SLIT("does not appear on left hand side")] +\end{code} + + +%********************************************************* +%* * +\subsection{Type, class and iface sig declarations} +%* * +%********************************************************* + +@rnTyDecl@ uses the `global name function' to create a new type +declaration in which local names have been replaced by their original +names, reporting any unknown names. + +Renaming type variables is a pain. Because they now contain uniques, +it is necessary to pass in an association list which maps a parsed +tyvar to its @Name@ representation. +In some cases (type signatures of values), +it is even necessary to go over the type first +in order to get the set of tyvars used by it, make an assoc list, +and then go over it again to rename the tyvars! +However, we can also do some scoping checks at the same time. + +\begin{code} +rnTyClDecl (ForeignType {tcdLName = name, tcdFoType = fo_type, tcdExtName = ext_name}) + = lookupLocatedTopBndrRn name `thenM` \ name' -> + returnM (ForeignType {tcdLName = name', tcdFoType = fo_type, tcdExtName = ext_name}, + emptyFVs) + +rnTyClDecl (TyData {tcdND = new_or_data, tcdCtxt = context, tcdLName = tycon, + tcdTyVars = tyvars, tcdCons = condecls, + tcdKindSig = sig, tcdDerivs = derivs}) + | is_vanilla -- Normal Haskell data type decl + = ASSERT( isNothing sig ) -- In normal H98 form, kind signature on the + -- data type is syntactically illegal + bindTyVarsRn data_doc tyvars $ \ tyvars' -> + do { tycon' <- lookupLocatedTopBndrRn tycon + ; context' <- rnContext data_doc context + ; (derivs', deriv_fvs) <- rn_derivs derivs + ; checkDupNames data_doc con_names + ; condecls' <- rnConDecls (unLoc tycon') condecls + ; returnM (TyData {tcdND = new_or_data, tcdCtxt = context', tcdLName = tycon', + tcdTyVars = tyvars', tcdKindSig = Nothing, tcdCons = condecls', + tcdDerivs = derivs'}, + delFVs (map hsLTyVarName tyvars') $ + extractHsCtxtTyNames context' `plusFV` + plusFVs (map conDeclFVs condecls') `plusFV` + deriv_fvs) } + + | otherwise -- GADT + = do { tycon' <- lookupLocatedTopBndrRn tycon + ; checkTc (null (unLoc context)) (badGadtStupidTheta tycon) + ; tyvars' <- bindTyVarsRn data_doc tyvars + (\ tyvars' -> return tyvars') + -- For GADTs, the type variables in the declaration + -- do not scope over the constructor signatures + -- data T a where { T1 :: forall b. b-> b } + ; (derivs', deriv_fvs) <- rn_derivs derivs + ; checkDupNames data_doc con_names + ; condecls' <- rnConDecls (unLoc tycon') condecls + ; returnM (TyData {tcdND = new_or_data, tcdCtxt = noLoc [], tcdLName = tycon', + tcdTyVars = tyvars', tcdCons = condecls', tcdKindSig = sig, + tcdDerivs = derivs'}, + plusFVs (map conDeclFVs condecls') `plusFV` deriv_fvs) } + + where + is_vanilla = case condecls of -- Yuk + [] -> True + L _ (ConDecl { con_res = ResTyH98 }) : _ -> True + other -> False + + data_doc = text "In the data type declaration for" <+> quotes (ppr tycon) + con_names = map con_names_helper condecls + + con_names_helper (L _ c) = con_name c + + rn_derivs Nothing = returnM (Nothing, emptyFVs) + rn_derivs (Just ds) = rnLHsTypes data_doc ds `thenM` \ ds' -> + returnM (Just ds', extractHsTyNames_s ds') + +rnTyClDecl (TySynonym {tcdLName = name, tcdTyVars = tyvars, tcdSynRhs = ty}) + = lookupLocatedTopBndrRn name `thenM` \ name' -> + bindTyVarsRn syn_doc tyvars $ \ tyvars' -> + rnHsTypeFVs syn_doc ty `thenM` \ (ty', fvs) -> + returnM (TySynonym {tcdLName = name', tcdTyVars = tyvars', + tcdSynRhs = ty'}, + delFVs (map hsLTyVarName tyvars') fvs) + where + syn_doc = text "In the declaration for type synonym" <+> quotes (ppr name) + +rnTyClDecl (ClassDecl {tcdCtxt = context, tcdLName = cname, + tcdTyVars = tyvars, tcdFDs = fds, tcdSigs = sigs, + tcdMeths = mbinds}) + = lookupLocatedTopBndrRn cname `thenM` \ cname' -> + + -- Tyvars scope over superclass context and method signatures + bindTyVarsRn cls_doc tyvars ( \ tyvars' -> + rnContext cls_doc context `thenM` \ context' -> + rnFds cls_doc fds `thenM` \ fds' -> + renameSigs okClsDclSig sigs `thenM` \ sigs' -> + returnM (tyvars', context', fds', sigs') + ) `thenM` \ (tyvars', context', fds', sigs') -> + + -- Check the signatures + -- First process the class op sigs (op_sigs), then the fixity sigs (non_op_sigs). + let + sig_rdr_names_w_locs = [op | L _ (TypeSig op _) <- sigs] + in + checkDupNames sig_doc sig_rdr_names_w_locs `thenM_` + -- Typechecker is responsible for checking that we only + -- give default-method bindings for things in this class. + -- The renamer *could* check this for class decls, but can't + -- for instance decls. + + -- The newLocals call is tiresome: given a generic class decl + -- class C a where + -- op :: a -> a + -- op {| x+y |} (Inl a) = ... + -- op {| x+y |} (Inr b) = ... + -- op {| a*b |} (a*b) = ... + -- we want to name both "x" tyvars with the same unique, so that they are + -- easy to group together in the typechecker. + extendTyVarEnvForMethodBinds tyvars' ( + getLocalRdrEnv `thenM` \ name_env -> + let + meth_rdr_names_w_locs = collectHsBindLocatedBinders mbinds + gen_rdr_tyvars_w_locs = + [ tv | tv <- extractGenericPatTyVars mbinds, + not (unLoc tv `elemLocalRdrEnv` name_env) ] + in + checkDupNames meth_doc meth_rdr_names_w_locs `thenM_` + newLocalsRn gen_rdr_tyvars_w_locs `thenM` \ gen_tyvars -> + rnMethodBinds (unLoc cname') gen_tyvars mbinds + ) `thenM` \ (mbinds', meth_fvs) -> + + returnM (ClassDecl { tcdCtxt = context', tcdLName = cname', tcdTyVars = tyvars', + tcdFDs = fds', tcdSigs = sigs', tcdMeths = mbinds'}, + delFVs (map hsLTyVarName tyvars') $ + extractHsCtxtTyNames context' `plusFV` + plusFVs (map extractFunDepNames (map unLoc fds')) `plusFV` + hsSigsFVs sigs' `plusFV` + meth_fvs) + where + meth_doc = text "In the default-methods for class" <+> ppr cname + cls_doc = text "In the declaration for class" <+> ppr cname + sig_doc = text "In the signatures for class" <+> ppr cname + +badGadtStupidTheta tycon + = vcat [ptext SLIT("No context is allowed on a GADT-style data declaration"), + ptext SLIT("(You can put a context on each contructor, though.)")] +\end{code} + +%********************************************************* +%* * +\subsection{Support code for type/data declarations} +%* * +%********************************************************* + +\begin{code} +rnConDecls :: Name -> [LConDecl RdrName] -> RnM [LConDecl Name] +rnConDecls tycon condecls + = mappM (wrapLocM rnConDecl) condecls + +rnConDecl :: ConDecl RdrName -> RnM (ConDecl Name) +rnConDecl (ConDecl name expl tvs cxt details res_ty) + = do { addLocM checkConName name + + ; new_name <- lookupLocatedTopBndrRn name + ; name_env <- getLocalRdrEnv + + -- For H98 syntax, the tvs are the existential ones + -- For GADT syntax, the tvs are all the quantified tyvars + -- Hence the 'filter' in the ResTyH98 case only + ; let not_in_scope = not . (`elemLocalRdrEnv` name_env) . unLoc + arg_tys = hsConArgs details + implicit_tvs = case res_ty of + ResTyH98 -> filter not_in_scope $ + get_rdr_tvs arg_tys + ResTyGADT ty -> get_rdr_tvs (ty : arg_tys) + tvs' = case expl of + Explicit -> tvs + Implicit -> userHsTyVarBndrs implicit_tvs + + ; bindTyVarsRn doc tvs' $ \new_tyvars -> do + { new_context <- rnContext doc cxt + ; new_details <- rnConDetails doc details + ; new_res_ty <- rnConResult doc res_ty + ; let rv = ConDecl new_name expl new_tyvars new_context new_details new_res_ty + ; traceRn (text "****** - autrijus" <> ppr rv) + ; return rv } } + where + doc = text "In the definition of data constructor" <+> quotes (ppr name) + get_rdr_tvs tys = extractHsRhoRdrTyVars cxt (noLoc (HsTupleTy Boxed tys)) + +rnConResult _ ResTyH98 = return ResTyH98 +rnConResult doc (ResTyGADT ty) = do + ty' <- rnHsSigType doc ty + return $ ResTyGADT ty' + +rnConDetails doc (PrefixCon tys) + = mappM (rnLHsType doc) tys `thenM` \ new_tys -> + returnM (PrefixCon new_tys) + +rnConDetails doc (InfixCon ty1 ty2) + = rnLHsType doc ty1 `thenM` \ new_ty1 -> + rnLHsType doc ty2 `thenM` \ new_ty2 -> + returnM (InfixCon new_ty1 new_ty2) + +rnConDetails doc (RecCon fields) + = checkDupNames doc field_names `thenM_` + mappM (rnField doc) fields `thenM` \ new_fields -> + returnM (RecCon new_fields) + where + field_names = [fld | (fld, _) <- fields] + +rnField doc (name, ty) + = lookupLocatedTopBndrRn name `thenM` \ new_name -> + rnLHsType doc ty `thenM` \ new_ty -> + returnM (new_name, new_ty) + +-- This data decl will parse OK +-- data T = a Int +-- treating "a" as the constructor. +-- It is really hard to make the parser spot this malformation. +-- So the renamer has to check that the constructor is legal +-- +-- We can get an operator as the constructor, even in the prefix form: +-- data T = :% Int Int +-- from interface files, which always print in prefix form + +checkConName name = checkErr (isRdrDataCon name) (badDataCon name) + +badDataCon name + = hsep [ptext SLIT("Illegal data constructor name"), quotes (ppr name)] +\end{code} + + +%********************************************************* +%* * +\subsection{Support code to rename types} +%* * +%********************************************************* + +\begin{code} +rnFds :: SDoc -> [Located (FunDep RdrName)] -> RnM [Located (FunDep Name)] + +rnFds doc fds + = mappM (wrapLocM rn_fds) fds + where + rn_fds (tys1, tys2) + = rnHsTyVars doc tys1 `thenM` \ tys1' -> + rnHsTyVars doc tys2 `thenM` \ tys2' -> + returnM (tys1', tys2') + +rnHsTyVars doc tvs = mappM (rnHsTyvar doc) tvs +rnHsTyvar doc tyvar = lookupOccRn tyvar +\end{code} + + +%********************************************************* +%* * + Splices +%* * +%********************************************************* + +Note [Splices] +~~~~~~~~~~~~~~ +Consider + f = ... + h = ...$(thing "f")... + +The splice can expand into literally anything, so when we do dependency +analysis we must assume that it might mention 'f'. So we simply treat +all locally-defined names as mentioned by any splice. This is terribly +brutal, but I don't see what else to do. For example, it'll mean +that every locally-defined thing will appear to be used, so no unused-binding +warnings. But if we miss the dependency, then we might typecheck 'h' before 'f', +and that will crash the type checker because 'f' isn't in scope. + +Currently, I'm not treating a splice as also mentioning every import, +which is a bit inconsistent -- but there are a lot of them. We might +thereby get some bogus unused-import warnings, but we won't crash the +type checker. Not very satisfactory really. + +\begin{code} +rnSplice :: HsSplice RdrName -> RnM (HsSplice Name, FreeVars) +rnSplice (HsSplice n expr) + = do { checkTH expr "splice" + ; loc <- getSrcSpanM + ; [n'] <- newLocalsRn [L loc n] + ; (expr', fvs) <- rnLExpr expr + + -- Ugh! See Note [Splices] above + ; lcl_rdr <- getLocalRdrEnv + ; gbl_rdr <- getGlobalRdrEnv + ; let gbl_names = mkNameSet [gre_name gre | gre <- globalRdrEnvElts gbl_rdr, + isLocalGRE gre] + lcl_names = mkNameSet (occEnvElts lcl_rdr) + + ; return (HsSplice n' expr', fvs `plusFV` lcl_names `plusFV` gbl_names) } + +#ifdef GHCI +checkTH e what = returnM () -- OK +#else +checkTH e what -- Raise an error in a stage-1 compiler + = addErr (vcat [ptext SLIT("Template Haskell") <+> text what <+> + ptext SLIT("illegal in a stage-1 compiler"), + nest 2 (ppr e)]) +#endif +\end{code} diff --git a/compiler/rename/RnTypes.lhs b/compiler/rename/RnTypes.lhs new file mode 100644 index 0000000000..d7d435ce97 --- /dev/null +++ b/compiler/rename/RnTypes.lhs @@ -0,0 +1,766 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[RnSource]{Main pass of renamer} + +\begin{code} +module RnTypes ( + -- Type related stuff + rnHsType, rnLHsType, rnLHsTypes, rnContext, + rnHsSigType, rnHsTypeFVs, + + -- Patterns and literals + rnLPat, rnPat, rnPatsAndThen, -- Here because it's not part + rnLit, rnOverLit, -- of any mutual recursion + + -- Precence related stuff + mkOpAppRn, mkNegAppRn, mkOpFormRn, + checkPrecMatch, checkSectionPrec, + + -- Error messages + dupFieldErr, patSigErr, checkTupSize + ) where + +import DynFlags ( DynFlag(Opt_WarnUnusedMatches, Opt_GlasgowExts) ) + +import HsSyn +import RdrHsSyn ( extractHsRhoRdrTyVars ) +import RnHsSyn ( extractHsTyNames, parrTyCon_name, tupleTyCon_name, + listTyCon_name + ) +import RnEnv ( lookupOccRn, lookupBndrRn, lookupSyntaxName, + lookupLocatedOccRn, lookupLocatedBndrRn, + lookupLocatedGlobalOccRn, bindTyVarsRn, + lookupFixityRn, lookupTyFixityRn, + mapFvRn, warnUnusedMatches, + newIPNameRn, bindPatSigTyVarsFV, bindLocatedLocalsFV ) +import TcRnMonad +import RdrName ( RdrName, elemLocalRdrEnv ) +import PrelNames ( eqClassName, integralClassName, geName, eqName, + negateName, minusName, lengthPName, indexPName, + plusIntegerName, fromIntegerName, timesIntegerName, + ratioDataConName, fromRationalName ) +import TypeRep ( funTyCon ) +import Constants ( mAX_TUPLE_SIZE ) +import Name ( Name ) +import SrcLoc ( SrcSpan, Located(..), unLoc, noLoc, combineLocs ) +import NameSet + +import Literal ( inIntRange, inCharRange ) +import BasicTypes ( compareFixity, funTyFixity, negateFixity, + Fixity(..), FixityDirection(..) ) +import ListSetOps ( removeDups ) +import Outputable + +#include "HsVersions.h" +\end{code} + +These type renamers are in a separate module, rather than in (say) RnSource, +to break several loop. + +%********************************************************* +%* * +\subsection{Renaming types} +%* * +%********************************************************* + +\begin{code} +rnHsTypeFVs :: SDoc -> LHsType RdrName -> RnM (LHsType Name, FreeVars) +rnHsTypeFVs doc_str ty + = rnLHsType doc_str ty `thenM` \ ty' -> + returnM (ty', extractHsTyNames ty') + +rnHsSigType :: SDoc -> LHsType RdrName -> RnM (LHsType Name) + -- rnHsSigType is used for source-language type signatures, + -- which use *implicit* universal quantification. +rnHsSigType doc_str ty + = rnLHsType (text "In the type signature for" <+> doc_str) ty +\end{code} + +rnHsType is here because we call it from loadInstDecl, and I didn't +want a gratuitous knot. + +\begin{code} +rnLHsType :: SDoc -> LHsType RdrName -> RnM (LHsType Name) +rnLHsType doc = wrapLocM (rnHsType doc) + +rnHsType :: SDoc -> HsType RdrName -> RnM (HsType Name) + +rnHsType doc (HsForAllTy Implicit _ ctxt ty) + -- Implicit quantifiction in source code (no kinds on tyvars) + -- Given the signature C => T we universally quantify + -- over FV(T) \ {in-scope-tyvars} + = getLocalRdrEnv `thenM` \ name_env -> + let + mentioned = extractHsRhoRdrTyVars ctxt ty + + -- Don't quantify over type variables that are in scope; + -- when GlasgowExts is off, there usually won't be any, except for + -- class signatures: + -- class C a where { op :: a -> a } + forall_tyvars = filter (not . (`elemLocalRdrEnv` name_env) . unLoc) mentioned + tyvar_bndrs = userHsTyVarBndrs forall_tyvars + in + rnForAll doc Implicit tyvar_bndrs ctxt ty + +rnHsType doc (HsForAllTy Explicit forall_tyvars ctxt tau) + -- Explicit quantification. + -- Check that the forall'd tyvars are actually + -- mentioned in the type, and produce a warning if not + = let + mentioned = map unLoc (extractHsRhoRdrTyVars ctxt tau) + forall_tyvar_names = hsLTyVarLocNames forall_tyvars + + -- Explicitly quantified but not mentioned in ctxt or tau + warn_guys = filter ((`notElem` mentioned) . unLoc) forall_tyvar_names + in + mappM_ (forAllWarn doc tau) warn_guys `thenM_` + rnForAll doc Explicit forall_tyvars ctxt tau + +rnHsType doc (HsTyVar tyvar) + = lookupOccRn tyvar `thenM` \ tyvar' -> + returnM (HsTyVar tyvar') + +rnHsType doc (HsOpTy ty1 (L loc op) ty2) + = setSrcSpan loc ( + lookupOccRn op `thenM` \ op' -> + let + l_op' = L loc op' + in + lookupTyFixityRn l_op' `thenM` \ fix -> + rnLHsType doc ty1 `thenM` \ ty1' -> + rnLHsType doc ty2 `thenM` \ ty2' -> + mkHsOpTyRn (\t1 t2 -> HsOpTy t1 l_op' t2) (ppr op') fix ty1' ty2' + ) + +rnHsType doc (HsParTy ty) + = rnLHsType doc ty `thenM` \ ty' -> + returnM (HsParTy ty') + +rnHsType doc (HsBangTy b ty) + = rnLHsType doc ty `thenM` \ ty' -> + returnM (HsBangTy b ty') + +rnHsType doc (HsNumTy i) + | i == 1 = returnM (HsNumTy i) + | otherwise = addErr err_msg `thenM_` returnM (HsNumTy i) + where + err_msg = ptext SLIT("Only unit numeric type pattern is valid") + + +rnHsType doc (HsFunTy ty1 ty2) + = rnLHsType doc ty1 `thenM` \ ty1' -> + -- Might find a for-all as the arg of a function type + rnLHsType doc ty2 `thenM` \ ty2' -> + -- Or as the result. This happens when reading Prelude.hi + -- when we find return :: forall m. Monad m -> forall a. a -> m a + + -- Check for fixity rearrangements + mkHsOpTyRn HsFunTy (ppr funTyCon) funTyFixity ty1' ty2' + +rnHsType doc (HsListTy ty) + = rnLHsType doc ty `thenM` \ ty' -> + returnM (HsListTy ty') + +rnHsType doc (HsKindSig ty k) + = rnLHsType doc ty `thenM` \ ty' -> + returnM (HsKindSig ty' k) + +rnHsType doc (HsPArrTy ty) + = rnLHsType doc ty `thenM` \ ty' -> + returnM (HsPArrTy ty') + +-- Unboxed tuples are allowed to have poly-typed arguments. These +-- sometimes crop up as a result of CPR worker-wrappering dictionaries. +rnHsType doc (HsTupleTy tup_con tys) + = mappM (rnLHsType doc) tys `thenM` \ tys' -> + returnM (HsTupleTy tup_con tys') + +rnHsType doc (HsAppTy ty1 ty2) + = rnLHsType doc ty1 `thenM` \ ty1' -> + rnLHsType doc ty2 `thenM` \ ty2' -> + returnM (HsAppTy ty1' ty2') + +rnHsType doc (HsPredTy pred) + = rnPred doc pred `thenM` \ pred' -> + returnM (HsPredTy pred') + +rnHsType doc (HsSpliceTy _) + = do { addErr (ptext SLIT("Type splices are not yet implemented")) + ; failM } + +rnLHsTypes doc tys = mappM (rnLHsType doc) tys +\end{code} + + +\begin{code} +rnForAll :: SDoc -> HsExplicitForAll -> [LHsTyVarBndr RdrName] + -> LHsContext RdrName -> LHsType RdrName -> RnM (HsType Name) + +rnForAll doc exp [] (L _ []) (L _ ty) = rnHsType doc ty + -- One reason for this case is that a type like Int# + -- starts off as (HsForAllTy Nothing [] Int), in case + -- there is some quantification. Now that we have quantified + -- and discovered there are no type variables, it's nicer to turn + -- it into plain Int. If it were Int# instead of Int, we'd actually + -- get an error, because the body of a genuine for-all is + -- of kind *. + +rnForAll doc exp forall_tyvars ctxt ty + = bindTyVarsRn doc forall_tyvars $ \ new_tyvars -> + rnContext doc ctxt `thenM` \ new_ctxt -> + rnLHsType doc ty `thenM` \ new_ty -> + returnM (HsForAllTy exp new_tyvars new_ctxt new_ty) + -- Retain the same implicit/explicit flag as before + -- so that we can later print it correctly +\end{code} + + +%************************************************************************ +%* * + Fixities and precedence parsing +%* * +%************************************************************************ + +@mkOpAppRn@ deals with operator fixities. The argument expressions +are assumed to be already correctly arranged. It needs the fixities +recorded in the OpApp nodes, because fixity info applies to the things +the programmer actually wrote, so you can't find it out from the Name. + +Furthermore, the second argument is guaranteed not to be another +operator application. Why? Because the parser parses all +operator appications left-associatively, EXCEPT negation, which +we need to handle specially. +Infix types are read in a *right-associative* way, so that + a `op` b `op` c +is always read in as + a `op` (b `op` c) + +mkHsOpTyRn rearranges where necessary. The two arguments +have already been renamed and rearranged. It's made rather tiresome +by the presence of ->, which is a separate syntactic construct. + +\begin{code} +--------------- +-- Building (ty1 `op1` (ty21 `op2` ty22)) +mkHsOpTyRn :: (LHsType Name -> LHsType Name -> HsType Name) + -> SDoc -> Fixity -> LHsType Name -> LHsType Name + -> RnM (HsType Name) + +mkHsOpTyRn mk1 pp_op1 fix1 ty1 (L loc2 (HsOpTy ty21 op2 ty22)) + = do { fix2 <- lookupTyFixityRn op2 + ; mk_hs_op_ty mk1 pp_op1 fix1 ty1 + (\t1 t2 -> HsOpTy t1 op2 t2) + (ppr op2) fix2 ty21 ty22 loc2 } + +mkHsOpTyRn mk1 pp_op1 fix1 ty1 ty2@(L loc2 (HsFunTy ty21 ty22)) + = mk_hs_op_ty mk1 pp_op1 fix1 ty1 + HsFunTy (ppr funTyCon) funTyFixity ty21 ty22 loc2 + +mkHsOpTyRn mk1 pp_op1 fix1 ty1 ty2 -- Default case, no rearrangment + = return (mk1 ty1 ty2) + +--------------- +mk_hs_op_ty :: (LHsType Name -> LHsType Name -> HsType Name) + -> SDoc -> Fixity -> LHsType Name + -> (LHsType Name -> LHsType Name -> HsType Name) + -> SDoc -> Fixity -> LHsType Name -> LHsType Name -> SrcSpan + -> RnM (HsType Name) +mk_hs_op_ty mk1 pp_op1 fix1 ty1 + mk2 pp_op2 fix2 ty21 ty22 loc2 + | nofix_error = do { addErr (precParseErr (quotes pp_op1,fix1) + (quotes pp_op2,fix2)) + ; return (mk1 ty1 (L loc2 (mk2 ty21 ty22))) } + | associate_right = return (mk1 ty1 (L loc2 (mk2 ty21 ty22))) + | otherwise = do { -- Rearrange to ((ty1 `op1` ty21) `op2` ty22) + new_ty <- mkHsOpTyRn mk1 pp_op1 fix1 ty1 ty21 + ; return (mk2 (noLoc new_ty) ty22) } + where + (nofix_error, associate_right) = compareFixity fix1 fix2 + + +--------------------------- +mkOpAppRn :: LHsExpr Name -- Left operand; already rearranged + -> LHsExpr Name -> Fixity -- Operator and fixity + -> LHsExpr Name -- Right operand (not an OpApp, but might + -- be a NegApp) + -> RnM (HsExpr Name) + +-- (e11 `op1` e12) `op2` e2 +mkOpAppRn e1@(L _ (OpApp e11 op1 fix1 e12)) op2 fix2 e2 + | nofix_error + = addErr (precParseErr (ppr_op op1,fix1) (ppr_op op2,fix2)) `thenM_` + returnM (OpApp e1 op2 fix2 e2) + + | associate_right + = mkOpAppRn e12 op2 fix2 e2 `thenM` \ new_e -> + returnM (OpApp e11 op1 fix1 (L loc' new_e)) + where + loc'= combineLocs e12 e2 + (nofix_error, associate_right) = compareFixity fix1 fix2 + +--------------------------- +-- (- neg_arg) `op` e2 +mkOpAppRn e1@(L _ (NegApp neg_arg neg_name)) op2 fix2 e2 + | nofix_error + = addErr (precParseErr (pp_prefix_minus,negateFixity) (ppr_op op2,fix2)) `thenM_` + returnM (OpApp e1 op2 fix2 e2) + + | associate_right + = mkOpAppRn neg_arg op2 fix2 e2 `thenM` \ new_e -> + returnM (NegApp (L loc' new_e) neg_name) + where + loc' = combineLocs neg_arg e2 + (nofix_error, associate_right) = compareFixity negateFixity fix2 + +--------------------------- +-- e1 `op` - neg_arg +mkOpAppRn e1 op1 fix1 e2@(L _ (NegApp neg_arg _)) -- NegApp can occur on the right + | not associate_right -- We *want* right association + = addErr (precParseErr (ppr_op op1, fix1) (pp_prefix_minus, negateFixity)) `thenM_` + returnM (OpApp e1 op1 fix1 e2) + where + (_, associate_right) = compareFixity fix1 negateFixity + +--------------------------- +-- Default case +mkOpAppRn e1 op fix e2 -- Default case, no rearrangment + = ASSERT2( right_op_ok fix (unLoc e2), + ppr e1 $$ text "---" $$ ppr op $$ text "---" $$ ppr fix $$ text "---" $$ ppr e2 + ) + returnM (OpApp e1 op fix e2) + +-- Parser left-associates everything, but +-- derived instances may have correctly-associated things to +-- in the right operarand. So we just check that the right operand is OK +right_op_ok fix1 (OpApp _ _ fix2 _) + = not error_please && associate_right + where + (error_please, associate_right) = compareFixity fix1 fix2 +right_op_ok fix1 other + = True + +-- Parser initially makes negation bind more tightly than any other operator +-- And "deriving" code should respect this (use HsPar if not) +mkNegAppRn :: LHsExpr id -> SyntaxExpr id -> RnM (HsExpr id) +mkNegAppRn neg_arg neg_name + = ASSERT( not_op_app (unLoc neg_arg) ) + returnM (NegApp neg_arg neg_name) + +not_op_app (OpApp _ _ _ _) = False +not_op_app other = True + +--------------------------- +mkOpFormRn :: LHsCmdTop Name -- Left operand; already rearranged + -> LHsExpr Name -> Fixity -- Operator and fixity + -> LHsCmdTop Name -- Right operand (not an infix) + -> RnM (HsCmd Name) + +-- (e11 `op1` e12) `op2` e2 +mkOpFormRn a1@(L loc (HsCmdTop (L _ (HsArrForm op1 (Just fix1) [a11,a12])) _ _ _)) + op2 fix2 a2 + | nofix_error + = addErr (precParseErr (ppr_op op1,fix1) (ppr_op op2,fix2)) `thenM_` + returnM (HsArrForm op2 (Just fix2) [a1, a2]) + + | associate_right + = mkOpFormRn a12 op2 fix2 a2 `thenM` \ new_c -> + returnM (HsArrForm op1 (Just fix1) + [a11, L loc (HsCmdTop (L loc new_c) [] placeHolderType [])]) + -- TODO: locs are wrong + where + (nofix_error, associate_right) = compareFixity fix1 fix2 + +-- Default case +mkOpFormRn arg1 op fix arg2 -- Default case, no rearrangment + = returnM (HsArrForm op (Just fix) [arg1, arg2]) + + +-------------------------------------- +mkConOpPatRn :: Located Name -> Fixity -> LPat Name -> LPat Name + -> RnM (Pat Name) + +mkConOpPatRn op2 fix2 p1@(L loc (ConPatIn op1 (InfixCon p11 p12))) p2 + = lookupFixityRn (unLoc op1) `thenM` \ fix1 -> + let + (nofix_error, associate_right) = compareFixity fix1 fix2 + in + if nofix_error then + addErr (precParseErr (ppr_op op1,fix1) (ppr_op op2,fix2)) `thenM_` + returnM (ConPatIn op2 (InfixCon p1 p2)) + else + if associate_right then + mkConOpPatRn op2 fix2 p12 p2 `thenM` \ new_p -> + returnM (ConPatIn op1 (InfixCon p11 (L loc new_p))) -- XXX loc right? + else + returnM (ConPatIn op2 (InfixCon p1 p2)) + +mkConOpPatRn op fix p1 p2 -- Default case, no rearrangment + = ASSERT( not_op_pat (unLoc p2) ) + returnM (ConPatIn op (InfixCon p1 p2)) + +not_op_pat (ConPatIn _ (InfixCon _ _)) = False +not_op_pat other = True + +-------------------------------------- +checkPrecMatch :: Bool -> Name -> MatchGroup Name -> RnM () + -- True indicates an infix lhs + -- See comments with rnExpr (OpApp ...) about "deriving" + +checkPrecMatch False fn match + = returnM () +checkPrecMatch True op (MatchGroup ms _) + = mapM_ check ms + where + check (L _ (Match (p1:p2:_) _ _)) + = checkPrec op (unLoc p1) False `thenM_` + checkPrec op (unLoc p2) True + + check _ = panic "checkPrecMatch" + +checkPrec op (ConPatIn op1 (InfixCon _ _)) right + = lookupFixityRn op `thenM` \ op_fix@(Fixity op_prec op_dir) -> + lookupFixityRn (unLoc op1) `thenM` \ op1_fix@(Fixity op1_prec op1_dir) -> + let + inf_ok = op1_prec > op_prec || + (op1_prec == op_prec && + (op1_dir == InfixR && op_dir == InfixR && right || + op1_dir == InfixL && op_dir == InfixL && not right)) + + info = (ppr_op op, op_fix) + info1 = (ppr_op op1, op1_fix) + (infol, infor) = if right then (info, info1) else (info1, info) + in + checkErr inf_ok (precParseErr infol infor) + +checkPrec op pat right + = returnM () + +-- Check precedence of (arg op) or (op arg) respectively +-- If arg is itself an operator application, then either +-- (a) its precedence must be higher than that of op +-- (b) its precedency & associativity must be the same as that of op +checkSectionPrec :: FixityDirection -> HsExpr RdrName + -> LHsExpr Name -> LHsExpr Name -> RnM () +checkSectionPrec direction section op arg + = case unLoc arg of + OpApp _ op fix _ -> go_for_it (ppr_op op) fix + NegApp _ _ -> go_for_it pp_prefix_minus negateFixity + other -> returnM () + where + L _ (HsVar op_name) = op + go_for_it pp_arg_op arg_fix@(Fixity arg_prec assoc) + = lookupFixityRn op_name `thenM` \ op_fix@(Fixity op_prec _) -> + checkErr (op_prec < arg_prec + || op_prec == arg_prec && direction == assoc) + (sectionPrecErr (ppr_op op_name, op_fix) + (pp_arg_op, arg_fix) section) +\end{code} + +Precedence-related error messages + +\begin{code} +precParseErr op1 op2 + = hang (ptext SLIT("precedence parsing error")) + 4 (hsep [ptext SLIT("cannot mix"), ppr_opfix op1, ptext SLIT("and"), + ppr_opfix op2, + ptext SLIT("in the same infix expression")]) + +sectionPrecErr op arg_op section + = vcat [ptext SLIT("The operator") <+> ppr_opfix op <+> ptext SLIT("of a section"), + nest 4 (ptext SLIT("must have lower precedence than the operand") <+> ppr_opfix arg_op), + nest 4 (ptext SLIT("in the section:") <+> quotes (ppr section))] + +pp_prefix_minus = ptext SLIT("prefix `-'") +ppr_op op = quotes (ppr op) -- Here, op can be a Name or a (Var n), where n is a Name +ppr_opfix (pp_op, fixity) = pp_op <+> brackets (ppr fixity) +\end{code} + +%********************************************************* +%* * +\subsection{Contexts and predicates} +%* * +%********************************************************* + +\begin{code} +rnContext :: SDoc -> LHsContext RdrName -> RnM (LHsContext Name) +rnContext doc = wrapLocM (rnContext' doc) + +rnContext' :: SDoc -> HsContext RdrName -> RnM (HsContext Name) +rnContext' doc ctxt = mappM (rnLPred doc) ctxt + +rnLPred :: SDoc -> LHsPred RdrName -> RnM (LHsPred Name) +rnLPred doc = wrapLocM (rnPred doc) + +rnPred doc (HsClassP clas tys) + = lookupOccRn clas `thenM` \ clas_name -> + rnLHsTypes doc tys `thenM` \ tys' -> + returnM (HsClassP clas_name tys') + +rnPred doc (HsIParam n ty) + = newIPNameRn n `thenM` \ name -> + rnLHsType doc ty `thenM` \ ty' -> + returnM (HsIParam name ty') +\end{code} + + +********************************************************* +* * +\subsection{Patterns} +* * +********************************************************* + +\begin{code} +rnPatsAndThen :: HsMatchContext Name + -> [LPat RdrName] + -> ([LPat Name] -> RnM (a, FreeVars)) + -> RnM (a, FreeVars) +-- Bring into scope all the binders and type variables +-- bound by the patterns; then rename the patterns; then +-- do the thing inside. +-- +-- Note that we do a single bindLocalsRn for all the +-- matches together, so that we spot the repeated variable in +-- f x x = 1 + +rnPatsAndThen ctxt pats thing_inside + = bindPatSigTyVarsFV pat_sig_tys $ + bindLocatedLocalsFV doc_pat bndrs $ \ new_bndrs -> + rnLPats pats `thenM` \ (pats', pat_fvs) -> + thing_inside pats' `thenM` \ (res, res_fvs) -> + + let + unused_binders = filter (not . (`elemNameSet` res_fvs)) new_bndrs + in + warnUnusedMatches unused_binders `thenM_` + returnM (res, res_fvs `plusFV` pat_fvs) + where + pat_sig_tys = collectSigTysFromPats pats + bndrs = collectLocatedPatsBinders pats + doc_pat = ptext SLIT("In") <+> pprMatchContext ctxt + +rnLPats :: [LPat RdrName] -> RnM ([LPat Name], FreeVars) +rnLPats ps = mapFvRn rnLPat ps + +rnLPat :: LPat RdrName -> RnM (LPat Name, FreeVars) +rnLPat = wrapLocFstM rnPat + +-- ----------------------------------------------------------------------------- +-- rnPat + +rnPat :: Pat RdrName -> RnM (Pat Name, FreeVars) + +rnPat (WildPat _) = returnM (WildPat placeHolderType, emptyFVs) + +rnPat (VarPat name) + = lookupBndrRn name `thenM` \ vname -> + returnM (VarPat vname, emptyFVs) + +rnPat (SigPatIn pat ty) + = doptM Opt_GlasgowExts `thenM` \ glaExts -> + + if glaExts + then rnLPat pat `thenM` \ (pat', fvs1) -> + rnHsTypeFVs doc ty `thenM` \ (ty', fvs2) -> + returnM (SigPatIn pat' ty', fvs1 `plusFV` fvs2) + + else addErr (patSigErr ty) `thenM_` + rnPat (unLoc pat) -- XXX shouldn't throw away the loc + where + doc = text "In a pattern type-signature" + +rnPat (LitPat lit) + = rnLit lit `thenM_` + returnM (LitPat lit, emptyFVs) + +rnPat (NPat lit mb_neg eq _) + = rnOverLit lit `thenM` \ (lit', fvs1) -> + (case mb_neg of + Nothing -> returnM (Nothing, emptyFVs) + Just _ -> lookupSyntaxName negateName `thenM` \ (neg, fvs) -> + returnM (Just neg, fvs) + ) `thenM` \ (mb_neg', fvs2) -> + lookupSyntaxName eqName `thenM` \ (eq', fvs3) -> + returnM (NPat lit' mb_neg' eq' placeHolderType, + fvs1 `plusFV` fvs2 `plusFV` fvs3 `addOneFV` eqClassName) + -- Needed to find equality on pattern + +rnPat (NPlusKPat name lit _ _) + = rnOverLit lit `thenM` \ (lit', fvs1) -> + lookupLocatedBndrRn name `thenM` \ name' -> + lookupSyntaxName minusName `thenM` \ (minus, fvs2) -> + lookupSyntaxName geName `thenM` \ (ge, fvs3) -> + returnM (NPlusKPat name' lit' ge minus, + fvs1 `plusFV` fvs2 `plusFV` fvs3 `addOneFV` integralClassName) + -- The Report says that n+k patterns must be in Integral + +rnPat (LazyPat pat) + = rnLPat pat `thenM` \ (pat', fvs) -> + returnM (LazyPat pat', fvs) + +rnPat (BangPat pat) + = rnLPat pat `thenM` \ (pat', fvs) -> + returnM (BangPat pat', fvs) + +rnPat (AsPat name pat) + = rnLPat pat `thenM` \ (pat', fvs) -> + lookupLocatedBndrRn name `thenM` \ vname -> + returnM (AsPat vname pat', fvs) + +rnPat (ConPatIn con stuff) = rnConPat con stuff + + +rnPat (ParPat pat) + = rnLPat pat `thenM` \ (pat', fvs) -> + returnM (ParPat pat', fvs) + +rnPat (ListPat pats _) + = rnLPats pats `thenM` \ (patslist, fvs) -> + returnM (ListPat patslist placeHolderType, fvs `addOneFV` listTyCon_name) + +rnPat (PArrPat pats _) + = rnLPats pats `thenM` \ (patslist, fvs) -> + returnM (PArrPat patslist placeHolderType, + fvs `plusFV` implicit_fvs `addOneFV` parrTyCon_name) + where + implicit_fvs = mkFVs [lengthPName, indexPName] + +rnPat (TuplePat pats boxed _) + = checkTupSize tup_size `thenM_` + rnLPats pats `thenM` \ (patslist, fvs) -> + returnM (TuplePat patslist boxed placeHolderType, + fvs `addOneFV` tycon_name) + where + tup_size = length pats + tycon_name = tupleTyCon_name boxed tup_size + +rnPat (TypePat name) = + rnHsTypeFVs (text "In a type pattern") name `thenM` \ (name', fvs) -> + returnM (TypePat name', fvs) + +-- ----------------------------------------------------------------------------- +-- rnConPat + +rnConPat con (PrefixCon pats) + = lookupLocatedOccRn con `thenM` \ con' -> + rnLPats pats `thenM` \ (pats', fvs) -> + returnM (ConPatIn con' (PrefixCon pats'), fvs `addOneFV` unLoc con') + +rnConPat con (RecCon rpats) + = lookupLocatedOccRn con `thenM` \ con' -> + rnRpats rpats `thenM` \ (rpats', fvs) -> + returnM (ConPatIn con' (RecCon rpats'), fvs `addOneFV` unLoc con') + +rnConPat con (InfixCon pat1 pat2) + = lookupLocatedOccRn con `thenM` \ con' -> + rnLPat pat1 `thenM` \ (pat1', fvs1) -> + rnLPat pat2 `thenM` \ (pat2', fvs2) -> + lookupFixityRn (unLoc con') `thenM` \ fixity -> + mkConOpPatRn con' fixity pat1' pat2' `thenM` \ pat' -> + returnM (pat', fvs1 `plusFV` fvs2 `addOneFV` unLoc con') + +-- ----------------------------------------------------------------------------- +-- rnRpats + +rnRpats :: [(Located RdrName, LPat RdrName)] + -> RnM ([(Located Name, LPat Name)], FreeVars) +rnRpats rpats + = mappM_ field_dup_err dup_fields `thenM_` + mapFvRn rn_rpat rpats `thenM` \ (rpats', fvs) -> + returnM (rpats', fvs) + where + (_, dup_fields) = removeDups compare [ unLoc f | (f,_) <- rpats ] + + field_dup_err dups = addErr (dupFieldErr "pattern" dups) + + rn_rpat (field, pat) + = lookupLocatedGlobalOccRn field `thenM` \ fieldname -> + rnLPat pat `thenM` \ (pat', fvs) -> + returnM ((fieldname, pat'), fvs `addOneFV` unLoc fieldname) + +\end{code} + + +%************************************************************************ +%* * +\subsubsection{Literals} +%* * +%************************************************************************ + +When literals occur we have to make sure +that the types and classes they involve +are made available. + +\begin{code} +rnLit :: HsLit -> RnM () +rnLit (HsChar c) = checkErr (inCharRange c) (bogusCharError c) +rnLit other = returnM () + +rnOverLit (HsIntegral i _) + = lookupSyntaxName fromIntegerName `thenM` \ (from_integer_name, fvs) -> + if inIntRange i then + returnM (HsIntegral i from_integer_name, fvs) + else let + extra_fvs = mkFVs [plusIntegerName, timesIntegerName] + -- Big integer literals are built, using + and *, + -- out of small integers (DsUtils.mkIntegerLit) + -- [NB: plusInteger, timesInteger aren't rebindable... + -- they are used to construct the argument to fromInteger, + -- which is the rebindable one.] + in + returnM (HsIntegral i from_integer_name, fvs `plusFV` extra_fvs) + +rnOverLit (HsFractional i _) + = lookupSyntaxName fromRationalName `thenM` \ (from_rat_name, fvs) -> + let + extra_fvs = mkFVs [ratioDataConName, plusIntegerName, timesIntegerName] + -- We have to make sure that the Ratio type is imported with + -- its constructor, because literals of type Ratio t are + -- built with that constructor. + -- The Rational type is needed too, but that will come in + -- as part of the type for fromRational. + -- The plus/times integer operations may be needed to construct the numerator + -- and denominator (see DsUtils.mkIntegerLit) + in + returnM (HsFractional i from_rat_name, fvs `plusFV` extra_fvs) +\end{code} + + + +%********************************************************* +%* * +\subsection{Errors} +%* * +%********************************************************* + +\begin{code} +checkTupSize :: Int -> RnM () +checkTupSize tup_size + | tup_size <= mAX_TUPLE_SIZE + = returnM () + | otherwise + = addErr (sep [ptext SLIT("A") <+> int tup_size <> ptext SLIT("-tuple is too large for GHC"), + nest 2 (parens (ptext SLIT("max size is") <+> int mAX_TUPLE_SIZE)), + nest 2 (ptext SLIT("Workaround: use nested tuples or define a data type"))]) + +forAllWarn doc ty (L loc tyvar) + = ifOptM Opt_WarnUnusedMatches $ + setSrcSpan loc $ + addWarn (sep [ptext SLIT("The universally quantified type variable") <+> quotes (ppr tyvar), + nest 4 (ptext SLIT("does not appear in the type") <+> quotes (ppr ty))] + $$ + doc + ) + +bogusCharError c + = ptext SLIT("character literal out of range: '\\") <> char c <> char '\'' + +patSigErr ty + = (ptext SLIT("Illegal signature in pattern:") <+> ppr ty) + $$ nest 4 (ptext SLIT("Use -fglasgow-exts to permit it")) + +dupFieldErr str dup + = hsep [ptext SLIT("duplicate field name"), + quotes (ppr dup), + ptext SLIT("in record"), text str] +\end{code} diff --git a/compiler/rename/rename.tex b/compiler/rename/rename.tex new file mode 100644 index 0000000000..b3f8e1d770 --- /dev/null +++ b/compiler/rename/rename.tex @@ -0,0 +1,18 @@ +\documentstyle{report} +\input{lit-style} + +\begin{document} +\centerline{{\Large{rename}}} +\tableofcontents + +\input{Rename} % {Renaming and dependency analysis passes} +\input{RnSource} % {Main pass of renamer} +\input{RnMonad} % {The monad used by the renamer} +\input{RnEnv} % {Environment manipulation for the renamer monad} +\input{RnHsSyn} % {Specialisations of the @HsSyn@ syntax for the renamer} +\input{RnNames} % {Extracting imported and top-level names in scope} +\input{RnExpr} % {Renaming of expressions} +\input{RnBinds} % {Renaming and dependency analysis of bindings} +\input{RnIfaces} % {Cacheing and Renaming of Interfaces} + +\end{document} diff --git a/compiler/simplCore/CSE.lhs b/compiler/simplCore/CSE.lhs new file mode 100644 index 0000000000..2e8489a295 --- /dev/null +++ b/compiler/simplCore/CSE.lhs @@ -0,0 +1,290 @@ +% +% (c) The AQUA Project, Glasgow University, 1993-1998 +% +\section{Common subexpression} + +\begin{code} +module CSE ( + cseProgram + ) where + +#include "HsVersions.h" + +import DynFlags ( DynFlag(..), DynFlags ) +import Id ( Id, idType, idWorkerInfo ) +import IdInfo ( workerExists ) +import CoreUtils ( hashExpr, cheapEqExpr, exprIsBig, mkAltExpr, exprIsCheap ) +import DataCon ( isUnboxedTupleCon ) +import Type ( tyConAppArgs ) +import CoreSyn +import VarEnv +import CoreLint ( showPass, endPass ) +import Outputable +import Util ( mapAccumL, lengthExceeds ) +import UniqFM +\end{code} + + + Simple common sub-expression + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +When we see + x1 = C a b + x2 = C x1 b +we build up a reverse mapping: C a b -> x1 + C x1 b -> x2 +and apply that to the rest of the program. + +When we then see + y1 = C a b + y2 = C y1 b +we replace the C a b with x1. But then we *dont* want to +add x1 -> y1 to the mapping. Rather, we want the reverse, y1 -> x1 +so that a subsequent binding + y2 = C y1 b +will get transformed to C x1 b, and then to x2. + +So we carry an extra var->var substitution which we apply *before* looking up in the +reverse mapping. + + +[Note: SHADOWING] +~~~~~~~~~~~~~~~~~ +We have to be careful about shadowing. +For example, consider + f = \x -> let y = x+x in + h = \x -> x+x + in ... + +Here we must *not* do CSE on the inner x+x! The simplifier used to guarantee no +shadowing, but it doesn't any more (it proved too hard), so we clone as we go. +We can simply add clones to the substitution already described. + +However, we do NOT clone type variables. It's just too hard, because then we need +to run the substitution over types and IdInfo. No no no. Instead, we just throw + +(In fact, I think the simplifier does guarantee no-shadowing for type variables.) + + +[Note: case binders 1] +~~~~~~~~~~~~~~~~~~~~~~ +Consider + + f = \x -> case x of wild { + (a:as) -> case a of wild1 { + (p,q) -> ...(wild1:as)... + +Here, (wild1:as) is morally the same as (a:as) and hence equal to wild. +But that's not quite obvious. In general we want to keep it as (wild1:as), +but for CSE purpose that's a bad idea. + +So we add the binding (wild1 -> a) to the extra var->var mapping. +Notice this is exactly backwards to what the simplifier does, which is +to try to replaces uses of a with uses of wild1 + +[Note: case binders 2] +~~~~~~~~~~~~~~~~~~~~~~ +Consider + case (h x) of y -> ...(h x)... + +We'd like to replace (h x) in the alternative, by y. But because of +the preceding [Note: case binders 1], we only want to add the mapping + scrutinee -> case binder +to the reverse CSE mapping if the scrutinee is a non-trivial expression. +(If the scrutinee is a simple variable we want to add the mapping + case binder -> scrutinee +to the substitution + +[Note: unboxed tuple case binders] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Consider + case f x of t { (# a,b #) -> + case ... of + True -> f x + False -> 0 } + +We must not replace (f x) by t, because t is an unboxed-tuple binder. +Instead, we shoudl replace (f x) by (# a,b #). That is, the "reverse mapping" is + f x --> (# a,b #) +That is why the CSEMap has pairs of expressions. + + +%************************************************************************ +%* * +\section{Common subexpression} +%* * +%************************************************************************ + +\begin{code} +cseProgram :: DynFlags -> [CoreBind] -> IO [CoreBind] + +cseProgram dflags binds + = do { + showPass dflags "Common sub-expression"; + let { binds' = cseBinds emptyCSEnv binds }; + endPass dflags "Common sub-expression" Opt_D_dump_cse binds' + } + +cseBinds :: CSEnv -> [CoreBind] -> [CoreBind] +cseBinds env [] = [] +cseBinds env (b:bs) = (b':bs') + where + (env1, b') = cseBind env b + bs' = cseBinds env1 bs + +cseBind :: CSEnv -> CoreBind -> (CSEnv, CoreBind) +cseBind env (NonRec b e) = let (env', (b',e')) = do_one env (b, e) + in (env', NonRec b' e') +cseBind env (Rec pairs) = let (env', pairs') = mapAccumL do_one env pairs + in (env', Rec pairs') + + +do_one env (id, rhs) + = case lookupCSEnv env rhs' of + Just (Var other_id) -> (extendSubst env' id other_id, (id', Var other_id)) + Just other_expr -> (env', (id', other_expr)) + Nothing -> (addCSEnvItem env' rhs' (Var id'), (id', rhs')) + where + (env', id') = addBinder env id + rhs' | not (workerExists (idWorkerInfo id)) = cseExpr env' rhs + + -- Hack alert: don't do CSE on wrapper RHSs. + -- Otherwise we find: + -- $wf = h + -- f = \x -> ...$wf... + -- ===> + -- f = \x -> ...h... + -- But the WorkerInfo for f still says $wf, which is now dead! + | otherwise = rhs + + +tryForCSE :: CSEnv -> CoreExpr -> CoreExpr +tryForCSE env (Type t) = Type t +tryForCSE env expr = case lookupCSEnv env expr' of + Just smaller_expr -> smaller_expr + Nothing -> expr' + where + expr' = cseExpr env expr + +cseExpr :: CSEnv -> CoreExpr -> CoreExpr +cseExpr env (Type t) = Type t +cseExpr env (Lit lit) = Lit lit +cseExpr env (Var v) = Var (lookupSubst env v) +cseExpr env (App f a) = App (cseExpr env f) (tryForCSE env a) +cseExpr env (Note n e) = Note n (cseExpr env e) +cseExpr env (Lam b e) = let (env', b') = addBinder env b + in Lam b' (cseExpr env' e) +cseExpr env (Let bind e) = let (env', bind') = cseBind env bind + in Let bind' (cseExpr env' e) +cseExpr env (Case scrut bndr ty alts) = Case scrut' bndr' ty (cseAlts env' scrut' bndr bndr' alts) + where + scrut' = tryForCSE env scrut + (env', bndr') = addBinder env bndr + + +cseAlts env scrut' bndr bndr' [(DataAlt con, args, rhs)] + | isUnboxedTupleCon con + -- Unboxed tuples are special because the case binder isn't + -- a real values. See [Note: unboxed tuple case binders] + = [(DataAlt con, args', tryForCSE new_env rhs)] + where + (env', args') = addBinders env args + new_env | exprIsCheap scrut' = env' + | otherwise = extendCSEnv env' scrut' tup_value + tup_value = mkAltExpr (DataAlt con) args' (tyConAppArgs (idType bndr)) + +cseAlts env scrut' bndr bndr' alts + = map cse_alt alts + where + (con_target, alt_env) + = case scrut' of + Var v' -> (v', extendSubst env bndr v') -- See [Note: case binder 1] + -- map: bndr -> v' + + other -> (bndr', extendCSEnv env scrut' (Var bndr')) -- See [Note: case binder 2] + -- map: scrut' -> bndr' + + arg_tys = tyConAppArgs (idType bndr) + + cse_alt (DataAlt con, args, rhs) + | not (null args) + -- Don't try CSE if there are no args; it just increases the number + -- of live vars. E.g. + -- case x of { True -> ....True.... } + -- Don't replace True by x! + -- Hence the 'null args', which also deal with literals and DEFAULT + = (DataAlt con, args', tryForCSE new_env rhs) + where + (env', args') = addBinders alt_env args + new_env = extendCSEnv env' (mkAltExpr (DataAlt con) args' arg_tys) + (Var con_target) + + cse_alt (con, args, rhs) + = (con, args', tryForCSE env' rhs) + where + (env', args') = addBinders alt_env args +\end{code} + + +%************************************************************************ +%* * +\section{The CSE envt} +%* * +%************************************************************************ + +\begin{code} +data CSEnv = CS CSEMap InScopeSet (IdEnv Id) + -- Simple substitution + +type CSEMap = UniqFM [(CoreExpr, CoreExpr)] -- This is the reverse mapping + -- It maps the hash-code of an expression e to list of (e,e') pairs + -- This means that it's good to replace e by e' + -- INVARIANT: The expr in the range has already been CSE'd + +emptyCSEnv = CS emptyUFM emptyInScopeSet emptyVarEnv + +lookupCSEnv :: CSEnv -> CoreExpr -> Maybe CoreExpr +lookupCSEnv (CS cs _ _) expr + = case lookupUFM cs (hashExpr expr) of + Nothing -> Nothing + Just pairs -> lookup_list pairs expr + +lookup_list :: [(CoreExpr,CoreExpr)] -> CoreExpr -> Maybe CoreExpr +lookup_list [] expr = Nothing +lookup_list ((e,e'):es) expr | cheapEqExpr e expr = Just e' + | otherwise = lookup_list es expr + +addCSEnvItem env expr expr' | exprIsBig expr = env + | otherwise = extendCSEnv env expr expr' + -- We don't try to CSE big expressions, because they are expensive to compare + -- (and are unlikely to be the same anyway) + +extendCSEnv (CS cs in_scope sub) expr expr' + = CS (addToUFM_C combine cs hash [(expr, expr')]) in_scope sub + where + hash = hashExpr expr + combine old new = WARN( result `lengthExceeds` 4, text "extendCSEnv: long list:" <+> ppr result ) + result + where + result = new ++ old + +lookupSubst (CS _ _ sub) x = case lookupVarEnv sub x of + Just y -> y + Nothing -> x + +extendSubst (CS cs in_scope sub) x y = CS cs in_scope (extendVarEnv sub x y) + +addBinder :: CSEnv -> Id -> (CSEnv, Id) +addBinder env@(CS cs in_scope sub) v + | not (v `elemInScopeSet` in_scope) = (CS cs (extendInScopeSet in_scope v) sub, v) + | isId v = (CS cs (extendInScopeSet in_scope v') (extendVarEnv sub v v'), v') + | not (isId v) = WARN( True, ppr v ) + (CS emptyUFM in_scope sub, v) + -- This last case is the unusual situation where we have shadowing of + -- a type variable; we have to discard the CSE mapping + -- See "IMPORTANT NOTE" at the top + where + v' = uniqAway in_scope v + +addBinders :: CSEnv -> [Id] -> (CSEnv, [Id]) +addBinders env vs = mapAccumL addBinder env vs +\end{code} diff --git a/compiler/simplCore/FloatIn.lhs b/compiler/simplCore/FloatIn.lhs new file mode 100644 index 0000000000..0e8edb5930 --- /dev/null +++ b/compiler/simplCore/FloatIn.lhs @@ -0,0 +1,464 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +%************************************************************************ +%* * +\section[FloatIn]{Floating Inwards pass} +%* * +%************************************************************************ + +The main purpose of @floatInwards@ is floating into branches of a +case, so that we don't allocate things, save them on the stack, and +then discover that they aren't needed in the chosen branch. + +\begin{code} +module FloatIn ( floatInwards ) where + +#include "HsVersions.h" + +import DynFlags ( DynFlags, DynFlag(..) ) +import CoreSyn +import CoreUtils ( exprIsHNF, exprIsDupable ) +import CoreLint ( showPass, endPass ) +import CoreFVs ( CoreExprWithFVs, freeVars, freeVarsOf ) +import Id ( isOneShotBndr ) +import Var ( Id, idType ) +import Type ( isUnLiftedType ) +import VarSet +import Util ( zipEqual, zipWithEqual, count ) +import Outputable +\end{code} + +Top-level interface function, @floatInwards@. Note that we do not +actually float any bindings downwards from the top-level. + +\begin{code} +floatInwards :: DynFlags -> [CoreBind] -> IO [CoreBind] + +floatInwards dflags binds + = do { + showPass dflags "Float inwards"; + let { binds' = map fi_top_bind binds }; + endPass dflags "Float inwards" Opt_D_verbose_core2core binds' + {- no specific flag for dumping float-in -} + } + + where + fi_top_bind (NonRec binder rhs) + = NonRec binder (fiExpr [] (freeVars rhs)) + fi_top_bind (Rec pairs) + = Rec [ (b, fiExpr [] (freeVars rhs)) | (b, rhs) <- pairs ] +\end{code} + +%************************************************************************ +%* * +\subsection{Mail from Andr\'e [edited]} +%* * +%************************************************************************ + +{\em Will wrote: What??? I thought the idea was to float as far +inwards as possible, no matter what. This is dropping all bindings +every time it sees a lambda of any kind. Help! } + +You are assuming we DO DO full laziness AFTER floating inwards! We +have to [not float inside lambdas] if we don't. + +If we indeed do full laziness after the floating inwards (we could +check the compilation flags for that) then I agree we could be more +aggressive and do float inwards past lambdas. + +Actually we are not doing a proper full laziness (see below), which +was another reason for not floating inwards past a lambda. + +This can easily be fixed. The problem is that we float lets outwards, +but there are a few expressions which are not let bound, like case +scrutinees and case alternatives. After floating inwards the +simplifier could decide to inline the let and the laziness would be +lost, e.g. + +\begin{verbatim} +let a = expensive ==> \b -> case expensive of ... +in \ b -> case a of ... +\end{verbatim} +The fix is +\begin{enumerate} +\item +to let bind the algebraic case scrutinees (done, I think) and +the case alternatives (except the ones with an +unboxed type)(not done, I think). This is best done in the +SetLevels.lhs module, which tags things with their level numbers. +\item +do the full laziness pass (floating lets outwards). +\item +simplify. The simplifier inlines the (trivial) lets that were + created but were not floated outwards. +\end{enumerate} + +With the fix I think Will's suggestion that we can gain even more from +strictness by floating inwards past lambdas makes sense. + +We still gain even without going past lambdas, as things may be +strict in the (new) context of a branch (where it was floated to) or +of a let rhs, e.g. +\begin{verbatim} +let a = something case x of +in case x of alt1 -> case something of a -> a + a + alt1 -> a + a ==> alt2 -> b + alt2 -> b + +let a = something let b = case something of a -> a + a +in let b = a + a ==> in (b,b) +in (b,b) +\end{verbatim} +Also, even if a is not found to be strict in the new context and is +still left as a let, if the branch is not taken (or b is not entered) +the closure for a is not built. + +%************************************************************************ +%* * +\subsection{Main floating-inwards code} +%* * +%************************************************************************ + +\begin{code} +type FreeVarsSet = IdSet + +type FloatingBinds = [(CoreBind, FreeVarsSet)] + -- In reverse dependency order (innermost bindiner first) + + -- The FreeVarsSet is the free variables of the binding. In the case + -- of recursive bindings, the set doesn't include the bound + -- variables. + +fiExpr :: FloatingBinds -- Binds we're trying to drop + -- as far "inwards" as possible + -> CoreExprWithFVs -- Input expr + -> CoreExpr -- Result + +fiExpr to_drop (_, AnnVar v) = mkCoLets' to_drop (Var v) + +fiExpr to_drop (_, AnnType ty) = ASSERT( null to_drop ) + Type ty + +fiExpr to_drop (_, AnnLit lit) = Lit lit +\end{code} + +Applications: we do float inside applications, mainly because we +need to get at all the arguments. The next simplifier run will +pull out any silly ones. + +\begin{code} +fiExpr to_drop (_,AnnApp fun arg) + = mkCoLets' drop_here (App (fiExpr fun_drop fun) (fiExpr arg_drop arg)) + where + [drop_here, fun_drop, arg_drop] = sepBindsByDropPoint False [freeVarsOf fun, freeVarsOf arg] to_drop +\end{code} + +We are careful about lambdas: + +* We must be careful about floating inside inside a value lambda. + That risks losing laziness. + The float-out pass might rescue us, but then again it might not. + +* We must be careful about type lambdas too. At one time we did, and + there is no risk of duplicating work thereby, but we do need to be + careful. In particular, here is a bad case (it happened in the + cichelli benchmark: + let v = ... + in let f = /\t -> \a -> ... + ==> + let f = /\t -> let v = ... in \a -> ... + This is bad as now f is an updatable closure (update PAP) + and has arity 0. + +So we treat lambda in groups, using the following rule: + + Float inside a group of lambdas only if + they are all either type lambdas or one-shot lambdas. + + Otherwise drop all the bindings outside the group. + +\begin{code} + -- Hack alert! We only float in through one-shot lambdas, + -- not (as you might guess) through big lambdas. + -- Reason: we float *out* past big lambdas (see the test in the Lam + -- case of FloatOut.floatExpr) and we don't want to float straight + -- back in again. + -- + -- It *is* important to float into one-shot lambdas, however; + -- see the remarks with noFloatIntoRhs. +fiExpr to_drop lam@(_, AnnLam _ _) + | all is_one_shot bndrs -- Float in + = mkLams bndrs (fiExpr to_drop body) + + | otherwise -- Dump it all here + = mkCoLets' to_drop (mkLams bndrs (fiExpr [] body)) + + where + (bndrs, body) = collectAnnBndrs lam +\end{code} + +We don't float lets inwards past an SCC. + ToDo: keep info on current cc, and when passing + one, if it is not the same, annotate all lets in binds with current + cc, change current cc to the new one and float binds into expr. + +\begin{code} +fiExpr to_drop (_, AnnNote note@(SCC cc) expr) + = -- Wimp out for now + mkCoLets' to_drop (Note note (fiExpr [] expr)) + +fiExpr to_drop (_, AnnNote InlineCall expr) + = -- Wimp out for InlineCall; keep it close + -- the the call it annotates + mkCoLets' to_drop (Note InlineCall (fiExpr [] expr)) + +fiExpr to_drop (_, AnnNote InlineMe expr) + = -- Ditto... don't float anything into an INLINE expression + mkCoLets' to_drop (Note InlineMe (fiExpr [] expr)) + +fiExpr to_drop (_, AnnNote note@(Coerce _ _) expr) + = -- Just float in past coercion + Note note (fiExpr to_drop expr) + +fiExpr to_drop (_, AnnNote note@(CoreNote _) expr) + = Note note (fiExpr to_drop expr) +\end{code} + +For @Lets@, the possible ``drop points'' for the \tr{to_drop} +bindings are: (a)~in the body, (b1)~in the RHS of a NonRec binding, +or~(b2), in each of the RHSs of the pairs of a @Rec@. + +Note that we do {\em weird things} with this let's binding. Consider: +\begin{verbatim} +let + w = ... +in { + let v = ... w ... + in ... v .. w ... +} +\end{verbatim} +Look at the inner \tr{let}. As \tr{w} is used in both the bind and +body of the inner let, we could panic and leave \tr{w}'s binding where +it is. But \tr{v} is floatable further into the body of the inner let, and +{\em then} \tr{w} will also be only in the body of that inner let. + +So: rather than drop \tr{w}'s binding here, we add it onto the list of +things to drop in the outer let's body, and let nature take its +course. + +\begin{code} +fiExpr to_drop (_,AnnLet (AnnNonRec id rhs@(rhs_fvs, ann_rhs)) body) + = fiExpr new_to_drop body + where + body_fvs = freeVarsOf body + + final_body_fvs | noFloatIntoRhs ann_rhs + || isUnLiftedType (idType id) = body_fvs `unionVarSet` rhs_fvs + | otherwise = body_fvs + -- See commments with letrec below + -- No point in floating in only to float straight out again + -- Ditto ok-for-speculation unlifted RHSs + + [shared_binds, rhs_binds, body_binds] = sepBindsByDropPoint False [rhs_fvs, final_body_fvs] to_drop + + new_to_drop = body_binds ++ -- the bindings used only in the body + [(NonRec id rhs', rhs_fvs')] ++ -- the new binding itself + shared_binds -- the bindings used both in rhs and body + + -- Push rhs_binds into the right hand side of the binding + rhs' = fiExpr rhs_binds rhs + rhs_fvs' = rhs_fvs `unionVarSet` floatedBindsFVs rhs_binds + +fiExpr to_drop (_,AnnLet (AnnRec bindings) body) + = fiExpr new_to_drop body + where + rhss = map snd bindings + + rhss_fvs = map freeVarsOf rhss + body_fvs = freeVarsOf body + + -- Add to body_fvs the free vars of any RHS that has + -- a lambda at the top. This has the effect of making it seem + -- that such things are used in the body as well, and hence prevents + -- them getting floated in. The big idea is to avoid turning: + -- let x# = y# +# 1# + -- in + -- letrec f = \z. ...x#...f... + -- in ... + -- into + -- letrec f = let x# = y# +# 1# in \z. ...x#...f... in ... + -- + -- Because now we can't float the let out again, because a letrec + -- can't have unboxed bindings. + + final_body_fvs = foldr (unionVarSet . get_extras) body_fvs rhss + get_extras (rhs_fvs, rhs) | noFloatIntoRhs rhs = rhs_fvs + | otherwise = emptyVarSet + + (shared_binds:body_binds:rhss_binds) = sepBindsByDropPoint False (final_body_fvs:rhss_fvs) to_drop + + new_to_drop = -- the bindings used only in the body + body_binds ++ + -- the new binding itself + [(Rec (fi_bind rhss_binds bindings), rhs_fvs')] ++ + -- the bindings used both in rhs and body or in more than one rhs + shared_binds + + rhs_fvs' = unionVarSet (unionVarSets rhss_fvs) + (unionVarSets (map floatedBindsFVs rhss_binds)) + + -- Push rhs_binds into the right hand side of the binding + fi_bind :: [FloatingBinds] -- one per "drop pt" conjured w/ fvs_of_rhss + -> [(Id, CoreExprWithFVs)] + -> [(Id, CoreExpr)] + + fi_bind to_drops pairs + = [ (binder, fiExpr to_drop rhs) + | ((binder, rhs), to_drop) <- zipEqual "fi_bind" pairs to_drops ] +\end{code} + +For @Case@, the possible ``drop points'' for the \tr{to_drop} +bindings are: (a)~inside the scrutinee, (b)~inside one of the +alternatives/default [default FVs always {\em first}!]. + +\begin{code} +fiExpr to_drop (_, AnnCase scrut case_bndr ty alts) + = mkCoLets' drop_here1 $ + mkCoLets' drop_here2 $ + Case (fiExpr scrut_drops scrut) case_bndr ty + (zipWith fi_alt alts_drops_s alts) + where + -- Float into the scrut and alts-considered-together just like App + [drop_here1, scrut_drops, alts_drops] = sepBindsByDropPoint False [scrut_fvs, all_alts_fvs] to_drop + + -- Float into the alts with the is_case flag set + (drop_here2 : alts_drops_s) = sepBindsByDropPoint True alts_fvs alts_drops + + scrut_fvs = freeVarsOf scrut + alts_fvs = map alt_fvs alts + all_alts_fvs = unionVarSets alts_fvs + alt_fvs (con, args, rhs) = foldl delVarSet (freeVarsOf rhs) (case_bndr:args) + -- Delete case_bndr and args from free vars of rhs + -- to get free vars of alt + + fi_alt to_drop (con, args, rhs) = (con, args, fiExpr to_drop rhs) + +noFloatIntoRhs (AnnNote InlineMe _) = True +noFloatIntoRhs (AnnLam b _) = not (is_one_shot b) + -- IMPORTANT: don't say 'True' for a RHS with a one-shot lambda at the top. + -- This makes a big difference for things like + -- f x# = let x = I# x# + -- in let j = \() -> ...x... + -- in if <condition> then normal-path else j () + -- If x is used only in the error case join point, j, we must float the + -- boxing constructor into it, else we box it every time which is very bad + -- news indeed. + +noFloatIntoRhs rhs = exprIsHNF (deAnnotate' rhs) -- We'd just float right back out again... + +is_one_shot b = isId b && isOneShotBndr b +\end{code} + + +%************************************************************************ +%* * +\subsection{@sepBindsByDropPoint@} +%* * +%************************************************************************ + +This is the crucial function. The idea is: We have a wad of bindings +that we'd like to distribute inside a collection of {\em drop points}; +insides the alternatives of a \tr{case} would be one example of some +drop points; the RHS and body of a non-recursive \tr{let} binding +would be another (2-element) collection. + +So: We're given a list of sets-of-free-variables, one per drop point, +and a list of floating-inwards bindings. If a binding can go into +only one drop point (without suddenly making something out-of-scope), +in it goes. If a binding is used inside {\em multiple} drop points, +then it has to go in a you-must-drop-it-above-all-these-drop-points +point. + +We have to maintain the order on these drop-point-related lists. + +\begin{code} +sepBindsByDropPoint + :: Bool -- True <=> is case expression + -> [FreeVarsSet] -- One set of FVs per drop point + -> FloatingBinds -- Candidate floaters + -> [FloatingBinds] -- FIRST one is bindings which must not be floated + -- inside any drop point; the rest correspond + -- one-to-one with the input list of FV sets + +-- Every input floater is returned somewhere in the result; +-- none are dropped, not even ones which don't seem to be +-- free in *any* of the drop-point fvs. Why? Because, for example, +-- a binding (let x = E in B) might have a specialised version of +-- x (say x') stored inside x, but x' isn't free in E or B. + +type DropBox = (FreeVarsSet, FloatingBinds) + +sepBindsByDropPoint is_case drop_pts [] + = [] : [[] | p <- drop_pts] -- cut to the chase scene; it happens + +sepBindsByDropPoint is_case drop_pts floaters + = go floaters (map (\fvs -> (fvs, [])) (emptyVarSet : drop_pts)) + where + go :: FloatingBinds -> [DropBox] -> [FloatingBinds] + -- The *first* one in the argument list is the drop_here set + -- The FloatingBinds in the lists are in the reverse of + -- the normal FloatingBinds order; that is, they are the right way round! + + go [] drop_boxes = map (reverse . snd) drop_boxes + + go (bind_w_fvs@(bind, bind_fvs) : binds) drop_boxes@(here_box : fork_boxes) + = go binds new_boxes + where + -- "here" means the group of bindings dropped at the top of the fork + + (used_here : used_in_flags) = [ any (`elemVarSet` fvs) (bindersOf bind) + | (fvs, drops) <- drop_boxes] + + drop_here = used_here || not can_push + + -- For case expressions we duplicate the binding if it is + -- reasonably small, and if it is not used in all the RHSs + -- This is good for situations like + -- let x = I# y in + -- case e of + -- C -> error x + -- D -> error x + -- E -> ...not mentioning x... + + n_alts = length used_in_flags + n_used_alts = count id used_in_flags -- returns number of Trues in list. + + can_push = n_used_alts == 1 -- Used in just one branch + || (is_case && -- We are looking at case alternatives + n_used_alts > 1 && -- It's used in more than one + n_used_alts < n_alts && -- ...but not all + bindIsDupable bind) -- and we can duplicate the binding + + new_boxes | drop_here = (insert here_box : fork_boxes) + | otherwise = (here_box : new_fork_boxes) + + new_fork_boxes = zipWithEqual "FloatIn.sepBinds" insert_maybe fork_boxes used_in_flags + + insert :: DropBox -> DropBox + insert (fvs,drops) = (fvs `unionVarSet` bind_fvs, bind_w_fvs:drops) + + insert_maybe box True = insert box + insert_maybe box False = box + + +floatedBindsFVs :: FloatingBinds -> FreeVarsSet +floatedBindsFVs binds = unionVarSets (map snd binds) + +mkCoLets' :: FloatingBinds -> CoreExpr -> CoreExpr +mkCoLets' to_drop e = foldl (flip (Let . fst)) e to_drop + -- Remember to_drop is in *reverse* dependency order + +bindIsDupable (Rec prs) = all (exprIsDupable . snd) prs +bindIsDupable (NonRec b r) = exprIsDupable r +\end{code} diff --git a/compiler/simplCore/FloatOut.lhs b/compiler/simplCore/FloatOut.lhs new file mode 100644 index 0000000000..988bd53015 --- /dev/null +++ b/compiler/simplCore/FloatOut.lhs @@ -0,0 +1,443 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[FloatOut]{Float bindings outwards (towards the top level)} + +``Long-distance'' floating of bindings towards the top level. + +\begin{code} +module FloatOut ( floatOutwards ) where + +#include "HsVersions.h" + +import CoreSyn +import CoreUtils ( mkSCC, exprIsHNF, exprIsTrivial ) + +import DynFlags ( DynFlags, DynFlag(..), FloatOutSwitches(..) ) +import ErrUtils ( dumpIfSet_dyn ) +import CostCentre ( dupifyCC, CostCentre ) +import Id ( Id, idType ) +import Type ( isUnLiftedType ) +import CoreLint ( showPass, endPass ) +import SetLevels ( Level(..), LevelledExpr, LevelledBind, + setLevels, ltMajLvl, ltLvl, isTopLvl ) +import UniqSupply ( UniqSupply ) +import List ( partition ) +import Outputable +import Util ( notNull ) +\end{code} + + ----------------- + Overall game plan + ----------------- + +The Big Main Idea is: + + To float out sub-expressions that can thereby get outside + a non-one-shot value lambda, and hence may be shared. + + +To achieve this we may need to do two thing: + + a) Let-bind the sub-expression: + + f (g x) ==> let lvl = f (g x) in lvl + + Now we can float the binding for 'lvl'. + + b) More than that, we may need to abstract wrt a type variable + + \x -> ... /\a -> let v = ...a... in .... + + Here the binding for v mentions 'a' but not 'x'. So we + abstract wrt 'a', to give this binding for 'v': + + vp = /\a -> ...a... + v = vp a + + Now the binding for vp can float out unimpeded. + I can't remember why this case seemed important enough to + deal with, but I certainly found cases where important floats + didn't happen if we did not abstract wrt tyvars. + +With this in mind we can also achieve another goal: lambda lifting. +We can make an arbitrary (function) binding float to top level by +abstracting wrt *all* local variables, not just type variables, leaving +a binding that can be floated right to top level. Whether or not this +happens is controlled by a flag. + + +Random comments +~~~~~~~~~~~~~~~ + +At the moment we never float a binding out to between two adjacent +lambdas. For example: + +@ + \x y -> let t = x+x in ... +===> + \x -> let t = x+x in \y -> ... +@ +Reason: this is less efficient in the case where the original lambda +is never partially applied. + +But there's a case I've seen where this might not be true. Consider: +@ +elEm2 x ys + = elem' x ys + where + elem' _ [] = False + elem' x (y:ys) = x==y || elem' x ys +@ +It turns out that this generates a subexpression of the form +@ + \deq x ys -> let eq = eqFromEqDict deq in ... +@ +vwhich might usefully be separated to +@ + \deq -> let eq = eqFromEqDict deq in \xy -> ... +@ +Well, maybe. We don't do this at the moment. + +\begin{code} +type FloatBind = (Level, CoreBind) -- INVARIANT: a FloatBind is always lifted +type FloatBinds = [FloatBind] +\end{code} + +%************************************************************************ +%* * +\subsection[floatOutwards]{@floatOutwards@: let-floating interface function} +%* * +%************************************************************************ + +\begin{code} +floatOutwards :: FloatOutSwitches + -> DynFlags + -> UniqSupply + -> [CoreBind] -> IO [CoreBind] + +floatOutwards float_sws dflags us pgm + = do { + showPass dflags float_msg ; + + let { annotated_w_levels = setLevels float_sws pgm us ; + (fss, binds_s') = unzip (map floatTopBind annotated_w_levels) + } ; + + dumpIfSet_dyn dflags Opt_D_verbose_core2core "Levels added:" + (vcat (map ppr annotated_w_levels)); + + let { (tlets, ntlets, lams) = get_stats (sum_stats fss) }; + + dumpIfSet_dyn dflags Opt_D_dump_simpl_stats "FloatOut stats:" + (hcat [ int tlets, ptext SLIT(" Lets floated to top level; "), + int ntlets, ptext SLIT(" Lets floated elsewhere; from "), + int lams, ptext SLIT(" Lambda groups")]); + + endPass dflags float_msg Opt_D_verbose_core2core (concat binds_s') + {- no specific flag for dumping float-out -} + } + where + float_msg = showSDoc (text "Float out" <+> parens (sws float_sws)) + sws (FloatOutSw lam const) = pp_not lam <+> text "lambdas" <> comma <+> + pp_not const <+> text "constants" + pp_not True = empty + pp_not False = text "not" + +floatTopBind bind@(NonRec _ _) + = case (floatBind bind) of { (fs, floats, bind') -> + (fs, floatsToBinds floats ++ [bind']) + } + +floatTopBind bind@(Rec _) + = case (floatBind bind) of { (fs, floats, Rec pairs') -> + WARN( notNull floats, ppr bind $$ ppr floats ) + (fs, [Rec (floatsToBindPairs floats ++ pairs')]) } +\end{code} + +%************************************************************************ +%* * +\subsection[FloatOut-Bind]{Floating in a binding (the business end)} +%* * +%************************************************************************ + + +\begin{code} +floatBind :: LevelledBind + -> (FloatStats, FloatBinds, CoreBind) + +floatBind (NonRec (TB name level) rhs) + = case (floatNonRecRhs level rhs) of { (fs, rhs_floats, rhs') -> + (fs, rhs_floats, NonRec name rhs') } + +floatBind bind@(Rec pairs) + = case (unzip3 (map do_pair pairs)) of { (fss, rhss_floats, new_pairs) -> + + if not (isTopLvl bind_dest_level) then + -- Standard case; the floated bindings can't mention the + -- binders, because they couldn't be escaping a major level + -- if so. + (sum_stats fss, concat rhss_floats, Rec new_pairs) + else + -- In a recursive binding, *destined for* the top level + -- (only), the rhs floats may contain references to the + -- bound things. For example + -- f = ...(let v = ...f... in b) ... + -- might get floated to + -- v = ...f... + -- f = ... b ... + -- and hence we must (pessimistically) make all the floats recursive + -- with the top binding. Later dependency analysis will unravel it. + -- + -- This can only happen for bindings destined for the top level, + -- because only then will partitionByMajorLevel allow through a binding + -- that only differs in its minor level + (sum_stats fss, [], + Rec (new_pairs ++ floatsToBindPairs (concat rhss_floats))) + } + where + bind_dest_level = getBindLevel bind + + do_pair (TB name level, rhs) + = case (floatRhs level rhs) of { (fs, rhs_floats, rhs') -> + (fs, rhs_floats, (name, rhs')) + } +\end{code} + +%************************************************************************ + +\subsection[FloatOut-Expr]{Floating in expressions} +%* * +%************************************************************************ + +\begin{code} +floatExpr, floatRhs, floatNonRecRhs + :: Level + -> LevelledExpr + -> (FloatStats, FloatBinds, CoreExpr) + +floatRhs lvl arg -- Used rec rhss, and case-alternative rhss + = case (floatExpr lvl arg) of { (fsa, floats, arg') -> + case (partitionByMajorLevel lvl floats) of { (floats', heres) -> + -- Dump bindings that aren't going to escape from a lambda; + -- in particular, we must dump the ones that are bound by + -- the rec or case alternative + (fsa, floats', install heres arg') }} + +floatNonRecRhs lvl arg -- Used for nested non-rec rhss, and fn args + = case (floatExpr lvl arg) of { (fsa, floats, arg') -> + -- Dump bindings that aren't going to escape from a lambda + -- This isn't a scoping issue (the binder isn't in scope in the RHS of a non-rec binding) + -- Rather, it is to avoid floating the x binding out of + -- f (let x = e in b) + -- unnecessarily. But we first test for values or trival rhss, + -- because (in particular) we don't want to insert new bindings between + -- the "=" and the "\". E.g. + -- f = \x -> let <bind> in <body> + -- We do not want + -- f = let <bind> in \x -> <body> + -- (a) The simplifier will immediately float it further out, so we may + -- as well do so right now; in general, keeping rhss as manifest + -- values is good + -- (b) If a float-in pass follows immediately, it might add yet more + -- bindings just after the '='. And some of them might (correctly) + -- be strict even though the 'let f' is lazy, because f, being a value, + -- gets its demand-info zapped by the simplifier. + if exprIsHNF arg' || exprIsTrivial arg' then + (fsa, floats, arg') + else + case (partitionByMajorLevel lvl floats) of { (floats', heres) -> + (fsa, floats', install heres arg') }} + +floatExpr _ (Var v) = (zeroStats, [], Var v) +floatExpr _ (Type ty) = (zeroStats, [], Type ty) +floatExpr _ (Lit lit) = (zeroStats, [], Lit lit) + +floatExpr lvl (App e a) + = case (floatExpr lvl e) of { (fse, floats_e, e') -> + case (floatNonRecRhs lvl a) of { (fsa, floats_a, a') -> + (fse `add_stats` fsa, floats_e ++ floats_a, App e' a') }} + +floatExpr lvl lam@(Lam _ _) + = let + (bndrs_w_lvls, body) = collectBinders lam + bndrs = [b | TB b _ <- bndrs_w_lvls] + lvls = [l | TB b l <- bndrs_w_lvls] + + -- For the all-tyvar case we are prepared to pull + -- the lets out, to implement the float-out-of-big-lambda + -- transform; but otherwise we only float bindings that are + -- going to escape a value lambda. + -- In particular, for one-shot lambdas we don't float things + -- out; we get no saving by so doing. + partition_fn | all isTyVar bndrs = partitionByLevel + | otherwise = partitionByMajorLevel + in + case (floatExpr (last lvls) body) of { (fs, floats, body') -> + + -- Dump any bindings which absolutely cannot go any further + case (partition_fn (head lvls) floats) of { (floats', heres) -> + + (add_to_stats fs floats', floats', mkLams bndrs (install heres body')) + }} + +floatExpr lvl (Note note@(SCC cc) expr) + = case (floatExpr lvl expr) of { (fs, floating_defns, expr') -> + let + -- Annotate bindings floated outwards past an scc expression + -- with the cc. We mark that cc as "duplicated", though. + + annotated_defns = annotate (dupifyCC cc) floating_defns + in + (fs, annotated_defns, Note note expr') } + where + annotate :: CostCentre -> FloatBinds -> FloatBinds + + annotate dupd_cc defn_groups + = [ (level, ann_bind floater) | (level, floater) <- defn_groups ] + where + ann_bind (NonRec binder rhs) + = NonRec binder (mkSCC dupd_cc rhs) + + ann_bind (Rec pairs) + = Rec [(binder, mkSCC dupd_cc rhs) | (binder, rhs) <- pairs] + +floatExpr lvl (Note InlineMe expr) -- Other than SCCs + = case floatExpr InlineCtxt expr of { (fs, floating_defns, expr') -> + -- There can be some floating_defns, arising from + -- ordinary lets that were there all the time. It seems + -- more efficient to test once here than to avoid putting + -- them into floating_defns (which would mean testing for + -- inlineCtxt at every let) + (fs, [], Note InlineMe (install floating_defns expr')) } -- See notes in SetLevels + +floatExpr lvl (Note note expr) -- Other than SCCs + = case (floatExpr lvl expr) of { (fs, floating_defns, expr') -> + (fs, floating_defns, Note note expr') } + +floatExpr lvl (Let (NonRec (TB bndr bndr_lvl) rhs) body) + | isUnLiftedType (idType bndr) -- Treat unlifted lets just like a case + = case floatExpr lvl rhs of { (fs, rhs_floats, rhs') -> + case floatRhs bndr_lvl body of { (fs, body_floats, body') -> + (fs, rhs_floats ++ body_floats, Let (NonRec bndr rhs') body') }} + +floatExpr lvl (Let bind body) + = case (floatBind bind) of { (fsb, rhs_floats, bind') -> + case (floatExpr lvl body) of { (fse, body_floats, body') -> + (add_stats fsb fse, + rhs_floats ++ [(bind_lvl, bind')] ++ body_floats, + body') }} + where + bind_lvl = getBindLevel bind + +floatExpr lvl (Case scrut (TB case_bndr case_lvl) ty alts) + = case floatExpr lvl scrut of { (fse, fde, scrut') -> + case floatList float_alt alts of { (fsa, fda, alts') -> + (add_stats fse fsa, fda ++ fde, Case scrut' case_bndr ty alts') + }} + where + -- Use floatRhs for the alternatives, so that we + -- don't gratuitiously float bindings out of the RHSs + float_alt (con, bs, rhs) + = case (floatRhs case_lvl rhs) of { (fs, rhs_floats, rhs') -> + (fs, rhs_floats, (con, [b | TB b _ <- bs], rhs')) } + + +floatList :: (a -> (FloatStats, FloatBinds, b)) -> [a] -> (FloatStats, FloatBinds, [b]) +floatList f [] = (zeroStats, [], []) +floatList f (a:as) = case f a of { (fs_a, binds_a, b) -> + case floatList f as of { (fs_as, binds_as, bs) -> + (fs_a `add_stats` fs_as, binds_a ++ binds_as, b:bs) }} +\end{code} + +%************************************************************************ +%* * +\subsection{Utility bits for floating stats} +%* * +%************************************************************************ + +I didn't implement this with unboxed numbers. I don't want to be too +strict in this stuff, as it is rarely turned on. (WDP 95/09) + +\begin{code} +data FloatStats + = FlS Int -- Number of top-floats * lambda groups they've been past + Int -- Number of non-top-floats * lambda groups they've been past + Int -- Number of lambda (groups) seen + +get_stats (FlS a b c) = (a, b, c) + +zeroStats = FlS 0 0 0 + +sum_stats xs = foldr add_stats zeroStats xs + +add_stats (FlS a1 b1 c1) (FlS a2 b2 c2) + = FlS (a1 + a2) (b1 + b2) (c1 + c2) + +add_to_stats (FlS a b c) floats + = FlS (a + length top_floats) (b + length other_floats) (c + 1) + where + (top_floats, other_floats) = partition to_very_top floats + + to_very_top (my_lvl, _) = isTopLvl my_lvl +\end{code} + + +%************************************************************************ +%* * +\subsection{Utility bits for floating} +%* * +%************************************************************************ + +\begin{code} +getBindLevel (NonRec (TB _ lvl) _) = lvl +getBindLevel (Rec (((TB _ lvl), _) : _)) = lvl +\end{code} + +\begin{code} +partitionByMajorLevel, partitionByLevel + :: Level -- Partitioning level + + -> FloatBinds -- Defns to be divided into 2 piles... + + -> (FloatBinds, -- Defns with level strictly < partition level, + FloatBinds) -- The rest + + +partitionByMajorLevel ctxt_lvl defns + = partition float_further defns + where + -- Float it if we escape a value lambda, or if we get to the top level + float_further (my_lvl, bind) = my_lvl `ltMajLvl` ctxt_lvl || isTopLvl my_lvl + -- The isTopLvl part says that if we can get to the top level, say "yes" anyway + -- This means that + -- x = f e + -- transforms to + -- lvl = e + -- x = f lvl + -- which is as it should be + +partitionByLevel ctxt_lvl defns + = partition float_further defns + where + float_further (my_lvl, _) = my_lvl `ltLvl` ctxt_lvl +\end{code} + +\begin{code} +floatsToBinds :: FloatBinds -> [CoreBind] +floatsToBinds floats = map snd floats + +floatsToBindPairs :: FloatBinds -> [(Id,CoreExpr)] + +floatsToBindPairs floats = concat (map mk_pairs floats) + where + mk_pairs (_, Rec pairs) = pairs + mk_pairs (_, NonRec binder rhs) = [(binder,rhs)] + +install :: FloatBinds -> CoreExpr -> CoreExpr + +install defn_groups expr + = foldr install_group expr defn_groups + where + install_group (_, defns) body = Let defns body +\end{code} diff --git a/compiler/simplCore/LiberateCase.lhs b/compiler/simplCore/LiberateCase.lhs new file mode 100644 index 0000000000..c29a5b9c68 --- /dev/null +++ b/compiler/simplCore/LiberateCase.lhs @@ -0,0 +1,317 @@ +% +% (c) The AQUA Project, Glasgow University, 1994-1998 +% +\section[LiberateCase]{Unroll recursion to allow evals to be lifted from a loop} + +\begin{code} +module LiberateCase ( liberateCase ) where + +#include "HsVersions.h" + +import DynFlags ( DynFlags, DynFlag(..) ) +import StaticFlags ( opt_LiberateCaseThreshold ) +import CoreLint ( showPass, endPass ) +import CoreSyn +import CoreUnfold ( couldBeSmallEnoughToInline ) +import Id ( Id, setIdName, idName, setIdNotExported ) +import VarEnv +import Name ( localiseName ) +import Outputable +import Util ( notNull ) +\end{code} + +This module walks over @Core@, and looks for @case@ on free variables. +The criterion is: + if there is case on a free on the route to the recursive call, + then the recursive call is replaced with an unfolding. + +Example + +\begin{verbatim} +f = \ t -> case v of + V a b -> a : f t +\end{verbatim} + +=> the inner f is replaced. + +\begin{verbatim} +f = \ t -> case v of + V a b -> a : (letrec + f = \ t -> case v of + V a b -> a : f t + in f) t +\end{verbatim} +(note the NEED for shadowing) + +=> Simplify + +\begin{verbatim} +f = \ t -> case v of + V a b -> a : (letrec + f = \ t -> a : f t + in f t) +\begin{verbatim} + +Better code, because 'a' is free inside the inner letrec, rather +than needing projection from v. + +Other examples we'd like to catch with this kind of transformation + + last [] = error + last (x:[]) = x + last (x:xs) = last xs + +We'd like to avoid the redundant pattern match, transforming to + + last [] = error + last (x:[]) = x + last (x:(y:ys)) = last' y ys + where + last' y [] = y + last' _ (y:ys) = last' y ys + + (is this necessarily an improvement) + + +Similarly drop: + + drop n [] = [] + drop 0 xs = xs + drop n (x:xs) = drop (n-1) xs + +Would like to pass n along unboxed. + + +To think about (Apr 94) +~~~~~~~~~~~~~~ + +Main worry: duplicating code excessively. At the moment we duplicate +the entire binding group once at each recursive call. But there may +be a group of recursive calls which share a common set of evaluated +free variables, in which case the duplication is a plain waste. + +Another thing we could consider adding is some unfold-threshold thing, +so that we'll only duplicate if the size of the group rhss isn't too +big. + +Data types +~~~~~~~~~~ + +The ``level'' of a binder tells how many +recursive defns lexically enclose the binding +A recursive defn "encloses" its RHS, not its +scope. For example: +\begin{verbatim} + letrec f = let g = ... in ... + in + let h = ... + in ... +\end{verbatim} +Here, the level of @f@ is zero, the level of @g@ is one, +and the level of @h@ is zero (NB not one). + +\begin{code} +type LibCaseLevel = Int + +topLevel :: LibCaseLevel +topLevel = 0 +\end{code} + +\begin{code} +data LibCaseEnv + = LibCaseEnv + Int -- Bomb-out size for deciding if + -- potential liberatees are too big. + -- (passed in from cmd-line args) + + LibCaseLevel -- Current level + + (IdEnv LibCaseLevel) -- Binds all non-top-level in-scope Ids + -- (top-level and imported things have + -- a level of zero) + + (IdEnv CoreBind) -- Binds *only* recursively defined + -- Ids, to their own binding group, + -- and *only* in their own RHSs + + [(Id,LibCaseLevel)] -- Each of these Ids was scrutinised by an + -- enclosing case expression, with the + -- specified number of enclosing + -- recursive bindings; furthermore, + -- the Id is bound at a lower level + -- than the case expression. The + -- order is insignificant; it's a bag + -- really + +initEnv :: Int -> LibCaseEnv +initEnv bomb_size = LibCaseEnv bomb_size 0 emptyVarEnv emptyVarEnv [] + +bombOutSize (LibCaseEnv bomb_size _ _ _ _) = bomb_size +\end{code} + + +Programs +~~~~~~~~ +\begin{code} +liberateCase :: DynFlags -> [CoreBind] -> IO [CoreBind] +liberateCase dflags binds + = do { + showPass dflags "Liberate case" ; + let { binds' = do_prog (initEnv opt_LiberateCaseThreshold) binds } ; + endPass dflags "Liberate case" Opt_D_verbose_core2core binds' + {- no specific flag for dumping -} + } + where + do_prog env [] = [] + do_prog env (bind:binds) = bind' : do_prog env' binds + where + (env', bind') = libCaseBind env bind +\end{code} + +Bindings +~~~~~~~~ + +\begin{code} +libCaseBind :: LibCaseEnv -> CoreBind -> (LibCaseEnv, CoreBind) + +libCaseBind env (NonRec binder rhs) + = (addBinders env [binder], NonRec binder (libCase env rhs)) + +libCaseBind env (Rec pairs) + = (env_body, Rec pairs') + where + (binders, rhss) = unzip pairs + + env_body = addBinders env binders + + pairs' = [(binder, libCase env_rhs rhs) | (binder,rhs) <- pairs] + + env_rhs = if all rhs_small_enough rhss then extended_env else env + + -- We extend the rec-env by binding each Id to its rhs, first + -- processing the rhs with an *un-extended* environment, so + -- that the same process doesn't occur for ever! + -- + extended_env = addRecBinds env [ (adjust binder, libCase env_body rhs) + | (binder, rhs) <- pairs ] + + -- Two subtle things: + -- (a) Reset the export flags on the binders so + -- that we don't get name clashes on exported things if the + -- local binding floats out to top level. This is most unlikely + -- to happen, since the whole point concerns free variables. + -- But resetting the export flag is right regardless. + -- + -- (b) Make the name an Internal one. External Names should never be + -- nested; if it were floated to the top level, we'd get a name + -- clash at code generation time. + adjust bndr = setIdNotExported (setIdName bndr (localiseName (idName bndr))) + + rhs_small_enough rhs = couldBeSmallEnoughToInline lIBERATE_BOMB_SIZE rhs + lIBERATE_BOMB_SIZE = bombOutSize env +\end{code} + + +Expressions +~~~~~~~~~~~ + +\begin{code} +libCase :: LibCaseEnv + -> CoreExpr + -> CoreExpr + +libCase env (Var v) = libCaseId env v +libCase env (Lit lit) = Lit lit +libCase env (Type ty) = Type ty +libCase env (App fun arg) = App (libCase env fun) (libCase env arg) +libCase env (Note note body) = Note note (libCase env body) + +libCase env (Lam binder body) + = Lam binder (libCase (addBinders env [binder]) body) + +libCase env (Let bind body) + = Let bind' (libCase env_body body) + where + (env_body, bind') = libCaseBind env bind + +libCase env (Case scrut bndr ty alts) + = Case (libCase env scrut) bndr ty (map (libCaseAlt env_alts) alts) + where + env_alts = addBinders env_with_scrut [bndr] + env_with_scrut = case scrut of + Var scrut_var -> addScrutedVar env scrut_var + other -> env + +libCaseAlt env (con,args,rhs) = (con, args, libCase (addBinders env args) rhs) +\end{code} + +Ids +~~~ +\begin{code} +libCaseId :: LibCaseEnv -> Id -> CoreExpr +libCaseId env v + | Just the_bind <- lookupRecId env v -- It's a use of a recursive thing + , notNull free_scruts -- with free vars scrutinised in RHS + = Let the_bind (Var v) + + | otherwise + = Var v + + where + rec_id_level = lookupLevel env v + free_scruts = freeScruts env rec_id_level +\end{code} + + + +Utility functions +~~~~~~~~~~~~~~~~~ +\begin{code} +addBinders :: LibCaseEnv -> [CoreBndr] -> LibCaseEnv +addBinders (LibCaseEnv bomb lvl lvl_env rec_env scruts) binders + = LibCaseEnv bomb lvl lvl_env' rec_env scruts + where + lvl_env' = extendVarEnvList lvl_env (binders `zip` repeat lvl) + +addRecBinds :: LibCaseEnv -> [(Id,CoreExpr)] -> LibCaseEnv +addRecBinds (LibCaseEnv bomb lvl lvl_env rec_env scruts) pairs + = LibCaseEnv bomb lvl' lvl_env' rec_env' scruts + where + lvl' = lvl + 1 + lvl_env' = extendVarEnvList lvl_env [(binder,lvl) | (binder,_) <- pairs] + rec_env' = extendVarEnvList rec_env [(binder, Rec pairs) | (binder,_) <- pairs] + +addScrutedVar :: LibCaseEnv + -> Id -- This Id is being scrutinised by a case expression + -> LibCaseEnv + +addScrutedVar env@(LibCaseEnv bomb lvl lvl_env rec_env scruts) scrut_var + | bind_lvl < lvl + = LibCaseEnv bomb lvl lvl_env rec_env scruts' + -- Add to scruts iff the scrut_var is being scrutinised at + -- a deeper level than its defn + + | otherwise = env + where + scruts' = (scrut_var, lvl) : scruts + bind_lvl = case lookupVarEnv lvl_env scrut_var of + Just lvl -> lvl + Nothing -> topLevel + +lookupRecId :: LibCaseEnv -> Id -> Maybe CoreBind +lookupRecId (LibCaseEnv bomb lvl lvl_env rec_env scruts) id + = lookupVarEnv rec_env id + +lookupLevel :: LibCaseEnv -> Id -> LibCaseLevel +lookupLevel (LibCaseEnv bomb lvl lvl_env rec_env scruts) id + = case lookupVarEnv lvl_env id of + Just lvl -> lvl + Nothing -> topLevel + +freeScruts :: LibCaseEnv + -> LibCaseLevel -- Level of the recursive Id + -> [Id] -- Ids that are scrutinised between the binding + -- of the recursive Id and here +freeScruts (LibCaseEnv bomb lvl lvl_env rec_env scruts) rec_bind_lvl + = [v | (v,scrut_lvl) <- scruts, scrut_lvl > rec_bind_lvl] +\end{code} diff --git a/compiler/simplCore/OccurAnal.lhs b/compiler/simplCore/OccurAnal.lhs new file mode 100644 index 0000000000..90a565f4dd --- /dev/null +++ b/compiler/simplCore/OccurAnal.lhs @@ -0,0 +1,823 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +%************************************************************************ +%* * +\section[OccurAnal]{Occurrence analysis pass} +%* * +%************************************************************************ + +The occurrence analyser re-typechecks a core expression, returning a new +core expression with (hopefully) improved usage information. + +\begin{code} +module OccurAnal ( + occurAnalysePgm, occurAnalyseExpr + ) where + +#include "HsVersions.h" + +import CoreSyn +import CoreFVs ( idRuleVars ) +import CoreUtils ( exprIsTrivial, isDefaultAlt ) +import Id ( isDataConWorkId, isOneShotBndr, setOneShotLambda, + idOccInfo, setIdOccInfo, isLocalId, + isExportedId, idArity, idSpecialisation, + idType, idUnique, Id + ) +import IdInfo ( isEmptySpecInfo ) +import BasicTypes ( OccInfo(..), isOneOcc, InterestingCxt ) + +import VarSet +import VarEnv + +import Type ( isFunTy, dropForAlls ) +import Maybes ( orElse ) +import Digraph ( stronglyConnCompR, SCC(..) ) +import PrelNames ( buildIdKey, foldrIdKey, runSTRepIdKey, augmentIdKey ) +import Unique ( Unique ) +import UniqFM ( keysUFM ) +import Util ( zipWithEqual, mapAndUnzip ) +import Outputable +\end{code} + + +%************************************************************************ +%* * +\subsection[OccurAnal-main]{Counting occurrences: main function} +%* * +%************************************************************************ + +Here's the externally-callable interface: + +\begin{code} +occurAnalysePgm :: [CoreBind] -> [CoreBind] +occurAnalysePgm binds + = snd (go initOccEnv binds) + where + go :: OccEnv -> [CoreBind] -> (UsageDetails, [CoreBind]) + go env [] + = (emptyDetails, []) + go env (bind:binds) + = (final_usage, bind' ++ binds') + where + (bs_usage, binds') = go env binds + (final_usage, bind') = occAnalBind env bind bs_usage + +occurAnalyseExpr :: CoreExpr -> CoreExpr + -- Do occurrence analysis, and discard occurence info returned +occurAnalyseExpr expr = snd (occAnal initOccEnv expr) +\end{code} + + +%************************************************************************ +%* * +\subsection[OccurAnal-main]{Counting occurrences: main function} +%* * +%************************************************************************ + +Bindings +~~~~~~~~ + +\begin{code} +type IdWithOccInfo = Id -- An Id with fresh PragmaInfo attached + +type Node details = (details, Unique, [Unique]) -- The Ints are gotten from the Unique, + -- which is gotten from the Id. +type Details1 = (Id, UsageDetails, CoreExpr) +type Details2 = (IdWithOccInfo, CoreExpr) + + +occAnalBind :: OccEnv + -> CoreBind + -> UsageDetails -- Usage details of scope + -> (UsageDetails, -- Of the whole let(rec) + [CoreBind]) + +occAnalBind env (NonRec binder rhs) body_usage + | not (binder `usedIn` body_usage) -- It's not mentioned + = (body_usage, []) + + | otherwise -- It's mentioned in the body + = (final_body_usage `combineUsageDetails` rhs_usage, + [NonRec tagged_binder rhs']) + + where + (final_body_usage, tagged_binder) = tagBinder body_usage binder + (rhs_usage, rhs') = occAnalRhs env tagged_binder rhs +\end{code} + +Dropping dead code for recursive bindings is done in a very simple way: + + the entire set of bindings is dropped if none of its binders are + mentioned in its body; otherwise none are. + +This seems to miss an obvious improvement. +@ + letrec f = ...g... + g = ...f... + in + ...g... + +===> + + letrec f = ...g... + g = ...(...g...)... + in + ...g... +@ + +Now @f@ is unused. But dependency analysis will sort this out into a +@letrec@ for @g@ and a @let@ for @f@, and then @f@ will get dropped. +It isn't easy to do a perfect job in one blow. Consider + +@ + letrec f = ...g... + g = ...h... + h = ...k... + k = ...m... + m = ...m... + in + ...m... +@ + + +\begin{code} +occAnalBind env (Rec pairs) body_usage + = foldr (_scc_ "occAnalBind.dofinal" do_final_bind) (body_usage, []) sccs + where + analysed_pairs :: [Details1] + analysed_pairs = [ (bndr, rhs_usage, rhs') + | (bndr, rhs) <- pairs, + let (rhs_usage, rhs') = occAnalRhs env bndr rhs + ] + + sccs :: [SCC (Node Details1)] + sccs = _scc_ "occAnalBind.scc" stronglyConnCompR edges + + + ---- stuff for dependency analysis of binds ------------------------------- + edges :: [Node Details1] + edges = _scc_ "occAnalBind.assoc" + [ (details, idUnique id, edges_from rhs_usage) + | details@(id, rhs_usage, rhs) <- analysed_pairs + ] + + -- (a -> b) means a mentions b + -- Given the usage details (a UFM that gives occ info for each free var of + -- the RHS) we can get the list of free vars -- or rather their Int keys -- + -- by just extracting the keys from the finite map. Grimy, but fast. + -- Previously we had this: + -- [ bndr | bndr <- bndrs, + -- maybeToBool (lookupVarEnv rhs_usage bndr)] + -- which has n**2 cost, and this meant that edges_from alone + -- consumed 10% of total runtime! + edges_from :: UsageDetails -> [Unique] + edges_from rhs_usage = _scc_ "occAnalBind.edges_from" + keysUFM rhs_usage + + ---- stuff to "re-constitute" bindings from dependency-analysis info ------ + + -- Non-recursive SCC + do_final_bind (AcyclicSCC ((bndr, rhs_usage, rhs'), _, _)) (body_usage, binds_so_far) + | not (bndr `usedIn` body_usage) + = (body_usage, binds_so_far) -- Dead code + | otherwise + = (combined_usage, new_bind : binds_so_far) + where + total_usage = combineUsageDetails body_usage rhs_usage + (combined_usage, tagged_bndr) = tagBinder total_usage bndr + new_bind = NonRec tagged_bndr rhs' + + -- Recursive SCC + do_final_bind (CyclicSCC cycle) (body_usage, binds_so_far) + | not (any (`usedIn` body_usage) bndrs) -- NB: look at body_usage, not total_usage + = (body_usage, binds_so_far) -- Dead code + | otherwise + = (combined_usage, final_bind:binds_so_far) + where + details = [details | (details, _, _) <- cycle] + bndrs = [bndr | (bndr, _, _) <- details] + rhs_usages = [rhs_usage | (_, rhs_usage, _) <- details] + total_usage = foldr combineUsageDetails body_usage rhs_usages + (combined_usage, tagged_bndrs) = tagBinders total_usage bndrs + final_bind = Rec (reOrderRec env new_cycle) + + new_cycle = CyclicSCC (zipWithEqual "occAnalBind" mk_new_bind tagged_bndrs cycle) + mk_new_bind tagged_bndr ((_, _, rhs'), key, keys) = ((tagged_bndr, rhs'), key, keys) +\end{code} + +@reOrderRec@ is applied to the list of (binder,rhs) pairs for a cyclic +strongly connected component (there's guaranteed to be a cycle). It returns the +same pairs, but + a) in a better order, + b) with some of the Ids having a IMustNotBeINLINEd pragma + +The "no-inline" Ids are sufficient to break all cycles in the SCC. This means +that the simplifier can guarantee not to loop provided it never records an inlining +for these no-inline guys. + +Furthermore, the order of the binds is such that if we neglect dependencies +on the no-inline Ids then the binds are topologically sorted. This means +that the simplifier will generally do a good job if it works from top bottom, +recording inlinings for any Ids which aren't marked as "no-inline" as it goes. + +============== +[June 98: I don't understand the following paragraphs, and I've + changed the a=b case again so that it isn't a special case any more.] + +Here's a case that bit me: + + letrec + a = b + b = \x. BIG + in + ...a...a...a.... + +Re-ordering doesn't change the order of bindings, but there was no loop-breaker. + +My solution was to make a=b bindings record b as Many, rather like INLINE bindings. +Perhaps something cleverer would suffice. +=============== + +You might think that you can prevent non-termination simply by making +sure that we simplify a recursive binding's RHS in an environment that +simply clones the recursive Id. But no. Consider + + letrec f = \x -> let z = f x' in ... + + in + let n = f y + in + case n of { ... } + +We bind n to its *simplified* RHS, we then *re-simplify* it when +we inline n. Then we may well inline f; and then the same thing +happens with z! + +I don't think it's possible to prevent non-termination by environment +manipulation in this way. Apart from anything else, successive +iterations of the simplifier may unroll recursive loops in cases like +that above. The idea of beaking every recursive loop with an +IMustNotBeINLINEd pragma is much much better. + + +\begin{code} +reOrderRec + :: OccEnv + -> SCC (Node Details2) + -> [Details2] + -- Sorted into a plausible order. Enough of the Ids have + -- dontINLINE pragmas that there are no loops left. + + -- Non-recursive case +reOrderRec env (AcyclicSCC (bind, _, _)) = [bind] + + -- Common case of simple self-recursion +reOrderRec env (CyclicSCC [bind]) + = [(setIdOccInfo tagged_bndr IAmALoopBreaker, rhs)] + where + ((tagged_bndr, rhs), _, _) = bind + +reOrderRec env (CyclicSCC (bind : binds)) + = -- Choose a loop breaker, mark it no-inline, + -- do SCC analysis on the rest, and recursively sort them out + concat (map (reOrderRec env) (stronglyConnCompR unchosen)) + ++ + [(setIdOccInfo tagged_bndr IAmALoopBreaker, rhs)] + + where + (chosen_pair, unchosen) = choose_loop_breaker bind (score bind) [] binds + (tagged_bndr, rhs) = chosen_pair + + -- This loop looks for the bind with the lowest score + -- to pick as the loop breaker. The rest accumulate in + choose_loop_breaker (details,_,_) loop_sc acc [] + = (details, acc) -- Done + + choose_loop_breaker loop_bind loop_sc acc (bind : binds) + | sc < loop_sc -- Lower score so pick this new one + = choose_loop_breaker bind sc (loop_bind : acc) binds + + | otherwise -- No lower so don't pick it + = choose_loop_breaker loop_bind loop_sc (bind : acc) binds + where + sc = score bind + + score :: Node Details2 -> Int -- Higher score => less likely to be picked as loop breaker + score ((bndr, rhs), _, _) + | exprIsTrivial rhs = 4 -- Practically certain to be inlined + -- Used to have also: && not (isExportedId bndr) + -- But I found this sometimes cost an extra iteration when we have + -- rec { d = (a,b); a = ...df...; b = ...df...; df = d } + -- where df is the exported dictionary. Then df makes a really + -- bad choice for loop breaker + + | not_fun_ty (idType bndr) = 3 -- Data types help with cases + -- This used to have a lower score than inlineCandidate, but + -- it's *really* helpful if dictionaries get inlined fast, + -- so I'm experimenting with giving higher priority to data-typed things + + | inlineCandidate bndr rhs = 2 -- Likely to be inlined + + | not (isEmptySpecInfo (idSpecialisation bndr)) = 1 + -- Avoid things with specialisations; we'd like + -- to take advantage of them in the subsequent bindings + + | otherwise = 0 + + inlineCandidate :: Id -> CoreExpr -> Bool + inlineCandidate id (Note InlineMe _) = True + inlineCandidate id rhs = isOneOcc (idOccInfo id) + + -- Real example (the Enum Ordering instance from PrelBase): + -- rec f = \ x -> case d of (p,q,r) -> p x + -- g = \ x -> case d of (p,q,r) -> q x + -- d = (v, f, g) + -- + -- Here, f and g occur just once; but we can't inline them into d. + -- On the other hand we *could* simplify those case expressions if + -- we didn't stupidly choose d as the loop breaker. + -- But we won't because constructor args are marked "Many". + + not_fun_ty ty = not (isFunTy (dropForAlls ty)) +\end{code} + +@occAnalRhs@ deals with the question of bindings where the Id is marked +by an INLINE pragma. For these we record that anything which occurs +in its RHS occurs many times. This pessimistically assumes that ths +inlined binder also occurs many times in its scope, but if it doesn't +we'll catch it next time round. At worst this costs an extra simplifier pass. +ToDo: try using the occurrence info for the inline'd binder. + +[March 97] We do the same for atomic RHSs. Reason: see notes with reOrderRec. +[June 98, SLPJ] I've undone this change; I don't understand it. See notes with reOrderRec. + + +\begin{code} +occAnalRhs :: OccEnv + -> Id -> CoreExpr -- Binder and rhs + -- For non-recs the binder is alrady tagged + -- with occurrence info + -> (UsageDetails, CoreExpr) + +occAnalRhs env id rhs + = (final_usage, rhs') + where + (rhs_usage, rhs') = occAnal ctxt rhs + ctxt | certainly_inline id = env + | otherwise = rhsCtxt + -- Note that we generally use an rhsCtxt. This tells the occ anal n + -- that it's looking at an RHS, which has an effect in occAnalApp + -- + -- But there's a problem. Consider + -- x1 = a0 : [] + -- x2 = a1 : x1 + -- x3 = a2 : x2 + -- g = f x3 + -- First time round, it looks as if x1 and x2 occur as an arg of a + -- let-bound constructor ==> give them a many-occurrence. + -- But then x3 is inlined (unconditionally as it happens) and + -- next time round, x2 will be, and the next time round x1 will be + -- Result: multiple simplifier iterations. Sigh. + -- Crude solution: use rhsCtxt for things that occur just once... + + certainly_inline id = case idOccInfo id of + OneOcc in_lam one_br _ -> not in_lam && one_br + other -> False + + -- [March 98] A new wrinkle is that if the binder has specialisations inside + -- it then we count the specialised Ids as "extra rhs's". That way + -- the "parent" keeps the specialised "children" alive. If the parent + -- dies (because it isn't referenced any more), then the children will + -- die too unless they are already referenced directly. + + final_usage = addRuleUsage rhs_usage id + +addRuleUsage :: UsageDetails -> Id -> UsageDetails +-- Add the usage from RULES in Id to the usage +addRuleUsage usage id + = foldVarSet add usage (idRuleVars id) + where + add v u = addOneOcc u v NoOccInfo -- Give a non-committal binder info + -- (i.e manyOcc) because many copies + -- of the specialised thing can appear +\end{code} + +Expressions +~~~~~~~~~~~ +\begin{code} +occAnal :: OccEnv + -> CoreExpr + -> (UsageDetails, -- Gives info only about the "interesting" Ids + CoreExpr) + +occAnal env (Type t) = (emptyDetails, Type t) +occAnal env (Var v) = (mkOneOcc env v False, Var v) + -- At one stage, I gathered the idRuleVars for v here too, + -- which in a way is the right thing to do. + -- Btu that went wrong right after specialisation, when + -- the *occurrences* of the overloaded function didn't have any + -- rules in them, so the *specialised* versions looked as if they + -- weren't used at all. +\end{code} + +We regard variables that occur as constructor arguments as "dangerousToDup": + +\begin{verbatim} +module A where +f x = let y = expensive x in + let z = (True,y) in + (case z of {(p,q)->q}, case z of {(p,q)->q}) +\end{verbatim} + +We feel free to duplicate the WHNF (True,y), but that means +that y may be duplicated thereby. + +If we aren't careful we duplicate the (expensive x) call! +Constructors are rather like lambdas in this way. + +\begin{code} +occAnal env expr@(Lit lit) = (emptyDetails, expr) +\end{code} + +\begin{code} +occAnal env (Note InlineMe body) + = case occAnal env body of { (usage, body') -> + (mapVarEnv markMany usage, Note InlineMe body') + } + +occAnal env (Note note@(SCC cc) body) + = case occAnal env body of { (usage, body') -> + (mapVarEnv markInsideSCC usage, Note note body') + } + +occAnal env (Note note body) + = case occAnal env body of { (usage, body') -> + (usage, Note note body') + } +\end{code} + +\begin{code} +occAnal env app@(App fun arg) + = occAnalApp env (collectArgs app) False + +-- Ignore type variables altogether +-- (a) occurrences inside type lambdas only not marked as InsideLam +-- (b) type variables not in environment + +occAnal env expr@(Lam x body) | isTyVar x + = case occAnal env body of { (body_usage, body') -> + (body_usage, Lam x body') + } + +-- For value lambdas we do a special hack. Consider +-- (\x. \y. ...x...) +-- If we did nothing, x is used inside the \y, so would be marked +-- as dangerous to dup. But in the common case where the abstraction +-- is applied to two arguments this is over-pessimistic. +-- So instead, we just mark each binder with its occurrence +-- info in the *body* of the multiple lambda. +-- Then, the simplifier is careful when partially applying lambdas. + +occAnal env expr@(Lam _ _) + = case occAnal env_body body of { (body_usage, body') -> + let + (final_usage, tagged_binders) = tagBinders body_usage binders + -- URGH! Sept 99: we don't seem to be able to use binders' here, because + -- we get linear-typed things in the resulting program that we can't handle yet. + -- (e.g. PrelShow) TODO + + really_final_usage = if linear then + final_usage + else + mapVarEnv markInsideLam final_usage + in + (really_final_usage, + mkLams tagged_binders body') } + where + env_body = vanillaCtxt -- Body is (no longer) an RhsContext + (binders, body) = collectBinders expr + binders' = oneShotGroup env binders + linear = all is_one_shot binders' + is_one_shot b = isId b && isOneShotBndr b + +occAnal env (Case scrut bndr ty alts) + = case occ_anal_scrut scrut alts of { (scrut_usage, scrut') -> + case mapAndUnzip (occAnalAlt env bndr) alts of { (alts_usage_s, alts') -> + let + alts_usage = foldr1 combineAltsUsageDetails alts_usage_s + alts_usage' = addCaseBndrUsage alts_usage + (alts_usage1, tagged_bndr) = tagBinder alts_usage' bndr + total_usage = scrut_usage `combineUsageDetails` alts_usage1 + in + total_usage `seq` (total_usage, Case scrut' tagged_bndr ty alts') }} + where + -- The case binder gets a usage of either "many" or "dead", never "one". + -- Reason: we like to inline single occurrences, to eliminate a binding, + -- but inlining a case binder *doesn't* eliminate a binding. + -- We *don't* want to transform + -- case x of w { (p,q) -> f w } + -- into + -- case x of w { (p,q) -> f (p,q) } + addCaseBndrUsage usage = case lookupVarEnv usage bndr of + Nothing -> usage + Just occ -> extendVarEnv usage bndr (markMany occ) + + occ_anal_scrut (Var v) (alt1 : other_alts) + | not (null other_alts) || not (isDefaultAlt alt1) + = (mkOneOcc env v True, Var v) + occ_anal_scrut scrut alts = occAnal vanillaCtxt scrut + -- No need for rhsCtxt + +occAnal env (Let bind body) + = case occAnal env body of { (body_usage, body') -> + case occAnalBind env bind body_usage of { (final_usage, new_binds) -> + (final_usage, mkLets new_binds body') }} + +occAnalArgs env args + = case mapAndUnzip (occAnal arg_env) args of { (arg_uds_s, args') -> + (foldr combineUsageDetails emptyDetails arg_uds_s, args')} + where + arg_env = vanillaCtxt +\end{code} + +Applications are dealt with specially because we want +the "build hack" to work. + +\begin{code} +-- Hack for build, fold, runST +occAnalApp env (Var fun, args) is_rhs + = case args_stuff of { (args_uds, args') -> + let + -- We mark the free vars of the argument of a constructor or PAP + -- as "many", if it is the RHS of a let(rec). + -- This means that nothing gets inlined into a constructor argument + -- position, which is what we want. Typically those constructor + -- arguments are just variables, or trivial expressions. + -- + -- This is the *whole point* of the isRhsEnv predicate + final_args_uds + | isRhsEnv env, + isDataConWorkId fun || valArgCount args < idArity fun + = mapVarEnv markMany args_uds + | otherwise = args_uds + in + (fun_uds `combineUsageDetails` final_args_uds, mkApps (Var fun) args') } + where + fun_uniq = idUnique fun + fun_uds = mkOneOcc env fun (valArgCount args > 0) + args_stuff | fun_uniq == buildIdKey = appSpecial env 2 [True,True] args + | fun_uniq == augmentIdKey = appSpecial env 2 [True,True] args + | fun_uniq == foldrIdKey = appSpecial env 3 [False,True] args + | fun_uniq == runSTRepIdKey = appSpecial env 2 [True] args + -- (foldr k z xs) may call k many times, but it never + -- shares a partial application of k; hence [False,True] + -- This means we can optimise + -- foldr (\x -> let v = ...x... in \y -> ...v...) z xs + -- by floating in the v + + | otherwise = occAnalArgs env args + + +occAnalApp env (fun, args) is_rhs + = case occAnal (addAppCtxt env args) fun of { (fun_uds, fun') -> + -- The addAppCtxt is a bit cunning. One iteration of the simplifier + -- often leaves behind beta redexs like + -- (\x y -> e) a1 a2 + -- Here we would like to mark x,y as one-shot, and treat the whole + -- thing much like a let. We do this by pushing some True items + -- onto the context stack. + + case occAnalArgs env args of { (args_uds, args') -> + let + final_uds = fun_uds `combineUsageDetails` args_uds + in + (final_uds, mkApps fun' args') }} + +appSpecial :: OccEnv + -> Int -> CtxtTy -- Argument number, and context to use for it + -> [CoreExpr] + -> (UsageDetails, [CoreExpr]) +appSpecial env n ctxt args + = go n args + where + arg_env = vanillaCtxt + + go n [] = (emptyDetails, []) -- Too few args + + go 1 (arg:args) -- The magic arg + = case occAnal (setCtxt arg_env ctxt) arg of { (arg_uds, arg') -> + case occAnalArgs env args of { (args_uds, args') -> + (combineUsageDetails arg_uds args_uds, arg':args') }} + + go n (arg:args) + = case occAnal arg_env arg of { (arg_uds, arg') -> + case go (n-1) args of { (args_uds, args') -> + (combineUsageDetails arg_uds args_uds, arg':args') }} +\end{code} + + +Case alternatives +~~~~~~~~~~~~~~~~~ +If the case binder occurs at all, the other binders effectively do too. +For example + case e of x { (a,b) -> rhs } +is rather like + let x = (a,b) in rhs +If e turns out to be (e1,e2) we indeed get something like + let a = e1; b = e2; x = (a,b) in rhs + +\begin{code} +occAnalAlt env case_bndr (con, bndrs, rhs) + = case occAnal env rhs of { (rhs_usage, rhs') -> + let + (final_usage, tagged_bndrs) = tagBinders rhs_usage bndrs + final_bndrs | case_bndr `elemVarEnv` final_usage = bndrs + | otherwise = tagged_bndrs + -- Leave the binders untagged if the case + -- binder occurs at all; see note above + in + (final_usage, (con, final_bndrs, rhs')) } +\end{code} + + +%************************************************************************ +%* * +\subsection[OccurAnal-types]{OccEnv} +%* * +%************************************************************************ + +\begin{code} +data OccEnv + = OccEnv OccEncl -- Enclosing context information + CtxtTy -- Tells about linearity + +-- OccEncl is used to control whether to inline into constructor arguments +-- For example: +-- x = (p,q) -- Don't inline p or q +-- y = /\a -> (p a, q a) -- Still don't inline p or q +-- z = f (p,q) -- Do inline p,q; it may make a rule fire +-- So OccEncl tells enought about the context to know what to do when +-- we encounter a contructor application or PAP. + +data OccEncl + = OccRhs -- RHS of let(rec), albeit perhaps inside a type lambda + -- Don't inline into constructor args here + | OccVanilla -- Argument of function, body of lambda, scruintee of case etc. + -- Do inline into constructor args here + +type CtxtTy = [Bool] + -- [] No info + -- + -- True:ctxt Analysing a function-valued expression that will be + -- applied just once + -- + -- False:ctxt Analysing a function-valued expression that may + -- be applied many times; but when it is, + -- the CtxtTy inside applies + +initOccEnv :: OccEnv +initOccEnv = OccEnv OccRhs [] + +vanillaCtxt = OccEnv OccVanilla [] +rhsCtxt = OccEnv OccRhs [] + +isRhsEnv (OccEnv OccRhs _) = True +isRhsEnv (OccEnv OccVanilla _) = False + +setCtxt :: OccEnv -> CtxtTy -> OccEnv +setCtxt (OccEnv encl _) ctxt = OccEnv encl ctxt + +oneShotGroup :: OccEnv -> [CoreBndr] -> [CoreBndr] + -- The result binders have one-shot-ness set that they might not have had originally. + -- This happens in (build (\cn -> e)). Here the occurrence analyser + -- linearity context knows that c,n are one-shot, and it records that fact in + -- the binder. This is useful to guide subsequent float-in/float-out tranformations + +oneShotGroup (OccEnv encl ctxt) bndrs + = go ctxt bndrs [] + where + go ctxt [] rev_bndrs = reverse rev_bndrs + + go (lin_ctxt:ctxt) (bndr:bndrs) rev_bndrs + | isId bndr = go ctxt bndrs (bndr':rev_bndrs) + where + bndr' | lin_ctxt = setOneShotLambda bndr + | otherwise = bndr + + go ctxt (bndr:bndrs) rev_bndrs = go ctxt bndrs (bndr:rev_bndrs) + +addAppCtxt (OccEnv encl ctxt) args + = OccEnv encl (replicate (valArgCount args) True ++ ctxt) +\end{code} + +%************************************************************************ +%* * +\subsection[OccurAnal-types]{OccEnv} +%* * +%************************************************************************ + +\begin{code} +type UsageDetails = IdEnv OccInfo -- A finite map from ids to their usage + +combineUsageDetails, combineAltsUsageDetails + :: UsageDetails -> UsageDetails -> UsageDetails + +combineUsageDetails usage1 usage2 + = plusVarEnv_C addOccInfo usage1 usage2 + +combineAltsUsageDetails usage1 usage2 + = plusVarEnv_C orOccInfo usage1 usage2 + +addOneOcc :: UsageDetails -> Id -> OccInfo -> UsageDetails +addOneOcc usage id info + = plusVarEnv_C addOccInfo usage (unitVarEnv id info) + -- ToDo: make this more efficient + +emptyDetails = (emptyVarEnv :: UsageDetails) + +usedIn :: Id -> UsageDetails -> Bool +v `usedIn` details = isExportedId v || v `elemVarEnv` details + +tagBinders :: UsageDetails -- Of scope + -> [Id] -- Binders + -> (UsageDetails, -- Details with binders removed + [IdWithOccInfo]) -- Tagged binders + +tagBinders usage binders + = let + usage' = usage `delVarEnvList` binders + uss = map (setBinderOcc usage) binders + in + usage' `seq` (usage', uss) + +tagBinder :: UsageDetails -- Of scope + -> Id -- Binders + -> (UsageDetails, -- Details with binders removed + IdWithOccInfo) -- Tagged binders + +tagBinder usage binder + = let + usage' = usage `delVarEnv` binder + binder' = setBinderOcc usage binder + in + usage' `seq` (usage', binder') + +setBinderOcc :: UsageDetails -> CoreBndr -> CoreBndr +setBinderOcc usage bndr + | isTyVar bndr = bndr + | isExportedId bndr = case idOccInfo bndr of + NoOccInfo -> bndr + other -> setIdOccInfo bndr NoOccInfo + -- Don't use local usage info for visible-elsewhere things + -- BUT *do* erase any IAmALoopBreaker annotation, because we're + -- about to re-generate it and it shouldn't be "sticky" + + | otherwise = setIdOccInfo bndr occ_info + where + occ_info = lookupVarEnv usage bndr `orElse` IAmDead +\end{code} + + +%************************************************************************ +%* * +\subsection{Operations over OccInfo} +%* * +%************************************************************************ + +\begin{code} +mkOneOcc :: OccEnv -> Id -> InterestingCxt -> UsageDetails +mkOneOcc env id int_cxt + | isLocalId id = unitVarEnv id (OneOcc False True int_cxt) + | otherwise = emptyDetails + +markMany, markInsideLam, markInsideSCC :: OccInfo -> OccInfo + +markMany IAmDead = IAmDead +markMany other = NoOccInfo + +markInsideSCC occ = markMany occ + +markInsideLam (OneOcc _ one_br int_cxt) = OneOcc True one_br int_cxt +markInsideLam occ = occ + +addOccInfo, orOccInfo :: OccInfo -> OccInfo -> OccInfo + +addOccInfo IAmDead info2 = info2 +addOccInfo info1 IAmDead = info1 +addOccInfo info1 info2 = NoOccInfo + +-- (orOccInfo orig new) is used +-- when combining occurrence info from branches of a case + +orOccInfo IAmDead info2 = info2 +orOccInfo info1 IAmDead = info1 +orOccInfo (OneOcc in_lam1 one_branch1 int_cxt1) + (OneOcc in_lam2 one_branch2 int_cxt2) + = OneOcc (in_lam1 || in_lam2) + False -- False, because it occurs in both branches + (int_cxt1 && int_cxt2) + +orOccInfo info1 info2 = NoOccInfo +\end{code} diff --git a/compiler/simplCore/SAT.lhs b/compiler/simplCore/SAT.lhs new file mode 100644 index 0000000000..81f3c4c406 --- /dev/null +++ b/compiler/simplCore/SAT.lhs @@ -0,0 +1,214 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +%************************************************************************ +%* * +\section[SAT]{Static Argument Transformation pass} +%* * +%************************************************************************ + +96/03: We aren't using the static-argument transformation right now. + +May be seen as removing invariants from loops: +Arguments of recursive functions that do not change in recursive +calls are removed from the recursion, which is done locally +and only passes the arguments which effectively change. + +Example: +map = /\ ab -> \f -> \xs -> case xs of + [] -> [] + (a:b) -> f a : map f b + +as map is recursively called with the same argument f (unmodified) +we transform it to + +map = /\ ab -> \f -> \xs -> let map' ys = case ys of + [] -> [] + (a:b) -> f a : map' b + in map' xs + +Notice that for a compiler that uses lambda lifting this is +useless as map' will be transformed back to what map was. + +We could possibly do the same for big lambdas, but we don't as +they will eventually be removed in later stages of the compiler, +therefore there is no penalty in keeping them. + +Experimental Evidence: Heap: +/- 7% + Instrs: Always improves for 2 or more Static Args. + +\begin{code} +module SAT ( doStaticArgs ) where + +#include "HsVersions.h" + +import Panic ( panic ) + +doStaticArgs = panic "SAT.doStaticArgs (ToDo)" + +{- LATER: to end of file: + +import SATMonad +import Util +\end{code} + +\begin{code} +doStaticArgs :: [CoreBind] -> UniqSupply -> [CoreBind] + +doStaticArgs binds + = do { + showPass "Static argument"; + let { binds' = initSAT (mapSAT sat_bind binds) }; + endPass "Static argument" + False -- No specific flag for dumping SAT + binds' + } + where + sat_bind (NonRec binder expr) + = emptyEnvSAT `thenSAT_` + satExpr expr `thenSAT` (\ expr' -> + returnSAT (NonRec binder expr') ) + sat_bind (Rec [(binder,rhs)]) + = emptyEnvSAT `thenSAT_` + insSAEnv binder (getArgLists rhs) `thenSAT_` + satExpr rhs `thenSAT` (\ rhs' -> + saTransform binder rhs') + sat_bind (Rec pairs) + = emptyEnvSAT `thenSAT_` + mapSAT satExpr rhss `thenSAT` \ rhss' -> + returnSAT (Rec (zipEqual "doStaticArgs" binders rhss')) + where + (binders, rhss) = unzip pairs +\end{code} + +\begin{code} +satAtom (VarArg v) + = updSAEnv (Just (v,([],[]))) `thenSAT_` + returnSAT () + +satAtom _ = returnSAT () +\end{code} + +\begin{code} +satExpr :: CoreExpr -> SatM CoreExpr + +satExpr var@(Var v) + = updSAEnv (Just (v,([],[]))) `thenSAT_` + returnSAT var + +satExpr lit@(Lit _) = returnSAT lit + +satExpr e@(Prim prim ty args) + = mapSAT satAtom args `thenSAT_` + returnSAT e + +satExpr (Lam binders body) + = satExpr body `thenSAT` \ body' -> + returnSAT (Lam binders body') + +satExpr (CoTyLam tyvar body) + = satExpr body `thenSAT` (\ body' -> + returnSAT (CoTyLam tyvar body') ) + +satExpr app@(App _ _) + = getAppArgs app + +satExpr app@(CoTyApp _ _) + = getAppArgs app + +satExpr (Case expr alts) + = satExpr expr `thenSAT` \ expr' -> + sat_alts alts `thenSAT` \ alts' -> + returnSAT (Case expr' alts') + where + sat_alts (AlgAlts alts deflt) + = mapSAT satAlgAlt alts `thenSAT` \ alts' -> + sat_default deflt `thenSAT` \ deflt' -> + returnSAT (AlgAlts alts' deflt') + where + satAlgAlt (con, params, rhs) + = satExpr rhs `thenSAT` \ rhs' -> + returnSAT (con, params, rhs') + + sat_alts (PrimAlts alts deflt) + = mapSAT satPrimAlt alts `thenSAT` \ alts' -> + sat_default deflt `thenSAT` \ deflt' -> + returnSAT (PrimAlts alts' deflt') + where + satPrimAlt (lit, rhs) + = satExpr rhs `thenSAT` \ rhs' -> + returnSAT (lit, rhs') + + sat_default NoDefault + = returnSAT NoDefault + sat_default (BindDefault binder rhs) + = satExpr rhs `thenSAT` \ rhs' -> + returnSAT (BindDefault binder rhs') + +satExpr (Let (NonRec binder rhs) body) + = satExpr body `thenSAT` \ body' -> + satExpr rhs `thenSAT` \ rhs' -> + returnSAT (Let (NonRec binder rhs') body') + +satExpr (Let (Rec [(binder,rhs)]) body) + = satExpr body `thenSAT` \ body' -> + insSAEnv binder (getArgLists rhs) `thenSAT_` + satExpr rhs `thenSAT` \ rhs' -> + saTransform binder rhs' `thenSAT` \ binding -> + returnSAT (Let binding body') + +satExpr (Let (Rec binds) body) + = let + (binders, rhss) = unzip binds + in + satExpr body `thenSAT` \ body' -> + mapSAT satExpr rhss `thenSAT` \ rhss' -> + returnSAT (Let (Rec (zipEqual "satExpr:Rec" binders rhss')) body') + +satExpr (Note note expr) + = satExpr expr `thenSAT` \ expr2 -> + returnSAT (Note note expr2) +\end{code} + +\begin{code} +getAppArgs :: CoreExpr -> SatM CoreExpr + +getAppArgs app + = get app `thenSAT` \ (app',result) -> + updSAEnv result `thenSAT_` + returnSAT app' + where + get :: CoreExpr + -> SatM (CoreExpr, Maybe (Id, SATInfo)) + + get (CoTyApp e ty) + = get e `thenSAT` \ (e',result) -> + returnSAT ( + CoTyApp e' ty, + case result of + Nothing -> Nothing + Just (v,(tv,lv)) -> Just (v,(tv++[Static ty],lv)) + ) + + get (App e a) + = get e `thenSAT` \ (e', result) -> + satAtom a `thenSAT_` + let si = case a of + (VarArg v) -> Static v + _ -> NotStatic + in + returnSAT ( + App e' a, + case result of + Just (v,(tv,lv)) -> Just (v,(tv,lv++[si])) + Nothing -> Nothing + ) + + get var@(Var v) + = returnSAT (var, Just (v,([],[]))) + + get e + = satExpr e `thenSAT` \ e2 -> + returnSAT (e2, Nothing) +-} +\end{code} diff --git a/compiler/simplCore/SATMonad.lhs b/compiler/simplCore/SATMonad.lhs new file mode 100644 index 0000000000..9786f448af --- /dev/null +++ b/compiler/simplCore/SATMonad.lhs @@ -0,0 +1,263 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +%************************************************************************ +%* * +\section[SATMonad]{The Static Argument Transformation pass Monad} +%* * +%************************************************************************ + +96/03: We aren't using the static-argument transformation right now. + +\begin{code} +module SATMonad where + +#include "HsVersions.h" + +import Panic ( panic ) + +junk_from_SATMonad = panic "SATMonad.junk" + +{- LATER: to end of file: + +module SATMonad ( + SATInfo(..), updSAEnv, + SatM(..), initSAT, emptyEnvSAT, + returnSAT, thenSAT, thenSAT_, mapSAT, getSATInfo, newSATName, + getArgLists, Arg(..), insSAEnv, saTransform, + + SATEnv(..), isStatic, dropStatics + ) where + +import Type ( mkTyVarTy, mkSigmaTy, + splitSigmaTy, splitFunTys, + glueTyArgs, substTy, + InstTyEnv(..) + ) +import MkId ( mkSysLocal ) +import Id ( idType, idName, mkLocalId ) +import UniqSupply +import Util + +infixr 9 `thenSAT`, `thenSAT_` +\end{code} + +%************************************************************************ +%* * +\subsection{Static Argument Transformation Environment} +%* * +%************************************************************************ + +\begin{code} +type SATEnv = IdEnv SATInfo + +type SATInfo = ([Arg Type],[Arg Id]) + +data Arg a = Static a | NotStatic + deriving Eq + +delOneFromSAEnv v us env + = ((), delVarEnv env v) + +updSAEnv :: Maybe (Id,SATInfo) -> SatM () +updSAEnv Nothing + = returnSAT () +updSAEnv (Just (b,(tyargs,args))) + = getSATInfo b `thenSAT` (\ r -> + case r of + Nothing -> returnSAT () + Just (tyargs',args') -> delOneFromSAEnv b `thenSAT_` + insSAEnv b (checkArgs tyargs tyargs', + checkArgs args args') + ) + +checkArgs as [] = notStatics (length as) +checkArgs [] as = notStatics (length as) +checkArgs (a:as) (a':as') | a == a' = a:checkArgs as as' +checkArgs (_:as) (_:as') = NotStatic:checkArgs as as' + +notStatics :: Int -> [Arg a] +notStatics n = nOfThem n NotStatic + +insSAEnv :: Id -> SATInfo -> SatM () +insSAEnv b info us env + = ((), extendVarEnv env b info) +\end{code} + +%************************************************************************ +%* * +\subsection{Static Argument Transformation Monad} +%* * +%************************************************************************ + +Two items of state to thread around: a UniqueSupply and a SATEnv. + +\begin{code} +type SatM result + = UniqSupply -> SATEnv -> (result, SATEnv) + +initSAT :: SatM a -> UniqSupply -> a + +initSAT f us = fst (f us emptyVarEnv) + +thenSAT m k us env + = case splitUniqSupply us of { (s1, s2) -> + case m s1 env of { (m_result, menv) -> + k m_result s2 menv }} + +thenSAT_ m k us env + = case splitUniqSupply us of { (s1, s2) -> + case m s1 env of { (_, menv) -> + k s2 menv }} + +emptyEnvSAT :: SatM () +emptyEnvSAT us _ = ((), emptyVarEnv) + +returnSAT v us env = (v, env) + +mapSAT f [] = returnSAT [] +mapSAT f (x:xs) + = f x `thenSAT` \ x' -> + mapSAT f xs `thenSAT` \ xs' -> + returnSAT (x':xs') +\end{code} + +%************************************************************************ +%* * +\subsection{Utility Functions} +%* * +%************************************************************************ + +\begin{code} +getSATInfo :: Id -> SatM (Maybe SATInfo) +getSATInfo var us env + = (lookupVarEnv env var, env) + +newSATName :: Id -> Type -> SatM Id +newSATName id ty us env + = case (getUnique us) of { unique -> + let + new_name = mkCompoundName SLIT("$sat") unique (idName id) + in + (mkLocalId new_name ty, env) } + +getArgLists :: CoreExpr -> ([Arg Type],[Arg Id]) +getArgLists expr + = let + (tvs, lambda_bounds, body) = collectBinders expr + in + ([ Static (mkTyVarTy tv) | tv <- tvs ], + [ Static v | v <- lambda_bounds ]) + +dropArgs :: CoreExpr -> CoreExpr +dropArgs (Lam _ e) = dropArgs e +dropArgs (CoTyLam _ e) = dropArgs e +dropArgs e = e +\end{code} + +We implement saTransform using shadowing of binders, that is +we transform +map = \f as -> case as of + [] -> [] + (a':as') -> let x = f a' + y = map f as' + in x:y +to +map = \f as -> let map = \f as -> map' as + in let rec map' = \as -> case as of + [] -> [] + (a':as') -> let x = f a' + y = map f as' + in x:y + in map' as + +the inner map should get inlined and eliminated. +\begin{code} +saTransform :: Id -> CoreExpr -> SatM CoreBinding +saTransform binder rhs + = getSATInfo binder `thenSAT` \ r -> + case r of + -- [Andre] test: do it only if we have more than one static argument. + --Just (tyargs,args) | any isStatic args + Just (tyargs,args) | (filter isStatic args) `lengthExceeds` 1 + -> newSATName binder (new_ty tyargs args) `thenSAT` \ binder' -> + mkNewRhs binder binder' tyargs args rhs `thenSAT` \ new_rhs -> + trace ("SAT "++ show (length (filter isStatic args))) ( + returnSAT (NonRec binder new_rhs) + ) + _ -> returnSAT (Rec [(binder, rhs)]) + where + mkNewRhs binder binder' tyargs args rhs + = let + non_static_args :: [Id] + non_static_args + = get_nsa args (snd (getArgLists rhs)) + where + get_nsa :: [Arg a] -> [Arg a] -> [a] + get_nsa [] _ = [] + get_nsa _ [] = [] + get_nsa (NotStatic:args) (Static v:as) = v:get_nsa args as + get_nsa (_:args) (_:as) = get_nsa args as + + local_body = foldl App (Var binder') + [VarArg a | a <- non_static_args] + + nonrec_rhs = origLams local_body + + -- HACK! The following is a fake SysLocal binder with + -- *the same* unique as binder. + -- the reason for this is the following: + -- this binder *will* get inlined but if it happen to be + -- a top level binder it is never removed as dead code, + -- therefore we have to remove that information (of it being + -- top-level or exported somehow.) + -- A better fix is to use binder directly but with the TopLevel + -- tag (or Exported tag) modified. + fake_binder = mkSysLocal SLIT("sat") + (getUnique binder) + (idType binder) + rec_body = mkValLam non_static_args + ( Let (NonRec fake_binder nonrec_rhs) + {-in-} (dropArgs rhs)) + in + returnSAT ( + origLams (Let (Rec [(binder',rec_body)]) {-in-} local_body) + ) + where + origLams = origLams' rhs + where + origLams' (Lam v e) e' = Lam v (origLams' e e') + origLams' (CoTyLam ty e) e' = CoTyLam ty (origLams' e e') + origLams' _ e' = e' + + new_ty tyargs args + = substTy (mk_inst_tyenv tyargs tv_tmpl) + (mkSigmaTy tv_tmpl' dict_tys' tau_ty') + where + -- get type info for the local function: + (tv_tmpl, dict_tys, tau_ty) = (splitSigmaTy . idType) binder + (reg_arg_tys, res_type) = splitFunTys tau_ty + + -- now, we drop the ones that are + -- static, that is, the ones we will not pass to the local function + tv_tmpl' = dropStatics tyargs tv_tmpl + + (args1, args2) = splitAtList dict_tys args + dict_tys' = dropStatics args1 dict_tys + reg_arg_tys' = dropStatics args2 reg_arg_tys + + tau_ty' = glueTyArgs reg_arg_tys' res_type + + mk_inst_tyenv [] _ = emptyVarEnv + mk_inst_tyenv (Static s:args) (t:ts) = extendVarEnv (mk_inst_tyenv args ts) t s + mk_inst_tyenv (_:args) (_:ts) = mk_inst_tyenv args ts + +dropStatics [] t = t +dropStatics (Static _:args) (t:ts) = dropStatics args ts +dropStatics (_:args) (t:ts) = t:dropStatics args ts + +isStatic :: Arg a -> Bool +isStatic NotStatic = False +isStatic _ = True +-} +\end{code} diff --git a/compiler/simplCore/SetLevels.lhs b/compiler/simplCore/SetLevels.lhs new file mode 100644 index 0000000000..f8ab29dcd5 --- /dev/null +++ b/compiler/simplCore/SetLevels.lhs @@ -0,0 +1,847 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section{SetLevels} + + *************************** + Overview + *************************** + +1. We attach binding levels to Core bindings, in preparation for floating + outwards (@FloatOut@). + +2. We also let-ify many expressions (notably case scrutinees), so they + will have a fighting chance of being floated sensible. + +3. We clone the binders of any floatable let-binding, so that when it is + floated out it will be unique. (This used to be done by the simplifier + but the latter now only ensures that there's no shadowing; indeed, even + that may not be true.) + + NOTE: this can't be done using the uniqAway idea, because the variable + must be unique in the whole program, not just its current scope, + because two variables in different scopes may float out to the + same top level place + + NOTE: Very tiresomely, we must apply this substitution to + the rules stored inside a variable too. + + We do *not* clone top-level bindings, because some of them must not change, + but we *do* clone bindings that are heading for the top level + +4. In the expression + case x of wild { p -> ...wild... } + we substitute x for wild in the RHS of the case alternatives: + case x of wild { p -> ...x... } + This means that a sub-expression involving x is not "trapped" inside the RHS. + And it's not inconvenient because we already have a substitution. + + Note that this is EXACTLY BACKWARDS from the what the simplifier does. + The simplifier tries to get rid of occurrences of x, in favour of wild, + in the hope that there will only be one remaining occurrence of x, namely + the scrutinee of the case, and we can inline it. + +\begin{code} +module SetLevels ( + setLevels, + + Level(..), tOP_LEVEL, + LevelledBind, LevelledExpr, + + incMinorLvl, ltMajLvl, ltLvl, isTopLvl, isInlineCtxt + ) where + +#include "HsVersions.h" + +import CoreSyn + +import DynFlags ( FloatOutSwitches(..) ) +import CoreUtils ( exprType, exprIsTrivial, exprIsCheap, mkPiTypes ) +import CoreFVs -- all of it +import CoreSubst ( Subst, emptySubst, extendInScope, extendIdSubst, + cloneIdBndr, cloneRecIdBndrs ) +import Id ( Id, idType, mkSysLocal, isOneShotLambda, + zapDemandIdInfo, + idSpecialisation, idWorkerInfo, setIdInfo + ) +import IdInfo ( workerExists, vanillaIdInfo, isEmptySpecInfo ) +import Var ( Var ) +import VarSet +import VarEnv +import Name ( getOccName ) +import OccName ( occNameString ) +import Type ( isUnLiftedType, Type ) +import BasicTypes ( TopLevelFlag(..) ) +import UniqSupply +import Util ( sortLe, isSingleton, count ) +import Outputable +import FastString +\end{code} + +%************************************************************************ +%* * +\subsection{Level numbers} +%* * +%************************************************************************ + +\begin{code} +data Level = InlineCtxt -- A level that's used only for + -- the context parameter ctxt_lvl + | Level Int -- Level number of enclosing lambdas + Int -- Number of big-lambda and/or case expressions between + -- here and the nearest enclosing lambda +\end{code} + +The {\em level number} on a (type-)lambda-bound variable is the +nesting depth of the (type-)lambda which binds it. The outermost lambda +has level 1, so (Level 0 0) means that the variable is bound outside any lambda. + +On an expression, it's the maximum level number of its free +(type-)variables. On a let(rec)-bound variable, it's the level of its +RHS. On a case-bound variable, it's the number of enclosing lambdas. + +Top-level variables: level~0. Those bound on the RHS of a top-level +definition but ``before'' a lambda; e.g., the \tr{x} in (levels shown +as ``subscripts'')... +\begin{verbatim} +a_0 = let b_? = ... in + x_1 = ... b ... in ... +\end{verbatim} + +The main function @lvlExpr@ carries a ``context level'' (@ctxt_lvl@). +That's meant to be the level number of the enclosing binder in the +final (floated) program. If the level number of a sub-expression is +less than that of the context, then it might be worth let-binding the +sub-expression so that it will indeed float. + +If you can float to level @Level 0 0@ worth doing so because then your +allocation becomes static instead of dynamic. We always start with +context @Level 0 0@. + + +InlineCtxt +~~~~~~~~~~ +@InlineCtxt@ very similar to @Level 0 0@, but is used for one purpose: +to say "don't float anything out of here". That's exactly what we +want for the body of an INLINE, where we don't want to float anything +out at all. See notes with lvlMFE below. + +But, check this out: + +-- At one time I tried the effect of not float anything out of an InlineMe, +-- but it sometimes works badly. For example, consider PrelArr.done. It +-- has the form __inline (\d. e) +-- where e doesn't mention d. If we float this to +-- __inline (let x = e in \d. x) +-- things are bad. The inliner doesn't even inline it because it doesn't look +-- like a head-normal form. So it seems a lesser evil to let things float. +-- In SetLevels we do set the context to (Level 0 0) when we get to an InlineMe +-- which discourages floating out. + +So the conclusion is: don't do any floating at all inside an InlineMe. +(In the above example, don't float the {x=e} out of the \d.) + +One particular case is that of workers: we don't want to float the +call to the worker outside the wrapper, otherwise the worker might get +inlined into the floated expression, and an importing module won't see +the worker at all. + +\begin{code} +type LevelledExpr = TaggedExpr Level +type LevelledBind = TaggedBind Level + +tOP_LEVEL = Level 0 0 +iNLINE_CTXT = InlineCtxt + +incMajorLvl :: Level -> Level +-- For InlineCtxt we ignore any inc's; we don't want +-- to do any floating at all; see notes above +incMajorLvl InlineCtxt = InlineCtxt +incMajorLvl (Level major minor) = Level (major+1) 0 + +incMinorLvl :: Level -> Level +incMinorLvl InlineCtxt = InlineCtxt +incMinorLvl (Level major minor) = Level major (minor+1) + +maxLvl :: Level -> Level -> Level +maxLvl InlineCtxt l2 = l2 +maxLvl l1 InlineCtxt = l1 +maxLvl l1@(Level maj1 min1) l2@(Level maj2 min2) + | (maj1 > maj2) || (maj1 == maj2 && min1 > min2) = l1 + | otherwise = l2 + +ltLvl :: Level -> Level -> Bool +ltLvl any_lvl InlineCtxt = False +ltLvl InlineCtxt (Level _ _) = True +ltLvl (Level maj1 min1) (Level maj2 min2) + = (maj1 < maj2) || (maj1 == maj2 && min1 < min2) + +ltMajLvl :: Level -> Level -> Bool + -- Tells if one level belongs to a difft *lambda* level to another +ltMajLvl any_lvl InlineCtxt = False +ltMajLvl InlineCtxt (Level maj2 _) = 0 < maj2 +ltMajLvl (Level maj1 _) (Level maj2 _) = maj1 < maj2 + +isTopLvl :: Level -> Bool +isTopLvl (Level 0 0) = True +isTopLvl other = False + +isInlineCtxt :: Level -> Bool +isInlineCtxt InlineCtxt = True +isInlineCtxt other = False + +instance Outputable Level where + ppr InlineCtxt = text "<INLINE>" + ppr (Level maj min) = hcat [ char '<', int maj, char ',', int min, char '>' ] + +instance Eq Level where + InlineCtxt == InlineCtxt = True + (Level maj1 min1) == (Level maj2 min2) = maj1==maj2 && min1==min2 + l1 == l2 = False +\end{code} + + +%************************************************************************ +%* * +\subsection{Main level-setting code} +%* * +%************************************************************************ + +\begin{code} +setLevels :: FloatOutSwitches + -> [CoreBind] + -> UniqSupply + -> [LevelledBind] + +setLevels float_lams binds us + = initLvl us (do_them binds) + where + -- "do_them"'s main business is to thread the monad along + -- It gives each top binding the same empty envt, because + -- things unbound in the envt have level number zero implicitly + do_them :: [CoreBind] -> LvlM [LevelledBind] + + do_them [] = returnLvl [] + do_them (b:bs) + = lvlTopBind init_env b `thenLvl` \ (lvld_bind, _) -> + do_them bs `thenLvl` \ lvld_binds -> + returnLvl (lvld_bind : lvld_binds) + + init_env = initialEnv float_lams + +lvlTopBind env (NonRec binder rhs) + = lvlBind TopLevel tOP_LEVEL env (AnnNonRec binder (freeVars rhs)) + -- Rhs can have no free vars! + +lvlTopBind env (Rec pairs) + = lvlBind TopLevel tOP_LEVEL env (AnnRec [(b,freeVars rhs) | (b,rhs) <- pairs]) +\end{code} + +%************************************************************************ +%* * +\subsection{Setting expression levels} +%* * +%************************************************************************ + +\begin{code} +lvlExpr :: Level -- ctxt_lvl: Level of enclosing expression + -> LevelEnv -- Level of in-scope names/tyvars + -> CoreExprWithFVs -- input expression + -> LvlM LevelledExpr -- Result expression +\end{code} + +The @ctxt_lvl@ is, roughly, the level of the innermost enclosing +binder. Here's an example + + v = \x -> ...\y -> let r = case (..x..) of + ..x.. + in .. + +When looking at the rhs of @r@, @ctxt_lvl@ will be 1 because that's +the level of @r@, even though it's inside a level-2 @\y@. It's +important that @ctxt_lvl@ is 1 and not 2 in @r@'s rhs, because we +don't want @lvlExpr@ to turn the scrutinee of the @case@ into an MFE +--- because it isn't a *maximal* free expression. + +If there were another lambda in @r@'s rhs, it would get level-2 as well. + +\begin{code} +lvlExpr _ _ (_, AnnType ty) = returnLvl (Type ty) +lvlExpr _ env (_, AnnVar v) = returnLvl (lookupVar env v) +lvlExpr _ env (_, AnnLit lit) = returnLvl (Lit lit) + +lvlExpr ctxt_lvl env (_, AnnApp fun arg) + = lvl_fun fun `thenLvl` \ fun' -> + lvlMFE False ctxt_lvl env arg `thenLvl` \ arg' -> + returnLvl (App fun' arg') + where +-- gaw 2004 + lvl_fun (_, AnnCase _ _ _ _) = lvlMFE True ctxt_lvl env fun + lvl_fun other = lvlExpr ctxt_lvl env fun + -- We don't do MFE on partial applications generally, + -- but we do if the function is big and hairy, like a case + +lvlExpr ctxt_lvl env (_, AnnNote InlineMe expr) +-- Don't float anything out of an InlineMe; hence the iNLINE_CTXT + = lvlExpr iNLINE_CTXT env expr `thenLvl` \ expr' -> + returnLvl (Note InlineMe expr') + +lvlExpr ctxt_lvl env (_, AnnNote note expr) + = lvlExpr ctxt_lvl env expr `thenLvl` \ expr' -> + returnLvl (Note note expr') + +-- We don't split adjacent lambdas. That is, given +-- \x y -> (x+1,y) +-- we don't float to give +-- \x -> let v = x+y in \y -> (v,y) +-- Why not? Because partial applications are fairly rare, and splitting +-- lambdas makes them more expensive. + +lvlExpr ctxt_lvl env expr@(_, AnnLam bndr rhs) + = lvlMFE True new_lvl new_env body `thenLvl` \ new_body -> + returnLvl (mkLams new_bndrs new_body) + where + (bndrs, body) = collectAnnBndrs expr + (new_lvl, new_bndrs) = lvlLamBndrs ctxt_lvl bndrs + new_env = extendLvlEnv env new_bndrs + -- At one time we called a special verion of collectBinders, + -- which ignored coercions, because we don't want to split + -- a lambda like this (\x -> coerce t (\s -> ...)) + -- This used to happen quite a bit in state-transformer programs, + -- but not nearly so much now non-recursive newtypes are transparent. + -- [See SetLevels rev 1.50 for a version with this approach.] + +lvlExpr ctxt_lvl env (_, AnnLet (AnnNonRec bndr rhs) body) + | isUnLiftedType (idType bndr) + -- Treat unlifted let-bindings (let x = b in e) just like (case b of x -> e) + -- That is, leave it exactly where it is + -- We used to float unlifted bindings too (e.g. to get a cheap primop + -- outside a lambda (to see how, look at lvlBind in rev 1.58) + -- but an unrelated change meant that these unlifed bindings + -- could get to the top level which is bad. And there's not much point; + -- unlifted bindings are always cheap, and so hardly worth floating. + = lvlExpr ctxt_lvl env rhs `thenLvl` \ rhs' -> + lvlExpr incd_lvl env' body `thenLvl` \ body' -> + returnLvl (Let (NonRec bndr' rhs') body') + where + incd_lvl = incMinorLvl ctxt_lvl + bndr' = TB bndr incd_lvl + env' = extendLvlEnv env [bndr'] + +lvlExpr ctxt_lvl env (_, AnnLet bind body) + = lvlBind NotTopLevel ctxt_lvl env bind `thenLvl` \ (bind', new_env) -> + lvlExpr ctxt_lvl new_env body `thenLvl` \ body' -> + returnLvl (Let bind' body') + +lvlExpr ctxt_lvl env (_, AnnCase expr case_bndr ty alts) + = lvlMFE True ctxt_lvl env expr `thenLvl` \ expr' -> + let + alts_env = extendCaseBndrLvlEnv env expr' case_bndr incd_lvl + in + mapLvl (lvl_alt alts_env) alts `thenLvl` \ alts' -> + returnLvl (Case expr' (TB case_bndr incd_lvl) ty alts') + where + incd_lvl = incMinorLvl ctxt_lvl + + lvl_alt alts_env (con, bs, rhs) + = lvlMFE True incd_lvl new_env rhs `thenLvl` \ rhs' -> + returnLvl (con, bs', rhs') + where + bs' = [ TB b incd_lvl | b <- bs ] + new_env = extendLvlEnv alts_env bs' +\end{code} + +@lvlMFE@ is just like @lvlExpr@, except that it might let-bind +the expression, so that it can itself be floated. + +[NOTE: unlifted MFEs] +We don't float unlifted MFEs, which potentially loses big opportunites. +For example: + \x -> f (h y) +where h :: Int -> Int# is expensive. We'd like to float the (h y) outside +the \x, but we don't because it's unboxed. Possible solution: box it. + +\begin{code} +lvlMFE :: Bool -- True <=> strict context [body of case or let] + -> Level -- Level of innermost enclosing lambda/tylam + -> LevelEnv -- Level of in-scope names/tyvars + -> CoreExprWithFVs -- input expression + -> LvlM LevelledExpr -- Result expression + +lvlMFE strict_ctxt ctxt_lvl env (_, AnnType ty) + = returnLvl (Type ty) + + +lvlMFE strict_ctxt ctxt_lvl env ann_expr@(fvs, _) + | isUnLiftedType ty -- Can't let-bind it; see [NOTE: unlifted MFEs] + || isInlineCtxt ctxt_lvl -- Don't float out of an __inline__ context + || exprIsTrivial expr -- Never float if it's trivial + || not good_destination + = -- Don't float it out + lvlExpr ctxt_lvl env ann_expr + + | otherwise -- Float it out! + = lvlFloatRhs abs_vars dest_lvl env ann_expr `thenLvl` \ expr' -> + newLvlVar "lvl" abs_vars ty `thenLvl` \ var -> + returnLvl (Let (NonRec (TB var dest_lvl) expr') + (mkVarApps (Var var) abs_vars)) + where + expr = deAnnotate ann_expr + ty = exprType expr + dest_lvl = destLevel env fvs (isFunction ann_expr) + abs_vars = abstractVars dest_lvl env fvs + + -- A decision to float entails let-binding this thing, and we only do + -- that if we'll escape a value lambda, or will go to the top level. + good_destination + | dest_lvl `ltMajLvl` ctxt_lvl -- Escapes a value lambda + = not (exprIsCheap expr) || isTopLvl dest_lvl + -- Even if it escapes a value lambda, we only + -- float if it's not cheap (unless it'll get all the + -- way to the top). I've seen cases where we + -- float dozens of tiny free expressions, which cost + -- more to allocate than to evaluate. + -- NB: exprIsCheap is also true of bottom expressions, which + -- is good; we don't want to share them + -- + -- It's only Really Bad to float a cheap expression out of a + -- strict context, because that builds a thunk that otherwise + -- would never be built. So another alternative would be to + -- add + -- || (strict_ctxt && not (exprIsBottom expr)) + -- to the condition above. We should really try this out. + + | otherwise -- Does not escape a value lambda + = isTopLvl dest_lvl -- Only float if we are going to the top level + && floatConsts env -- and the floatConsts flag is on + && not strict_ctxt -- Don't float from a strict context + -- We are keen to float something to the top level, even if it does not + -- escape a lambda, because then it needs no allocation. But it's controlled + -- by a flag, because doing this too early loses opportunities for RULES + -- which (needless to say) are important in some nofib programs + -- (gcd is an example). + -- + -- Beware: + -- concat = /\ a -> foldr ..a.. (++) [] + -- was getting turned into + -- concat = /\ a -> lvl a + -- lvl = /\ a -> foldr ..a.. (++) [] + -- which is pretty stupid. Hence the strict_ctxt test +\end{code} + + +%************************************************************************ +%* * +\subsection{Bindings} +%* * +%************************************************************************ + +The binding stuff works for top level too. + +\begin{code} +lvlBind :: TopLevelFlag -- Used solely to decide whether to clone + -> Level -- Context level; might be Top even for bindings nested in the RHS + -- of a top level binding + -> LevelEnv + -> CoreBindWithFVs + -> LvlM (LevelledBind, LevelEnv) + +lvlBind top_lvl ctxt_lvl env (AnnNonRec bndr rhs@(rhs_fvs,_)) + | isInlineCtxt ctxt_lvl -- Don't do anything inside InlineMe + = lvlExpr ctxt_lvl env rhs `thenLvl` \ rhs' -> + returnLvl (NonRec (TB bndr ctxt_lvl) rhs', env) + + | null abs_vars + = -- No type abstraction; clone existing binder + lvlExpr dest_lvl env rhs `thenLvl` \ rhs' -> + cloneVar top_lvl env bndr ctxt_lvl dest_lvl `thenLvl` \ (env', bndr') -> + returnLvl (NonRec (TB bndr' dest_lvl) rhs', env') + + | otherwise + = -- Yes, type abstraction; create a new binder, extend substitution, etc + lvlFloatRhs abs_vars dest_lvl env rhs `thenLvl` \ rhs' -> + newPolyBndrs dest_lvl env abs_vars [bndr] `thenLvl` \ (env', [bndr']) -> + returnLvl (NonRec (TB bndr' dest_lvl) rhs', env') + + where + bind_fvs = rhs_fvs `unionVarSet` idFreeVars bndr + abs_vars = abstractVars dest_lvl env bind_fvs + dest_lvl = destLevel env bind_fvs (isFunction rhs) +\end{code} + + +\begin{code} +lvlBind top_lvl ctxt_lvl env (AnnRec pairs) + | isInlineCtxt ctxt_lvl -- Don't do anything inside InlineMe + = mapLvl (lvlExpr ctxt_lvl env) rhss `thenLvl` \ rhss' -> + returnLvl (Rec ([TB b ctxt_lvl | b <- bndrs] `zip` rhss'), env) + + | null abs_vars + = cloneRecVars top_lvl env bndrs ctxt_lvl dest_lvl `thenLvl` \ (new_env, new_bndrs) -> + mapLvl (lvlExpr ctxt_lvl new_env) rhss `thenLvl` \ new_rhss -> + returnLvl (Rec ([TB b dest_lvl | b <- new_bndrs] `zip` new_rhss), new_env) + + | isSingleton pairs && count isId abs_vars > 1 + = -- Special case for self recursion where there are + -- several variables carried around: build a local loop: + -- poly_f = \abs_vars. \lam_vars . letrec f = \lam_vars. rhs in f lam_vars + -- This just makes the closures a bit smaller. If we don't do + -- this, allocation rises significantly on some programs + -- + -- We could elaborate it for the case where there are several + -- mutually functions, but it's quite a bit more complicated + -- + -- This all seems a bit ad hoc -- sigh + let + (bndr,rhs) = head pairs + (rhs_lvl, abs_vars_w_lvls) = lvlLamBndrs dest_lvl abs_vars + rhs_env = extendLvlEnv env abs_vars_w_lvls + in + cloneVar NotTopLevel rhs_env bndr rhs_lvl rhs_lvl `thenLvl` \ (rhs_env', new_bndr) -> + let + (lam_bndrs, rhs_body) = collectAnnBndrs rhs + (body_lvl, new_lam_bndrs) = lvlLamBndrs rhs_lvl lam_bndrs + body_env = extendLvlEnv rhs_env' new_lam_bndrs + in + lvlExpr body_lvl body_env rhs_body `thenLvl` \ new_rhs_body -> + newPolyBndrs dest_lvl env abs_vars [bndr] `thenLvl` \ (poly_env, [poly_bndr]) -> + returnLvl (Rec [(TB poly_bndr dest_lvl, + mkLams abs_vars_w_lvls $ + mkLams new_lam_bndrs $ + Let (Rec [(TB new_bndr rhs_lvl, mkLams new_lam_bndrs new_rhs_body)]) + (mkVarApps (Var new_bndr) lam_bndrs))], + poly_env) + + | otherwise -- Non-null abs_vars + = newPolyBndrs dest_lvl env abs_vars bndrs `thenLvl` \ (new_env, new_bndrs) -> + mapLvl (lvlFloatRhs abs_vars dest_lvl new_env) rhss `thenLvl` \ new_rhss -> + returnLvl (Rec ([TB b dest_lvl | b <- new_bndrs] `zip` new_rhss), new_env) + + where + (bndrs,rhss) = unzip pairs + + -- Finding the free vars of the binding group is annoying + bind_fvs = (unionVarSets [ idFreeVars bndr `unionVarSet` rhs_fvs + | (bndr, (rhs_fvs,_)) <- pairs]) + `minusVarSet` + mkVarSet bndrs + + dest_lvl = destLevel env bind_fvs (all isFunction rhss) + abs_vars = abstractVars dest_lvl env bind_fvs + +---------------------------------------------------- +-- Three help functons for the type-abstraction case + +lvlFloatRhs abs_vars dest_lvl env rhs + = lvlExpr rhs_lvl rhs_env rhs `thenLvl` \ rhs' -> + returnLvl (mkLams abs_vars_w_lvls rhs') + where + (rhs_lvl, abs_vars_w_lvls) = lvlLamBndrs dest_lvl abs_vars + rhs_env = extendLvlEnv env abs_vars_w_lvls +\end{code} + + +%************************************************************************ +%* * +\subsection{Deciding floatability} +%* * +%************************************************************************ + +\begin{code} +lvlLamBndrs :: Level -> [CoreBndr] -> (Level, [TaggedBndr Level]) +-- Compute the levels for the binders of a lambda group +-- The binders returned are exactly the same as the ones passed, +-- but they are now paired with a level +lvlLamBndrs lvl [] + = (lvl, []) + +lvlLamBndrs lvl bndrs + = go (incMinorLvl lvl) + False -- Havn't bumped major level in this group + [] bndrs + where + go old_lvl bumped_major rev_lvld_bndrs (bndr:bndrs) + | isId bndr && -- Go to the next major level if this is a value binder, + not bumped_major && -- and we havn't already gone to the next level (one jump per group) + not (isOneShotLambda bndr) -- and it isn't a one-shot lambda + = go new_lvl True (TB bndr new_lvl : rev_lvld_bndrs) bndrs + + | otherwise + = go old_lvl bumped_major (TB bndr old_lvl : rev_lvld_bndrs) bndrs + + where + new_lvl = incMajorLvl old_lvl + + go old_lvl _ rev_lvld_bndrs [] + = (old_lvl, reverse rev_lvld_bndrs) + -- a lambda like this (\x -> coerce t (\s -> ...)) + -- This happens quite a bit in state-transformer programs +\end{code} + +\begin{code} + -- Destintion level is the max Id level of the expression + -- (We'll abstract the type variables, if any.) +destLevel :: LevelEnv -> VarSet -> Bool -> Level +destLevel env fvs is_function + | floatLams env + && is_function = tOP_LEVEL -- Send functions to top level; see + -- the comments with isFunction + | otherwise = maxIdLevel env fvs + +isFunction :: CoreExprWithFVs -> Bool +-- The idea here is that we want to float *functions* to +-- the top level. This saves no work, but +-- (a) it can make the host function body a lot smaller, +-- and hence inlinable. +-- (b) it can also save allocation when the function is recursive: +-- h = \x -> letrec f = \y -> ...f...y...x... +-- in f x +-- becomes +-- f = \x y -> ...(f x)...y...x... +-- h = \x -> f x x +-- No allocation for f now. +-- We may only want to do this if there are sufficiently few free +-- variables. We certainly only want to do it for values, and not for +-- constructors. So the simple thing is just to look for lambdas +isFunction (_, AnnLam b e) | isId b = True + | otherwise = isFunction e +isFunction (_, AnnNote n e) = isFunction e +isFunction other = False +\end{code} + + +%************************************************************************ +%* * +\subsection{Free-To-Level Monad} +%* * +%************************************************************************ + +\begin{code} +type LevelEnv = (FloatOutSwitches, + VarEnv Level, -- Domain is *post-cloned* TyVars and Ids + Subst, -- Domain is pre-cloned Ids; tracks the in-scope set + -- so that subtitution is capture-avoiding + IdEnv ([Var], LevelledExpr)) -- Domain is pre-cloned Ids + -- We clone let-bound variables so that they are still + -- distinct when floated out; hence the SubstEnv/IdEnv. + -- (see point 3 of the module overview comment). + -- We also use these envs when making a variable polymorphic + -- because we want to float it out past a big lambda. + -- + -- The SubstEnv and IdEnv always implement the same mapping, but the + -- SubstEnv maps to CoreExpr and the IdEnv to LevelledExpr + -- Since the range is always a variable or type application, + -- there is never any difference between the two, but sadly + -- the types differ. The SubstEnv is used when substituting in + -- a variable's IdInfo; the IdEnv when we find a Var. + -- + -- In addition the IdEnv records a list of tyvars free in the + -- type application, just so we don't have to call freeVars on + -- the type application repeatedly. + -- + -- The domain of the both envs is *pre-cloned* Ids, though + -- + -- The domain of the VarEnv Level is the *post-cloned* Ids + +initialEnv :: FloatOutSwitches -> LevelEnv +initialEnv float_lams = (float_lams, emptyVarEnv, emptySubst, emptyVarEnv) + +floatLams :: LevelEnv -> Bool +floatLams (FloatOutSw float_lams _, _, _, _) = float_lams + +floatConsts :: LevelEnv -> Bool +floatConsts (FloatOutSw _ float_consts, _, _, _) = float_consts + +extendLvlEnv :: LevelEnv -> [TaggedBndr Level] -> LevelEnv +-- Used when *not* cloning +extendLvlEnv (float_lams, lvl_env, subst, id_env) prs + = (float_lams, + foldl add_lvl lvl_env prs, + foldl del_subst subst prs, + foldl del_id id_env prs) + where + add_lvl env (TB v l) = extendVarEnv env v l + del_subst env (TB v _) = extendInScope env v + del_id env (TB v _) = delVarEnv env v + -- We must remove any clone for this variable name in case of + -- shadowing. This bit me in the following case + -- (in nofib/real/gg/Spark.hs): + -- + -- case ds of wild { + -- ... -> case e of wild { + -- ... -> ... wild ... + -- } + -- } + -- + -- The inside occurrence of @wild@ was being replaced with @ds@, + -- incorrectly, because the SubstEnv was still lying around. Ouch! + -- KSW 2000-07. + +-- extendCaseBndrLvlEnv adds the mapping case-bndr->scrut-var if it can +-- (see point 4 of the module overview comment) +extendCaseBndrLvlEnv (float_lams, lvl_env, subst, id_env) (Var scrut_var) case_bndr lvl + = (float_lams, + extendVarEnv lvl_env case_bndr lvl, + extendIdSubst subst case_bndr (Var scrut_var), + extendVarEnv id_env case_bndr ([scrut_var], Var scrut_var)) + +extendCaseBndrLvlEnv env scrut case_bndr lvl + = extendLvlEnv env [TB case_bndr lvl] + +extendPolyLvlEnv dest_lvl (float_lams, lvl_env, subst, id_env) abs_vars bndr_pairs + = (float_lams, + foldl add_lvl lvl_env bndr_pairs, + foldl add_subst subst bndr_pairs, + foldl add_id id_env bndr_pairs) + where + add_lvl env (v,v') = extendVarEnv env v' dest_lvl + add_subst env (v,v') = extendIdSubst env v (mkVarApps (Var v') abs_vars) + add_id env (v,v') = extendVarEnv env v ((v':abs_vars), mkVarApps (Var v') abs_vars) + +extendCloneLvlEnv lvl (float_lams, lvl_env, _, id_env) new_subst bndr_pairs + = (float_lams, + foldl add_lvl lvl_env bndr_pairs, + new_subst, + foldl add_id id_env bndr_pairs) + where + add_lvl env (v,v') = extendVarEnv env v' lvl + add_id env (v,v') = extendVarEnv env v ([v'], Var v') + + +maxIdLevel :: LevelEnv -> VarSet -> Level +maxIdLevel (_, lvl_env,_,id_env) var_set + = foldVarSet max_in tOP_LEVEL var_set + where + max_in in_var lvl = foldr max_out lvl (case lookupVarEnv id_env in_var of + Just (abs_vars, _) -> abs_vars + Nothing -> [in_var]) + + max_out out_var lvl + | isId out_var = case lookupVarEnv lvl_env out_var of + Just lvl' -> maxLvl lvl' lvl + Nothing -> lvl + | otherwise = lvl -- Ignore tyvars in *maxIdLevel* + +lookupVar :: LevelEnv -> Id -> LevelledExpr +lookupVar (_, _, _, id_env) v = case lookupVarEnv id_env v of + Just (_, expr) -> expr + other -> Var v + +abstractVars :: Level -> LevelEnv -> VarSet -> [Var] + -- Find the variables in fvs, free vars of the target expresion, + -- whose level is greater than the destination level + -- These are the ones we are going to abstract out +abstractVars dest_lvl env fvs + = uniq (sortLe le [var | fv <- varSetElems fvs, var <- absVarsOf dest_lvl env fv]) + where + -- Sort the variables so we don't get + -- mixed-up tyvars and Ids; it's just messy + v1 `le` v2 = case (isId v1, isId v2) of + (True, False) -> False + (False, True) -> True + other -> v1 <= v2 -- Same family + + uniq :: [Var] -> [Var] + -- Remove adjacent duplicates; the sort will have brought them together + uniq (v1:v2:vs) | v1 == v2 = uniq (v2:vs) + | otherwise = v1 : uniq (v2:vs) + uniq vs = vs + +absVarsOf :: Level -> LevelEnv -> Var -> [Var] + -- If f is free in the expression, and f maps to poly_f a b c in the + -- current substitution, then we must report a b c as candidate type + -- variables +absVarsOf dest_lvl (_, lvl_env, _, id_env) v + | isId v + = [zap av2 | av1 <- lookup_avs v, av2 <- add_tyvars av1, abstract_me av2] + + | otherwise + = if abstract_me v then [v] else [] + + where + abstract_me v = case lookupVarEnv lvl_env v of + Just lvl -> dest_lvl `ltLvl` lvl + Nothing -> False + + lookup_avs v = case lookupVarEnv id_env v of + Just (abs_vars, _) -> abs_vars + Nothing -> [v] + + add_tyvars v | isId v = v : varSetElems (idFreeTyVars v) + | otherwise = [v] + + -- We are going to lambda-abstract, so nuke any IdInfo, + -- and add the tyvars of the Id (if necessary) + zap v | isId v = WARN( workerExists (idWorkerInfo v) || + not (isEmptySpecInfo (idSpecialisation v)), + text "absVarsOf: discarding info on" <+> ppr v ) + setIdInfo v vanillaIdInfo + | otherwise = v +\end{code} + +\begin{code} +type LvlM result = UniqSM result + +initLvl = initUs_ +thenLvl = thenUs +returnLvl = returnUs +mapLvl = mapUs +\end{code} + +\begin{code} +newPolyBndrs dest_lvl env abs_vars bndrs + = getUniquesUs `thenLvl` \ uniqs -> + let + new_bndrs = zipWith mk_poly_bndr bndrs uniqs + in + returnLvl (extendPolyLvlEnv dest_lvl env abs_vars (bndrs `zip` new_bndrs), new_bndrs) + where + mk_poly_bndr bndr uniq = mkSysLocal (mkFastString str) uniq poly_ty + where + str = "poly_" ++ occNameString (getOccName bndr) + poly_ty = mkPiTypes abs_vars (idType bndr) + + +newLvlVar :: String + -> [CoreBndr] -> Type -- Abstract wrt these bndrs + -> LvlM Id +newLvlVar str vars body_ty + = getUniqueUs `thenLvl` \ uniq -> + returnUs (mkSysLocal (mkFastString str) uniq (mkPiTypes vars body_ty)) + +-- The deeply tiresome thing is that we have to apply the substitution +-- to the rules inside each Id. Grr. But it matters. + +cloneVar :: TopLevelFlag -> LevelEnv -> Id -> Level -> Level -> LvlM (LevelEnv, Id) +cloneVar TopLevel env v ctxt_lvl dest_lvl + = returnUs (env, v) -- Don't clone top level things +cloneVar NotTopLevel env@(_,_,subst,_) v ctxt_lvl dest_lvl + = ASSERT( isId v ) + getUs `thenLvl` \ us -> + let + (subst', v1) = cloneIdBndr subst us v + v2 = zap_demand ctxt_lvl dest_lvl v1 + env' = extendCloneLvlEnv dest_lvl env subst' [(v,v2)] + in + returnUs (env', v2) + +cloneRecVars :: TopLevelFlag -> LevelEnv -> [Id] -> Level -> Level -> LvlM (LevelEnv, [Id]) +cloneRecVars TopLevel env vs ctxt_lvl dest_lvl + = returnUs (env, vs) -- Don't clone top level things +cloneRecVars NotTopLevel env@(_,_,subst,_) vs ctxt_lvl dest_lvl + = ASSERT( all isId vs ) + getUs `thenLvl` \ us -> + let + (subst', vs1) = cloneRecIdBndrs subst us vs + vs2 = map (zap_demand ctxt_lvl dest_lvl) vs1 + env' = extendCloneLvlEnv dest_lvl env subst' (vs `zip` vs2) + in + returnUs (env', vs2) + + -- VERY IMPORTANT: we must zap the demand info + -- if the thing is going to float out past a lambda +zap_demand dest_lvl ctxt_lvl id + | ctxt_lvl == dest_lvl = id -- Stays put + | otherwise = zapDemandIdInfo id -- Floats out +\end{code} + diff --git a/compiler/simplCore/SimplCore.lhs b/compiler/simplCore/SimplCore.lhs new file mode 100644 index 0000000000..a386a3d6b0 --- /dev/null +++ b/compiler/simplCore/SimplCore.lhs @@ -0,0 +1,674 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[SimplCore]{Driver for simplifying @Core@ programs} + +\begin{code} +module SimplCore ( core2core, simplifyExpr ) where + +#include "HsVersions.h" + +import DynFlags ( CoreToDo(..), SimplifierSwitch(..), + SimplifierMode(..), DynFlags, DynFlag(..), dopt, + getCoreToDo ) +import CoreSyn +import HscTypes ( HscEnv(..), ModGuts(..), ExternalPackageState(..), + Dependencies( dep_mods ), + hscEPS, hptRules ) +import CSE ( cseProgram ) +import Rules ( RuleBase, emptyRuleBase, mkRuleBase, unionRuleBase, + extendRuleBaseList, pprRuleBase, ruleCheckProgram, + addSpecInfo, addIdSpecialisations ) +import PprCore ( pprCoreBindings, pprCoreExpr, pprRules ) +import OccurAnal ( occurAnalysePgm, occurAnalyseExpr ) +import IdInfo ( setNewStrictnessInfo, newStrictnessInfo, + setWorkerInfo, workerInfo, + setSpecInfo, specInfo, specInfoRules ) +import CoreUtils ( coreBindsSize ) +import Simplify ( simplTopBinds, simplExpr ) +import SimplEnv ( SimplEnv, simplBinders, mkSimplEnv, setInScopeSet ) +import SimplMonad +import ErrUtils ( dumpIfSet, dumpIfSet_dyn, showPass ) +import CoreLint ( endPass ) +import FloatIn ( floatInwards ) +import FloatOut ( floatOutwards ) +import Id ( Id, modifyIdInfo, idInfo, isExportedId, isLocalId, + idSpecialisation, idName ) +import VarSet +import VarEnv +import NameEnv ( lookupNameEnv ) +import LiberateCase ( liberateCase ) +import SAT ( doStaticArgs ) +import Specialise ( specProgram) +import SpecConstr ( specConstrProgram) +import DmdAnal ( dmdAnalPgm ) +import WorkWrap ( wwTopBinds ) +#ifdef OLD_STRICTNESS +import StrictAnal ( saBinds ) +import CprAnalyse ( cprAnalyse ) +#endif + +import UniqSupply ( UniqSupply, mkSplitUniqSupply, splitUniqSupply ) +import IO ( hPutStr, stderr ) +import Outputable +import List ( partition ) +import Maybes ( orElse ) +\end{code} + +%************************************************************************ +%* * +\subsection{The driver for the simplifier} +%* * +%************************************************************************ + +\begin{code} +core2core :: HscEnv + -> ModGuts + -> IO ModGuts + +core2core hsc_env guts + = do + let dflags = hsc_dflags hsc_env + core_todos = getCoreToDo dflags + + us <- mkSplitUniqSupply 's' + let (cp_us, ru_us) = splitUniqSupply us + + -- COMPUTE THE RULE BASE TO USE + (imp_rule_base, guts') <- prepareRules hsc_env guts ru_us + + -- DO THE BUSINESS + (stats, guts'') <- doCorePasses hsc_env imp_rule_base cp_us + (zeroSimplCount dflags) + guts' core_todos + + dumpIfSet_dyn dflags Opt_D_dump_simpl_stats + "Grand total simplifier statistics" + (pprSimplCount stats) + + return guts'' + + +simplifyExpr :: DynFlags -- includes spec of what core-to-core passes to do + -> CoreExpr + -> IO CoreExpr +-- simplifyExpr is called by the driver to simplify an +-- expression typed in at the interactive prompt +simplifyExpr dflags expr + = do { + ; showPass dflags "Simplify" + + ; us <- mkSplitUniqSupply 's' + + ; let (expr', _counts) = initSmpl dflags us $ + simplExprGently gentleSimplEnv expr + + ; dumpIfSet_dyn dflags Opt_D_dump_simpl "Simplified expression" + (pprCoreExpr expr') + + ; return expr' + } + +gentleSimplEnv :: SimplEnv +gentleSimplEnv = mkSimplEnv SimplGently + (isAmongSimpl []) + emptyRuleBase + +doCorePasses :: HscEnv + -> RuleBase -- the imported main rule base + -> UniqSupply -- uniques + -> SimplCount -- simplifier stats + -> ModGuts -- local binds in (with rules attached) + -> [CoreToDo] -- which passes to do + -> IO (SimplCount, ModGuts) + +doCorePasses hsc_env rb us stats guts [] + = return (stats, guts) + +doCorePasses hsc_env rb us stats guts (to_do : to_dos) + = do + let (us1, us2) = splitUniqSupply us + (stats1, guts1) <- doCorePass to_do hsc_env us1 rb guts + doCorePasses hsc_env rb us2 (stats `plusSimplCount` stats1) guts1 to_dos + +doCorePass (CoreDoSimplify mode sws) = _scc_ "Simplify" simplifyPgm mode sws +doCorePass CoreCSE = _scc_ "CommonSubExpr" trBinds cseProgram +doCorePass CoreLiberateCase = _scc_ "LiberateCase" trBinds liberateCase +doCorePass CoreDoFloatInwards = _scc_ "FloatInwards" trBinds floatInwards +doCorePass (CoreDoFloatOutwards f) = _scc_ "FloatOutwards" trBindsU (floatOutwards f) +doCorePass CoreDoStaticArgs = _scc_ "StaticArgs" trBinds doStaticArgs +doCorePass CoreDoStrictness = _scc_ "Stranal" trBinds dmdAnalPgm +doCorePass CoreDoWorkerWrapper = _scc_ "WorkWrap" trBindsU wwTopBinds +doCorePass CoreDoSpecialising = _scc_ "Specialise" trBindsU specProgram +doCorePass CoreDoSpecConstr = _scc_ "SpecConstr" trBindsU specConstrProgram +doCorePass CoreDoGlomBinds = trBinds glomBinds +doCorePass CoreDoPrintCore = observe printCore +doCorePass (CoreDoRuleCheck phase pat) = observe (ruleCheck phase pat) +doCorePass CoreDoNothing = observe (\ _ _ -> return ()) +#ifdef OLD_STRICTNESS +doCorePass CoreDoOldStrictness = _scc_ "OldStrictness" trBinds doOldStrictness +#endif + +#ifdef OLD_STRICTNESS +doOldStrictness dfs binds + = do binds1 <- saBinds dfs binds + binds2 <- cprAnalyse dfs binds1 + return binds2 +#endif + +printCore _ binds = dumpIfSet True "Print Core" (pprCoreBindings binds) + +ruleCheck phase pat dflags binds = do showPass dflags "RuleCheck" + printDump (ruleCheckProgram phase pat binds) + +-- Most passes return no stats and don't change rules +trBinds :: (DynFlags -> [CoreBind] -> IO [CoreBind]) + -> HscEnv -> UniqSupply -> RuleBase -> ModGuts + -> IO (SimplCount, ModGuts) +trBinds do_pass hsc_env us rb guts + = do { binds' <- do_pass dflags (mg_binds guts) + ; return (zeroSimplCount dflags, guts { mg_binds = binds' }) } + where + dflags = hsc_dflags hsc_env + +trBindsU :: (DynFlags -> UniqSupply -> [CoreBind] -> IO [CoreBind]) + -> HscEnv -> UniqSupply -> RuleBase -> ModGuts + -> IO (SimplCount, ModGuts) +trBindsU do_pass hsc_env us rb guts + = do { binds' <- do_pass dflags us (mg_binds guts) + ; return (zeroSimplCount dflags, guts { mg_binds = binds' }) } + where + dflags = hsc_dflags hsc_env + +-- Observer passes just peek; don't modify the bindings at all +observe :: (DynFlags -> [CoreBind] -> IO a) + -> HscEnv -> UniqSupply -> RuleBase -> ModGuts + -> IO (SimplCount, ModGuts) +observe do_pass hsc_env us rb guts + = do { binds <- do_pass dflags (mg_binds guts) + ; return (zeroSimplCount dflags, guts) } + where + dflags = hsc_dflags hsc_env +\end{code} + + + +%************************************************************************ +%* * +\subsection{Dealing with rules} +%* * +%************************************************************************ + +-- prepareLocalRuleBase takes the CoreBinds and rules defined in this module. +-- It attaches those rules that are for local Ids to their binders, and +-- returns the remainder attached to Ids in an IdSet. + +\begin{code} +prepareRules :: HscEnv + -> ModGuts + -> UniqSupply + -> IO (RuleBase, -- Rule base for imported things, incl + -- (a) rules defined in this module (orphans) + -- (b) rules from other modules in home package + -- but not things from other packages + + ModGuts) -- Modified fields are + -- (a) Bindings have rules attached, + -- (b) Rules are now just orphan rules + +prepareRules hsc_env@(HscEnv { hsc_dflags = dflags, hsc_HPT = hpt }) + guts@(ModGuts { mg_binds = binds, mg_deps = deps, mg_rules = local_rules }) + us + = do { let -- Simplify the local rules; boringly, we need to make an in-scope set + -- from the local binders, to avoid warnings from Simplify.simplVar + local_ids = mkInScopeSet (mkVarSet (bindersOfBinds binds)) + env = setInScopeSet gentleSimplEnv local_ids + (better_rules,_) = initSmpl dflags us (mapSmpl (simplRule env) local_rules) + home_pkg_rules = hptRules hsc_env (dep_mods deps) + + -- Find the rules for locally-defined Ids; then we can attach them + -- to the binders in the top-level bindings + -- + -- Reason + -- - It makes the rules easier to look up + -- - It means that transformation rules and specialisations for + -- locally defined Ids are handled uniformly + -- - It keeps alive things that are referred to only from a rule + -- (the occurrence analyser knows about rules attached to Ids) + -- - It makes sure that, when we apply a rule, the free vars + -- of the RHS are more likely to be in scope + -- - The imported rules are carried in the in-scope set + -- which is extended on each iteration by the new wave of + -- local binders; any rules which aren't on the binding will + -- thereby get dropped + (rules_for_locals, rules_for_imps) = partition isLocalRule better_rules + local_rule_base = extendRuleBaseList emptyRuleBase rules_for_locals + binds_w_rules = updateBinders local_rule_base binds + + hpt_rule_base = mkRuleBase home_pkg_rules + imp_rule_base = extendRuleBaseList hpt_rule_base rules_for_imps + + ; dumpIfSet_dyn dflags Opt_D_dump_rules "Transformation rules" + (vcat [text "Local rules", pprRules better_rules, + text "", + text "Imported rules", pprRuleBase imp_rule_base]) + + ; return (imp_rule_base, guts { mg_binds = binds_w_rules, + mg_rules = rules_for_imps }) + } + +updateBinders :: RuleBase -> [CoreBind] -> [CoreBind] +updateBinders local_rules binds + = map update_bndrs binds + where + update_bndrs (NonRec b r) = NonRec (update_bndr b) r + update_bndrs (Rec prs) = Rec [(update_bndr b, r) | (b,r) <- prs] + + update_bndr bndr = case lookupNameEnv local_rules (idName bndr) of + Nothing -> bndr + Just rules -> bndr `addIdSpecialisations` rules + -- The binder might have some existing rules, + -- arising from specialisation pragmas +\end{code} + + +We must do some gentle simplification on the template (but not the RHS) +of each rule. The case that forced me to add this was the fold/build rule, +which without simplification looked like: + fold k z (build (/\a. g a)) ==> ... +This doesn't match unless you do eta reduction on the build argument. + +\begin{code} +simplRule env rule@(BuiltinRule {}) + = returnSmpl rule +simplRule env rule@(Rule { ru_bndrs = bndrs, ru_args = args, ru_rhs = rhs }) + = simplBinders env bndrs `thenSmpl` \ (env, bndrs') -> + mapSmpl (simplExprGently env) args `thenSmpl` \ args' -> + simplExprGently env rhs `thenSmpl` \ rhs' -> + returnSmpl (rule { ru_bndrs = bndrs', ru_args = args', ru_rhs = rhs' }) + +-- It's important that simplExprGently does eta reduction. +-- For example, in a rule like: +-- augment g (build h) +-- we do not want to get +-- augment (\a. g a) (build h) +-- otherwise we don't match when given an argument like +-- (\a. h a a) +-- +-- The simplifier does indeed do eta reduction (it's in +-- Simplify.completeLam) but only if -O is on. +\end{code} + +\begin{code} +simplExprGently :: SimplEnv -> CoreExpr -> SimplM CoreExpr +-- Simplifies an expression +-- does occurrence analysis, then simplification +-- and repeats (twice currently) because one pass +-- alone leaves tons of crud. +-- Used (a) for user expressions typed in at the interactive prompt +-- (b) the LHS and RHS of a RULE +-- +-- The name 'Gently' suggests that the SimplifierMode is SimplGently, +-- and in fact that is so.... but the 'Gently' in simplExprGently doesn't +-- enforce that; it just simplifies the expression twice + +simplExprGently env expr + = simplExpr env (occurAnalyseExpr expr) `thenSmpl` \ expr1 -> + simplExpr env (occurAnalyseExpr expr1) +\end{code} + + +%************************************************************************ +%* * +\subsection{Glomming} +%* * +%************************************************************************ + +\begin{code} +glomBinds :: DynFlags -> [CoreBind] -> IO [CoreBind] +-- Glom all binds together in one Rec, in case any +-- transformations have introduced any new dependencies +-- +-- NB: the global invariant is this: +-- *** the top level bindings are never cloned, and are always unique *** +-- +-- We sort them into dependency order, but applying transformation rules may +-- make something at the top refer to something at the bottom: +-- f = \x -> p (q x) +-- h = \y -> 3 +-- +-- RULE: p (q x) = h x +-- +-- Applying this rule makes f refer to h, +-- although it doesn't appear to in the source program. +-- This pass lets us control where it happens. +-- +-- NOTICE that this cannot happen for rules whose head is a locally-defined +-- function. It only happens for rules whose head is an imported function +-- (p in the example above). So, for example, the rule had been +-- RULE: f (p x) = h x +-- then the rule for f would be attached to f itself (in its IdInfo) +-- by prepareLocalRuleBase and h would be regarded by the occurrency +-- analyser as free in f. + +glomBinds dflags binds + = do { showPass dflags "GlomBinds" ; + let { recd_binds = [Rec (flattenBinds binds)] } ; + return recd_binds } + -- Not much point in printing the result... + -- just consumes output bandwidth +\end{code} + + +%************************************************************************ +%* * +\subsection{The driver for the simplifier} +%* * +%************************************************************************ + +\begin{code} +simplifyPgm :: SimplifierMode + -> [SimplifierSwitch] + -> HscEnv + -> UniqSupply + -> RuleBase + -> ModGuts + -> IO (SimplCount, ModGuts) -- New bindings + +simplifyPgm mode switches hsc_env us imp_rule_base guts + = do { + showPass dflags "Simplify"; + + (termination_msg, it_count, counts_out, binds') + <- do_iteration us 1 (zeroSimplCount dflags) (mg_binds guts) ; + + dumpIfSet (dopt Opt_D_verbose_core2core dflags + && dopt Opt_D_dump_simpl_stats dflags) + "Simplifier statistics" + (vcat [text termination_msg <+> text "after" <+> ppr it_count <+> text "iterations", + text "", + pprSimplCount counts_out]); + + endPass dflags "Simplify" Opt_D_verbose_core2core binds'; + + return (counts_out, guts { mg_binds = binds' }) + } + where + dflags = hsc_dflags hsc_env + phase_info = case mode of + SimplGently -> "gentle" + SimplPhase n -> show n + + sw_chkr = isAmongSimpl switches + max_iterations = intSwitchSet sw_chkr MaxSimplifierIterations `orElse` 2 + + do_iteration us iteration_no counts binds + -- iteration_no is the number of the iteration we are + -- about to begin, with '1' for the first + | iteration_no > max_iterations -- Stop if we've run out of iterations + = do { +#ifdef DEBUG + if max_iterations > 2 then + hPutStr stderr ("NOTE: Simplifier still going after " ++ + show max_iterations ++ + " iterations; bailing out.\n") + else + return (); +#endif + -- Subtract 1 from iteration_no to get the + -- number of iterations we actually completed + return ("Simplifier baled out", iteration_no - 1, counts, binds) + } + + -- Try and force thunks off the binds; significantly reduces + -- space usage, especially with -O. JRS, 000620. + | let sz = coreBindsSize binds in sz == sz + = do { + -- Occurrence analysis + let { tagged_binds = _scc_ "OccAnal" occurAnalysePgm binds } ; + dumpIfSet_dyn dflags Opt_D_dump_occur_anal "Occurrence analysis" + (pprCoreBindings tagged_binds); + + -- Get any new rules, and extend the rule base + -- We need to do this regularly, because simplification can + -- poke on IdInfo thunks, which in turn brings in new rules + -- behind the scenes. Otherwise there's a danger we'll simply + -- miss the rules for Ids hidden inside imported inlinings + eps <- hscEPS hsc_env ; + let { rule_base' = unionRuleBase imp_rule_base (eps_rule_base eps) + ; simpl_env = mkSimplEnv mode sw_chkr rule_base' } ; + + -- Simplify the program + -- We do this with a *case* not a *let* because lazy pattern + -- matching bit us with bad space leak! + -- With a let, we ended up with + -- let + -- t = initSmpl ... + -- counts' = snd t + -- in + -- case t of {(_,counts') -> if counts'=0 then ... } + -- So the conditional didn't force counts', because the + -- selection got duplicated. Sigh! + case initSmpl dflags us1 (_scc_ "SimplTopBinds" simplTopBinds simpl_env tagged_binds) of { + (binds', counts') -> do { + + let { all_counts = counts `plusSimplCount` counts' + ; herald = "Simplifier phase " ++ phase_info ++ + ", iteration " ++ show iteration_no ++ + " out of " ++ show max_iterations + } ; + + -- Stop if nothing happened; don't dump output + if isZeroSimplCount counts' then + return ("Simplifier reached fixed point", iteration_no, + all_counts, binds') + else do { + -- Short out indirections + -- We do this *after* at least one run of the simplifier + -- because indirection-shorting uses the export flag on *occurrences* + -- and that isn't guaranteed to be ok until after the first run propagates + -- stuff from the binding site to its occurrences + let { binds'' = _scc_ "ZapInd" shortOutIndirections binds' } ; + + -- Dump the result of this iteration + dumpIfSet_dyn dflags Opt_D_dump_simpl_iterations herald + (pprSimplCount counts') ; + endPass dflags herald Opt_D_dump_simpl_iterations binds'' ; + + -- Loop + do_iteration us2 (iteration_no + 1) all_counts binds'' + } } } } + where + (us1, us2) = splitUniqSupply us +\end{code} + + +%************************************************************************ +%* * + Shorting out indirections +%* * +%************************************************************************ + +If we have this: + + x_local = <expression> + ...bindings... + x_exported = x_local + +where x_exported is exported, and x_local is not, then we replace it with this: + + x_exported = <expression> + x_local = x_exported + ...bindings... + +Without this we never get rid of the x_exported = x_local thing. This +save a gratuitous jump (from \tr{x_exported} to \tr{x_local}), and +makes strictness information propagate better. This used to happen in +the final phase, but it's tidier to do it here. + +STRICTNESS: if we have done strictness analysis, we want the strictness info on +x_local to transfer to x_exported. Hence the copyIdInfo call. + +RULES: we want to *add* any RULES for x_local to x_exported. + +Note [Rules and indirection-zapping] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Problem: what if x_exported has a RULE that mentions something in ...bindings...? +Then the things mentioned can be out of scope! Solution + a) Make sure that in this pass the usage-info from x_exported is + available for ...bindings... + b) If there are any such RULES, rec-ify the entire top-level. + It'll get sorted out next time round + +Messing up the rules +~~~~~~~~~~~~~~~~~~~~ +The example that went bad on me at one stage was this one: + + iterate :: (a -> a) -> a -> [a] + [Exported] + iterate = iterateList + + iterateFB c f x = x `c` iterateFB c f (f x) + iterateList f x = x : iterateList f (f x) + [Not exported] + + {-# RULES + "iterate" forall f x. iterate f x = build (\c _n -> iterateFB c f x) + "iterateFB" iterateFB (:) = iterateList + #-} + +This got shorted out to: + + iterateList :: (a -> a) -> a -> [a] + iterateList = iterate + + iterateFB c f x = x `c` iterateFB c f (f x) + iterate f x = x : iterate f (f x) + + {-# RULES + "iterate" forall f x. iterate f x = build (\c _n -> iterateFB c f x) + "iterateFB" iterateFB (:) = iterate + #-} + +And now we get an infinite loop in the rule system + iterate f x -> build (\cn -> iterateFB c f x) + -> iterateFB (:) f x + -> iterate f x + +Tiresome old solution: + don't do shorting out if f has rewrite rules (see shortableIdInfo) + +New solution (I think): + use rule switching-off pragmas to get rid + of iterateList in the first place + + +Other remarks +~~~~~~~~~~~~~ +If more than one exported thing is equal to a local thing (i.e., the +local thing really is shared), then we do one only: +\begin{verbatim} + x_local = .... + x_exported1 = x_local + x_exported2 = x_local +==> + x_exported1 = .... + + x_exported2 = x_exported1 +\end{verbatim} + +We rely on prior eta reduction to simplify things like +\begin{verbatim} + x_exported = /\ tyvars -> x_local tyvars +==> + x_exported = x_local +\end{verbatim} +Hence,there's a possibility of leaving unchanged something like this: +\begin{verbatim} + x_local = .... + x_exported1 = x_local Int +\end{verbatim} +By the time we've thrown away the types in STG land this +could be eliminated. But I don't think it's very common +and it's dangerous to do this fiddling in STG land +because we might elminate a binding that's mentioned in the +unfolding for something. + +\begin{code} +type IndEnv = IdEnv Id -- Maps local_id -> exported_id + +shortOutIndirections :: [CoreBind] -> [CoreBind] +shortOutIndirections binds + | isEmptyVarEnv ind_env = binds + | no_need_to_flatten = binds' + | otherwise = [Rec (flattenBinds binds')] -- See Note [Rules and indirect-zapping] + where + ind_env = makeIndEnv binds + exp_ids = varSetElems ind_env -- These exported Ids are the subjects + exp_id_set = mkVarSet exp_ids -- of the indirection-elimination + no_need_to_flatten = all (null . specInfoRules . idSpecialisation) exp_ids + binds' = concatMap zap binds + + zap (NonRec bndr rhs) = [NonRec b r | (b,r) <- zapPair (bndr,rhs)] + zap (Rec pairs) = [Rec (concatMap zapPair pairs)] + + zapPair (bndr, rhs) + | bndr `elemVarSet` exp_id_set = [] + | Just exp_id <- lookupVarEnv ind_env bndr = [(transferIdInfo exp_id bndr, rhs), + (bndr, Var exp_id)] + | otherwise = [(bndr,rhs)] + +makeIndEnv :: [CoreBind] -> IndEnv +makeIndEnv binds + = foldr add_bind emptyVarEnv binds + where + add_bind :: CoreBind -> IndEnv -> IndEnv + add_bind (NonRec exported_id rhs) env = add_pair (exported_id, rhs) env + add_bind (Rec pairs) env = foldr add_pair env pairs + + add_pair :: (Id,CoreExpr) -> IndEnv -> IndEnv + add_pair (exported_id, Var local_id) env + | shortMeOut env exported_id local_id = extendVarEnv env local_id exported_id + add_pair (exported_id, rhs) env + = env + +shortMeOut ind_env exported_id local_id +-- The if-then-else stuff is just so I can get a pprTrace to see +-- how often I don't get shorting out becuase of IdInfo stuff + = if isExportedId exported_id && -- Only if this is exported + + isLocalId local_id && -- Only if this one is defined in this + -- module, so that we *can* change its + -- binding to be the exported thing! + + not (isExportedId local_id) && -- Only if this one is not itself exported, + -- since the transformation will nuke it + + not (local_id `elemVarEnv` ind_env) -- Only if not already substituted for + then + True + +{- No longer needed + if isEmptySpecInfo (specInfo (idInfo exported_id)) -- Only if no rules + then True -- See note on "Messing up rules" + else +#ifdef DEBUG + pprTrace "shortMeOut:" (ppr exported_id) +#endif + False +-} + else + False + + +----------------- +transferIdInfo :: Id -> Id -> Id +transferIdInfo exported_id local_id + = modifyIdInfo transfer exported_id + where + local_info = idInfo local_id + transfer exp_info = exp_info `setNewStrictnessInfo` newStrictnessInfo local_info + `setWorkerInfo` workerInfo local_info + `setSpecInfo` addSpecInfo (specInfo exp_info) + (specInfo local_info) +\end{code} diff --git a/compiler/simplCore/SimplEnv.lhs b/compiler/simplCore/SimplEnv.lhs new file mode 100644 index 0000000000..00f035e513 --- /dev/null +++ b/compiler/simplCore/SimplEnv.lhs @@ -0,0 +1,741 @@ +% +% (c) The AQUA Project, Glasgow University, 1993-1998 +% +\section[SimplMonad]{The simplifier Monad} + +\begin{code} +module SimplEnv ( + InId, InBind, InExpr, InAlt, InArg, InType, InBinder, + OutId, OutTyVar, OutBind, OutExpr, OutAlt, OutArg, OutType, OutBinder, + + -- The simplifier mode + setMode, getMode, + + -- Switch checker + SwitchChecker, SwitchResult(..), getSwitchChecker, getSimplIntSwitch, + isAmongSimpl, intSwitchSet, switchIsOn, + + setEnclosingCC, getEnclosingCC, + + -- Environments + SimplEnv, mkSimplEnv, extendIdSubst, extendTvSubst, + zapSubstEnv, setSubstEnv, + getInScope, setInScope, setInScopeSet, modifyInScope, addNewInScopeIds, + getRules, refineSimplEnv, + + SimplSR(..), mkContEx, substId, + + simplNonRecBndr, simplRecBndrs, simplLamBndr, simplLamBndrs, + simplBinder, simplBinders, addLetIdInfo, + substExpr, substTy, + + -- Floats + FloatsWith, FloatsWithExpr, + Floats, emptyFloats, isEmptyFloats, unitFloat, addFloats, flattenFloats, + allLifted, wrapFloats, floatBinds, + addAuxiliaryBind, + ) where + +#include "HsVersions.h" + +import SimplMonad +import Id ( Id, idType, idOccInfo, idUnfolding, setIdUnfolding ) +import IdInfo ( IdInfo, vanillaIdInfo, occInfo, setOccInfo, specInfo, setSpecInfo, + arityInfo, setArityInfo, workerInfo, setWorkerInfo, + unfoldingInfo, setUnfoldingInfo, isEmptySpecInfo, + unknownArity, workerExists + ) +import CoreSyn +import Unify ( TypeRefinement ) +import Rules ( RuleBase ) +import CoreUtils ( needsCaseBinding ) +import CostCentre ( CostCentreStack, subsumedCCS ) +import Var +import VarEnv +import VarSet ( isEmptyVarSet ) +import OrdList + +import qualified CoreSubst ( Subst, mkSubst, substExpr, substSpec, substWorker ) +import qualified Type ( substTy, substTyVarBndr ) + +import Type ( Type, TvSubst(..), TvSubstEnv, composeTvSubst, + isUnLiftedType, seqType, tyVarsOfType ) +import BasicTypes ( OccInfo(..), isFragileOcc ) +import DynFlags ( SimplifierMode(..) ) +import Util ( mapAccumL ) +import Outputable +\end{code} + +%************************************************************************ +%* * +\subsection[Simplify-types]{Type declarations} +%* * +%************************************************************************ + +\begin{code} +type InBinder = CoreBndr +type InId = Id -- Not yet cloned +type InType = Type -- Ditto +type InBind = CoreBind +type InExpr = CoreExpr +type InAlt = CoreAlt +type InArg = CoreArg + +type OutBinder = CoreBndr +type OutId = Id -- Cloned +type OutTyVar = TyVar -- Cloned +type OutType = Type -- Cloned +type OutBind = CoreBind +type OutExpr = CoreExpr +type OutAlt = CoreAlt +type OutArg = CoreArg +\end{code} + +%************************************************************************ +%* * +\subsubsection{The @SimplEnv@ type} +%* * +%************************************************************************ + + +\begin{code} +data SimplEnv + = SimplEnv { + seMode :: SimplifierMode, + seChkr :: SwitchChecker, + seCC :: CostCentreStack, -- The enclosing CCS (when profiling) + + -- Rules from other modules + seExtRules :: RuleBase, + + -- The current set of in-scope variables + -- They are all OutVars, and all bound in this module + seInScope :: InScopeSet, -- OutVars only + + -- The current substitution + seTvSubst :: TvSubstEnv, -- InTyVar |--> OutType + seIdSubst :: SimplIdSubst -- InId |--> OutExpr + } + +type SimplIdSubst = IdEnv SimplSR -- IdId |--> OutExpr + +data SimplSR + = DoneEx OutExpr -- Completed term + | DoneId OutId OccInfo -- Completed term variable, with occurrence info + | ContEx TvSubstEnv -- A suspended substitution + SimplIdSubst + InExpr +\end{code} + + +seInScope: + The in-scope part of Subst includes *all* in-scope TyVars and Ids + The elements of the set may have better IdInfo than the + occurrences of in-scope Ids, and (more important) they will + have a correctly-substituted type. So we use a lookup in this + set to replace occurrences + + The Ids in the InScopeSet are replete with their Rules, + and as we gather info about the unfolding of an Id, we replace + it in the in-scope set. + + The in-scope set is actually a mapping OutVar -> OutVar, and + in case expressions we sometimes bind + +seIdSubst: + The substitution is *apply-once* only, because InIds and OutIds can overlap. + For example, we generally omit mappings + a77 -> a77 + from the substitution, when we decide not to clone a77, but it's quite + legitimate to put the mapping in the substitution anyway. + + Indeed, we do so when we want to pass fragile OccInfo to the + occurrences of the variable; we add a substitution + x77 -> DoneId x77 occ + to record x's occurrence information.] + + Furthermore, consider + let x = case k of I# x77 -> ... in + let y = case k of I# x77 -> ... in ... + and suppose the body is strict in both x and y. Then the simplifier + will pull the first (case k) to the top; so the second (case k) will + cancel out, mapping x77 to, well, x77! But one is an in-Id and the + other is an out-Id. + + Of course, the substitution *must* applied! Things in its domain + simply aren't necessarily bound in the result. + +* substId adds a binding (DoneId new_id occ) to the substitution if + EITHER the Id's unique has changed + OR the Id has interesting occurrence information + So in effect you can only get to interesting occurrence information + by looking up the *old* Id; it's not really attached to the new id + at all. + + Note, though that the substitution isn't necessarily extended + if the type changes. Why not? Because of the next point: + +* We *always, always* finish by looking up in the in-scope set + any variable that doesn't get a DoneEx or DoneVar hit in the substitution. + Reason: so that we never finish up with a "old" Id in the result. + An old Id might point to an old unfolding and so on... which gives a space leak. + + [The DoneEx and DoneVar hits map to "new" stuff.] + +* It follows that substExpr must not do a no-op if the substitution is empty. + substType is free to do so, however. + +* When we come to a let-binding (say) we generate new IdInfo, including an + unfolding, attach it to the binder, and add this newly adorned binder to + the in-scope set. So all subsequent occurrences of the binder will get mapped + to the full-adorned binder, which is also the one put in the binding site. + +* The in-scope "set" usually maps x->x; we use it simply for its domain. + But sometimes we have two in-scope Ids that are synomyms, and should + map to the same target: x->x, y->x. Notably: + case y of x { ... } + That's why the "set" is actually a VarEnv Var + + +Note [GADT type refinement] +~~~~~~~~~~~~~~~~~~~~~~~~~~~ +When we come to a GADT pattern match that refines the in-scope types, we + a) Refine the types of the Ids in the in-scope set, seInScope. + For exmaple, consider + data T a where + Foo :: T (Bool -> Bool) + + (\ (x::T a) (y::a) -> case x of { Foo -> y True } + + Technically this is well-typed, but exprType will barf on the + (y True) unless we refine the type on y's occurrence. + + b) Refine the range of the type substitution, seTvSubst. + Very similar reason to (a). + + NB: we don't refine the range of the SimplIdSubst, because it's always + interpreted relative to the seInScope (see substId) + +For (b) we need to be a little careful. Specifically, we compose the refinement +with the type substitution. Suppose + The substitution was [a->b, b->a] + and the refinement was [b->Int] + Then we want [a->Int, b->a] + +But also if + The substitution was [a->b] + and the refinement was [b->Int] + Then we want [a->Int, b->Int] + becuase b might be both an InTyVar and OutTyVar + + +\begin{code} +mkSimplEnv :: SimplifierMode -> SwitchChecker -> RuleBase -> SimplEnv +mkSimplEnv mode switches rules + = SimplEnv { seChkr = switches, seCC = subsumedCCS, + seMode = mode, seInScope = emptyInScopeSet, + seExtRules = rules, + seTvSubst = emptyVarEnv, seIdSubst = emptyVarEnv } + -- The top level "enclosing CC" is "SUBSUMED". + +--------------------- +getSwitchChecker :: SimplEnv -> SwitchChecker +getSwitchChecker env = seChkr env + +--------------------- +getMode :: SimplEnv -> SimplifierMode +getMode env = seMode env + +setMode :: SimplifierMode -> SimplEnv -> SimplEnv +setMode mode env = env { seMode = mode } + +--------------------- +getEnclosingCC :: SimplEnv -> CostCentreStack +getEnclosingCC env = seCC env + +setEnclosingCC :: SimplEnv -> CostCentreStack -> SimplEnv +setEnclosingCC env cc = env {seCC = cc} + +--------------------- +extendIdSubst :: SimplEnv -> Id -> SimplSR -> SimplEnv +extendIdSubst env@(SimplEnv {seIdSubst = subst}) var res + = env {seIdSubst = extendVarEnv subst var res} + +extendTvSubst :: SimplEnv -> TyVar -> Type -> SimplEnv +extendTvSubst env@(SimplEnv {seTvSubst = subst}) var res + = env {seTvSubst = extendVarEnv subst var res} + +--------------------- +getInScope :: SimplEnv -> InScopeSet +getInScope env = seInScope env + +setInScopeSet :: SimplEnv -> InScopeSet -> SimplEnv +setInScopeSet env in_scope = env {seInScope = in_scope} + +setInScope :: SimplEnv -> SimplEnv -> SimplEnv +setInScope env env_with_in_scope = setInScopeSet env (getInScope env_with_in_scope) + +addNewInScopeIds :: SimplEnv -> [CoreBndr] -> SimplEnv + -- The new Ids are guaranteed to be freshly allocated +addNewInScopeIds env@(SimplEnv { seInScope = in_scope, seIdSubst = id_subst }) vs + = env { seInScope = in_scope `extendInScopeSetList` vs, + seIdSubst = id_subst `delVarEnvList` vs } + -- Why delete? Consider + -- let x = a*b in (x, \x -> x+3) + -- We add [x |-> a*b] to the substitution, but we must + -- *delete* it from the substitution when going inside + -- the (\x -> ...)! + +modifyInScope :: SimplEnv -> CoreBndr -> CoreBndr -> SimplEnv +modifyInScope env@(SimplEnv {seInScope = in_scope}) v v' + = env {seInScope = modifyInScopeSet in_scope v v'} + +--------------------- +zapSubstEnv :: SimplEnv -> SimplEnv +zapSubstEnv env = env {seTvSubst = emptyVarEnv, seIdSubst = emptyVarEnv} + +setSubstEnv :: SimplEnv -> TvSubstEnv -> SimplIdSubst -> SimplEnv +setSubstEnv env tvs ids = env { seTvSubst = tvs, seIdSubst = ids } + +mkContEx :: SimplEnv -> InExpr -> SimplSR +mkContEx (SimplEnv { seTvSubst = tvs, seIdSubst = ids }) e = ContEx tvs ids e + +isEmptySimplSubst :: SimplEnv -> Bool +isEmptySimplSubst (SimplEnv { seTvSubst = tvs, seIdSubst = ids }) + = isEmptyVarEnv tvs && isEmptyVarEnv ids + +--------------------- +getRules :: SimplEnv -> RuleBase +getRules = seExtRules +\end{code} + + GADT stuff + +Given an idempotent substitution, generated by the unifier, use it to +refine the environment + +\begin{code} +refineSimplEnv :: SimplEnv -> TypeRefinement -> SimplEnv +-- The TvSubstEnv is the refinement, and it refines OutTyVars into OutTypes +refineSimplEnv env@(SimplEnv { seTvSubst = tv_subst, seInScope = in_scope }) + (refine_tv_subst, all_bound_here) + = env { seTvSubst = composeTvSubst in_scope refine_tv_subst tv_subst, + seInScope = in_scope' } + where + in_scope' + | all_bound_here = in_scope + -- The tvs are the tyvars bound here. If only they + -- are refined, there's no need to do anything + | otherwise = mapInScopeSet refine_id in_scope + + refine_id v -- Only refine its type; any rules will get + -- refined if they are used (I hope) + | isId v = setIdType v (Type.substTy refine_subst (idType v)) + | otherwise = v + refine_subst = TvSubst in_scope refine_tv_subst +\end{code} + +%************************************************************************ +%* * + Substitution of Vars +%* * +%************************************************************************ + + +\begin{code} +substId :: SimplEnv -> Id -> SimplSR +substId (SimplEnv { seInScope = in_scope, seIdSubst = ids }) v + | not (isLocalId v) + = DoneId v NoOccInfo + | otherwise -- A local Id + = case lookupVarEnv ids v of + Just (DoneId v occ) -> DoneId (refine v) occ + Just res -> res + Nothing -> let v' = refine v + in DoneId v' (idOccInfo v') + -- We don't put LoopBreakers in the substitution (unless then need + -- to be cloned for name-clash rasons), so the idOccInfo is + -- very important! If isFragileOcc returned True for + -- loop breakers we could avoid this call, but at the expense + -- of adding more to the substitution, and building new Ids + -- a bit more often than really necessary + where + -- Get the most up-to-date thing from the in-scope set + -- Even though it isn't in the substitution, it may be in + -- the in-scope set with a different type (we only use the + -- substitution if the unique changes). + refine v = case lookupInScope in_scope v of + Just v' -> v' + Nothing -> WARN( True, ppr v ) v -- This is an error! +\end{code} + + +%************************************************************************ +%* * +\section{Substituting an Id binder} +%* * +%************************************************************************ + + +These functions are in the monad only so that they can be made strict via seq. + +\begin{code} +simplBinders, simplLamBndrs + :: SimplEnv -> [InBinder] -> SimplM (SimplEnv, [OutBinder]) +simplBinders env bndrs = mapAccumLSmpl simplBinder env bndrs +simplLamBndrs env bndrs = mapAccumLSmpl simplLamBndr env bndrs + +------------- +simplBinder :: SimplEnv -> InBinder -> SimplM (SimplEnv, OutBinder) +-- Used for lambda and case-bound variables +-- Clone Id if necessary, substitute type +-- Return with IdInfo already substituted, but (fragile) occurrence info zapped +-- The substitution is extended only if the variable is cloned, because +-- we *don't* need to use it to track occurrence info. +simplBinder env bndr + | isTyVar bndr = do { let (env', tv) = substTyVarBndr env bndr + ; seqTyVar tv `seq` return (env', tv) } + | otherwise = do { let (env', id) = substIdBndr env bndr + ; seqId id `seq` return (env', id) } + +------------- +simplLamBndr :: SimplEnv -> Var -> SimplM (SimplEnv, Var) +-- Used for lambda binders. These sometimes have unfoldings added by +-- the worker/wrapper pass that must be preserved, becuase they can't +-- be reconstructed from context. For example: +-- f x = case x of (a,b) -> fw a b x +-- fw a b x{=(a,b)} = ... +-- The "{=(a,b)}" is an unfolding we can't reconstruct otherwise. +simplLamBndr env bndr + | not (isId bndr && hasSomeUnfolding old_unf) = simplBinder env bndr -- Normal case + | otherwise = seqId id2 `seq` return (env', id2) + where + old_unf = idUnfolding bndr + (env', id1) = substIdBndr env bndr + id2 = id1 `setIdUnfolding` substUnfolding env old_unf + +-------------- +substIdBndr :: SimplEnv -> Id -- Substitition and Id to transform + -> (SimplEnv, Id) -- Transformed pair + +-- Returns with: +-- * Unique changed if necessary +-- * Type substituted +-- * Unfolding zapped +-- * Rules, worker, lbvar info all substituted +-- * Fragile occurrence info zapped +-- * The in-scope set extended with the returned Id +-- * The substitution extended with a DoneId if unique changed +-- In this case, the var in the DoneId is the same as the +-- var returned + +substIdBndr env@(SimplEnv { seInScope = in_scope, seIdSubst = id_subst}) + old_id + = (env { seInScope = in_scope `extendInScopeSet` new_id, + seIdSubst = new_subst }, new_id) + where + -- id1 is cloned if necessary + id1 = uniqAway in_scope old_id + + -- id2 has its type zapped + id2 = substIdType env id1 + + -- new_id has the final IdInfo + subst = mkCoreSubst env + new_id = maybeModifyIdInfo (substIdInfo subst) id2 + + -- Extend the substitution if the unique has changed + -- See the notes with substTyVarBndr for the delSubstEnv + new_subst | new_id /= old_id + = extendVarEnv id_subst old_id (DoneId new_id (idOccInfo old_id)) + | otherwise + = delVarEnv id_subst old_id +\end{code} + + +\begin{code} +seqTyVar :: TyVar -> () +seqTyVar b = b `seq` () + +seqId :: Id -> () +seqId id = seqType (idType id) `seq` + idInfo id `seq` + () + +seqIds :: [Id] -> () +seqIds [] = () +seqIds (id:ids) = seqId id `seq` seqIds ids +\end{code} + + +%************************************************************************ +%* * + Let bindings +%* * +%************************************************************************ + +Simplifying let binders +~~~~~~~~~~~~~~~~~~~~~~~ +Rename the binders if necessary, + +\begin{code} +simplNonRecBndr :: SimplEnv -> InBinder -> SimplM (SimplEnv, OutBinder) +simplNonRecBndr env id + = do { let (env1, id1) = substLetIdBndr env id + ; seqId id1 `seq` return (env1, id1) } + +--------------- +simplRecBndrs :: SimplEnv -> [InBinder] -> SimplM (SimplEnv, [OutBinder]) +simplRecBndrs env@(SimplEnv { seInScope = in_scope, seIdSubst = id_subst }) ids + = do { let (env1, ids1) = mapAccumL substLetIdBndr env ids + ; seqIds ids1 `seq` return (env1, ids1) } + +--------------- +substLetIdBndr :: SimplEnv -> InBinder -- Env and binder to transform + -> (SimplEnv, OutBinder) +-- C.f. CoreSubst.substIdBndr +-- Clone Id if necessary, substitute its type +-- Return an Id with completely zapped IdInfo +-- [addLetIdInfo, below, will restore its IdInfo] +-- Augment the subtitution +-- if the unique changed, *or* +-- if there's interesting occurrence info + +substLetIdBndr env@(SimplEnv { seInScope = in_scope, seIdSubst = id_subst }) old_id + = (env { seInScope = in_scope `extendInScopeSet` new_id, + seIdSubst = new_subst }, new_id) + where + id1 = uniqAway in_scope old_id + id2 = substIdType env id1 + new_id = setIdInfo id2 vanillaIdInfo + + -- Extend the substitution if the unique has changed, + -- or there's some useful occurrence information + -- See the notes with substTyVarBndr for the delSubstEnv + occ_info = occInfo (idInfo old_id) + new_subst | new_id /= old_id || isFragileOcc occ_info + = extendVarEnv id_subst old_id (DoneId new_id occ_info) + | otherwise + = delVarEnv id_subst old_id +\end{code} + +Add IdInfo back onto a let-bound Id +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +We must transfer the IdInfo of the original binder to the new binder. +This is crucial, to preserve + strictness + rules + worker info +etc. To do this we must apply the current substitution, +which incorporates earlier substitutions in this very letrec group. + +NB 1. We do this *before* processing the RHS of the binder, so that +its substituted rules are visible in its own RHS. +This is important. Manuel found cases where he really, really +wanted a RULE for a recursive function to apply in that function's +own right-hand side. + +NB 2: We do not transfer the arity (see Subst.substIdInfo) +The arity of an Id should not be visible +in its own RHS, else we eta-reduce + f = \x -> f x +to + f = f +which isn't sound. And it makes the arity in f's IdInfo greater than +the manifest arity, which isn't good. +The arity will get added later. + +NB 3: It's important that we *do* transer the loop-breaker OccInfo, +because that's what stops the Id getting inlined infinitely, in the body +of the letrec. + +NB 4: does no harm for non-recursive bindings + +NB 5: we can't do the addLetIdInfo part before *all* the RHSs because + rec { f = g + h = ... + RULE h Int = f + } +Here, we'll do postInlineUnconditionally on f, and we must "see" that +when substituting in h's RULE. + +\begin{code} +addLetIdInfo :: SimplEnv -> InBinder -> OutBinder -> (SimplEnv, OutBinder) +addLetIdInfo env in_id out_id + = (modifyInScope env out_id out_id, final_id) + where + final_id = out_id `setIdInfo` new_info + subst = mkCoreSubst env + old_info = idInfo in_id + new_info = case substIdInfo subst old_info of + Nothing -> old_info + Just new_info -> new_info + +substIdInfo :: CoreSubst.Subst -> IdInfo -> Maybe IdInfo +-- Substitute the +-- rules +-- worker info +-- Zap the unfolding +-- Keep only 'robust' OccInfo +-- Zap Arity +-- +-- Seq'ing on the returned IdInfo is enough to cause all the +-- substitutions to happen completely + +substIdInfo subst info + | nothing_to_do = Nothing + | otherwise = Just (info `setOccInfo` (if keep_occ then old_occ else NoOccInfo) + `setArityInfo` (if keep_arity then old_arity else unknownArity) + `setSpecInfo` CoreSubst.substSpec subst old_rules + `setWorkerInfo` CoreSubst.substWorker subst old_wrkr + `setUnfoldingInfo` noUnfolding) + -- setSpecInfo does a seq + -- setWorkerInfo does a seq + where + nothing_to_do = keep_occ && keep_arity && + isEmptySpecInfo old_rules && + not (workerExists old_wrkr) && + not (hasUnfolding (unfoldingInfo info)) + + keep_occ = not (isFragileOcc old_occ) + keep_arity = old_arity == unknownArity + old_arity = arityInfo info + old_occ = occInfo info + old_rules = specInfo info + old_wrkr = workerInfo info + +------------------ +substIdType :: SimplEnv -> Id -> Id +substIdType env@(SimplEnv { seInScope = in_scope, seTvSubst = tv_env}) id + | isEmptyVarEnv tv_env || isEmptyVarSet (tyVarsOfType old_ty) = id + | otherwise = setIdType id (Type.substTy (TvSubst in_scope tv_env) old_ty) + -- The tyVarsOfType is cheaper than it looks + -- because we cache the free tyvars of the type + -- in a Note in the id's type itself + where + old_ty = idType id + +------------------ +substUnfolding env NoUnfolding = NoUnfolding +substUnfolding env (OtherCon cons) = OtherCon cons +substUnfolding env (CompulsoryUnfolding rhs) = CompulsoryUnfolding (substExpr env rhs) +substUnfolding env (CoreUnfolding rhs t v w g) = CoreUnfolding (substExpr env rhs) t v w g +\end{code} + + +%************************************************************************ +%* * + Impedence matching to type substitution +%* * +%************************************************************************ + +\begin{code} +substTy :: SimplEnv -> Type -> Type +substTy (SimplEnv { seInScope = in_scope, seTvSubst = tv_env }) ty + = Type.substTy (TvSubst in_scope tv_env) ty + +substTyVarBndr :: SimplEnv -> TyVar -> (SimplEnv, TyVar) +substTyVarBndr env@(SimplEnv { seInScope = in_scope, seTvSubst = tv_env }) tv + = case Type.substTyVarBndr (TvSubst in_scope tv_env) tv of + (TvSubst in_scope' tv_env', tv') + -> (env { seInScope = in_scope', seTvSubst = tv_env'}, tv') + +-- When substituting in rules etc we can get CoreSubst to do the work +-- But CoreSubst uses a simpler form of IdSubstEnv, so we must impedence-match +-- here. I think the this will not usually result in a lot of work; +-- the substitutions are typically small, and laziness will avoid work in many cases. + +mkCoreSubst :: SimplEnv -> CoreSubst.Subst +mkCoreSubst (SimplEnv { seInScope = in_scope, seTvSubst = tv_env, seIdSubst = id_env }) + = mk_subst tv_env id_env + where + mk_subst tv_env id_env = CoreSubst.mkSubst in_scope tv_env (mapVarEnv fiddle id_env) + + fiddle (DoneEx e) = e + fiddle (DoneId v occ) = Var v + fiddle (ContEx tv id e) = CoreSubst.substExpr (mk_subst tv id) e + +substExpr :: SimplEnv -> CoreExpr -> CoreExpr +substExpr env expr + | isEmptySimplSubst env = expr + | otherwise = CoreSubst.substExpr (mkCoreSubst env) expr +\end{code} + + +%************************************************************************ +%* * +\subsection{Floats} +%* * +%************************************************************************ + +\begin{code} +type FloatsWithExpr = FloatsWith OutExpr +type FloatsWith a = (Floats, a) + -- We return something equivalent to (let b in e), but + -- in pieces to avoid the quadratic blowup when floating + -- incrementally. Comments just before simplExprB in Simplify.lhs + +data Floats = Floats (OrdList OutBind) + InScopeSet -- Environment "inside" all the floats + Bool -- True <=> All bindings are lifted + +allLifted :: Floats -> Bool +allLifted (Floats _ _ is_lifted) = is_lifted + +wrapFloats :: Floats -> OutExpr -> OutExpr +wrapFloats (Floats bs _ _) body = foldrOL Let body bs + +isEmptyFloats :: Floats -> Bool +isEmptyFloats (Floats bs _ _) = isNilOL bs + +floatBinds :: Floats -> [OutBind] +floatBinds (Floats bs _ _) = fromOL bs + +flattenFloats :: Floats -> Floats +-- Flattens into a single Rec group +flattenFloats (Floats bs is is_lifted) + = ASSERT2( is_lifted, ppr (fromOL bs) ) + Floats (unitOL (Rec (flattenBinds (fromOL bs)))) is is_lifted +\end{code} + +\begin{code} +emptyFloats :: SimplEnv -> Floats +emptyFloats env = Floats nilOL (getInScope env) True + +unitFloat :: SimplEnv -> OutId -> OutExpr -> Floats +-- A single non-rec float; extend the in-scope set +unitFloat env var rhs = Floats (unitOL (NonRec var rhs)) + (extendInScopeSet (getInScope env) var) + (not (isUnLiftedType (idType var))) + +addFloats :: SimplEnv -> Floats + -> (SimplEnv -> SimplM (FloatsWith a)) + -> SimplM (FloatsWith a) +addFloats env (Floats b1 is1 l1) thing_inside + | isNilOL b1 + = thing_inside env + | otherwise + = thing_inside (setInScopeSet env is1) `thenSmpl` \ (Floats b2 is2 l2, res) -> + returnSmpl (Floats (b1 `appOL` b2) is2 (l1 && l2), res) + +addLetBind :: OutBind -> Floats -> Floats +addLetBind bind (Floats binds in_scope lifted) + = Floats (bind `consOL` binds) in_scope (lifted && is_lifted_bind bind) + +is_lifted_bind (Rec _) = True +is_lifted_bind (NonRec b r) = not (isUnLiftedType (idType b)) + +-- addAuxiliaryBind * takes already-simplified things (bndr and rhs) +-- * extends the in-scope env +-- * assumes it's a let-bindable thing +addAuxiliaryBind :: SimplEnv -> OutBind + -> (SimplEnv -> SimplM (FloatsWith a)) + -> SimplM (FloatsWith a) + -- Extends the in-scope environment as well as wrapping the bindings +addAuxiliaryBind env bind thing_inside + = ASSERT( case bind of { NonRec b r -> not (needsCaseBinding (idType b) r) ; Rec _ -> True } ) + thing_inside (addNewInScopeIds env (bindersOf bind)) `thenSmpl` \ (floats, x) -> + returnSmpl (addLetBind bind floats, x) +\end{code} + + diff --git a/compiler/simplCore/SimplMonad.lhs b/compiler/simplCore/SimplMonad.lhs new file mode 100644 index 0000000000..bc09e1128c --- /dev/null +++ b/compiler/simplCore/SimplMonad.lhs @@ -0,0 +1,526 @@ +% +% (c) The AQUA Project, Glasgow University, 1993-1998 +% +\section[SimplMonad]{The simplifier Monad} + +\begin{code} +module SimplMonad ( + -- The monad + SimplM, + initSmpl, returnSmpl, thenSmpl, thenSmpl_, + mapSmpl, mapAndUnzipSmpl, mapAccumLSmpl, + getDOptsSmpl, + + -- Unique supply + getUniqueSmpl, getUniquesSmpl, getUniqSupplySmpl, newId, + + -- Counting + SimplCount, Tick(..), + tick, freeTick, + getSimplCount, zeroSimplCount, pprSimplCount, + plusSimplCount, isZeroSimplCount, + + -- Switch checker + SwitchChecker, SwitchResult(..), getSimplIntSwitch, + isAmongSimpl, intSwitchSet, switchIsOn + ) where + +#include "HsVersions.h" + +import Id ( Id, mkSysLocal ) +import Type ( Type ) +import UniqSupply ( uniqsFromSupply, uniqFromSupply, splitUniqSupply, + UniqSupply + ) +import DynFlags ( SimplifierSwitch(..), DynFlags, DynFlag(..), dopt ) +import StaticFlags ( opt_PprStyle_Debug, opt_HistorySize ) +import Unique ( Unique ) +import Maybes ( expectJust ) +import FiniteMap ( FiniteMap, emptyFM, isEmptyFM, lookupFM, addToFM, plusFM_C, fmToList ) +import FastString ( FastString ) +import Outputable +import FastTypes + +import GLAEXTS ( indexArray# ) + +#if __GLASGOW_HASKELL__ < 503 +import PrelArr ( Array(..) ) +#else +import GHC.Arr ( Array(..) ) +#endif + +import Array ( array, (//) ) + +infixr 0 `thenSmpl`, `thenSmpl_` +\end{code} + +%************************************************************************ +%* * +\subsection{Monad plumbing} +%* * +%************************************************************************ + +For the simplifier monad, we want to {\em thread} a unique supply and a counter. +(Command-line switches move around through the explicitly-passed SimplEnv.) + +\begin{code} +newtype SimplM result + = SM { unSM :: DynFlags -- We thread the unique supply because + -> UniqSupply -- constantly splitting it is rather expensive + -> SimplCount + -> (result, UniqSupply, SimplCount)} +\end{code} + +\begin{code} +initSmpl :: DynFlags + -> UniqSupply -- No init count; set to 0 + -> SimplM a + -> (a, SimplCount) + +initSmpl dflags us m + = case unSM m dflags us (zeroSimplCount dflags) of + (result, _, count) -> (result, count) + + +{-# INLINE thenSmpl #-} +{-# INLINE thenSmpl_ #-} +{-# INLINE returnSmpl #-} + +instance Monad SimplM where + (>>) = thenSmpl_ + (>>=) = thenSmpl + return = returnSmpl + +returnSmpl :: a -> SimplM a +returnSmpl e = SM (\ dflags us sc -> (e, us, sc)) + +thenSmpl :: SimplM a -> (a -> SimplM b) -> SimplM b +thenSmpl_ :: SimplM a -> SimplM b -> SimplM b + +thenSmpl m k + = SM (\ dflags us0 sc0 -> + case (unSM m dflags us0 sc0) of + (m_result, us1, sc1) -> unSM (k m_result) dflags us1 sc1 ) + +thenSmpl_ m k + = SM (\dflags us0 sc0 -> + case (unSM m dflags us0 sc0) of + (_, us1, sc1) -> unSM k dflags us1 sc1) +\end{code} + + +\begin{code} +mapSmpl :: (a -> SimplM b) -> [a] -> SimplM [b] +mapAndUnzipSmpl :: (a -> SimplM (b, c)) -> [a] -> SimplM ([b],[c]) + +mapSmpl f [] = returnSmpl [] +mapSmpl f (x:xs) + = f x `thenSmpl` \ x' -> + mapSmpl f xs `thenSmpl` \ xs' -> + returnSmpl (x':xs') + +mapAndUnzipSmpl f [] = returnSmpl ([],[]) +mapAndUnzipSmpl f (x:xs) + = f x `thenSmpl` \ (r1, r2) -> + mapAndUnzipSmpl f xs `thenSmpl` \ (rs1, rs2) -> + returnSmpl (r1:rs1, r2:rs2) + +mapAccumLSmpl :: (acc -> b -> SimplM (acc,c)) -> acc -> [b] -> SimplM (acc, [c]) +mapAccumLSmpl f acc [] = returnSmpl (acc, []) +mapAccumLSmpl f acc (x:xs) = f acc x `thenSmpl` \ (acc', x') -> + mapAccumLSmpl f acc' xs `thenSmpl` \ (acc'', xs') -> + returnSmpl (acc'', x':xs') +\end{code} + + +%************************************************************************ +%* * +\subsection{The unique supply} +%* * +%************************************************************************ + +\begin{code} +getUniqSupplySmpl :: SimplM UniqSupply +getUniqSupplySmpl + = SM (\dflags us sc -> case splitUniqSupply us of + (us1, us2) -> (us1, us2, sc)) + +getUniqueSmpl :: SimplM Unique +getUniqueSmpl + = SM (\dflags us sc -> case splitUniqSupply us of + (us1, us2) -> (uniqFromSupply us1, us2, sc)) + +getUniquesSmpl :: SimplM [Unique] +getUniquesSmpl + = SM (\dflags us sc -> case splitUniqSupply us of + (us1, us2) -> (uniqsFromSupply us1, us2, sc)) + +getDOptsSmpl :: SimplM DynFlags +getDOptsSmpl + = SM (\dflags us sc -> (dflags, us, sc)) + +newId :: FastString -> Type -> SimplM Id +newId fs ty = getUniqueSmpl `thenSmpl` \ uniq -> + returnSmpl (mkSysLocal fs uniq ty) +\end{code} + + +%************************************************************************ +%* * +\subsection{Counting up what we've done} +%* * +%************************************************************************ + +\begin{code} +getSimplCount :: SimplM SimplCount +getSimplCount = SM (\dflags us sc -> (sc, us, sc)) + +tick :: Tick -> SimplM () +tick t + = SM (\dflags us sc -> let sc' = doTick t sc + in sc' `seq` ((), us, sc')) + +freeTick :: Tick -> SimplM () +-- Record a tick, but don't add to the total tick count, which is +-- used to decide when nothing further has happened +freeTick t + = SM (\dflags us sc -> let sc' = doFreeTick t sc + in sc' `seq` ((), us, sc')) +\end{code} + +\begin{code} +verboseSimplStats = opt_PprStyle_Debug -- For now, anyway + +zeroSimplCount :: DynFlags -> SimplCount +isZeroSimplCount :: SimplCount -> Bool +pprSimplCount :: SimplCount -> SDoc +doTick, doFreeTick :: Tick -> SimplCount -> SimplCount +plusSimplCount :: SimplCount -> SimplCount -> SimplCount +\end{code} + +\begin{code} +data SimplCount = VerySimplZero -- These two are used when + | VerySimplNonZero -- we are only interested in + -- termination info + + | SimplCount { + ticks :: !Int, -- Total ticks + details :: !TickCounts, -- How many of each type + n_log :: !Int, -- N + log1 :: [Tick], -- Last N events; <= opt_HistorySize + log2 :: [Tick] -- Last opt_HistorySize events before that + } + +type TickCounts = FiniteMap Tick Int + +zeroSimplCount dflags + -- This is where we decide whether to do + -- the VerySimpl version or the full-stats version + | dopt Opt_D_dump_simpl_stats dflags + = SimplCount {ticks = 0, details = emptyFM, + n_log = 0, log1 = [], log2 = []} + | otherwise + = VerySimplZero + +isZeroSimplCount VerySimplZero = True +isZeroSimplCount (SimplCount { ticks = 0 }) = True +isZeroSimplCount other = False + +doFreeTick tick sc@SimplCount { details = dts } + = dts' `seqFM` sc { details = dts' } + where + dts' = dts `addTick` tick +doFreeTick tick sc = sc + +-- Gross hack to persuade GHC 3.03 to do this important seq +seqFM fm x | isEmptyFM fm = x + | otherwise = x + +doTick tick sc@SimplCount { ticks = tks, details = dts, n_log = nl, log1 = l1, log2 = l2 } + | nl >= opt_HistorySize = sc1 { n_log = 1, log1 = [tick], log2 = l1 } + | otherwise = sc1 { n_log = nl+1, log1 = tick : l1 } + where + sc1 = sc { ticks = tks+1, details = dts `addTick` tick } + +doTick tick sc = VerySimplNonZero -- The very simple case + + +-- Don't use plusFM_C because that's lazy, and we want to +-- be pretty strict here! +addTick :: TickCounts -> Tick -> TickCounts +addTick fm tick = case lookupFM fm tick of + Nothing -> addToFM fm tick 1 + Just n -> n1 `seq` addToFM fm tick n1 + where + n1 = n+1 + + +plusSimplCount sc1@(SimplCount { ticks = tks1, details = dts1 }) + sc2@(SimplCount { ticks = tks2, details = dts2 }) + = log_base { ticks = tks1 + tks2, details = plusFM_C (+) dts1 dts2 } + where + -- A hackish way of getting recent log info + log_base | null (log1 sc2) = sc1 -- Nothing at all in sc2 + | null (log2 sc2) = sc2 { log2 = log1 sc1 } + | otherwise = sc2 + +plusSimplCount VerySimplZero VerySimplZero = VerySimplZero +plusSimplCount sc1 sc2 = VerySimplNonZero + +pprSimplCount VerySimplZero = ptext SLIT("Total ticks: ZERO!") +pprSimplCount VerySimplNonZero = ptext SLIT("Total ticks: NON-ZERO!") +pprSimplCount (SimplCount { ticks = tks, details = dts, log1 = l1, log2 = l2 }) + = vcat [ptext SLIT("Total ticks: ") <+> int tks, + text "", + pprTickCounts (fmToList dts), + if verboseSimplStats then + vcat [text "", + ptext SLIT("Log (most recent first)"), + nest 4 (vcat (map ppr l1) $$ vcat (map ppr l2))] + else empty + ] + +pprTickCounts :: [(Tick,Int)] -> SDoc +pprTickCounts [] = empty +pprTickCounts ((tick1,n1):ticks) + = vcat [int tot_n <+> text (tickString tick1), + pprTCDetails real_these, + pprTickCounts others + ] + where + tick1_tag = tickToTag tick1 + (these, others) = span same_tick ticks + real_these = (tick1,n1):these + same_tick (tick2,_) = tickToTag tick2 == tick1_tag + tot_n = sum [n | (_,n) <- real_these] + +pprTCDetails ticks@((tick,_):_) + | verboseSimplStats || isRuleFired tick + = nest 4 (vcat [int n <+> pprTickCts tick | (tick,n) <- ticks]) + | otherwise + = empty +\end{code} + +%************************************************************************ +%* * +\subsection{Ticks} +%* * +%************************************************************************ + +\begin{code} +data Tick + = PreInlineUnconditionally Id + | PostInlineUnconditionally Id + + | UnfoldingDone Id + | RuleFired FastString -- Rule name + + | LetFloatFromLet + | EtaExpansion Id -- LHS binder + | EtaReduction Id -- Binder on outer lambda + | BetaReduction Id -- Lambda binder + + + | CaseOfCase Id -- Bndr on *inner* case + | KnownBranch Id -- Case binder + | CaseMerge Id -- Binder on outer case + | AltMerge Id -- Case binder + | CaseElim Id -- Case binder + | CaseIdentity Id -- Case binder + | FillInCaseDefault Id -- Case binder + + | BottomFound + | SimplifierDone -- Ticked at each iteration of the simplifier + +isRuleFired (RuleFired _) = True +isRuleFired other = False + +instance Outputable Tick where + ppr tick = text (tickString tick) <+> pprTickCts tick + +instance Eq Tick where + a == b = case a `cmpTick` b of { EQ -> True; other -> False } + +instance Ord Tick where + compare = cmpTick + +tickToTag :: Tick -> Int +tickToTag (PreInlineUnconditionally _) = 0 +tickToTag (PostInlineUnconditionally _) = 1 +tickToTag (UnfoldingDone _) = 2 +tickToTag (RuleFired _) = 3 +tickToTag LetFloatFromLet = 4 +tickToTag (EtaExpansion _) = 5 +tickToTag (EtaReduction _) = 6 +tickToTag (BetaReduction _) = 7 +tickToTag (CaseOfCase _) = 8 +tickToTag (KnownBranch _) = 9 +tickToTag (CaseMerge _) = 10 +tickToTag (CaseElim _) = 11 +tickToTag (CaseIdentity _) = 12 +tickToTag (FillInCaseDefault _) = 13 +tickToTag BottomFound = 14 +tickToTag SimplifierDone = 16 +tickToTag (AltMerge _) = 17 + +tickString :: Tick -> String +tickString (PreInlineUnconditionally _) = "PreInlineUnconditionally" +tickString (PostInlineUnconditionally _)= "PostInlineUnconditionally" +tickString (UnfoldingDone _) = "UnfoldingDone" +tickString (RuleFired _) = "RuleFired" +tickString LetFloatFromLet = "LetFloatFromLet" +tickString (EtaExpansion _) = "EtaExpansion" +tickString (EtaReduction _) = "EtaReduction" +tickString (BetaReduction _) = "BetaReduction" +tickString (CaseOfCase _) = "CaseOfCase" +tickString (KnownBranch _) = "KnownBranch" +tickString (CaseMerge _) = "CaseMerge" +tickString (AltMerge _) = "AltMerge" +tickString (CaseElim _) = "CaseElim" +tickString (CaseIdentity _) = "CaseIdentity" +tickString (FillInCaseDefault _) = "FillInCaseDefault" +tickString BottomFound = "BottomFound" +tickString SimplifierDone = "SimplifierDone" + +pprTickCts :: Tick -> SDoc +pprTickCts (PreInlineUnconditionally v) = ppr v +pprTickCts (PostInlineUnconditionally v)= ppr v +pprTickCts (UnfoldingDone v) = ppr v +pprTickCts (RuleFired v) = ppr v +pprTickCts LetFloatFromLet = empty +pprTickCts (EtaExpansion v) = ppr v +pprTickCts (EtaReduction v) = ppr v +pprTickCts (BetaReduction v) = ppr v +pprTickCts (CaseOfCase v) = ppr v +pprTickCts (KnownBranch v) = ppr v +pprTickCts (CaseMerge v) = ppr v +pprTickCts (AltMerge v) = ppr v +pprTickCts (CaseElim v) = ppr v +pprTickCts (CaseIdentity v) = ppr v +pprTickCts (FillInCaseDefault v) = ppr v +pprTickCts other = empty + +cmpTick :: Tick -> Tick -> Ordering +cmpTick a b = case (tickToTag a `compare` tickToTag b) of + GT -> GT + EQ | isRuleFired a || verboseSimplStats -> cmpEqTick a b + | otherwise -> EQ + LT -> LT + -- Always distinguish RuleFired, so that the stats + -- can report them even in non-verbose mode + +cmpEqTick :: Tick -> Tick -> Ordering +cmpEqTick (PreInlineUnconditionally a) (PreInlineUnconditionally b) = a `compare` b +cmpEqTick (PostInlineUnconditionally a) (PostInlineUnconditionally b) = a `compare` b +cmpEqTick (UnfoldingDone a) (UnfoldingDone b) = a `compare` b +cmpEqTick (RuleFired a) (RuleFired b) = a `compare` b +cmpEqTick (EtaExpansion a) (EtaExpansion b) = a `compare` b +cmpEqTick (EtaReduction a) (EtaReduction b) = a `compare` b +cmpEqTick (BetaReduction a) (BetaReduction b) = a `compare` b +cmpEqTick (CaseOfCase a) (CaseOfCase b) = a `compare` b +cmpEqTick (KnownBranch a) (KnownBranch b) = a `compare` b +cmpEqTick (CaseMerge a) (CaseMerge b) = a `compare` b +cmpEqTick (AltMerge a) (AltMerge b) = a `compare` b +cmpEqTick (CaseElim a) (CaseElim b) = a `compare` b +cmpEqTick (CaseIdentity a) (CaseIdentity b) = a `compare` b +cmpEqTick (FillInCaseDefault a) (FillInCaseDefault b) = a `compare` b +cmpEqTick other1 other2 = EQ +\end{code} + + +%************************************************************************ +%* * +\subsubsection{Command-line switches} +%* * +%************************************************************************ + +\begin{code} +type SwitchChecker = SimplifierSwitch -> SwitchResult + +data SwitchResult + = SwBool Bool -- on/off + | SwString FastString -- nothing or a String + | SwInt Int -- nothing or an Int + +isAmongSimpl :: [SimplifierSwitch] -> SimplifierSwitch -> SwitchResult +isAmongSimpl on_switches -- Switches mentioned later occur *earlier* + -- in the list; defaults right at the end. + = let + tidied_on_switches = foldl rm_dups [] on_switches + -- The fold*l* ensures that we keep the latest switches; + -- ie the ones that occur earliest in the list. + + sw_tbl :: Array Int SwitchResult + sw_tbl = (array (0, lAST_SIMPL_SWITCH_TAG) -- bounds... + all_undefined) + // defined_elems + + all_undefined = [ (i, SwBool False) | i <- [0 .. lAST_SIMPL_SWITCH_TAG ] ] + + defined_elems = map mk_assoc_elem tidied_on_switches + in + -- (avoid some unboxing, bounds checking, and other horrible things:) + case sw_tbl of { Array _ _ stuff -> + \ switch -> + case (indexArray# stuff (tagOf_SimplSwitch switch)) of + (# v #) -> v + } + where + mk_assoc_elem k@(MaxSimplifierIterations lvl) + = (iBox (tagOf_SimplSwitch k), SwInt lvl) + mk_assoc_elem k + = (iBox (tagOf_SimplSwitch k), SwBool True) -- I'm here, Mom! + + -- cannot have duplicates if we are going to use the array thing + rm_dups switches_so_far switch + = if switch `is_elem` switches_so_far + then switches_so_far + else switch : switches_so_far + where + sw `is_elem` [] = False + sw `is_elem` (s:ss) = (tagOf_SimplSwitch sw) ==# (tagOf_SimplSwitch s) + || sw `is_elem` ss +\end{code} + +\begin{code} +getSimplIntSwitch :: SwitchChecker -> (Int-> SimplifierSwitch) -> Int +getSimplIntSwitch chkr switch + = expectJust "getSimplIntSwitch" (intSwitchSet chkr switch) + +switchIsOn :: (switch -> SwitchResult) -> switch -> Bool + +switchIsOn lookup_fn switch + = case (lookup_fn switch) of + SwBool False -> False + _ -> True + +intSwitchSet :: (switch -> SwitchResult) + -> (Int -> switch) + -> Maybe Int + +intSwitchSet lookup_fn switch + = case (lookup_fn (switch (panic "intSwitchSet"))) of + SwInt int -> Just int + _ -> Nothing +\end{code} + + +These things behave just like enumeration types. + +\begin{code} +instance Eq SimplifierSwitch where + a == b = tagOf_SimplSwitch a ==# tagOf_SimplSwitch b + +instance Ord SimplifierSwitch where + a < b = tagOf_SimplSwitch a <# tagOf_SimplSwitch b + a <= b = tagOf_SimplSwitch a <=# tagOf_SimplSwitch b + + +tagOf_SimplSwitch (MaxSimplifierIterations _) = _ILIT(1) +tagOf_SimplSwitch NoCaseOfCase = _ILIT(2) + +-- If you add anything here, be sure to change lAST_SIMPL_SWITCH_TAG, too! + +lAST_SIMPL_SWITCH_TAG = 2 +\end{code} + diff --git a/compiler/simplCore/SimplUtils.lhs b/compiler/simplCore/SimplUtils.lhs new file mode 100644 index 0000000000..9e616b5df1 --- /dev/null +++ b/compiler/simplCore/SimplUtils.lhs @@ -0,0 +1,1592 @@ +% +% (c) The AQUA Project, Glasgow University, 1993-1998 +% +\section[SimplUtils]{The simplifier utilities} + +\begin{code} +module SimplUtils ( + mkLam, prepareAlts, mkCase, + + -- Inlining, + preInlineUnconditionally, postInlineUnconditionally, activeInline, activeRule, + inlineMode, + + -- The continuation type + SimplCont(..), DupFlag(..), LetRhsFlag(..), + contIsDupable, contResultType, + countValArgs, countArgs, pushContArgs, + mkBoringStop, mkRhsStop, contIsRhs, contIsRhsOrArg, + getContArgs, interestingCallContext, interestingArg, isStrictType + + ) where + +#include "HsVersions.h" + +import SimplEnv +import DynFlags ( SimplifierSwitch(..), SimplifierMode(..), + DynFlag(..), dopt ) +import StaticFlags ( opt_UF_UpdateInPlace, opt_SimplNoPreInlining, + opt_RulesOff ) +import CoreSyn +import CoreFVs ( exprFreeVars ) +import CoreUtils ( cheapEqExpr, exprType, exprIsTrivial, exprIsCheap, + etaExpand, exprEtaExpandArity, bindNonRec, mkCoerce2, + findDefault, exprOkForSpeculation, exprIsHNF + ) +import Literal ( mkStringLit ) +import CoreUnfold ( smallEnoughToInline ) +import MkId ( eRROR_ID ) +import Id ( idType, isDataConWorkId, idOccInfo, isDictId, + mkSysLocal, isDeadBinder, idNewDemandInfo, isExportedId, + idUnfolding, idNewStrictness, idInlinePragma, + ) +import NewDemand ( isStrictDmd, isBotRes, splitStrictSig ) +import SimplMonad +import Type ( Type, splitFunTys, dropForAlls, isStrictType, + splitTyConApp_maybe, tyConAppArgs, mkTyVarTys + ) +import Name ( mkSysTvName ) +import TyCon ( tyConDataCons_maybe, isAlgTyCon, isNewTyCon ) +import DataCon ( dataConRepArity, dataConTyVars, dataConInstArgTys, isVanillaDataCon ) +import Var ( tyVarKind, mkTyVar ) +import VarSet +import BasicTypes ( TopLevelFlag(..), isNotTopLevel, OccInfo(..), isLoopBreaker, isOneOcc, + Activation, isAlwaysActive, isActive ) +import Util ( lengthExceeds ) +import Outputable +\end{code} + + +%************************************************************************ +%* * +\subsection{The continuation data type} +%* * +%************************************************************************ + +\begin{code} +data SimplCont -- Strict contexts + = Stop OutType -- Type of the result + LetRhsFlag + Bool -- True <=> This is the RHS of a thunk whose type suggests + -- that update-in-place would be possible + -- (This makes the inliner a little keener.) + + | CoerceIt OutType -- The To-type, simplified + SimplCont + + | InlinePlease -- This continuation makes a function very + SimplCont -- keen to inline itelf + + | ApplyTo DupFlag + InExpr SimplEnv -- The argument, as yet unsimplified, + SimplCont -- and its environment + + | Select DupFlag + InId [InAlt] SimplEnv -- The case binder, alts, and subst-env + SimplCont + + | ArgOf LetRhsFlag -- An arbitrary strict context: the argument + -- of a strict function, or a primitive-arg fn + -- or a PrimOp + -- No DupFlag because we never duplicate it + OutType -- arg_ty: type of the argument itself + OutType -- cont_ty: the type of the expression being sought by the context + -- f (error "foo") ==> coerce t (error "foo") + -- when f is strict + -- We need to know the type t, to which to coerce. + + (SimplEnv -> OutExpr -> SimplM FloatsWithExpr) -- What to do with the result + -- The result expression in the OutExprStuff has type cont_ty + +data LetRhsFlag = AnArg -- It's just an argument not a let RHS + | AnRhs -- It's the RHS of a let (so please float lets out of big lambdas) + +instance Outputable LetRhsFlag where + ppr AnArg = ptext SLIT("arg") + ppr AnRhs = ptext SLIT("rhs") + +instance Outputable SimplCont where + ppr (Stop ty is_rhs _) = ptext SLIT("Stop") <> brackets (ppr is_rhs) <+> ppr ty + ppr (ApplyTo dup arg se cont) = (ptext SLIT("ApplyTo") <+> ppr dup <+> ppr arg) $$ ppr cont + ppr (ArgOf _ _ _ _) = ptext SLIT("ArgOf...") + ppr (Select dup bndr alts se cont) = (ptext SLIT("Select") <+> ppr dup <+> ppr bndr) $$ + (nest 4 (ppr alts)) $$ ppr cont + ppr (CoerceIt ty cont) = (ptext SLIT("CoerceIt") <+> ppr ty) $$ ppr cont + ppr (InlinePlease cont) = ptext SLIT("InlinePlease") $$ ppr cont + +data DupFlag = OkToDup | NoDup + +instance Outputable DupFlag where + ppr OkToDup = ptext SLIT("ok") + ppr NoDup = ptext SLIT("nodup") + + +------------------- +mkBoringStop, mkRhsStop :: OutType -> SimplCont +mkBoringStop ty = Stop ty AnArg (canUpdateInPlace ty) +mkRhsStop ty = Stop ty AnRhs (canUpdateInPlace ty) + +contIsRhs :: SimplCont -> Bool +contIsRhs (Stop _ AnRhs _) = True +contIsRhs (ArgOf AnRhs _ _ _) = True +contIsRhs other = False + +contIsRhsOrArg (Stop _ _ _) = True +contIsRhsOrArg (ArgOf _ _ _ _) = True +contIsRhsOrArg other = False + +------------------- +contIsDupable :: SimplCont -> Bool +contIsDupable (Stop _ _ _) = True +contIsDupable (ApplyTo OkToDup _ _ _) = True +contIsDupable (Select OkToDup _ _ _ _) = True +contIsDupable (CoerceIt _ cont) = contIsDupable cont +contIsDupable (InlinePlease cont) = contIsDupable cont +contIsDupable other = False + +------------------- +discardableCont :: SimplCont -> Bool +discardableCont (Stop _ _ _) = False +discardableCont (CoerceIt _ cont) = discardableCont cont +discardableCont (InlinePlease cont) = discardableCont cont +discardableCont other = True + +discardCont :: SimplCont -- A continuation, expecting + -> SimplCont -- Replace the continuation with a suitable coerce +discardCont cont = case cont of + Stop to_ty is_rhs _ -> cont + other -> CoerceIt to_ty (mkBoringStop to_ty) + where + to_ty = contResultType cont + +------------------- +contResultType :: SimplCont -> OutType +contResultType (Stop to_ty _ _) = to_ty +contResultType (ArgOf _ _ to_ty _) = to_ty +contResultType (ApplyTo _ _ _ cont) = contResultType cont +contResultType (CoerceIt _ cont) = contResultType cont +contResultType (InlinePlease cont) = contResultType cont +contResultType (Select _ _ _ _ cont) = contResultType cont + +------------------- +countValArgs :: SimplCont -> Int +countValArgs (ApplyTo _ (Type ty) se cont) = countValArgs cont +countValArgs (ApplyTo _ val_arg se cont) = 1 + countValArgs cont +countValArgs other = 0 + +countArgs :: SimplCont -> Int +countArgs (ApplyTo _ arg se cont) = 1 + countArgs cont +countArgs other = 0 + +------------------- +pushContArgs :: SimplEnv -> [OutArg] -> SimplCont -> SimplCont +-- Pushes args with the specified environment +pushContArgs env [] cont = cont +pushContArgs env (arg : args) cont = ApplyTo NoDup arg env (pushContArgs env args cont) +\end{code} + + +\begin{code} +getContArgs :: SwitchChecker + -> OutId -> SimplCont + -> ([(InExpr, SimplEnv, Bool)], -- Arguments; the Bool is true for strict args + SimplCont, -- Remaining continuation + Bool) -- Whether we came across an InlineCall +-- getContArgs id k = (args, k', inl) +-- args are the leading ApplyTo items in k +-- (i.e. outermost comes first) +-- augmented with demand info from the functionn +getContArgs chkr fun orig_cont + = let + -- Ignore strictness info if the no-case-of-case + -- flag is on. Strictness changes evaluation order + -- and that can change full laziness + stricts | switchIsOn chkr NoCaseOfCase = vanilla_stricts + | otherwise = computed_stricts + in + go [] stricts False orig_cont + where + ---------------------------- + + -- Type argument + go acc ss inl (ApplyTo _ arg@(Type _) se cont) + = go ((arg,se,False) : acc) ss inl cont + -- NB: don't bother to instantiate the function type + + -- Value argument + go acc (s:ss) inl (ApplyTo _ arg se cont) + = go ((arg,se,s) : acc) ss inl cont + + -- An Inline continuation + go acc ss inl (InlinePlease cont) + = go acc ss True cont + + -- We're run out of arguments, or else we've run out of demands + -- The latter only happens if the result is guaranteed bottom + -- This is the case for + -- * case (error "hello") of { ... } + -- * (error "Hello") arg + -- * f (error "Hello") where f is strict + -- etc + -- Then, especially in the first of these cases, we'd like to discard + -- the continuation, leaving just the bottoming expression. But the + -- type might not be right, so we may have to add a coerce. + go acc ss inl cont + | null ss && discardableCont cont = (reverse acc, discardCont cont, inl) + | otherwise = (reverse acc, cont, inl) + + ---------------------------- + vanilla_stricts, computed_stricts :: [Bool] + vanilla_stricts = repeat False + computed_stricts = zipWith (||) fun_stricts arg_stricts + + ---------------------------- + (val_arg_tys, _) = splitFunTys (dropForAlls (idType fun)) + arg_stricts = map isStrictType val_arg_tys ++ repeat False + -- These argument types are used as a cheap and cheerful way to find + -- unboxed arguments, which must be strict. But it's an InType + -- and so there might be a type variable where we expect a function + -- type (the substitution hasn't happened yet). And we don't bother + -- doing the type applications for a polymorphic function. + -- Hence the splitFunTys*IgnoringForAlls* + + ---------------------------- + -- If fun_stricts is finite, it means the function returns bottom + -- after that number of value args have been consumed + -- Otherwise it's infinite, extended with False + fun_stricts + = case splitStrictSig (idNewStrictness fun) of + (demands, result_info) + | not (demands `lengthExceeds` countValArgs orig_cont) + -> -- Enough args, use the strictness given. + -- For bottoming functions we used to pretend that the arg + -- is lazy, so that we don't treat the arg as an + -- interesting context. This avoids substituting + -- top-level bindings for (say) strings into + -- calls to error. But now we are more careful about + -- inlining lone variables, so its ok (see SimplUtils.analyseCont) + if isBotRes result_info then + map isStrictDmd demands -- Finite => result is bottom + else + map isStrictDmd demands ++ vanilla_stricts + + other -> vanilla_stricts -- Not enough args, or no strictness + +------------------- +interestingArg :: OutExpr -> Bool + -- An argument is interesting if it has *some* structure + -- We are here trying to avoid unfolding a function that + -- is applied only to variables that have no unfolding + -- (i.e. they are probably lambda bound): f x y z + -- There is little point in inlining f here. +interestingArg (Var v) = hasSomeUnfolding (idUnfolding v) + -- Was: isValueUnfolding (idUnfolding v') + -- But that seems over-pessimistic + || isDataConWorkId v + -- This accounts for an argument like + -- () or [], which is definitely interesting +interestingArg (Type _) = False +interestingArg (App fn (Type _)) = interestingArg fn +interestingArg (Note _ a) = interestingArg a +interestingArg other = True + -- Consider let x = 3 in f x + -- The substitution will contain (x -> ContEx 3), and we want to + -- to say that x is an interesting argument. + -- But consider also (\x. f x y) y + -- The substitution will contain (x -> ContEx y), and we want to say + -- that x is not interesting (assuming y has no unfolding) +\end{code} + +Comment about interestingCallContext +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +We want to avoid inlining an expression where there can't possibly be +any gain, such as in an argument position. Hence, if the continuation +is interesting (eg. a case scrutinee, application etc.) then we +inline, otherwise we don't. + +Previously some_benefit used to return True only if the variable was +applied to some value arguments. This didn't work: + + let x = _coerce_ (T Int) Int (I# 3) in + case _coerce_ Int (T Int) x of + I# y -> .... + +we want to inline x, but can't see that it's a constructor in a case +scrutinee position, and some_benefit is False. + +Another example: + +dMonadST = _/\_ t -> :Monad (g1 _@_ t, g2 _@_ t, g3 _@_ t) + +.... case dMonadST _@_ x0 of (a,b,c) -> .... + +we'd really like to inline dMonadST here, but we *don't* want to +inline if the case expression is just + + case x of y { DEFAULT -> ... } + +since we can just eliminate this case instead (x is in WHNF). Similar +applies when x is bound to a lambda expression. Hence +contIsInteresting looks for case expressions with just a single +default case. + +\begin{code} +interestingCallContext :: Bool -- False <=> no args at all + -> Bool -- False <=> no value args + -> SimplCont -> Bool + -- The "lone-variable" case is important. I spent ages + -- messing about with unsatisfactory varaints, but this is nice. + -- The idea is that if a variable appear all alone + -- as an arg of lazy fn, or rhs Stop + -- as scrutinee of a case Select + -- as arg of a strict fn ArgOf + -- then we should not inline it (unless there is some other reason, + -- e.g. is is the sole occurrence). We achieve this by making + -- interestingCallContext return False for a lone variable. + -- + -- Why? At least in the case-scrutinee situation, turning + -- let x = (a,b) in case x of y -> ... + -- into + -- let x = (a,b) in case (a,b) of y -> ... + -- and thence to + -- let x = (a,b) in let y = (a,b) in ... + -- is bad if the binding for x will remain. + -- + -- Another example: I discovered that strings + -- were getting inlined straight back into applications of 'error' + -- because the latter is strict. + -- s = "foo" + -- f = \x -> ...(error s)... + + -- Fundamentally such contexts should not ecourage inlining because + -- the context can ``see'' the unfolding of the variable (e.g. case or a RULE) + -- so there's no gain. + -- + -- However, even a type application or coercion isn't a lone variable. + -- Consider + -- case $fMonadST @ RealWorld of { :DMonad a b c -> c } + -- We had better inline that sucker! The case won't see through it. + -- + -- For now, I'm treating treating a variable applied to types + -- in a *lazy* context "lone". The motivating example was + -- f = /\a. \x. BIG + -- g = /\a. \y. h (f a) + -- There's no advantage in inlining f here, and perhaps + -- a significant disadvantage. Hence some_val_args in the Stop case + +interestingCallContext some_args some_val_args cont + = interesting cont + where + interesting (InlinePlease _) = True + interesting (Select _ _ _ _ _) = some_args + interesting (ApplyTo _ _ _ _) = True -- Can happen if we have (coerce t (f x)) y + -- Perhaps True is a bit over-keen, but I've + -- seen (coerce f) x, where f has an INLINE prag, + -- So we have to give some motivaiton for inlining it + interesting (ArgOf _ _ _ _) = some_val_args + interesting (Stop ty _ upd_in_place) = some_val_args && upd_in_place + interesting (CoerceIt _ cont) = interesting cont + -- If this call is the arg of a strict function, the context + -- is a bit interesting. If we inline here, we may get useful + -- evaluation information to avoid repeated evals: e.g. + -- x + (y * z) + -- Here the contIsInteresting makes the '*' keener to inline, + -- which in turn exposes a constructor which makes the '+' inline. + -- Assuming that +,* aren't small enough to inline regardless. + -- + -- It's also very important to inline in a strict context for things + -- like + -- foldr k z (f x) + -- Here, the context of (f x) is strict, and if f's unfolding is + -- a build it's *great* to inline it here. So we must ensure that + -- the context for (f x) is not totally uninteresting. + + +------------------- +canUpdateInPlace :: Type -> Bool +-- Consider let x = <wurble> in ... +-- If <wurble> returns an explicit constructor, we might be able +-- to do update in place. So we treat even a thunk RHS context +-- as interesting if update in place is possible. We approximate +-- this by seeing if the type has a single constructor with a +-- small arity. But arity zero isn't good -- we share the single copy +-- for that case, so no point in sharing. + +canUpdateInPlace ty + | not opt_UF_UpdateInPlace = False + | otherwise + = case splitTyConApp_maybe ty of + Nothing -> False + Just (tycon, _) -> case tyConDataCons_maybe tycon of + Just [dc] -> arity == 1 || arity == 2 + where + arity = dataConRepArity dc + other -> False +\end{code} + + + +%************************************************************************ +%* * +\subsection{Decisions about inlining} +%* * +%************************************************************************ + +Inlining is controlled partly by the SimplifierMode switch. This has two +settings: + + SimplGently (a) Simplifying before specialiser/full laziness + (b) Simplifiying inside INLINE pragma + (c) Simplifying the LHS of a rule + (d) Simplifying a GHCi expression or Template + Haskell splice + + SimplPhase n Used at all other times + +The key thing about SimplGently is that it does no call-site inlining. +Before full laziness we must be careful not to inline wrappers, +because doing so inhibits floating + e.g. ...(case f x of ...)... + ==> ...(case (case x of I# x# -> fw x#) of ...)... + ==> ...(case x of I# x# -> case fw x# of ...)... +and now the redex (f x) isn't floatable any more. + +The no-inling thing is also important for Template Haskell. You might be +compiling in one-shot mode with -O2; but when TH compiles a splice before +running it, we don't want to use -O2. Indeed, we don't want to inline +anything, because the byte-code interpreter might get confused about +unboxed tuples and suchlike. + +INLINE pragmas +~~~~~~~~~~~~~~ +SimplGently is also used as the mode to simplify inside an InlineMe note. + +\begin{code} +inlineMode :: SimplifierMode +inlineMode = SimplGently +\end{code} + +It really is important to switch off inlinings inside such +expressions. Consider the following example + + let f = \pq -> BIG + in + let g = \y -> f y y + {-# INLINE g #-} + in ...g...g...g...g...g... + +Now, if that's the ONLY occurrence of f, it will be inlined inside g, +and thence copied multiple times when g is inlined. + + +This function may be inlinined in other modules, so we +don't want to remove (by inlining) calls to functions that have +specialisations, or that may have transformation rules in an importing +scope. + +E.g. {-# INLINE f #-} + f x = ...g... + +and suppose that g is strict *and* has specialisations. If we inline +g's wrapper, we deny f the chance of getting the specialised version +of g when f is inlined at some call site (perhaps in some other +module). + +It's also important not to inline a worker back into a wrapper. +A wrapper looks like + wraper = inline_me (\x -> ...worker... ) +Normally, the inline_me prevents the worker getting inlined into +the wrapper (initially, the worker's only call site!). But, +if the wrapper is sure to be called, the strictness analyser will +mark it 'demanded', so when the RHS is simplified, it'll get an ArgOf +continuation. That's why the keep_inline predicate returns True for +ArgOf continuations. It shouldn't do any harm not to dissolve the +inline-me note under these circumstances. + +Note that the result is that we do very little simplification +inside an InlineMe. + + all xs = foldr (&&) True xs + any p = all . map p {-# INLINE any #-} + +Problem: any won't get deforested, and so if it's exported and the +importer doesn't use the inlining, (eg passes it as an arg) then we +won't get deforestation at all. We havn't solved this problem yet! + + +preInlineUnconditionally +~~~~~~~~~~~~~~~~~~~~~~~~ +@preInlineUnconditionally@ examines a bndr to see if it is used just +once in a completely safe way, so that it is safe to discard the +binding inline its RHS at the (unique) usage site, REGARDLESS of how +big the RHS might be. If this is the case we don't simplify the RHS +first, but just inline it un-simplified. + +This is much better than first simplifying a perhaps-huge RHS and then +inlining and re-simplifying it. Indeed, it can be at least quadratically +better. Consider + + x1 = e1 + x2 = e2[x1] + x3 = e3[x2] + ...etc... + xN = eN[xN-1] + +We may end up simplifying e1 N times, e2 N-1 times, e3 N-3 times etc. +This can happen with cascades of functions too: + + f1 = \x1.e1 + f2 = \xs.e2[f1] + f3 = \xs.e3[f3] + ...etc... + +THE MAIN INVARIANT is this: + + ---- preInlineUnconditionally invariant ----- + IF preInlineUnconditionally chooses to inline x = <rhs> + THEN doing the inlining should not change the occurrence + info for the free vars of <rhs> + ---------------------------------------------- + +For example, it's tempting to look at trivial binding like + x = y +and inline it unconditionally. But suppose x is used many times, +but this is the unique occurrence of y. Then inlining x would change +y's occurrence info, which breaks the invariant. It matters: y +might have a BIG rhs, which will now be dup'd at every occurrenc of x. + + +Evne RHSs labelled InlineMe aren't caught here, because there might be +no benefit from inlining at the call site. + +[Sept 01] Don't unconditionally inline a top-level thing, because that +can simply make a static thing into something built dynamically. E.g. + x = (a,b) + main = \s -> h x + +[Remember that we treat \s as a one-shot lambda.] No point in +inlining x unless there is something interesting about the call site. + +But watch out: if you aren't careful, some useful foldr/build fusion +can be lost (most notably in spectral/hartel/parstof) because the +foldr didn't see the build. Doing the dynamic allocation isn't a big +deal, in fact, but losing the fusion can be. But the right thing here +seems to be to do a callSiteInline based on the fact that there is +something interesting about the call site (it's strict). Hmm. That +seems a bit fragile. + +Conclusion: inline top level things gaily until Phase 0 (the last +phase), at which point don't. + +\begin{code} +preInlineUnconditionally :: SimplEnv -> TopLevelFlag -> InId -> InExpr -> Bool +preInlineUnconditionally env top_lvl bndr rhs + | not active = False + | opt_SimplNoPreInlining = False + | otherwise = case idOccInfo bndr of + IAmDead -> True -- Happens in ((\x.1) v) + OneOcc in_lam True int_cxt -> try_once in_lam int_cxt + other -> False + where + phase = getMode env + active = case phase of + SimplGently -> isAlwaysActive prag + SimplPhase n -> isActive n prag + prag = idInlinePragma bndr + + try_once in_lam int_cxt -- There's one textual occurrence + | not in_lam = isNotTopLevel top_lvl || early_phase + | otherwise = int_cxt && canInlineInLam rhs + +-- Be very careful before inlining inside a lambda, becuase (a) we must not +-- invalidate occurrence information, and (b) we want to avoid pushing a +-- single allocation (here) into multiple allocations (inside lambda). +-- Inlining a *function* with a single *saturated* call would be ok, mind you. +-- || (if is_cheap && not (canInlineInLam rhs) then pprTrace "preinline" (ppr bndr <+> ppr rhs) ok else ok) +-- where +-- is_cheap = exprIsCheap rhs +-- ok = is_cheap && int_cxt + + -- int_cxt The context isn't totally boring + -- E.g. let f = \ab.BIG in \y. map f xs + -- Don't want to substitute for f, because then we allocate + -- its closure every time the \y is called + -- But: let f = \ab.BIG in \y. map (f y) xs + -- Now we do want to substitute for f, even though it's not + -- saturated, because we're going to allocate a closure for + -- (f y) every time round the loop anyhow. + + -- canInlineInLam => free vars of rhs are (Once in_lam) or Many, + -- so substituting rhs inside a lambda doesn't change the occ info. + -- Sadly, not quite the same as exprIsHNF. + canInlineInLam (Lit l) = True + canInlineInLam (Lam b e) = isRuntimeVar b || canInlineInLam e + canInlineInLam (Note _ e) = canInlineInLam e + canInlineInLam _ = False + + early_phase = case phase of + SimplPhase 0 -> False + other -> True +-- If we don't have this early_phase test, consider +-- x = length [1,2,3] +-- The full laziness pass carefully floats all the cons cells to +-- top level, and preInlineUnconditionally floats them all back in. +-- Result is (a) static allocation replaced by dynamic allocation +-- (b) many simplifier iterations because this tickles +-- a related problem; only one inlining per pass +-- +-- On the other hand, I have seen cases where top-level fusion is +-- lost if we don't inline top level thing (e.g. string constants) +-- Hence the test for phase zero (which is the phase for all the final +-- simplifications). Until phase zero we take no special notice of +-- top level things, but then we become more leery about inlining +-- them. + +\end{code} + +postInlineUnconditionally +~~~~~~~~~~~~~~~~~~~~~~~~~ +@postInlineUnconditionally@ decides whether to unconditionally inline +a thing based on the form of its RHS; in particular if it has a +trivial RHS. If so, we can inline and discard the binding altogether. + +NB: a loop breaker has must_keep_binding = True and non-loop-breakers +only have *forward* references Hence, it's safe to discard the binding + +NOTE: This isn't our last opportunity to inline. We're at the binding +site right now, and we'll get another opportunity when we get to the +ocurrence(s) + +Note that we do this unconditional inlining only for trival RHSs. +Don't inline even WHNFs inside lambdas; doing so may simply increase +allocation when the function is called. This isn't the last chance; see +NOTE above. + +NB: Even inline pragmas (e.g. IMustBeINLINEd) are ignored here Why? +Because we don't even want to inline them into the RHS of constructor +arguments. See NOTE above + +NB: At one time even NOINLINE was ignored here: if the rhs is trivial +it's best to inline it anyway. We often get a=E; b=a from desugaring, +with both a and b marked NOINLINE. But that seems incompatible with +our new view that inlining is like a RULE, so I'm sticking to the 'active' +story for now. + +\begin{code} +postInlineUnconditionally :: SimplEnv -> TopLevelFlag -> OutId -> OccInfo -> OutExpr -> Unfolding -> Bool +postInlineUnconditionally env top_lvl bndr occ_info rhs unfolding + | not active = False + | isLoopBreaker occ_info = False + | isExportedId bndr = False + | exprIsTrivial rhs = True + | otherwise + = case occ_info of + OneOcc in_lam one_br int_cxt + -> (one_br || smallEnoughToInline unfolding) -- Small enough to dup + -- ToDo: consider discount on smallEnoughToInline if int_cxt is true + -- + -- NB: Do we want to inline arbitrarily big things becuase + -- one_br is True? that can lead to inline cascades. But + -- preInlineUnconditionlly has dealt with all the common cases + -- so perhaps it's worth the risk. Here's an example + -- let f = if b then Left (\x.BIG) else Right (\y.BIG) + -- in \y. ....f.... + -- We can't preInlineUnconditionally because that woud invalidate + -- the occ info for b. Yet f is used just once, and duplicating + -- the case work is fine (exprIsCheap). + + && ((isNotTopLevel top_lvl && not in_lam) || + -- But outside a lambda, we want to be reasonably aggressive + -- about inlining into multiple branches of case + -- e.g. let x = <non-value> + -- in case y of { C1 -> ..x..; C2 -> ..x..; C3 -> ... } + -- Inlining can be a big win if C3 is the hot-spot, even if + -- the uses in C1, C2 are not 'interesting' + -- An example that gets worse if you add int_cxt here is 'clausify' + + (isCheapUnfolding unfolding && int_cxt)) + -- isCheap => acceptable work duplication; in_lam may be true + -- int_cxt to prevent us inlining inside a lambda without some + -- good reason. See the notes on int_cxt in preInlineUnconditionally + + other -> False + -- The point here is that for *non-values* that occur + -- outside a lambda, the call-site inliner won't have + -- a chance (becuase it doesn't know that the thing + -- only occurs once). The pre-inliner won't have gotten + -- it either, if the thing occurs in more than one branch + -- So the main target is things like + -- let x = f y in + -- case v of + -- True -> case x of ... + -- False -> case x of ... + -- I'm not sure how important this is in practice + where + active = case getMode env of + SimplGently -> isAlwaysActive prag + SimplPhase n -> isActive n prag + prag = idInlinePragma bndr + +activeInline :: SimplEnv -> OutId -> OccInfo -> Bool +activeInline env id occ + = case getMode env of + SimplGently -> isOneOcc occ && isAlwaysActive prag + -- No inlining at all when doing gentle stuff, + -- except for local things that occur once + -- The reason is that too little clean-up happens if you + -- don't inline use-once things. Also a bit of inlining is *good* for + -- full laziness; it can expose constant sub-expressions. + -- Example in spectral/mandel/Mandel.hs, where the mandelset + -- function gets a useful let-float if you inline windowToViewport + + -- NB: we used to have a second exception, for data con wrappers. + -- On the grounds that we use gentle mode for rule LHSs, and + -- they match better when data con wrappers are inlined. + -- But that only really applies to the trivial wrappers (like (:)), + -- and they are now constructed as Compulsory unfoldings (in MkId) + -- so they'll happen anyway. + + SimplPhase n -> isActive n prag + where + prag = idInlinePragma id + +activeRule :: SimplEnv -> Maybe (Activation -> Bool) +-- Nothing => No rules at all +activeRule env + | opt_RulesOff = Nothing + | otherwise + = case getMode env of + SimplGently -> Just isAlwaysActive + -- Used to be Nothing (no rules in gentle mode) + -- Main motivation for changing is that I wanted + -- lift String ===> ... + -- to work in Template Haskell when simplifying + -- splices, so we get simpler code for literal strings + SimplPhase n -> Just (isActive n) +\end{code} + + +%************************************************************************ +%* * +\subsection{Rebuilding a lambda} +%* * +%************************************************************************ + +\begin{code} +mkLam :: SimplEnv -> [OutBinder] -> OutExpr -> SimplCont -> SimplM FloatsWithExpr +\end{code} + +Try three things + a) eta reduction, if that gives a trivial expression + b) eta expansion [only if there are some value lambdas] + c) floating lets out through big lambdas + [only if all tyvar lambdas, and only if this lambda + is the RHS of a let] + +\begin{code} +mkLam env bndrs body cont + = getDOptsSmpl `thenSmpl` \dflags -> + mkLam' dflags env bndrs body cont + where + mkLam' dflags env bndrs body cont + | dopt Opt_DoEtaReduction dflags, + Just etad_lam <- tryEtaReduce bndrs body + = tick (EtaReduction (head bndrs)) `thenSmpl_` + returnSmpl (emptyFloats env, etad_lam) + + | dopt Opt_DoLambdaEtaExpansion dflags, + any isRuntimeVar bndrs + = tryEtaExpansion body `thenSmpl` \ body' -> + returnSmpl (emptyFloats env, mkLams bndrs body') + +{- Sept 01: I'm experimenting with getting the + full laziness pass to float out past big lambdsa + | all isTyVar bndrs, -- Only for big lambdas + contIsRhs cont -- Only try the rhs type-lambda floating + -- if this is indeed a right-hand side; otherwise + -- we end up floating the thing out, only for float-in + -- to float it right back in again! + = tryRhsTyLam env bndrs body `thenSmpl` \ (floats, body') -> + returnSmpl (floats, mkLams bndrs body') +-} + + | otherwise + = returnSmpl (emptyFloats env, mkLams bndrs body) +\end{code} + + +%************************************************************************ +%* * +\subsection{Eta expansion and reduction} +%* * +%************************************************************************ + +We try for eta reduction here, but *only* if we get all the +way to an exprIsTrivial expression. +We don't want to remove extra lambdas unless we are going +to avoid allocating this thing altogether + +\begin{code} +tryEtaReduce :: [OutBinder] -> OutExpr -> Maybe OutExpr +tryEtaReduce bndrs body + -- We don't use CoreUtils.etaReduce, because we can be more + -- efficient here: + -- (a) we already have the binders + -- (b) we can do the triviality test before computing the free vars + = go (reverse bndrs) body + where + go (b : bs) (App fun arg) | ok_arg b arg = go bs fun -- Loop round + go [] fun | ok_fun fun = Just fun -- Success! + go _ _ = Nothing -- Failure! + + ok_fun fun = exprIsTrivial fun + && not (any (`elemVarSet` (exprFreeVars fun)) bndrs) + && (exprIsHNF fun || all ok_lam bndrs) + ok_lam v = isTyVar v || isDictId v + -- The exprIsHNF is because eta reduction is not + -- valid in general: \x. bot /= bot + -- So we need to be sure that the "fun" is a value. + -- + -- However, we always want to reduce (/\a -> f a) to f + -- This came up in a RULE: foldr (build (/\a -> g a)) + -- did not match foldr (build (/\b -> ...something complex...)) + -- The type checker can insert these eta-expanded versions, + -- with both type and dictionary lambdas; hence the slightly + -- ad-hoc isDictTy + + ok_arg b arg = varToCoreExpr b `cheapEqExpr` arg +\end{code} + + + Try eta expansion for RHSs + +We go for: + f = \x1..xn -> N ==> f = \x1..xn y1..ym -> N y1..ym + (n >= 0) + +where (in both cases) + + * The xi can include type variables + + * The yi are all value variables + + * N is a NORMAL FORM (i.e. no redexes anywhere) + wanting a suitable number of extra args. + +We may have to sandwich some coerces between the lambdas +to make the types work. exprEtaExpandArity looks through coerces +when computing arity; and etaExpand adds the coerces as necessary when +actually computing the expansion. + +\begin{code} +tryEtaExpansion :: OutExpr -> SimplM OutExpr +-- There is at least one runtime binder in the binders +tryEtaExpansion body + = getUniquesSmpl `thenSmpl` \ us -> + returnSmpl (etaExpand fun_arity us body (exprType body)) + where + fun_arity = exprEtaExpandArity body +\end{code} + + +%************************************************************************ +%* * +\subsection{Floating lets out of big lambdas} +%* * +%************************************************************************ + +tryRhsTyLam tries this transformation, when the big lambda appears as +the RHS of a let(rec) binding: + + /\abc -> let(rec) x = e in b + ==> + let(rec) x' = /\abc -> let x = x' a b c in e + in + /\abc -> let x = x' a b c in b + +This is good because it can turn things like: + + let f = /\a -> letrec g = ... g ... in g +into + letrec g' = /\a -> ... g' a ... + in + let f = /\ a -> g' a + +which is better. In effect, it means that big lambdas don't impede +let-floating. + +This optimisation is CRUCIAL in eliminating the junk introduced by +desugaring mutually recursive definitions. Don't eliminate it lightly! + +So far as the implementation is concerned: + + Invariant: go F e = /\tvs -> F e + + Equalities: + go F (Let x=e in b) + = Let x' = /\tvs -> F e + in + go G b + where + G = F . Let x = x' tvs + + go F (Letrec xi=ei in b) + = Letrec {xi' = /\tvs -> G ei} + in + go G b + where + G = F . Let {xi = xi' tvs} + +[May 1999] If we do this transformation *regardless* then we can +end up with some pretty silly stuff. For example, + + let + st = /\ s -> let { x1=r1 ; x2=r2 } in ... + in .. +becomes + let y1 = /\s -> r1 + y2 = /\s -> r2 + st = /\s -> ...[y1 s/x1, y2 s/x2] + in .. + +Unless the "..." is a WHNF there is really no point in doing this. +Indeed it can make things worse. Suppose x1 is used strictly, +and is of the form + + x1* = case f y of { (a,b) -> e } + +If we abstract this wrt the tyvar we then can't do the case inline +as we would normally do. + + +\begin{code} +{- Trying to do this in full laziness + +tryRhsTyLam :: SimplEnv -> [OutTyVar] -> OutExpr -> SimplM FloatsWithExpr +-- Call ensures that all the binders are type variables + +tryRhsTyLam env tyvars body -- Only does something if there's a let + | not (all isTyVar tyvars) + || not (worth_it body) -- inside a type lambda, + = returnSmpl (emptyFloats env, body) -- and a WHNF inside that + + | otherwise + = go env (\x -> x) body + + where + worth_it e@(Let _ _) = whnf_in_middle e + worth_it e = False + + whnf_in_middle (Let (NonRec x rhs) e) | isUnLiftedType (idType x) = False + whnf_in_middle (Let _ e) = whnf_in_middle e + whnf_in_middle e = exprIsCheap e + + main_tyvar_set = mkVarSet tyvars + + go env fn (Let bind@(NonRec var rhs) body) + | exprIsTrivial rhs + = go env (fn . Let bind) body + + go env fn (Let (NonRec var rhs) body) + = mk_poly tyvars_here var `thenSmpl` \ (var', rhs') -> + addAuxiliaryBind env (NonRec var' (mkLams tyvars_here (fn rhs))) $ \ env -> + go env (fn . Let (mk_silly_bind var rhs')) body + + where + + tyvars_here = varSetElems (main_tyvar_set `intersectVarSet` exprSomeFreeVars isTyVar rhs) + -- Abstract only over the type variables free in the rhs + -- wrt which the new binding is abstracted. But the naive + -- approach of abstract wrt the tyvars free in the Id's type + -- fails. Consider: + -- /\ a b -> let t :: (a,b) = (e1, e2) + -- x :: a = fst t + -- in ... + -- Here, b isn't free in x's type, but we must nevertheless + -- abstract wrt b as well, because t's type mentions b. + -- Since t is floated too, we'd end up with the bogus: + -- poly_t = /\ a b -> (e1, e2) + -- poly_x = /\ a -> fst (poly_t a *b*) + -- So for now we adopt the even more naive approach of + -- abstracting wrt *all* the tyvars. We'll see if that + -- gives rise to problems. SLPJ June 98 + + go env fn (Let (Rec prs) body) + = mapAndUnzipSmpl (mk_poly tyvars_here) vars `thenSmpl` \ (vars', rhss') -> + let + gn body = fn (foldr Let body (zipWith mk_silly_bind vars rhss')) + pairs = vars' `zip` [mkLams tyvars_here (gn rhs) | rhs <- rhss] + in + addAuxiliaryBind env (Rec pairs) $ \ env -> + go env gn body + where + (vars,rhss) = unzip prs + tyvars_here = varSetElems (main_tyvar_set `intersectVarSet` exprsSomeFreeVars isTyVar (map snd prs)) + -- See notes with tyvars_here above + + go env fn body = returnSmpl (emptyFloats env, fn body) + + mk_poly tyvars_here var + = getUniqueSmpl `thenSmpl` \ uniq -> + let + poly_name = setNameUnique (idName var) uniq -- Keep same name + poly_ty = mkForAllTys tyvars_here (idType var) -- But new type of course + poly_id = mkLocalId poly_name poly_ty + + -- In the olden days, it was crucial to copy the occInfo of the original var, + -- because we were looking at occurrence-analysed but as yet unsimplified code! + -- In particular, we mustn't lose the loop breakers. BUT NOW we are looking + -- at already simplified code, so it doesn't matter + -- + -- It's even right to retain single-occurrence or dead-var info: + -- Suppose we started with /\a -> let x = E in B + -- where x occurs once in B. Then we transform to: + -- let x' = /\a -> E in /\a -> let x* = x' a in B + -- where x* has an INLINE prag on it. Now, once x* is inlined, + -- the occurrences of x' will be just the occurrences originally + -- pinned on x. + in + returnSmpl (poly_id, mkTyApps (Var poly_id) (mkTyVarTys tyvars_here)) + + mk_silly_bind var rhs = NonRec var (Note InlineMe rhs) + -- Suppose we start with: + -- + -- x = /\ a -> let g = G in E + -- + -- Then we'll float to get + -- + -- x = let poly_g = /\ a -> G + -- in /\ a -> let g = poly_g a in E + -- + -- But now the occurrence analyser will see just one occurrence + -- of poly_g, not inside a lambda, so the simplifier will + -- PreInlineUnconditionally poly_g back into g! Badk to square 1! + -- (I used to think that the "don't inline lone occurrences" stuff + -- would stop this happening, but since it's the *only* occurrence, + -- PreInlineUnconditionally kicks in first!) + -- + -- Solution: put an INLINE note on g's RHS, so that poly_g seems + -- to appear many times. (NB: mkInlineMe eliminates + -- such notes on trivial RHSs, so do it manually.) +-} +\end{code} + +%************************************************************************ +%* * +\subsection{Case alternative filtering +%* * +%************************************************************************ + +prepareAlts does two things: + +1. Eliminate alternatives that cannot match, including the + DEFAULT alternative. + +2. If the DEFAULT alternative can match only one possible constructor, + then make that constructor explicit. + e.g. + case e of x { DEFAULT -> rhs } + ===> + case e of x { (a,b) -> rhs } + where the type is a single constructor type. This gives better code + when rhs also scrutinises x or e. + +It's a good idea do do this stuff before simplifying the alternatives, to +avoid simplifying alternatives we know can't happen, and to come up with +the list of constructors that are handled, to put into the IdInfo of the +case binder, for use when simplifying the alternatives. + +Eliminating the default alternative in (1) isn't so obvious, but it can +happen: + +data Colour = Red | Green | Blue + +f x = case x of + Red -> .. + Green -> .. + DEFAULT -> h x + +h y = case y of + Blue -> .. + DEFAULT -> [ case y of ... ] + +If we inline h into f, the default case of the inlined h can't happen. +If we don't notice this, we may end up filtering out *all* the cases +of the inner case y, which give us nowhere to go! + + +\begin{code} +prepareAlts :: OutExpr -- Scrutinee + -> InId -- Case binder (passed only to use in statistics) + -> [InAlt] -- Increasing order + -> SimplM ([InAlt], -- Better alternatives, still incresaing order + [AltCon]) -- These cases are handled + +prepareAlts scrut case_bndr alts + = let + (alts_wo_default, maybe_deflt) = findDefault alts + + impossible_cons = case scrut of + Var v -> otherCons (idUnfolding v) + other -> [] + + -- Filter out alternatives that can't possibly match + better_alts | null impossible_cons = alts_wo_default + | otherwise = [alt | alt@(con,_,_) <- alts_wo_default, + not (con `elem` impossible_cons)] + + -- "handled_cons" are handled either by the context, + -- or by a branch in this case expression + -- (Don't add DEFAULT to the handled_cons!!) + handled_cons = impossible_cons ++ [con | (con,_,_) <- better_alts] + in + -- Filter out the default, if it can't happen, + -- or replace it with "proper" alternative if there + -- is only one constructor left + prepareDefault scrut case_bndr handled_cons maybe_deflt `thenSmpl` \ deflt_alt -> + + returnSmpl (mergeAlts better_alts deflt_alt, handled_cons) + -- We need the mergeAlts in case the new default_alt + -- has turned into a constructor alternative. + +prepareDefault scrut case_bndr handled_cons (Just rhs) + | Just (tycon, inst_tys) <- splitTyConApp_maybe (exprType scrut), + -- Use exprType scrut here, rather than idType case_bndr, because + -- case_bndr is an InId, so exprType scrut may have more information + -- Test simpl013 is an example + isAlgTyCon tycon, -- It's a data type, tuple, or unboxed tuples. + not (isNewTyCon tycon), -- We can have a newtype, if we are just doing an eval: + -- case x of { DEFAULT -> e } + -- and we don't want to fill in a default for them! + Just all_cons <- tyConDataCons_maybe tycon, + not (null all_cons), -- This is a tricky corner case. If the data type has no constructors, + -- which GHC allows, then the case expression will have at most a default + -- alternative. We don't want to eliminate that alternative, because the + -- invariant is that there's always one alternative. It's more convenient + -- to leave + -- case x of { DEFAULT -> e } + -- as it is, rather than transform it to + -- error "case cant match" + -- which would be quite legitmate. But it's a really obscure corner, and + -- not worth wasting code on. + let handled_data_cons = [data_con | DataAlt data_con <- handled_cons], + let missing_cons = [con | con <- all_cons, + not (con `elem` handled_data_cons)] + = case missing_cons of + [] -> returnSmpl [] -- Eliminate the default alternative + -- if it can't match + + [con] -> -- It matches exactly one constructor, so fill it in + tick (FillInCaseDefault case_bndr) `thenSmpl_` + mk_args con inst_tys `thenSmpl` \ args -> + returnSmpl [(DataAlt con, args, rhs)] + + two_or_more -> returnSmpl [(DEFAULT, [], rhs)] + + | otherwise + = returnSmpl [(DEFAULT, [], rhs)] + +prepareDefault scrut case_bndr handled_cons Nothing + = returnSmpl [] + +mk_args missing_con inst_tys + = mk_tv_bndrs missing_con inst_tys `thenSmpl` \ (tv_bndrs, inst_tys') -> + getUniquesSmpl `thenSmpl` \ id_uniqs -> + let arg_tys = dataConInstArgTys missing_con inst_tys' + arg_ids = zipWith (mkSysLocal FSLIT("a")) id_uniqs arg_tys + in + returnSmpl (tv_bndrs ++ arg_ids) + +mk_tv_bndrs missing_con inst_tys + | isVanillaDataCon missing_con + = returnSmpl ([], inst_tys) + | otherwise + = getUniquesSmpl `thenSmpl` \ tv_uniqs -> + let new_tvs = zipWith mk tv_uniqs (dataConTyVars missing_con) + mk uniq tv = mkTyVar (mkSysTvName uniq FSLIT("t")) (tyVarKind tv) + in + returnSmpl (new_tvs, mkTyVarTys new_tvs) +\end{code} + + +%************************************************************************ +%* * +\subsection{Case absorption and identity-case elimination} +%* * +%************************************************************************ + +mkCase puts a case expression back together, trying various transformations first. + +\begin{code} +mkCase :: OutExpr -> OutId -> OutType + -> [OutAlt] -- Increasing order + -> SimplM OutExpr + +mkCase scrut case_bndr ty alts + = getDOptsSmpl `thenSmpl` \dflags -> + mkAlts dflags scrut case_bndr alts `thenSmpl` \ better_alts -> + mkCase1 scrut case_bndr ty better_alts +\end{code} + + +mkAlts tries these things: + +1. If several alternatives are identical, merge them into + a single DEFAULT alternative. I've occasionally seen this + making a big difference: + + case e of =====> case e of + C _ -> f x D v -> ....v.... + D v -> ....v.... DEFAULT -> f x + DEFAULT -> f x + + The point is that we merge common RHSs, at least for the DEFAULT case. + [One could do something more elaborate but I've never seen it needed.] + To avoid an expensive test, we just merge branches equal to the *first* + alternative; this picks up the common cases + a) all branches equal + b) some branches equal to the DEFAULT (which occurs first) + +2. Case merging: + case e of b { ==> case e of b { + p1 -> rhs1 p1 -> rhs1 + ... ... + pm -> rhsm pm -> rhsm + _ -> case b of b' { pn -> let b'=b in rhsn + pn -> rhsn ... + ... po -> let b'=b in rhso + po -> rhso _ -> let b'=b in rhsd + _ -> rhsd + } + + which merges two cases in one case when -- the default alternative of + the outer case scrutises the same variable as the outer case This + transformation is called Case Merging. It avoids that the same + variable is scrutinised multiple times. + + +The case where transformation (1) showed up was like this (lib/std/PrelCError.lhs): + + x | p `is` 1 -> e1 + | p `is` 2 -> e2 + ...etc... + +where @is@ was something like + + p `is` n = p /= (-1) && p == n + +This gave rise to a horrible sequence of cases + + case p of + (-1) -> $j p + 1 -> e1 + DEFAULT -> $j p + +and similarly in cascade for all the join points! + + + +\begin{code} +-------------------------------------------------- +-- 1. Merge identical branches +-------------------------------------------------- +mkAlts dflags scrut case_bndr alts@((con1,bndrs1,rhs1) : con_alts) + | all isDeadBinder bndrs1, -- Remember the default + length filtered_alts < length con_alts -- alternative comes first + = tick (AltMerge case_bndr) `thenSmpl_` + returnSmpl better_alts + where + filtered_alts = filter keep con_alts + keep (con,bndrs,rhs) = not (all isDeadBinder bndrs && rhs `cheapEqExpr` rhs1) + better_alts = (DEFAULT, [], rhs1) : filtered_alts + + +-------------------------------------------------- +-- 2. Merge nested cases +-------------------------------------------------- + +mkAlts dflags scrut outer_bndr outer_alts + | dopt Opt_CaseMerge dflags, + (outer_alts_without_deflt, maybe_outer_deflt) <- findDefault outer_alts, + Just (Case (Var scrut_var) inner_bndr _ inner_alts) <- maybe_outer_deflt, + scruting_same_var scrut_var + = let + munged_inner_alts = [(con, args, munge_rhs rhs) | (con, args, rhs) <- inner_alts] + munge_rhs rhs = bindCaseBndr inner_bndr (Var outer_bndr) rhs + + new_alts = mergeAlts outer_alts_without_deflt munged_inner_alts + -- The merge keeps the inner DEFAULT at the front, if there is one + -- and eliminates any inner_alts that are shadowed by the outer_alts + in + tick (CaseMerge outer_bndr) `thenSmpl_` + returnSmpl new_alts + -- Warning: don't call mkAlts recursively! + -- Firstly, there's no point, because inner alts have already had + -- mkCase applied to them, so they won't have a case in their default + -- Secondly, if you do, you get an infinite loop, because the bindCaseBndr + -- in munge_rhs may put a case into the DEFAULT branch! + where + -- We are scrutinising the same variable if it's + -- the outer case-binder, or if the outer case scrutinises a variable + -- (and it's the same). Testing both allows us not to replace the + -- outer scrut-var with the outer case-binder (Simplify.simplCaseBinder). + scruting_same_var = case scrut of + Var outer_scrut -> \ v -> v == outer_bndr || v == outer_scrut + other -> \ v -> v == outer_bndr + +------------------------------------------------ +-- Catch-all +------------------------------------------------ + +mkAlts dflags scrut case_bndr other_alts = returnSmpl other_alts + + +--------------------------------- +mergeAlts :: [OutAlt] -> [OutAlt] -> [OutAlt] +-- Merge preserving order; alternatives in the first arg +-- shadow ones in the second +mergeAlts [] as2 = as2 +mergeAlts as1 [] = as1 +mergeAlts (a1:as1) (a2:as2) + = case a1 `cmpAlt` a2 of + LT -> a1 : mergeAlts as1 (a2:as2) + EQ -> a1 : mergeAlts as1 as2 -- Discard a2 + GT -> a2 : mergeAlts (a1:as1) as2 +\end{code} + + + +================================================================================= + +mkCase1 tries these things + +1. Eliminate the case altogether if possible + +2. Case-identity: + + case e of ===> e + True -> True; + False -> False + + and similar friends. + + +Start with a simple situation: + + case x# of ===> e[x#/y#] + y# -> e + +(when x#, y# are of primitive type, of course). We can't (in general) +do this for algebraic cases, because we might turn bottom into +non-bottom! + +Actually, we generalise this idea to look for a case where we're +scrutinising a variable, and we know that only the default case can +match. For example: +\begin{verbatim} + case x of + 0# -> ... + other -> ...(case x of + 0# -> ... + other -> ...) ... +\end{code} +Here the inner case can be eliminated. This really only shows up in +eliminating error-checking code. + +We also make sure that we deal with this very common case: + + case e of + x -> ...x... + +Here we are using the case as a strict let; if x is used only once +then we want to inline it. We have to be careful that this doesn't +make the program terminate when it would have diverged before, so we +check that + - x is used strictly, or + - e is already evaluated (it may so if e is a variable) + +Lastly, we generalise the transformation to handle this: + + case e of ===> r + True -> r + False -> r + +We only do this for very cheaply compared r's (constructors, literals +and variables). If pedantic bottoms is on, we only do it when the +scrutinee is a PrimOp which can't fail. + +We do it *here*, looking at un-simplified alternatives, because we +have to check that r doesn't mention the variables bound by the +pattern in each alternative, so the binder-info is rather useful. + +So the case-elimination algorithm is: + + 1. Eliminate alternatives which can't match + + 2. Check whether all the remaining alternatives + (a) do not mention in their rhs any of the variables bound in their pattern + and (b) have equal rhss + + 3. Check we can safely ditch the case: + * PedanticBottoms is off, + or * the scrutinee is an already-evaluated variable + or * the scrutinee is a primop which is ok for speculation + -- ie we want to preserve divide-by-zero errors, and + -- calls to error itself! + + or * [Prim cases] the scrutinee is a primitive variable + + or * [Alg cases] the scrutinee is a variable and + either * the rhs is the same variable + (eg case x of C a b -> x ===> x) + or * there is only one alternative, the default alternative, + and the binder is used strictly in its scope. + [NB this is helped by the "use default binder where + possible" transformation; see below.] + + +If so, then we can replace the case with one of the rhss. + +Further notes about case elimination +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Consider: test :: Integer -> IO () + test = print + +Turns out that this compiles to: + Print.test + = \ eta :: Integer + eta1 :: State# RealWorld -> + case PrelNum.< eta PrelNum.zeroInteger of wild { __DEFAULT -> + case hPutStr stdout + (PrelNum.jtos eta ($w[] @ Char)) + eta1 + of wild1 { (# new_s, a4 #) -> PrelIO.lvl23 new_s }} + +Notice the strange '<' which has no effect at all. This is a funny one. +It started like this: + +f x y = if x < 0 then jtos x + else if y==0 then "" else jtos x + +At a particular call site we have (f v 1). So we inline to get + + if v < 0 then jtos x + else if 1==0 then "" else jtos x + +Now simplify the 1==0 conditional: + + if v<0 then jtos v else jtos v + +Now common-up the two branches of the case: + + case (v<0) of DEFAULT -> jtos v + +Why don't we drop the case? Because it's strict in v. It's technically +wrong to drop even unnecessary evaluations, and in practice they +may be a result of 'seq' so we *definitely* don't want to drop those. +I don't really know how to improve this situation. + + +\begin{code} +-------------------------------------------------- +-- 0. Check for empty alternatives +-------------------------------------------------- + +-- This isn't strictly an error. It's possible that the simplifer might "see" +-- that an inner case has no accessible alternatives before it "sees" that the +-- entire branch of an outer case is inaccessible. So we simply +-- put an error case here insteadd +mkCase1 scrut case_bndr ty [] + = pprTrace "mkCase1: null alts" (ppr case_bndr <+> ppr scrut) $ + return (mkApps (Var eRROR_ID) + [Type ty, Lit (mkStringLit "Impossible alternative")]) + +-------------------------------------------------- +-- 1. Eliminate the case altogether if poss +-------------------------------------------------- + +mkCase1 scrut case_bndr ty [(con,bndrs,rhs)] + -- See if we can get rid of the case altogether + -- See the extensive notes on case-elimination above + -- mkCase made sure that if all the alternatives are equal, + -- then there is now only one (DEFAULT) rhs + | all isDeadBinder bndrs, + + -- Check that the scrutinee can be let-bound instead of case-bound + exprOkForSpeculation scrut + -- OK not to evaluate it + -- This includes things like (==# a# b#)::Bool + -- so that we simplify + -- case ==# a# b# of { True -> x; False -> x } + -- to just + -- x + -- This particular example shows up in default methods for + -- comparision operations (e.g. in (>=) for Int.Int32) + || exprIsHNF scrut -- It's already evaluated + || var_demanded_later scrut -- It'll be demanded later + +-- || not opt_SimplPedanticBottoms) -- Or we don't care! +-- We used to allow improving termination by discarding cases, unless -fpedantic-bottoms was on, +-- but that breaks badly for the dataToTag# primop, which relies on a case to evaluate +-- its argument: case x of { y -> dataToTag# y } +-- Here we must *not* discard the case, because dataToTag# just fetches the tag from +-- the info pointer. So we'll be pedantic all the time, and see if that gives any +-- other problems +-- Also we don't want to discard 'seq's + = tick (CaseElim case_bndr) `thenSmpl_` + returnSmpl (bindCaseBndr case_bndr scrut rhs) + + where + -- The case binder is going to be evaluated later, + -- and the scrutinee is a simple variable + var_demanded_later (Var v) = isStrictDmd (idNewDemandInfo case_bndr) + var_demanded_later other = False + + +-------------------------------------------------- +-- 2. Identity case +-------------------------------------------------- + +mkCase1 scrut case_bndr ty alts -- Identity case + | all identity_alt alts + = tick (CaseIdentity case_bndr) `thenSmpl_` + returnSmpl (re_note scrut) + where + identity_alt (con, args, rhs) = de_note rhs `cheapEqExpr` identity_rhs con args + + identity_rhs (DataAlt con) args = mkConApp con (arg_tys ++ map varToCoreExpr args) + identity_rhs (LitAlt lit) _ = Lit lit + identity_rhs DEFAULT _ = Var case_bndr + + arg_tys = map Type (tyConAppArgs (idType case_bndr)) + + -- We've seen this: + -- case coerce T e of x { _ -> coerce T' x } + -- And we definitely want to eliminate this case! + -- So we throw away notes from the RHS, and reconstruct + -- (at least an approximation) at the other end + de_note (Note _ e) = de_note e + de_note e = e + + -- re_note wraps a coerce if it might be necessary + re_note scrut = case head alts of + (_,_,rhs1@(Note _ _)) -> mkCoerce2 (exprType rhs1) (idType case_bndr) scrut + other -> scrut + + +-------------------------------------------------- +-- Catch-all +-------------------------------------------------- +mkCase1 scrut bndr ty alts = returnSmpl (Case scrut bndr ty alts) +\end{code} + + +When adding auxiliary bindings for the case binder, it's worth checking if +its dead, because it often is, and occasionally these mkCase transformations +cascade rather nicely. + +\begin{code} +bindCaseBndr bndr rhs body + | isDeadBinder bndr = body + | otherwise = bindNonRec bndr rhs body +\end{code} diff --git a/compiler/simplCore/Simplify.lhs b/compiler/simplCore/Simplify.lhs new file mode 100644 index 0000000000..5ea0a91007 --- /dev/null +++ b/compiler/simplCore/Simplify.lhs @@ -0,0 +1,1894 @@ +% +% (c) The AQUA Project, Glasgow University, 1993-1998 +% +\section[Simplify]{The main module of the simplifier} + +\begin{code} +module Simplify ( simplTopBinds, simplExpr ) where + +#include "HsVersions.h" + +import DynFlags ( dopt, DynFlag(Opt_D_dump_inlinings), + SimplifierSwitch(..) + ) +import SimplMonad +import SimplEnv +import SimplUtils ( mkCase, mkLam, prepareAlts, + SimplCont(..), DupFlag(..), LetRhsFlag(..), + mkRhsStop, mkBoringStop, pushContArgs, + contResultType, countArgs, contIsDupable, contIsRhsOrArg, + getContArgs, interestingCallContext, interestingArg, isStrictType, + preInlineUnconditionally, postInlineUnconditionally, + inlineMode, activeInline, activeRule + ) +import Id ( Id, idType, idInfo, idArity, isDataConWorkId, + setIdUnfolding, isDeadBinder, + idNewDemandInfo, setIdInfo, + setIdOccInfo, zapLamIdInfo, setOneShotLambda + ) +import MkId ( eRROR_ID ) +import Literal ( mkStringLit ) +import IdInfo ( OccInfo(..), isLoopBreaker, + setArityInfo, zapDemandInfo, + setUnfoldingInfo, + occInfo + ) +import NewDemand ( isStrictDmd ) +import Unify ( coreRefineTys ) +import DataCon ( dataConTyCon, dataConRepStrictness, isVanillaDataCon ) +import TyCon ( tyConArity ) +import CoreSyn +import PprCore ( pprParendExpr, pprCoreExpr ) +import CoreUnfold ( mkUnfolding, callSiteInline ) +import CoreUtils ( exprIsDupable, exprIsTrivial, needsCaseBinding, + exprIsConApp_maybe, mkPiTypes, findAlt, + exprType, exprIsHNF, + exprOkForSpeculation, exprArity, + mkCoerce, mkCoerce2, mkSCC, mkInlineMe, applyTypeToArg + ) +import Rules ( lookupRule ) +import BasicTypes ( isMarkedStrict ) +import CostCentre ( currentCCS ) +import Type ( TvSubstEnv, isUnLiftedType, seqType, tyConAppArgs, funArgTy, + splitFunTy_maybe, splitFunTy, coreEqType + ) +import VarEnv ( elemVarEnv, emptyVarEnv ) +import TysPrim ( realWorldStatePrimTy ) +import PrelInfo ( realWorldPrimId ) +import BasicTypes ( TopLevelFlag(..), isTopLevel, + RecFlag(..), isNonRec + ) +import StaticFlags ( opt_PprStyle_Debug ) +import OrdList +import Maybes ( orElse ) +import Outputable +import Util ( notNull ) +\end{code} + + +The guts of the simplifier is in this module, but the driver loop for +the simplifier is in SimplCore.lhs. + + +----------------------------------------- + *** IMPORTANT NOTE *** +----------------------------------------- +The simplifier used to guarantee that the output had no shadowing, but +it does not do so any more. (Actually, it never did!) The reason is +documented with simplifyArgs. + + +----------------------------------------- + *** IMPORTANT NOTE *** +----------------------------------------- +Many parts of the simplifier return a bunch of "floats" as well as an +expression. This is wrapped as a datatype SimplUtils.FloatsWith. + +All "floats" are let-binds, not case-binds, but some non-rec lets may +be unlifted (with RHS ok-for-speculation). + + + +----------------------------------------- + ORGANISATION OF FUNCTIONS +----------------------------------------- +simplTopBinds + - simplify all top-level binders + - for NonRec, call simplRecOrTopPair + - for Rec, call simplRecBind + + + ------------------------------ +simplExpr (applied lambda) ==> simplNonRecBind +simplExpr (Let (NonRec ...) ..) ==> simplNonRecBind +simplExpr (Let (Rec ...) ..) ==> simplify binders; simplRecBind + + ------------------------------ +simplRecBind [binders already simplfied] + - use simplRecOrTopPair on each pair in turn + +simplRecOrTopPair [binder already simplified] + Used for: recursive bindings (top level and nested) + top-level non-recursive bindings + Returns: + - check for PreInlineUnconditionally + - simplLazyBind + +simplNonRecBind + Used for: non-top-level non-recursive bindings + beta reductions (which amount to the same thing) + Because it can deal with strict arts, it takes a + "thing-inside" and returns an expression + + - check for PreInlineUnconditionally + - simplify binder, including its IdInfo + - if strict binding + simplStrictArg + mkAtomicArgs + completeNonRecX + else + simplLazyBind + addFloats + +simplNonRecX: [given a *simplified* RHS, but an *unsimplified* binder] + Used for: binding case-binder and constr args in a known-constructor case + - check for PreInLineUnconditionally + - simplify binder + - completeNonRecX + + ------------------------------ +simplLazyBind: [binder already simplified, RHS not] + Used for: recursive bindings (top level and nested) + top-level non-recursive bindings + non-top-level, but *lazy* non-recursive bindings + [must not be strict or unboxed] + Returns floats + an augmented environment, not an expression + - substituteIdInfo and add result to in-scope + [so that rules are available in rec rhs] + - simplify rhs + - mkAtomicArgs + - float if exposes constructor or PAP + - completeLazyBind + + +completeNonRecX: [binder and rhs both simplified] + - if the the thing needs case binding (unlifted and not ok-for-spec) + build a Case + else + completeLazyBind + addFloats + +completeLazyBind: [given a simplified RHS] + [used for both rec and non-rec bindings, top level and not] + - try PostInlineUnconditionally + - add unfolding [this is the only place we add an unfolding] + - add arity + + + +Right hand sides and arguments +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +In many ways we want to treat + (a) the right hand side of a let(rec), and + (b) a function argument +in the same way. But not always! In particular, we would +like to leave these arguments exactly as they are, so they +will match a RULE more easily. + + f (g x, h x) + g (+ x) + +It's harder to make the rule match if we ANF-ise the constructor, +or eta-expand the PAP: + + f (let { a = g x; b = h x } in (a,b)) + g (\y. + x y) + +On the other hand if we see the let-defns + + p = (g x, h x) + q = + x + +then we *do* want to ANF-ise and eta-expand, so that p and q +can be safely inlined. + +Even floating lets out is a bit dubious. For let RHS's we float lets +out if that exposes a value, so that the value can be inlined more vigorously. +For example + + r = let x = e in (x,x) + +Here, if we float the let out we'll expose a nice constructor. We did experiments +that showed this to be a generally good thing. But it was a bad thing to float +lets out unconditionally, because that meant they got allocated more often. + +For function arguments, there's less reason to expose a constructor (it won't +get inlined). Just possibly it might make a rule match, but I'm pretty skeptical. +So for the moment we don't float lets out of function arguments either. + + +Eta expansion +~~~~~~~~~~~~~~ +For eta expansion, we want to catch things like + + case e of (a,b) -> \x -> case a of (p,q) -> \y -> r + +If the \x was on the RHS of a let, we'd eta expand to bring the two +lambdas together. And in general that's a good thing to do. Perhaps +we should eta expand wherever we find a (value) lambda? Then the eta +expansion at a let RHS can concentrate solely on the PAP case. + + +%************************************************************************ +%* * +\subsection{Bindings} +%* * +%************************************************************************ + +\begin{code} +simplTopBinds :: SimplEnv -> [InBind] -> SimplM [OutBind] + +simplTopBinds env binds + = -- Put all the top-level binders into scope at the start + -- so that if a transformation rule has unexpectedly brought + -- anything into scope, then we don't get a complaint about that. + -- It's rather as if the top-level binders were imported. + simplRecBndrs env (bindersOfBinds binds) `thenSmpl` \ (env, bndrs') -> + simpl_binds env binds bndrs' `thenSmpl` \ (floats, _) -> + freeTick SimplifierDone `thenSmpl_` + returnSmpl (floatBinds floats) + where + -- We need to track the zapped top-level binders, because + -- they should have their fragile IdInfo zapped (notably occurrence info) + -- That's why we run down binds and bndrs' simultaneously. + simpl_binds :: SimplEnv -> [InBind] -> [OutId] -> SimplM (FloatsWith ()) + simpl_binds env [] bs = ASSERT( null bs ) returnSmpl (emptyFloats env, ()) + simpl_binds env (bind:binds) bs = simpl_bind env bind bs `thenSmpl` \ (floats,env) -> + addFloats env floats $ \env -> + simpl_binds env binds (drop_bs bind bs) + + drop_bs (NonRec _ _) (_ : bs) = bs + drop_bs (Rec prs) bs = drop (length prs) bs + + simpl_bind env bind bs + = getDOptsSmpl `thenSmpl` \ dflags -> + if dopt Opt_D_dump_inlinings dflags then + pprTrace "SimplBind" (ppr (bindersOf bind)) $ simpl_bind1 env bind bs + else + simpl_bind1 env bind bs + + simpl_bind1 env (NonRec b r) (b':_) = simplRecOrTopPair env TopLevel b b' r + simpl_bind1 env (Rec pairs) bs' = simplRecBind env TopLevel pairs bs' +\end{code} + + +%************************************************************************ +%* * +\subsection{simplNonRec} +%* * +%************************************************************************ + +simplNonRecBind is used for + * non-top-level non-recursive lets in expressions + * beta reduction + +It takes + * An unsimplified (binder, rhs) pair + * The env for the RHS. It may not be the same as the + current env because the bind might occur via (\x.E) arg + +It uses the CPS form because the binding might be strict, in which +case we might discard the continuation: + let x* = error "foo" in (...x...) + +It needs to turn unlifted bindings into a @case@. They can arise +from, say: (\x -> e) (4# + 3#) + +\begin{code} +simplNonRecBind :: SimplEnv + -> InId -- Binder + -> InExpr -> SimplEnv -- Arg, with its subst-env + -> OutType -- Type of thing computed by the context + -> (SimplEnv -> SimplM FloatsWithExpr) -- The body + -> SimplM FloatsWithExpr +#ifdef DEBUG +simplNonRecBind env bndr rhs rhs_se cont_ty thing_inside + | isTyVar bndr + = pprPanic "simplNonRecBind" (ppr bndr <+> ppr rhs) +#endif + +simplNonRecBind env bndr rhs rhs_se cont_ty thing_inside + = simplNonRecBind' env bndr rhs rhs_se cont_ty thing_inside + +simplNonRecBind' env bndr rhs rhs_se cont_ty thing_inside + | preInlineUnconditionally env NotTopLevel bndr rhs + = tick (PreInlineUnconditionally bndr) `thenSmpl_` + thing_inside (extendIdSubst env bndr (mkContEx rhs_se rhs)) + + | isStrictDmd (idNewDemandInfo bndr) || isStrictType bndr_ty -- A strict let + = -- Don't use simplBinder because that doesn't keep + -- fragile occurrence info in the substitution + simplNonRecBndr env bndr `thenSmpl` \ (env, bndr1) -> + simplStrictArg AnRhs env rhs rhs_se (idType bndr1) cont_ty $ \ env1 rhs1 -> + + -- Now complete the binding and simplify the body + let + (env2,bndr2) = addLetIdInfo env1 bndr bndr1 + in + if needsCaseBinding bndr_ty rhs1 + then + thing_inside env2 `thenSmpl` \ (floats, body) -> + returnSmpl (emptyFloats env2, Case rhs1 bndr2 (exprType body) + [(DEFAULT, [], wrapFloats floats body)]) + else + completeNonRecX env2 True {- strict -} bndr bndr2 rhs1 thing_inside + + | otherwise -- Normal, lazy case + = -- Don't use simplBinder because that doesn't keep + -- fragile occurrence info in the substitution + simplNonRecBndr env bndr `thenSmpl` \ (env, bndr') -> + simplLazyBind env NotTopLevel NonRecursive + bndr bndr' rhs rhs_se `thenSmpl` \ (floats, env) -> + addFloats env floats thing_inside + + where + bndr_ty = idType bndr +\end{code} + +A specialised variant of simplNonRec used when the RHS is already simplified, notably +in knownCon. It uses case-binding where necessary. + +\begin{code} +simplNonRecX :: SimplEnv + -> InId -- Old binder + -> OutExpr -- Simplified RHS + -> (SimplEnv -> SimplM FloatsWithExpr) + -> SimplM FloatsWithExpr + +simplNonRecX env bndr new_rhs thing_inside + | needsCaseBinding (idType bndr) new_rhs + -- Make this test *before* the preInlineUnconditionally + -- Consider case I# (quotInt# x y) of + -- I# v -> let w = J# v in ... + -- If we gaily inline (quotInt# x y) for v, we end up building an + -- extra thunk: + -- let w = J# (quotInt# x y) in ... + -- because quotInt# can fail. + = simplBinder env bndr `thenSmpl` \ (env, bndr') -> + thing_inside env `thenSmpl` \ (floats, body) -> + let body' = wrapFloats floats body in + returnSmpl (emptyFloats env, Case new_rhs bndr' (exprType body') [(DEFAULT, [], body')]) + + | preInlineUnconditionally env NotTopLevel bndr new_rhs + -- This happens; for example, the case_bndr during case of + -- known constructor: case (a,b) of x { (p,q) -> ... } + -- Here x isn't mentioned in the RHS, so we don't want to + -- create the (dead) let-binding let x = (a,b) in ... + -- + -- Similarly, single occurrences can be inlined vigourously + -- e.g. case (f x, g y) of (a,b) -> .... + -- If a,b occur once we can avoid constructing the let binding for them. + = thing_inside (extendIdSubst env bndr (DoneEx new_rhs)) + + | otherwise + = simplBinder env bndr `thenSmpl` \ (env, bndr') -> + completeNonRecX env False {- Non-strict; pessimistic -} + bndr bndr' new_rhs thing_inside + +completeNonRecX env is_strict old_bndr new_bndr new_rhs thing_inside + = mkAtomicArgs is_strict + True {- OK to float unlifted -} + new_rhs `thenSmpl` \ (aux_binds, rhs2) -> + + -- Make the arguments atomic if necessary, + -- adding suitable bindings + addAtomicBindsE env (fromOL aux_binds) $ \ env -> + completeLazyBind env NotTopLevel + old_bndr new_bndr rhs2 `thenSmpl` \ (floats, env) -> + addFloats env floats thing_inside +\end{code} + + +%************************************************************************ +%* * +\subsection{Lazy bindings} +%* * +%************************************************************************ + +simplRecBind is used for + * recursive bindings only + +\begin{code} +simplRecBind :: SimplEnv -> TopLevelFlag + -> [(InId, InExpr)] -> [OutId] + -> SimplM (FloatsWith SimplEnv) +simplRecBind env top_lvl pairs bndrs' + = go env pairs bndrs' `thenSmpl` \ (floats, env) -> + returnSmpl (flattenFloats floats, env) + where + go env [] _ = returnSmpl (emptyFloats env, env) + + go env ((bndr, rhs) : pairs) (bndr' : bndrs') + = simplRecOrTopPair env top_lvl bndr bndr' rhs `thenSmpl` \ (floats, env) -> + addFloats env floats (\env -> go env pairs bndrs') +\end{code} + + +simplRecOrTopPair is used for + * recursive bindings (whether top level or not) + * top-level non-recursive bindings + +It assumes the binder has already been simplified, but not its IdInfo. + +\begin{code} +simplRecOrTopPair :: SimplEnv + -> TopLevelFlag + -> InId -> OutId -- Binder, both pre-and post simpl + -> InExpr -- The RHS and its environment + -> SimplM (FloatsWith SimplEnv) + +simplRecOrTopPair env top_lvl bndr bndr' rhs + | preInlineUnconditionally env top_lvl bndr rhs -- Check for unconditional inline + = tick (PreInlineUnconditionally bndr) `thenSmpl_` + returnSmpl (emptyFloats env, extendIdSubst env bndr (mkContEx env rhs)) + + | otherwise + = simplLazyBind env top_lvl Recursive bndr bndr' rhs env + -- May not actually be recursive, but it doesn't matter +\end{code} + + +simplLazyBind is used for + * recursive bindings (whether top level or not) + * top-level non-recursive bindings + * non-top-level *lazy* non-recursive bindings + +[Thus it deals with the lazy cases from simplNonRecBind, and all cases +from SimplRecOrTopBind] + +Nota bene: + 1. It assumes that the binder is *already* simplified, + and is in scope, but not its IdInfo + + 2. It assumes that the binder type is lifted. + + 3. It does not check for pre-inline-unconditionallly; + that should have been done already. + +\begin{code} +simplLazyBind :: SimplEnv + -> TopLevelFlag -> RecFlag + -> InId -> OutId -- Binder, both pre-and post simpl + -> InExpr -> SimplEnv -- The RHS and its environment + -> SimplM (FloatsWith SimplEnv) + +simplLazyBind env top_lvl is_rec bndr bndr1 rhs rhs_se + = let + (env1,bndr2) = addLetIdInfo env bndr bndr1 + rhs_env = setInScope rhs_se env1 + is_top_level = isTopLevel top_lvl + ok_float_unlifted = not is_top_level && isNonRec is_rec + rhs_cont = mkRhsStop (idType bndr2) + in + -- Simplify the RHS; note the mkRhsStop, which tells + -- the simplifier that this is the RHS of a let. + simplExprF rhs_env rhs rhs_cont `thenSmpl` \ (floats, rhs1) -> + + -- If any of the floats can't be floated, give up now + -- (The allLifted predicate says True for empty floats.) + if (not ok_float_unlifted && not (allLifted floats)) then + completeLazyBind env1 top_lvl bndr bndr2 + (wrapFloats floats rhs1) + else + + -- ANF-ise a constructor or PAP rhs + mkAtomicArgs False {- Not strict -} + ok_float_unlifted rhs1 `thenSmpl` \ (aux_binds, rhs2) -> + + -- If the result is a PAP, float the floats out, else wrap them + -- By this time it's already been ANF-ised (if necessary) + if isEmptyFloats floats && isNilOL aux_binds then -- Shortcut a common case + completeLazyBind env1 top_lvl bndr bndr2 rhs2 + + else if is_top_level || exprIsTrivial rhs2 || exprIsHNF rhs2 then + -- WARNING: long dodgy argument coming up + -- WANTED: a better way to do this + -- + -- We can't use "exprIsCheap" instead of exprIsHNF, + -- because that causes a strictness bug. + -- x = let y* = E in case (scc y) of { T -> F; F -> T} + -- The case expression is 'cheap', but it's wrong to transform to + -- y* = E; x = case (scc y) of {...} + -- Either we must be careful not to float demanded non-values, or + -- we must use exprIsHNF for the test, which ensures that the + -- thing is non-strict. So exprIsHNF => bindings are non-strict + -- I think. The WARN below tests for this. + -- + -- We use exprIsTrivial here because we want to reveal lone variables. + -- E.g. let { x = letrec { y = E } in y } in ... + -- Here we definitely want to float the y=E defn. + -- exprIsHNF definitely isn't right for that. + -- + -- Again, the floated binding can't be strict; if it's recursive it'll + -- be non-strict; if it's non-recursive it'd be inlined. + -- + -- Note [SCC-and-exprIsTrivial] + -- If we have + -- y = let { x* = E } in scc "foo" x + -- then we do *not* want to float out the x binding, because + -- it's strict! Fortunately, exprIsTrivial replies False to + -- (scc "foo" x). + + -- There's a subtlety here. There may be a binding (x* = e) in the + -- floats, where the '*' means 'will be demanded'. So is it safe + -- to float it out? Answer no, but it won't matter because + -- we only float if (a) arg' is a WHNF, or (b) it's going to top level + -- and so there can't be any 'will be demanded' bindings in the floats. + -- Hence the warning + ASSERT2( is_top_level || not (any demanded_float (floatBinds floats)), + ppr (filter demanded_float (floatBinds floats)) ) + + tick LetFloatFromLet `thenSmpl_` ( + addFloats env1 floats $ \ env2 -> + addAtomicBinds env2 (fromOL aux_binds) $ \ env3 -> + completeLazyBind env3 top_lvl bndr bndr2 rhs2) + + else + completeLazyBind env1 top_lvl bndr bndr2 (wrapFloats floats rhs1) + +#ifdef DEBUG +demanded_float (NonRec b r) = isStrictDmd (idNewDemandInfo b) && not (isUnLiftedType (idType b)) + -- Unlifted-type (cheap-eagerness) lets may well have a demanded flag on them +demanded_float (Rec _) = False +#endif +\end{code} + + +%************************************************************************ +%* * +\subsection{Completing a lazy binding} +%* * +%************************************************************************ + +completeLazyBind + * deals only with Ids, not TyVars + * takes an already-simplified binder and RHS + * is used for both recursive and non-recursive bindings + * is used for both top-level and non-top-level bindings + +It does the following: + - tries discarding a dead binding + - tries PostInlineUnconditionally + - add unfolding [this is the only place we add an unfolding] + - add arity + +It does *not* attempt to do let-to-case. Why? Because it is used for + - top-level bindings (when let-to-case is impossible) + - many situations where the "rhs" is known to be a WHNF + (so let-to-case is inappropriate). + +\begin{code} +completeLazyBind :: SimplEnv + -> TopLevelFlag -- Flag stuck into unfolding + -> InId -- Old binder + -> OutId -- New binder + -> OutExpr -- Simplified RHS + -> SimplM (FloatsWith SimplEnv) +-- We return a new SimplEnv, because completeLazyBind may choose to do its work +-- by extending the substitution (e.g. let x = y in ...) +-- The new binding (if any) is returned as part of the floats. +-- NB: the returned SimplEnv has the right SubstEnv, but you should +-- (as usual) use the in-scope-env from the floats + +completeLazyBind env top_lvl old_bndr new_bndr new_rhs + | postInlineUnconditionally env top_lvl new_bndr occ_info new_rhs unfolding + = -- Drop the binding + tick (PostInlineUnconditionally old_bndr) `thenSmpl_` + returnSmpl (emptyFloats env, extendIdSubst env old_bndr (DoneEx new_rhs)) + -- Use the substitution to make quite, quite sure that the substitution + -- will happen, since we are going to discard the binding + + | otherwise + = let + -- Add arity info + new_bndr_info = idInfo new_bndr `setArityInfo` exprArity new_rhs + + -- Add the unfolding *only* for non-loop-breakers + -- Making loop breakers not have an unfolding at all + -- means that we can avoid tests in exprIsConApp, for example. + -- This is important: if exprIsConApp says 'yes' for a recursive + -- thing, then we can get into an infinite loop + + -- If the unfolding is a value, the demand info may + -- go pear-shaped, so we nuke it. Example: + -- let x = (a,b) in + -- case x of (p,q) -> h p q x + -- Here x is certainly demanded. But after we've nuked + -- the case, we'll get just + -- let x = (a,b) in h a b x + -- and now x is not demanded (I'm assuming h is lazy) + -- This really happens. Similarly + -- let f = \x -> e in ...f..f... + -- After inling f at some of its call sites the original binding may + -- (for example) be no longer strictly demanded. + -- The solution here is a bit ad hoc... + info_w_unf = new_bndr_info `setUnfoldingInfo` unfolding + final_info | loop_breaker = new_bndr_info + | isEvaldUnfolding unfolding = zapDemandInfo info_w_unf `orElse` info_w_unf + | otherwise = info_w_unf + + final_id = new_bndr `setIdInfo` final_info + in + -- These seqs forces the Id, and hence its IdInfo, + -- and hence any inner substitutions + final_id `seq` + returnSmpl (unitFloat env final_id new_rhs, env) + + where + unfolding = mkUnfolding (isTopLevel top_lvl) new_rhs + loop_breaker = isLoopBreaker occ_info + old_info = idInfo old_bndr + occ_info = occInfo old_info +\end{code} + + + +%************************************************************************ +%* * +\subsection[Simplify-simplExpr]{The main function: simplExpr} +%* * +%************************************************************************ + +The reason for this OutExprStuff stuff is that we want to float *after* +simplifying a RHS, not before. If we do so naively we get quadratic +behaviour as things float out. + +To see why it's important to do it after, consider this (real) example: + + let t = f x + in fst t +==> + let t = let a = e1 + b = e2 + in (a,b) + in fst t +==> + let a = e1 + b = e2 + t = (a,b) + in + a -- Can't inline a this round, cos it appears twice +==> + e1 + +Each of the ==> steps is a round of simplification. We'd save a +whole round if we float first. This can cascade. Consider + + let f = g d + in \x -> ...f... +==> + let f = let d1 = ..d.. in \y -> e + in \x -> ...f... +==> + let d1 = ..d.. + in \x -> ...(\y ->e)... + +Only in this second round can the \y be applied, and it +might do the same again. + + +\begin{code} +simplExpr :: SimplEnv -> CoreExpr -> SimplM CoreExpr +simplExpr env expr = simplExprC env expr (mkBoringStop expr_ty') + where + expr_ty' = substTy env (exprType expr) + -- The type in the Stop continuation, expr_ty', is usually not used + -- It's only needed when discarding continuations after finding + -- a function that returns bottom. + -- Hence the lazy substitution + + +simplExprC :: SimplEnv -> CoreExpr -> SimplCont -> SimplM CoreExpr + -- Simplify an expression, given a continuation +simplExprC env expr cont + = simplExprF env expr cont `thenSmpl` \ (floats, expr) -> + returnSmpl (wrapFloats floats expr) + +simplExprF :: SimplEnv -> InExpr -> SimplCont -> SimplM FloatsWithExpr + -- Simplify an expression, returning floated binds + +simplExprF env (Var v) cont = simplVar env v cont +simplExprF env (Lit lit) cont = rebuild env (Lit lit) cont +simplExprF env expr@(Lam _ _) cont = simplLam env expr cont +simplExprF env (Note note expr) cont = simplNote env note expr cont +simplExprF env (App fun arg) cont = simplExprF env fun (ApplyTo NoDup arg env cont) + +simplExprF env (Type ty) cont + = ASSERT( contIsRhsOrArg cont ) + simplType env ty `thenSmpl` \ ty' -> + rebuild env (Type ty') cont + +simplExprF env (Case scrut bndr case_ty alts) cont + | not (switchIsOn (getSwitchChecker env) NoCaseOfCase) + = -- Simplify the scrutinee with a Select continuation + simplExprF env scrut (Select NoDup bndr alts env cont) + + | otherwise + = -- If case-of-case is off, simply simplify the case expression + -- in a vanilla Stop context, and rebuild the result around it + simplExprC env scrut case_cont `thenSmpl` \ case_expr' -> + rebuild env case_expr' cont + where + case_cont = Select NoDup bndr alts env (mkBoringStop case_ty') + case_ty' = substTy env case_ty -- c.f. defn of simplExpr + +simplExprF env (Let (Rec pairs) body) cont + = simplRecBndrs env (map fst pairs) `thenSmpl` \ (env, bndrs') -> + -- NB: bndrs' don't have unfoldings or rules + -- We add them as we go down + + simplRecBind env NotTopLevel pairs bndrs' `thenSmpl` \ (floats, env) -> + addFloats env floats $ \ env -> + simplExprF env body cont + +-- A non-recursive let is dealt with by simplNonRecBind +simplExprF env (Let (NonRec bndr rhs) body) cont + = simplNonRecBind env bndr rhs env (contResultType cont) $ \ env -> + simplExprF env body cont + + +--------------------------------- +simplType :: SimplEnv -> InType -> SimplM OutType + -- Kept monadic just so we can do the seqType +simplType env ty + = seqType new_ty `seq` returnSmpl new_ty + where + new_ty = substTy env ty +\end{code} + + +%************************************************************************ +%* * +\subsection{Lambdas} +%* * +%************************************************************************ + +\begin{code} +simplLam env fun cont + = go env fun cont + where + zap_it = mkLamBndrZapper fun (countArgs cont) + cont_ty = contResultType cont + + -- Type-beta reduction + go env (Lam bndr body) (ApplyTo _ (Type ty_arg) arg_se body_cont) + = ASSERT( isTyVar bndr ) + tick (BetaReduction bndr) `thenSmpl_` + simplType (setInScope arg_se env) ty_arg `thenSmpl` \ ty_arg' -> + go (extendTvSubst env bndr ty_arg') body body_cont + + -- Ordinary beta reduction + go env (Lam bndr body) cont@(ApplyTo _ arg arg_se body_cont) + = tick (BetaReduction bndr) `thenSmpl_` + simplNonRecBind env (zap_it bndr) arg arg_se cont_ty $ \ env -> + go env body body_cont + + -- Not enough args, so there are real lambdas left to put in the result + go env lam@(Lam _ _) cont + = simplLamBndrs env bndrs `thenSmpl` \ (env, bndrs') -> + simplExpr env body `thenSmpl` \ body' -> + mkLam env bndrs' body' cont `thenSmpl` \ (floats, new_lam) -> + addFloats env floats $ \ env -> + rebuild env new_lam cont + where + (bndrs,body) = collectBinders lam + + -- Exactly enough args + go env expr cont = simplExprF env expr cont + +mkLamBndrZapper :: CoreExpr -- Function + -> Int -- Number of args supplied, *including* type args + -> Id -> Id -- Use this to zap the binders +mkLamBndrZapper fun n_args + | n_args >= n_params fun = \b -> b -- Enough args + | otherwise = \b -> zapLamIdInfo b + where + -- NB: we count all the args incl type args + -- so we must count all the binders (incl type lambdas) + n_params (Note _ e) = n_params e + n_params (Lam b e) = 1 + n_params e + n_params other = 0::Int +\end{code} + + +%************************************************************************ +%* * +\subsection{Notes} +%* * +%************************************************************************ + +\begin{code} +simplNote env (Coerce to from) body cont + = let + addCoerce s1 k1 cont -- Drop redundant coerces. This can happen if a polymoprhic + -- (coerce a b e) is instantiated with a=ty1 b=ty2 and the + -- two are the same. This happens a lot in Happy-generated parsers + | s1 `coreEqType` k1 = cont + + addCoerce s1 k1 (CoerceIt t1 cont) + -- coerce T1 S1 (coerce S1 K1 e) + -- ==> + -- e, if T1=K1 + -- coerce T1 K1 e, otherwise + -- + -- For example, in the initial form of a worker + -- we may find (coerce T (coerce S (\x.e))) y + -- and we'd like it to simplify to e[y/x] in one round + -- of simplification + | t1 `coreEqType` k1 = cont -- The coerces cancel out + | otherwise = CoerceIt t1 cont -- They don't cancel, but + -- the inner one is redundant + + addCoerce t1t2 s1s2 (ApplyTo dup arg arg_se cont) + | not (isTypeArg arg), -- This whole case only works for value args + -- Could upgrade to have equiv thing for type apps too + Just (s1, s2) <- splitFunTy_maybe s1s2 + -- (coerce (T1->T2) (S1->S2) F) E + -- ===> + -- coerce T2 S2 (F (coerce S1 T1 E)) + -- + -- t1t2 must be a function type, T1->T2, because it's applied to something + -- but s1s2 might conceivably not be + -- + -- When we build the ApplyTo we can't mix the out-types + -- with the InExpr in the argument, so we simply substitute + -- to make it all consistent. It's a bit messy. + -- But it isn't a common case. + = let + (t1,t2) = splitFunTy t1t2 + new_arg = mkCoerce2 s1 t1 (substExpr arg_env arg) + arg_env = setInScope arg_se env + in + ApplyTo dup new_arg (zapSubstEnv env) (addCoerce t2 s2 cont) + + addCoerce to' _ cont = CoerceIt to' cont + in + simplType env to `thenSmpl` \ to' -> + simplType env from `thenSmpl` \ from' -> + simplExprF env body (addCoerce to' from' cont) + + +-- Hack: we only distinguish subsumed cost centre stacks for the purposes of +-- inlining. All other CCCSs are mapped to currentCCS. +simplNote env (SCC cc) e cont + = simplExpr (setEnclosingCC env currentCCS) e `thenSmpl` \ e' -> + rebuild env (mkSCC cc e') cont + +simplNote env InlineCall e cont + = simplExprF env e (InlinePlease cont) + +-- See notes with SimplMonad.inlineMode +simplNote env InlineMe e cont + | contIsRhsOrArg cont -- Totally boring continuation; see notes above + = -- Don't inline inside an INLINE expression + simplExpr (setMode inlineMode env ) e `thenSmpl` \ e' -> + rebuild env (mkInlineMe e') cont + + | otherwise -- Dissolve the InlineMe note if there's + -- an interesting context of any kind to combine with + -- (even a type application -- anything except Stop) + = simplExprF env e cont + +simplNote env (CoreNote s) e cont + = simplExpr env e `thenSmpl` \ e' -> + rebuild env (Note (CoreNote s) e') cont +\end{code} + + +%************************************************************************ +%* * +\subsection{Dealing with calls} +%* * +%************************************************************************ + +\begin{code} +simplVar env var cont + = case substId env var of + DoneEx e -> simplExprF (zapSubstEnv env) e cont + ContEx tvs ids e -> simplExprF (setSubstEnv env tvs ids) e cont + DoneId var1 occ -> completeCall (zapSubstEnv env) var1 occ cont + -- Note [zapSubstEnv] + -- The template is already simplified, so don't re-substitute. + -- This is VITAL. Consider + -- let x = e in + -- let y = \z -> ...x... in + -- \ x -> ...y... + -- We'll clone the inner \x, adding x->x' in the id_subst + -- Then when we inline y, we must *not* replace x by x' in + -- the inlined copy!! + +--------------------------------------------------------- +-- Dealing with a call site + +completeCall env var occ_info cont + = -- Simplify the arguments + getDOptsSmpl `thenSmpl` \ dflags -> + let + chkr = getSwitchChecker env + (args, call_cont, inline_call) = getContArgs chkr var cont + fn_ty = idType var + in + simplifyArgs env fn_ty args (contResultType call_cont) $ \ env args -> + + -- Next, look for rules or specialisations that match + -- + -- It's important to simplify the args first, because the rule-matcher + -- doesn't do substitution as it goes. We don't want to use subst_args + -- (defined in the 'where') because that throws away useful occurrence info, + -- and perhaps-very-important specialisations. + -- + -- Some functions have specialisations *and* are strict; in this case, + -- we don't want to inline the wrapper of the non-specialised thing; better + -- to call the specialised thing instead. + -- We used to use the black-listing mechanism to ensure that inlining of + -- the wrapper didn't occur for things that have specialisations till a + -- later phase, so but now we just try RULES first + -- + -- You might think that we shouldn't apply rules for a loop breaker: + -- doing so might give rise to an infinite loop, because a RULE is + -- rather like an extra equation for the function: + -- RULE: f (g x) y = x+y + -- Eqn: f a y = a-y + -- + -- But it's too drastic to disable rules for loop breakers. + -- Even the foldr/build rule would be disabled, because foldr + -- is recursive, and hence a loop breaker: + -- foldr k z (build g) = g k z + -- So it's up to the programmer: rules can cause divergence + + let + in_scope = getInScope env + rules = getRules env + maybe_rule = case activeRule env of + Nothing -> Nothing -- No rules apply + Just act_fn -> lookupRule act_fn in_scope rules var args + in + case maybe_rule of { + Just (rule_name, rule_rhs) -> + tick (RuleFired rule_name) `thenSmpl_` + (if dopt Opt_D_dump_inlinings dflags then + pprTrace "Rule fired" (vcat [ + text "Rule:" <+> ftext rule_name, + text "Before:" <+> ppr var <+> sep (map pprParendExpr args), + text "After: " <+> pprCoreExpr rule_rhs, + text "Cont: " <+> ppr call_cont]) + else + id) $ + simplExprF env rule_rhs call_cont ; + + Nothing -> -- No rules + + -- Next, look for an inlining + let + arg_infos = [ interestingArg arg | arg <- args, isValArg arg] + + interesting_cont = interestingCallContext (notNull args) + (notNull arg_infos) + call_cont + + active_inline = activeInline env var occ_info + maybe_inline = callSiteInline dflags active_inline inline_call occ_info + var arg_infos interesting_cont + in + case maybe_inline of { + Just unfolding -- There is an inlining! + -> tick (UnfoldingDone var) `thenSmpl_` + (if dopt Opt_D_dump_inlinings dflags then + pprTrace "Inlining done" (vcat [ + text "Before:" <+> ppr var <+> sep (map pprParendExpr args), + text "Inlined fn: " <+> ppr unfolding, + text "Cont: " <+> ppr call_cont]) + else + id) $ + makeThatCall env var unfolding args call_cont + + ; + Nothing -> -- No inlining! + + -- Done + rebuild env (mkApps (Var var) args) call_cont + }} + +makeThatCall :: SimplEnv + -> Id + -> InExpr -- Inlined function rhs + -> [OutExpr] -- Arguments, already simplified + -> SimplCont -- After the call + -> SimplM FloatsWithExpr +-- Similar to simplLam, but this time +-- the arguments are already simplified +makeThatCall orig_env var fun@(Lam _ _) args cont + = go orig_env fun args + where + zap_it = mkLamBndrZapper fun (length args) + + -- Type-beta reduction + go env (Lam bndr body) (Type ty_arg : args) + = ASSERT( isTyVar bndr ) + tick (BetaReduction bndr) `thenSmpl_` + go (extendTvSubst env bndr ty_arg) body args + + -- Ordinary beta reduction + go env (Lam bndr body) (arg : args) + = tick (BetaReduction bndr) `thenSmpl_` + simplNonRecX env (zap_it bndr) arg $ \ env -> + go env body args + + -- Not enough args, so there are real lambdas left to put in the result + go env fun args + = simplExprF env fun (pushContArgs orig_env args cont) + -- NB: orig_env; the correct environment to capture with + -- the arguments.... env has been augmented with substitutions + -- from the beta reductions. + +makeThatCall env var fun args cont + = simplExprF env fun (pushContArgs env args cont) +\end{code} + + +%************************************************************************ +%* * +\subsection{Arguments} +%* * +%************************************************************************ + +\begin{code} +--------------------------------------------------------- +-- Simplifying the arguments of a call + +simplifyArgs :: SimplEnv + -> OutType -- Type of the function + -> [(InExpr, SimplEnv, Bool)] -- Details of the arguments + -> OutType -- Type of the continuation + -> (SimplEnv -> [OutExpr] -> SimplM FloatsWithExpr) + -> SimplM FloatsWithExpr + +-- [CPS-like because of strict arguments] + +-- Simplify the arguments to a call. +-- This part of the simplifier may break the no-shadowing invariant +-- Consider +-- f (...(\a -> e)...) (case y of (a,b) -> e') +-- where f is strict in its second arg +-- If we simplify the innermost one first we get (...(\a -> e)...) +-- Simplifying the second arg makes us float the case out, so we end up with +-- case y of (a,b) -> f (...(\a -> e)...) e' +-- So the output does not have the no-shadowing invariant. However, there is +-- no danger of getting name-capture, because when the first arg was simplified +-- we used an in-scope set that at least mentioned all the variables free in its +-- static environment, and that is enough. +-- +-- We can't just do innermost first, or we'd end up with a dual problem: +-- case x of (a,b) -> f e (...(\a -> e')...) +-- +-- I spent hours trying to recover the no-shadowing invariant, but I just could +-- not think of an elegant way to do it. The simplifier is already knee-deep in +-- continuations. We have to keep the right in-scope set around; AND we have +-- to get the effect that finding (error "foo") in a strict arg position will +-- discard the entire application and replace it with (error "foo"). Getting +-- all this at once is TOO HARD! + +simplifyArgs env fn_ty args cont_ty thing_inside + = go env fn_ty args thing_inside + where + go env fn_ty [] thing_inside = thing_inside env [] + go env fn_ty (arg:args) thing_inside = simplifyArg env fn_ty arg cont_ty $ \ env arg' -> + go env (applyTypeToArg fn_ty arg') args $ \ env args' -> + thing_inside env (arg':args') + +simplifyArg env fn_ty (Type ty_arg, se, _) cont_ty thing_inside + = simplType (setInScope se env) ty_arg `thenSmpl` \ new_ty_arg -> + thing_inside env (Type new_ty_arg) + +simplifyArg env fn_ty (val_arg, arg_se, is_strict) cont_ty thing_inside + | is_strict + = simplStrictArg AnArg env val_arg arg_se arg_ty cont_ty thing_inside + + | otherwise -- Lazy argument + -- DO NOT float anything outside, hence simplExprC + -- There is no benefit (unlike in a let-binding), and we'd + -- have to be very careful about bogus strictness through + -- floating a demanded let. + = simplExprC (setInScope arg_se env) val_arg + (mkBoringStop arg_ty) `thenSmpl` \ arg1 -> + thing_inside env arg1 + where + arg_ty = funArgTy fn_ty + + +simplStrictArg :: LetRhsFlag + -> SimplEnv -- The env of the call + -> InExpr -> SimplEnv -- The arg plus its env + -> OutType -- arg_ty: type of the argument + -> OutType -- cont_ty: Type of thing computed by the context + -> (SimplEnv -> OutExpr -> SimplM FloatsWithExpr) + -- Takes an expression of type rhs_ty, + -- returns an expression of type cont_ty + -- The env passed to this continuation is the + -- env of the call, plus any new in-scope variables + -> SimplM FloatsWithExpr -- An expression of type cont_ty + +simplStrictArg is_rhs call_env arg arg_env arg_ty cont_ty thing_inside + = simplExprF (setInScope arg_env call_env) arg + (ArgOf is_rhs arg_ty cont_ty (\ new_env -> thing_inside (setInScope call_env new_env))) + -- Notice the way we use arg_env (augmented with in-scope vars from call_env) + -- to simplify the argument + -- and call-env (augmented with in-scope vars from the arg) to pass to the continuation +\end{code} + + +%************************************************************************ +%* * +\subsection{mkAtomicArgs} +%* * +%************************************************************************ + +mkAtomicArgs takes a putative RHS, checks whether it's a PAP or +constructor application and, if so, converts it to ANF, so that the +resulting thing can be inlined more easily. Thus + x = (f a, g b) +becomes + t1 = f a + t2 = g b + x = (t1,t2) + +There are three sorts of binding context, specified by the two +boolean arguments + +Strict + OK-unlifted + +N N Top-level or recursive Only bind args of lifted type + +N Y Non-top-level and non-recursive, Bind args of lifted type, or + but lazy unlifted-and-ok-for-speculation + +Y Y Non-top-level, non-recursive, Bind all args + and strict (demanded) + + +For example, given + + x = MkC (y div# z) + +there is no point in transforming to + + x = case (y div# z) of r -> MkC r + +because the (y div# z) can't float out of the let. But if it was +a *strict* let, then it would be a good thing to do. Hence the +context information. + +\begin{code} +mkAtomicArgs :: Bool -- A strict binding + -> Bool -- OK to float unlifted args + -> OutExpr + -> SimplM (OrdList (OutId,OutExpr), -- The floats (unusually) may include + OutExpr) -- things that need case-binding, + -- if the strict-binding flag is on + +mkAtomicArgs is_strict ok_float_unlifted rhs + | (Var fun, args) <- collectArgs rhs, -- It's an application + isDataConWorkId fun || valArgCount args < idArity fun -- And it's a constructor or PAP + = go fun nilOL [] args -- Have a go + + | otherwise = bale_out -- Give up + + where + bale_out = returnSmpl (nilOL, rhs) + + go fun binds rev_args [] + = returnSmpl (binds, mkApps (Var fun) (reverse rev_args)) + + go fun binds rev_args (arg : args) + | exprIsTrivial arg -- Easy case + = go fun binds (arg:rev_args) args + + | not can_float_arg -- Can't make this arg atomic + = bale_out -- ... so give up + + | otherwise -- Don't forget to do it recursively + -- E.g. x = a:b:c:[] + = mkAtomicArgs is_strict ok_float_unlifted arg `thenSmpl` \ (arg_binds, arg') -> + newId FSLIT("a") arg_ty `thenSmpl` \ arg_id -> + go fun ((arg_binds `snocOL` (arg_id,arg')) `appOL` binds) + (Var arg_id : rev_args) args + where + arg_ty = exprType arg + can_float_arg = is_strict + || not (isUnLiftedType arg_ty) + || (ok_float_unlifted && exprOkForSpeculation arg) + + +addAtomicBinds :: SimplEnv -> [(OutId,OutExpr)] + -> (SimplEnv -> SimplM (FloatsWith a)) + -> SimplM (FloatsWith a) +addAtomicBinds env [] thing_inside = thing_inside env +addAtomicBinds env ((v,r):bs) thing_inside = addAuxiliaryBind env (NonRec v r) $ \ env -> + addAtomicBinds env bs thing_inside + +addAtomicBindsE :: SimplEnv -> [(OutId,OutExpr)] + -> (SimplEnv -> SimplM FloatsWithExpr) + -> SimplM FloatsWithExpr +-- Same again, but this time we're in an expression context, +-- and may need to do some case bindings + +addAtomicBindsE env [] thing_inside + = thing_inside env +addAtomicBindsE env ((v,r):bs) thing_inside + | needsCaseBinding (idType v) r + = addAtomicBindsE (addNewInScopeIds env [v]) bs thing_inside `thenSmpl` \ (floats, expr) -> + WARN( exprIsTrivial expr, ppr v <+> pprCoreExpr expr ) + (let body = wrapFloats floats expr in + returnSmpl (emptyFloats env, Case r v (exprType body) [(DEFAULT,[],body)])) + + | otherwise + = addAuxiliaryBind env (NonRec v r) $ \ env -> + addAtomicBindsE env bs thing_inside +\end{code} + + +%************************************************************************ +%* * +\subsection{The main rebuilder} +%* * +%************************************************************************ + +\begin{code} +rebuild :: SimplEnv -> OutExpr -> SimplCont -> SimplM FloatsWithExpr + +rebuild env expr (Stop _ _ _) = rebuildDone env expr +rebuild env expr (ArgOf _ _ _ cont_fn) = cont_fn env expr +rebuild env expr (CoerceIt to_ty cont) = rebuild env (mkCoerce to_ty expr) cont +rebuild env expr (InlinePlease cont) = rebuild env (Note InlineCall expr) cont +rebuild env expr (Select _ bndr alts se cont) = rebuildCase (setInScope se env) expr bndr alts cont +rebuild env expr (ApplyTo _ arg se cont) = rebuildApp (setInScope se env) expr arg cont + +rebuildApp env fun arg cont + = simplExpr env arg `thenSmpl` \ arg' -> + rebuild env (App fun arg') cont + +rebuildDone env expr = returnSmpl (emptyFloats env, expr) +\end{code} + + +%************************************************************************ +%* * +\subsection{Functions dealing with a case} +%* * +%************************************************************************ + +Blob of helper functions for the "case-of-something-else" situation. + +\begin{code} +--------------------------------------------------------- +-- Eliminate the case if possible + +rebuildCase :: SimplEnv + -> OutExpr -- Scrutinee + -> InId -- Case binder + -> [InAlt] -- Alternatives (inceasing order) + -> SimplCont + -> SimplM FloatsWithExpr + +rebuildCase env scrut case_bndr alts cont + | Just (con,args) <- exprIsConApp_maybe scrut + -- Works when the scrutinee is a variable with a known unfolding + -- as well as when it's an explicit constructor application + = knownCon env (DataAlt con) args case_bndr alts cont + + | Lit lit <- scrut -- No need for same treatment as constructors + -- because literals are inlined more vigorously + = knownCon env (LitAlt lit) [] case_bndr alts cont + + | otherwise + = -- Prepare the alternatives. + prepareAlts scrut case_bndr alts `thenSmpl` \ (better_alts, handled_cons) -> + + -- Prepare the continuation; + -- The new subst_env is in place + prepareCaseCont env better_alts cont `thenSmpl` \ (floats, (dup_cont, nondup_cont)) -> + addFloats env floats $ \ env -> + + let + -- The case expression is annotated with the result type of the continuation + -- This may differ from the type originally on the case. For example + -- case(T) (case(Int#) a of { True -> 1#; False -> 0# }) of + -- a# -> <blob> + -- ===> + -- let j a# = <blob> + -- in case(T) a of { True -> j 1#; False -> j 0# } + -- Note that the case that scrutinises a now returns a T not an Int# + res_ty' = contResultType dup_cont + in + + -- Deal with case binder + simplCaseBinder env scrut case_bndr `thenSmpl` \ (alt_env, case_bndr') -> + + -- Deal with the case alternatives + simplAlts alt_env handled_cons + case_bndr' better_alts dup_cont `thenSmpl` \ alts' -> + + -- Put the case back together + mkCase scrut case_bndr' res_ty' alts' `thenSmpl` \ case_expr -> + + -- Notice that rebuildDone returns the in-scope set from env, not alt_env + -- The case binder *not* scope over the whole returned case-expression + rebuild env case_expr nondup_cont +\end{code} + +simplCaseBinder checks whether the scrutinee is a variable, v. If so, +try to eliminate uses of v in the RHSs in favour of case_bndr; that +way, there's a chance that v will now only be used once, and hence +inlined. + +Note 1 +~~~~~~ +There is a time we *don't* want to do that, namely when +-fno-case-of-case is on. This happens in the first simplifier pass, +and enhances full laziness. Here's the bad case: + f = \ y -> ...(case x of I# v -> ...(case x of ...) ... ) +If we eliminate the inner case, we trap it inside the I# v -> arm, +which might prevent some full laziness happening. I've seen this +in action in spectral/cichelli/Prog.hs: + [(m,n) | m <- [1..max], n <- [1..max]] +Hence the check for NoCaseOfCase. + +Note 2 +~~~~~~ +There is another situation when we don't want to do it. If we have + + case x of w1 { DEFAULT -> case x of w2 { A -> e1; B -> e2 } + ...other cases .... } + +We'll perform the binder-swap for the outer case, giving + + case x of w1 { DEFAULT -> case w1 of w2 { A -> e1; B -> e2 } + ...other cases .... } + +But there is no point in doing it for the inner case, because w1 can't +be inlined anyway. Furthermore, doing the case-swapping involves +zapping w2's occurrence info (see paragraphs that follow), and that +forces us to bind w2 when doing case merging. So we get + + case x of w1 { A -> let w2 = w1 in e1 + B -> let w2 = w1 in e2 + ...other cases .... } + +This is plain silly in the common case where w2 is dead. + +Even so, I can't see a good way to implement this idea. I tried +not doing the binder-swap if the scrutinee was already evaluated +but that failed big-time: + + data T = MkT !Int + + case v of w { MkT x -> + case x of x1 { I# y1 -> + case x of x2 { I# y2 -> ... + +Notice that because MkT is strict, x is marked "evaluated". But to +eliminate the last case, we must either make sure that x (as well as +x1) has unfolding MkT y1. THe straightforward thing to do is to do +the binder-swap. So this whole note is a no-op. + +Note 3 +~~~~~~ +If we replace the scrutinee, v, by tbe case binder, then we have to nuke +any occurrence info (eg IAmDead) in the case binder, because the +case-binder now effectively occurs whenever v does. AND we have to do +the same for the pattern-bound variables! Example: + + (case x of { (a,b) -> a }) (case x of { (p,q) -> q }) + +Here, b and p are dead. But when we move the argment inside the first +case RHS, and eliminate the second case, we get + + case x of { (a,b) -> a b } + +Urk! b is alive! Reason: the scrutinee was a variable, and case elimination +happened. + +Indeed, this can happen anytime the case binder isn't dead: + case <any> of x { (a,b) -> + case x of { (p,q) -> p } } +Here (a,b) both look dead, but come alive after the inner case is eliminated. +The point is that we bring into the envt a binding + let x = (a,b) +after the outer case, and that makes (a,b) alive. At least we do unless +the case binder is guaranteed dead. + +\begin{code} +simplCaseBinder env (Var v) case_bndr + | not (switchIsOn (getSwitchChecker env) NoCaseOfCase) + +-- Failed try [see Note 2 above] +-- not (isEvaldUnfolding (idUnfolding v)) + + = simplBinder env (zap case_bndr) `thenSmpl` \ (env, case_bndr') -> + returnSmpl (modifyInScope env v case_bndr', case_bndr') + -- We could extend the substitution instead, but it would be + -- a hack because then the substitution wouldn't be idempotent + -- any more (v is an OutId). And this does just as well. + where + zap b = b `setIdOccInfo` NoOccInfo + +simplCaseBinder env other_scrut case_bndr + = simplBinder env case_bndr `thenSmpl` \ (env, case_bndr') -> + returnSmpl (env, case_bndr') +\end{code} + + + +\begin{code} +simplAlts :: SimplEnv + -> [AltCon] -- Alternatives the scrutinee can't be + -- in the default case + -> OutId -- Case binder + -> [InAlt] -> SimplCont + -> SimplM [OutAlt] -- Includes the continuation + +simplAlts env handled_cons case_bndr' alts cont' + = do { mb_alts <- mapSmpl simpl_alt alts + ; return [alt' | Just (_, alt') <- mb_alts] } + -- Filter out the alternatives that are inaccessible + where + simpl_alt alt = simplAlt env handled_cons case_bndr' alt cont' + +simplAlt :: SimplEnv -> [AltCon] -> OutId -> InAlt -> SimplCont + -> SimplM (Maybe (TvSubstEnv, OutAlt)) +-- Simplify an alternative, returning the type refinement for the +-- alternative, if the alternative does any refinement at all +-- Nothing => the alternative is inaccessible + +simplAlt env handled_cons case_bndr' (DEFAULT, bndrs, rhs) cont' + = ASSERT( null bndrs ) + simplExprC env' rhs cont' `thenSmpl` \ rhs' -> + returnSmpl (Just (emptyVarEnv, (DEFAULT, [], rhs'))) + where + env' = mk_rhs_env env case_bndr' (mkOtherCon handled_cons) + -- Record the constructors that the case-binder *can't* be. + +simplAlt env handled_cons case_bndr' (LitAlt lit, bndrs, rhs) cont' + = ASSERT( null bndrs ) + simplExprC env' rhs cont' `thenSmpl` \ rhs' -> + returnSmpl (Just (emptyVarEnv, (LitAlt lit, [], rhs'))) + where + env' = mk_rhs_env env case_bndr' (mkUnfolding False (Lit lit)) + +simplAlt env handled_cons case_bndr' (DataAlt con, vs, rhs) cont' + | isVanillaDataCon con + = -- Deal with the pattern-bound variables + -- Mark the ones that are in ! positions in the data constructor + -- as certainly-evaluated. + -- NB: it happens that simplBinders does *not* erase the OtherCon + -- form of unfolding, so it's ok to add this info before + -- doing simplBinders + simplBinders env (add_evals con vs) `thenSmpl` \ (env, vs') -> + + -- Bind the case-binder to (con args) + let unf = mkUnfolding False (mkConApp con con_args) + inst_tys' = tyConAppArgs (idType case_bndr') + con_args = map Type inst_tys' ++ map varToCoreExpr vs' + env' = mk_rhs_env env case_bndr' unf + in + simplExprC env' rhs cont' `thenSmpl` \ rhs' -> + returnSmpl (Just (emptyVarEnv, (DataAlt con, vs', rhs'))) + + | otherwise -- GADT case + = let + (tvs,ids) = span isTyVar vs + in + simplBinders env tvs `thenSmpl` \ (env1, tvs') -> + case coreRefineTys con tvs' (idType case_bndr') of { + Nothing -- Inaccessible + | opt_PprStyle_Debug -- Hack: if debugging is on, generate an error case + -- so we can see it + -> let rhs' = mkApps (Var eRROR_ID) + [Type (substTy env (exprType rhs)), + Lit (mkStringLit "Impossible alternative (GADT)")] + in + simplBinders env1 ids `thenSmpl` \ (env2, ids') -> + returnSmpl (Just (emptyVarEnv, (DataAlt con, tvs' ++ ids', rhs'))) + + | otherwise -- Filter out the inaccessible branch + -> return Nothing ; + + Just refine@(tv_subst_env, _) -> -- The normal case + + let + env2 = refineSimplEnv env1 refine + -- Simplify the Ids in the refined environment, so their types + -- reflect the refinement. Usually this doesn't matter, but it helps + -- in mkDupableAlt, when we want to float a lambda that uses these binders + -- Furthermore, it means the binders contain maximal type information + in + simplBinders env2 (add_evals con ids) `thenSmpl` \ (env3, ids') -> + let unf = mkUnfolding False con_app + con_app = mkConApp con con_args + con_args = map varToCoreExpr vs' -- NB: no inst_tys' + env_w_unf = mk_rhs_env env3 case_bndr' unf + vs' = tvs' ++ ids' + in + simplExprC env_w_unf rhs cont' `thenSmpl` \ rhs' -> + returnSmpl (Just (tv_subst_env, (DataAlt con, vs', rhs'))) } + + where + -- add_evals records the evaluated-ness of the bound variables of + -- a case pattern. This is *important*. Consider + -- data T = T !Int !Int + -- + -- case x of { T a b -> T (a+1) b } + -- + -- We really must record that b is already evaluated so that we don't + -- go and re-evaluate it when constructing the result. + add_evals dc vs = cat_evals dc vs (dataConRepStrictness dc) + + cat_evals dc vs strs + = go vs strs + where + go [] [] = [] + go (v:vs) strs | isTyVar v = v : go vs strs + go (v:vs) (str:strs) + | isMarkedStrict str = evald_v : go vs strs + | otherwise = zapped_v : go vs strs + where + zapped_v = zap_occ_info v + evald_v = zapped_v `setIdUnfolding` evaldUnfolding + go _ _ = pprPanic "cat_evals" (ppr dc $$ ppr vs $$ ppr strs) + + -- If the case binder is alive, then we add the unfolding + -- case_bndr = C vs + -- to the envt; so vs are now very much alive + zap_occ_info | isDeadBinder case_bndr' = \id -> id + | otherwise = \id -> id `setIdOccInfo` NoOccInfo + +mk_rhs_env env case_bndr' case_bndr_unf + = modifyInScope env case_bndr' (case_bndr' `setIdUnfolding` case_bndr_unf) +\end{code} + + +%************************************************************************ +%* * +\subsection{Known constructor} +%* * +%************************************************************************ + +We are a bit careful with occurrence info. Here's an example + + (\x* -> case x of (a*, b) -> f a) (h v, e) + +where the * means "occurs once". This effectively becomes + case (h v, e) of (a*, b) -> f a) +and then + let a* = h v; b = e in f a +and then + f (h v) + +All this should happen in one sweep. + +\begin{code} +knownCon :: SimplEnv -> AltCon -> [OutExpr] + -> InId -> [InAlt] -> SimplCont + -> SimplM FloatsWithExpr + +knownCon env con args bndr alts cont + = tick (KnownBranch bndr) `thenSmpl_` + case findAlt con alts of + (DEFAULT, bs, rhs) -> ASSERT( null bs ) + simplNonRecX env bndr scrut $ \ env -> + -- This might give rise to a binding with non-atomic args + -- like x = Node (f x) (g x) + -- but no harm will be done + simplExprF env rhs cont + where + scrut = case con of + LitAlt lit -> Lit lit + DataAlt dc -> mkConApp dc args + + (LitAlt lit, bs, rhs) -> ASSERT( null bs ) + simplNonRecX env bndr (Lit lit) $ \ env -> + simplExprF env rhs cont + + (DataAlt dc, bs, rhs) + -> ASSERT( n_drop_tys + length bs == length args ) + bind_args env bs (drop n_drop_tys args) $ \ env -> + let + con_app = mkConApp dc (take n_drop_tys args ++ con_args) + con_args = [substExpr env (varToCoreExpr b) | b <- bs] + -- args are aready OutExprs, but bs are InIds + in + simplNonRecX env bndr con_app $ \ env -> + simplExprF env rhs cont + where + n_drop_tys | isVanillaDataCon dc = tyConArity (dataConTyCon dc) + | otherwise = 0 + -- Vanilla data constructors lack type arguments in the pattern + +-- Ugh! +bind_args env [] _ thing_inside = thing_inside env + +bind_args env (b:bs) (Type ty : args) thing_inside + = ASSERT( isTyVar b ) + bind_args (extendTvSubst env b ty) bs args thing_inside + +bind_args env (b:bs) (arg : args) thing_inside + = ASSERT( isId b ) + simplNonRecX env b arg $ \ env -> + bind_args env bs args thing_inside +\end{code} + + +%************************************************************************ +%* * +\subsection{Duplicating continuations} +%* * +%************************************************************************ + +\begin{code} +prepareCaseCont :: SimplEnv + -> [InAlt] -> SimplCont + -> SimplM (FloatsWith (SimplCont,SimplCont)) + -- Return a duplicatable continuation, a non-duplicable part + -- plus some extra bindings + + -- No need to make it duplicatable if there's only one alternative +prepareCaseCont env [alt] cont = returnSmpl (emptyFloats env, (cont, mkBoringStop (contResultType cont))) +prepareCaseCont env alts cont = mkDupableCont env cont +\end{code} + +\begin{code} +mkDupableCont :: SimplEnv -> SimplCont + -> SimplM (FloatsWith (SimplCont, SimplCont)) + +mkDupableCont env cont + | contIsDupable cont + = returnSmpl (emptyFloats env, (cont, mkBoringStop (contResultType cont))) + +mkDupableCont env (CoerceIt ty cont) + = mkDupableCont env cont `thenSmpl` \ (floats, (dup_cont, nondup_cont)) -> + returnSmpl (floats, (CoerceIt ty dup_cont, nondup_cont)) + +mkDupableCont env (InlinePlease cont) + = mkDupableCont env cont `thenSmpl` \ (floats, (dup_cont, nondup_cont)) -> + returnSmpl (floats, (InlinePlease dup_cont, nondup_cont)) + +mkDupableCont env cont@(ArgOf _ arg_ty _ _) + = returnSmpl (emptyFloats env, (mkBoringStop arg_ty, cont)) + -- Do *not* duplicate an ArgOf continuation + -- Because ArgOf continuations are opaque, we gain nothing by + -- propagating them into the expressions, and we do lose a lot. + -- Here's an example: + -- && (case x of { T -> F; F -> T }) E + -- Now, && is strict so we end up simplifying the case with + -- an ArgOf continuation. If we let-bind it, we get + -- + -- let $j = \v -> && v E + -- in simplExpr (case x of { T -> F; F -> T }) + -- (ArgOf (\r -> $j r) + -- And after simplifying more we get + -- + -- let $j = \v -> && v E + -- in case of { T -> $j F; F -> $j T } + -- Which is a Very Bad Thing + -- + -- The desire not to duplicate is the entire reason that + -- mkDupableCont returns a pair of continuations. + -- + -- The original plan had: + -- e.g. (...strict-fn...) [...hole...] + -- ==> + -- let $j = \a -> ...strict-fn... + -- in $j [...hole...] + +mkDupableCont env (ApplyTo _ arg se cont) + = -- e.g. [...hole...] (...arg...) + -- ==> + -- let a = ...arg... + -- in [...hole...] a + simplExpr (setInScope se env) arg `thenSmpl` \ arg' -> + + mkDupableCont env cont `thenSmpl` \ (floats, (dup_cont, nondup_cont)) -> + addFloats env floats $ \ env -> + + if exprIsDupable arg' then + returnSmpl (emptyFloats env, (ApplyTo OkToDup arg' (zapSubstEnv se) dup_cont, nondup_cont)) + else + newId FSLIT("a") (exprType arg') `thenSmpl` \ arg_id -> + + tick (CaseOfCase arg_id) `thenSmpl_` + -- Want to tick here so that we go round again, + -- and maybe copy or inline the code. + -- Not strictly CaseOfCase, but never mind + + returnSmpl (unitFloat env arg_id arg', + (ApplyTo OkToDup (Var arg_id) (zapSubstEnv se) dup_cont, + nondup_cont)) + -- But what if the arg should be case-bound? + -- This has been this way for a long time, so I'll leave it, + -- but I can't convince myself that it's right. + +mkDupableCont env (Select _ case_bndr alts se cont) + = -- e.g. (case [...hole...] of { pi -> ei }) + -- ===> + -- let ji = \xij -> ei + -- in case [...hole...] of { pi -> ji xij } + tick (CaseOfCase case_bndr) `thenSmpl_` + let + alt_env = setInScope se env + in + prepareCaseCont alt_env alts cont `thenSmpl` \ (floats1, (dup_cont, nondup_cont)) -> + addFloats alt_env floats1 $ \ alt_env -> + + simplBinder alt_env case_bndr `thenSmpl` \ (alt_env, case_bndr') -> + -- NB: simplBinder does not zap deadness occ-info, so + -- a dead case_bndr' will still advertise its deadness + -- This is really important because in + -- case e of b { (# a,b #) -> ... } + -- b is always dead, and indeed we are not allowed to bind b to (# a,b #), + -- which might happen if e was an explicit unboxed pair and b wasn't marked dead. + -- In the new alts we build, we have the new case binder, so it must retain + -- its deadness. + + mkDupableAlts alt_env case_bndr' alts dup_cont `thenSmpl` \ (floats2, alts') -> + addFloats alt_env floats2 $ \ alt_env -> + returnSmpl (emptyFloats alt_env, + (Select OkToDup case_bndr' alts' (zapSubstEnv se) + (mkBoringStop (contResultType dup_cont)), + nondup_cont)) + +mkDupableAlts :: SimplEnv -> OutId -> [InAlt] -> SimplCont + -> SimplM (FloatsWith [InAlt]) +-- Absorbs the continuation into the new alternatives + +mkDupableAlts env case_bndr' alts dupable_cont + = go env alts + where + go env [] = returnSmpl (emptyFloats env, []) + go env (alt:alts) + = do { (floats1, mb_alt') <- mkDupableAlt env case_bndr' dupable_cont alt + ; addFloats env floats1 $ \ env -> do + { (floats2, alts') <- go env alts + ; returnSmpl (floats2, case mb_alt' of + Just alt' -> alt' : alts' + Nothing -> alts' + )}} + +mkDupableAlt env case_bndr' cont alt + = simplAlt env [] case_bndr' alt cont `thenSmpl` \ mb_stuff -> + case mb_stuff of { + Nothing -> returnSmpl (emptyFloats env, Nothing) ; + + Just (reft, (con, bndrs', rhs')) -> + -- Safe to say that there are no handled-cons for the DEFAULT case + + if exprIsDupable rhs' then + returnSmpl (emptyFloats env, Just (con, bndrs', rhs')) + -- It is worth checking for a small RHS because otherwise we + -- get extra let bindings that may cause an extra iteration of the simplifier to + -- inline back in place. Quite often the rhs is just a variable or constructor. + -- The Ord instance of Maybe in PrelMaybe.lhs, for example, took several extra + -- iterations because the version with the let bindings looked big, and so wasn't + -- inlined, but after the join points had been inlined it looked smaller, and so + -- was inlined. + -- + -- NB: we have to check the size of rhs', not rhs. + -- Duplicating a small InAlt might invalidate occurrence information + -- However, if it *is* dupable, we return the *un* simplified alternative, + -- because otherwise we'd need to pair it up with an empty subst-env.... + -- but we only have one env shared between all the alts. + -- (Remember we must zap the subst-env before re-simplifying something). + -- Rather than do this we simply agree to re-simplify the original (small) thing later. + + else + let + rhs_ty' = exprType rhs' + used_bndrs' = filter abstract_over (case_bndr' : bndrs') + abstract_over bndr + | isTyVar bndr = not (bndr `elemVarEnv` reft) + -- Don't abstract over tyvar binders which are refined away + -- See Note [Refinement] below + | otherwise = not (isDeadBinder bndr) + -- The deadness info on the new Ids is preserved by simplBinders + in + -- If we try to lift a primitive-typed something out + -- for let-binding-purposes, we will *caseify* it (!), + -- with potentially-disastrous strictness results. So + -- instead we turn it into a function: \v -> e + -- where v::State# RealWorld#. The value passed to this function + -- is realworld#, which generates (almost) no code. + + -- There's a slight infelicity here: we pass the overall + -- case_bndr to all the join points if it's used in *any* RHS, + -- because we don't know its usage in each RHS separately + + -- We used to say "&& isUnLiftedType rhs_ty'" here, but now + -- we make the join point into a function whenever used_bndrs' + -- is empty. This makes the join-point more CPR friendly. + -- Consider: let j = if .. then I# 3 else I# 4 + -- in case .. of { A -> j; B -> j; C -> ... } + -- + -- Now CPR doesn't w/w j because it's a thunk, so + -- that means that the enclosing function can't w/w either, + -- which is a lose. Here's the example that happened in practice: + -- kgmod :: Int -> Int -> Int + -- kgmod x y = if x > 0 && y < 0 || x < 0 && y > 0 + -- then 78 + -- else 5 + -- + -- I have seen a case alternative like this: + -- True -> \v -> ... + -- It's a bit silly to add the realWorld dummy arg in this case, making + -- $j = \s v -> ... + -- True -> $j s + -- (the \v alone is enough to make CPR happy) but I think it's rare + + ( if not (any isId used_bndrs') + then newId FSLIT("w") realWorldStatePrimTy `thenSmpl` \ rw_id -> + returnSmpl ([rw_id], [Var realWorldPrimId]) + else + returnSmpl (used_bndrs', map varToCoreExpr used_bndrs') + ) `thenSmpl` \ (final_bndrs', final_args) -> + + -- See comment about "$j" name above + newId FSLIT("$j") (mkPiTypes final_bndrs' rhs_ty') `thenSmpl` \ join_bndr -> + -- Notice the funky mkPiTypes. If the contructor has existentials + -- it's possible that the join point will be abstracted over + -- type varaibles as well as term variables. + -- Example: Suppose we have + -- data T = forall t. C [t] + -- Then faced with + -- case (case e of ...) of + -- C t xs::[t] -> rhs + -- We get the join point + -- let j :: forall t. [t] -> ... + -- j = /\t \xs::[t] -> rhs + -- in + -- case (case e of ...) of + -- C t xs::[t] -> j t xs + let + -- We make the lambdas into one-shot-lambdas. The + -- join point is sure to be applied at most once, and doing so + -- prevents the body of the join point being floated out by + -- the full laziness pass + really_final_bndrs = map one_shot final_bndrs' + one_shot v | isId v = setOneShotLambda v + | otherwise = v + join_rhs = mkLams really_final_bndrs rhs' + join_call = mkApps (Var join_bndr) final_args + in + returnSmpl (unitFloat env join_bndr join_rhs, Just (con, bndrs', join_call)) } +\end{code} + +Note [Refinement] +~~~~~~~~~~~~~~~~~ +Consider + data T a where + MkT :: a -> b -> T a + + f = /\a. \(w::a). + case (case ...) of + MkT a' b (p::a') (q::b) -> [p,w] + +The danger is that we'll make a join point + + j a' p = [p,w] + +and that's ill-typed, because (p::a') but (w::a). + +Solution so far: don't abstract over a', because the type refinement +maps [a' -> a] . Ultimately that won't work when real refinement goes on. + +Then we must abstract over any refined free variables. Hmm. Maybe we +could just abstract over *all* free variables, thereby lambda-lifting +the join point? We should try this. diff --git a/compiler/simplCore/simplifier.tib b/compiler/simplCore/simplifier.tib new file mode 100644 index 0000000000..18acd27943 --- /dev/null +++ b/compiler/simplCore/simplifier.tib @@ -0,0 +1,771 @@ +% Andre: +% +% - I'd like the transformation rules to appear clearly-identified in +% a box of some kind, so they can be distinguished from the examples. +% + + + +\documentstyle[slpj,11pt]{article} + +\renewcommand{\textfraction}{0.2} +\renewcommand{\floatpagefraction}{0.7} + +\begin{document} + +\title{How to simplify matters} + +\author{Simon Peyton Jones and Andre Santos\\ +Department of Computing Science, University of Glasgow, G12 8QQ \\ + @simonpj@@dcs.gla.ac.uk@ +} + +\maketitle + + +\section{Motivation} + +Quite a few compilers use the {\em compilation by transformation} idiom. +The idea is that as much of possible of the compilation process is +expressed as correctness-preserving transformations, each of which +transforms a program into a semantically-equivalent +program that (hopefully) executes more quickly or in less space. +Functional languages are particularly amenable to this approach because +they have a particularly rich family of possible transformations. +Examples of transformation-based compilers +include the Orbit compiler,[.kranz orbit thesis.] +Kelsey's compilers,[.kelsey thesis, hudak kelsey principles 1989.] +the New Jersey SML compiler,[.appel compiling with continuations.] +and the Glasgow Haskell compiler.[.ghc JFIT.] Of course many, perhaps most, +other compilers also use transformation to some degree. + +Compilation by transformation uses automatic transformations; that is, those +which can safely be applied automatically by a compiler. There +is also a whole approach to programming, which we might call {\em programming by transformation}, +in which the programmer manually transforms an inefficient specification into +an efficient program. This development process might be supported by +a programming environment in which does the book keeping, but the key steps +are guided by the programmer. We focus exclusively on automatic transformations +in this paper. + +Automatic program transformations seem to fall into two broad categories: +\begin{itemize} +\item {\bf Glamorous transformations} are global, sophisticated, +intellectually satisfying transformations, sometimes guided by some +interesting kind of analysis. +Examples include: +lambda lifting,[.johnsson lambda lifting.] +full laziness,[.hughes thesis, lester spe.] +closure conversion,[.appel jim 1989.] +deforestation,[.wadler 1990 deforestation, marlow wadler deforestation Glasgow92, chin phd 1990 march, gill launchbury.] +transformations based on strictness analysis,[.peyton launchbury unboxed.] +and so on. It is easy to write papers about these sorts of transformations. + +\item {\bf Humble transformations} are small, simple, local transformations, +which individually look pretty trivial. Here are two simple examples\footnote{ +The notation @E[]@ stands for an arbitrary expression with zero or more holes. +The notation @E[e]@ denotes @E[]@ with the holes filled in by the expression @e@. +We implicitly assume that no name-capture happens --- it's just +a short-hand, not an algorithm. +}: +@ + let x = y in E[x] ===> E[y] + + case (x:xs) of ===> E1[x,xs] + (y:ys) -> E1[y,ys] + [] -> E2 +@ +Transformations of this kind are almost embarassingly simple. How could +anyone write a paper about them? +\end{itemize} +This paper is about humble transformations, and how to implement them. +Although each individual +transformation is simple enough, there is a scaling issue: +there are a large number of candidate transformations to consider, and +there are a very large number of opportunities to apply them. + +In the Glasgow Haskell compiler, all humble transformations +are performed by the so-called {\em simplifier}. +Our goal in this paper is to give an overview of how the simplifier works, what +transformations it applies, and what issues arose in its design. + +\section{The language} + +Mutter mutter. Important points: +\begin{itemize} +\item Second order lambda calculus. +\item Arguments are variables. +\item Unboxed data types, and unboxed cases. +\end{itemize} +Less important points: +\begin{itemize} +\item Constructors and primitives are saturated. +\item if-then-else desugared to @case@ +\end{itemize} + +Give data type. + +\section{Transformations} + +This section lists all the transformations implemented by the simplifier. +Because it is a complete list, it is a long one. +We content ourselves with a brief statement of each transformation, +augmented with forward references to Section~\ref{sect:composing} +which gives examples of the ways in which the transformations can compose together. + +\subsection{Beta reduction} + +If a lambda abstraction is applied to an argument, we can simply +beta-reduce. This applies equally to ordinary lambda abstractions and +type abstractions: +@ + (\x -> E[x]) arg ===> E[arg] + (/\a -> E[a]) ty ===> E[ty] +@ +There is no danger of duplicating work because the argument is +guaranteed to be a simple variable or literal. + +\subsubsection{Floating applications inward} + +Applications can be floated inside a @let(rec)@ or @case@ expression. +This is a good idea, because they might find a lambda abstraction inside +to beta-reduce with: +@ + (let(rec) Bind in E) arg ===> let(rec) Bind in (E arg) + + (case E of {P1 -> E1;...; Pn -> En}) arg + ===> + case E of {P1 -> E1 arg; ...; Pn -> En arg} +@ + + + +\subsection{Transformations concerning @let(rec)@} + +\subsubsection{Floating @let@ out of @let@} + +It is sometimes useful to float a @let(rec)@ out of a @let(rec)@ right-hand +side: +@ + let x = let(rec) Bind in B1 ===> let(rec) Bind in + in B2 let x = B1 + in B2 + + + letrec x = let(rec) Bind in B1 ===> let(rec) Bind + in B2 x = B1 + in B2 +@ + +\subsubsection{Floating @case@ out of @let@} + + +\subsubsection{@let@ to @case@} + + +\subsection{Transformations concerning @case@} + +\subsubsection{Case of known constructor} + +If a @case@ expression scrutinises a constructor, +the @case@ can be eliminated. This transformation is a real +win: it eliminates a whole @case@ expression. +@ + case (C a1 .. an) of ===> E[a1..an] + ... + C b1 .. bn -> E[b1..bn] + ... +@ +If none of the constructors in the alternatives match, then +the default is taken: +@ + case (C a1 .. an) of ===> let y = C a1 .. an + ...[no alt matches C]... in E + y -> E +@ +There is an important variant of this transformation when +the @case@ expression scrutinises a {\em variable} +which is known to be bound to a constructor. +This situation can +arise for two reasons: +\begin{itemize} +\item An enclosing @let(rec)@ binding binds the variable to a constructor. +For example: +@ + let x = C p q in ... (case x of ...) ... +@ +\item An enclosing @case@ expression scrutinises the same variable. +For example: +@ + case x of + ... + C p q -> ... (case x of ...) ... + ... +@ +This situation is particularly common, as we discuss in Section~\ref{sect:repeated-evals}. +\end{itemize} +In each of these examples, @x@ is known to be bound to @C p q@ +at the inner @case@. The general rules are: +@ + case x of {...; C b1 .. bn -> E[b1..bn]; ...} +===> {x bound to C a1 .. an} + E[a1..an] + + case x of {...[no alts match C]...; y -> E[y]} +===> {x bound to C a1 .. an} + E[x] +@ + +\subsubsection{Dead alternative elimination} +@ + case x of + C a .. z -> E + ...[other alts]... +===> x *not* bound to C + case x of + ...[other alts]... +@ +We might know that @x@ is not bound to a particular constructor +because of an enclosing case: +@ + case x of + C a .. z -> E1 + other -> E2 +@ +Inside @E1@ we know that @x@ is bound to @C@. +However, if the type has more than two constructors, +inside @E2@ all we know is that @x@ is {\em not} bound to @C@. + +This applies to unboxed cases also, in the obvious way. + +\subsubsection{Case elimination} + +If we can prove that @x@ is not bottom, then this rule applies. +@ + case x of ===> E[x] + y -> E[y] +@ +We might know that @x@ is non-bottom because: +\begin{itemize} +\item @x@ has an unboxed type. +\item There's an enclosing case which scrutinises @x@. +\item It is bound to an expression which provably terminates. +\end{itemize} +Since this transformation can only improve termination, even if we apply it +when @x@ is not provably non-bottom, we provide a compiler flag to +enable it all the time. + +\subsubsection{Case of error} + +@ + case (error ty E) of Alts ===> error ty' E + where + ty' is type of whole case expression +@ + +Mutter about types. Mutter about variables bound to error. +Mutter about disguised forms of error. + +\subsubsection{Floating @let(rec)@ out of @case@} + +A @let(rec)@ binding can be floated out of a @case@ scrutinee: +@ + case (let(rec) Bind in E) of Alts ===> let(rec) Bind in + case E of Alts +@ +This increases the likelihood of a case-of-known-constructor transformation, +because @E@ is not hidden from the @case@ by the @let(rec)@. + +\subsubsection{Floating @case@ out of @case@} + +Analogous to floating a @let(rec)@ from a @case@ scrutinee is +floating a @case@ from a @case@ scrutinee. We have to be +careful, though, about code size. If there's only one alternative +in the inner case, things are easy: +@ + case (case E of {P -> R}) of ===> case E of {P -> case R of + Q1 -> S1 Q1 -> S1 + ... ... + Qm -> Sm Qm -> Sm} +@ +If there's more than one alternative there's a danger +that we'll duplicate @S1@...@Sm@, which might be a lot of code. +Our solution is to create a new local definition for each +alternative: +@ + case (case E of {P1 -> R1; ...; Pn -> Rn}) of + Q1 -> S1 + ... + Qm -> Sm +===> + let s1 = \x1 ... z1 -> S1 + ... + sm = \xm ... zm -> Sm + in + case E of + P1 -> case R1 of {Q1 -> s1 x1 ... z1; ...; Qm -> sm xm ... zm} + ... + Pn -> case Rn of {Q1 -> s1 x1 ... z1; ...; Qm -> sm xm ... zm} +@ +Here, @x1 ... z1@ are that subset of +variables bound by the pattern @Q1@ which are free in @S1@, and +similarly for the other @si@. + +Is this transformation a win? After all, we have introduced @m@ new +functions! Section~\ref{sect:join-points} discusses this point. + +\subsubsection{Case merging} + +@ + case x of + ...[some alts]... + other -> case x of + ...[more alts]... +===> + case x of + ...[some alts]... + ...[more alts]... +@ +Any alternatives in @[more alts]@ which are already covered by @[some alts]@ +should first be eliminated by the dead-alternative transformation. + + +\subsection{Constructor reuse} + + +\subsection{Inlining} + +The inlining transformtion is simple enough: +@ + let x = R in B[x] ===> B[R] +@ +Inlining is more conventionally used to describe the instantiation of a function +body at its call site, with arguments substituted for formal parameters. We treat +this as a two-stage process: inlining followed by beta reduction. Since we are +working with a higher-order language, not all the arguments may be available at every +call site, so separating inlining from beta reduction allows us to concentrate on +one problem at a time. + +The choice of exactly {\em which} bindings to inline has a major impact on efficiency. +Specifically, we need to consider the following factors: +\begin{itemize} +\item +Inlining a function at its call site, followed by some beta reduction, +very often exposes opportunities for further transformations. +We inline many simple arithmetic and boolean operators for this reason. +\item +Inlining can increase code size. +\item +Inlining can duplicate work, for example if a redex is inlined at more than one site. +Duplicating a single expensive redex can ruin a program's efficiency. +\end{itemize} + + +Our inlining strategy depends on the form of @R@: + +Mutter mutter. + + +\subsubsection{Dead code removal} + +If a @let@-bound variable is not used the binding can be dropped: +@ + let x = E in B ===> B + x not free in B +@ +A similar transformation applies for @letrec@-bound variables. +Programmers seldom write dead code, of course, but bindings often become dead when they +are inlined. + + + + +\section{Composing transformations} +\label{sect:composing} + +The really interesting thing about humble transformations is the way in which +they compose together to carry out substantial and useful transformations. +This section gives a collection of motivating examples, all of which have +shown up in real application programs. + +\subsection{Repeated evals} +\label{sect:repeated-evals} + +Example: x+x, as in unboxed paper. + + +\subsection{Lazy pattern matching} + +Lazy pattern matching is pretty inefficient. Consider: +@ + let (x,y) = E in B +@ +which desugars to: +@ + let t = E + x = case t of (x,y) -> x + y = case t of (x,y) -> y + in B +@ +This code allocates three thunks! However, if @B@ is strict in {\em either} +@x@ {\em or} @y@, then the strictness analyser will easily spot that +the binding for @t@ is strict, so we can do a @let@-to-@case@ transformation: +@ + case E of + (x,y) -> let t = (x,y) in + let x = case t of (x,y) -> x + y = case t of (x,y) -> y + in B +@ +whereupon the case-of-known-constructor transformation +eliminates the @case@ expressions in the right-hand side of @x@ and @y@, +and @t@ is then spotted as being dead, so we get +@ + case E of + (x,y) -> B +@ + +\subsection{Join points} +\label{sect:join-points} + +One motivating example is this: +@ + if (not x) then E1 else E2 +@ +After desugaring the conditional, and inlining the definition of +@not@, we get +@ + case (case x of True -> False; False -> True}) of + True -> E1 + False -> E2 +@ +Now, if we apply our case-of-case transformation we get: +@ + let e1 = E1 + e2 = E2 + in + case x of + True -> case False of {True -> e1; False -> e2} + False -> case True of {True -> e1; False -> e2} +@ +Now the case-of-known constructor transformation applies: +@ + let e1 = E1 + e2 = E2 + in + case x of + True -> e2 + False -> e1 +@ +Since there is now only one occurrence of @e1@ and @e2@ we can +inline them, giving just what we hoped for: +@ + case x of {True -> E2; False -> E1} +@ +The point is that the local definitions will often disappear again. + +\subsubsection{How join points occur} + +But what if they don't disappear? Then the definitions @s1@ ... @sm@ +play the role of ``join points''; they represent the places where +execution joins up again, having forked at the @case x@. The +``calls'' to the @si@ should really be just jumps. To see this more clearly +consider the expression +@ + if (x || y) then E1 else E2 +@ +A C compiler will ``short-circuit'' the +evaluation of the condition if @x@ turns out to be true +generate code, something like this: +@ + if (x) goto l1; + if (y) {...code for E2...} + l1: ...code for E1... +@ +In our setting, here's what will happen. First we desguar the +conditional, and inline the definition of @||@: +@ + case (case x of {True -> True; False -> y}) of + True -> E1 + False -> E2 +@ +Now apply the case-of-case transformation: +@ + let e1 = E1 + e2 = E2 + in + case x of + True -> case True of {True -> e1; False -> e2} + False -> case y of {True -> e1; False -> e2} +@ +Unlike the @not@ example, only one of the two inner case +simplifies, and we can therefore only inline @e2@, because +@e1@ is still mentioned twice\footnote{Unless the +inlining strategy decides that @E1@ is small enough to duplicate; +it is used in separate @case@ branches so there's no concern about duplicating +work. Here's another example of the way in which we make one part of the +simplifier (the inlining strategy) help with the work of another (@case@-expression +simplification.} +@ + let e1 = E1 + in + case x of + True -> e1 + False -> case y of {True -> e1; False -> e2} +@ +The code generator produces essentially the same code as +the C code given above. The binding for @e1@ turns into +just a label, which is jumped to from the two occurrences of @e1@. + +\subsubsection{Case of @error@} + +The case-of-error transformation is often exposed by the case-of-case +transformation. Consider +@ + case (hd xs) of + True -> E1 + False -> E2 +@ +After inlining @hd@, we get +@ + case (case xs of [] -> error "hd"; (x:_) -> x) of + True -> E1 + False -> E2 +@ +(I've omitted the type argument of @error@ to save clutter.) +Now doing case-of-case gives +@ + let e1 = E1 + e2 = E2 + in + case xs of + [] -> case (error "hd") of { True -> e1; False -> e2 } + (x:_) -> case x of { True -> e1; False -> e2 } +@ +Now the case-of-error transformation springs to life, after which +we can inline @e1@ and @e2@: +@ + case xs of + [] -> error "hd" + (x:_) -> case x of {True -> E1; False -> E2} +@ + +\subsection{Nested conditionals combined} + +Sometimes programmers write something which should be done +by a single @case@ as a sequence of tests: +@ + if x==0::Int then E0 else + if x==1 then E1 else + E2 +@ +After eliminating some redundant evals and doing the case-of-case +transformation we get +@ + case x of I# x# -> + case x# of + 0# -> E0 + other -> case x# of + 1# -> E1 + other -> E2 +@ +The case-merging transformation puts these together to get +@ + case x of I# x# -> + case x# of + 0# -> E0 + 1# -> E1 + other -> E2 +@ +Sometimes the sequence of tests cannot be eliminated from the source +code because of overloading: +@ + f :: Num a => a -> Bool + f 0 = True + f 3 = True + f n = False +@ +If we specialise @f@ to @Int@ we'll get the previous example again. + +\subsection{Error tests eliminated} + +The elimination of redundant alternatives, and then of redundant cases, +arises when we inline functions which do error checking. A typical +example is this: +@ + if (x `rem` y) == 0 then (x `div` y) else y +@ +Here, both @rem@ and @div@ do an error-check for @y@ being zero. +The second check is eliminated by the transformations. +After transformation the code becomes: +@ + case x of I# x# -> + case y of I# y# -> + case y of + 0# -> error "rem: zero divisor" + _ -> case x# rem# y# of + 0# -> case x# div# y# of + r# -> I# r# + _ -> y +@ + +\subsection{Atomic arguments} + +At this point it is possible to appreciate the usefulness of +the Core-language syntax requirement that arguments are atomic. +For example, suppose that arguments could be arbitrary expressions. +Here is a possible transformation: +@ + f (case x of (p,q) -> p) +===> f strict in its second argument + case x of (p,q) -> f (p,p) +@ +Doing this transformation would be useful, because now the +argument to @f@ is a simple variable rather than a thunk. +However, if arguments are atomic, this transformation becomes +just a special case of floating a @case@ out of a strict @let@: +@ + let a = case x of (p,q) -> p + in f a +===> (f a) strict in a + case x of (p,q) -> let a=p in f a +===> + case x of (p,q) -> f p +@ +There are many examples of this kind. For almost any transformation +involving @let@ there is a corresponding one involving a function +argument. The same effect is achieved with much less complexity +by restricting function arguments to be atomic. + +\section{Design} + +Dependency analysis +Occurrence analysis + +\subsection{Renaming and cloning} + +Every program-transformation system has to worry about name capture. +For example, here is an erroneous transformation: +@ + let y = E + in + (\x -> \y -> x + y) (y+3) +===> WRONG! + let y = E + in + (\y -> (y+3) + y) +@ +The transformation fails because the originally free-occurrence +of @y@ in the argument @y+3@ has been ``captured'' by the @\y@-abstraction. +There are various sophisticated solutions to this difficulty, but +we adopted a very simple one: we uniquely rename every locally-bound identifier +on every pass of the simplifier. +Since we are in any case producing an entirely new program (rather than side-effecting +an existing one) it costs very little extra to rename the identifiers as we go. + +So our example would become +@ + let y = E + in + (\x -> \y -> x + y) (y+3) +===> WRONG! + let y1 = E + in + (\y2 -> (y1+3) + y2) +@ +The simplifier accepts as input a program which has arbitrary bound +variable names, including ``shadowing'' (where a binding hides an +outer binding for the same identifier), but it produces a program in +which every bound identifier has a distinct name. + +Both the ``old'' and ``new'' identifiers have type @Id@, but when writing +type signatures for functions in the simplifier we use the types @InId@, for +identifiers from the input program, and @OutId@ for identifiers from the output program: +@ + type InId = Id + type OutId = Id +@ +This nomenclature extends naturally to expressions: a value of type @InExpr@ is an +expression whose identifiers are from the input-program name-space, and similarly +@OutExpr@. + + +\section{The simplifier} + +The basic algorithm followed by the simplifier is: +\begin{enumerate} +\item Analyse: perform occurrence analysis and dependency analysis. +\item Simplify: apply as many transformations as possible. +\item Iterate: perform the above two steps repeatedly until no further transformations are possible. +(A compiler flag allows the programmer to bound the maximum number of iterations.) +\end{enumerate} +We make a effort to apply as many transformations as possible in Step +2. To see why this is a good idea, just consider a sequence of +transformations in which each transformation enables the next. If +each iteration of Step 2 only performs one transformation, then the +entire program will to be re-analysed by Step 1, and re-traversed by +Step 2, for each transformation of the sequence. Sometimes this is +unavoidable, but it is often possible to perform a sequence of +transformtions in a single pass. + +The key function, which simplifies expressions, has the following type: +@ + simplExpr :: SimplEnv + -> InExpr -> [OutArg] + -> SmplM OutExpr +@ +The monad, @SmplM@ can quickly be disposed of. It has only two purposes: +\begin{itemize} +\item It plumbs around a supply of unique names, so that the simplifier can +easily invent new names. +\item It gathers together counts of how many of each kind of transformation +has been applied, for statistical purposes. These counts are also used +in Step 3 to decide when the simplification process has terminated. +\end{itemize} + +The signature can be understood like this: +\begin{itemize} +\item The environment, of type @SimplEnv@, provides information about +identifiers bound by the enclosing context. +\item The second and third arguments together specify the expression to be simplified. +\item The result is the simplified expression, wrapped up by the monad. +\end{itemize} +The simplifier's invariant is this: +$$ +@simplExpr@~env~expr~[a_1,\ldots,a_n] = expr[env]~a_1~\ldots~a_n +$$ +That is, the expression returned by $@simplExpr@~env~expr~[a_1,\ldots,a_n]$ +is semantically equal (although hopefully more efficient than) +$expr$, with the renamings in $env$ applied to it, applied to the arguments +$a_1,\ldots,a_n$. + +\subsection{Application and beta reduction} + +The arguments are carried ``inwards'' by @simplExpr@, as an accumulating parameter. +This is a convenient way of implementing the transformations which float +arguments inside a @let@ and @case@. This list of pending arguments +requires a new data type, @CoreArg@, along with its ``in'' and ``out'' synonyms, +because an argument might be a type or an atom: +@ +data CoreArg bindee = TypeArg UniType + | ValArg (CoreAtom bindee) + +type InArg = CoreArg InId +type OutArg = CoreArg OutId +@ +The equations for applications simply apply +the environment to the argument (to handle renaming) and put the result +on the argument stack, tagged to say whether it is a type argument or value argument: +@ + simplExpr env (CoApp fun arg) args + = simplExpr env fun (ValArg (simplAtom env arg) : args) + simplExpr env (CoTyApp fun ty) args + = simplExpr env fun (TypeArg (simplTy env ty) : args) +@ + + + + + + +\end{document} diff --git a/compiler/simplStg/SRT.lhs b/compiler/simplStg/SRT.lhs new file mode 100644 index 0000000000..cd118d7092 --- /dev/null +++ b/compiler/simplStg/SRT.lhs @@ -0,0 +1,165 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1998 +% + +Run through the STG code and compute the Static Reference Table for +each let-binding. At the same time, we figure out which top-level +bindings have no CAF references, and record the fact in their IdInfo. + +\begin{code} +module SRT( computeSRTs ) where + +#include "HsVersions.h" + +import StgSyn +import Id ( Id ) +import VarSet +import VarEnv +import Util ( sortLe ) +import Maybes ( orElse ) +import Maybes ( expectJust ) +import Bitmap ( intsToBitmap ) + +#ifdef DEBUG +import Outputable +#endif + +import List + +import Util +import Outputable +\end{code} + +\begin{code} +computeSRTs :: [StgBinding] -> [(StgBinding,[(Id,[Id])])] + -- The incoming bindingd are filled with SRTEntries in their SRT slots + -- the outgoing ones have NoSRT/SRT values instead + +computeSRTs binds = srtTopBinds emptyVarEnv binds + +-- -------------------------------------------------------------------------- +-- Top-level Bindings + +srtTopBinds :: IdEnv Id -> [StgBinding] -> [(StgBinding, [(Id,[Id])])] + +srtTopBinds env [] = [] +srtTopBinds env (StgNonRec b rhs : binds) = + (StgNonRec b rhs', [(b,srt')]) : srtTopBinds env' binds + where + (rhs', srt) = srtTopRhs b rhs + env' = maybeExtendEnv env b rhs + srt' = applyEnvList env srt +srtTopBinds env (StgRec bs : binds) = + (StgRec (zip bndrs rhss), zip bndrs srts') : srtTopBinds env binds + where + (rhss, srts) = unzip [ srtTopRhs b r | (b,r) <- bs ] + bndrs = map fst bs + srts' = map (applyEnvList env) srts + +-- Shorting out indirections in SRTs: if a binding has an SRT with a single +-- element in it, we just inline it with that element everywhere it occurs +-- in other SRTs. +-- +-- This is in a way a generalisation of the CafInfo. CafInfo says +-- whether a top-level binding has *zero* CAF references, allowing us +-- to omit it from SRTs. Here, we pick up bindings with *one* CAF +-- reference, and inline its SRT everywhere it occurs. We could pass +-- this information across module boundaries too, but we currently +-- don't. + +maybeExtendEnv env bndr (StgRhsClosure _ _ _ ReEntrant (SRTEntries cafs) _ _) + | [one] <- varSetElems cafs + = extendVarEnv env bndr (applyEnv env one) +maybeExtendEnv env bndr _ = env + +applyEnvList :: IdEnv Id -> [Id] -> [Id] +applyEnvList env = map (applyEnv env) + +applyEnv env id = lookupVarEnv env id `orElse` id + +-- ---- Top-level right hand sides: + +srtTopRhs :: Id -> StgRhs -> (StgRhs, [Id]) + +srtTopRhs binder rhs@(StgRhsCon _ _ _) = (rhs, []) +srtTopRhs binder rhs@(StgRhsClosure _ _ _ _ (SRTEntries cafs) _ _) + = (srtRhs table rhs, elems) + where + elems = varSetElems cafs + table = mkVarEnv (zip elems [0..]) + +-- ---- Binds: + +srtBind :: IdEnv Int -> StgBinding -> StgBinding + +srtBind table (StgNonRec binder rhs) = StgNonRec binder (srtRhs table rhs) +srtBind table (StgRec pairs) = StgRec [ (b, srtRhs table r) | (b,r) <- pairs ] + +-- ---- Right Hand Sides: + +srtRhs :: IdEnv Int -> StgRhs -> StgRhs + +srtRhs table e@(StgRhsCon cc con args) = e +srtRhs table (StgRhsClosure cc bi free_vars u srt args body) + = StgRhsClosure cc bi free_vars u (constructSRT table srt) args + $! (srtExpr table body) + +-- --------------------------------------------------------------------------- +-- Expressions + +srtExpr :: IdEnv Int -> StgExpr -> StgExpr + +srtExpr table e@(StgApp f args) = e +srtExpr table e@(StgLit l) = e +srtExpr table e@(StgConApp con args) = e +srtExpr table e@(StgOpApp op args ty) = e + +srtExpr table (StgSCC cc expr) = StgSCC cc $! srtExpr table expr + +srtExpr table (StgCase scrut live1 live2 uniq srt alt_type alts) + = StgCase expr' live1 live2 uniq srt' alt_type alts' + where + expr' = srtExpr table scrut + srt' = constructSRT table srt + alts' = map (srtAlt table) alts + +srtExpr table (StgLet bind body) + = srtBind table bind =: \ bind' -> + srtExpr table body =: \ body' -> + StgLet bind' body' + +srtExpr table (StgLetNoEscape live1 live2 bind body) + = srtBind table bind =: \ bind' -> + srtExpr table body =: \ body' -> + StgLetNoEscape live1 live2 bind' body' + +#ifdef DEBUG +srtExpr table expr = pprPanic "srtExpr" (ppr expr) +#endif + +srtAlt :: IdEnv Int -> StgAlt -> StgAlt +srtAlt table (con,args,used,rhs) + = (,,,) con args used $! srtExpr table rhs + +----------------------------------------------------------------------------- +-- Construct an SRT bitmap. + +constructSRT :: IdEnv Int -> SRT -> SRT +constructSRT table (SRTEntries entries) + | isEmptyVarSet entries = NoSRT + | otherwise = SRT offset len bitmap + where + ints = map (expectJust "constructSRT" . lookupVarEnv table) + (varSetElems entries) + sorted_ints = sortLe (<=) ints + offset = head sorted_ints + bitmap_entries = map (subtract offset) sorted_ints + len = last bitmap_entries + 1 + bitmap = intsToBitmap len bitmap_entries + +-- --------------------------------------------------------------------------- +-- Misc stuff + +a =: k = k a + +\end{code} diff --git a/compiler/simplStg/SimplStg.lhs b/compiler/simplStg/SimplStg.lhs new file mode 100644 index 0000000000..e87877cb4c --- /dev/null +++ b/compiler/simplStg/SimplStg.lhs @@ -0,0 +1,96 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1993-1998 +% +\section[SimplStg]{Driver for simplifying @STG@ programs} + +\begin{code} +module SimplStg ( stg2stg ) where + +#include "HsVersions.h" + +import StgSyn + +import CostCentre ( CollectedCCs ) +import SCCfinal ( stgMassageForProfiling ) +import StgLint ( lintStgBindings ) +import StgStats ( showStgStats ) +import SRT ( computeSRTs ) + +import Packages ( HomeModules ) +import DynFlags ( DynFlags(..), DynFlag(..), dopt, StgToDo(..), + getStgToDo ) +import Id ( Id ) +import Module ( Module ) +import ErrUtils ( doIfSet_dyn, dumpIfSet_dyn, showPass ) +import UniqSupply ( mkSplitUniqSupply, splitUniqSupply ) +import Outputable +\end{code} + +\begin{code} +stg2stg :: DynFlags -- includes spec of what stg-to-stg passes to do + -> HomeModules + -> Module -- module name (profiling only) + -> [StgBinding] -- input... + -> IO ( [(StgBinding,[(Id,[Id])])] -- output program... + , CollectedCCs) -- cost centre information (declared and used) + +stg2stg dflags pkg_deps module_name binds + = do { showPass dflags "Stg2Stg" + ; us <- mkSplitUniqSupply 'g' + + ; doIfSet_dyn dflags Opt_D_verbose_stg2stg + (printDump (text "VERBOSE STG-TO-STG:")) + + ; (binds', us', ccs) <- end_pass us "Stg2Stg" ([],[],[]) binds + + -- Do the main business! + ; (processed_binds, _, cost_centres) + <- foldl_mn do_stg_pass (binds', us', ccs) (getStgToDo dflags) + + ; let srt_binds = computeSRTs processed_binds + + ; dumpIfSet_dyn dflags Opt_D_dump_stg "STG syntax:" + (pprStgBindingsWithSRTs srt_binds) + + ; return (srt_binds, cost_centres) + } + + where + stg_linter = if dopt Opt_DoStgLinting dflags + then lintStgBindings + else ( \ whodunnit binds -> binds ) + + ------------------------------------------- + do_stg_pass (binds, us, ccs) to_do + = let + (us1, us2) = splitUniqSupply us + in + case to_do of + D_stg_stats -> + trace (showStgStats binds) + end_pass us2 "StgStats" ccs binds + + StgDoMassageForProfiling -> + {-# SCC "ProfMassage" #-} + let + (collected_CCs, binds3) + = stgMassageForProfiling pkg_deps module_name us1 binds + in + end_pass us2 "ProfMassage" collected_CCs binds3 + + end_pass us2 what ccs binds2 + = do -- report verbosely, if required + dumpIfSet_dyn dflags Opt_D_verbose_stg2stg what + (vcat (map ppr binds2)) + let linted_binds = stg_linter what binds2 + return (linted_binds, us2, ccs) + -- return: processed binds + -- UniqueSupply for the next guy to use + -- cost-centres to be declared/registered (specialised) + -- add to description of what's happened (reverse order) + +-- here so it can be inlined... +foldl_mn f z [] = return z +foldl_mn f z (x:xs) = f z x >>= \ zz -> + foldl_mn f zz xs +\end{code} diff --git a/compiler/simplStg/StgStats.lhs b/compiler/simplStg/StgStats.lhs new file mode 100644 index 0000000000..a91873971c --- /dev/null +++ b/compiler/simplStg/StgStats.lhs @@ -0,0 +1,172 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[StgStats]{Gathers statistical information about programs} + + +The program gather statistics about +\begin{enumerate} +\item number of boxed cases +\item number of unboxed cases +\item number of let-no-escapes +\item number of non-updatable lets +\item number of updatable lets +\item number of applications +\item number of primitive applications +\item number of closures (does not include lets bound to constructors) +\item number of free variables in closures +%\item number of top-level functions +%\item number of top-level CAFs +\item number of constructors +\end{enumerate} + +\begin{code} +module StgStats ( showStgStats ) where + +#include "HsVersions.h" + +import StgSyn + +import FiniteMap ( emptyFM, plusFM_C, unitFM, fmToList, FiniteMap ) +import Id (Id) +\end{code} + +\begin{code} +data CounterType + = Literals + | Applications + | ConstructorApps + | PrimitiveApps + | LetNoEscapes + | StgCases + | FreeVariables + | ConstructorBinds Bool{-True<=>top-level-} + | ReEntrantBinds Bool{-ditto-} + | SingleEntryBinds Bool{-ditto-} + | UpdatableBinds Bool{-ditto-} + deriving (Eq, Ord) + +type Count = Int +type StatEnv = FiniteMap CounterType Count +\end{code} + +\begin{code} +emptySE :: StatEnv +emptySE = emptyFM + +combineSE :: StatEnv -> StatEnv -> StatEnv +combineSE = plusFM_C (+) + +combineSEs :: [StatEnv] -> StatEnv +combineSEs = foldr combineSE emptySE + +countOne :: CounterType -> StatEnv +countOne c = unitFM c 1 + +countN :: CounterType -> Int -> StatEnv +countN = unitFM +\end{code} + +%************************************************************************ +%* * +\subsection{Top-level list of bindings (a ``program'')} +%* * +%************************************************************************ + +\begin{code} +showStgStats :: [StgBinding] -> String + +showStgStats prog + = "STG Statistics:\n\n" + ++ concat (map showc (fmToList (gatherStgStats prog))) + where + showc (x,n) = (showString (s x) . shows n) "\n" + + s Literals = "Literals " + s Applications = "Applications " + s ConstructorApps = "ConstructorApps " + s PrimitiveApps = "PrimitiveApps " + s LetNoEscapes = "LetNoEscapes " + s StgCases = "StgCases " + s FreeVariables = "FreeVariables " + s (ConstructorBinds True) = "ConstructorBinds_Top " + s (ReEntrantBinds True) = "ReEntrantBinds_Top " + s (SingleEntryBinds True) = "SingleEntryBinds_Top " + s (UpdatableBinds True) = "UpdatableBinds_Top " + s (ConstructorBinds _) = "ConstructorBinds_Nested " + s (ReEntrantBinds _) = "ReEntrantBindsBinds_Nested " + s (SingleEntryBinds _) = "SingleEntryBinds_Nested " + s (UpdatableBinds _) = "UpdatableBinds_Nested " + +gatherStgStats :: [StgBinding] -> StatEnv + +gatherStgStats binds + = combineSEs (map (statBinding True{-top-level-}) binds) +\end{code} + +%************************************************************************ +%* * +\subsection{Bindings} +%* * +%************************************************************************ + +\begin{code} +statBinding :: Bool -- True <=> top-level; False <=> nested + -> StgBinding + -> StatEnv + +statBinding top (StgNonRec b rhs) + = statRhs top (b, rhs) + +statBinding top (StgRec pairs) + = combineSEs (map (statRhs top) pairs) + +statRhs :: Bool -> (Id, StgRhs) -> StatEnv + +statRhs top (b, StgRhsCon cc con args) + = countOne (ConstructorBinds top) + +statRhs top (b, StgRhsClosure cc bi fv u _srt args body) + = statExpr body `combineSE` + countN FreeVariables (length fv) `combineSE` + countOne ( + case u of + ReEntrant -> ReEntrantBinds top + Updatable -> UpdatableBinds top + SingleEntry -> SingleEntryBinds top + ) +\end{code} + +%************************************************************************ +%* * +\subsection{Expressions} +%* * +%************************************************************************ + +\begin{code} +statExpr :: StgExpr -> StatEnv + +statExpr (StgApp _ _) = countOne Applications +statExpr (StgLit _) = countOne Literals +statExpr (StgConApp _ _) = countOne ConstructorApps +statExpr (StgOpApp _ _ _) = countOne PrimitiveApps +statExpr (StgSCC l e) = statExpr e + +statExpr (StgLetNoEscape lvs_whole lvs_rhss binds body) + = statBinding False{-not top-level-} binds `combineSE` + statExpr body `combineSE` + countOne LetNoEscapes + +statExpr (StgLet binds body) + = statBinding False{-not top-level-} binds `combineSE` + statExpr body + +statExpr (StgCase expr lve lva bndr srt alt_type alts) + = statExpr expr `combineSE` + stat_alts alts `combineSE` + countOne StgCases + where + stat_alts alts + = combineSEs (map statExpr [ e | (_,_,_,e) <- alts ]) +\end{code} + diff --git a/compiler/specialise/Rules.lhs b/compiler/specialise/Rules.lhs new file mode 100644 index 0000000000..4d743140ea --- /dev/null +++ b/compiler/specialise/Rules.lhs @@ -0,0 +1,633 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[CoreRules]{Transformation rules} + +\begin{code} +module Rules ( + RuleBase, emptyRuleBase, mkRuleBase, extendRuleBaseList, + unionRuleBase, pprRuleBase, ruleCheckProgram, + + mkSpecInfo, extendSpecInfo, addSpecInfo, + rulesOfBinds, addIdSpecialisations, + + lookupRule, mkLocalRule, roughTopNames + ) where + +#include "HsVersions.h" + +import CoreSyn -- All of it +import OccurAnal ( occurAnalyseExpr ) +import CoreFVs ( exprFreeVars, exprsFreeVars, rulesRhsFreeVars ) +import CoreUnfold ( isCheapUnfolding, unfoldingTemplate ) +import CoreUtils ( tcEqExprX ) +import PprCore ( pprRules ) +import Type ( TvSubstEnv ) +import TcType ( tcSplitTyConApp_maybe ) +import CoreTidy ( tidyRules ) +import Id ( Id, idUnfolding, isLocalId, isGlobalId, idName, + idSpecialisation, idCoreRules, setIdSpecialisation ) +import IdInfo ( SpecInfo( SpecInfo ) ) +import Var ( Var ) +import VarEnv ( IdEnv, InScopeSet, emptyTidyEnv, + emptyInScopeSet, mkInScopeSet, extendInScopeSetList, + emptyVarEnv, lookupVarEnv, extendVarEnv, + nukeRnEnvL, mkRnEnv2, rnOccR, rnOccL, inRnEnvR, + rnBndrR, rnBndr2, rnBndrL, rnBndrs2 ) +import VarSet +import Name ( Name, NamedThing(..), nameOccName ) +import NameEnv +import Unify ( ruleMatchTyX, MatchEnv(..) ) +import BasicTypes ( Activation, CompilerPhase, isActive ) +import Outputable +import FastString +import Maybes ( isJust, orElse ) +import Bag +import Util ( singleton ) +import List ( isPrefixOf ) +\end{code} + + +%************************************************************************ +%* * +\subsection[specialisation-IdInfo]{Specialisation info about an @Id@} +%* * +%************************************************************************ + +A @CoreRule@ holds details of one rule for an @Id@, which +includes its specialisations. + +For example, if a rule for @f@ contains the mapping: +\begin{verbatim} + forall a b d. [Type (List a), Type b, Var d] ===> f' a b +\end{verbatim} +then when we find an application of f to matching types, we simply replace +it by the matching RHS: +\begin{verbatim} + f (List Int) Bool dict ===> f' Int Bool +\end{verbatim} +All the stuff about how many dictionaries to discard, and what types +to apply the specialised function to, are handled by the fact that the +Rule contains a template for the result of the specialisation. + +There is one more exciting case, which is dealt with in exactly the same +way. If the specialised value is unboxed then it is lifted at its +definition site and unlifted at its uses. For example: + + pi :: forall a. Num a => a + +might have a specialisation + + [Int#] ===> (case pi' of Lift pi# -> pi#) + +where pi' :: Lift Int# is the specialised version of pi. + +\begin{code} +mkLocalRule :: RuleName -> Activation + -> Name -> [CoreBndr] -> [CoreExpr] -> CoreExpr -> CoreRule +-- Used to make CoreRule for an Id defined in this module +mkLocalRule name act fn bndrs args rhs + = Rule { ru_name = name, ru_fn = fn, ru_act = act, + ru_bndrs = bndrs, ru_args = args, + ru_rhs = rhs, ru_rough = roughTopNames args, + ru_orph = Just (nameOccName fn), ru_local = True } + +-------------- +roughTopNames :: [CoreExpr] -> [Maybe Name] +roughTopNames args = map roughTopName args + +roughTopName :: CoreExpr -> Maybe Name +-- Find the "top" free name of an expression +-- a) the function in an App chain (if a GlobalId) +-- b) the TyCon in a type +-- This is used for the fast-match-check for rules; +-- if the top names don't match, the rest can't +roughTopName (Type ty) = case tcSplitTyConApp_maybe ty of + Just (tc,_) -> Just (getName tc) + Nothing -> Nothing +roughTopName (App f a) = roughTopName f +roughTopName (Var f) | isGlobalId f = Just (idName f) + | otherwise = Nothing +roughTopName other = Nothing + +ruleCantMatch :: [Maybe Name] -> [Maybe Name] -> Bool +-- (ruleCantMatch tpl actual) returns True only if 'actual' +-- definitely can't match 'tpl' by instantiating 'tpl'. +-- It's only a one-way match; unlike instance matching we +-- don't consider unification +ruleCantMatch (Just n1 : ts) (Just n2 : as) = n1 /= n2 || ruleCantMatch ts as +ruleCantMatch (Just n1 : ts) (Nothing : as) = True +ruleCantMatch (t : ts) (a : as) = ruleCantMatch ts as +ruleCantMatch ts as = False +\end{code} + + +%************************************************************************ +%* * + SpecInfo: the rules in an IdInfo +%* * +%************************************************************************ + +\begin{code} +mkSpecInfo :: [CoreRule] -> SpecInfo +mkSpecInfo rules = SpecInfo rules (rulesRhsFreeVars rules) + +extendSpecInfo :: SpecInfo -> [CoreRule] -> SpecInfo +extendSpecInfo (SpecInfo rs1 fvs1) rs2 + = SpecInfo (rs2 ++ rs1) (rulesRhsFreeVars rs2 `unionVarSet` fvs1) + +addSpecInfo :: SpecInfo -> SpecInfo -> SpecInfo +addSpecInfo (SpecInfo rs1 fvs1) (SpecInfo rs2 fvs2) + = SpecInfo (rs1 ++ rs2) (fvs1 `unionVarSet` fvs2) + +addIdSpecialisations :: Id -> [CoreRule] -> Id +addIdSpecialisations id rules + = setIdSpecialisation id $ + extendSpecInfo (idSpecialisation id) rules + +rulesOfBinds :: [CoreBind] -> [CoreRule] +rulesOfBinds binds = concatMap (concatMap idCoreRules . bindersOf) binds +\end{code} + + +%************************************************************************ +%* * + RuleBase +%* * +%************************************************************************ + +\begin{code} +type RuleBase = NameEnv [CoreRule] + -- Maps (the name of) an Id to its rules + -- The rules are are unordered; + -- we sort out any overlaps on lookup + +emptyRuleBase = emptyNameEnv + +mkRuleBase :: [CoreRule] -> RuleBase +mkRuleBase rules = extendRuleBaseList emptyRuleBase rules + +extendRuleBaseList :: RuleBase -> [CoreRule] -> RuleBase +extendRuleBaseList rule_base new_guys + = foldl extendRuleBase rule_base new_guys + +unionRuleBase :: RuleBase -> RuleBase -> RuleBase +unionRuleBase rb1 rb2 = plusNameEnv_C (++) rb1 rb2 + +extendRuleBase :: RuleBase -> CoreRule -> RuleBase +extendRuleBase rule_base rule + = extendNameEnv_Acc (:) singleton rule_base (ruleIdName rule) rule + +pprRuleBase :: RuleBase -> SDoc +pprRuleBase rules = vcat [ pprRules (tidyRules emptyTidyEnv rs) + | rs <- nameEnvElts rules ] +\end{code} + + +%************************************************************************ +%* * +\subsection{Matching} +%* * +%************************************************************************ + +\begin{code} +lookupRule :: (Activation -> Bool) -> InScopeSet + -> RuleBase -- Imported rules + -> Id -> [CoreExpr] -> Maybe (RuleName, CoreExpr) +lookupRule is_active in_scope rule_base fn args + = matchRules is_active in_scope fn args rules + where + -- The rules for an Id come from two places: + -- (a) the ones it is born with (idCoreRules fn) + -- (b) rules added in subsequent modules (extra_rules) + -- PrimOps, for example, are born with a bunch of rules under (a) + rules = extra_rules ++ idCoreRules fn + extra_rules | isLocalId fn = [] + | otherwise = lookupNameEnv rule_base (idName fn) `orElse` [] + +matchRules :: (Activation -> Bool) -> InScopeSet + -> Id -> [CoreExpr] + -> [CoreRule] -> Maybe (RuleName, CoreExpr) +-- See comments on matchRule +matchRules is_active in_scope fn args rules + = case go [] rules of + [] -> Nothing + (m:ms) -> Just (case findBest (fn,args) m ms of + (rule, ans) -> (ru_name rule, ans)) + where + rough_args = map roughTopName args + + go :: [(CoreRule,CoreExpr)] -> [CoreRule] -> [(CoreRule,CoreExpr)] + go ms [] = ms + go ms (r:rs) = case (matchRule is_active in_scope args rough_args r) of + Just e -> go ((r,e):ms) rs + Nothing -> go ms rs + +findBest :: (Id, [CoreExpr]) + -> (CoreRule,CoreExpr) -> [(CoreRule,CoreExpr)] -> (CoreRule,CoreExpr) +-- All these pairs matched the expression +-- Return the pair the the most specific rule +-- The (fn,args) is just for overlap reporting + +findBest target (rule,ans) [] = (rule,ans) +findBest target (rule1,ans1) ((rule2,ans2):prs) + | rule1 `isMoreSpecific` rule2 = findBest target (rule1,ans1) prs + | rule2 `isMoreSpecific` rule1 = findBest target (rule1,ans1) prs +#ifdef DEBUG + | otherwise = pprTrace "Rules.findBest: rule overlap (Rule 1 wins)" + (vcat [ptext SLIT("Expression to match:") <+> ppr fn <+> sep (map ppr args), + ptext SLIT("Rule 1:") <+> ppr rule1, + ptext SLIT("Rule 2:") <+> ppr rule2]) $ + findBest target (rule1,ans1) prs +#else + | otherwise = findBest target (rule1,ans1) prs +#endif + where + (fn,args) = target + +isMoreSpecific :: CoreRule -> CoreRule -> Bool +isMoreSpecific (BuiltinRule {}) r2 = True +isMoreSpecific r1 (BuiltinRule {}) = False +isMoreSpecific (Rule { ru_bndrs = bndrs1, ru_args = args1 }) + (Rule { ru_bndrs = bndrs2, ru_args = args2 }) + = isJust (matchN in_scope bndrs2 args2 args1) + where + in_scope = mkInScopeSet (mkVarSet bndrs1) + -- Actually we should probably include the free vars + -- of rule1's args, but I can't be bothered + +noBlackList :: Activation -> Bool +noBlackList act = False -- Nothing is black listed + +matchRule :: (Activation -> Bool) -> InScopeSet + -> [CoreExpr] -> [Maybe Name] + -> CoreRule -> Maybe CoreExpr + +-- If (matchRule rule args) returns Just (name,rhs) +-- then (f args) matches the rule, and the corresponding +-- rewritten RHS is rhs +-- +-- The bndrs and rhs is occurrence-analysed +-- +-- Example +-- +-- The rule +-- forall f g x. map f (map g x) ==> map (f . g) x +-- is stored +-- CoreRule "map/map" +-- [f,g,x] -- tpl_vars +-- [f,map g x] -- tpl_args +-- map (f.g) x) -- rhs +-- +-- Then the call: matchRule the_rule [e1,map e2 e3] +-- = Just ("map/map", (\f,g,x -> rhs) e1 e2 e3) +-- +-- Any 'surplus' arguments in the input are simply put on the end +-- of the output. + +matchRule is_active in_scope args rough_args + (BuiltinRule { ru_name = name, ru_try = match_fn }) + = case match_fn args of + Just expr -> Just expr + Nothing -> Nothing + +matchRule is_active in_scope args rough_args + (Rule { ru_name = rn, ru_act = act, ru_rough = tpl_tops, + ru_bndrs = tpl_vars, ru_args = tpl_args, + ru_rhs = rhs }) + | not (is_active act) = Nothing + | ruleCantMatch tpl_tops rough_args = Nothing + | otherwise + = case matchN in_scope tpl_vars tpl_args args of + Nothing -> Nothing + Just (tpl_vals, leftovers) -> Just (rule_fn + `mkApps` tpl_vals + `mkApps` leftovers) + where + rule_fn = occurAnalyseExpr (mkLams tpl_vars rhs) + -- We could do this when putting things into the rulebase, I guess +\end{code} + +\begin{code} +matchN :: InScopeSet + -> [Var] -- Template tyvars + -> [CoreExpr] -- Template + -> [CoreExpr] -- Target; can have more elts than template + -> Maybe ([CoreExpr], -- What is substituted for each template var + [CoreExpr]) -- Leftover target exprs + +matchN in_scope tmpl_vars tmpl_es target_es + = do { (subst, leftover_es) <- go init_menv emptySubstEnv tmpl_es target_es + ; return (map (lookup_tmpl subst) tmpl_vars, leftover_es) } + where + init_menv = ME { me_tmpls = mkVarSet tmpl_vars, me_env = init_rn_env } + init_rn_env = mkRnEnv2 (extendInScopeSetList in_scope tmpl_vars) + + go menv subst [] es = Just (subst, es) + go menv subst ts [] = Nothing -- Fail if too few actual args + go menv subst (t:ts) (e:es) = do { subst1 <- match menv subst t e + ; go menv subst1 ts es } + + lookup_tmpl :: (TvSubstEnv, IdSubstEnv) -> Var -> CoreExpr + lookup_tmpl (tv_subst, id_subst) tmpl_var + | isTyVar tmpl_var = case lookupVarEnv tv_subst tmpl_var of + Just ty -> Type ty + Nothing -> unbound tmpl_var + | otherwise = case lookupVarEnv id_subst tmpl_var of + Just e -> e + other -> unbound tmpl_var + + unbound var = pprPanic "Template variable unbound in rewrite rule" (ppr var) +\end{code} + + + --------------------------------------------- + The inner workings of matching + --------------------------------------------- + +\begin{code} +-- These two definitions are not the same as in Subst, +-- but they simple and direct, and purely local to this module +-- The third, for TvSubstEnv, is the same as in VarEnv, but repeated here +-- for uniformity with IdSubstEnv +type SubstEnv = (TvSubstEnv, IdSubstEnv) +type IdSubstEnv = IdEnv CoreExpr + +emptySubstEnv :: SubstEnv +emptySubstEnv = (emptyVarEnv, emptyVarEnv) + + +-- At one stage I tried to match even if there are more +-- template args than real args. + +-- I now think this is probably a bad idea. +-- Should the template (map f xs) match (map g)? I think not. +-- For a start, in general eta expansion wastes work. +-- SLPJ July 99 + + +match :: MatchEnv + -> SubstEnv + -> CoreExpr -- Template + -> CoreExpr -- Target + -> Maybe SubstEnv + +-- See the notes with Unify.match, which matches types +-- Everything is very similar for terms + +-- Interesting examples: +-- Consider matching +-- \x->f against \f->f +-- When we meet the lambdas we must remember to rename f to f' in the +-- second expresion. The RnEnv2 does that. +-- +-- Consider matching +-- forall a. \b->b against \a->3 +-- We must rename the \a. Otherwise when we meet the lambdas we +-- might substitute [a/b] in the template, and then erroneously +-- succeed in matching what looks like the template variable 'a' against 3. + +-- The Var case follows closely what happens in Unify.match +match menv subst@(tv_subst, id_subst) (Var v1) e2 + | v1 `elemVarSet` me_tmpls menv + = case lookupVarEnv id_subst v1' of + Nothing | any (inRnEnvR rn_env) (varSetElems (exprFreeVars e2)) + -> Nothing -- Occurs check failure + -- e.g. match forall a. (\x-> a x) against (\y. y y) + + | otherwise + -> Just (tv_subst, extendVarEnv id_subst v1 e2) + + Just e2' | tcEqExprX (nukeRnEnvL rn_env) e2' e2 + -> Just subst + + other -> Nothing + + | otherwise -- v1 is not a template variable + = case e2 of + Var v2 | v1' == rnOccR rn_env v2 -> Just subst + other -> Nothing + where + rn_env = me_env menv + v1' = rnOccL rn_env v1 + +-- Here is another important rule: if the term being matched is a +-- variable, we expand it so long as its unfolding is a WHNF +-- (Its occurrence information is not necessarily up to date, +-- so we don't use it.) +match menv subst e1 (Var v2) + | isCheapUnfolding unfolding + = match menv subst e1 (unfoldingTemplate unfolding) + where + unfolding = idUnfolding v2 + +match menv subst (Lit lit1) (Lit lit2) + | lit1 == lit2 + = Just subst + +match menv subst (App f1 a1) (App f2 a2) + = do { subst' <- match menv subst f1 f2 + ; match menv subst' a1 a2 } + +match menv subst (Lam x1 e1) (Lam x2 e2) + = match menv' subst e1 e2 + where + menv' = menv { me_env = rnBndr2 (me_env menv) x1 x2 } + +-- This rule does eta expansion +-- (\x.M) ~ N iff M ~ N x +match menv subst (Lam x1 e1) e2 + = match menv' subst e1 (App e2 (varToCoreExpr new_x)) + where + (rn_env', new_x) = rnBndrL (me_env menv) x1 + menv' = menv { me_env = rn_env' } + +-- Eta expansion the other way +-- M ~ (\y.N) iff M y ~ N +match menv subst e1 (Lam x2 e2) + = match menv' subst (App e1 (varToCoreExpr new_x)) e2 + where + (rn_env', new_x) = rnBndrR (me_env menv) x2 + menv' = menv { me_env = rn_env' } + +match menv subst (Case e1 x1 ty1 alts1) (Case e2 x2 ty2 alts2) + = do { subst1 <- match_ty menv subst ty1 ty2 + ; subst2 <- match menv subst1 e1 e2 + ; let menv' = menv { me_env = rnBndr2 (me_env menv) x2 x2 } + ; match_alts menv' subst2 alts1 alts2 -- Alts are both sorted + } + +match menv subst (Type ty1) (Type ty2) + = match_ty menv subst ty1 ty2 + +match menv subst (Note (Coerce to1 from1) e1) (Note (Coerce to2 from2) e2) + = do { subst1 <- match_ty menv subst to1 to2 + ; subst2 <- match_ty menv subst1 from1 from2 + ; match menv subst2 e1 e2 } + +-- This is an interesting rule: we simply ignore lets in the +-- term being matched against! The unfolding inside it is (by assumption) +-- already inside any occurrences of the bound variables, so we'll expand +-- them when we encounter them. +match menv subst e1 (Let (NonRec x2 r2) e2) + = match menv' subst e1 e2 + where + menv' = menv { me_env = fst (rnBndrR (me_env menv) x2) } + -- It's important to do this renaming. For example: + -- Matching + -- forall f,x,xs. f (x:xs) + -- against + -- f (let y = e in (y:[])) + -- We must not get success with x->y! Instead, we + -- need an occurs check. + +-- Everything else fails +match menv subst e1 e2 = Nothing + +------------------------------------------ +match_alts :: MatchEnv + -> SubstEnv + -> [CoreAlt] -- Template + -> [CoreAlt] -- Target + -> Maybe SubstEnv +match_alts menv subst [] [] + = return subst +match_alts menv subst ((c1,vs1,r1):alts1) ((c2,vs2,r2):alts2) + | c1 == c2 + = do { subst1 <- match menv' subst r1 r2 + ; match_alts menv subst1 alts1 alts2 } + where + menv' :: MatchEnv + menv' = menv { me_env = rnBndrs2 (me_env menv) vs1 vs2 } + +match_alts menv subst alts1 alts2 + = Nothing +\end{code} + +Matching Core types: use the matcher in TcType. +Notice that we treat newtypes as opaque. For example, suppose +we have a specialised version of a function at a newtype, say + newtype T = MkT Int +We only want to replace (f T) with f', not (f Int). + +\begin{code} +------------------------------------------ +match_ty menv (tv_subst, id_subst) ty1 ty2 + = do { tv_subst' <- Unify.ruleMatchTyX menv tv_subst ty1 ty2 + ; return (tv_subst', id_subst) } +\end{code} + + +%************************************************************************ +%* * +\subsection{Checking a program for failing rule applications} +%* * +%************************************************************************ + +----------------------------------------------------- + Game plan +----------------------------------------------------- + +We want to know what sites have rules that could have fired but didn't. +This pass runs over the tree (without changing it) and reports such. + +NB: we assume that this follows a run of the simplifier, so every Id +occurrence (including occurrences of imported Ids) is decorated with +all its (active) rules. No need to construct a rule base or anything +like that. + +\begin{code} +ruleCheckProgram :: CompilerPhase -> String -> [CoreBind] -> SDoc +-- Report partial matches for rules beginning +-- with the specified string +ruleCheckProgram phase rule_pat binds + | isEmptyBag results + = text "Rule check results: no rule application sites" + | otherwise + = vcat [text "Rule check results:", + line, + vcat [ p $$ line | p <- bagToList results ] + ] + where + results = unionManyBags (map (ruleCheckBind (phase, rule_pat)) binds) + line = text (replicate 20 '-') + +type RuleCheckEnv = (CompilerPhase, String) -- Phase and Pattern + +ruleCheckBind :: RuleCheckEnv -> CoreBind -> Bag SDoc + -- The Bag returned has one SDoc for each call site found +ruleCheckBind env (NonRec b r) = ruleCheck env r +ruleCheckBind env (Rec prs) = unionManyBags [ruleCheck env r | (b,r) <- prs] + +ruleCheck :: RuleCheckEnv -> CoreExpr -> Bag SDoc +ruleCheck env (Var v) = emptyBag +ruleCheck env (Lit l) = emptyBag +ruleCheck env (Type ty) = emptyBag +ruleCheck env (App f a) = ruleCheckApp env (App f a) [] +ruleCheck env (Note n e) = ruleCheck env e +ruleCheck env (Let bd e) = ruleCheckBind env bd `unionBags` ruleCheck env e +ruleCheck env (Lam b e) = ruleCheck env e +ruleCheck env (Case e _ _ as) = ruleCheck env e `unionBags` + unionManyBags [ruleCheck env r | (_,_,r) <- as] + +ruleCheckApp env (App f a) as = ruleCheck env a `unionBags` ruleCheckApp env f (a:as) +ruleCheckApp env (Var f) as = ruleCheckFun env f as +ruleCheckApp env other as = ruleCheck env other +\end{code} + +\begin{code} +ruleCheckFun :: RuleCheckEnv -> Id -> [CoreExpr] -> Bag SDoc +-- Produce a report for all rules matching the predicate +-- saying why it doesn't match the specified application + +ruleCheckFun (phase, pat) fn args + | null name_match_rules = emptyBag + | otherwise = unitBag (ruleAppCheck_help phase fn args name_match_rules) + where + name_match_rules = filter match (idCoreRules fn) + match rule = pat `isPrefixOf` unpackFS (ruleName rule) + +ruleAppCheck_help :: CompilerPhase -> Id -> [CoreExpr] -> [CoreRule] -> SDoc +ruleAppCheck_help phase fn args rules + = -- The rules match the pattern, so we want to print something + vcat [text "Expression:" <+> ppr (mkApps (Var fn) args), + vcat (map check_rule rules)] + where + n_args = length args + i_args = args `zip` [1::Int ..] + rough_args = map roughTopName args + + check_rule rule = rule_herald rule <> colon <+> rule_info rule + + rule_herald (BuiltinRule { ru_name = name }) + = ptext SLIT("Builtin rule") <+> doubleQuotes (ftext name) + rule_herald (Rule { ru_name = name }) + = ptext SLIT("Rule") <+> doubleQuotes (ftext name) + + rule_info rule + | Just _ <- matchRule noBlackList emptyInScopeSet args rough_args rule + = text "matches (which is very peculiar!)" + + rule_info (BuiltinRule {}) = text "does not match" + + rule_info (Rule { ru_name = name, ru_act = act, + ru_bndrs = rule_bndrs, ru_args = rule_args}) + | not (isActive phase act) = text "active only in later phase" + | n_args < n_rule_args = text "too few arguments" + | n_mismatches == n_rule_args = text "no arguments match" + | n_mismatches == 0 = text "all arguments match (considered individually), but rule as a whole does not" + | otherwise = text "arguments" <+> ppr mismatches <+> text "do not match (1-indexing)" + where + n_rule_args = length rule_args + n_mismatches = length mismatches + mismatches = [i | (rule_arg, (arg,i)) <- rule_args `zip` i_args, + not (isJust (match_fn rule_arg arg))] + + lhs_fvs = exprsFreeVars rule_args -- Includes template tyvars + match_fn rule_arg arg = match menv emptySubstEnv rule_arg arg + where + in_scope = lhs_fvs `unionVarSet` exprFreeVars arg + menv = ME { me_env = mkRnEnv2 (mkInScopeSet in_scope) + , me_tmpls = mkVarSet rule_bndrs } +\end{code} + diff --git a/compiler/specialise/SpecConstr.lhs b/compiler/specialise/SpecConstr.lhs new file mode 100644 index 0000000000..74944da983 --- /dev/null +++ b/compiler/specialise/SpecConstr.lhs @@ -0,0 +1,625 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[SpecConstr]{Specialise over constructors} + +\begin{code} +module SpecConstr( + specConstrProgram + ) where + +#include "HsVersions.h" + +import CoreSyn +import CoreLint ( showPass, endPass ) +import CoreUtils ( exprType, tcEqExpr, mkPiTypes ) +import CoreFVs ( exprsFreeVars ) +import CoreSubst ( Subst, mkSubst, substExpr ) +import CoreTidy ( tidyRules ) +import PprCore ( pprRules ) +import WwLib ( mkWorkerArgs ) +import DataCon ( dataConRepArity, isVanillaDataCon ) +import Type ( tyConAppArgs, tyVarsOfTypes ) +import Unify ( coreRefineTys ) +import Id ( Id, idName, idType, isDataConWorkId_maybe, + mkUserLocal, mkSysLocal ) +import Var ( Var ) +import VarEnv +import VarSet +import Name ( nameOccName, nameSrcLoc ) +import Rules ( addIdSpecialisations, mkLocalRule, rulesOfBinds ) +import OccName ( mkSpecOcc ) +import ErrUtils ( dumpIfSet_dyn ) +import DynFlags ( DynFlags, DynFlag(..) ) +import BasicTypes ( Activation(..) ) +import Maybes ( orElse ) +import Util ( mapAccumL, lengthAtLeast, notNull ) +import List ( nubBy, partition ) +import UniqSupply +import Outputable +import FastString +\end{code} + +----------------------------------------------------- + Game plan +----------------------------------------------------- + +Consider + drop n [] = [] + drop 0 xs = [] + drop n (x:xs) = drop (n-1) xs + +After the first time round, we could pass n unboxed. This happens in +numerical code too. Here's what it looks like in Core: + + drop n xs = case xs of + [] -> [] + (y:ys) -> case n of + I# n# -> case n# of + 0 -> [] + _ -> drop (I# (n# -# 1#)) xs + +Notice that the recursive call has an explicit constructor as argument. +Noticing this, we can make a specialised version of drop + + RULE: drop (I# n#) xs ==> drop' n# xs + + drop' n# xs = let n = I# n# in ...orig RHS... + +Now the simplifier will apply the specialisation in the rhs of drop', giving + + drop' n# xs = case xs of + [] -> [] + (y:ys) -> case n# of + 0 -> [] + _ -> drop (n# -# 1#) xs + +Much better! + +We'd also like to catch cases where a parameter is carried along unchanged, +but evaluated each time round the loop: + + f i n = if i>0 || i>n then i else f (i*2) n + +Here f isn't strict in n, but we'd like to avoid evaluating it each iteration. +In Core, by the time we've w/wd (f is strict in i) we get + + f i# n = case i# ># 0 of + False -> I# i# + True -> case n of n' { I# n# -> + case i# ># n# of + False -> I# i# + True -> f (i# *# 2#) n' + +At the call to f, we see that the argument, n is know to be (I# n#), +and n is evaluated elsewhere in the body of f, so we can play the same +trick as above. However we don't want to do that if the boxed version +of n is needed (else we'd avoid the eval but pay more for re-boxing n). +So in this case we want that the *only* uses of n are in case statements. + + +So we look for + +* A self-recursive function. Ignore mutual recursion for now, + because it's less common, and the code is simpler for self-recursion. + +* EITHER + + a) At a recursive call, one or more parameters is an explicit + constructor application + AND + That same parameter is scrutinised by a case somewhere in + the RHS of the function + + OR + + b) At a recursive call, one or more parameters has an unfolding + that is an explicit constructor application + AND + That same parameter is scrutinised by a case somewhere in + the RHS of the function + AND + Those are the only uses of the parameter + + +There's a bit of a complication with type arguments. If the call +site looks like + + f p = ...f ((:) [a] x xs)... + +then our specialised function look like + + f_spec x xs = let p = (:) [a] x xs in ....as before.... + +This only makes sense if either + a) the type variable 'a' is in scope at the top of f, or + b) the type variable 'a' is an argument to f (and hence fs) + +Actually, (a) may hold for value arguments too, in which case +we may not want to pass them. Supose 'x' is in scope at f's +defn, but xs is not. Then we'd like + + f_spec xs = let p = (:) [a] x xs in ....as before.... + +Similarly (b) may hold too. If x is already an argument at the +call, no need to pass it again. + +Finally, if 'a' is not in scope at the call site, we could abstract +it as we do the term variables: + + f_spec a x xs = let p = (:) [a] x xs in ...as before... + +So the grand plan is: + + * abstract the call site to a constructor-only pattern + e.g. C x (D (f p) (g q)) ==> C s1 (D s2 s3) + + * Find the free variables of the abstracted pattern + + * Pass these variables, less any that are in scope at + the fn defn. + + +NOTICE that we only abstract over variables that are not in scope, +so we're in no danger of shadowing variables used in "higher up" +in f_spec's RHS. + + +%************************************************************************ +%* * +\subsection{Top level wrapper stuff} +%* * +%************************************************************************ + +\begin{code} +specConstrProgram :: DynFlags -> UniqSupply -> [CoreBind] -> IO [CoreBind] +specConstrProgram dflags us binds + = do + showPass dflags "SpecConstr" + + let (binds', _) = initUs us (go emptyScEnv binds) + + endPass dflags "SpecConstr" Opt_D_dump_spec binds' + + dumpIfSet_dyn dflags Opt_D_dump_rules "Top-level specialisations" + (pprRules (tidyRules emptyTidyEnv (rulesOfBinds binds'))) + + return binds' + where + go env [] = returnUs [] + go env (bind:binds) = scBind env bind `thenUs` \ (env', _, bind') -> + go env' binds `thenUs` \ binds' -> + returnUs (bind' : binds') +\end{code} + + +%************************************************************************ +%* * +\subsection{Environment: goes downwards} +%* * +%************************************************************************ + +\begin{code} +data ScEnv = SCE { scope :: VarEnv HowBound, + -- Binds all non-top-level variables in scope + + cons :: ConstrEnv + } + +type ConstrEnv = IdEnv ConValue +data ConValue = CV AltCon [CoreArg] + -- Variables known to be bound to a constructor + -- in a particular case alternative + +refineConstrEnv :: Subst -> ConstrEnv -> ConstrEnv +-- The substitution is a type substitution only +refineConstrEnv subst env = mapVarEnv refine_con_value env + where + refine_con_value (CV con args) = CV con (map (substExpr subst) args) + +emptyScEnv = SCE { scope = emptyVarEnv, cons = emptyVarEnv } + +data HowBound = RecFun -- These are the recursive functions for which + -- we seek interesting call patterns + + | RecArg -- These are those functions' arguments; we are + -- interested to see if those arguments are scrutinised + + | Other -- We track all others so we know what's in scope + -- This is used in spec_one to check what needs to be + -- passed as a parameter and what is in scope at the + -- function definition site + +instance Outputable HowBound where + ppr RecFun = text "RecFun" + ppr RecArg = text "RecArg" + ppr Other = text "Other" + +lookupScopeEnv env v = lookupVarEnv (scope env) v + +extendBndrs env bndrs = env { scope = extendVarEnvList (scope env) [(b,Other) | b <- bndrs] } +extendBndr env bndr = env { scope = extendVarEnv (scope env) bndr Other } + + -- When we encounter + -- case scrut of b + -- C x y -> ... + -- we want to bind b, and perhaps scrut too, to (C x y) +extendCaseBndrs :: ScEnv -> Id -> CoreExpr -> AltCon -> [Var] -> ScEnv +extendCaseBndrs env case_bndr scrut DEFAULT alt_bndrs + = extendBndrs env (case_bndr : alt_bndrs) + +extendCaseBndrs env case_bndr scrut con@(LitAlt lit) alt_bndrs + = ASSERT( null alt_bndrs ) extendAlt env case_bndr scrut (CV con []) [] + +extendCaseBndrs env case_bndr scrut con@(DataAlt data_con) alt_bndrs + | isVanillaDataCon data_con + = extendAlt env case_bndr scrut (CV con vanilla_args) alt_bndrs + + | otherwise -- GADT + = extendAlt env1 case_bndr scrut (CV con gadt_args) alt_bndrs + where + vanilla_args = map Type (tyConAppArgs (idType case_bndr)) ++ + map varToCoreExpr alt_bndrs + + gadt_args = map (substExpr subst . varToCoreExpr) alt_bndrs + + (alt_tvs, _) = span isTyVar alt_bndrs + Just (tv_subst, is_local) = coreRefineTys data_con alt_tvs (idType case_bndr) + subst = mkSubst in_scope tv_subst emptyVarEnv -- No Id substitition + in_scope = mkInScopeSet (tyVarsOfTypes (varEnvElts tv_subst)) + + env1 | is_local = env + | otherwise = env { cons = refineConstrEnv subst (cons env) } + + + +extendAlt :: ScEnv -> Id -> CoreExpr -> ConValue -> [Var] -> ScEnv +extendAlt env case_bndr scrut val alt_bndrs + = let + env1 = SCE { scope = extendVarEnvList (scope env) [(b,Other) | b <- case_bndr : alt_bndrs], + cons = extendVarEnv (cons env) case_bndr val } + in + case scrut of + Var v -> -- Bind the scrutinee in the ConstrEnv if it's a variable + -- Also forget if the scrutinee is a RecArg, because we're + -- now in the branch of a case, and we don't want to + -- record a non-scrutinee use of v if we have + -- case v of { (a,b) -> ...(f v)... } + SCE { scope = extendVarEnv (scope env1) v Other, + cons = extendVarEnv (cons env1) v val } + other -> env1 + + -- When we encounter a recursive function binding + -- f = \x y -> ... + -- we want to extend the scope env with bindings + -- that record that f is a RecFn and x,y are RecArgs +extendRecBndr env fn bndrs + = env { scope = scope env `extendVarEnvList` + ((fn,RecFun): [(bndr,RecArg) | bndr <- bndrs]) } +\end{code} + + +%************************************************************************ +%* * +\subsection{Usage information: flows upwards} +%* * +%************************************************************************ + +\begin{code} +data ScUsage + = SCU { + calls :: !(IdEnv ([Call])), -- Calls + -- The functions are a subset of the + -- RecFuns in the ScEnv + + occs :: !(IdEnv ArgOcc) -- Information on argument occurrences + } -- The variables are a subset of the + -- RecArg in the ScEnv + +type Call = (ConstrEnv, [CoreArg]) + -- The arguments of the call, together with the + -- env giving the constructor bindings at the call site + +nullUsage = SCU { calls = emptyVarEnv, occs = emptyVarEnv } + +combineUsage u1 u2 = SCU { calls = plusVarEnv_C (++) (calls u1) (calls u2), + occs = plusVarEnv_C combineOcc (occs u1) (occs u2) } + +combineUsages [] = nullUsage +combineUsages us = foldr1 combineUsage us + +data ArgOcc = CaseScrut + | OtherOcc + | Both + +instance Outputable ArgOcc where + ppr CaseScrut = ptext SLIT("case-scrut") + ppr OtherOcc = ptext SLIT("other-occ") + ppr Both = ptext SLIT("case-scrut and other") + +combineOcc CaseScrut CaseScrut = CaseScrut +combineOcc OtherOcc OtherOcc = OtherOcc +combineOcc _ _ = Both +\end{code} + + +%************************************************************************ +%* * +\subsection{The main recursive function} +%* * +%************************************************************************ + +The main recursive function gathers up usage information, and +creates specialised versions of functions. + +\begin{code} +scExpr :: ScEnv -> CoreExpr -> UniqSM (ScUsage, CoreExpr) + -- The unique supply is needed when we invent + -- a new name for the specialised function and its args + +scExpr env e@(Type t) = returnUs (nullUsage, e) +scExpr env e@(Lit l) = returnUs (nullUsage, e) +scExpr env e@(Var v) = returnUs (varUsage env v OtherOcc, e) +scExpr env (Note n e) = scExpr env e `thenUs` \ (usg,e') -> + returnUs (usg, Note n e') +scExpr env (Lam b e) = scExpr (extendBndr env b) e `thenUs` \ (usg,e') -> + returnUs (usg, Lam b e') + +scExpr env (Case scrut b ty alts) + = sc_scrut scrut `thenUs` \ (scrut_usg, scrut') -> + mapAndUnzipUs sc_alt alts `thenUs` \ (alts_usgs, alts') -> + returnUs (combineUsages alts_usgs `combineUsage` scrut_usg, + Case scrut' b ty alts') + where + sc_scrut e@(Var v) = returnUs (varUsage env v CaseScrut, e) + sc_scrut e = scExpr env e + + sc_alt (con,bs,rhs) = scExpr env1 rhs `thenUs` \ (usg,rhs') -> + returnUs (usg, (con,bs,rhs')) + where + env1 = extendCaseBndrs env b scrut con bs + +scExpr env (Let bind body) + = scBind env bind `thenUs` \ (env', bind_usg, bind') -> + scExpr env' body `thenUs` \ (body_usg, body') -> + returnUs (bind_usg `combineUsage` body_usg, Let bind' body') + +scExpr env e@(App _ _) + = let + (fn, args) = collectArgs e + in + mapAndUnzipUs (scExpr env) args `thenUs` \ (usgs, args') -> + let + arg_usg = combineUsages usgs + fn_usg | Var f <- fn, + Just RecFun <- lookupScopeEnv env f + = SCU { calls = unitVarEnv f [(cons env, args)], + occs = emptyVarEnv } + | otherwise + = nullUsage + in + returnUs (arg_usg `combineUsage` fn_usg, mkApps fn args') + -- Don't bother to look inside fn; + -- it's almost always a variable + +---------------------- +scBind :: ScEnv -> CoreBind -> UniqSM (ScEnv, ScUsage, CoreBind) +scBind env (Rec [(fn,rhs)]) + | notNull val_bndrs + = scExpr env_fn_body body `thenUs` \ (usg, body') -> + let + SCU { calls = calls, occs = occs } = usg + in + specialise env fn bndrs body usg `thenUs` \ (rules, spec_prs) -> + returnUs (extendBndr env fn, -- For the body of the letrec, just + -- extend the env with Other to record + -- that it's in scope; no funny RecFun business + SCU { calls = calls `delVarEnv` fn, occs = occs `delVarEnvList` val_bndrs}, + Rec ((fn `addIdSpecialisations` rules, mkLams bndrs body') : spec_prs)) + where + (bndrs,body) = collectBinders rhs + val_bndrs = filter isId bndrs + env_fn_body = extendRecBndr env fn bndrs + +scBind env (Rec prs) + = mapAndUnzipUs do_one prs `thenUs` \ (usgs, prs') -> + returnUs (extendBndrs env (map fst prs), combineUsages usgs, Rec prs') + where + do_one (bndr,rhs) = scExpr env rhs `thenUs` \ (usg, rhs') -> + returnUs (usg, (bndr,rhs')) + +scBind env (NonRec bndr rhs) + = scExpr env rhs `thenUs` \ (usg, rhs') -> + returnUs (extendBndr env bndr, usg, NonRec bndr rhs') + +---------------------- +varUsage env v use + | Just RecArg <- lookupScopeEnv env v = SCU { calls = emptyVarEnv, + occs = unitVarEnv v use } + | otherwise = nullUsage +\end{code} + + +%************************************************************************ +%* * +\subsection{The specialiser} +%* * +%************************************************************************ + +\begin{code} +specialise :: ScEnv + -> Id -- Functionn + -> [CoreBndr] -> CoreExpr -- Its RHS + -> ScUsage -- Info on usage + -> UniqSM ([CoreRule], -- Rules + [(Id,CoreExpr)]) -- Bindings + +specialise env fn bndrs body (SCU {calls=calls, occs=occs}) + = getUs `thenUs` \ us -> + let + all_calls = lookupVarEnv calls fn `orElse` [] + + good_calls :: [[CoreArg]] + good_calls = [ pats + | (con_env, call_args) <- all_calls, + call_args `lengthAtLeast` n_bndrs, -- App is saturated + let call = (bndrs `zip` call_args), + any (good_arg con_env occs) call, -- At least one arg is a constr app + let (_, pats) = argsToPats con_env us call_args + ] + in + mapAndUnzipUs (spec_one env fn (mkLams bndrs body)) + (nubBy same_call good_calls `zip` [1..]) + where + n_bndrs = length bndrs + same_call as1 as2 = and (zipWith tcEqExpr as1 as2) + +--------------------- +good_arg :: ConstrEnv -> IdEnv ArgOcc -> (CoreBndr, CoreArg) -> Bool +good_arg con_env arg_occs (bndr, arg) + = case is_con_app_maybe con_env arg of + Just _ -> bndr_usg_ok arg_occs bndr arg + other -> False + +bndr_usg_ok :: IdEnv ArgOcc -> Var -> CoreArg -> Bool +bndr_usg_ok arg_occs bndr arg + = case lookupVarEnv arg_occs bndr of + Just CaseScrut -> True -- Used only by case scrutiny + Just Both -> case arg of -- Used by case and elsewhere + App _ _ -> True -- so the arg should be an explicit con app + other -> False + other -> False -- Not used, or used wonkily + + +--------------------- +spec_one :: ScEnv + -> Id -- Function + -> CoreExpr -- Rhs of the original function + -> ([CoreArg], Int) + -> UniqSM (CoreRule, (Id,CoreExpr)) -- Rule and binding + +-- spec_one creates a specialised copy of the function, together +-- with a rule for using it. I'm very proud of how short this +-- function is, considering what it does :-). + +{- + Example + + In-scope: a, x::a + f = /\b \y::[(a,b)] -> ....f (b,c) ((:) (a,(b,c)) (x,v) (h w))... + [c::*, v::(b,c) are presumably bound by the (...) part] + ==> + f_spec = /\ b c \ v::(b,c) hw::[(a,(b,c))] -> + (...entire RHS of f...) (b,c) ((:) (a,(b,c)) (x,v) hw) + + RULE: forall b::* c::*, -- Note, *not* forall a, x + v::(b,c), + hw::[(a,(b,c))] . + + f (b,c) ((:) (a,(b,c)) (x,v) hw) = f_spec b c v hw +-} + +spec_one env fn rhs (pats, rule_number) + = getUniqueUs `thenUs` \ spec_uniq -> + let + fn_name = idName fn + fn_loc = nameSrcLoc fn_name + spec_occ = mkSpecOcc (nameOccName fn_name) + pat_fvs = varSetElems (exprsFreeVars pats) + vars_to_bind = filter not_avail pat_fvs + not_avail v = not (v `elemVarEnv` scope env) + -- Put the type variables first; the type of a term + -- variable may mention a type variable + (tvs, ids) = partition isTyVar vars_to_bind + bndrs = tvs ++ ids + spec_body = mkApps rhs pats + body_ty = exprType spec_body + + (spec_lam_args, spec_call_args) = mkWorkerArgs bndrs body_ty + -- Usual w/w hack to avoid generating + -- a spec_rhs of unlifted type and no args + + rule_name = mkFastString ("SC:" ++ showSDoc (ppr fn <> int rule_number)) + spec_rhs = mkLams spec_lam_args spec_body + spec_id = mkUserLocal spec_occ spec_uniq (mkPiTypes spec_lam_args body_ty) fn_loc + rule_rhs = mkVarApps (Var spec_id) spec_call_args + rule = mkLocalRule rule_name specConstrActivation fn_name bndrs pats rule_rhs + in + returnUs (rule, (spec_id, spec_rhs)) + +-- In which phase should the specialise-constructor rules be active? +-- Originally I made them always-active, but Manuel found that +-- this defeated some clever user-written rules. So Plan B +-- is to make them active only in Phase 0; after all, currently, +-- the specConstr transformation is only run after the simplifier +-- has reached Phase 0. In general one would want it to be +-- flag-controllable, but for now I'm leaving it baked in +-- [SLPJ Oct 01] +specConstrActivation :: Activation +specConstrActivation = ActiveAfter 0 -- Baked in; see comments above +\end{code} + +%************************************************************************ +%* * +\subsection{Argument analysis} +%* * +%************************************************************************ + +This code deals with analysing call-site arguments to see whether +they are constructor applications. + +\begin{code} + -- argToPat takes an actual argument, and returns an abstracted + -- version, consisting of just the "constructor skeleton" of the + -- argument, with non-constructor sub-expression replaced by new + -- placeholder variables. For example: + -- C a (D (f x) (g y)) ==> C p1 (D p2 p3) + +argToPat :: ConstrEnv -> UniqSupply -> CoreArg -> (UniqSupply, CoreExpr) +argToPat env us (Type ty) + = (us, Type ty) + +argToPat env us arg + | Just (CV dc args) <- is_con_app_maybe env arg + = let + (us',args') = argsToPats env us args + in + (us', mk_con_app dc args') + +argToPat env us (Var v) -- Don't uniqify existing vars, + = (us, Var v) -- so that we can spot when we pass them twice + +argToPat env us arg + = (us1, Var (mkSysLocal FSLIT("sc") (uniqFromSupply us2) (exprType arg))) + where + (us1,us2) = splitUniqSupply us + +argsToPats :: ConstrEnv -> UniqSupply -> [CoreArg] -> (UniqSupply, [CoreExpr]) +argsToPats env us args = mapAccumL (argToPat env) us args +\end{code} + + +\begin{code} +is_con_app_maybe :: ConstrEnv -> CoreExpr -> Maybe ConValue +is_con_app_maybe env (Var v) + = lookupVarEnv env v + -- You might think we could look in the idUnfolding here + -- but that doesn't take account of which branch of a + -- case we are in, which is the whole point + +is_con_app_maybe env (Lit lit) + = Just (CV (LitAlt lit) []) + +is_con_app_maybe env expr + = case collectArgs expr of + (Var fun, args) | Just con <- isDataConWorkId_maybe fun, + args `lengthAtLeast` dataConRepArity con + -- Might be > because the arity excludes type args + -> Just (CV (DataAlt con) args) + + other -> Nothing + +mk_con_app :: AltCon -> [CoreArg] -> CoreExpr +mk_con_app (LitAlt lit) [] = Lit lit +mk_con_app (DataAlt con) args = mkConApp con args +\end{code} diff --git a/compiler/specialise/Specialise.lhs b/compiler/specialise/Specialise.lhs new file mode 100644 index 0000000000..0e66b0bc78 --- /dev/null +++ b/compiler/specialise/Specialise.lhs @@ -0,0 +1,1236 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1993-1998 +% +\section[Specialise]{Stamping out overloading, and (optionally) polymorphism} + +\begin{code} +module Specialise ( specProgram ) where + +#include "HsVersions.h" + +import DynFlags ( DynFlags, DynFlag(..) ) +import Id ( Id, idName, idType, mkUserLocal ) +import TcType ( Type, mkTyVarTy, tcSplitSigmaTy, + tyVarsOfTypes, tyVarsOfTheta, isClassPred, + tcCmpType, isUnLiftedType + ) +import CoreSubst ( Subst, mkEmptySubst, extendTvSubstList, lookupIdSubst, + substBndr, substBndrs, substTy, substInScope, + cloneIdBndr, cloneIdBndrs, cloneRecIdBndrs + ) +import VarSet +import VarEnv +import CoreSyn +import CoreUtils ( applyTypeToArgs, mkPiTypes ) +import CoreFVs ( exprFreeVars, exprsFreeVars, idRuleVars ) +import CoreTidy ( tidyRules ) +import CoreLint ( showPass, endPass ) +import Rules ( addIdSpecialisations, mkLocalRule, lookupRule, emptyRuleBase, rulesOfBinds ) +import PprCore ( pprRules ) +import UniqSupply ( UniqSupply, + UniqSM, initUs_, thenUs, returnUs, getUniqueUs, + getUs, mapUs + ) +import Name ( nameOccName, mkSpecOcc, getSrcLoc ) +import MkId ( voidArgId, realWorldPrimId ) +import FiniteMap +import Maybes ( catMaybes, maybeToBool ) +import ErrUtils ( dumpIfSet_dyn ) +import BasicTypes ( Activation( AlwaysActive ) ) +import Bag +import List ( partition ) +import Util ( zipEqual, zipWithEqual, cmpList, lengthIs, + equalLength, lengthAtLeast, notNull ) +import Outputable +import FastString + +infixr 9 `thenSM` +\end{code} + +%************************************************************************ +%* * +\subsection[notes-Specialise]{Implementation notes [SLPJ, Aug 18 1993]} +%* * +%************************************************************************ + +These notes describe how we implement specialisation to eliminate +overloading. + +The specialisation pass works on Core +syntax, complete with all the explicit dictionary application, +abstraction and construction as added by the type checker. The +existing type checker remains largely as it is. + +One important thought: the {\em types} passed to an overloaded +function, and the {\em dictionaries} passed are mutually redundant. +If the same function is applied to the same type(s) then it is sure to +be applied to the same dictionary(s)---or rather to the same {\em +values}. (The arguments might look different but they will evaluate +to the same value.) + +Second important thought: we know that we can make progress by +treating dictionary arguments as static and worth specialising on. So +we can do without binding-time analysis, and instead specialise on +dictionary arguments and no others. + +The basic idea +~~~~~~~~~~~~~~ +Suppose we have + + let f = <f_rhs> + in <body> + +and suppose f is overloaded. + +STEP 1: CALL-INSTANCE COLLECTION + +We traverse <body>, accumulating all applications of f to types and +dictionaries. + +(Might there be partial applications, to just some of its types and +dictionaries? In principle yes, but in practice the type checker only +builds applications of f to all its types and dictionaries, so partial +applications could only arise as a result of transformation, and even +then I think it's unlikely. In any case, we simply don't accumulate such +partial applications.) + + +STEP 2: EQUIVALENCES + +So now we have a collection of calls to f: + f t1 t2 d1 d2 + f t3 t4 d3 d4 + ... +Notice that f may take several type arguments. To avoid ambiguity, we +say that f is called at type t1/t2 and t3/t4. + +We take equivalence classes using equality of the *types* (ignoring +the dictionary args, which as mentioned previously are redundant). + +STEP 3: SPECIALISATION + +For each equivalence class, choose a representative (f t1 t2 d1 d2), +and create a local instance of f, defined thus: + + f@t1/t2 = <f_rhs> t1 t2 d1 d2 + +f_rhs presumably has some big lambdas and dictionary lambdas, so lots +of simplification will now result. However we don't actually *do* that +simplification. Rather, we leave it for the simplifier to do. If we +*did* do it, though, we'd get more call instances from the specialised +RHS. We can work out what they are by instantiating the call-instance +set from f's RHS with the types t1, t2. + +Add this new id to f's IdInfo, to record that f has a specialised version. + +Before doing any of this, check that f's IdInfo doesn't already +tell us about an existing instance of f at the required type/s. +(This might happen if specialisation was applied more than once, or +it might arise from user SPECIALIZE pragmas.) + +Recursion +~~~~~~~~~ +Wait a minute! What if f is recursive? Then we can't just plug in +its right-hand side, can we? + +But it's ok. The type checker *always* creates non-recursive definitions +for overloaded recursive functions. For example: + + f x = f (x+x) -- Yes I know its silly + +becomes + + f a (d::Num a) = let p = +.sel a d + in + letrec fl (y::a) = fl (p y y) + in + fl + +We still have recusion for non-overloaded functions which we +speciailise, but the recursive call should get specialised to the +same recursive version. + + +Polymorphism 1 +~~~~~~~~~~~~~~ + +All this is crystal clear when the function is applied to *constant +types*; that is, types which have no type variables inside. But what if +it is applied to non-constant types? Suppose we find a call of f at type +t1/t2. There are two possibilities: + +(a) The free type variables of t1, t2 are in scope at the definition point +of f. In this case there's no problem, we proceed just as before. A common +example is as follows. Here's the Haskell: + + g y = let f x = x+x + in f y + f y + +After typechecking we have + + g a (d::Num a) (y::a) = let f b (d'::Num b) (x::b) = +.sel b d' x x + in +.sel a d (f a d y) (f a d y) + +Notice that the call to f is at type type "a"; a non-constant type. +Both calls to f are at the same type, so we can specialise to give: + + g a (d::Num a) (y::a) = let f@a (x::a) = +.sel a d x x + in +.sel a d (f@a y) (f@a y) + + +(b) The other case is when the type variables in the instance types +are *not* in scope at the definition point of f. The example we are +working with above is a good case. There are two instances of (+.sel a d), +but "a" is not in scope at the definition of +.sel. Can we do anything? +Yes, we can "common them up", a sort of limited common sub-expression deal. +This would give: + + g a (d::Num a) (y::a) = let +.sel@a = +.sel a d + f@a (x::a) = +.sel@a x x + in +.sel@a (f@a y) (f@a y) + +This can save work, and can't be spotted by the type checker, because +the two instances of +.sel weren't originally at the same type. + +Further notes on (b) + +* There are quite a few variations here. For example, the defn of + +.sel could be floated ouside the \y, to attempt to gain laziness. + It certainly mustn't be floated outside the \d because the d has to + be in scope too. + +* We don't want to inline f_rhs in this case, because +that will duplicate code. Just commoning up the call is the point. + +* Nothing gets added to +.sel's IdInfo. + +* Don't bother unless the equivalence class has more than one item! + +Not clear whether this is all worth it. It is of course OK to +simply discard call-instances when passing a big lambda. + +Polymorphism 2 -- Overloading +~~~~~~~~~~~~~~ +Consider a function whose most general type is + + f :: forall a b. Ord a => [a] -> b -> b + +There is really no point in making a version of g at Int/Int and another +at Int/Bool, because it's only instancing the type variable "a" which +buys us any efficiency. Since g is completely polymorphic in b there +ain't much point in making separate versions of g for the different +b types. + +That suggests that we should identify which of g's type variables +are constrained (like "a") and which are unconstrained (like "b"). +Then when taking equivalence classes in STEP 2, we ignore the type args +corresponding to unconstrained type variable. In STEP 3 we make +polymorphic versions. Thus: + + f@t1/ = /\b -> <f_rhs> t1 b d1 d2 + +We do this. + + +Dictionary floating +~~~~~~~~~~~~~~~~~~~ +Consider this + + f a (d::Num a) = let g = ... + in + ...(let d1::Ord a = Num.Ord.sel a d in g a d1)... + +Here, g is only called at one type, but the dictionary isn't in scope at the +definition point for g. Usually the type checker would build a +definition for d1 which enclosed g, but the transformation system +might have moved d1's defn inward. Solution: float dictionary bindings +outwards along with call instances. + +Consider + + f x = let g p q = p==q + h r s = (r+s, g r s) + in + h x x + + +Before specialisation, leaving out type abstractions we have + + f df x = let g :: Eq a => a -> a -> Bool + g dg p q = == dg p q + h :: Num a => a -> a -> (a, Bool) + h dh r s = let deq = eqFromNum dh + in (+ dh r s, g deq r s) + in + h df x x + +After specialising h we get a specialised version of h, like this: + + h' r s = let deq = eqFromNum df + in (+ df r s, g deq r s) + +But we can't naively make an instance for g from this, because deq is not in scope +at the defn of g. Instead, we have to float out the (new) defn of deq +to widen its scope. Notice that this floating can't be done in advance -- it only +shows up when specialisation is done. + +User SPECIALIZE pragmas +~~~~~~~~~~~~~~~~~~~~~~~ +Specialisation pragmas can be digested by the type checker, and implemented +by adding extra definitions along with that of f, in the same way as before + + f@t1/t2 = <f_rhs> t1 t2 d1 d2 + +Indeed the pragmas *have* to be dealt with by the type checker, because +only it knows how to build the dictionaries d1 and d2! For example + + g :: Ord a => [a] -> [a] + {-# SPECIALIZE f :: [Tree Int] -> [Tree Int] #-} + +Here, the specialised version of g is an application of g's rhs to the +Ord dictionary for (Tree Int), which only the type checker can conjure +up. There might not even *be* one, if (Tree Int) is not an instance of +Ord! (All the other specialision has suitable dictionaries to hand +from actual calls.) + +Problem. The type checker doesn't have to hand a convenient <f_rhs>, because +it is buried in a complex (as-yet-un-desugared) binding group. +Maybe we should say + + f@t1/t2 = f* t1 t2 d1 d2 + +where f* is the Id f with an IdInfo which says "inline me regardless!". +Indeed all the specialisation could be done in this way. +That in turn means that the simplifier has to be prepared to inline absolutely +any in-scope let-bound thing. + + +Again, the pragma should permit polymorphism in unconstrained variables: + + h :: Ord a => [a] -> b -> b + {-# SPECIALIZE h :: [Int] -> b -> b #-} + +We *insist* that all overloaded type variables are specialised to ground types, +(and hence there can be no context inside a SPECIALIZE pragma). +We *permit* unconstrained type variables to be specialised to + - a ground type + - or left as a polymorphic type variable +but nothing in between. So + + {-# SPECIALIZE h :: [Int] -> [c] -> [c] #-} + +is *illegal*. (It can be handled, but it adds complication, and gains the +programmer nothing.) + + +SPECIALISING INSTANCE DECLARATIONS +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Consider + + instance Foo a => Foo [a] where + ... + {-# SPECIALIZE instance Foo [Int] #-} + +The original instance decl creates a dictionary-function +definition: + + dfun.Foo.List :: forall a. Foo a -> Foo [a] + +The SPECIALIZE pragma just makes a specialised copy, just as for +ordinary function definitions: + + dfun.Foo.List@Int :: Foo [Int] + dfun.Foo.List@Int = dfun.Foo.List Int dFooInt + +The information about what instance of the dfun exist gets added to +the dfun's IdInfo in the same way as a user-defined function too. + + +Automatic instance decl specialisation? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Can instance decls be specialised automatically? It's tricky. +We could collect call-instance information for each dfun, but +then when we specialised their bodies we'd get new call-instances +for ordinary functions; and when we specialised their bodies, we might get +new call-instances of the dfuns, and so on. This all arises because of +the unrestricted mutual recursion between instance decls and value decls. + +Still, there's no actual problem; it just means that we may not do all +the specialisation we could theoretically do. + +Furthermore, instance decls are usually exported and used non-locally, +so we'll want to compile enough to get those specialisations done. + +Lastly, there's no such thing as a local instance decl, so we can +survive solely by spitting out *usage* information, and then reading that +back in as a pragma when next compiling the file. So for now, +we only specialise instance decls in response to pragmas. + + +SPITTING OUT USAGE INFORMATION +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To spit out usage information we need to traverse the code collecting +call-instance information for all imported (non-prelude?) functions +and data types. Then we equivalence-class it and spit it out. + +This is done at the top-level when all the call instances which escape +must be for imported functions and data types. + +*** Not currently done *** + + +Partial specialisation by pragmas +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +What about partial specialisation: + + k :: (Ord a, Eq b) => [a] -> b -> b -> [a] + {-# SPECIALIZE k :: Eq b => [Int] -> b -> b -> [a] #-} + +or even + + {-# SPECIALIZE k :: Eq b => [Int] -> [b] -> [b] -> [a] #-} + +Seems quite reasonable. Similar things could be done with instance decls: + + instance (Foo a, Foo b) => Foo (a,b) where + ... + {-# SPECIALIZE instance Foo a => Foo (a,Int) #-} + {-# SPECIALIZE instance Foo b => Foo (Int,b) #-} + +Ho hum. Things are complex enough without this. I pass. + + +Requirements for the simplifer +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The simplifier has to be able to take advantage of the specialisation. + +* When the simplifier finds an application of a polymorphic f, it looks in +f's IdInfo in case there is a suitable instance to call instead. This converts + + f t1 t2 d1 d2 ===> f_t1_t2 + +Note that the dictionaries get eaten up too! + +* Dictionary selection operations on constant dictionaries must be + short-circuited: + + +.sel Int d ===> +Int + +The obvious way to do this is in the same way as other specialised +calls: +.sel has inside it some IdInfo which tells that if it's applied +to the type Int then it should eat a dictionary and transform to +Int. + +In short, dictionary selectors need IdInfo inside them for constant +methods. + +* Exactly the same applies if a superclass dictionary is being + extracted: + + Eq.sel Int d ===> dEqInt + +* Something similar applies to dictionary construction too. Suppose +dfun.Eq.List is the function taking a dictionary for (Eq a) to +one for (Eq [a]). Then we want + + dfun.Eq.List Int d ===> dEq.List_Int + +Where does the Eq [Int] dictionary come from? It is built in +response to a SPECIALIZE pragma on the Eq [a] instance decl. + +In short, dfun Ids need IdInfo with a specialisation for each +constant instance of their instance declaration. + +All this uses a single mechanism: the SpecEnv inside an Id + + +What does the specialisation IdInfo look like? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The SpecEnv of an Id maps a list of types (the template) to an expression + + [Type] |-> Expr + +For example, if f has this SpecInfo: + + [Int, a] -> \d:Ord Int. f' a + +it means that we can replace the call + + f Int t ===> (\d. f' t) + +This chucks one dictionary away and proceeds with the +specialised version of f, namely f'. + + +What can't be done this way? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +There is no way, post-typechecker, to get a dictionary for (say) +Eq a from a dictionary for Eq [a]. So if we find + + ==.sel [t] d + +we can't transform to + + eqList (==.sel t d') + +where + eqList :: (a->a->Bool) -> [a] -> [a] -> Bool + +Of course, we currently have no way to automatically derive +eqList, nor to connect it to the Eq [a] instance decl, but you +can imagine that it might somehow be possible. Taking advantage +of this is permanently ruled out. + +Still, this is no great hardship, because we intend to eliminate +overloading altogether anyway! + + + +A note about non-tyvar dictionaries +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Some Ids have types like + + forall a,b,c. Eq a -> Ord [a] -> tau + +This seems curious at first, because we usually only have dictionary +args whose types are of the form (C a) where a is a type variable. +But this doesn't hold for the functions arising from instance decls, +which sometimes get arguements with types of form (C (T a)) for some +type constructor T. + +Should we specialise wrt this compound-type dictionary? We used to say +"no", saying: + "This is a heuristic judgement, as indeed is the fact that we + specialise wrt only dictionaries. We choose *not* to specialise + wrt compound dictionaries because at the moment the only place + they show up is in instance decls, where they are simply plugged + into a returned dictionary. So nothing is gained by specialising + wrt them." + +But it is simpler and more uniform to specialise wrt these dicts too; +and in future GHC is likely to support full fledged type signatures +like + f ;: Eq [(a,b)] => ... + + +%************************************************************************ +%* * +\subsubsection{The new specialiser} +%* * +%************************************************************************ + +Our basic game plan is this. For let(rec) bound function + f :: (C a, D c) => (a,b,c,d) -> Bool + +* Find any specialised calls of f, (f ts ds), where + ts are the type arguments t1 .. t4, and + ds are the dictionary arguments d1 .. d2. + +* Add a new definition for f1 (say): + + f1 = /\ b d -> (..body of f..) t1 b t3 d d1 d2 + + Note that we abstract over the unconstrained type arguments. + +* Add the mapping + + [t1,b,t3,d] |-> \d1 d2 -> f1 b d + + to the specialisations of f. This will be used by the + simplifier to replace calls + (f t1 t2 t3 t4) da db + by + (\d1 d1 -> f1 t2 t4) da db + + All the stuff about how many dictionaries to discard, and what types + to apply the specialised function to, are handled by the fact that the + SpecEnv contains a template for the result of the specialisation. + +We don't build *partial* specialisations for f. For example: + + f :: Eq a => a -> a -> Bool + {-# SPECIALISE f :: (Eq b, Eq c) => (b,c) -> (b,c) -> Bool #-} + +Here, little is gained by making a specialised copy of f. +There's a distinct danger that the specialised version would +first build a dictionary for (Eq b, Eq c), and then select the (==) +method from it! Even if it didn't, not a great deal is saved. + +We do, however, generate polymorphic, but not overloaded, specialisations: + + f :: Eq a => [a] -> b -> b -> b + {#- SPECIALISE f :: [Int] -> b -> b -> b #-} + +Hence, the invariant is this: + + *** no specialised version is overloaded *** + + +%************************************************************************ +%* * +\subsubsection{The exported function} +%* * +%************************************************************************ + +\begin{code} +specProgram :: DynFlags -> UniqSupply -> [CoreBind] -> IO [CoreBind] +specProgram dflags us binds + = do + showPass dflags "Specialise" + + let binds' = initSM us (go binds `thenSM` \ (binds', uds') -> + returnSM (dumpAllDictBinds uds' binds')) + + endPass dflags "Specialise" Opt_D_dump_spec binds' + + dumpIfSet_dyn dflags Opt_D_dump_rules "Top-level specialisations" + (pprRules (tidyRules emptyTidyEnv (rulesOfBinds binds'))) + + return binds' + where + -- We need to start with a Subst that knows all the things + -- that are in scope, so that the substitution engine doesn't + -- accidentally re-use a unique that's already in use + -- Easiest thing is to do it all at once, as if all the top-level + -- decls were mutually recursive + top_subst = mkEmptySubst (mkInScopeSet (mkVarSet (bindersOfBinds binds))) + + go [] = returnSM ([], emptyUDs) + go (bind:binds) = go binds `thenSM` \ (binds', uds) -> + specBind top_subst bind uds `thenSM` \ (bind', uds') -> + returnSM (bind' ++ binds', uds') +\end{code} + +%************************************************************************ +%* * +\subsubsection{@specExpr@: the main function} +%* * +%************************************************************************ + +\begin{code} +specVar :: Subst -> Id -> CoreExpr +specVar subst v = lookupIdSubst subst v + +specExpr :: Subst -> CoreExpr -> SpecM (CoreExpr, UsageDetails) +-- We carry a substitution down: +-- a) we must clone any binding that might flaot outwards, +-- to avoid name clashes +-- b) we carry a type substitution to use when analysing +-- the RHS of specialised bindings (no type-let!) + +---------------- First the easy cases -------------------- +specExpr subst (Type ty) = returnSM (Type (substTy subst ty), emptyUDs) +specExpr subst (Var v) = returnSM (specVar subst v, emptyUDs) +specExpr subst (Lit lit) = returnSM (Lit lit, emptyUDs) + +specExpr subst (Note note body) + = specExpr subst body `thenSM` \ (body', uds) -> + returnSM (Note (specNote subst note) body', uds) + + +---------------- Applications might generate a call instance -------------------- +specExpr subst expr@(App fun arg) + = go expr [] + where + go (App fun arg) args = specExpr subst arg `thenSM` \ (arg', uds_arg) -> + go fun (arg':args) `thenSM` \ (fun', uds_app) -> + returnSM (App fun' arg', uds_arg `plusUDs` uds_app) + + go (Var f) args = case specVar subst f of + Var f' -> returnSM (Var f', mkCallUDs subst f' args) + e' -> returnSM (e', emptyUDs) -- I don't expect this! + go other args = specExpr subst other + +---------------- Lambda/case require dumping of usage details -------------------- +specExpr subst e@(Lam _ _) + = specExpr subst' body `thenSM` \ (body', uds) -> + let + (filtered_uds, body'') = dumpUDs bndrs' uds body' + in + returnSM (mkLams bndrs' body'', filtered_uds) + where + (bndrs, body) = collectBinders e + (subst', bndrs') = substBndrs subst bndrs + -- More efficient to collect a group of binders together all at once + -- and we don't want to split a lambda group with dumped bindings + +specExpr subst (Case scrut case_bndr ty alts) + = specExpr subst scrut `thenSM` \ (scrut', uds_scrut) -> + mapAndCombineSM spec_alt alts `thenSM` \ (alts', uds_alts) -> + returnSM (Case scrut' case_bndr' (substTy subst ty) alts', uds_scrut `plusUDs` uds_alts) + where + (subst_alt, case_bndr') = substBndr subst case_bndr + -- No need to clone case binder; it can't float like a let(rec) + + spec_alt (con, args, rhs) + = specExpr subst_rhs rhs `thenSM` \ (rhs', uds) -> + let + (uds', rhs'') = dumpUDs args uds rhs' + in + returnSM ((con, args', rhs''), uds') + where + (subst_rhs, args') = substBndrs subst_alt args + +---------------- Finally, let is the interesting case -------------------- +specExpr subst (Let bind body) + = -- Clone binders + cloneBindSM subst bind `thenSM` \ (rhs_subst, body_subst, bind') -> + + -- Deal with the body + specExpr body_subst body `thenSM` \ (body', body_uds) -> + + -- Deal with the bindings + specBind rhs_subst bind' body_uds `thenSM` \ (binds', uds) -> + + -- All done + returnSM (foldr Let body' binds', uds) + +-- Must apply the type substitution to coerceions +specNote subst (Coerce t1 t2) = Coerce (substTy subst t1) (substTy subst t2) +specNote subst note = note +\end{code} + +%************************************************************************ +%* * +\subsubsection{Dealing with a binding} +%* * +%************************************************************************ + +\begin{code} +specBind :: Subst -- Use this for RHSs + -> CoreBind + -> UsageDetails -- Info on how the scope of the binding + -> SpecM ([CoreBind], -- New bindings + UsageDetails) -- And info to pass upstream + +specBind rhs_subst bind body_uds + = specBindItself rhs_subst bind (calls body_uds) `thenSM` \ (bind', bind_uds) -> + let + bndrs = bindersOf bind + all_uds = zapCalls bndrs (body_uds `plusUDs` bind_uds) + -- It's important that the `plusUDs` is this way round, + -- because body_uds may bind dictionaries that are + -- used in the calls passed to specDefn. So the + -- dictionary bindings in bind_uds may mention + -- dictionaries bound in body_uds. + in + case splitUDs bndrs all_uds of + + (_, ([],[])) -- This binding doesn't bind anything needed + -- in the UDs, so put the binding here + -- This is the case for most non-dict bindings, except + -- for the few that are mentioned in a dict binding + -- that is floating upwards in body_uds + -> returnSM ([bind'], all_uds) + + (float_uds, (dict_binds, calls)) -- This binding is needed in the UDs, so float it out + -> returnSM ([], float_uds `plusUDs` mkBigUD bind' dict_binds calls) + + +-- A truly gruesome function +mkBigUD bind@(NonRec _ _) dbs calls + = -- Common case: non-recursive and no specialisations + -- (if there were any specialistions it would have been made recursive) + MkUD { dict_binds = listToBag (mkDB bind : dbs), + calls = listToCallDetails calls } + +mkBigUD bind dbs calls + = -- General case + MkUD { dict_binds = unitBag (mkDB (Rec (bind_prs bind ++ dbsToPairs dbs))), + -- Make a huge Rec + calls = listToCallDetails calls } + where + bind_prs (NonRec b r) = [(b,r)] + bind_prs (Rec prs) = prs + + dbsToPairs [] = [] + dbsToPairs ((bind,_):dbs) = bind_prs bind ++ dbsToPairs dbs + +-- specBindItself deals with the RHS, specialising it according +-- to the calls found in the body (if any) +specBindItself rhs_subst (NonRec bndr rhs) call_info + = specDefn rhs_subst call_info (bndr,rhs) `thenSM` \ ((bndr',rhs'), spec_defns, spec_uds) -> + let + new_bind | null spec_defns = NonRec bndr' rhs' + | otherwise = Rec ((bndr',rhs'):spec_defns) + -- bndr' mentions the spec_defns in its SpecEnv + -- Not sure why we couln't just put the spec_defns first + in + returnSM (new_bind, spec_uds) + +specBindItself rhs_subst (Rec pairs) call_info + = mapSM (specDefn rhs_subst call_info) pairs `thenSM` \ stuff -> + let + (pairs', spec_defns_s, spec_uds_s) = unzip3 stuff + spec_defns = concat spec_defns_s + spec_uds = plusUDList spec_uds_s + new_bind = Rec (spec_defns ++ pairs') + in + returnSM (new_bind, spec_uds) + + +specDefn :: Subst -- Subst to use for RHS + -> CallDetails -- Info on how it is used in its scope + -> (Id, CoreExpr) -- The thing being bound and its un-processed RHS + -> SpecM ((Id, CoreExpr), -- The thing and its processed RHS + -- the Id may now have specialisations attached + [(Id,CoreExpr)], -- Extra, specialised bindings + UsageDetails -- Stuff to fling upwards from the RHS and its + ) -- specialised versions + +specDefn subst calls (fn, rhs) + -- The first case is the interesting one + | rhs_tyvars `lengthIs` n_tyvars -- Rhs of fn's defn has right number of big lambdas + && rhs_bndrs `lengthAtLeast` n_dicts -- and enough dict args + && notNull calls_for_me -- And there are some calls to specialise + +-- At one time I tried not specialising small functions +-- but sometimes there are big functions marked INLINE +-- that we'd like to specialise. In particular, dictionary +-- functions, which Marcin is keen to inline +-- && not (certainlyWillInline fn) -- And it's not small + -- If it's small, it's better just to inline + -- it than to construct lots of specialisations + = -- Specialise the body of the function + specExpr subst rhs `thenSM` \ (rhs', rhs_uds) -> + + -- Make a specialised version for each call in calls_for_me + mapSM spec_call calls_for_me `thenSM` \ stuff -> + let + (spec_defns, spec_uds, spec_rules) = unzip3 stuff + + fn' = addIdSpecialisations fn spec_rules + in + returnSM ((fn',rhs'), + spec_defns, + rhs_uds `plusUDs` plusUDList spec_uds) + + | otherwise -- No calls or RHS doesn't fit our preconceptions + = specExpr subst rhs `thenSM` \ (rhs', rhs_uds) -> + returnSM ((fn, rhs'), [], rhs_uds) + + where + fn_type = idType fn + (tyvars, theta, _) = tcSplitSigmaTy fn_type + n_tyvars = length tyvars + n_dicts = length theta + + (rhs_tyvars, rhs_ids, rhs_body) + = collectTyAndValBinders (dropInline rhs) + -- It's important that we "see past" any INLINE pragma + -- else we'll fail to specialise an INLINE thing + + rhs_dicts = take n_dicts rhs_ids + rhs_bndrs = rhs_tyvars ++ rhs_dicts + body = mkLams (drop n_dicts rhs_ids) rhs_body + -- Glue back on the non-dict lambdas + + calls_for_me = case lookupFM calls fn of + Nothing -> [] + Just cs -> fmToList cs + + ---------------------------------------------------------- + -- Specialise to one particular call pattern + spec_call :: (CallKey, ([DictExpr], VarSet)) -- Call instance + -> SpecM ((Id,CoreExpr), -- Specialised definition + UsageDetails, -- Usage details from specialised body + CoreRule) -- Info for the Id's SpecEnv + spec_call (CallKey call_ts, (call_ds, call_fvs)) + = ASSERT( call_ts `lengthIs` n_tyvars && call_ds `lengthIs` n_dicts ) + -- Calls are only recorded for properly-saturated applications + + -- Suppose f's defn is f = /\ a b c d -> \ d1 d2 -> rhs + -- Supppose the call is for f [Just t1, Nothing, Just t3, Nothing] [dx1, dx2] + + -- Construct the new binding + -- f1 = SUBST[a->t1,c->t3, d1->d1', d2->d2'] (/\ b d -> rhs) + -- PLUS the usage-details + -- { d1' = dx1; d2' = dx2 } + -- where d1', d2' are cloned versions of d1,d2, with the type substitution applied. + -- + -- Note that the substitution is applied to the whole thing. + -- This is convenient, but just slightly fragile. Notably: + -- * There had better be no name clashes in a/b/c/d + -- + let + -- poly_tyvars = [b,d] in the example above + -- spec_tyvars = [a,c] + -- ty_args = [t1,b,t3,d] + poly_tyvars = [tv | (tv, Nothing) <- rhs_tyvars `zip` call_ts] + spec_tyvars = [tv | (tv, Just _) <- rhs_tyvars `zip` call_ts] + ty_args = zipWithEqual "spec_call" mk_ty_arg rhs_tyvars call_ts + where + mk_ty_arg rhs_tyvar Nothing = Type (mkTyVarTy rhs_tyvar) + mk_ty_arg rhs_tyvar (Just ty) = Type ty + rhs_subst = extendTvSubstList subst (spec_tyvars `zip` [ty | Just ty <- call_ts]) + in + cloneBinders rhs_subst rhs_dicts `thenSM` \ (rhs_subst', rhs_dicts') -> + let + inst_args = ty_args ++ map Var rhs_dicts' + + -- Figure out the type of the specialised function + body_ty = applyTypeToArgs rhs fn_type inst_args + (lam_args, app_args) -- Add a dummy argument if body_ty is unlifted + | isUnLiftedType body_ty -- C.f. WwLib.mkWorkerArgs + = (poly_tyvars ++ [voidArgId], poly_tyvars ++ [realWorldPrimId]) + | otherwise = (poly_tyvars, poly_tyvars) + spec_id_ty = mkPiTypes lam_args body_ty + in + newIdSM fn spec_id_ty `thenSM` \ spec_f -> + specExpr rhs_subst' (mkLams lam_args body) `thenSM` \ (spec_rhs, rhs_uds) -> + let + -- The rule to put in the function's specialisation is: + -- forall b,d, d1',d2'. f t1 b t3 d d1' d2' = f1 b d + spec_env_rule = mkLocalRule (mkFastString ("SPEC " ++ showSDoc (ppr fn))) + AlwaysActive (idName fn) + (poly_tyvars ++ rhs_dicts') + inst_args + (mkVarApps (Var spec_f) app_args) + + -- Add the { d1' = dx1; d2' = dx2 } usage stuff + final_uds = foldr addDictBind rhs_uds (my_zipEqual "spec_call" rhs_dicts' call_ds) + + -- NOTE: we don't add back in any INLINE pragma on the RHS, so even if + -- the original function said INLINE, the specialised copies won't. + -- The idea is that the point of inlining was precisely to specialise + -- the function at its call site, and that's not so important for the + -- specialised copies. But it still smells like an ad hoc decision. + + in + returnSM ((spec_f, spec_rhs), + final_uds, + spec_env_rule) + + where + my_zipEqual doc xs ys + | not (equalLength xs ys) = pprPanic "my_zipEqual" (ppr xs $$ ppr ys $$ (ppr fn <+> ppr call_ts) $$ ppr rhs) + | otherwise = zipEqual doc xs ys + +dropInline :: CoreExpr -> CoreExpr +dropInline (Note InlineMe rhs) = rhs +dropInline rhs = rhs +\end{code} + +%************************************************************************ +%* * +\subsubsection{UsageDetails and suchlike} +%* * +%************************************************************************ + +\begin{code} +data UsageDetails + = MkUD { + dict_binds :: !(Bag DictBind), + -- Floated dictionary bindings + -- The order is important; + -- in ds1 `union` ds2, bindings in ds2 can depend on those in ds1 + -- (Remember, Bags preserve order in GHC.) + + calls :: !CallDetails + } + +type DictBind = (CoreBind, VarSet) + -- The set is the free vars of the binding + -- both tyvars and dicts + +type DictExpr = CoreExpr + +emptyUDs = MkUD { dict_binds = emptyBag, calls = emptyFM } + +type ProtoUsageDetails = ([DictBind], + [(Id, CallKey, ([DictExpr], VarSet))] + ) + +------------------------------------------------------------ +type CallDetails = FiniteMap Id CallInfo +newtype CallKey = CallKey [Maybe Type] -- Nothing => unconstrained type argument +type CallInfo = FiniteMap CallKey + ([DictExpr], VarSet) -- Dict args and the vars of the whole + -- call (including tyvars) + -- [*not* include the main id itself, of course] + -- The finite maps eliminate duplicates + -- The list of types and dictionaries is guaranteed to + -- match the type of f + +-- Type isn't an instance of Ord, so that we can control which +-- instance we use. That's tiresome here. Oh well +instance Eq CallKey where + k1 == k2 = case k1 `compare` k2 of { EQ -> True; other -> False } + +instance Ord CallKey where + compare (CallKey k1) (CallKey k2) = cmpList cmp k1 k2 + where + cmp Nothing Nothing = EQ + cmp Nothing (Just t2) = LT + cmp (Just t1) Nothing = GT + cmp (Just t1) (Just t2) = tcCmpType t1 t2 + +unionCalls :: CallDetails -> CallDetails -> CallDetails +unionCalls c1 c2 = plusFM_C plusFM c1 c2 + +singleCall :: Id -> [Maybe Type] -> [DictExpr] -> CallDetails +singleCall id tys dicts + = unitFM id (unitFM (CallKey tys) (dicts, call_fvs)) + where + call_fvs = exprsFreeVars dicts `unionVarSet` tys_fvs + tys_fvs = tyVarsOfTypes (catMaybes tys) + -- The type args (tys) are guaranteed to be part of the dictionary + -- types, because they are just the constrained types, + -- and the dictionary is therefore sure to be bound + -- inside the binding for any type variables free in the type; + -- hence it's safe to neglect tyvars free in tys when making + -- the free-var set for this call + -- BUT I don't trust this reasoning; play safe and include tys_fvs + -- + -- We don't include the 'id' itself. + +listToCallDetails calls + = foldr (unionCalls . mk_call) emptyFM calls + where + mk_call (id, tys, dicts_w_fvs) = unitFM id (unitFM tys dicts_w_fvs) + -- NB: the free vars of the call are provided + +callDetailsToList calls = [ (id,tys,dicts) + | (id,fm) <- fmToList calls, + (tys, dicts) <- fmToList fm + ] + +mkCallUDs subst f args + | null theta + || not (all isClassPred theta) + -- Only specialise if all overloading is on class params. + -- In ptic, with implicit params, the type args + -- *don't* say what the value of the implicit param is! + || not (spec_tys `lengthIs` n_tyvars) + || not ( dicts `lengthIs` n_dicts) + || maybeToBool (lookupRule (\act -> True) (substInScope subst) emptyRuleBase f args) + -- There's already a rule covering this call. A typical case + -- is where there's an explicit user-provided rule. Then + -- we don't want to create a specialised version + -- of the function that overlaps. + = emptyUDs -- Not overloaded, or no specialisation wanted + + | otherwise + = MkUD {dict_binds = emptyBag, + calls = singleCall f spec_tys dicts + } + where + (tyvars, theta, _) = tcSplitSigmaTy (idType f) + constrained_tyvars = tyVarsOfTheta theta + n_tyvars = length tyvars + n_dicts = length theta + + spec_tys = [mk_spec_ty tv ty | (tv, Type ty) <- tyvars `zip` args] + dicts = [dict_expr | (_, dict_expr) <- theta `zip` (drop n_tyvars args)] + + mk_spec_ty tyvar ty + | tyvar `elemVarSet` constrained_tyvars = Just ty + | otherwise = Nothing + +------------------------------------------------------------ +plusUDs :: UsageDetails -> UsageDetails -> UsageDetails +plusUDs (MkUD {dict_binds = db1, calls = calls1}) + (MkUD {dict_binds = db2, calls = calls2}) + = MkUD {dict_binds = d, calls = c} + where + d = db1 `unionBags` db2 + c = calls1 `unionCalls` calls2 + +plusUDList = foldr plusUDs emptyUDs + +-- zapCalls deletes calls to ids from uds +zapCalls ids uds = uds {calls = delListFromFM (calls uds) ids} + +mkDB bind = (bind, bind_fvs bind) + +bind_fvs (NonRec bndr rhs) = pair_fvs (bndr,rhs) +bind_fvs (Rec prs) = foldl delVarSet rhs_fvs bndrs + where + bndrs = map fst prs + rhs_fvs = unionVarSets (map pair_fvs prs) + +pair_fvs (bndr, rhs) = exprFreeVars rhs `unionVarSet` idRuleVars bndr + -- Don't forget variables mentioned in the + -- rules of the bndr. C.f. OccAnal.addRuleUsage + + +addDictBind (dict,rhs) uds = uds { dict_binds = mkDB (NonRec dict rhs) `consBag` dict_binds uds } + +dumpAllDictBinds (MkUD {dict_binds = dbs}) binds + = foldrBag add binds dbs + where + add (bind,_) binds = bind : binds + +dumpUDs :: [CoreBndr] + -> UsageDetails -> CoreExpr + -> (UsageDetails, CoreExpr) +dumpUDs bndrs uds body + = (free_uds, foldr add_let body dict_binds) + where + (free_uds, (dict_binds, _)) = splitUDs bndrs uds + add_let (bind,_) body = Let bind body + +splitUDs :: [CoreBndr] + -> UsageDetails + -> (UsageDetails, -- These don't mention the binders + ProtoUsageDetails) -- These do + +splitUDs bndrs uds@(MkUD {dict_binds = orig_dbs, + calls = orig_calls}) + + = if isEmptyBag dump_dbs && null dump_calls then + -- Common case: binder doesn't affect floats + (uds, ([],[])) + + else + -- Binders bind some of the fvs of the floats + (MkUD {dict_binds = free_dbs, + calls = listToCallDetails free_calls}, + (bagToList dump_dbs, dump_calls) + ) + + where + bndr_set = mkVarSet bndrs + + (free_dbs, dump_dbs, dump_idset) + = foldlBag dump_db (emptyBag, emptyBag, bndr_set) orig_dbs + -- Important that it's foldl not foldr; + -- we're accumulating the set of dumped ids in dump_set + + -- Filter out any calls that mention things that are being dumped + orig_call_list = callDetailsToList orig_calls + (dump_calls, free_calls) = partition captured orig_call_list + captured (id,tys,(dicts, fvs)) = fvs `intersectsVarSet` dump_idset + || id `elemVarSet` dump_idset + + dump_db (free_dbs, dump_dbs, dump_idset) db@(bind, fvs) + | dump_idset `intersectsVarSet` fvs -- Dump it + = (free_dbs, dump_dbs `snocBag` db, + extendVarSetList dump_idset (bindersOf bind)) + + | otherwise -- Don't dump it + = (free_dbs `snocBag` db, dump_dbs, dump_idset) +\end{code} + + +%************************************************************************ +%* * +\subsubsection{Boring helper functions} +%* * +%************************************************************************ + +\begin{code} +type SpecM a = UniqSM a + +thenSM = thenUs +returnSM = returnUs +getUniqSM = getUniqueUs +mapSM = mapUs +initSM = initUs_ + +mapAndCombineSM f [] = returnSM ([], emptyUDs) +mapAndCombineSM f (x:xs) = f x `thenSM` \ (y, uds1) -> + mapAndCombineSM f xs `thenSM` \ (ys, uds2) -> + returnSM (y:ys, uds1 `plusUDs` uds2) + +cloneBindSM :: Subst -> CoreBind -> SpecM (Subst, Subst, CoreBind) +-- Clone the binders of the bind; return new bind with the cloned binders +-- Return the substitution to use for RHSs, and the one to use for the body +cloneBindSM subst (NonRec bndr rhs) + = getUs `thenUs` \ us -> + let + (subst', bndr') = cloneIdBndr subst us bndr + in + returnUs (subst, subst', NonRec bndr' rhs) + +cloneBindSM subst (Rec pairs) + = getUs `thenUs` \ us -> + let + (subst', bndrs') = cloneRecIdBndrs subst us (map fst pairs) + in + returnUs (subst', subst', Rec (bndrs' `zip` map snd pairs)) + +cloneBinders subst bndrs + = getUs `thenUs` \ us -> + returnUs (cloneIdBndrs subst us bndrs) + +newIdSM old_id new_ty + = getUniqSM `thenSM` \ uniq -> + let + -- Give the new Id a similar occurrence name to the old one + name = idName old_id + new_id = mkUserLocal (mkSpecOcc (nameOccName name)) uniq new_ty (getSrcLoc name) + in + returnSM new_id +\end{code} + + + Old (but interesting) stuff about unboxed bindings + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +What should we do when a value is specialised to a *strict* unboxed value? + + map_*_* f (x:xs) = let h = f x + t = map f xs + in h:t + +Could convert let to case: + + map_*_Int# f (x:xs) = case f x of h# -> + let t = map f xs + in h#:t + +This may be undesirable since it forces evaluation here, but the value +may not be used in all branches of the body. In the general case this +transformation is impossible since the mutual recursion in a letrec +cannot be expressed as a case. + +There is also a problem with top-level unboxed values, since our +implementation cannot handle unboxed values at the top level. + +Solution: Lift the binding of the unboxed value and extract it when it +is used: + + map_*_Int# f (x:xs) = let h = case (f x) of h# -> _Lift h# + t = map f xs + in case h of + _Lift h# -> h#:t + +Now give it to the simplifier and the _Lifting will be optimised away. + +The benfit is that we have given the specialised "unboxed" values a +very simplep lifted semantics and then leave it up to the simplifier to +optimise it --- knowing that the overheads will be removed in nearly +all cases. + +In particular, the value will only be evaluted in the branches of the +program which use it, rather than being forced at the point where the +value is bound. For example: + + filtermap_*_* p f (x:xs) + = let h = f x + t = ... + in case p x of + True -> h:t + False -> t + ==> + filtermap_*_Int# p f (x:xs) + = let h = case (f x) of h# -> _Lift h# + t = ... + in case p x of + True -> case h of _Lift h# + -> h#:t + False -> t + +The binding for h can still be inlined in the one branch and the +_Lifting eliminated. + + +Question: When won't the _Lifting be eliminated? + +Answer: When they at the top-level (where it is necessary) or when +inlining would duplicate work (or possibly code depending on +options). However, the _Lifting will still be eliminated if the +strictness analyser deems the lifted binding strict. + diff --git a/compiler/stgSyn/CoreToStg.lhs b/compiler/stgSyn/CoreToStg.lhs new file mode 100644 index 0000000000..824cabaacb --- /dev/null +++ b/compiler/stgSyn/CoreToStg.lhs @@ -0,0 +1,1107 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1993-1998 +% +\section[CoreToStg]{Converts Core to STG Syntax} + +And, as we have the info in hand, we may convert some lets to +let-no-escapes. + +\begin{code} +module CoreToStg ( coreToStg, coreExprToStg ) where + +#include "HsVersions.h" + +import CoreSyn +import CoreUtils ( rhsIsStatic, manifestArity, exprType, findDefault ) +import StgSyn + +import Type +import TyCon ( isAlgTyCon ) +import Id +import Var ( Var, globalIdDetails, idType ) +import TyCon ( isUnboxedTupleTyCon, isPrimTyCon, isFunTyCon, isHiBootTyCon ) +#ifdef ILX +import MkId ( unsafeCoerceId ) +#endif +import IdInfo +import DataCon +import CostCentre ( noCCS ) +import VarSet +import VarEnv +import Maybes ( maybeToBool ) +import Name ( getOccName, isExternalName, nameOccName ) +import OccName ( occNameString, occNameFS ) +import BasicTypes ( Arity ) +import Packages ( HomeModules ) +import StaticFlags ( opt_RuntimeTypes ) +import Outputable + +infixr 9 `thenLne` +\end{code} + +%************************************************************************ +%* * +\subsection[live-vs-free-doc]{Documentation} +%* * +%************************************************************************ + +(There is other relevant documentation in codeGen/CgLetNoEscape.) + +The actual Stg datatype is decorated with {\em live variable} +information, as well as {\em free variable} information. The two are +{\em not} the same. Liveness is an operational property rather than a +semantic one. A variable is live at a particular execution point if +it can be referred to {\em directly} again. In particular, a dead +variable's stack slot (if it has one): +\begin{enumerate} +\item +should be stubbed to avoid space leaks, and +\item +may be reused for something else. +\end{enumerate} + +There ought to be a better way to say this. Here are some examples: +\begin{verbatim} + let v = [q] \[x] -> e + in + ...v... (but no q's) +\end{verbatim} + +Just after the `in', v is live, but q is dead. If the whole of that +let expression was enclosed in a case expression, thus: +\begin{verbatim} + case (let v = [q] \[x] -> e in ...v...) of + alts[...q...] +\end{verbatim} +(ie @alts@ mention @q@), then @q@ is live even after the `in'; because +we'll return later to the @alts@ and need it. + +Let-no-escapes make this a bit more interesting: +\begin{verbatim} + let-no-escape v = [q] \ [x] -> e + in + ...v... +\end{verbatim} +Here, @q@ is still live at the `in', because @v@ is represented not by +a closure but by the current stack state. In other words, if @v@ is +live then so is @q@. Furthermore, if @e@ mentions an enclosing +let-no-escaped variable, then {\em its} free variables are also live +if @v@ is. + +%************************************************************************ +%* * +\subsection[caf-info]{Collecting live CAF info} +%* * +%************************************************************************ + +In this pass we also collect information on which CAFs are live for +constructing SRTs (see SRT.lhs). + +A top-level Id has CafInfo, which is + + - MayHaveCafRefs, if it may refer indirectly to + one or more CAFs, or + - NoCafRefs if it definitely doesn't + +The CafInfo has already been calculated during the CoreTidy pass. + +During CoreToStg, we then pin onto each binding and case expression, a +list of Ids which represents the "live" CAFs at that point. The meaning +of "live" here is the same as for live variables, see above (which is +why it's convenient to collect CAF information here rather than elsewhere). + +The later SRT pass takes these lists of Ids and uses them to construct +the actual nested SRTs, and replaces the lists of Ids with (offset,length) +pairs. + + +Interaction of let-no-escape with SRTs [Sept 01] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Consider + + let-no-escape x = ...caf1...caf2... + in + ...x...x...x... + +where caf1,caf2 are CAFs. Since x doesn't have a closure, we +build SRTs just as if x's defn was inlined at each call site, and +that means that x's CAF refs get duplicated in the overall SRT. + +This is unlike ordinary lets, in which the CAF refs are not duplicated. + +We could fix this loss of (static) sharing by making a sort of pseudo-closure +for x, solely to put in the SRTs lower down. + + +%************************************************************************ +%* * +\subsection[binds-StgVarInfo]{Setting variable info: top-level, binds, RHSs} +%* * +%************************************************************************ + +\begin{code} +coreToStg :: HomeModules -> [CoreBind] -> IO [StgBinding] +coreToStg hmods pgm + = return pgm' + where (_, _, pgm') = coreTopBindsToStg hmods emptyVarEnv pgm + +coreExprToStg :: CoreExpr -> StgExpr +coreExprToStg expr + = new_expr where (new_expr,_,_) = initLne emptyVarEnv (coreToStgExpr expr) + + +coreTopBindsToStg + :: HomeModules + -> IdEnv HowBound -- environment for the bindings + -> [CoreBind] + -> (IdEnv HowBound, FreeVarsInfo, [StgBinding]) + +coreTopBindsToStg hmods env [] = (env, emptyFVInfo, []) +coreTopBindsToStg hmods env (b:bs) + = (env2, fvs2, b':bs') + where + -- env accumulates down the list of binds, fvs accumulates upwards + (env1, fvs2, b' ) = coreTopBindToStg hmods env fvs1 b + (env2, fvs1, bs') = coreTopBindsToStg hmods env1 bs + + +coreTopBindToStg + :: HomeModules + -> IdEnv HowBound + -> FreeVarsInfo -- Info about the body + -> CoreBind + -> (IdEnv HowBound, FreeVarsInfo, StgBinding) + +coreTopBindToStg hmods env body_fvs (NonRec id rhs) + = let + env' = extendVarEnv env id how_bound + how_bound = LetBound TopLet $! manifestArity rhs + + (stg_rhs, fvs') = + initLne env ( + coreToTopStgRhs hmods body_fvs (id,rhs) `thenLne` \ (stg_rhs, fvs') -> + returnLne (stg_rhs, fvs') + ) + + bind = StgNonRec id stg_rhs + in + ASSERT2(manifestArity rhs == stgRhsArity stg_rhs, ppr id) + ASSERT2(consistentCafInfo id bind, ppr id) +-- WARN(not (consistent caf_info bind), ppr id <+> ppr cafs <+> ppCafInfo caf_info) + (env', fvs' `unionFVInfo` body_fvs, bind) + +coreTopBindToStg hmods env body_fvs (Rec pairs) + = let + (binders, rhss) = unzip pairs + + extra_env' = [ (b, LetBound TopLet $! manifestArity rhs) + | (b, rhs) <- pairs ] + env' = extendVarEnvList env extra_env' + + (stg_rhss, fvs') + = initLne env' ( + mapAndUnzipLne (coreToTopStgRhs hmods body_fvs) pairs + `thenLne` \ (stg_rhss, fvss') -> + let fvs' = unionFVInfos fvss' in + returnLne (stg_rhss, fvs') + ) + + bind = StgRec (zip binders stg_rhss) + in + ASSERT2(and [manifestArity rhs == stgRhsArity stg_rhs | (rhs,stg_rhs) <- rhss `zip` stg_rhss], ppr binders) + ASSERT2(consistentCafInfo (head binders) bind, ppr binders) + (env', fvs' `unionFVInfo` body_fvs, bind) + +#ifdef DEBUG +-- Assertion helper: this checks that the CafInfo on the Id matches +-- what CoreToStg has figured out about the binding's SRT. The +-- CafInfo will be exact in all cases except when CorePrep has +-- floated out a binding, in which case it will be approximate. +consistentCafInfo id bind + | occNameFS (nameOccName (idName id)) == FSLIT("sat") + = safe + | otherwise + = WARN (not exact, ppr id) safe + where + safe = id_marked_caffy || not binding_is_caffy + exact = id_marked_caffy == binding_is_caffy + id_marked_caffy = mayHaveCafRefs (idCafInfo id) + binding_is_caffy = stgBindHasCafRefs bind +#endif +\end{code} + +\begin{code} +coreToTopStgRhs + :: HomeModules + -> FreeVarsInfo -- Free var info for the scope of the binding + -> (Id,CoreExpr) + -> LneM (StgRhs, FreeVarsInfo) + +coreToTopStgRhs hmods scope_fv_info (bndr, rhs) + = coreToStgExpr rhs `thenLne` \ (new_rhs, rhs_fvs, _) -> + freeVarsToLiveVars rhs_fvs `thenLne` \ lv_info -> + returnLne (mkTopStgRhs is_static rhs_fvs (mkSRT lv_info) bndr_info new_rhs, rhs_fvs) + where + bndr_info = lookupFVInfo scope_fv_info bndr + is_static = rhsIsStatic hmods rhs + +mkTopStgRhs :: Bool -> FreeVarsInfo -> SRT -> StgBinderInfo -> StgExpr + -> StgRhs + +mkTopStgRhs is_static rhs_fvs srt binder_info (StgLam _ bndrs body) + = ASSERT( is_static ) + StgRhsClosure noCCS binder_info + (getFVs rhs_fvs) + ReEntrant + srt + bndrs body + +mkTopStgRhs is_static rhs_fvs srt binder_info (StgConApp con args) + | is_static -- StgConApps can be updatable (see isCrossDllConApp) + = StgRhsCon noCCS con args + +mkTopStgRhs is_static rhs_fvs srt binder_info rhs + = ASSERT2( not is_static, ppr rhs ) + StgRhsClosure noCCS binder_info + (getFVs rhs_fvs) + Updatable + srt + [] rhs +\end{code} + + +-- --------------------------------------------------------------------------- +-- Expressions +-- --------------------------------------------------------------------------- + +\begin{code} +coreToStgExpr + :: CoreExpr + -> LneM (StgExpr, -- Decorated STG expr + FreeVarsInfo, -- Its free vars (NB free, not live) + EscVarsSet) -- Its escapees, a subset of its free vars; + -- also a subset of the domain of the envt + -- because we are only interested in the escapees + -- for vars which might be turned into + -- let-no-escaped ones. +\end{code} + +The second and third components can be derived in a simple bottom up pass, not +dependent on any decisions about which variables will be let-no-escaped or +not. The first component, that is, the decorated expression, may then depend +on these components, but it in turn is not scrutinised as the basis for any +decisions. Hence no black holes. + +\begin{code} +coreToStgExpr (Lit l) = returnLne (StgLit l, emptyFVInfo, emptyVarSet) +coreToStgExpr (Var v) = coreToStgApp Nothing v [] + +coreToStgExpr expr@(App _ _) + = coreToStgApp Nothing f args + where + (f, args) = myCollectArgs expr + +coreToStgExpr expr@(Lam _ _) + = let + (args, body) = myCollectBinders expr + args' = filterStgBinders args + in + extendVarEnvLne [ (a, LambdaBound) | a <- args' ] $ + coreToStgExpr body `thenLne` \ (body, body_fvs, body_escs) -> + let + fvs = args' `minusFVBinders` body_fvs + escs = body_escs `delVarSetList` args' + result_expr | null args' = body + | otherwise = StgLam (exprType expr) args' body + in + returnLne (result_expr, fvs, escs) + +coreToStgExpr (Note (SCC cc) expr) + = coreToStgExpr expr `thenLne` ( \ (expr2, fvs, escs) -> + returnLne (StgSCC cc expr2, fvs, escs) ) + +#ifdef ILX +-- For ILX, convert (__coerce__ to_ty from_ty e) +-- into (coerce to_ty from_ty e) +-- where coerce is real function +coreToStgExpr (Note (Coerce to_ty from_ty) expr) + = coreToStgExpr (mkApps (Var unsafeCoerceId) + [Type from_ty, Type to_ty, expr]) +#endif + +coreToStgExpr (Note other_note expr) + = coreToStgExpr expr + +-- Cases require a little more real work. + +coreToStgExpr (Case scrut bndr _ alts) + = extendVarEnvLne [(bndr, LambdaBound)] ( + mapAndUnzip3Lne vars_alt alts `thenLne` \ (alts2, fvs_s, escs_s) -> + returnLne ( alts2, + unionFVInfos fvs_s, + unionVarSets escs_s ) + ) `thenLne` \ (alts2, alts_fvs, alts_escs) -> + let + -- Determine whether the default binder is dead or not + -- This helps the code generator to avoid generating an assignment + -- for the case binder (is extremely rare cases) ToDo: remove. + bndr' | bndr `elementOfFVInfo` alts_fvs = bndr + | otherwise = bndr `setIdOccInfo` IAmDead + + -- Don't consider the default binder as being 'live in alts', + -- since this is from the point of view of the case expr, where + -- the default binder is not free. + alts_fvs_wo_bndr = bndr `minusFVBinder` alts_fvs + alts_escs_wo_bndr = alts_escs `delVarSet` bndr + in + + freeVarsToLiveVars alts_fvs_wo_bndr `thenLne` \ alts_lv_info -> + + -- We tell the scrutinee that everything + -- live in the alts is live in it, too. + setVarsLiveInCont alts_lv_info ( + coreToStgExpr scrut `thenLne` \ (scrut2, scrut_fvs, scrut_escs) -> + freeVarsToLiveVars scrut_fvs `thenLne` \ scrut_lv_info -> + returnLne (scrut2, scrut_fvs, scrut_escs, scrut_lv_info) + ) + `thenLne` \ (scrut2, scrut_fvs, scrut_escs, scrut_lv_info) -> + + returnLne ( + StgCase scrut2 (getLiveVars scrut_lv_info) + (getLiveVars alts_lv_info) + bndr' + (mkSRT alts_lv_info) + (mkStgAltType (idType bndr) alts) + alts2, + scrut_fvs `unionFVInfo` alts_fvs_wo_bndr, + alts_escs_wo_bndr `unionVarSet` getFVSet scrut_fvs + -- You might think we should have scrut_escs, not + -- (getFVSet scrut_fvs), but actually we can't call, and + -- then return from, a let-no-escape thing. + ) + where + vars_alt (con, binders, rhs) + = let -- Remove type variables + binders' = filterStgBinders binders + in + extendVarEnvLne [(b, LambdaBound) | b <- binders'] $ + coreToStgExpr rhs `thenLne` \ (rhs2, rhs_fvs, rhs_escs) -> + let + -- Records whether each param is used in the RHS + good_use_mask = [ b `elementOfFVInfo` rhs_fvs | b <- binders' ] + in + returnLne ( (con, binders', good_use_mask, rhs2), + binders' `minusFVBinders` rhs_fvs, + rhs_escs `delVarSetList` binders' ) + -- ToDo: remove the delVarSet; + -- since escs won't include any of these binders +\end{code} + +Lets not only take quite a bit of work, but this is where we convert +then to let-no-escapes, if we wish. + +(Meanwhile, we don't expect to see let-no-escapes...) +\begin{code} +coreToStgExpr (Let bind body) + = fixLne (\ ~(_, _, _, no_binder_escapes) -> + coreToStgLet no_binder_escapes bind body + ) `thenLne` \ (new_let, fvs, escs, _) -> + + returnLne (new_let, fvs, escs) +\end{code} + +\begin{code} +mkStgAltType scrut_ty alts + = case splitTyConApp_maybe (repType scrut_ty) of + Just (tc,_) | isUnboxedTupleTyCon tc -> UbxTupAlt tc + | isPrimTyCon tc -> PrimAlt tc + | isHiBootTyCon tc -> look_for_better_tycon + | isAlgTyCon tc -> AlgAlt tc + | isFunTyCon tc -> PolyAlt + | otherwise -> pprPanic "mkStgAlts" (ppr tc) + Nothing -> PolyAlt + + where + -- Sometimes, the TyCon in the type of the scrutinee is an HiBootTyCon, + -- which may not have any constructors inside it. If so, then we + -- can get a better TyCon by grabbing the one from a constructor alternative + -- if one exists. + look_for_better_tycon + | ((DataAlt con, _, _) : _) <- data_alts = + AlgAlt (dataConTyCon con) + | otherwise = + ASSERT(null data_alts) + PolyAlt + where + (data_alts, _deflt) = findDefault alts +\end{code} + + +-- --------------------------------------------------------------------------- +-- Applications +-- --------------------------------------------------------------------------- + +\begin{code} +coreToStgApp + :: Maybe UpdateFlag -- Just upd <=> this application is + -- the rhs of a thunk binding + -- x = [...] \upd [] -> the_app + -- with specified update flag + -> Id -- Function + -> [CoreArg] -- Arguments + -> LneM (StgExpr, FreeVarsInfo, EscVarsSet) + +coreToStgApp maybe_thunk_body f args + = coreToStgArgs args `thenLne` \ (args', args_fvs) -> + lookupVarLne f `thenLne` \ how_bound -> + + let + n_val_args = valArgCount args + not_letrec_bound = not (isLetBound how_bound) + fun_fvs + = let fvs = singletonFVInfo f how_bound fun_occ in + -- e.g. (f :: a -> int) (x :: a) + -- Here the free variables are "f", "x" AND the type variable "a" + -- coreToStgArgs will deal with the arguments recursively + if opt_RuntimeTypes then + fvs `unionFVInfo` tyvarFVInfo (tyVarsOfType (idType f)) + else fvs + + -- Mostly, the arity info of a function is in the fn's IdInfo + -- But new bindings introduced by CoreSat may not have no + -- arity info; it would do us no good anyway. For example: + -- let f = \ab -> e in f + -- No point in having correct arity info for f! + -- Hence the hasArity stuff below. + -- NB: f_arity is only consulted for LetBound things + f_arity = stgArity f how_bound + saturated = f_arity <= n_val_args + + fun_occ + | not_letrec_bound = noBinderInfo -- Uninteresting variable + | f_arity > 0 && saturated = stgSatOcc -- Saturated or over-saturated function call + | otherwise = stgUnsatOcc -- Unsaturated function or thunk + + fun_escs + | not_letrec_bound = emptyVarSet -- Only letrec-bound escapees are interesting + | f_arity == n_val_args = emptyVarSet -- A function *or thunk* with an exactly + -- saturated call doesn't escape + -- (let-no-escape applies to 'thunks' too) + + | otherwise = unitVarSet f -- Inexact application; it does escape + + -- At the moment of the call: + + -- either the function is *not* let-no-escaped, in which case + -- nothing is live except live_in_cont + -- or the function *is* let-no-escaped in which case the + -- variables it uses are live, but still the function + -- itself is not. PS. In this case, the function's + -- live vars should already include those of the + -- continuation, but it does no harm to just union the + -- two regardless. + + res_ty = exprType (mkApps (Var f) args) + app = case globalIdDetails f of + DataConWorkId dc | saturated -> StgConApp dc args' + PrimOpId op -> ASSERT( saturated ) + StgOpApp (StgPrimOp op) args' res_ty + FCallId call -> ASSERT( saturated ) + StgOpApp (StgFCallOp call (idUnique f)) args' res_ty + _other -> StgApp f args' + + in + returnLne ( + app, + fun_fvs `unionFVInfo` args_fvs, + fun_escs `unionVarSet` (getFVSet args_fvs) + -- All the free vars of the args are disqualified + -- from being let-no-escaped. + ) + + + +-- --------------------------------------------------------------------------- +-- Argument lists +-- This is the guy that turns applications into A-normal form +-- --------------------------------------------------------------------------- + +coreToStgArgs :: [CoreArg] -> LneM ([StgArg], FreeVarsInfo) +coreToStgArgs [] + = returnLne ([], emptyFVInfo) + +coreToStgArgs (Type ty : args) -- Type argument + = coreToStgArgs args `thenLne` \ (args', fvs) -> + if opt_RuntimeTypes then + returnLne (StgTypeArg ty : args', fvs `unionFVInfo` tyvarFVInfo (tyVarsOfType ty)) + else + returnLne (args', fvs) + +coreToStgArgs (arg : args) -- Non-type argument + = coreToStgArgs args `thenLne` \ (stg_args, args_fvs) -> + coreToStgExpr arg `thenLne` \ (arg', arg_fvs, escs) -> + let + fvs = args_fvs `unionFVInfo` arg_fvs + stg_arg = case arg' of + StgApp v [] -> StgVarArg v + StgConApp con [] -> StgVarArg (dataConWorkId con) + StgLit lit -> StgLitArg lit + _ -> pprPanic "coreToStgArgs" (ppr arg) + in + returnLne (stg_arg : stg_args, fvs) + + +-- --------------------------------------------------------------------------- +-- The magic for lets: +-- --------------------------------------------------------------------------- + +coreToStgLet + :: Bool -- True <=> yes, we are let-no-escaping this let + -> CoreBind -- bindings + -> CoreExpr -- body + -> LneM (StgExpr, -- new let + FreeVarsInfo, -- variables free in the whole let + EscVarsSet, -- variables that escape from the whole let + Bool) -- True <=> none of the binders in the bindings + -- is among the escaping vars + +coreToStgLet let_no_escape bind body + = fixLne (\ ~(_, _, _, _, _, rec_body_fvs, _, _) -> + + -- Do the bindings, setting live_in_cont to empty if + -- we ain't in a let-no-escape world + getVarsLiveInCont `thenLne` \ live_in_cont -> + setVarsLiveInCont (if let_no_escape + then live_in_cont + else emptyLiveInfo) + (vars_bind rec_body_fvs bind) + `thenLne` \ ( bind2, bind_fvs, bind_escs, bind_lv_info, env_ext) -> + + -- Do the body + extendVarEnvLne env_ext ( + coreToStgExpr body `thenLne` \(body2, body_fvs, body_escs) -> + freeVarsToLiveVars body_fvs `thenLne` \ body_lv_info -> + + returnLne (bind2, bind_fvs, bind_escs, getLiveVars bind_lv_info, + body2, body_fvs, body_escs, getLiveVars body_lv_info) + ) + + ) `thenLne` (\ (bind2, bind_fvs, bind_escs, bind_lvs, + body2, body_fvs, body_escs, body_lvs) -> + + + -- Compute the new let-expression + let + new_let | let_no_escape = StgLetNoEscape live_in_whole_let bind_lvs bind2 body2 + | otherwise = StgLet bind2 body2 + + free_in_whole_let + = binders `minusFVBinders` (bind_fvs `unionFVInfo` body_fvs) + + live_in_whole_let + = bind_lvs `unionVarSet` (body_lvs `delVarSetList` binders) + + real_bind_escs = if let_no_escape then + bind_escs + else + getFVSet bind_fvs + -- Everything escapes which is free in the bindings + + let_escs = (real_bind_escs `unionVarSet` body_escs) `delVarSetList` binders + + all_escs = bind_escs `unionVarSet` body_escs -- Still includes binders of + -- this let(rec) + + no_binder_escapes = isEmptyVarSet (set_of_binders `intersectVarSet` all_escs) + +#ifdef DEBUG + -- Debugging code as requested by Andrew Kennedy + checked_no_binder_escapes + | not no_binder_escapes && any is_join_var binders + = pprTrace "Interesting! A join var that isn't let-no-escaped" (ppr binders) + False + | otherwise = no_binder_escapes +#else + checked_no_binder_escapes = no_binder_escapes +#endif + + -- Mustn't depend on the passed-in let_no_escape flag, since + -- no_binder_escapes is used by the caller to derive the flag! + in + returnLne ( + new_let, + free_in_whole_let, + let_escs, + checked_no_binder_escapes + )) + where + set_of_binders = mkVarSet binders + binders = bindersOf bind + + mk_binding bind_lv_info binder rhs + = (binder, LetBound (NestedLet live_vars) (manifestArity rhs)) + where + live_vars | let_no_escape = addLiveVar bind_lv_info binder + | otherwise = unitLiveVar binder + -- c.f. the invariant on NestedLet + + vars_bind :: FreeVarsInfo -- Free var info for body of binding + -> CoreBind + -> LneM (StgBinding, + FreeVarsInfo, + EscVarsSet, -- free vars; escapee vars + LiveInfo, -- Vars and CAFs live in binding + [(Id, HowBound)]) -- extension to environment + + + vars_bind body_fvs (NonRec binder rhs) + = coreToStgRhs body_fvs [] (binder,rhs) + `thenLne` \ (rhs2, bind_fvs, bind_lv_info, escs) -> + let + env_ext_item = mk_binding bind_lv_info binder rhs + in + returnLne (StgNonRec binder rhs2, + bind_fvs, escs, bind_lv_info, [env_ext_item]) + + + vars_bind body_fvs (Rec pairs) + = fixLne (\ ~(_, rec_rhs_fvs, _, bind_lv_info, _) -> + let + rec_scope_fvs = unionFVInfo body_fvs rec_rhs_fvs + binders = map fst pairs + env_ext = [ mk_binding bind_lv_info b rhs + | (b,rhs) <- pairs ] + in + extendVarEnvLne env_ext ( + mapAndUnzip4Lne (coreToStgRhs rec_scope_fvs binders) pairs + `thenLne` \ (rhss2, fvss, lv_infos, escss) -> + let + bind_fvs = unionFVInfos fvss + bind_lv_info = foldr unionLiveInfo emptyLiveInfo lv_infos + escs = unionVarSets escss + in + returnLne (StgRec (binders `zip` rhss2), + bind_fvs, escs, bind_lv_info, env_ext) + ) + ) + +is_join_var :: Id -> Bool +-- A hack (used only for compiler debuggging) to tell if +-- a variable started life as a join point ($j) +is_join_var j = occNameString (getOccName j) == "$j" +\end{code} + +\begin{code} +coreToStgRhs :: FreeVarsInfo -- Free var info for the scope of the binding + -> [Id] + -> (Id,CoreExpr) + -> LneM (StgRhs, FreeVarsInfo, LiveInfo, EscVarsSet) + +coreToStgRhs scope_fv_info binders (bndr, rhs) + = coreToStgExpr rhs `thenLne` \ (new_rhs, rhs_fvs, rhs_escs) -> + getEnvLne `thenLne` \ env -> + freeVarsToLiveVars (binders `minusFVBinders` rhs_fvs) `thenLne` \ lv_info -> + returnLne (mkStgRhs rhs_fvs (mkSRT lv_info) bndr_info new_rhs, + rhs_fvs, lv_info, rhs_escs) + where + bndr_info = lookupFVInfo scope_fv_info bndr + +mkStgRhs :: FreeVarsInfo -> SRT -> StgBinderInfo -> StgExpr -> StgRhs + +mkStgRhs rhs_fvs srt binder_info (StgConApp con args) + = StgRhsCon noCCS con args + +mkStgRhs rhs_fvs srt binder_info (StgLam _ bndrs body) + = StgRhsClosure noCCS binder_info + (getFVs rhs_fvs) + ReEntrant + srt bndrs body + +mkStgRhs rhs_fvs srt binder_info rhs + = StgRhsClosure noCCS binder_info + (getFVs rhs_fvs) + upd_flag srt [] rhs + where + upd_flag = Updatable + {- + SDM: disabled. Eval/Apply can't handle functions with arity zero very + well; and making these into simple non-updatable thunks breaks other + assumptions (namely that they will be entered only once). + + upd_flag | isPAP env rhs = ReEntrant + | otherwise = Updatable + -} + +{- ToDo: + upd = if isOnceDem dem + then (if isNotTop toplev + then SingleEntry -- HA! Paydirt for "dem" + else +#ifdef DEBUG + trace "WARNING: SE CAFs unsupported, forcing UPD instead" $ +#endif + Updatable) + else Updatable + -- For now we forbid SingleEntry CAFs; they tickle the + -- ASSERT in rts/Storage.c line 215 at newCAF() re mut_link, + -- and I don't understand why. There's only one SE_CAF (well, + -- only one that tickled a great gaping bug in an earlier attempt + -- at ClosureInfo.getEntryConvention) in the whole of nofib, + -- specifically Main.lvl6 in spectral/cryptarithm2. + -- So no great loss. KSW 2000-07. +-} +\end{code} + +Detect thunks which will reduce immediately to PAPs, and make them +non-updatable. This has several advantages: + + - the non-updatable thunk behaves exactly like the PAP, + + - the thunk is more efficient to enter, because it is + specialised to the task. + + - we save one update frame, one stg_update_PAP, one update + and lots of PAP_enters. + + - in the case where the thunk is top-level, we save building + a black hole and futhermore the thunk isn't considered to + be a CAF any more, so it doesn't appear in any SRTs. + +We do it here, because the arity information is accurate, and we need +to do it before the SRT pass to save the SRT entries associated with +any top-level PAPs. + +isPAP env (StgApp f args) = listLengthCmp args arity == LT -- idArity f > length args + where + arity = stgArity f (lookupBinding env f) +isPAP env _ = False + + +%************************************************************************ +%* * +\subsection[LNE-monad]{A little monad for this let-no-escaping pass} +%* * +%************************************************************************ + +There's a lot of stuff to pass around, so we use this @LneM@ monad to +help. All the stuff here is only passed *down*. + +\begin{code} +type LneM a = IdEnv HowBound + -> LiveInfo -- Vars and CAFs live in continuation + -> a + +type LiveInfo = (StgLiveVars, -- Dynamic live variables; + -- i.e. ones with a nested (non-top-level) binding + CafSet) -- Static live variables; + -- i.e. top-level variables that are CAFs or refer to them + +type EscVarsSet = IdSet +type CafSet = IdSet + +data HowBound + = ImportBound -- Used only as a response to lookupBinding; never + -- exists in the range of the (IdEnv HowBound) + + | LetBound -- A let(rec) in this module + LetInfo -- Whether top level or nested + Arity -- Its arity (local Ids don't have arity info at this point) + + | LambdaBound -- Used for both lambda and case + +data LetInfo + = TopLet -- top level things + | NestedLet LiveInfo -- For nested things, what is live if this + -- thing is live? Invariant: the binder + -- itself is always a member of + -- the dynamic set of its own LiveInfo + +isLetBound (LetBound _ _) = True +isLetBound other = False + +topLevelBound ImportBound = True +topLevelBound (LetBound TopLet _) = True +topLevelBound other = False +\end{code} + +For a let(rec)-bound variable, x, we record LiveInfo, the set of +variables that are live if x is live. This LiveInfo comprises + (a) dynamic live variables (ones with a non-top-level binding) + (b) static live variabes (CAFs or things that refer to CAFs) + +For "normal" variables (a) is just x alone. If x is a let-no-escaped +variable then x is represented by a code pointer and a stack pointer +(well, one for each stack). So all of the variables needed in the +execution of x are live if x is, and are therefore recorded in the +LetBound constructor; x itself *is* included. + +The set of dynamic live variables is guaranteed ot have no further let-no-escaped +variables in it. + +\begin{code} +emptyLiveInfo :: LiveInfo +emptyLiveInfo = (emptyVarSet,emptyVarSet) + +unitLiveVar :: Id -> LiveInfo +unitLiveVar lv = (unitVarSet lv, emptyVarSet) + +unitLiveCaf :: Id -> LiveInfo +unitLiveCaf caf = (emptyVarSet, unitVarSet caf) + +addLiveVar :: LiveInfo -> Id -> LiveInfo +addLiveVar (lvs, cafs) id = (lvs `extendVarSet` id, cafs) + +unionLiveInfo :: LiveInfo -> LiveInfo -> LiveInfo +unionLiveInfo (lv1,caf1) (lv2,caf2) = (lv1 `unionVarSet` lv2, caf1 `unionVarSet` caf2) + +mkSRT :: LiveInfo -> SRT +mkSRT (_, cafs) = SRTEntries cafs + +getLiveVars :: LiveInfo -> StgLiveVars +getLiveVars (lvs, _) = lvs +\end{code} + + +The std monad functions: +\begin{code} +initLne :: IdEnv HowBound -> LneM a -> a +initLne env m = m env emptyLiveInfo + + + +{-# INLINE thenLne #-} +{-# INLINE returnLne #-} + +returnLne :: a -> LneM a +returnLne e env lvs_cont = e + +thenLne :: LneM a -> (a -> LneM b) -> LneM b +thenLne m k env lvs_cont + = k (m env lvs_cont) env lvs_cont + +mapAndUnzipLne :: (a -> LneM (b,c)) -> [a] -> LneM ([b],[c]) +mapAndUnzipLne f [] = returnLne ([],[]) +mapAndUnzipLne f (x:xs) + = f x `thenLne` \ (r1, r2) -> + mapAndUnzipLne f xs `thenLne` \ (rs1, rs2) -> + returnLne (r1:rs1, r2:rs2) + +mapAndUnzip3Lne :: (a -> LneM (b,c,d)) -> [a] -> LneM ([b],[c],[d]) +mapAndUnzip3Lne f [] = returnLne ([],[],[]) +mapAndUnzip3Lne f (x:xs) + = f x `thenLne` \ (r1, r2, r3) -> + mapAndUnzip3Lne f xs `thenLne` \ (rs1, rs2, rs3) -> + returnLne (r1:rs1, r2:rs2, r3:rs3) + +mapAndUnzip4Lne :: (a -> LneM (b,c,d,e)) -> [a] -> LneM ([b],[c],[d],[e]) +mapAndUnzip4Lne f [] = returnLne ([],[],[],[]) +mapAndUnzip4Lne f (x:xs) + = f x `thenLne` \ (r1, r2, r3, r4) -> + mapAndUnzip4Lne f xs `thenLne` \ (rs1, rs2, rs3, rs4) -> + returnLne (r1:rs1, r2:rs2, r3:rs3, r4:rs4) + +fixLne :: (a -> LneM a) -> LneM a +fixLne expr env lvs_cont + = result + where + result = expr result env lvs_cont +\end{code} + +Functions specific to this monad: + +\begin{code} +getVarsLiveInCont :: LneM LiveInfo +getVarsLiveInCont env lvs_cont = lvs_cont + +setVarsLiveInCont :: LiveInfo -> LneM a -> LneM a +setVarsLiveInCont new_lvs_cont expr env lvs_cont + = expr env new_lvs_cont + +extendVarEnvLne :: [(Id, HowBound)] -> LneM a -> LneM a +extendVarEnvLne ids_w_howbound expr env lvs_cont + = expr (extendVarEnvList env ids_w_howbound) lvs_cont + +lookupVarLne :: Id -> LneM HowBound +lookupVarLne v env lvs_cont = returnLne (lookupBinding env v) env lvs_cont + +getEnvLne :: LneM (IdEnv HowBound) +getEnvLne env lvs_cont = returnLne env env lvs_cont + +lookupBinding :: IdEnv HowBound -> Id -> HowBound +lookupBinding env v = case lookupVarEnv env v of + Just xx -> xx + Nothing -> ASSERT2( isGlobalId v, ppr v ) ImportBound + + +-- The result of lookupLiveVarsForSet, a set of live variables, is +-- only ever tacked onto a decorated expression. It is never used as +-- the basis of a control decision, which might give a black hole. + +freeVarsToLiveVars :: FreeVarsInfo -> LneM LiveInfo +freeVarsToLiveVars fvs env live_in_cont + = returnLne live_info env live_in_cont + where + live_info = foldr unionLiveInfo live_in_cont lvs_from_fvs + lvs_from_fvs = map do_one (allFreeIds fvs) + + do_one (v, how_bound) + = case how_bound of + ImportBound -> unitLiveCaf v -- Only CAF imports are + -- recorded in fvs + LetBound TopLet _ + | mayHaveCafRefs (idCafInfo v) -> unitLiveCaf v + | otherwise -> emptyLiveInfo + + LetBound (NestedLet lvs) _ -> lvs -- lvs already contains v + -- (see the invariant on NestedLet) + + _lambda_or_case_binding -> unitLiveVar v -- Bound by lambda or case +\end{code} + +%************************************************************************ +%* * +\subsection[Free-var info]{Free variable information} +%* * +%************************************************************************ + +\begin{code} +type FreeVarsInfo = VarEnv (Var, HowBound, StgBinderInfo) + -- The Var is so we can gather up the free variables + -- as a set. + -- + -- The HowBound info just saves repeated lookups; + -- we look up just once when we encounter the occurrence. + -- INVARIANT: Any ImportBound Ids are HaveCafRef Ids + -- Imported Ids without CAF refs are simply + -- not put in the FreeVarsInfo for an expression. + -- See singletonFVInfo and freeVarsToLiveVars + -- + -- StgBinderInfo records how it occurs; notably, we + -- are interested in whether it only occurs in saturated + -- applications, because then we don't need to build a + -- curried version. + -- If f is mapped to noBinderInfo, that means + -- that f *is* mentioned (else it wouldn't be in the + -- IdEnv at all), but perhaps in an unsaturated applications. + -- + -- All case/lambda-bound things are also mapped to + -- noBinderInfo, since we aren't interested in their + -- occurence info. + -- + -- For ILX we track free var info for type variables too; + -- hence VarEnv not IdEnv +\end{code} + +\begin{code} +emptyFVInfo :: FreeVarsInfo +emptyFVInfo = emptyVarEnv + +singletonFVInfo :: Id -> HowBound -> StgBinderInfo -> FreeVarsInfo +-- Don't record non-CAF imports at all, to keep free-var sets small +singletonFVInfo id ImportBound info + | mayHaveCafRefs (idCafInfo id) = unitVarEnv id (id, ImportBound, info) + | otherwise = emptyVarEnv +singletonFVInfo id how_bound info = unitVarEnv id (id, how_bound, info) + +tyvarFVInfo :: TyVarSet -> FreeVarsInfo +tyvarFVInfo tvs = foldVarSet add emptyFVInfo tvs + where + add tv fvs = extendVarEnv fvs tv (tv, LambdaBound, noBinderInfo) + -- Type variables must be lambda-bound + +unionFVInfo :: FreeVarsInfo -> FreeVarsInfo -> FreeVarsInfo +unionFVInfo fv1 fv2 = plusVarEnv_C plusFVInfo fv1 fv2 + +unionFVInfos :: [FreeVarsInfo] -> FreeVarsInfo +unionFVInfos fvs = foldr unionFVInfo emptyFVInfo fvs + +minusFVBinders :: [Id] -> FreeVarsInfo -> FreeVarsInfo +minusFVBinders vs fv = foldr minusFVBinder fv vs + +minusFVBinder :: Id -> FreeVarsInfo -> FreeVarsInfo +minusFVBinder v fv | isId v && opt_RuntimeTypes + = (fv `delVarEnv` v) `unionFVInfo` + tyvarFVInfo (tyVarsOfType (idType v)) + | otherwise = fv `delVarEnv` v + -- When removing a binder, remember to add its type variables + -- c.f. CoreFVs.delBinderFV + +elementOfFVInfo :: Id -> FreeVarsInfo -> Bool +elementOfFVInfo id fvs = maybeToBool (lookupVarEnv fvs id) + +lookupFVInfo :: FreeVarsInfo -> Id -> StgBinderInfo +-- Find how the given Id is used. +-- Externally visible things may be used any old how +lookupFVInfo fvs id + | isExternalName (idName id) = noBinderInfo + | otherwise = case lookupVarEnv fvs id of + Nothing -> noBinderInfo + Just (_,_,info) -> info + +allFreeIds :: FreeVarsInfo -> [(Id,HowBound)] -- Both top level and non-top-level Ids +allFreeIds fvs = [(id,how_bound) | (id,how_bound,_) <- varEnvElts fvs, isId id] + +-- Non-top-level things only, both type variables and ids +-- (type variables only if opt_RuntimeTypes) +getFVs :: FreeVarsInfo -> [Var] +getFVs fvs = [id | (id, how_bound, _) <- varEnvElts fvs, + not (topLevelBound how_bound) ] + +getFVSet :: FreeVarsInfo -> VarSet +getFVSet fvs = mkVarSet (getFVs fvs) + +plusFVInfo (id1,hb1,info1) (id2,hb2,info2) + = ASSERT (id1 == id2 && hb1 `check_eq_how_bound` hb2) + (id1, hb1, combineStgBinderInfo info1 info2) + +#ifdef DEBUG +-- The HowBound info for a variable in the FVInfo should be consistent +check_eq_how_bound ImportBound ImportBound = True +check_eq_how_bound LambdaBound LambdaBound = True +check_eq_how_bound (LetBound li1 ar1) (LetBound li2 ar2) = ar1 == ar2 && check_eq_li li1 li2 +check_eq_how_bound hb1 hb2 = False + +check_eq_li (NestedLet _) (NestedLet _) = True +check_eq_li TopLet TopLet = True +check_eq_li li1 li2 = False +#endif +\end{code} + +Misc. +\begin{code} +filterStgBinders :: [Var] -> [Var] +filterStgBinders bndrs + | opt_RuntimeTypes = bndrs + | otherwise = filter isId bndrs +\end{code} + + +\begin{code} + -- Ignore all notes except SCC +myCollectBinders expr + = go [] expr + where + go bs (Lam b e) = go (b:bs) e + go bs e@(Note (SCC _) _) = (reverse bs, e) + go bs (Note _ e) = go bs e + go bs e = (reverse bs, e) + +myCollectArgs :: CoreExpr -> (Id, [CoreArg]) + -- We assume that we only have variables + -- in the function position by now +myCollectArgs expr + = go expr [] + where + go (Var v) as = (v, as) + go (App f a) as = go f (a:as) + go (Note (SCC _) e) as = pprPanic "CoreToStg.myCollectArgs" (ppr expr) + go (Note n e) as = go e as + go _ as = pprPanic "CoreToStg.myCollectArgs" (ppr expr) +\end{code} + +\begin{code} +stgArity :: Id -> HowBound -> Arity +stgArity f (LetBound _ arity) = arity +stgArity f ImportBound = idArity f +stgArity f LambdaBound = 0 +\end{code} diff --git a/compiler/stgSyn/StgLint.lhs b/compiler/stgSyn/StgLint.lhs new file mode 100644 index 0000000000..326cd44578 --- /dev/null +++ b/compiler/stgSyn/StgLint.lhs @@ -0,0 +1,524 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1993-1998 +% +\section[StgLint]{A ``lint'' pass to check for Stg correctness} + +\begin{code} +module StgLint ( lintStgBindings ) where + +#include "HsVersions.h" + +import StgSyn + +import Bag ( Bag, emptyBag, isEmptyBag, snocBag, bagToList ) +import Id ( Id, idType, isLocalId ) +import VarSet +import DataCon ( DataCon, dataConInstArgTys, dataConRepType ) +import CoreSyn ( AltCon(..) ) +import PrimOp ( primOpType ) +import Literal ( literalType ) +import Maybes ( catMaybes ) +import Name ( getSrcLoc ) +import ErrUtils ( Message, mkLocMessage ) +import Type ( mkFunTys, splitFunTys, splitTyConApp_maybe, + isUnLiftedType, isTyVarTy, dropForAlls, Type + ) +import TyCon ( isAlgTyCon, isNewTyCon, tyConDataCons ) +import Util ( zipEqual, equalLength ) +import SrcLoc ( srcLocSpan ) +import Outputable + +infixr 9 `thenL`, `thenL_`, `thenMaybeL` +\end{code} + +Checks for + (a) *some* type errors + (b) locally-defined variables used but not defined + + +Note: unless -dverbose-stg is on, display of lint errors will result +in "panic: bOGUS_LVs". + +WARNING: +~~~~~~~~ + +This module has suffered bit-rot; it is likely to yield lint errors +for Stg code that is currently perfectly acceptable for code +generation. Solution: don't use it! (KSW 2000-05). + + +%************************************************************************ +%* * +\subsection{``lint'' for various constructs} +%* * +%************************************************************************ + +@lintStgBindings@ is the top-level interface function. + +\begin{code} +lintStgBindings :: String -> [StgBinding] -> [StgBinding] + +lintStgBindings whodunnit binds + = {-# SCC "StgLint" #-} + case (initL (lint_binds binds)) of + Nothing -> binds + Just msg -> pprPanic "" (vcat [ + ptext SLIT("*** Stg Lint ErrMsgs: in") <+> text whodunnit <+> ptext SLIT("***"), + msg, + ptext SLIT("*** Offending Program ***"), + pprStgBindings binds, + ptext SLIT("*** End of Offense ***")]) + where + lint_binds :: [StgBinding] -> LintM () + + lint_binds [] = returnL () + lint_binds (bind:binds) + = lintStgBinds bind `thenL` \ binders -> + addInScopeVars binders ( + lint_binds binds + ) +\end{code} + + +\begin{code} +lintStgArg :: StgArg -> LintM (Maybe Type) +lintStgArg (StgLitArg lit) = returnL (Just (literalType lit)) +lintStgArg (StgVarArg v) = lintStgVar v + +lintStgVar v = checkInScope v `thenL_` + returnL (Just (idType v)) +\end{code} + +\begin{code} +lintStgBinds :: StgBinding -> LintM [Id] -- Returns the binders +lintStgBinds (StgNonRec binder rhs) + = lint_binds_help (binder,rhs) `thenL_` + returnL [binder] + +lintStgBinds (StgRec pairs) + = addInScopeVars binders ( + mapL lint_binds_help pairs `thenL_` + returnL binders + ) + where + binders = [b | (b,_) <- pairs] + +lint_binds_help (binder, rhs) + = addLoc (RhsOf binder) ( + -- Check the rhs + lintStgRhs rhs `thenL` \ maybe_rhs_ty -> + + -- Check binder doesn't have unlifted type + checkL (not (isUnLiftedType binder_ty)) + (mkUnLiftedTyMsg binder rhs) `thenL_` + + -- Check match to RHS type + (case maybe_rhs_ty of + Nothing -> returnL () + Just rhs_ty -> checkTys binder_ty + rhs_ty + (mkRhsMsg binder rhs_ty) + ) `thenL_` + + returnL () + ) + where + binder_ty = idType binder +\end{code} + +\begin{code} +lintStgRhs :: StgRhs -> LintM (Maybe Type) + +lintStgRhs (StgRhsClosure _ _ _ _ _ [] expr) + = lintStgExpr expr + +lintStgRhs (StgRhsClosure _ _ _ _ _ binders expr) + = addLoc (LambdaBodyOf binders) ( + addInScopeVars binders ( + lintStgExpr expr `thenMaybeL` \ body_ty -> + returnL (Just (mkFunTys (map idType binders) body_ty)) + )) + +lintStgRhs (StgRhsCon _ con args) + = mapMaybeL lintStgArg args `thenL` \ maybe_arg_tys -> + case maybe_arg_tys of + Nothing -> returnL Nothing + Just arg_tys -> checkFunApp con_ty arg_tys (mkRhsConMsg con_ty arg_tys) + where + con_ty = dataConRepType con +\end{code} + +\begin{code} +lintStgExpr :: StgExpr -> LintM (Maybe Type) -- Nothing if error found + +lintStgExpr (StgLit l) = returnL (Just (literalType l)) + +lintStgExpr e@(StgApp fun args) + = lintStgVar fun `thenMaybeL` \ fun_ty -> + mapMaybeL lintStgArg args `thenL` \ maybe_arg_tys -> + case maybe_arg_tys of + Nothing -> returnL Nothing + Just arg_tys -> checkFunApp fun_ty arg_tys (mkFunAppMsg fun_ty arg_tys e) + +lintStgExpr e@(StgConApp con args) + = mapMaybeL lintStgArg args `thenL` \ maybe_arg_tys -> + case maybe_arg_tys of + Nothing -> returnL Nothing + Just arg_tys -> checkFunApp con_ty arg_tys (mkFunAppMsg con_ty arg_tys e) + where + con_ty = dataConRepType con + +lintStgExpr e@(StgOpApp (StgFCallOp _ _) args res_ty) + = -- We don't have enough type information to check + -- the application; ToDo + mapMaybeL lintStgArg args `thenL` \ maybe_arg_tys -> + returnL (Just res_ty) + +lintStgExpr e@(StgOpApp (StgPrimOp op) args _) + = mapMaybeL lintStgArg args `thenL` \ maybe_arg_tys -> + case maybe_arg_tys of + Nothing -> returnL Nothing + Just arg_tys -> checkFunApp op_ty arg_tys (mkFunAppMsg op_ty arg_tys e) + where + op_ty = primOpType op + +lintStgExpr (StgLam _ bndrs _) + = addErrL (ptext SLIT("Unexpected StgLam") <+> ppr bndrs) `thenL_` + returnL Nothing + +lintStgExpr (StgLet binds body) + = lintStgBinds binds `thenL` \ binders -> + addLoc (BodyOfLetRec binders) ( + addInScopeVars binders ( + lintStgExpr body + )) + +lintStgExpr (StgLetNoEscape _ _ binds body) + = lintStgBinds binds `thenL` \ binders -> + addLoc (BodyOfLetRec binders) ( + addInScopeVars binders ( + lintStgExpr body + )) + +lintStgExpr (StgSCC _ expr) = lintStgExpr expr + +lintStgExpr e@(StgCase scrut _ _ bndr _ alts_type alts) + = lintStgExpr scrut `thenMaybeL` \ _ -> + + (case alts_type of + AlgAlt tc -> check_bndr tc + PrimAlt tc -> check_bndr tc + UbxTupAlt tc -> check_bndr tc + PolyAlt -> returnL () + ) `thenL_` + + (trace (showSDoc (ppr e)) $ + -- we only allow case of tail-call or primop. + (case scrut of + StgApp _ _ -> returnL () + StgConApp _ _ -> returnL () + StgOpApp _ _ _ -> returnL () + other -> addErrL (mkCaseOfCaseMsg e)) `thenL_` + + addInScopeVars [bndr] (lintStgAlts alts scrut_ty) + ) + where + scrut_ty = idType bndr + bad_bndr = mkDefltMsg bndr + check_bndr tc = case splitTyConApp_maybe scrut_ty of + Just (bndr_tc, _) -> checkL (tc == bndr_tc) bad_bndr + Nothing -> addErrL bad_bndr + + +lintStgAlts :: [StgAlt] + -> Type -- Type of scrutinee + -> LintM (Maybe Type) -- Type of alternatives + +lintStgAlts alts scrut_ty + = mapL (lintAlt scrut_ty) alts `thenL` \ maybe_result_tys -> + + -- Check the result types + case catMaybes (maybe_result_tys) of + [] -> returnL Nothing + + (first_ty:tys) -> mapL check tys `thenL_` + returnL (Just first_ty) + where + check ty = checkTys first_ty ty (mkCaseAltMsg alts) + +lintAlt scrut_ty (DEFAULT, _, _, rhs) + = lintStgExpr rhs + +lintAlt scrut_ty (LitAlt lit, _, _, rhs) + = checkTys (literalType lit) scrut_ty (mkAltMsg1 scrut_ty) `thenL_` + lintStgExpr rhs + +lintAlt scrut_ty (DataAlt con, args, _, rhs) + = (case splitTyConApp_maybe scrut_ty of + Just (tycon, tys_applied) | isAlgTyCon tycon && + not (isNewTyCon tycon) -> + let + cons = tyConDataCons tycon + arg_tys = dataConInstArgTys con tys_applied + -- This almost certainly does not work for existential constructors + in + checkL (con `elem` cons) (mkAlgAltMsg2 scrut_ty con) `thenL_` + checkL (equalLength arg_tys args) (mkAlgAltMsg3 con args) + `thenL_` + mapL check (zipEqual "lintAlgAlt:stg" arg_tys args) `thenL_` + returnL () + other -> + addErrL (mkAltMsg1 scrut_ty) + ) `thenL_` + addInScopeVars args ( + lintStgExpr rhs + ) + where + check (ty, arg) = checkTys ty (idType arg) (mkAlgAltMsg4 ty arg) + + -- elem: yes, the elem-list here can sometimes be long-ish, + -- but as it's use-once, probably not worth doing anything different + -- We give it its own copy, so it isn't overloaded. + elem _ [] = False + elem x (y:ys) = x==y || elem x ys +\end{code} + + +%************************************************************************ +%* * +\subsection[lint-monad]{The Lint monad} +%* * +%************************************************************************ + +\begin{code} +type LintM a = [LintLocInfo] -- Locations + -> IdSet -- Local vars in scope + -> Bag Message -- Error messages so far + -> (a, Bag Message) -- Result and error messages (if any) + +data LintLocInfo + = RhsOf Id -- The variable bound + | LambdaBodyOf [Id] -- The lambda-binder + | BodyOfLetRec [Id] -- One of the binders + +dumpLoc (RhsOf v) = + (srcLocSpan (getSrcLoc v), ptext SLIT(" [RHS of ") <> pp_binders [v] <> char ']' ) +dumpLoc (LambdaBodyOf bs) = + (srcLocSpan (getSrcLoc (head bs)), ptext SLIT(" [in body of lambda with binders ") <> pp_binders bs <> char ']' ) + +dumpLoc (BodyOfLetRec bs) = + (srcLocSpan (getSrcLoc (head bs)), ptext SLIT(" [in body of letrec with binders ") <> pp_binders bs <> char ']' ) + + +pp_binders :: [Id] -> SDoc +pp_binders bs + = sep (punctuate comma (map pp_binder bs)) + where + pp_binder b + = hsep [ppr b, dcolon, ppr (idType b)] +\end{code} + +\begin{code} +initL :: LintM a -> Maybe Message +initL m + = case (m [] emptyVarSet emptyBag) of { (_, errs) -> + if isEmptyBag errs then + Nothing + else + Just (vcat (punctuate (text "") (bagToList errs))) + } + +returnL :: a -> LintM a +returnL r loc scope errs = (r, errs) + +thenL :: LintM a -> (a -> LintM b) -> LintM b +thenL m k loc scope errs + = case m loc scope errs of + (r, errs') -> k r loc scope errs' + +thenL_ :: LintM a -> LintM b -> LintM b +thenL_ m k loc scope errs + = case m loc scope errs of + (_, errs') -> k loc scope errs' + +thenMaybeL :: LintM (Maybe a) -> (a -> LintM (Maybe b)) -> LintM (Maybe b) +thenMaybeL m k loc scope errs + = case m loc scope errs of + (Nothing, errs2) -> (Nothing, errs2) + (Just r, errs2) -> k r loc scope errs2 + +mapL :: (a -> LintM b) -> [a] -> LintM [b] +mapL f [] = returnL [] +mapL f (x:xs) + = f x `thenL` \ r -> + mapL f xs `thenL` \ rs -> + returnL (r:rs) + +mapMaybeL :: (a -> LintM (Maybe b)) -> [a] -> LintM (Maybe [b]) + -- Returns Nothing if anything fails +mapMaybeL f [] = returnL (Just []) +mapMaybeL f (x:xs) + = f x `thenMaybeL` \ r -> + mapMaybeL f xs `thenMaybeL` \ rs -> + returnL (Just (r:rs)) +\end{code} + +\begin{code} +checkL :: Bool -> Message -> LintM () +checkL True msg loc scope errs = ((), errs) +checkL False msg loc scope errs = ((), addErr errs msg loc) + +addErrL :: Message -> LintM () +addErrL msg loc scope errs = ((), addErr errs msg loc) + +addErr :: Bag Message -> Message -> [LintLocInfo] -> Bag Message + +addErr errs_so_far msg locs + = errs_so_far `snocBag` mk_msg locs + where + mk_msg (loc:_) = let (l,hdr) = dumpLoc loc + in mkLocMessage l (hdr $$ msg) + mk_msg [] = msg + +addLoc :: LintLocInfo -> LintM a -> LintM a +addLoc extra_loc m loc scope errs + = m (extra_loc:loc) scope errs + +addInScopeVars :: [Id] -> LintM a -> LintM a +addInScopeVars ids m loc scope errs + = -- We check if these "new" ids are already + -- in scope, i.e., we have *shadowing* going on. + -- For now, it's just a "trace"; we may make + -- a real error out of it... + let + new_set = mkVarSet ids + in +-- After adding -fliberate-case, Simon decided he likes shadowed +-- names after all. WDP 94/07 +-- (if isEmptyVarSet shadowed +-- then id +-- else pprTrace "Shadowed vars:" (ppr (varSetElems shadowed))) $ + m loc (scope `unionVarSet` new_set) errs +\end{code} + +Checking function applications: we only check that the type has the +right *number* of arrows, we don't actually compare the types. This +is because we can't expect the types to be equal - the type +applications and type lambdas that we use to calculate accurate types +have long since disappeared. + +\begin{code} +checkFunApp :: Type -- The function type + -> [Type] -- The arg type(s) + -> Message -- Error messgae + -> LintM (Maybe Type) -- The result type + +checkFunApp fun_ty arg_tys msg loc scope errs + = cfa res_ty expected_arg_tys arg_tys + where + (expected_arg_tys, res_ty) = splitFunTys (dropForAlls fun_ty) + + cfa res_ty expected [] -- Args have run out; that's fine + = (Just (mkFunTys expected res_ty), errs) + + cfa res_ty [] arg_tys -- Expected arg tys ran out first; + -- first see if res_ty is a tyvar template; + -- otherwise, maybe res_ty is a + -- dictionary type which is actually a function? + | isTyVarTy res_ty + = (Just res_ty, errs) + | otherwise + = case splitFunTys res_ty of + ([], _) -> (Nothing, addErr errs msg loc) -- Too many args + (new_expected, new_res) -> cfa new_res new_expected arg_tys + + cfa res_ty (expected_arg_ty:expected_arg_tys) (arg_ty:arg_tys) + = cfa res_ty expected_arg_tys arg_tys +\end{code} + +\begin{code} +checkInScope :: Id -> LintM () +checkInScope id loc scope errs + = if isLocalId id && not (id `elemVarSet` scope) then + ((), addErr errs (hsep [ppr id, ptext SLIT("is out of scope")]) loc) + else + ((), errs) + +checkTys :: Type -> Type -> Message -> LintM () +checkTys ty1 ty2 msg loc scope errs + = -- if (ty1 == ty2) then + ((), errs) + -- else ((), addErr errs msg loc) +\end{code} + +\begin{code} +mkCaseAltMsg :: [StgAlt] -> Message +mkCaseAltMsg alts + = ($$) (text "In some case alternatives, type of alternatives not all same:") + (empty) -- LATER: ppr alts + +mkDefltMsg :: Id -> Message +mkDefltMsg bndr + = ($$) (ptext SLIT("Binder of a case expression doesn't match type of scrutinee:")) + (panic "mkDefltMsg") + +mkFunAppMsg :: Type -> [Type] -> StgExpr -> Message +mkFunAppMsg fun_ty arg_tys expr + = vcat [text "In a function application, function type doesn't match arg types:", + hang (ptext SLIT("Function type:")) 4 (ppr fun_ty), + hang (ptext SLIT("Arg types:")) 4 (vcat (map (ppr) arg_tys)), + hang (ptext SLIT("Expression:")) 4 (ppr expr)] + +mkRhsConMsg :: Type -> [Type] -> Message +mkRhsConMsg fun_ty arg_tys + = vcat [text "In a RHS constructor application, con type doesn't match arg types:", + hang (ptext SLIT("Constructor type:")) 4 (ppr fun_ty), + hang (ptext SLIT("Arg types:")) 4 (vcat (map (ppr) arg_tys))] + +mkAltMsg1 :: Type -> Message +mkAltMsg1 ty + = ($$) (text "In a case expression, type of scrutinee does not match patterns") + (ppr ty) + +mkAlgAltMsg2 :: Type -> DataCon -> Message +mkAlgAltMsg2 ty con + = vcat [ + text "In some algebraic case alternative, constructor is not a constructor of scrutinee type:", + ppr ty, + ppr con + ] + +mkAlgAltMsg3 :: DataCon -> [Id] -> Message +mkAlgAltMsg3 con alts + = vcat [ + text "In some algebraic case alternative, number of arguments doesn't match constructor:", + ppr con, + ppr alts + ] + +mkAlgAltMsg4 :: Type -> Id -> Message +mkAlgAltMsg4 ty arg + = vcat [ + text "In some algebraic case alternative, type of argument doesn't match data constructor:", + ppr ty, + ppr arg + ] + +mkCaseOfCaseMsg :: StgExpr -> Message +mkCaseOfCaseMsg e + = text "Case of non-tail-call:" $$ ppr e + +mkRhsMsg :: Id -> Type -> Message +mkRhsMsg binder ty + = vcat [hsep [ptext SLIT("The type of this binder doesn't match the type of its RHS:"), + ppr binder], + hsep [ptext SLIT("Binder's type:"), ppr (idType binder)], + hsep [ptext SLIT("Rhs type:"), ppr ty] + ] + +mkUnLiftedTyMsg binder rhs + = (ptext SLIT("Let(rec) binder") <+> quotes (ppr binder) <+> + ptext SLIT("has unlifted type") <+> quotes (ppr (idType binder))) + $$ + (ptext SLIT("RHS:") <+> ppr rhs) +\end{code} diff --git a/compiler/stgSyn/StgSyn.lhs b/compiler/stgSyn/StgSyn.lhs new file mode 100644 index 0000000000..f1c50cc8fd --- /dev/null +++ b/compiler/stgSyn/StgSyn.lhs @@ -0,0 +1,786 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[StgSyn]{Shared term graph (STG) syntax for spineless-tagless code generation} + +This data type represents programs just before code generation +(conversion to @AbstractC@): basically, what we have is a stylised +form of @CoreSyntax@, the style being one that happens to be ideally +suited to spineless tagless code generation. + +\begin{code} +module StgSyn ( + GenStgArg(..), + GenStgLiveVars, + + GenStgBinding(..), GenStgExpr(..), GenStgRhs(..), + GenStgAlt, AltType(..), + + UpdateFlag(..), isUpdatable, + + StgBinderInfo, + noBinderInfo, stgSatOcc, stgUnsatOcc, satCallsOnly, + combineStgBinderInfo, + + -- a set of synonyms for the most common (only :-) parameterisation + StgArg, StgLiveVars, + StgBinding, StgExpr, StgRhs, StgAlt, + + -- StgOp + StgOp(..), + + -- SRTs + SRT(..), + + -- utils + stgBindHasCafRefs, stgArgHasCafRefs, stgRhsArity, + isDllConApp, isStgTypeArg, + stgArgType, + + pprStgBinding, pprStgBindings, pprStgBindingsWithSRTs + +#ifdef DEBUG + , pprStgLVs +#endif + ) where + +#include "HsVersions.h" + +import CostCentre ( CostCentreStack, CostCentre ) +import VarSet ( IdSet, isEmptyVarSet ) +import Var ( isId ) +import Id ( Id, idName, idType, idCafInfo ) +import IdInfo ( mayHaveCafRefs ) +import Packages ( isDllName ) +import Literal ( Literal, literalType ) +import ForeignCall ( ForeignCall ) +import DataCon ( DataCon, dataConName ) +import CoreSyn ( AltCon ) +import PprCore ( {- instances -} ) +import PrimOp ( PrimOp ) +import Outputable +import Util ( count ) +import Type ( Type ) +import TyCon ( TyCon ) +import UniqSet ( isEmptyUniqSet, uniqSetToList, UniqSet ) +import Unique ( Unique ) +import Bitmap +import DynFlags ( DynFlags ) +import Packages ( HomeModules ) +import StaticFlags ( opt_SccProfilingOn ) +\end{code} + +%************************************************************************ +%* * +\subsection{@GenStgBinding@} +%* * +%************************************************************************ + +As usual, expressions are interesting; other things are boring. Here +are the boring things [except note the @GenStgRhs@], parameterised +with respect to binder and occurrence information (just as in +@CoreSyn@): + +There is one SRT for each group of bindings. + +\begin{code} +data GenStgBinding bndr occ + = StgNonRec bndr (GenStgRhs bndr occ) + | StgRec [(bndr, GenStgRhs bndr occ)] +\end{code} + +%************************************************************************ +%* * +\subsection{@GenStgArg@} +%* * +%************************************************************************ + +\begin{code} +data GenStgArg occ + = StgVarArg occ + | StgLitArg Literal + | StgTypeArg Type -- For when we want to preserve all type info +\end{code} + +\begin{code} +isStgTypeArg (StgTypeArg _) = True +isStgTypeArg other = False + +isDllArg :: HomeModules -> StgArg -> Bool + -- Does this argument refer to something in a different DLL? +isDllArg hmods (StgTypeArg v) = False +isDllArg hmods (StgVarArg v) = isDllName hmods (idName v) +isDllArg hmods (StgLitArg lit) = False + +isDllConApp :: HomeModules -> DataCon -> [StgArg] -> Bool + -- Does this constructor application refer to + -- anything in a different DLL? + -- If so, we can't allocate it statically +isDllConApp hmods con args + = isDllName hmods (dataConName con) || any (isDllArg hmods) args + +stgArgType :: StgArg -> Type + -- Very half baked becase we have lost the type arguments +stgArgType (StgVarArg v) = idType v +stgArgType (StgLitArg lit) = literalType lit +stgArgType (StgTypeArg lit) = panic "stgArgType called on stgTypeArg" +\end{code} + +%************************************************************************ +%* * +\subsection{STG expressions} +%* * +%************************************************************************ + +The @GenStgExpr@ data type is parameterised on binder and occurrence +info, as before. + +%************************************************************************ +%* * +\subsubsection{@GenStgExpr@ application} +%* * +%************************************************************************ + +An application is of a function to a list of atoms [not expressions]. +Operationally, we want to push the arguments on the stack and call the +function. (If the arguments were expressions, we would have to build +their closures first.) + +There is no constructor for a lone variable; it would appear as +@StgApp var [] _@. +\begin{code} +type GenStgLiveVars occ = UniqSet occ + +data GenStgExpr bndr occ + = StgApp + occ -- function + [GenStgArg occ] -- arguments; may be empty +\end{code} + +%************************************************************************ +%* * +\subsubsection{@StgConApp@ and @StgPrimApp@---saturated applications} +%* * +%************************************************************************ + +There are a specialised forms of application, for +constructors, primitives, and literals. +\begin{code} + | StgLit Literal + + | StgConApp DataCon + [GenStgArg occ] -- Saturated + + | StgOpApp StgOp -- Primitive op or foreign call + [GenStgArg occ] -- Saturated + Type -- Result type; we need to know the result type + -- so that we can assign result registers. +\end{code} + +%************************************************************************ +%* * +\subsubsection{@StgLam@} +%* * +%************************************************************************ + +StgLam is used *only* during CoreToStg's work. Before CoreToStg has finished +it encodes (\x -> e) as (let f = \x -> e in f) + +\begin{code} + | StgLam + Type -- Type of whole lambda (useful when making a binder for it) + [bndr] + StgExpr -- Body of lambda +\end{code} + + +%************************************************************************ +%* * +\subsubsection{@GenStgExpr@: case-expressions} +%* * +%************************************************************************ + +This has the same boxed/unboxed business as Core case expressions. +\begin{code} + | StgCase + (GenStgExpr bndr occ) + -- the thing to examine + + (GenStgLiveVars occ) -- Live vars of whole case expression, + -- plus everything that happens after the case + -- i.e., those which mustn't be overwritten + + (GenStgLiveVars occ) -- Live vars of RHSs (plus what happens afterwards) + -- i.e., those which must be saved before eval. + -- + -- note that an alt's constructor's + -- binder-variables are NOT counted in the + -- free vars for the alt's RHS + + bndr -- binds the result of evaluating the scrutinee + + SRT -- The SRT for the continuation + + AltType + + [GenStgAlt bndr occ] -- The DEFAULT case is always *first* + -- if it is there at all +\end{code} + +%************************************************************************ +%* * +\subsubsection{@GenStgExpr@: @let(rec)@-expressions} +%* * +%************************************************************************ + +The various forms of let(rec)-expression encode most of the +interesting things we want to do. +\begin{enumerate} +\item +\begin{verbatim} +let-closure x = [free-vars] expr [args] +in e +\end{verbatim} +is equivalent to +\begin{verbatim} +let x = (\free-vars -> \args -> expr) free-vars +\end{verbatim} +\tr{args} may be empty (and is for most closures). It isn't under +circumstances like this: +\begin{verbatim} +let x = (\y -> y+z) +\end{verbatim} +This gets mangled to +\begin{verbatim} +let-closure x = [z] [y] (y+z) +\end{verbatim} +The idea is that we compile code for @(y+z)@ in an environment in which +@z@ is bound to an offset from \tr{Node}, and @y@ is bound to an +offset from the stack pointer. + +(A let-closure is an @StgLet@ with a @StgRhsClosure@ RHS.) + +\item +\begin{verbatim} +let-constructor x = Constructor [args] +in e +\end{verbatim} + +(A let-constructor is an @StgLet@ with a @StgRhsCon@ RHS.) + +\item +Letrec-expressions are essentially the same deal as +let-closure/let-constructor, so we use a common structure and +distinguish between them with an @is_recursive@ boolean flag. + +\item +\begin{verbatim} +let-unboxed u = an arbitrary arithmetic expression in unboxed values +in e +\end{verbatim} +All the stuff on the RHS must be fully evaluated. No function calls either! + +(We've backed away from this toward case-expressions with +suitably-magical alts ...) + +\item +~[Advanced stuff here! Not to start with, but makes pattern matching +generate more efficient code.] + +\begin{verbatim} +let-escapes-not fail = expr +in e' +\end{verbatim} +Here the idea is that @e'@ guarantees not to put @fail@ in a data structure, +or pass it to another function. All @e'@ will ever do is tail-call @fail@. +Rather than build a closure for @fail@, all we need do is to record the stack +level at the moment of the @let-escapes-not@; then entering @fail@ is just +a matter of adjusting the stack pointer back down to that point and entering +the code for it. + +Another example: +\begin{verbatim} +f x y = let z = huge-expression in + if y==1 then z else + if y==2 then z else + 1 +\end{verbatim} + +(A let-escapes-not is an @StgLetNoEscape@.) + +\item +We may eventually want: +\begin{verbatim} +let-literal x = Literal +in e +\end{verbatim} + +(ToDo: is this obsolete?) +\end{enumerate} + +And so the code for let(rec)-things: +\begin{code} + | StgLet + (GenStgBinding bndr occ) -- right hand sides (see below) + (GenStgExpr bndr occ) -- body + + | StgLetNoEscape -- remember: ``advanced stuff'' + (GenStgLiveVars occ) -- Live in the whole let-expression + -- Mustn't overwrite these stack slots + -- *Doesn't* include binders of the let(rec). + + (GenStgLiveVars occ) -- Live in the right hand sides (only) + -- These are the ones which must be saved on + -- the stack if they aren't there already + -- *Does* include binders of the let(rec) if recursive. + + (GenStgBinding bndr occ) -- right hand sides (see below) + (GenStgExpr bndr occ) -- body +\end{code} + +%************************************************************************ +%* * +\subsubsection{@GenStgExpr@: @scc@ expressions} +%* * +%************************************************************************ + +Finally for @scc@ expressions we introduce a new STG construct. + +\begin{code} + | StgSCC + CostCentre -- label of SCC expression + (GenStgExpr bndr occ) -- scc expression + -- end of GenStgExpr +\end{code} + +%************************************************************************ +%* * +\subsection{STG right-hand sides} +%* * +%************************************************************************ + +Here's the rest of the interesting stuff for @StgLet@s; the first +flavour is for closures: +\begin{code} +data GenStgRhs bndr occ + = StgRhsClosure + CostCentreStack -- CCS to be attached (default is CurrentCCS) + StgBinderInfo -- Info about how this binder is used (see below) + [occ] -- non-global free vars; a list, rather than + -- a set, because order is important + !UpdateFlag -- ReEntrant | Updatable | SingleEntry + SRT -- The SRT reference + [bndr] -- arguments; if empty, then not a function; + -- as above, order is important. + (GenStgExpr bndr occ) -- body +\end{code} +An example may be in order. Consider: +\begin{verbatim} +let t = \x -> \y -> ... x ... y ... p ... q in e +\end{verbatim} +Pulling out the free vars and stylising somewhat, we get the equivalent: +\begin{verbatim} +let t = (\[p,q] -> \[x,y] -> ... x ... y ... p ...q) p q +\end{verbatim} +Stg-operationally, the @[x,y]@ are on the stack, the @[p,q]@ are +offsets from @Node@ into the closure, and the code ptr for the closure +will be exactly that in parentheses above. + +The second flavour of right-hand-side is for constructors (simple but important): +\begin{code} + | StgRhsCon + CostCentreStack -- CCS to be attached (default is CurrentCCS). + -- Top-level (static) ones will end up with + -- DontCareCCS, because we don't count static + -- data in heap profiles, and we don't set CCCS + -- from static closure. + DataCon -- constructor + [GenStgArg occ] -- args +\end{code} + +\begin{code} +stgRhsArity :: StgRhs -> Int +stgRhsArity (StgRhsClosure _ _ _ _ _ bndrs _) = count isId bndrs + -- The arity never includes type parameters, so + -- when keeping type arguments and binders in the Stg syntax + -- (opt_RuntimeTypes) we have to fliter out the type binders. +stgRhsArity (StgRhsCon _ _ _) = 0 +\end{code} + +\begin{code} +stgBindHasCafRefs :: GenStgBinding bndr Id -> Bool +stgBindHasCafRefs (StgNonRec _ rhs) = rhsHasCafRefs rhs +stgBindHasCafRefs (StgRec binds) = any rhsHasCafRefs (map snd binds) + +rhsHasCafRefs (StgRhsClosure _ _ _ upd srt _ _) + = isUpdatable upd || nonEmptySRT srt +rhsHasCafRefs (StgRhsCon _ _ args) + = any stgArgHasCafRefs args + +stgArgHasCafRefs (StgVarArg id) = mayHaveCafRefs (idCafInfo id) +stgArgHasCafRefs _ = False +\end{code} + +Here's the @StgBinderInfo@ type, and its combining op: +\begin{code} +data StgBinderInfo + = NoStgBinderInfo + | SatCallsOnly -- All occurrences are *saturated* *function* calls + -- This means we don't need to build an info table and + -- slow entry code for the thing + -- Thunks never get this value + +noBinderInfo = NoStgBinderInfo +stgUnsatOcc = NoStgBinderInfo +stgSatOcc = SatCallsOnly + +satCallsOnly :: StgBinderInfo -> Bool +satCallsOnly SatCallsOnly = True +satCallsOnly NoStgBinderInfo = False + +combineStgBinderInfo :: StgBinderInfo -> StgBinderInfo -> StgBinderInfo +combineStgBinderInfo SatCallsOnly SatCallsOnly = SatCallsOnly +combineStgBinderInfo info1 info2 = NoStgBinderInfo + +-------------- +pp_binder_info NoStgBinderInfo = empty +pp_binder_info SatCallsOnly = ptext SLIT("sat-only") +\end{code} + +%************************************************************************ +%* * +\subsection[Stg-case-alternatives]{STG case alternatives} +%* * +%************************************************************************ + +Very like in @CoreSyntax@ (except no type-world stuff). + +The type constructor is guaranteed not to be abstract; that is, we can +see its representation. This is important because the code generator +uses it to determine return conventions etc. But it's not trivial +where there's a moduule loop involved, because some versions of a type +constructor might not have all the constructors visible. So +mkStgAlgAlts (in CoreToStg) ensures that it gets the TyCon from the +constructors or literals (which are guaranteed to have the Real McCoy) +rather than from the scrutinee type. + +\begin{code} +type GenStgAlt bndr occ + = (AltCon, -- alts: data constructor, + [bndr], -- constructor's parameters, + [Bool], -- "use mask", same length as + -- parameters; a True in a + -- param's position if it is + -- used in the ... + GenStgExpr bndr occ) -- ...right-hand side. + +data AltType + = PolyAlt -- Polymorphic (a type variable) + | UbxTupAlt TyCon -- Unboxed tuple + | AlgAlt TyCon -- Algebraic data type; the AltCons will be DataAlts + | PrimAlt TyCon -- Primitive data type; the AltCons will be LitAlts +\end{code} + +%************************************************************************ +%* * +\subsection[Stg]{The Plain STG parameterisation} +%* * +%************************************************************************ + +This happens to be the only one we use at the moment. + +\begin{code} +type StgBinding = GenStgBinding Id Id +type StgArg = GenStgArg Id +type StgLiveVars = GenStgLiveVars Id +type StgExpr = GenStgExpr Id Id +type StgRhs = GenStgRhs Id Id +type StgAlt = GenStgAlt Id Id +\end{code} + +%************************************************************************ +%* * +\subsubsection[UpdateFlag-datatype]{@UpdateFlag@} +%* * +%************************************************************************ + +This is also used in @LambdaFormInfo@ in the @ClosureInfo@ module. + +A @ReEntrant@ closure may be entered multiple times, but should not be +updated or blackholed. An @Updatable@ closure should be updated after +evaluation (and may be blackholed during evaluation). A @SingleEntry@ +closure will only be entered once, and so need not be updated but may +safely be blackholed. + +\begin{code} +data UpdateFlag = ReEntrant | Updatable | SingleEntry + +instance Outputable UpdateFlag where + ppr u + = char (case u of { ReEntrant -> 'r'; Updatable -> 'u'; SingleEntry -> 's' }) + +isUpdatable ReEntrant = False +isUpdatable SingleEntry = False +isUpdatable Updatable = True +\end{code} + +%************************************************************************ +%* * +\subsubsection{StgOp} +%* * +%************************************************************************ + +An StgOp allows us to group together PrimOps and ForeignCalls. +It's quite useful to move these around together, notably +in StgOpApp and COpStmt. + +\begin{code} +data StgOp = StgPrimOp PrimOp + + | StgFCallOp ForeignCall Unique + -- The Unique is occasionally needed by the C pretty-printer + -- (which lacks a unique supply), notably when generating a + -- typedef for foreign-export-dynamic +\end{code} + + +%************************************************************************ +%* * +\subsubsection[Static Reference Tables]{@SRT@} +%* * +%************************************************************************ + +There is one SRT per top-level function group. Each local binding and +case expression within this binding group has a subrange of the whole +SRT, expressed as an offset and length. + +In CoreToStg we collect the list of CafRefs at each SRT site, which is later +converted into the length and offset form by the SRT pass. + +\begin{code} +data SRT = NoSRT + | SRTEntries IdSet + -- generated by CoreToStg + | SRT !Int{-offset-} !Int{-length-} !Bitmap{-bitmap-} + -- generated by computeSRTs + +noSRT :: SRT +noSRT = NoSRT + +nonEmptySRT NoSRT = False +nonEmptySRT (SRTEntries vs) = not (isEmptyVarSet vs) +nonEmptySRT _ = True + +pprSRT (NoSRT) = ptext SLIT("_no_srt_") +pprSRT (SRTEntries ids) = text "SRT:" <> ppr ids +pprSRT (SRT off length bitmap) = parens (ppr off <> comma <> text "*bitmap*") +\end{code} + +%************************************************************************ +%* * +\subsection[Stg-pretty-printing]{Pretty-printing} +%* * +%************************************************************************ + +Robin Popplestone asked for semi-colon separators on STG binds; here's +hoping he likes terminators instead... Ditto for case alternatives. + +\begin{code} +pprGenStgBinding :: (Outputable bndr, Outputable bdee, Ord bdee) + => GenStgBinding bndr bdee -> SDoc + +pprGenStgBinding (StgNonRec bndr rhs) + = hang (hsep [ppr bndr, equals]) + 4 ((<>) (ppr rhs) semi) + +pprGenStgBinding (StgRec pairs) + = vcat ((ifPprDebug (ptext SLIT("{- StgRec (begin) -}"))) : + (map (ppr_bind) pairs) ++ [(ifPprDebug (ptext SLIT("{- StgRec (end) -}")))]) + where + ppr_bind (bndr, expr) + = hang (hsep [ppr bndr, equals]) + 4 ((<>) (ppr expr) semi) + +pprStgBinding :: StgBinding -> SDoc +pprStgBinding bind = pprGenStgBinding bind + +pprStgBindings :: [StgBinding] -> SDoc +pprStgBindings binds = vcat (map pprGenStgBinding binds) + +pprGenStgBindingWithSRT + :: (Outputable bndr, Outputable bdee, Ord bdee) + => (GenStgBinding bndr bdee,[(Id,[Id])]) -> SDoc + +pprGenStgBindingWithSRT (bind,srts) + = vcat (pprGenStgBinding bind : map pprSRT srts) + where pprSRT (id,srt) = + ptext SLIT("SRT") <> parens (ppr id) <> ptext SLIT(": ") <> ppr srt + +pprStgBindingsWithSRTs :: [(StgBinding,[(Id,[Id])])] -> SDoc +pprStgBindingsWithSRTs binds = vcat (map pprGenStgBindingWithSRT binds) +\end{code} + +\begin{code} +instance (Outputable bdee) => Outputable (GenStgArg bdee) where + ppr = pprStgArg + +instance (Outputable bndr, Outputable bdee, Ord bdee) + => Outputable (GenStgBinding bndr bdee) where + ppr = pprGenStgBinding + +instance (Outputable bndr, Outputable bdee, Ord bdee) + => Outputable (GenStgExpr bndr bdee) where + ppr = pprStgExpr + +instance (Outputable bndr, Outputable bdee, Ord bdee) + => Outputable (GenStgRhs bndr bdee) where + ppr rhs = pprStgRhs rhs +\end{code} + +\begin{code} +pprStgArg :: (Outputable bdee) => GenStgArg bdee -> SDoc + +pprStgArg (StgVarArg var) = ppr var +pprStgArg (StgLitArg con) = ppr con +pprStgArg (StgTypeArg ty) = char '@' <+> ppr ty +\end{code} + +\begin{code} +pprStgExpr :: (Outputable bndr, Outputable bdee, Ord bdee) + => GenStgExpr bndr bdee -> SDoc +-- special case +pprStgExpr (StgLit lit) = ppr lit + +-- general case +pprStgExpr (StgApp func args) + = hang (ppr func) + 4 (sep (map (ppr) args)) +\end{code} + +\begin{code} +pprStgExpr (StgConApp con args) + = hsep [ ppr con, brackets (interppSP args)] + +pprStgExpr (StgOpApp op args _) + = hsep [ pprStgOp op, brackets (interppSP args)] + +pprStgExpr (StgLam _ bndrs body) + =sep [ char '\\' <+> ppr bndrs <+> ptext SLIT("->"), + pprStgExpr body ] +\end{code} + +\begin{code} +-- special case: let v = <very specific thing> +-- in +-- let ... +-- in +-- ... +-- +-- Very special! Suspicious! (SLPJ) + +{- +pprStgExpr (StgLet srt (StgNonRec bndr (StgRhsClosure cc bi free_vars upd_flag args rhs)) + expr@(StgLet _ _)) + = ($$) + (hang (hcat [ptext SLIT("let { "), ppr bndr, ptext SLIT(" = "), + ppr cc, + pp_binder_info bi, + ptext SLIT(" ["), ifPprDebug (interppSP free_vars), ptext SLIT("] \\"), + ppr upd_flag, ptext SLIT(" ["), + interppSP args, char ']']) + 8 (sep [hsep [ppr rhs, ptext SLIT("} in")]])) + (ppr expr) +-} + +-- special case: let ... in let ... + +pprStgExpr (StgLet bind expr@(StgLet _ _)) + = ($$) + (sep [hang (ptext SLIT("let {")) + 2 (hsep [pprGenStgBinding bind, ptext SLIT("} in")])]) + (ppr expr) + +-- general case +pprStgExpr (StgLet bind expr) + = sep [hang (ptext SLIT("let {")) 2 (pprGenStgBinding bind), + hang (ptext SLIT("} in ")) 2 (ppr expr)] + +pprStgExpr (StgLetNoEscape lvs_whole lvs_rhss bind expr) + = sep [hang (ptext SLIT("let-no-escape {")) + 2 (pprGenStgBinding bind), + hang ((<>) (ptext SLIT("} in ")) + (ifPprDebug ( + nest 4 ( + hcat [ptext SLIT("-- lvs: ["), interppSP (uniqSetToList lvs_whole), + ptext SLIT("]; rhs lvs: ["), interppSP (uniqSetToList lvs_rhss), + char ']'])))) + 2 (ppr expr)] + +pprStgExpr (StgSCC cc expr) + = sep [ hsep [ptext SLIT("_scc_"), ppr cc], + pprStgExpr expr ] + +pprStgExpr (StgCase expr lvs_whole lvs_rhss bndr srt alt_type alts) + = sep [sep [ptext SLIT("case"), + nest 4 (hsep [pprStgExpr expr, + ifPprDebug (dcolon <+> ppr alt_type)]), + ptext SLIT("of"), ppr bndr, char '{'], + ifPprDebug ( + nest 4 ( + hcat [ptext SLIT("-- lvs: ["), interppSP (uniqSetToList lvs_whole), + ptext SLIT("]; rhs lvs: ["), interppSP (uniqSetToList lvs_rhss), + ptext SLIT("]; "), + pprMaybeSRT srt])), + nest 2 (vcat (map pprStgAlt alts)), + char '}'] + +pprStgAlt (con, params, use_mask, expr) + = hang (hsep [ppr con, interppSP params, ptext SLIT("->")]) + 4 (ppr expr <> semi) + +pprStgOp (StgPrimOp op) = ppr op +pprStgOp (StgFCallOp op _) = ppr op + +instance Outputable AltType where + ppr PolyAlt = ptext SLIT("Polymorphic") + ppr (UbxTupAlt tc) = ptext SLIT("UbxTup") <+> ppr tc + ppr (AlgAlt tc) = ptext SLIT("Alg") <+> ppr tc + ppr (PrimAlt tc) = ptext SLIT("Prim") <+> ppr tc +\end{code} + +\begin{code} +pprStgLVs :: Outputable occ => GenStgLiveVars occ -> SDoc +pprStgLVs lvs + = getPprStyle $ \ sty -> + if userStyle sty || isEmptyUniqSet lvs then + empty + else + hcat [text "{-lvs:", interpp'SP (uniqSetToList lvs), text "-}"] +\end{code} + +\begin{code} +pprStgRhs :: (Outputable bndr, Outputable bdee, Ord bdee) + => GenStgRhs bndr bdee -> SDoc + +-- special case +pprStgRhs (StgRhsClosure cc bi [free_var] upd_flag srt [{-no args-}] (StgApp func [])) + = hcat [ ppr cc, + pp_binder_info bi, + brackets (ifPprDebug (ppr free_var)), + ptext SLIT(" \\"), ppr upd_flag, pprMaybeSRT srt, ptext SLIT(" [] "), ppr func ] + +-- general case +pprStgRhs (StgRhsClosure cc bi free_vars upd_flag srt args body) + = hang (hsep [if opt_SccProfilingOn then ppr cc else empty, + pp_binder_info bi, + ifPprDebug (brackets (interppSP free_vars)), + char '\\' <> ppr upd_flag, pprMaybeSRT srt, brackets (interppSP args)]) + 4 (ppr body) + +pprStgRhs (StgRhsCon cc con args) + = hcat [ ppr cc, + space, ppr con, ptext SLIT("! "), brackets (interppSP args)] + +pprMaybeSRT (NoSRT) = empty +pprMaybeSRT srt = ptext SLIT("srt:") <> pprSRT srt +\end{code} diff --git a/compiler/stranal/DmdAnal.lhs b/compiler/stranal/DmdAnal.lhs new file mode 100644 index 0000000000..c5cfb7b4bd --- /dev/null +++ b/compiler/stranal/DmdAnal.lhs @@ -0,0 +1,1185 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1993-1998 +% + + ----------------- + A demand analysis + ----------------- + +\begin{code} +module DmdAnal ( dmdAnalPgm, dmdAnalTopRhs, + both {- needed by WwLib -} + ) where + +#include "HsVersions.h" + +import DynFlags ( DynFlags, DynFlag(..) ) +import StaticFlags ( opt_MaxWorkerArgs ) +import NewDemand -- All of it +import CoreSyn +import PprCore +import CoreUtils ( exprIsHNF, exprIsTrivial, exprArity ) +import DataCon ( dataConTyCon ) +import TyCon ( isProductTyCon, isRecursiveTyCon ) +import Id ( Id, idType, idInlinePragma, + isDataConWorkId, isGlobalId, idArity, +#ifdef OLD_STRICTNESS + idDemandInfo, idStrictness, idCprInfo, idName, +#endif + idNewStrictness, idNewStrictness_maybe, + setIdNewStrictness, idNewDemandInfo, + idNewDemandInfo_maybe, + setIdNewDemandInfo + ) +#ifdef OLD_STRICTNESS +import IdInfo ( newStrictnessFromOld, newDemand ) +#endif +import Var ( Var ) +import VarEnv +import TysWiredIn ( unboxedPairDataCon ) +import TysPrim ( realWorldStatePrimTy ) +import UniqFM ( plusUFM_C, addToUFM_Directly, lookupUFM_Directly, + keysUFM, minusUFM, ufmToList, filterUFM ) +import Type ( isUnLiftedType, coreEqType ) +import CoreLint ( showPass, endPass ) +import Util ( mapAndUnzip, mapAccumL, mapAccumR, lengthIs ) +import BasicTypes ( Arity, TopLevelFlag(..), isTopLevel, isNeverActive, + RecFlag(..), isRec ) +import Maybes ( orElse, expectJust ) +import Outputable +\end{code} + +To think about + +* set a noinline pragma on bottoming Ids + +* Consider f x = x+1 `fatbar` error (show x) + We'd like to unbox x, even if that means reboxing it in the error case. + + +%************************************************************************ +%* * +\subsection{Top level stuff} +%* * +%************************************************************************ + +\begin{code} +dmdAnalPgm :: DynFlags -> [CoreBind] -> IO [CoreBind] +dmdAnalPgm dflags binds + = do { + showPass dflags "Demand analysis" ; + let { binds_plus_dmds = do_prog binds } ; + + endPass dflags "Demand analysis" + Opt_D_dump_stranal binds_plus_dmds ; +#ifdef OLD_STRICTNESS + -- Only if OLD_STRICTNESS is on, because only then is the old + -- strictness analyser run + let { dmd_changes = get_changes binds_plus_dmds } ; + printDump (text "Changes in demands" $$ dmd_changes) ; +#endif + return binds_plus_dmds + } + where + do_prog :: [CoreBind] -> [CoreBind] + do_prog binds = snd $ mapAccumL dmdAnalTopBind emptySigEnv binds + +dmdAnalTopBind :: SigEnv + -> CoreBind + -> (SigEnv, CoreBind) +dmdAnalTopBind sigs (NonRec id rhs) + = let + ( _, _, (_, rhs1)) = dmdAnalRhs TopLevel NonRecursive sigs (id, rhs) + (sigs2, _, (id2, rhs2)) = dmdAnalRhs TopLevel NonRecursive sigs (id, rhs1) + -- Do two passes to improve CPR information + -- See comments with ignore_cpr_info in mk_sig_ty + -- and with extendSigsWithLam + in + (sigs2, NonRec id2 rhs2) + +dmdAnalTopBind sigs (Rec pairs) + = let + (sigs', _, pairs') = dmdFix TopLevel sigs pairs + -- We get two iterations automatically + -- c.f. the NonRec case above + in + (sigs', Rec pairs') +\end{code} + +\begin{code} +dmdAnalTopRhs :: CoreExpr -> (StrictSig, CoreExpr) +-- Analyse the RHS and return +-- a) appropriate strictness info +-- b) the unfolding (decorated with stricntess info) +dmdAnalTopRhs rhs + = (sig, rhs2) + where + call_dmd = vanillaCall (exprArity rhs) + (_, rhs1) = dmdAnal emptySigEnv call_dmd rhs + (rhs_ty, rhs2) = dmdAnal emptySigEnv call_dmd rhs1 + sig = mkTopSigTy rhs rhs_ty + -- Do two passes; see notes with extendSigsWithLam + -- Otherwise we get bogus CPR info for constructors like + -- newtype T a = MkT a + -- The constructor looks like (\x::T a -> x), modulo the coerce + -- extendSigsWithLam will optimistically give x a CPR tag the + -- first time, which is wrong in the end. +\end{code} + +%************************************************************************ +%* * +\subsection{The analyser itself} +%* * +%************************************************************************ + +\begin{code} +dmdAnal :: SigEnv -> Demand -> CoreExpr -> (DmdType, CoreExpr) + +dmdAnal sigs Abs e = (topDmdType, e) + +dmdAnal sigs dmd e + | not (isStrictDmd dmd) + = let + (res_ty, e') = dmdAnal sigs evalDmd e + in + (deferType res_ty, e') + -- It's important not to analyse e with a lazy demand because + -- a) When we encounter case s of (a,b) -> + -- we demand s with U(d1d2)... but if the overall demand is lazy + -- that is wrong, and we'd need to reduce the demand on s, + -- which is inconvenient + -- b) More important, consider + -- f (let x = R in x+x), where f is lazy + -- We still want to mark x as demanded, because it will be when we + -- enter the let. If we analyse f's arg with a Lazy demand, we'll + -- just mark x as Lazy + -- c) The application rule wouldn't be right either + -- Evaluating (f x) in a L demand does *not* cause + -- evaluation of f in a C(L) demand! + + +dmdAnal sigs dmd (Lit lit) + = (topDmdType, Lit lit) + +dmdAnal sigs dmd (Var var) + = (dmdTransform sigs var dmd, Var var) + +dmdAnal sigs dmd (Note n e) + = (dmd_ty, Note n e') + where + (dmd_ty, e') = dmdAnal sigs dmd' e + dmd' = case n of + Coerce _ _ -> evalDmd -- This coerce usually arises from a recursive + other -> dmd -- newtype, and we don't want to look inside them + -- for exactly the same reason that we don't look + -- inside recursive products -- we might not reach + -- a fixpoint. So revert to a vanilla Eval demand + +dmdAnal sigs dmd (App fun (Type ty)) + = (fun_ty, App fun' (Type ty)) + where + (fun_ty, fun') = dmdAnal sigs dmd fun + +-- Lots of the other code is there to make this +-- beautiful, compositional, application rule :-) +dmdAnal sigs dmd e@(App fun arg) -- Non-type arguments + = let -- [Type arg handled above] + (fun_ty, fun') = dmdAnal sigs (Call dmd) fun + (arg_ty, arg') = dmdAnal sigs arg_dmd arg + (arg_dmd, res_ty) = splitDmdTy fun_ty + in + (res_ty `bothType` arg_ty, App fun' arg') + +dmdAnal sigs dmd (Lam var body) + | isTyVar var + = let + (body_ty, body') = dmdAnal sigs dmd body + in + (body_ty, Lam var body') + + | Call body_dmd <- dmd -- A call demand: good! + = let + sigs' = extendSigsWithLam sigs var + (body_ty, body') = dmdAnal sigs' body_dmd body + (lam_ty, var') = annotateLamIdBndr body_ty var + in + (lam_ty, Lam var' body') + + | otherwise -- Not enough demand on the lambda; but do the body + = let -- anyway to annotate it and gather free var info + (body_ty, body') = dmdAnal sigs evalDmd body + (lam_ty, var') = annotateLamIdBndr body_ty var + in + (deferType lam_ty, Lam var' body') + +dmdAnal sigs dmd (Case scrut case_bndr ty [alt@(DataAlt dc,bndrs,rhs)]) + | let tycon = dataConTyCon dc, + isProductTyCon tycon, + not (isRecursiveTyCon tycon) + = let + sigs_alt = extendSigEnv NotTopLevel sigs case_bndr case_bndr_sig + (alt_ty, alt') = dmdAnalAlt sigs_alt dmd alt + (alt_ty1, case_bndr') = annotateBndr alt_ty case_bndr + (_, bndrs', _) = alt' + case_bndr_sig = cprSig + -- Inside the alternative, the case binder has the CPR property. + -- Meaning that a case on it will successfully cancel. + -- Example: + -- f True x = case x of y { I# x' -> if x' ==# 3 then y else I# 8 } + -- f False x = I# 3 + -- + -- We want f to have the CPR property: + -- f b x = case fw b x of { r -> I# r } + -- fw True x = case x of y { I# x' -> if x' ==# 3 then x' else 8 } + -- fw False x = 3 + + -- Figure out whether the demand on the case binder is used, and use + -- that to set the scrut_dmd. This is utterly essential. + -- Consider f x = case x of y { (a,b) -> k y a } + -- If we just take scrut_demand = U(L,A), then we won't pass x to the + -- worker, so the worker will rebuild + -- x = (a, absent-error) + -- and that'll crash. + -- So at one stage I had: + -- dead_case_bndr = isAbsentDmd (idNewDemandInfo case_bndr') + -- keepity | dead_case_bndr = Drop + -- | otherwise = Keep + -- + -- But then consider + -- case x of y { (a,b) -> h y + a } + -- where h : U(LL) -> T + -- The above code would compute a Keep for x, since y is not Abs, which is silly + -- The insight is, of course, that a demand on y is a demand on the + -- scrutinee, so we need to `both` it with the scrut demand + + scrut_dmd = Eval (Prod [idNewDemandInfo b | b <- bndrs', isId b]) + `both` + idNewDemandInfo case_bndr' + + (scrut_ty, scrut') = dmdAnal sigs scrut_dmd scrut + in + (alt_ty1 `bothType` scrut_ty, Case scrut' case_bndr' ty [alt']) + +dmdAnal sigs dmd (Case scrut case_bndr ty alts) + = let + (alt_tys, alts') = mapAndUnzip (dmdAnalAlt sigs dmd) alts + (scrut_ty, scrut') = dmdAnal sigs evalDmd scrut + (alt_ty, case_bndr') = annotateBndr (foldr1 lubType alt_tys) case_bndr + in +-- pprTrace "dmdAnal:Case" (ppr alts $$ ppr alt_tys) + (alt_ty `bothType` scrut_ty, Case scrut' case_bndr' ty alts') + +dmdAnal sigs dmd (Let (NonRec id rhs) body) + = let + (sigs', lazy_fv, (id1, rhs')) = dmdAnalRhs NotTopLevel NonRecursive sigs (id, rhs) + (body_ty, body') = dmdAnal sigs' dmd body + (body_ty1, id2) = annotateBndr body_ty id1 + body_ty2 = addLazyFVs body_ty1 lazy_fv + in + -- If the actual demand is better than the vanilla call + -- demand, you might think that we might do better to re-analyse + -- the RHS with the stronger demand. + -- But (a) That seldom happens, because it means that *every* path in + -- the body of the let has to use that stronger demand + -- (b) It often happens temporarily in when fixpointing, because + -- the recursive function at first seems to place a massive demand. + -- But we don't want to go to extra work when the function will + -- probably iterate to something less demanding. + -- In practice, all the times the actual demand on id2 is more than + -- the vanilla call demand seem to be due to (b). So we don't + -- bother to re-analyse the RHS. + (body_ty2, Let (NonRec id2 rhs') body') + +dmdAnal sigs dmd (Let (Rec pairs) body) + = let + bndrs = map fst pairs + (sigs', lazy_fv, pairs') = dmdFix NotTopLevel sigs pairs + (body_ty, body') = dmdAnal sigs' dmd body + body_ty1 = addLazyFVs body_ty lazy_fv + in + sigs' `seq` body_ty `seq` + let + (body_ty2, _) = annotateBndrs body_ty1 bndrs + -- Don't bother to add demand info to recursive + -- binders as annotateBndr does; + -- being recursive, we can't treat them strictly. + -- But we do need to remove the binders from the result demand env + in + (body_ty2, Let (Rec pairs') body') + + +dmdAnalAlt sigs dmd (con,bndrs,rhs) + = let + (rhs_ty, rhs') = dmdAnal sigs dmd rhs + (alt_ty, bndrs') = annotateBndrs rhs_ty bndrs + final_alt_ty | io_hack_reqd = alt_ty `lubType` topDmdType + | otherwise = alt_ty + + -- There's a hack here for I/O operations. Consider + -- case foo x s of { (# s, r #) -> y } + -- Is this strict in 'y'. Normally yes, but what if 'foo' is an I/O + -- operation that simply terminates the program (not in an erroneous way)? + -- In that case we should not evaluate y before the call to 'foo'. + -- Hackish solution: spot the IO-like situation and add a virtual branch, + -- as if we had + -- case foo x s of + -- (# s, r #) -> y + -- other -> return () + -- So the 'y' isn't necessarily going to be evaluated + -- + -- A more complete example where this shows up is: + -- do { let len = <expensive> ; + -- ; when (...) (exitWith ExitSuccess) + -- ; print len } + + io_hack_reqd = con == DataAlt unboxedPairDataCon && + idType (head bndrs) `coreEqType` realWorldStatePrimTy + in + (final_alt_ty, (con, bndrs', rhs')) +\end{code} + +%************************************************************************ +%* * +\subsection{Bindings} +%* * +%************************************************************************ + +\begin{code} +dmdFix :: TopLevelFlag + -> SigEnv -- Does not include bindings for this binding + -> [(Id,CoreExpr)] + -> (SigEnv, DmdEnv, + [(Id,CoreExpr)]) -- Binders annotated with stricness info + +dmdFix top_lvl sigs orig_pairs + = loop 1 initial_sigs orig_pairs + where + bndrs = map fst orig_pairs + initial_sigs = extendSigEnvList sigs [(id, (initialSig id, top_lvl)) | id <- bndrs] + + loop :: Int + -> SigEnv -- Already contains the current sigs + -> [(Id,CoreExpr)] + -> (SigEnv, DmdEnv, [(Id,CoreExpr)]) + loop n sigs pairs + | found_fixpoint + = (sigs', lazy_fv, pairs') + -- Note: use pairs', not pairs. pairs' is the result of + -- processing the RHSs with sigs (= sigs'), whereas pairs + -- is the result of processing the RHSs with the *previous* + -- iteration of sigs. + + | n >= 10 = pprTrace "dmdFix loop" (ppr n <+> (vcat + [ text "Sigs:" <+> ppr [(id,lookup sigs id, lookup sigs' id) | (id,_) <- pairs], + text "env:" <+> ppr (ufmToList sigs), + text "binds:" <+> pprCoreBinding (Rec pairs)])) + (emptySigEnv, lazy_fv, orig_pairs) -- Safe output + -- The lazy_fv part is really important! orig_pairs has no strictness + -- info, including nothing about free vars. But if we have + -- letrec f = ....y..... in ...f... + -- where 'y' is free in f, we must record that y is mentioned, + -- otherwise y will get recorded as absent altogether + + | otherwise = loop (n+1) sigs' pairs' + where + found_fixpoint = all (same_sig sigs sigs') bndrs + -- Use the new signature to do the next pair + -- The occurrence analyser has arranged them in a good order + -- so this can significantly reduce the number of iterations needed + ((sigs',lazy_fv), pairs') = mapAccumL (my_downRhs top_lvl) (sigs, emptyDmdEnv) pairs + + my_downRhs top_lvl (sigs,lazy_fv) (id,rhs) + = -- pprTrace "downRhs {" (ppr id <+> (ppr old_sig)) + -- (new_sig `seq` + -- pprTrace "downRhsEnd" (ppr id <+> ppr new_sig <+> char '}' ) + ((sigs', lazy_fv'), pair') + -- ) + where + (sigs', lazy_fv1, pair') = dmdAnalRhs top_lvl Recursive sigs (id,rhs) + lazy_fv' = plusUFM_C both lazy_fv lazy_fv1 + -- old_sig = lookup sigs id + -- new_sig = lookup sigs' id + + same_sig sigs sigs' var = lookup sigs var == lookup sigs' var + lookup sigs var = case lookupVarEnv sigs var of + Just (sig,_) -> sig + + -- Get an initial strictness signature from the Id + -- itself. That way we make use of earlier iterations + -- of the fixpoint algorithm. (Cunning plan.) + -- Note that the cunning plan extends to the DmdEnv too, + -- since it is part of the strictness signature +initialSig id = idNewStrictness_maybe id `orElse` botSig + +dmdAnalRhs :: TopLevelFlag -> RecFlag + -> SigEnv -> (Id, CoreExpr) + -> (SigEnv, DmdEnv, (Id, CoreExpr)) +-- Process the RHS of the binding, add the strictness signature +-- to the Id, and augment the environment with the signature as well. + +dmdAnalRhs top_lvl rec_flag sigs (id, rhs) + = (sigs', lazy_fv, (id', rhs')) + where + arity = idArity id -- The idArity should be up to date + -- The simplifier was run just beforehand + (rhs_dmd_ty, rhs') = dmdAnal sigs (vanillaCall arity) rhs + (lazy_fv, sig_ty) = WARN( arity /= dmdTypeDepth rhs_dmd_ty && not (exprIsTrivial rhs), ppr id ) + -- The RHS can be eta-reduced to just a variable, + -- in which case we should not complain. + mkSigTy top_lvl rec_flag id rhs rhs_dmd_ty + id' = id `setIdNewStrictness` sig_ty + sigs' = extendSigEnv top_lvl sigs id sig_ty +\end{code} + +%************************************************************************ +%* * +\subsection{Strictness signatures and types} +%* * +%************************************************************************ + +\begin{code} +mkTopSigTy :: CoreExpr -> DmdType -> StrictSig + -- Take a DmdType and turn it into a StrictSig + -- NB: not used for never-inline things; hence False +mkTopSigTy rhs dmd_ty = snd (mk_sig_ty False False rhs dmd_ty) + +mkSigTy :: TopLevelFlag -> RecFlag -> Id -> CoreExpr -> DmdType -> (DmdEnv, StrictSig) +mkSigTy top_lvl rec_flag id rhs dmd_ty + = mk_sig_ty never_inline thunk_cpr_ok rhs dmd_ty + where + never_inline = isNeverActive (idInlinePragma id) + maybe_id_dmd = idNewDemandInfo_maybe id + -- Is Nothing the first time round + + thunk_cpr_ok + | isTopLevel top_lvl = False -- Top level things don't get + -- their demandInfo set at all + | isRec rec_flag = False -- Ditto recursive things + | Just dmd <- maybe_id_dmd = isStrictDmd dmd + | otherwise = True -- Optimistic, first time round + -- See notes below +\end{code} + +The thunk_cpr_ok stuff [CPR-AND-STRICTNESS] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +If the rhs is a thunk, we usually forget the CPR info, because +it is presumably shared (else it would have been inlined, and +so we'd lose sharing if w/w'd it into a function. + +However, if the strictness analyser has figured out (in a previous +iteration) that it's strict, then we DON'T need to forget the CPR info. +Instead we can retain the CPR info and do the thunk-splitting transform +(see WorkWrap.splitThunk). + +This made a big difference to PrelBase.modInt, which had something like + modInt = \ x -> let r = ... -> I# v in + ...body strict in r... +r's RHS isn't a value yet; but modInt returns r in various branches, so +if r doesn't have the CPR property then neither does modInt +Another case I found in practice (in Complex.magnitude), looks like this: + let k = if ... then I# a else I# b + in ... body strict in k .... +(For this example, it doesn't matter whether k is returned as part of +the overall result; but it does matter that k's RHS has the CPR property.) +Left to itself, the simplifier will make a join point thus: + let $j k = ...body strict in k... + if ... then $j (I# a) else $j (I# b) +With thunk-splitting, we get instead + let $j x = let k = I#x in ...body strict in k... + in if ... then $j a else $j b +This is much better; there's a good chance the I# won't get allocated. + +The difficulty with this is that we need the strictness type to +look at the body... but we now need the body to calculate the demand +on the variable, so we can decide whether its strictness type should +have a CPR in it or not. Simple solution: + a) use strictness info from the previous iteration + b) make sure we do at least 2 iterations, by doing a second + round for top-level non-recs. Top level recs will get at + least 2 iterations except for totally-bottom functions + which aren't very interesting anyway. + +NB: strictly_demanded is never true of a top-level Id, or of a recursive Id. + +The Nothing case in thunk_cpr_ok [CPR-AND-STRICTNESS] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Demand info now has a 'Nothing' state, just like strictness info. +The analysis works from 'dangerous' towards a 'safe' state; so we +start with botSig for 'Nothing' strictness infos, and we start with +"yes, it's demanded" for 'Nothing' in the demand info. The +fixpoint iteration will sort it all out. + +We can't start with 'not-demanded' because then consider + f x = let + t = ... I# x + in + if ... then t else I# y else f x' + +In the first iteration we'd have no demand info for x, so assume +not-demanded; then we'd get TopRes for f's CPR info. Next iteration +we'd see that t was demanded, and so give it the CPR property, but by +now f has TopRes, so it will stay TopRes. Instead, with the Nothing +setting the first time round, we say 'yes t is demanded' the first +time. + +However, this does mean that for non-recursive bindings we must +iterate twice to be sure of not getting over-optimistic CPR info, +in the case where t turns out to be not-demanded. This is handled +by dmdAnalTopBind. + + +\begin{code} +mk_sig_ty never_inline thunk_cpr_ok rhs (DmdType fv dmds res) + | never_inline && not (isBotRes res) + -- HACK ALERT + -- Don't strictness-analyse NOINLINE things. Why not? Because + -- the NOINLINE says "don't expose any of the inner workings at the call + -- site" and the strictness is certainly an inner working. + -- + -- More concretely, the demand analyser discovers the following strictness + -- for unsafePerformIO: C(U(AV)) + -- But then consider + -- unsafePerformIO (\s -> let r = f x in + -- case writeIORef v r s of (# s1, _ #) -> + -- (# s1, r #) + -- The strictness analyser will find that the binding for r is strict, + -- (becuase of uPIO's strictness sig), and so it'll evaluate it before + -- doing the writeIORef. This actually makes tests/lib/should_run/memo002 + -- get a deadlock! + -- + -- Solution: don't expose the strictness of unsafePerformIO. + -- + -- But we do want to expose the strictness of error functions, + -- which are also often marked NOINLINE + -- {-# NOINLINE foo #-} + -- foo x = error ("wubble buggle" ++ x) + -- So (hack, hack) we only drop the strictness for non-bottom things + -- This is all very unsatisfactory. + = (deferEnv fv, topSig) + + | otherwise + = (lazy_fv, mkStrictSig dmd_ty) + where + dmd_ty = DmdType strict_fv final_dmds res' + + lazy_fv = filterUFM (not . isStrictDmd) fv + strict_fv = filterUFM isStrictDmd fv + -- We put the strict FVs in the DmdType of the Id, so + -- that at its call sites we unleash demands on its strict fvs. + -- An example is 'roll' in imaginary/wheel-sieve2 + -- Something like this: + -- roll x = letrec + -- go y = if ... then roll (x-1) else x+1 + -- in + -- go ms + -- We want to see that roll is strict in x, which is because + -- go is called. So we put the DmdEnv for x in go's DmdType. + -- + -- Another example: + -- f :: Int -> Int -> Int + -- f x y = let t = x+1 + -- h z = if z==0 then t else + -- if z==1 then x+1 else + -- x + h (z-1) + -- in + -- h y + -- Calling h does indeed evaluate x, but we can only see + -- that if we unleash a demand on x at the call site for t. + -- + -- Incidentally, here's a place where lambda-lifting h would + -- lose the cigar --- we couldn't see the joint strictness in t/x + -- + -- ON THE OTHER HAND + -- We don't want to put *all* the fv's from the RHS into the + -- DmdType, because that makes fixpointing very slow --- the + -- DmdType gets full of lazy demands that are slow to converge. + + final_dmds = setUnpackStrategy dmds + -- Set the unpacking strategy + + res' = case res of + RetCPR | ignore_cpr_info -> TopRes + other -> res + ignore_cpr_info = not (exprIsHNF rhs || thunk_cpr_ok) +\end{code} + +The unpack strategy determines whether we'll *really* unpack the argument, +or whether we'll just remember its strictness. If unpacking would give +rise to a *lot* of worker args, we may decide not to unpack after all. + +\begin{code} +setUnpackStrategy :: [Demand] -> [Demand] +setUnpackStrategy ds + = snd (go (opt_MaxWorkerArgs - nonAbsentArgs ds) ds) + where + go :: Int -- Max number of args available for sub-components of [Demand] + -> [Demand] + -> (Int, [Demand]) -- Args remaining after subcomponents of [Demand] are unpacked + + go n (Eval (Prod cs) : ds) + | n' >= 0 = Eval (Prod cs') `cons` go n'' ds + | otherwise = Box (Eval (Prod cs)) `cons` go n ds + where + (n'',cs') = go n' cs + n' = n + 1 - non_abs_args + -- Add one to the budget 'cos we drop the top-level arg + non_abs_args = nonAbsentArgs cs + -- Delete # of non-absent args to which we'll now be committed + + go n (d:ds) = d `cons` go n ds + go n [] = (n,[]) + + cons d (n,ds) = (n, d:ds) + +nonAbsentArgs :: [Demand] -> Int +nonAbsentArgs [] = 0 +nonAbsentArgs (Abs : ds) = nonAbsentArgs ds +nonAbsentArgs (d : ds) = 1 + nonAbsentArgs ds +\end{code} + + +%************************************************************************ +%* * +\subsection{Strictness signatures and types} +%* * +%************************************************************************ + +\begin{code} +splitDmdTy :: DmdType -> (Demand, DmdType) +-- Split off one function argument +-- We already have a suitable demand on all +-- free vars, so no need to add more! +splitDmdTy (DmdType fv (dmd:dmds) res_ty) = (dmd, DmdType fv dmds res_ty) +splitDmdTy ty@(DmdType fv [] res_ty) = (resTypeArgDmd res_ty, ty) +\end{code} + +\begin{code} +unitVarDmd var dmd = DmdType (unitVarEnv var dmd) [] TopRes + +addVarDmd top_lvl dmd_ty@(DmdType fv ds res) var dmd + | isTopLevel top_lvl = dmd_ty -- Don't record top level things + | otherwise = DmdType (extendVarEnv fv var dmd) ds res + +addLazyFVs (DmdType fv ds res) lazy_fvs + = DmdType both_fv1 ds res + where + both_fv = (plusUFM_C both fv lazy_fvs) + both_fv1 = modifyEnv (isBotRes res) (`both` Bot) lazy_fvs fv both_fv + -- This modifyEnv is vital. Consider + -- let f = \x -> (x,y) + -- in error (f 3) + -- Here, y is treated as a lazy-fv of f, but we must `both` that L + -- demand with the bottom coming up from 'error' + -- + -- I got a loop in the fixpointer without this, due to an interaction + -- with the lazy_fv filtering in mkSigTy. Roughly, it was + -- letrec f n x + -- = letrec g y = x `fatbar` + -- letrec h z = z + ...g... + -- in h (f (n-1) x) + -- in ... + -- In the initial iteration for f, f=Bot + -- Suppose h is found to be strict in z, but the occurrence of g in its RHS + -- is lazy. Now consider the fixpoint iteration for g, esp the demands it + -- places on its free variables. Suppose it places none. Then the + -- x `fatbar` ...call to h... + -- will give a x->V demand for x. That turns into a L demand for x, + -- which floats out of the defn for h. Without the modifyEnv, that + -- L demand doesn't get both'd with the Bot coming up from the inner + -- call to f. So we just get an L demand for x for g. + -- + -- A better way to say this is that the lazy-fv filtering should give the + -- same answer as putting the lazy fv demands in the function's type. + +annotateBndr :: DmdType -> Var -> (DmdType, Var) +-- The returned env has the var deleted +-- The returned var is annotated with demand info +-- No effect on the argument demands +annotateBndr dmd_ty@(DmdType fv ds res) var + | isTyVar var = (dmd_ty, var) + | otherwise = (DmdType fv' ds res, setIdNewDemandInfo var dmd) + where + (fv', dmd) = removeFV fv var res + +annotateBndrs = mapAccumR annotateBndr + +annotateLamIdBndr dmd_ty@(DmdType fv ds res) id +-- For lambdas we add the demand to the argument demands +-- Only called for Ids + = ASSERT( isId id ) + (DmdType fv' (hacked_dmd:ds) res, setIdNewDemandInfo id hacked_dmd) + where + (fv', dmd) = removeFV fv id res + hacked_dmd = argDemand dmd + -- This call to argDemand is vital, because otherwise we label + -- a lambda binder with demand 'B'. But in terms of calling + -- conventions that's Abs, because we don't pass it. But + -- when we do a w/w split we get + -- fw x = (\x y:B -> ...) x (error "oops") + -- And then the simplifier things the 'B' is a strict demand + -- and evaluates the (error "oops"). Sigh + +removeFV fv id res = (fv', zapUnlifted id dmd) + where + fv' = fv `delVarEnv` id + dmd = lookupVarEnv fv id `orElse` deflt + deflt | isBotRes res = Bot + | otherwise = Abs + +-- For unlifted-type variables, we are only +-- interested in Bot/Abs/Box Abs +zapUnlifted is Bot = Bot +zapUnlifted id Abs = Abs +zapUnlifted id dmd | isUnLiftedType (idType id) = lazyDmd + | otherwise = dmd +\end{code} + +%************************************************************************ +%* * +\subsection{Strictness signatures} +%* * +%************************************************************************ + +\begin{code} +type SigEnv = VarEnv (StrictSig, TopLevelFlag) + -- We use the SigEnv to tell us whether to + -- record info about a variable in the DmdEnv + -- We do so if it's a LocalId, but not top-level + -- + -- The DmdEnv gives the demand on the free vars of the function + -- when it is given enough args to satisfy the strictness signature + +emptySigEnv = emptyVarEnv + +extendSigEnv :: TopLevelFlag -> SigEnv -> Id -> StrictSig -> SigEnv +extendSigEnv top_lvl env var sig = extendVarEnv env var (sig, top_lvl) + +extendSigEnvList = extendVarEnvList + +extendSigsWithLam :: SigEnv -> Id -> SigEnv +-- Extend the SigEnv when we meet a lambda binder +-- If the binder is marked demanded with a product demand, then give it a CPR +-- signature, because in the likely event that this is a lambda on a fn defn +-- [we only use this when the lambda is being consumed with a call demand], +-- it'll be w/w'd and so it will be CPR-ish. E.g. +-- f = \x::(Int,Int). if ...strict in x... then +-- x +-- else +-- (a,b) +-- We want f to have the CPR property because x does, by the time f has been w/w'd +-- +-- Also note that we only want to do this for something that +-- definitely has product type, else we may get over-optimistic +-- CPR results (e.g. from \x -> x!). + +extendSigsWithLam sigs id + = case idNewDemandInfo_maybe id of + Nothing -> extendVarEnv sigs id (cprSig, NotTopLevel) + -- Optimistic in the Nothing case; + -- See notes [CPR-AND-STRICTNESS] + Just (Eval (Prod ds)) -> extendVarEnv sigs id (cprSig, NotTopLevel) + other -> sigs + + +dmdTransform :: SigEnv -- The strictness environment + -> Id -- The function + -> Demand -- The demand on the function + -> DmdType -- The demand type of the function in this context + -- Returned DmdEnv includes the demand on + -- this function plus demand on its free variables + +dmdTransform sigs var dmd + +------ DATA CONSTRUCTOR + | isDataConWorkId var -- Data constructor + = let + StrictSig dmd_ty = idNewStrictness var -- It must have a strictness sig + DmdType _ _ con_res = dmd_ty + arity = idArity var + in + if arity == call_depth then -- Saturated, so unleash the demand + let + -- Important! If we Keep the constructor application, then + -- we need the demands the constructor places (always lazy) + -- If not, we don't need to. For example: + -- f p@(x,y) = (p,y) -- S(AL) + -- g a b = f (a,b) + -- It's vital that we don't calculate Absent for a! + dmd_ds = case res_dmd of + Box (Eval ds) -> mapDmds box ds + Eval ds -> ds + other -> Poly Top + + -- ds can be empty, when we are just seq'ing the thing + -- If so we must make up a suitable bunch of demands + arg_ds = case dmd_ds of + Poly d -> replicate arity d + Prod ds -> ASSERT( ds `lengthIs` arity ) ds + + in + mkDmdType emptyDmdEnv arg_ds con_res + -- Must remember whether it's a product, hence con_res, not TopRes + else + topDmdType + +------ IMPORTED FUNCTION + | isGlobalId var, -- Imported function + let StrictSig dmd_ty = idNewStrictness var + = if dmdTypeDepth dmd_ty <= call_depth then -- Saturated, so unleash the demand + dmd_ty + else + topDmdType + +------ LOCAL LET/REC BOUND THING + | Just (StrictSig dmd_ty, top_lvl) <- lookupVarEnv sigs var + = let + fn_ty | dmdTypeDepth dmd_ty <= call_depth = dmd_ty + | otherwise = deferType dmd_ty + -- NB: it's important to use deferType, and not just return topDmdType + -- Consider let { f x y = p + x } in f 1 + -- The application isn't saturated, but we must nevertheless propagate + -- a lazy demand for p! + in + addVarDmd top_lvl fn_ty var dmd + +------ LOCAL NON-LET/REC BOUND THING + | otherwise -- Default case + = unitVarDmd var dmd + + where + (call_depth, res_dmd) = splitCallDmd dmd +\end{code} + + +%************************************************************************ +%* * +\subsection{Demands} +%* * +%************************************************************************ + +\begin{code} +splitCallDmd :: Demand -> (Int, Demand) +splitCallDmd (Call d) = case splitCallDmd d of + (n, r) -> (n+1, r) +splitCallDmd d = (0, d) + +vanillaCall :: Arity -> Demand +vanillaCall 0 = evalDmd +vanillaCall n = Call (vanillaCall (n-1)) + +deferType :: DmdType -> DmdType +deferType (DmdType fv _ _) = DmdType (deferEnv fv) [] TopRes + -- Notice that we throw away info about both arguments and results + -- For example, f = let ... in \x -> x + -- We don't want to get a stricness type V->T for f. + -- Peter?? + +deferEnv :: DmdEnv -> DmdEnv +deferEnv fv = mapVarEnv defer fv + + +---------------- +argDemand :: Demand -> Demand +-- The 'Defer' demands are just Lazy at function boundaries +-- Ugly! Ask John how to improve it. +argDemand Top = lazyDmd +argDemand (Defer d) = lazyDmd +argDemand (Eval ds) = Eval (mapDmds argDemand ds) +argDemand (Box Bot) = evalDmd +argDemand (Box d) = box (argDemand d) +argDemand Bot = Abs -- Don't pass args that are consumed (only) by bottom +argDemand d = d +\end{code} + +\begin{code} +------------------------- +-- Consider (if x then y else []) with demand V +-- Then the first branch gives {y->V} and the second +-- *implicitly* has {y->A}. So we must put {y->(V `lub` A)} +-- in the result env. +lubType (DmdType fv1 ds1 r1) (DmdType fv2 ds2 r2) + = DmdType lub_fv2 (lub_ds ds1 ds2) (r1 `lubRes` r2) + where + lub_fv = plusUFM_C lub fv1 fv2 + lub_fv1 = modifyEnv (not (isBotRes r1)) absLub fv2 fv1 lub_fv + lub_fv2 = modifyEnv (not (isBotRes r2)) absLub fv1 fv2 lub_fv1 + -- lub is the identity for Bot + + -- Extend the shorter argument list to match the longer + lub_ds (d1:ds1) (d2:ds2) = lub d1 d2 : lub_ds ds1 ds2 + lub_ds [] [] = [] + lub_ds ds1 [] = map (`lub` resTypeArgDmd r2) ds1 + lub_ds [] ds2 = map (resTypeArgDmd r1 `lub`) ds2 + +----------------------------------- +-- (t1 `bothType` t2) takes the argument/result info from t1, +-- using t2 just for its free-var info +-- NB: Don't forget about r2! It might be BotRes, which is +-- a bottom demand on all the in-scope variables. +-- Peter: can this be done more neatly? +bothType (DmdType fv1 ds1 r1) (DmdType fv2 ds2 r2) + = DmdType both_fv2 ds1 (r1 `bothRes` r2) + where + both_fv = plusUFM_C both fv1 fv2 + both_fv1 = modifyEnv (isBotRes r1) (`both` Bot) fv2 fv1 both_fv + both_fv2 = modifyEnv (isBotRes r2) (`both` Bot) fv1 fv2 both_fv1 + -- both is the identity for Abs +\end{code} + + +\begin{code} +lubRes BotRes r = r +lubRes r BotRes = r +lubRes RetCPR RetCPR = RetCPR +lubRes r1 r2 = TopRes + +-- If either diverges, the whole thing does +-- Otherwise take CPR info from the first +bothRes r1 BotRes = BotRes +bothRes r1 r2 = r1 +\end{code} + +\begin{code} +modifyEnv :: Bool -- No-op if False + -> (Demand -> Demand) -- The zapper + -> DmdEnv -> DmdEnv -- Env1 and Env2 + -> DmdEnv -> DmdEnv -- Transform this env + -- Zap anything in Env1 but not in Env2 + -- Assume: dom(env) includes dom(Env1) and dom(Env2) + +modifyEnv need_to_modify zapper env1 env2 env + | need_to_modify = foldr zap env (keysUFM (env1 `minusUFM` env2)) + | otherwise = env + where + zap uniq env = addToUFM_Directly env uniq (zapper current_val) + where + current_val = expectJust "modifyEnv" (lookupUFM_Directly env uniq) +\end{code} + + +%************************************************************************ +%* * +\subsection{LUB and BOTH} +%* * +%************************************************************************ + +\begin{code} +lub :: Demand -> Demand -> Demand + +lub Bot d2 = d2 +lub Abs d2 = absLub d2 +lub Top d2 = Top +lub (Defer ds1) d2 = defer (Eval ds1 `lub` d2) + +lub (Call d1) (Call d2) = Call (d1 `lub` d2) +lub d1@(Call _) (Box d2) = d1 `lub` d2 -- Just strip the box +lub d1@(Call _) d2@(Eval _) = d2 -- Presumably seq or vanilla eval +lub d1@(Call _) d2 = d2 `lub` d1 -- Bot, Abs, Top + +-- For the Eval case, we use these approximation rules +-- Box Bot <= Eval (Box Bot ...) +-- Box Top <= Defer (Box Bot ...) +-- Box (Eval ds) <= Eval (map Box ds) +lub (Eval ds1) (Eval ds2) = Eval (ds1 `lubs` ds2) +lub (Eval ds1) (Box Bot) = Eval (mapDmds (`lub` Box Bot) ds1) +lub (Eval ds1) (Box (Eval ds2)) = Eval (ds1 `lubs` mapDmds box ds2) +lub (Eval ds1) (Box Abs) = deferEval (mapDmds (`lub` Box Bot) ds1) +lub d1@(Eval _) d2 = d2 `lub` d1 -- Bot,Abs,Top,Call,Defer + +lub (Box d1) (Box d2) = box (d1 `lub` d2) +lub d1@(Box _) d2 = d2 `lub` d1 + +lubs = zipWithDmds lub + +--------------------- +-- box is the smart constructor for Box +-- It computes <B,bot> & d +-- INVARIANT: (Box d) => d = Bot, Abs, Eval +-- Seems to be no point in allowing (Box (Call d)) +box (Call d) = Call d -- The odd man out. Why? +box (Box d) = Box d +box (Defer _) = lazyDmd +box Top = lazyDmd -- Box Abs and Box Top +box Abs = lazyDmd -- are the same <B,L> +box d = Box d -- Bot, Eval + +--------------- +defer :: Demand -> Demand + +-- defer is the smart constructor for Defer +-- The idea is that (Defer ds) = <U(ds), L> +-- +-- It specifies what happens at a lazy function argument +-- or a lambda; the L* operator +-- Set the strictness part to L, but leave +-- the boxity side unaffected +-- It also ensures that Defer (Eval [LLLL]) = L + +defer Bot = Abs +defer Abs = Abs +defer Top = Top +defer (Call _) = lazyDmd -- Approximation here? +defer (Box _) = lazyDmd +defer (Defer ds) = Defer ds +defer (Eval ds) = deferEval ds + +-- deferEval ds = defer (Eval ds) +deferEval ds | allTop ds = Top + | otherwise = Defer ds + +--------------------- +absLub :: Demand -> Demand +-- Computes (Abs `lub` d) +-- For the Bot case consider +-- f x y = if ... then x else error x +-- Then for y we get Abs `lub` Bot, and we really +-- want Abs overall +absLub Bot = Abs +absLub Abs = Abs +absLub Top = Top +absLub (Call _) = Top +absLub (Box _) = Top +absLub (Eval ds) = Defer (absLubs ds) -- Or (Defer ds)? +absLub (Defer ds) = Defer (absLubs ds) -- Or (Defer ds)? + +absLubs = mapDmds absLub + +--------------- +both :: Demand -> Demand -> Demand + +both Abs d2 = d2 + +both Bot Bot = Bot +both Bot Abs = Bot +both Bot (Eval ds) = Eval (mapDmds (`both` Bot) ds) + -- Consider + -- f x = error x + -- From 'error' itself we get demand Bot on x + -- From the arg demand on x we get + -- x :-> evalDmd = Box (Eval (Poly Abs)) + -- So we get Bot `both` Box (Eval (Poly Abs)) + -- = Seq Keep (Poly Bot) + -- + -- Consider also + -- f x = if ... then error (fst x) else fst x + -- Then we get (Eval (Box Bot, Bot) `lub` Eval (SA)) + -- = Eval (SA) + -- which is what we want. +both Bot d = errDmd + +both Top Bot = errDmd +both Top Abs = Top +both Top Top = Top +both Top (Box d) = Box d +both Top (Call d) = Call d +both Top (Eval ds) = Eval (mapDmds (`both` Top) ds) +both Top (Defer ds) -- = defer (Top `both` Eval ds) + -- = defer (Eval (mapDmds (`both` Top) ds)) + = deferEval (mapDmds (`both` Top) ds) + + +both (Box d1) (Box d2) = box (d1 `both` d2) +both (Box d1) d2@(Call _) = box (d1 `both` d2) +both (Box d1) d2@(Eval _) = box (d1 `both` d2) +both (Box d1) (Defer d2) = Box d1 +both d1@(Box _) d2 = d2 `both` d1 + +both (Call d1) (Call d2) = Call (d1 `both` d2) +both (Call d1) (Eval ds2) = Call d1 -- Could do better for (Poly Bot)? +both (Call d1) (Defer ds2) = Call d1 -- Ditto +both d1@(Call _) d2 = d1 `both` d1 + +both (Eval ds1) (Eval ds2) = Eval (ds1 `boths` ds2) +both (Eval ds1) (Defer ds2) = Eval (ds1 `boths` mapDmds defer ds2) +both d1@(Eval ds1) d2 = d2 `both` d1 + +both (Defer ds1) (Defer ds2) = deferEval (ds1 `boths` ds2) +both d1@(Defer ds1) d2 = d2 `both` d1 + +boths = zipWithDmds both +\end{code} + + + +%************************************************************************ +%* * +\subsection{Miscellaneous +%* * +%************************************************************************ + + +\begin{code} +#ifdef OLD_STRICTNESS +get_changes binds = vcat (map get_changes_bind binds) + +get_changes_bind (Rec pairs) = vcat (map get_changes_pr pairs) +get_changes_bind (NonRec id rhs) = get_changes_pr (id,rhs) + +get_changes_pr (id,rhs) + = get_changes_var id $$ get_changes_expr rhs + +get_changes_var var + | isId var = get_changes_str var $$ get_changes_dmd var + | otherwise = empty + +get_changes_expr (Type t) = empty +get_changes_expr (Var v) = empty +get_changes_expr (Lit l) = empty +get_changes_expr (Note n e) = get_changes_expr e +get_changes_expr (App e1 e2) = get_changes_expr e1 $$ get_changes_expr e2 +get_changes_expr (Lam b e) = {- get_changes_var b $$ -} get_changes_expr e +get_changes_expr (Let b e) = get_changes_bind b $$ get_changes_expr e +get_changes_expr (Case e b a) = get_changes_expr e $$ {- get_changes_var b $$ -} vcat (map get_changes_alt a) + +get_changes_alt (con,bs,rhs) = {- vcat (map get_changes_var bs) $$ -} get_changes_expr rhs + +get_changes_str id + | new_better && old_better = empty + | new_better = message "BETTER" + | old_better = message "WORSE" + | otherwise = message "INCOMPARABLE" + where + message word = text word <+> text "strictness for" <+> ppr id <+> info + info = (text "Old" <+> ppr old) $$ (text "New" <+> ppr new) + new = squashSig (idNewStrictness id) -- Don't report spurious diffs that the old + -- strictness analyser can't track + old = newStrictnessFromOld (idName id) (idArity id) (idStrictness id) (idCprInfo id) + old_better = old `betterStrictness` new + new_better = new `betterStrictness` old + +get_changes_dmd id + | isUnLiftedType (idType id) = empty -- Not useful + | new_better && old_better = empty + | new_better = message "BETTER" + | old_better = message "WORSE" + | otherwise = message "INCOMPARABLE" + where + message word = text word <+> text "demand for" <+> ppr id <+> info + info = (text "Old" <+> ppr old) $$ (text "New" <+> ppr new) + new = squashDmd (argDemand (idNewDemandInfo id)) -- To avoid spurious improvements + -- A bit of a hack + old = newDemand (idDemandInfo id) + new_better = new `betterDemand` old + old_better = old `betterDemand` new + +betterStrictness :: StrictSig -> StrictSig -> Bool +betterStrictness (StrictSig t1) (StrictSig t2) = betterDmdType t1 t2 + +betterDmdType t1 t2 = (t1 `lubType` t2) == t2 + +betterDemand :: Demand -> Demand -> Bool +-- If d1 `better` d2, and d2 `better` d2, then d1==d2 +betterDemand d1 d2 = (d1 `lub` d2) == d2 + +squashSig (StrictSig (DmdType fv ds res)) + = StrictSig (DmdType emptyDmdEnv (map squashDmd ds) res) + where + -- squash just gets rid of call demands + -- which the old analyser doesn't track +squashDmd (Call d) = evalDmd +squashDmd (Box d) = Box (squashDmd d) +squashDmd (Eval ds) = Eval (mapDmds squashDmd ds) +squashDmd (Defer ds) = Defer (mapDmds squashDmd ds) +squashDmd d = d +#endif +\end{code} diff --git a/compiler/stranal/SaAbsInt.lhs b/compiler/stranal/SaAbsInt.lhs new file mode 100644 index 0000000000..a6a79ec166 --- /dev/null +++ b/compiler/stranal/SaAbsInt.lhs @@ -0,0 +1,925 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1993-1998 +% +\section[SaAbsInt]{Abstract interpreter for strictness analysis} + +\begin{code} +#ifndef OLD_STRICTNESS +-- If OLD_STRICTNESS is off, omit all exports +module SaAbsInt () where + +#else +module SaAbsInt ( + findStrictness, + findDemand, findDemandAlts, + absEval, + widen, + fixpoint, + isBot + ) where + +#include "HsVersions.h" + +import StaticFlags ( opt_AllStrict, opt_NumbersStrict ) +import CoreSyn +import CoreUnfold ( maybeUnfoldingTemplate ) +import Id ( Id, idType, idUnfolding, isDataConWorkId_maybe, + idStrictness, + ) +import DataCon ( dataConTyCon, splitProductType_maybe, dataConRepArgTys ) +import IdInfo ( StrictnessInfo(..) ) +import Demand ( Demand(..), wwPrim, wwStrict, wwUnpack, wwLazy, + mkStrictnessInfo, isLazy + ) +import SaLib +import TyCon ( isProductTyCon, isRecursiveTyCon ) +import Type ( splitTyConApp_maybe, + isUnLiftedType, Type ) +import TyCon ( tyConUnique ) +import PrelInfo ( numericTyKeys ) +import Util ( isIn, nOfThem, zipWithEqual, equalLength ) +import Outputable +\end{code} + +%************************************************************************ +%* * +\subsection[AbsVal-ops]{Operations on @AbsVals@} +%* * +%************************************************************************ + +Least upper bound, greatest lower bound. + +\begin{code} +lub, glb :: AbsVal -> AbsVal -> AbsVal + +lub AbsBot val2 = val2 +lub val1 AbsBot = val1 + +lub (AbsProd xs) (AbsProd ys) = AbsProd (zipWithEqual "lub" lub xs ys) + +lub _ _ = AbsTop -- Crude, but conservative + -- The crudity only shows up if there + -- are functions involved + +-- Slightly funny glb; for absence analysis only; +-- AbsBot is the safe answer. +-- +-- Using anyBot rather than just testing for AbsBot is important. +-- Consider: +-- +-- f = \a b -> ... +-- +-- g = \x y z -> case x of +-- [] -> f x +-- (p:ps) -> f p +-- +-- Now, the abstract value of the branches of the case will be an +-- AbsFun, but when testing for z's absence we want to spot that it's +-- an AbsFun which can't possibly return AbsBot. So when glb'ing we +-- mustn't be too keen to bale out and return AbsBot; the anyBot test +-- spots that (f x) can't possibly return AbsBot. + +-- We have also tripped over the following interesting case: +-- case x of +-- [] -> \y -> 1 +-- (p:ps) -> f +-- +-- Now, suppose f is bound to AbsTop. Does this expression mention z? +-- Obviously not. But the case will take the glb of AbsTop (for f) and +-- an AbsFun (for \y->1). We should not bale out and give AbsBot, because +-- that would say that it *does* mention z (or anything else for that matter). +-- Nor can we always return AbsTop, because the AbsFun might be something +-- like (\y->z), which obviously does mention z. The point is that we're +-- glbing two functions, and AbsTop is not actually the top of the function +-- lattice. It is more like (\xyz -> x|y|z); that is, AbsTop returns +-- poison iff any of its arguments do. + +-- Deal with functions specially, because AbsTop isn't the +-- top of their domain. + +glb v1 v2 + | is_fun v1 || is_fun v2 + = if not (anyBot v1) && not (anyBot v2) + then + AbsTop + else + AbsBot + where + is_fun (AbsFun _ _) = True + is_fun (AbsApproxFun _ _) = True -- Not used, but the glb works ok + is_fun other = False + +-- The non-functional cases are quite straightforward + +glb (AbsProd xs) (AbsProd ys) = AbsProd (zipWithEqual "glb" glb xs ys) + +glb AbsTop v2 = v2 +glb v1 AbsTop = v1 + +glb _ _ = AbsBot -- Be pessimistic +\end{code} + +@isBot@ returns True if its argument is (a representation of) bottom. The +``representation'' part is because we need to detect the bottom {\em function} +too. To detect the bottom function, bind its args to top, and see if it +returns bottom. + +Used only in strictness analysis: +\begin{code} +isBot :: AbsVal -> Bool + +isBot AbsBot = True +isBot other = False -- Functions aren't bottom any more +\end{code} + +Used only in absence analysis: + +\begin{code} +anyBot :: AbsVal -> Bool + +anyBot AbsBot = True -- poisoned! +anyBot AbsTop = False +anyBot (AbsProd vals) = any anyBot vals +anyBot (AbsFun bndr_ty abs_fn) = anyBot (abs_fn AbsTop) +anyBot (AbsApproxFun _ val) = anyBot val +\end{code} + +@widen@ takes an @AbsVal@, $val$, and returns and @AbsVal@ which is +approximated by $val$. Furthermore, the result has no @AbsFun@s in +it, so it can be compared for equality by @sameVal@. + +\begin{code} +widen :: AnalysisKind -> AbsVal -> AbsVal + +-- Widening is complicated by the fact that funtions are lifted +widen StrAnal the_fn@(AbsFun bndr_ty _) + = case widened_body of + AbsApproxFun ds val -> AbsApproxFun (d : ds) val + where + d = findRecDemand str_fn abs_fn bndr_ty + str_fn val = isBot (foldl (absApply StrAnal) the_fn + (val : [AbsTop | d <- ds])) + + other -> AbsApproxFun [d] widened_body + where + d = findRecDemand str_fn abs_fn bndr_ty + str_fn val = isBot (absApply StrAnal the_fn val) + where + widened_body = widen StrAnal (absApply StrAnal the_fn AbsTop) + abs_fn val = False -- Always says poison; so it looks as if + -- nothing is absent; safe + +{- OLD comment... + This stuff is now instead handled neatly by the fact that AbsApproxFun + contains an AbsVal inside it. SLPJ Jan 97 + + | isBot abs_body = AbsBot + -- It's worth checking for a function which is unconditionally + -- bottom. Consider + -- + -- f x y = let g y = case x of ... + -- in (g ..) + (g ..) + -- + -- Here, when we are considering strictness of f in x, we'll + -- evaluate the body of f with x bound to bottom. The current + -- strategy is to bind g to its *widened* value; without the isBot + -- (...) test above, we'd bind g to an AbsApproxFun, and deliver + -- Top, not Bot as the value of f's rhs. The test spots the + -- unconditional bottom-ness of g when x is bottom. (Another + -- alternative here would be to bind g to its exact abstract + -- value, but that entails lots of potential re-computation, at + -- every application of g.) +-} + +widen StrAnal (AbsProd vals) = AbsProd (map (widen StrAnal) vals) +widen StrAnal other_val = other_val + + +widen AbsAnal the_fn@(AbsFun bndr_ty _) + | anyBot widened_body = AbsBot + -- In the absence-analysis case it's *essential* to check + -- that the function has no poison in its body. If it does, + -- anywhere, then the whole function is poisonous. + + | otherwise + = case widened_body of + AbsApproxFun ds val -> AbsApproxFun (d : ds) val + where + d = findRecDemand str_fn abs_fn bndr_ty + abs_fn val = not (anyBot (foldl (absApply AbsAnal) the_fn + (val : [AbsTop | d <- ds]))) + + other -> AbsApproxFun [d] widened_body + where + d = findRecDemand str_fn abs_fn bndr_ty + abs_fn val = not (anyBot (absApply AbsAnal the_fn val)) + where + widened_body = widen AbsAnal (absApply AbsAnal the_fn AbsTop) + str_fn val = True -- Always says non-termination; + -- that'll make findRecDemand peer into the + -- structure of the value. + +widen AbsAnal (AbsProd vals) = AbsProd (map (widen AbsAnal) vals) + + -- It's desirable to do a good job of widening for product + -- values. Consider + -- + -- let p = (x,y) + -- in ...(case p of (x,y) -> x)... + -- + -- Now, is y absent in this expression? Currently the + -- analyser widens p before looking at p's scope, to avoid + -- lots of recomputation in the case where p is a function. + -- So if widening doesn't have a case for products, we'll + -- widen p to AbsBot (since when searching for absence in y we + -- bind y to poison ie AbsBot), and now we are lost. + +widen AbsAnal other_val = other_val + +-- WAS: if anyBot val then AbsBot else AbsTop +-- Nowadays widen is doing a better job on functions for absence analysis. +\end{code} + +@crudeAbsWiden@ is used just for absence analysis, and always +returns AbsTop or AbsBot, so it widens to a two-point domain + +\begin{code} +crudeAbsWiden :: AbsVal -> AbsVal +crudeAbsWiden val = if anyBot val then AbsBot else AbsTop +\end{code} + +@sameVal@ compares two abstract values for equality. It can't deal with +@AbsFun@, but that should have been removed earlier in the day by @widen@. + +\begin{code} +sameVal :: AbsVal -> AbsVal -> Bool -- Can't handle AbsFun! + +#ifdef DEBUG +sameVal (AbsFun _ _) _ = panic "sameVal: AbsFun: arg1" +sameVal _ (AbsFun _ _) = panic "sameVal: AbsFun: arg2" +#endif + +sameVal AbsBot AbsBot = True +sameVal AbsBot other = False -- widen has reduced AbsFun bots to AbsBot + +sameVal AbsTop AbsTop = True +sameVal AbsTop other = False -- Right? + +sameVal (AbsProd vals1) (AbsProd vals2) = and (zipWithEqual "sameVal" sameVal vals1 vals2) +sameVal (AbsProd _) AbsTop = False +sameVal (AbsProd _) AbsBot = False + +sameVal (AbsApproxFun str1 v1) (AbsApproxFun str2 v2) = str1 == str2 && sameVal v1 v2 +sameVal (AbsApproxFun _ _) AbsTop = False +sameVal (AbsApproxFun _ _) AbsBot = False + +sameVal val1 val2 = panic "sameVal: type mismatch or AbsFun encountered" +\end{code} + + +@evalStrictness@ compares a @Demand@ with an abstract value, returning +@True@ iff the abstract value is {\em less defined} than the demand. +(@True@ is the exciting answer; @False@ is always safe.) + +\begin{code} +evalStrictness :: Demand + -> AbsVal + -> Bool -- True iff the value is sure + -- to be less defined than the Demand + +evalStrictness (WwLazy _) _ = False +evalStrictness WwStrict val = isBot val +evalStrictness WwEnum val = isBot val + +evalStrictness (WwUnpack _ demand_info) val + = case val of + AbsTop -> False + AbsBot -> True + AbsProd vals + | not (equalLength vals demand_info) -> pprTrace "TELL SIMON: evalStrictness" (ppr demand_info $$ ppr val) + False + | otherwise -> or (zipWithEqual "evalStrictness" evalStrictness demand_info vals) + + _ -> pprTrace "evalStrictness?" empty False + +evalStrictness WwPrim val + = case val of + AbsTop -> False + AbsBot -> True -- Can happen: consider f (g x), where g is a + -- recursive function returning an Int# that diverges + + other -> pprPanic "evalStrictness: WwPrim:" (ppr other) +\end{code} + +For absence analysis, we're interested in whether "poison" in the +argument (ie a bottom therein) can propagate to the result of the +function call; that is, whether the specified demand can {\em +possibly} hit poison. + +\begin{code} +evalAbsence (WwLazy True) _ = False -- Can't possibly hit poison + -- with Absent demand + +evalAbsence (WwUnpack _ demand_info) val + = case val of + AbsTop -> False -- No poison in here + AbsBot -> True -- Pure poison + AbsProd vals + | not (equalLength vals demand_info) -> pprTrace "TELL SIMON: evalAbsence" (ppr demand_info $$ ppr val) + True + | otherwise -> or (zipWithEqual "evalAbsence" evalAbsence demand_info vals) + _ -> pprTrace "TELL SIMON: evalAbsence" + (ppr demand_info $$ ppr val) + True + +evalAbsence other val = anyBot val + -- The demand is conservative; even "Lazy" *might* evaluate the + -- argument arbitrarily so we have to look everywhere for poison +\end{code} + +%************************************************************************ +%* * +\subsection[absEval]{Evaluate an expression in the abstract domain} +%* * +%************************************************************************ + +\begin{code} +-- The isBottomingId stuf is now dealt with via the Id's strictness info +-- absId anal var env | isBottomingId var +-- = case anal of +-- StrAnal -> AbsBot -- See discussion below +-- AbsAnal -> AbsTop -- Just want to see if there's any poison in + -- error's arg + +absId anal var env + = case (lookupAbsValEnv env var, + isDataConWorkId_maybe var, + idStrictness var, + maybeUnfoldingTemplate (idUnfolding var)) of + + (Just abs_val, _, _, _) -> + abs_val -- Bound in the environment + + (_, Just data_con, _, _) | isProductTyCon tycon && + not (isRecursiveTyCon tycon) + -> -- A product. We get infinite loops if we don't + -- check for recursive products! + -- The strictness info on the constructor + -- isn't expressive enough to contain its abstract value + productAbsVal (dataConRepArgTys data_con) [] + where + tycon = dataConTyCon data_con + + (_, _, NoStrictnessInfo, Just unfolding) -> + -- We have an unfolding for the expr + -- Assume the unfolding has no free variables since it + -- came from inside the Id + absEval anal unfolding env + -- Notice here that we only look in the unfolding if we don't + -- have strictness info (an unusual situation). + -- We could have chosen to look in the unfolding if it exists, + -- and only try the strictness info if it doesn't, and that would + -- give more accurate results, at the cost of re-abstract-interpreting + -- the unfolding every time. + -- We found only one place where the look-at-unfolding-first + -- method gave better results, which is in the definition of + -- showInt in the Prelude. In its defintion, fromIntegral is + -- not inlined (it's big) but ab-interp-ing its unfolding gave + -- a better result than looking at its strictness only. + -- showInt :: Integral a => a -> [Char] -> [Char] + -- ! {-# GHC_PRAGMA _A_ 1 _U_ 122 _S_ + -- "U(U(U(U(SA)AAAAAAAAL)AA)AAAAASAAASA)" {...} _N_ _N_ #-} + -- --- 42,44 ---- + -- showInt :: Integral a => a -> [Char] -> [Char] + -- ! {-# GHC_PRAGMA _A_ 1 _U_ 122 _S_ + -- "U(U(U(U(SL)LLLLLLLLL)LL)LLLLLSLLLLL)" _N_ _N_ #-} + + + (_, _, strictness_info, _) -> + -- Includes NoUnfolding + -- Try the strictness info + absValFromStrictness anal strictness_info + +productAbsVal [] rev_abs_args = AbsProd (reverse rev_abs_args) +productAbsVal (arg_ty : arg_tys) rev_abs_args = AbsFun arg_ty (\ abs_arg -> productAbsVal arg_tys (abs_arg : rev_abs_args)) +\end{code} + +\begin{code} +absEval :: AnalysisKind -> CoreExpr -> AbsValEnv -> AbsVal + +absEval anal (Type ty) env = AbsTop +absEval anal (Var var) env = absId anal var env +\end{code} + +Discussion about error (following/quoting Lennart): Any expression +'error e' is regarded as bottom (with HBC, with the -ffail-strict +flag, on with -O). + +Regarding it as bottom gives much better strictness properties for +some functions. E.g. + + f [x] y = x+y + f (x:xs) y = f xs (x+y) +i.e. + f [] _ = error "no match" + f [x] y = x+y + f (x:xs) y = f xs (x+y) + +is strict in y, which you really want. But, it may lead to +transformations that turn a call to \tr{error} into non-termination. +(The odds of this happening aren't good.) + +Things are a little different for absence analysis, because we want +to make sure that any poison (?????) + +\begin{code} +absEval anal (Lit _) env = AbsTop + -- Literals terminate (strictness) and are not poison (absence) +\end{code} + +\begin{code} +absEval anal (Lam bndr body) env + | isTyVar bndr = absEval anal body env -- Type lambda + | otherwise = AbsFun (idType bndr) abs_fn -- Value lambda + where + abs_fn arg = absEval anal body (addOneToAbsValEnv env bndr arg) + +absEval anal (App expr (Type ty)) env + = absEval anal expr env -- Type appplication +absEval anal (App f val_arg) env + = absApply anal (absEval anal f env) -- Value applicationn + (absEval anal val_arg env) +\end{code} + +\begin{code} +absEval anal expr@(Case scrut case_bndr alts) env + = let + scrut_val = absEval anal scrut env + alts_env = addOneToAbsValEnv env case_bndr scrut_val + in + case (scrut_val, alts) of + (AbsBot, _) -> AbsBot + + (AbsProd arg_vals, [(con, bndrs, rhs)]) + | con /= DEFAULT -> + -- The scrutinee is a product value, so it must be of a single-constr + -- type; so the constructor in this alternative must be the right one + -- so we can go ahead and bind the constructor args to the components + -- of the product value. + ASSERT(equalLength arg_vals val_bndrs) + absEval anal rhs rhs_env + where + val_bndrs = filter isId bndrs + rhs_env = growAbsValEnvList alts_env (val_bndrs `zip` arg_vals) + + other -> absEvalAlts anal alts alts_env +\end{code} + +For @Lets@ we widen the value we get. This is nothing to +do with fixpointing. The reason is so that we don't get an explosion +in the amount of computation. For example, consider: +\begin{verbatim} + let + g a = case a of + q1 -> ... + q2 -> ... + f x = case x of + p1 -> ...g r... + p2 -> ...g s... + in + f e +\end{verbatim} +If we bind @f@ and @g@ to their exact abstract value, then we'll +``execute'' one call to @f@ and {\em two} calls to @g@. This can blow +up exponentially. Widening cuts it off by making a fixed +approximation to @f@ and @g@, so that the bodies of @f@ and @g@ are +not evaluated again at all when they are called. + +Of course, this can lose useful joint strictness, which is sad. An +alternative approach would be to try with a certain amount of ``fuel'' +and be prepared to bale out. + +\begin{code} +absEval anal (Let (NonRec binder e1) e2) env + = let + new_env = addOneToAbsValEnv env binder (widen anal (absEval anal e1 env)) + in + -- The binder of a NonRec should *not* be of unboxed type, + -- hence no need to strictly evaluate the Rhs. + absEval anal e2 new_env + +absEval anal (Let (Rec pairs) body) env + = let + (binders,rhss) = unzip pairs + rhs_vals = cheapFixpoint anal binders rhss env -- Returns widened values + new_env = growAbsValEnvList env (binders `zip` rhs_vals) + in + absEval anal body new_env + +absEval anal (Note (Coerce _ _) expr) env = AbsTop + -- Don't look inside coerces, becuase they + -- are usually recursive newtypes + -- (Could improve, for the error case, but we're about + -- to kill this analyser anyway.) +absEval anal (Note note expr) env = absEval anal expr env +\end{code} + +\begin{code} +absEvalAlts :: AnalysisKind -> [CoreAlt] -> AbsValEnv -> AbsVal +absEvalAlts anal alts env + = combine anal (map go alts) + where + combine StrAnal = foldr1 lub -- Diverge only if all diverge + combine AbsAnal = foldr1 glb -- Find any poison + + go (con, bndrs, rhs) + = absEval anal rhs rhs_env + where + rhs_env = growAbsValEnvList env (filter isId bndrs `zip` repeat AbsTop) +\end{code} + +%************************************************************************ +%* * +\subsection[absApply]{Apply an abstract function to an abstract argument} +%* * +%************************************************************************ + +Easy ones first: + +\begin{code} +absApply :: AnalysisKind -> AbsVal -> AbsVal -> AbsVal + +absApply anal AbsBot arg = AbsBot + -- AbsBot represents the abstract bottom *function* too + +absApply StrAnal AbsTop arg = AbsTop +absApply AbsAnal AbsTop arg = if anyBot arg + then AbsBot + else AbsTop + -- To be conservative, we have to assume that a function about + -- which we know nothing (AbsTop) might look at some part of + -- its argument +\end{code} + +An @AbsFun@ with only one more argument needed---bind it and eval the +result. A @Lam@ with two or more args: return another @AbsFun@ with +an augmented environment. + +\begin{code} +absApply anal (AbsFun bndr_ty abs_fn) arg = abs_fn arg +\end{code} + +\begin{code} +absApply StrAnal (AbsApproxFun (d:ds) val) arg + = case ds of + [] -> val' + other -> AbsApproxFun ds val' -- Result is non-bot if there are still args + where + val' | evalStrictness d arg = AbsBot + | otherwise = val + +absApply AbsAnal (AbsApproxFun (d:ds) val) arg + = if evalAbsence d arg + then AbsBot -- Poison in arg means poison in the application + else case ds of + [] -> val + other -> AbsApproxFun ds val + +#ifdef DEBUG +absApply anal f@(AbsProd _) arg + = pprPanic ("absApply: Duff function: AbsProd." ++ show anal) ((ppr f) <+> (ppr arg)) +#endif +\end{code} + + + + +%************************************************************************ +%* * +\subsection[findStrictness]{Determine some binders' strictness} +%* * +%************************************************************************ + +\begin{code} +findStrictness :: Id + -> AbsVal -- Abstract strictness value of function + -> AbsVal -- Abstract absence value of function + -> StrictnessInfo -- Resulting strictness annotation + +findStrictness id (AbsApproxFun str_ds str_res) (AbsApproxFun abs_ds _) + -- You might think there's really no point in describing detailed + -- strictness for a divergent function; + -- If it's fully applied we get bottom regardless of the + -- argument. If it's not fully applied we don't get bottom. + -- Finally, we don't want to regard the args of a divergent function + -- as 'interesting' for inlining purposes (see Simplify.prepareArgs) + -- + -- HOWEVER, if we make diverging functions appear lazy, they + -- don't get wrappers, and then we get dreadful reboxing. + -- See notes with WwLib.worthSplitting + = find_strictness id str_ds str_res abs_ds + +findStrictness id str_val abs_val + | isBot str_val = mkStrictnessInfo ([], True) + | otherwise = NoStrictnessInfo + +-- The list of absence demands passed to combineDemands +-- can be shorter than the list of absence demands +-- +-- lookup = \ dEq -> letrec { +-- lookup = \ key ds -> ...lookup... +-- } +-- in lookup +-- Here the strictness value takes three args, but the absence value +-- takes only one, for reasons I don't quite understand (see cheapFixpoint) + +find_strictness id orig_str_ds orig_str_res orig_abs_ds + = mkStrictnessInfo (go orig_str_ds orig_abs_ds, res_bot) + where + res_bot = isBot orig_str_res + + go str_ds abs_ds = zipWith mk_dmd str_ds (abs_ds ++ repeat wwLazy) + + mk_dmd str_dmd (WwLazy True) + = WARN( not (res_bot || isLazy str_dmd), + ppr id <+> ppr orig_str_ds <+> ppr orig_abs_ds ) + -- If the arg isn't used we jolly well don't expect the function + -- to be strict in it. Unless the function diverges. + WwLazy True -- Best of all + + mk_dmd (WwUnpack u str_ds) + (WwUnpack _ abs_ds) = WwUnpack u (go str_ds abs_ds) + + mk_dmd str_dmd abs_dmd = str_dmd +\end{code} + + +\begin{code} +findDemand dmd str_env abs_env expr binder + = findRecDemand str_fn abs_fn (idType binder) + where + str_fn val = evalStrictness dmd (absEval StrAnal expr (addOneToAbsValEnv str_env binder val)) + abs_fn val = not (evalAbsence dmd (absEval AbsAnal expr (addOneToAbsValEnv abs_env binder val))) + +findDemandAlts dmd str_env abs_env alts binder + = findRecDemand str_fn abs_fn (idType binder) + where + str_fn val = evalStrictness dmd (absEvalAlts StrAnal alts (addOneToAbsValEnv str_env binder val)) + abs_fn val = not (evalAbsence dmd (absEvalAlts AbsAnal alts (addOneToAbsValEnv abs_env binder val))) +\end{code} + +@findRecDemand@ is where we finally convert strictness/absence info +into ``Demands'' which we can pin on Ids (etc.). + +NOTE: What do we do if something is {\em both} strict and absent? +Should \tr{f x y z = error "foo"} says that \tr{f}'s arguments are all +strict (because of bottoming effect of \tr{error}) or all absent +(because they're not used)? + +Well, for practical reasons, we prefer absence over strictness. In +particular, it makes the ``default defaults'' for class methods (the +ones that say \tr{defm.foo dict = error "I don't exist"}) come out +nicely [saying ``the dict isn't used''], rather than saying it is +strict in every component of the dictionary [massive gratuitious +casing to take the dict apart]. + +But you could have examples where going for strictness would be better +than absence. Consider: +\begin{verbatim} + let x = something big + in + f x y z + g x +\end{verbatim} + +If \tr{x} is marked absent in \tr{f}, but not strict, and \tr{g} is +lazy, then the thunk for \tr{x} will be built. If \tr{f} was strict, +then we'd let-to-case it: +\begin{verbatim} + case something big of + x -> f x y z + g x +\end{verbatim} +Ho hum. + +\begin{code} +findRecDemand :: (AbsVal -> Bool) -- True => function applied to this value yields Bot + -> (AbsVal -> Bool) -- True => function applied to this value yields no poison + -> Type -- The type of the argument + -> Demand + +findRecDemand str_fn abs_fn ty + = if isUnLiftedType ty then -- It's a primitive type! + wwPrim + + else if abs_fn AbsBot then -- It's absent + -- We prefer absence over strictness: see NOTE above. + WwLazy True + + else if not (opt_AllStrict || + (opt_NumbersStrict && is_numeric_type ty) || + str_fn AbsBot) then + WwLazy False -- It's not strict and we're not pretending + + else -- It's strict (or we're pretending it is)! + + case splitProductType_maybe ty of + + Nothing -> wwStrict -- Could have a test for wwEnum, but + -- we don't exploit it yet, so don't bother + + Just (tycon,_,data_con,cmpnt_tys) -- Single constructor case + | isRecursiveTyCon tycon -- Recursive data type; don't unpack + -> wwStrict -- (this applies to newtypes too: + -- e.g. data Void = MkVoid Void) + + | null compt_strict_infos -- A nullary data type + -> wwStrict + + | otherwise -- Some other data type + -> wwUnpack compt_strict_infos + + where + prod_len = length cmpnt_tys + compt_strict_infos + = [ findRecDemand + (\ cmpnt_val -> + str_fn (mkMainlyTopProd prod_len i cmpnt_val) + ) + (\ cmpnt_val -> + abs_fn (mkMainlyTopProd prod_len i cmpnt_val) + ) + cmpnt_ty + | (cmpnt_ty, i) <- cmpnt_tys `zip` [1..] ] + + where + is_numeric_type ty + = case (splitTyConApp_maybe ty) of -- NB: duplicates stuff done above + Nothing -> False + Just (tycon, _) -> tyConUnique tycon `is_elem` numericTyKeys + where + is_elem = isIn "is_numeric_type" + + -- mkMainlyTopProd: make an AbsProd that is all AbsTops ("n"-1 of + -- them) except for a given value in the "i"th position. + + mkMainlyTopProd :: Int -> Int -> AbsVal -> AbsVal + + mkMainlyTopProd n i val + = let + befores = nOfThem (i-1) AbsTop + afters = nOfThem (n-i) AbsTop + in + AbsProd (befores ++ (val : afters)) +\end{code} + +%************************************************************************ +%* * +\subsection[fixpoint]{Fixpointer for the strictness analyser} +%* * +%************************************************************************ + +The @fixpoint@ functions take a list of \tr{(binder, expr)} pairs, an +environment, and returns the abstract value of each binder. + +The @cheapFixpoint@ function makes a conservative approximation, +by binding each of the variables to Top in their own right hand sides. +That allows us to make rapid progress, at the cost of a less-than-wonderful +approximation. + +\begin{code} +cheapFixpoint :: AnalysisKind -> [Id] -> [CoreExpr] -> AbsValEnv -> [AbsVal] + +cheapFixpoint AbsAnal [id] [rhs] env + = [crudeAbsWiden (absEval AbsAnal rhs new_env)] + where + new_env = addOneToAbsValEnv env id AbsTop -- Unsafe starting point! + -- In the just-one-binding case, we guarantee to + -- find a fixed point in just one iteration, + -- because we are using only a two-point domain. + -- This improves matters in cases like: + -- + -- f x y = letrec g = ...g... + -- in g x + -- + -- Here, y isn't used at all, but if g is bound to + -- AbsBot we simply get AbsBot as the next + -- iteration too. + +cheapFixpoint anal ids rhss env + = [widen anal (absEval anal rhs new_env) | rhs <- rhss] + -- We do just one iteration, starting from a safe + -- approximation. This won't do a good job in situations + -- like: + -- \x -> letrec f = ...g... + -- g = ...f...x... + -- in + -- ...f... + -- Here, f will end up bound to Top after one iteration, + -- and hence we won't spot the strictness in x. + -- (A second iteration would solve this. ToDo: try the effect of + -- really searching for a fixed point.) + where + new_env = growAbsValEnvList env [(id,safe_val) | id <- ids] + + safe_val + = case anal of -- The safe starting point + StrAnal -> AbsTop + AbsAnal -> AbsBot +\end{code} + +\begin{code} +fixpoint :: AnalysisKind -> [Id] -> [CoreExpr] -> AbsValEnv -> [AbsVal] + +fixpoint anal [] _ env = [] + +fixpoint anal ids rhss env + = fix_loop initial_vals + where + initial_val id + = case anal of -- The (unsafe) starting point + AbsAnal -> AbsTop + StrAnal -> AbsBot + -- At one stage for StrAnal we said: + -- if (returnsRealWorld (idType id)) + -- then AbsTop -- this is a massively horrible hack (SLPJ 95/05) + -- but no one has the foggiest idea what this hack did, + -- and returnsRealWorld was a stub that always returned False + -- So this comment is all that is left of the hack! + + initial_vals = [ initial_val id | id <- ids ] + + fix_loop :: [AbsVal] -> [AbsVal] + + fix_loop current_widened_vals + = let + new_env = growAbsValEnvList env (ids `zip` current_widened_vals) + new_vals = [ absEval anal rhs new_env | rhs <- rhss ] + new_widened_vals = map (widen anal) new_vals + in + if (and (zipWith sameVal current_widened_vals new_widened_vals)) then + current_widened_vals + + -- NB: I was too chicken to make that a zipWithEqual, + -- lest I jump into a black hole. WDP 96/02 + + -- Return the widened values. We might get a slightly + -- better value by returning new_vals (which we used to + -- do, see below), but alas that means that whenever the + -- function is called we have to re-execute it, which is + -- expensive. + + -- OLD VERSION + -- new_vals + -- Return the un-widened values which may be a bit better + -- than the widened ones, and are guaranteed safe, since + -- they are one iteration beyond current_widened_vals, + -- which itself is a fixed point. + else + fix_loop new_widened_vals +\end{code} + +For absence analysis, we make do with a very very simple approach: +look for convergence in a two-point domain. + +We used to use just one iteration, starting with the variables bound +to @AbsBot@, which is safe. + +Prior to that, we used one iteration starting from @AbsTop@ (which +isn't safe). Why isn't @AbsTop@ safe? Consider: +\begin{verbatim} + letrec + x = ...p..d... + d = (x,y) + in + ... +\end{verbatim} +Here, if p is @AbsBot@, then we'd better {\em not} end up with a ``fixed +point'' of @d@ being @(AbsTop, AbsTop)@! An @AbsBot@ initial value is +safe because it gives poison more often than really necessary, and +thus may miss some absence, but will never claim absence when it ain't +so. + +Anyway, one iteration starting with everything bound to @AbsBot@ give +bad results for + + f = \ x -> ...f... + +Here, f would always end up bound to @AbsBot@, which ain't very +clever, because then it would introduce poison whenever it was +applied. Much better to start with f bound to @AbsTop@, and widen it +to @AbsBot@ if any poison shows up. In effect we look for convergence +in the two-point @AbsTop@/@AbsBot@ domain. + +What we miss (compared with the cleverer strictness analysis) is +spotting that in this case + + f = \ x y -> ...y...(f x y')... + +\tr{x} is actually absent, since it is only passed round the loop, never +used. But who cares about missing that? + +NB: despite only having a two-point domain, we may still have many +iterations, because there are several variables involved at once. + +\begin{code} +#endif /* OLD_STRICTNESS */ +\end{code} diff --git a/compiler/stranal/SaLib.lhs b/compiler/stranal/SaLib.lhs new file mode 100644 index 0000000000..338a351530 --- /dev/null +++ b/compiler/stranal/SaLib.lhs @@ -0,0 +1,130 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1993-1998 +% +\section[SaLib]{Basic datatypes, functions for the strictness analyser} + +See also: the ``library'' for the ``back end'' (@SaBackLib@). + +\begin{code} +#ifndef OLD_STRICTNESS +module SaLib () where +#else + +module SaLib ( + AbsVal(..), + AnalysisKind(..), + AbsValEnv{-abstract-}, StrictEnv, AbsenceEnv, + mkAbsApproxFun, + nullAbsValEnv, addOneToAbsValEnv, growAbsValEnvList, + lookupAbsValEnv, + absValFromStrictness + ) where + +#include "HsVersions.h" + +import Type ( Type ) +import VarEnv +import IdInfo ( StrictnessInfo(..) ) +import Demand ( Demand ) +import Outputable +\end{code} + +%************************************************************************ +%* * +\subsection[AbsVal-datatype]{@AbsVal@: abstract values (and @AbsValEnv@)} +%* * +%************************************************************************ + +@AnalysisKind@ tells what kind of analysis is being done. + +\begin{code} +data AnalysisKind + = StrAnal -- We're doing strictness analysis + | AbsAnal -- We're doing absence analysis + deriving Show +\end{code} + +@AbsVal@ is the data type of HNF abstract values. + +\begin{code} +data AbsVal + = AbsTop -- AbsTop is the completely uninformative + -- value + + | AbsBot -- An expression whose abstract value is + -- AbsBot is sure to fail to terminate. + -- AbsBot represents the abstract + -- *function* bottom too. + + | AbsProd [AbsVal] -- (Lifted) product of abstract values + -- "Lifted" means that AbsBot is *different* from + -- AbsProd [AbsBot, ..., AbsBot] + + | AbsFun -- An abstract function, with the given: + Type -- Type of the *argument* to the function + (AbsVal -> AbsVal) -- The function + + | AbsApproxFun -- This is used to represent a coarse + [Demand] -- approximation to a function value. It's an + AbsVal -- abstract function which is strict in its + -- arguments if the Demand so indicates. + -- INVARIANT: the [Demand] is non-empty + + -- AbsApproxFun has to take a *list* of demands, no just one, + -- because function spaces are now lifted. Hence, (f bot top) + -- might be bot, but the partial application (f bot) is a *function*, + -- not bot. + +mkAbsApproxFun :: Demand -> AbsVal -> AbsVal +mkAbsApproxFun d (AbsApproxFun ds val) = AbsApproxFun (d:ds) val +mkAbsApproxFun d val = AbsApproxFun [d] val + +instance Outputable AbsVal where + ppr AbsTop = ptext SLIT("AbsTop") + ppr AbsBot = ptext SLIT("AbsBot") + ppr (AbsProd prod) = hsep [ptext SLIT("AbsProd"), ppr prod] + ppr (AbsFun bndr_ty body) = ptext SLIT("AbsFun") + ppr (AbsApproxFun demands val) + = ptext SLIT("AbsApprox") <+> brackets (interpp'SP demands) <+> ppr val +\end{code} + +%----------- + +An @AbsValEnv@ maps @Ids@ to @AbsVals@. Any unbound @Ids@ are +implicitly bound to @AbsTop@, the completely uninformative, +pessimistic value---see @absEval@ of a @Var@. + +\begin{code} +newtype AbsValEnv = AbsValEnv (IdEnv AbsVal) + +type StrictEnv = AbsValEnv -- Environment for strictness analysis +type AbsenceEnv = AbsValEnv -- Environment for absence analysis + +nullAbsValEnv -- this is the one and only way to create AbsValEnvs + = AbsValEnv emptyVarEnv + +addOneToAbsValEnv (AbsValEnv idenv) y z = AbsValEnv (extendVarEnv idenv y z) +growAbsValEnvList (AbsValEnv idenv) ys = AbsValEnv (extendVarEnvList idenv ys) + +lookupAbsValEnv (AbsValEnv idenv) y + = lookupVarEnv idenv y +\end{code} + +\begin{code} +absValFromStrictness :: AnalysisKind -> StrictnessInfo -> AbsVal + +absValFromStrictness anal NoStrictnessInfo = AbsTop +absValFromStrictness anal (StrictnessInfo args_info bot_result) + = case args_info of -- Check the invariant that the arg list on + [] -> res -- AbsApproxFun is non-empty + _ -> AbsApproxFun args_info res + where + res | not bot_result = AbsTop + | otherwise = case anal of + StrAnal -> AbsBot + AbsAnal -> AbsTop +\end{code} + +\begin{code} +#endif /* OLD_STRICTNESS */ +\end{code} diff --git a/compiler/stranal/StrictAnal.lhs b/compiler/stranal/StrictAnal.lhs new file mode 100644 index 0000000000..242a947074 --- /dev/null +++ b/compiler/stranal/StrictAnal.lhs @@ -0,0 +1,494 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1993-1998 +% +\section[StrictAnal]{``Simple'' Mycroft-style strictness analyser} + +The original version(s) of all strictness-analyser code (except the +Semantique analyser) was written by Andy Gill. + +\begin{code} +#ifndef OLD_STRICTNESS +module StrictAnal ( ) where + +#else + +module StrictAnal ( saBinds ) where + +#include "HsVersions.h" + +import DynFlags ( DynFlags, DynFlag(..) ) +import CoreSyn +import Id ( setIdStrictness, setInlinePragma, + idDemandInfo, setIdDemandInfo, isBottomingId, + Id + ) +import CoreLint ( showPass, endPass ) +import ErrUtils ( dumpIfSet_dyn ) +import SaAbsInt +import SaLib +import Demand ( Demand, wwStrict, isStrict, isLazy ) +import Util ( zipWith3Equal, stretchZipWith, compareLength ) +import BasicTypes ( Activation( NeverActive ) ) +import Outputable +import FastTypes +\end{code} + +%************************************************************************ +%* * +\subsection[Thoughts]{Random thoughts} +%* * +%************************************************************************ + +A note about worker-wrappering. If we have + + f :: Int -> Int + f = let v = <expensive> + in \x -> <body> + +and we deduce that f is strict, it is nevertheless NOT safe to worker-wapper to + + f = \x -> case x of Int x# -> fw x# + fw = \x# -> let x = Int x# + in + let v = <expensive> + in <body> + +because this obviously loses laziness, since now <expensive> +is done each time. Alas. + +WATCH OUT! This can mean that something is unboxed only to be +boxed again. For example + + g x y = f x + +Here g is strict, and *will* split into worker-wrapper. A call to +g, with the wrapper inlined will then be + + case arg of Int a# -> gw a# + +Now g calls f, which has no wrapper, so it has to box it. + + gw = \a# -> f (Int a#) + +Alas and alack. + + +%************************************************************************ +%* * +\subsection[iface-StrictAnal]{Interface to the outside world} +%* * +%************************************************************************ + +@saBinds@ decorates bindings with strictness info. A later +worker-wrapper pass can use this info to create wrappers and +strict workers. + +\begin{code} +saBinds :: DynFlags -> [CoreBind] -> IO [CoreBind] +saBinds dflags binds + = do { + showPass dflags "Strictness analysis"; + + -- Mark each binder with its strictness +#ifndef OMIT_STRANAL_STATS + let { (binds_w_strictness, sa_stats) = saTopBinds binds nullSaStats }; + dumpIfSet_dyn dflags Opt_D_dump_simpl_stats "Strictness analysis statistics" + (pp_stats sa_stats); +#else + let { binds_w_strictness = saTopBindsBinds binds }; +#endif + + endPass dflags "Strictness analysis" Opt_D_dump_stranal + binds_w_strictness + } +\end{code} + +%************************************************************************ +%* * +\subsection[saBinds]{Strictness analysis of bindings} +%* * +%************************************************************************ + +[Some of the documentation about types, etc., in \tr{SaLib} may be +helpful for understanding this module.] + +@saTopBinds@ tags each binder in the program with its @Demand@. +That tells how each binder is {\em used}; if @Strict@, then the binder +is sure to be evaluated to HNF; if @NonStrict@ it may or may not be; +if @Absent@, then it certainly is not used. [DATED; ToDo: update] + +(The above info is actually recorded for posterity in each binder's +IdInfo, notably its @DemandInfo@.) + +We proceed by analysing the bindings top-to-bottom, building up an +environment which maps @Id@s to their abstract values (i.e., an +@AbsValEnv@ maps an @Id@ to its @AbsVal@). + +\begin{code} +saTopBinds :: [CoreBind] -> SaM [CoreBind] -- not exported + +saTopBinds binds + = let + starting_abs_env = nullAbsValEnv + in + do_it starting_abs_env starting_abs_env binds + where + do_it _ _ [] = returnSa [] + do_it senv aenv (b:bs) + = saTopBind senv aenv b `thenSa` \ (senv2, aenv2, new_b) -> + do_it senv2 aenv2 bs `thenSa` \ new_bs -> + returnSa (new_b : new_bs) +\end{code} + +@saTopBind@ is only used for the top level. We don't add any demand +info to these ids because we can't work it out. In any case, it +doesn't do us any good to know whether top-level binders are sure to +be used; we can't turn top-level @let@s into @case@s. + +\begin{code} +saTopBind :: StrictEnv -> AbsenceEnv + -> CoreBind + -> SaM (StrictEnv, AbsenceEnv, CoreBind) + +saTopBind str_env abs_env (NonRec binder rhs) + = saExpr minDemand str_env abs_env rhs `thenSa` \ new_rhs -> + let + str_rhs = absEval StrAnal rhs str_env + abs_rhs = absEval AbsAnal rhs abs_env + + widened_str_rhs = widen StrAnal str_rhs + widened_abs_rhs = widen AbsAnal abs_rhs + -- The widening above is done for efficiency reasons. + -- See notes on Let case in SaAbsInt.lhs + + new_binder + = addStrictnessInfoToTopId + widened_str_rhs widened_abs_rhs + binder + + -- Augment environments with a mapping of the + -- binder to its abstract values, computed by absEval + new_str_env = addOneToAbsValEnv str_env binder widened_str_rhs + new_abs_env = addOneToAbsValEnv abs_env binder widened_abs_rhs + in + returnSa (new_str_env, new_abs_env, NonRec new_binder new_rhs) + +saTopBind str_env abs_env (Rec pairs) + = let + (binders,rhss) = unzip pairs + str_rhss = fixpoint StrAnal binders rhss str_env + abs_rhss = fixpoint AbsAnal binders rhss abs_env + -- fixpoint returns widened values + new_str_env = growAbsValEnvList str_env (binders `zip` str_rhss) + new_abs_env = growAbsValEnvList abs_env (binders `zip` abs_rhss) + new_binders = zipWith3Equal "saTopBind" addStrictnessInfoToTopId + str_rhss abs_rhss binders + in + mapSa (saExpr minDemand new_str_env new_abs_env) rhss `thenSa` \ new_rhss -> + let + new_pairs = new_binders `zip` new_rhss + in + returnSa (new_str_env, new_abs_env, Rec new_pairs) + +-- Hack alert! +-- Top level divergent bindings are marked NOINLINE +-- This avoids fruitless inlining of top level error functions +addStrictnessInfoToTopId str_val abs_val bndr + = if isBottomingId new_id then + new_id `setInlinePragma` NeverActive + else + new_id + where + new_id = addStrictnessInfoToId str_val abs_val bndr +\end{code} + +%************************************************************************ +%* * +\subsection[saExpr]{Strictness analysis of an expression} +%* * +%************************************************************************ + +@saExpr@ computes the strictness of an expression within a given +environment. + +\begin{code} +saExpr :: Demand -> StrictEnv -> AbsenceEnv -> CoreExpr -> SaM CoreExpr + -- The demand is the least demand we expect on the + -- expression. WwStrict is the least, because we're only + -- interested in the expression at all if it's being evaluated, + -- but the demand may be more. E.g. + -- f E + -- where f has strictness u(LL), will evaluate E with demand u(LL) + +minDemand = wwStrict +minDemands = repeat minDemand + +-- When we find an application, do the arguments +-- with demands gotten from the function +saApp str_env abs_env (fun, args) + = sequenceSa sa_args `thenSa` \ args' -> + saExpr minDemand str_env abs_env fun `thenSa` \ fun' -> + returnSa (mkApps fun' args') + where + arg_dmds = case fun of + Var var -> case lookupAbsValEnv str_env var of + Just (AbsApproxFun ds _) + | compareLength ds args /= LT + -- 'ds' is at least as long as 'args'. + -> ds ++ minDemands + other -> minDemands + other -> minDemands + + sa_args = stretchZipWith isTypeArg (error "saApp:dmd") + sa_arg args arg_dmds + -- The arg_dmds are for value args only, we need to skip + -- over the type args when pairing up with the demands + -- Hence the stretchZipWith + + sa_arg arg dmd = saExpr dmd' str_env abs_env arg + where + -- Bring arg demand up to minDemand + dmd' | isLazy dmd = minDemand + | otherwise = dmd + +saExpr _ _ _ e@(Var _) = returnSa e +saExpr _ _ _ e@(Lit _) = returnSa e +saExpr _ _ _ e@(Type _) = returnSa e + +saExpr dmd str_env abs_env (Lam bndr body) + = -- Don't bother to set the demand-info on a lambda binder + -- We do that only for let(rec)-bound functions + saExpr minDemand str_env abs_env body `thenSa` \ new_body -> + returnSa (Lam bndr new_body) + +saExpr dmd str_env abs_env e@(App fun arg) + = saApp str_env abs_env (collectArgs e) + +saExpr dmd str_env abs_env (Note note expr) + = saExpr dmd str_env abs_env expr `thenSa` \ new_expr -> + returnSa (Note note new_expr) + +saExpr dmd str_env abs_env (Case expr case_bndr alts) + = saExpr minDemand str_env abs_env expr `thenSa` \ new_expr -> + mapSa sa_alt alts `thenSa` \ new_alts -> + let + new_case_bndr = addDemandInfoToCaseBndr dmd str_env abs_env alts case_bndr + in + returnSa (Case new_expr new_case_bndr new_alts) + where + sa_alt (con, binders, rhs) + = saExpr dmd str_env abs_env rhs `thenSa` \ new_rhs -> + let + new_binders = map add_demand_info binders + add_demand_info bndr | isTyVar bndr = bndr + | otherwise = addDemandInfoToId dmd str_env abs_env rhs bndr + in + tickCases new_binders `thenSa_` -- stats + returnSa (con, new_binders, new_rhs) + +saExpr dmd str_env abs_env (Let (NonRec binder rhs) body) + = -- Analyse the RHS in the environment at hand + let + -- Find the demand on the RHS + rhs_dmd = findDemand dmd str_env abs_env body binder + + -- Bind this binder to the abstract value of the RHS; analyse + -- the body of the `let' in the extended environment. + str_rhs_val = absEval StrAnal rhs str_env + abs_rhs_val = absEval AbsAnal rhs abs_env + + widened_str_rhs = widen StrAnal str_rhs_val + widened_abs_rhs = widen AbsAnal abs_rhs_val + -- The widening above is done for efficiency reasons. + -- See notes on Let case in SaAbsInt.lhs + + new_str_env = addOneToAbsValEnv str_env binder widened_str_rhs + new_abs_env = addOneToAbsValEnv abs_env binder widened_abs_rhs + + -- Now determine the strictness of this binder; use that info + -- to record DemandInfo/StrictnessInfo in the binder. + new_binder = addStrictnessInfoToId + widened_str_rhs widened_abs_rhs + (binder `setIdDemandInfo` rhs_dmd) + in + tickLet new_binder `thenSa_` -- stats + saExpr rhs_dmd str_env abs_env rhs `thenSa` \ new_rhs -> + saExpr dmd new_str_env new_abs_env body `thenSa` \ new_body -> + returnSa (Let (NonRec new_binder new_rhs) new_body) + +saExpr dmd str_env abs_env (Let (Rec pairs) body) + = let + (binders,rhss) = unzip pairs + str_vals = fixpoint StrAnal binders rhss str_env + abs_vals = fixpoint AbsAnal binders rhss abs_env + -- fixpoint returns widened values + new_str_env = growAbsValEnvList str_env (binders `zip` str_vals) + new_abs_env = growAbsValEnvList abs_env (binders `zip` abs_vals) + in + saExpr dmd new_str_env new_abs_env body `thenSa` \ new_body -> + mapSa (saExpr minDemand new_str_env new_abs_env) rhss `thenSa` \ new_rhss -> + let +-- DON'T add demand info in a Rec! +-- a) it's useless: we can't do let-to-case +-- b) it's incorrect. Consider +-- letrec x = ...y... +-- y = ...x... +-- in ...x... +-- When we ask whether y is demanded we'll bind y to bottom and +-- evaluate the body of the letrec. But that will result in our +-- deciding that y is absent, which is plain wrong! +-- It's much easier simply not to do this. + + improved_binders = zipWith3Equal "saExpr" addStrictnessInfoToId + str_vals abs_vals binders + + new_pairs = improved_binders `zip` new_rhss + in + returnSa (Let (Rec new_pairs) new_body) +\end{code} + + +%************************************************************************ +%* * +\subsection[computeInfos]{Add computed info to binders} +%* * +%************************************************************************ + +Important note (Sept 93). @addStrictnessInfoToId@ is used only for +let(rec) bound variables, and is use to attach the strictness (not +demand) info to the binder. We are careful to restrict this +strictness info to the lambda-bound arguments which are actually +visible, at the top level, lest we accidentally lose laziness by +eagerly looking for an "extra" argument. So we "dig for lambdas" in a +rather syntactic way. + +A better idea might be to have some kind of arity analysis to +tell how many args could safely be grabbed. + +\begin{code} +addStrictnessInfoToId + :: AbsVal -- Abstract strictness value + -> AbsVal -- Ditto absence + -> Id -- The id + -> Id -- Augmented with strictness + +addStrictnessInfoToId str_val abs_val binder + = binder `setIdStrictness` findStrictness binder str_val abs_val +\end{code} + +\begin{code} +addDemandInfoToId :: Demand -> StrictEnv -> AbsenceEnv + -> CoreExpr -- The scope of the id + -> Id + -> Id -- Id augmented with Demand info + +addDemandInfoToId dmd str_env abs_env expr binder + = binder `setIdDemandInfo` (findDemand dmd str_env abs_env expr binder) + +addDemandInfoToCaseBndr dmd str_env abs_env alts binder + = binder `setIdDemandInfo` (findDemandAlts dmd str_env abs_env alts binder) +\end{code} + +%************************************************************************ +%* * +\subsection{Monad used herein for stats} +%* * +%************************************************************************ + +\begin{code} +data SaStats + = SaStats FastInt FastInt -- total/marked-demanded lambda-bound + FastInt FastInt -- total/marked-demanded case-bound + FastInt FastInt -- total/marked-demanded let-bound + -- (excl. top-level; excl. letrecs) + +nullSaStats = SaStats (_ILIT 0) (_ILIT 0) (_ILIT 0) (_ILIT 0) (_ILIT 0) (_ILIT 0) + +thenSa :: SaM a -> (a -> SaM b) -> SaM b +thenSa_ :: SaM a -> SaM b -> SaM b +returnSa :: a -> SaM a + +{-# INLINE thenSa #-} +{-# INLINE thenSa_ #-} +{-# INLINE returnSa #-} + +tickLambda :: Id -> SaM () +tickCases :: [CoreBndr] -> SaM () +tickLet :: Id -> SaM () + +#ifndef OMIT_STRANAL_STATS +type SaM a = SaStats -> (a, SaStats) + +thenSa expr cont stats + = case (expr stats) of { (result, stats1) -> + cont result stats1 } + +thenSa_ expr cont stats + = case (expr stats) of { (_, stats1) -> + cont stats1 } + +returnSa x stats = (x, stats) + +tickLambda var (SaStats tlam dlam tc dc tlet dlet) + = case (tick_demanded var (0,0)) of { (totB, demandedB) -> + let tot = iUnbox totB ; demanded = iUnbox demandedB + in + ((), SaStats (tlam +# tot) (dlam +# demanded) tc dc tlet dlet) } + +tickCases vars (SaStats tlam dlam tc dc tlet dlet) + = case (foldr tick_demanded (0,0) vars) of { (totB, demandedB) -> + let tot = iUnbox totB ; demanded = iUnbox demandedB + in + ((), SaStats tlam dlam (tc +# tot) (dc +# demanded) tlet dlet) } + +tickLet var (SaStats tlam dlam tc dc tlet dlet) + = case (tick_demanded var (0,0)) of { (totB, demandedB) -> + let tot = iUnbox totB ; demanded = iUnbox demandedB + in + ((), SaStats tlam dlam tc dc (tlet +# tot) (dlet +# demanded)) } + +tick_demanded var (tot, demanded) + | isTyVar var = (tot, demanded) + | otherwise + = (tot + 1, + if (isStrict (idDemandInfo var)) + then demanded + 1 + else demanded) + +pp_stats (SaStats tlam dlam tc dc tlet dlet) + = hcat [ptext SLIT("Lambda vars: "), int (iBox dlam), char '/', int (iBox tlam), + ptext SLIT("; Case vars: "), int (iBox dc), char '/', int (iBox tc), + ptext SLIT("; Let vars: "), int (iBox dlet), char '/', int (iBox tlet) + ] + +#else /* OMIT_STRANAL_STATS */ +-- identity monad +type SaM a = a + +thenSa expr cont = cont expr + +thenSa_ expr cont = cont + +returnSa x = x + +tickLambda var = panic "OMIT_STRANAL_STATS: tickLambda" +tickCases vars = panic "OMIT_STRANAL_STATS: tickCases" +tickLet var = panic "OMIT_STRANAL_STATS: tickLet" + +#endif /* OMIT_STRANAL_STATS */ + +mapSa :: (a -> SaM b) -> [a] -> SaM [b] + +mapSa f [] = returnSa [] +mapSa f (x:xs) = f x `thenSa` \ r -> + mapSa f xs `thenSa` \ rs -> + returnSa (r:rs) + +sequenceSa :: [SaM a] -> SaM [a] +sequenceSa [] = returnSa [] +sequenceSa (m:ms) = m `thenSa` \ r -> + sequenceSa ms `thenSa` \ rs -> + returnSa (r:rs) + +#endif /* OLD_STRICTNESS */ +\end{code} diff --git a/compiler/stranal/WorkWrap.lhs b/compiler/stranal/WorkWrap.lhs new file mode 100644 index 0000000000..64eba89273 --- /dev/null +++ b/compiler/stranal/WorkWrap.lhs @@ -0,0 +1,403 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1993-1998 +% +\section[WorkWrap]{Worker/wrapper-generating back-end of strictness analyser} + +\begin{code} +module WorkWrap ( wwTopBinds, mkWrapper ) where + +#include "HsVersions.h" + +import CoreSyn +import CoreUnfold ( certainlyWillInline ) +import CoreLint ( showPass, endPass ) +import CoreUtils ( exprType, exprIsHNF ) +import Id ( Id, idType, isOneShotLambda, + setIdNewStrictness, mkWorkerId, + setIdWorkerInfo, setInlinePragma, + idInfo ) +import MkId ( lazyIdKey, lazyIdUnfolding ) +import Type ( Type ) +import IdInfo ( WorkerInfo(..), arityInfo, + newDemandInfo, newStrictnessInfo, unfoldingInfo, inlinePragInfo + ) +import NewDemand ( Demand(..), StrictSig(..), DmdType(..), DmdResult(..), + Demands(..), mkTopDmdType, isBotRes, returnsCPR, topSig, isAbsent + ) +import UniqSupply ( UniqSupply, initUs_, returnUs, thenUs, mapUs, getUniqueUs, UniqSM ) +import Unique ( hasKey ) +import BasicTypes ( RecFlag(..), isNonRec, Activation(..) ) +import VarEnv ( isEmptyVarEnv ) +import Maybes ( orElse ) +import DynFlags +import WwLib +import Util ( lengthIs, notNull ) +import Outputable +\end{code} + +We take Core bindings whose binders have: + +\begin{enumerate} + +\item Strictness attached (by the front-end of the strictness +analyser), and / or + +\item Constructed Product Result information attached by the CPR +analysis pass. + +\end{enumerate} + +and we return some ``plain'' bindings which have been +worker/wrapper-ified, meaning: + +\begin{enumerate} + +\item Functions have been split into workers and wrappers where +appropriate. If a function has both strictness and CPR properties +then only one worker/wrapper doing both transformations is produced; + +\item Binders' @IdInfos@ have been updated to reflect the existence of +these workers/wrappers (this is where we get STRICTNESS and CPR pragma +info for exported values). +\end{enumerate} + +\begin{code} + +wwTopBinds :: DynFlags + -> UniqSupply + -> [CoreBind] + -> IO [CoreBind] + +wwTopBinds dflags us binds + = do { + showPass dflags "Worker Wrapper binds"; + + -- Create worker/wrappers, and mark binders with their + -- "strictness info" [which encodes their worker/wrapper-ness] + let { binds' = workersAndWrappers us binds }; + + endPass dflags "Worker Wrapper binds" + Opt_D_dump_worker_wrapper binds' + } +\end{code} + + +\begin{code} +workersAndWrappers :: UniqSupply -> [CoreBind] -> [CoreBind] + +workersAndWrappers us top_binds + = initUs_ us $ + mapUs wwBind top_binds `thenUs` \ top_binds' -> + returnUs (concat top_binds') +\end{code} + +%************************************************************************ +%* * +\subsection[wwBind-wwExpr]{@wwBind@ and @wwExpr@} +%* * +%************************************************************************ + +@wwBind@ works on a binding, trying each \tr{(binder, expr)} pair in +turn. Non-recursive case first, then recursive... + +\begin{code} +wwBind :: CoreBind + -> UniqSM [CoreBind] -- returns a WwBinding intermediate form; + -- the caller will convert to Expr/Binding, + -- as appropriate. + +wwBind (NonRec binder rhs) + = wwExpr rhs `thenUs` \ new_rhs -> + tryWW NonRecursive binder new_rhs `thenUs` \ new_pairs -> + returnUs [NonRec b e | (b,e) <- new_pairs] + -- Generated bindings must be non-recursive + -- because the original binding was. + +wwBind (Rec pairs) + = mapUs do_one pairs `thenUs` \ new_pairs -> + returnUs [Rec (concat new_pairs)] + where + do_one (binder, rhs) = wwExpr rhs `thenUs` \ new_rhs -> + tryWW Recursive binder new_rhs +\end{code} + +@wwExpr@ basically just walks the tree, looking for appropriate +annotations that can be used. Remember it is @wwBind@ that does the +matching by looking for strict arguments of the correct type. +@wwExpr@ is a version that just returns the ``Plain'' Tree. + +\begin{code} +wwExpr :: CoreExpr -> UniqSM CoreExpr + +wwExpr e@(Type _) = returnUs e +wwExpr e@(Lit _) = returnUs e +wwExpr e@(Note InlineMe expr) = returnUs e + -- Don't w/w inside InlineMe's + +wwExpr e@(Var v) + | v `hasKey` lazyIdKey = returnUs lazyIdUnfolding + | otherwise = returnUs e + -- Inline 'lazy' after strictness analysis + -- (but not inside InlineMe's) + +wwExpr (Lam binder expr) + = wwExpr expr `thenUs` \ new_expr -> + returnUs (Lam binder new_expr) + +wwExpr (App f a) + = wwExpr f `thenUs` \ new_f -> + wwExpr a `thenUs` \ new_a -> + returnUs (App new_f new_a) + +wwExpr (Note note expr) + = wwExpr expr `thenUs` \ new_expr -> + returnUs (Note note new_expr) + +wwExpr (Let bind expr) + = wwBind bind `thenUs` \ intermediate_bind -> + wwExpr expr `thenUs` \ new_expr -> + returnUs (mkLets intermediate_bind new_expr) + +wwExpr (Case expr binder ty alts) + = wwExpr expr `thenUs` \ new_expr -> + mapUs ww_alt alts `thenUs` \ new_alts -> + returnUs (Case new_expr binder ty new_alts) + where + ww_alt (con, binders, rhs) + = wwExpr rhs `thenUs` \ new_rhs -> + returnUs (con, binders, new_rhs) +\end{code} + +%************************************************************************ +%* * +\subsection[tryWW]{@tryWW@: attempt a worker/wrapper pair} +%* * +%************************************************************************ + +@tryWW@ just accumulates arguments, converts strictness info from the +front-end into the proper form, then calls @mkWwBodies@ to do +the business. + +We have to BE CAREFUL that we don't worker-wrapperize an Id that has +already been w-w'd! (You can end up with several liked-named Ids +bouncing around at the same time---absolute mischief.) So the +criterion we use is: if an Id already has an unfolding (for whatever +reason), then we don't w-w it. + +The only reason this is monadised is for the unique supply. + +\begin{code} +tryWW :: RecFlag + -> Id -- The fn binder + -> CoreExpr -- The bound rhs; its innards + -- are already ww'd + -> UniqSM [(Id, CoreExpr)] -- either *one* or *two* pairs; + -- if one, then no worker (only + -- the orig "wrapper" lives on); + -- if two, then a worker and a + -- wrapper. +tryWW is_rec fn_id rhs + | isNonRec is_rec && certainlyWillInline unfolding + -- No point in worker/wrappering a function that is going to be + -- INLINEd wholesale anyway. If the strictness analyser is run + -- twice, this test also prevents wrappers (which are INLINEd) + -- from being re-done. + -- + -- It's very important to refrain from w/w-ing an INLINE function + -- If we do so by mistake we transform + -- f = __inline (\x -> E) + -- into + -- f = __inline (\x -> case x of (a,b) -> fw E) + -- fw = \ab -> (__inline (\x -> E)) (a,b) + -- and the original __inline now vanishes, so E is no longer + -- inside its __inline wrapper. Death! Disaster! + = returnUs [ (new_fn_id, rhs) ] + + | is_thunk && worthSplittingThunk maybe_fn_dmd res_info + = ASSERT2( isNonRec is_rec, ppr new_fn_id ) -- The thunk must be non-recursive + splitThunk new_fn_id rhs + + | is_fun && worthSplittingFun wrap_dmds res_info + = splitFun new_fn_id fn_info wrap_dmds res_info inline_prag rhs + + | otherwise + = returnUs [ (new_fn_id, rhs) ] + + where + fn_info = idInfo fn_id + maybe_fn_dmd = newDemandInfo fn_info + unfolding = unfoldingInfo fn_info + inline_prag = inlinePragInfo fn_info + + -- In practice it always will have a strictness + -- signature, even if it's a uninformative one + strict_sig = newStrictnessInfo fn_info `orElse` topSig + StrictSig (DmdType env wrap_dmds res_info) = strict_sig + + -- new_fn_id has the DmdEnv zapped. + -- (a) it is never used again + -- (b) it wastes space + -- (c) it becomes incorrect as things are cloned, because + -- we don't push the substitution into it + new_fn_id | isEmptyVarEnv env = fn_id + | otherwise = fn_id `setIdNewStrictness` + StrictSig (mkTopDmdType wrap_dmds res_info) + + is_fun = notNull wrap_dmds + is_thunk = not is_fun && not (exprIsHNF rhs) + +--------------------- +splitFun fn_id fn_info wrap_dmds res_info inline_prag rhs + = WARN( not (wrap_dmds `lengthIs` arity), ppr fn_id <+> (ppr arity $$ ppr wrap_dmds $$ ppr res_info) ) + -- The arity should match the signature + mkWwBodies fun_ty wrap_dmds res_info one_shots `thenUs` \ (work_demands, wrap_fn, work_fn) -> + getUniqueUs `thenUs` \ work_uniq -> + let + work_rhs = work_fn rhs + work_id = mkWorkerId work_uniq fn_id (exprType work_rhs) + `setInlinePragma` inline_prag + `setIdNewStrictness` StrictSig (mkTopDmdType work_demands work_res_info) + -- Even though we may not be at top level, + -- it's ok to give it an empty DmdEnv + + wrap_rhs = wrap_fn work_id + wrap_id = fn_id `setIdWorkerInfo` HasWorker work_id arity + `setInlinePragma` AlwaysActive -- Zap any inline pragma; + -- Put it on the worker instead + in + returnUs ([(work_id, work_rhs), (wrap_id, wrap_rhs)]) + -- Worker first, because wrapper mentions it + -- mkWwBodies has already built a wrap_rhs with an INLINE pragma wrapped around it + where + fun_ty = idType fn_id + + arity = arityInfo fn_info -- The arity is set by the simplifier using exprEtaExpandArity + -- So it may be more than the number of top-level-visible lambdas + + work_res_info | isBotRes res_info = BotRes -- Cpr stuff done by wrapper + | otherwise = TopRes + + one_shots = get_one_shots rhs + +-- If the original function has one-shot arguments, it is important to +-- make the wrapper and worker have corresponding one-shot arguments too. +-- Otherwise we spuriously float stuff out of case-expression join points, +-- which is very annoying. +get_one_shots (Lam b e) + | isId b = isOneShotLambda b : get_one_shots e + | otherwise = get_one_shots e +get_one_shots (Note _ e) = get_one_shots e +get_one_shots other = noOneShotInfo +\end{code} + +Thunk splitting +~~~~~~~~~~~~~~~ +Suppose x is used strictly (never mind whether it has the CPR +property). + + let + x* = x-rhs + in body + +splitThunk transforms like this: + + let + x* = case x-rhs of { I# a -> I# a } + in body + +Now simplifier will transform to + + case x-rhs of + I# a -> let x* = I# b + in body + +which is what we want. Now suppose x-rhs is itself a case: + + x-rhs = case e of { T -> I# a; F -> I# b } + +The join point will abstract over a, rather than over (which is +what would have happened before) which is fine. + +Notice that x certainly has the CPR property now! + +In fact, splitThunk uses the function argument w/w splitting +function, so that if x's demand is deeper (say U(U(L,L),L)) +then the splitting will go deeper too. + +\begin{code} +-- splitThunk converts the *non-recursive* binding +-- x = e +-- into +-- x = let x = e +-- in case x of +-- I# y -> let x = I# y in x } +-- See comments above. Is it not beautifully short? + +splitThunk fn_id rhs + = mkWWstr [fn_id] `thenUs` \ (_, wrap_fn, work_fn) -> + returnUs [ (fn_id, Let (NonRec fn_id rhs) (wrap_fn (work_fn (Var fn_id)))) ] +\end{code} + + +%************************************************************************ +%* * +\subsection{Functions over Demands} +%* * +%************************************************************************ + +\begin{code} +worthSplittingFun :: [Demand] -> DmdResult -> Bool + -- True <=> the wrapper would not be an identity function +worthSplittingFun ds res + = any worth_it ds || returnsCPR res + -- worthSplitting returns False for an empty list of demands, + -- and hence do_strict_ww is False if arity is zero and there is no CPR + + -- We used not to split if the result is bottom. + -- [Justification: there's no efficiency to be gained.] + -- But it's sometimes bad not to make a wrapper. Consider + -- fw = \x# -> let x = I# x# in case e of + -- p1 -> error_fn x + -- p2 -> error_fn x + -- p3 -> the real stuff + -- The re-boxing code won't go away unless error_fn gets a wrapper too. + -- [We don't do reboxing now, but in general it's better to pass + -- an unboxed thing to f, and have it reboxed in the error cases....] + where + worth_it Abs = True -- Absent arg + worth_it (Eval (Prod ds)) = True -- Product arg to evaluate + worth_it other = False + +worthSplittingThunk :: Maybe Demand -- Demand on the thunk + -> DmdResult -- CPR info for the thunk + -> Bool +worthSplittingThunk maybe_dmd res + = worth_it maybe_dmd || returnsCPR res + where + -- Split if the thing is unpacked + worth_it (Just (Eval (Prod ds))) = not (all isAbsent ds) + worth_it other = False +\end{code} + + + +%************************************************************************ +%* * +\subsection{The worker wrapper core} +%* * +%************************************************************************ + +@mkWrapper@ is called when importing a function. We have the type of +the function and the name of its worker, and we want to make its body (the wrapper). + +\begin{code} +mkWrapper :: Type -- Wrapper type + -> StrictSig -- Wrapper strictness info + -> UniqSM (Id -> CoreExpr) -- Wrapper body, missing worker Id + +mkWrapper fun_ty (StrictSig (DmdType _ demands res_info)) + = mkWwBodies fun_ty demands res_info noOneShotInfo `thenUs` \ (_, wrap_fn, _) -> + returnUs wrap_fn + +noOneShotInfo = repeat False +\end{code} diff --git a/compiler/stranal/WwLib.lhs b/compiler/stranal/WwLib.lhs new file mode 100644 index 0000000000..e44e521c83 --- /dev/null +++ b/compiler/stranal/WwLib.lhs @@ -0,0 +1,514 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1993-1998 +% +\section[WwLib]{A library for the ``worker/wrapper'' back-end to the strictness analyser} + +\begin{code} +module WwLib ( mkWwBodies, mkWWstr, mkWorkerArgs ) where + +#include "HsVersions.h" + +import CoreSyn +import CoreUtils ( exprType ) +import Id ( Id, idType, mkSysLocal, idNewDemandInfo, setIdNewDemandInfo, + isOneShotLambda, setOneShotLambda, setIdUnfolding, + setIdInfo + ) +import IdInfo ( vanillaIdInfo ) +import DataCon ( splitProductType_maybe, splitProductType ) +import NewDemand ( Demand(..), DmdResult(..), Demands(..) ) +import MkId ( realWorldPrimId, voidArgId, mkRuntimeErrorApp, rUNTIME_ERROR_ID ) +import TysWiredIn ( tupleCon ) +import Type ( Type, isUnLiftedType, mkFunTys, + splitForAllTys, splitFunTys, splitRecNewType_maybe, isAlgType + ) +import BasicTypes ( Boxity(..) ) +import Var ( Var, isId ) +import UniqSupply ( returnUs, thenUs, getUniquesUs, UniqSM ) +import Util ( zipWithEqual, notNull ) +import Outputable +import List ( zipWith4 ) +\end{code} + + +%************************************************************************ +%* * +\subsection[mkWrapperAndWorker]{@mkWrapperAndWorker@} +%* * +%************************************************************************ + +Here's an example. The original function is: + +\begin{verbatim} +g :: forall a . Int -> [a] -> a + +g = /\ a -> \ x ys -> + case x of + 0 -> head ys + _ -> head (tail ys) +\end{verbatim} + +From this, we want to produce: +\begin{verbatim} +-- wrapper (an unfolding) +g :: forall a . Int -> [a] -> a + +g = /\ a -> \ x ys -> + case x of + I# x# -> $wg a x# ys + -- call the worker; don't forget the type args! + +-- worker +$wg :: forall a . Int# -> [a] -> a + +$wg = /\ a -> \ x# ys -> + let + x = I# x# + in + case x of -- note: body of g moved intact + 0 -> head ys + _ -> head (tail ys) +\end{verbatim} + +Something we have to be careful about: Here's an example: + +\begin{verbatim} +-- "f" strictness: U(P)U(P) +f (I# a) (I# b) = a +# b + +g = f -- "g" strictness same as "f" +\end{verbatim} + +\tr{f} will get a worker all nice and friendly-like; that's good. +{\em But we don't want a worker for \tr{g}}, even though it has the +same strictness as \tr{f}. Doing so could break laziness, at best. + +Consequently, we insist that the number of strictness-info items is +exactly the same as the number of lambda-bound arguments. (This is +probably slightly paranoid, but OK in practice.) If it isn't the +same, we ``revise'' the strictness info, so that we won't propagate +the unusable strictness-info into the interfaces. + + +%************************************************************************ +%* * +\subsection{The worker wrapper core} +%* * +%************************************************************************ + +@mkWwBodies@ is called when doing the worker/wrapper split inside a module. + +\begin{code} +mkWwBodies :: Type -- Type of original function + -> [Demand] -- Strictness of original function + -> DmdResult -- Info about function result + -> [Bool] -- One-shot-ness of the function + -> UniqSM ([Demand], -- Demands for worker (value) args + Id -> CoreExpr, -- Wrapper body, lacking only the worker Id + CoreExpr -> CoreExpr) -- Worker body, lacking the original function rhs + +-- wrap_fn_args E = \x y -> E +-- work_fn_args E = E x y + +-- wrap_fn_str E = case x of { (a,b) -> +-- case a of { (a1,a2) -> +-- E a1 a2 b y }} +-- work_fn_str E = \a2 a2 b y -> +-- let a = (a1,a2) in +-- let x = (a,b) in +-- E + +mkWwBodies fun_ty demands res_info one_shots + = mkWWargs fun_ty demands one_shots' `thenUs` \ (wrap_args, wrap_fn_args, work_fn_args, res_ty) -> + mkWWstr wrap_args `thenUs` \ (work_args, wrap_fn_str, work_fn_str) -> + let + (work_lam_args, work_call_args) = mkWorkerArgs work_args res_ty + in + -- Don't do CPR if the worker doesn't have any value arguments + -- Then the worker is just a constant, so we don't want to unbox it. + (if any isId work_args then + mkWWcpr res_ty res_info + else + returnUs (id, id, res_ty) + ) `thenUs` \ (wrap_fn_cpr, work_fn_cpr, cpr_res_ty) -> + + returnUs ([idNewDemandInfo v | v <- work_args, isId v], + Note InlineMe . wrap_fn_args . wrap_fn_cpr . wrap_fn_str . applyToVars work_call_args . Var, + mkLams work_lam_args. work_fn_str . work_fn_cpr . work_fn_args) + -- We use an INLINE unconditionally, even if the wrapper turns out to be + -- something trivial like + -- fw = ... + -- f = __inline__ (coerce T fw) + -- The point is to propagate the coerce to f's call sites, so even though + -- f's RHS is now trivial (size 1) we still want the __inline__ to prevent + -- fw from being inlined into f's RHS + where + one_shots' = one_shots ++ repeat False +\end{code} + + +%************************************************************************ +%* * +\subsection{Making wrapper args} +%* * +%************************************************************************ + +During worker-wrapper stuff we may end up with an unlifted thing +which we want to let-bind without losing laziness. So we +add a void argument. E.g. + + f = /\a -> \x y z -> E::Int# -- E does not mention x,y,z +==> + fw = /\ a -> \void -> E + f = /\ a -> \x y z -> fw realworld + +We use the state-token type which generates no code. + +\begin{code} +mkWorkerArgs :: [Var] + -> Type -- Type of body + -> ([Var], -- Lambda bound args + [Var]) -- Args at call site +mkWorkerArgs args res_ty + | any isId args || not (isUnLiftedType res_ty) + = (args, args) + | otherwise + = (args ++ [voidArgId], args ++ [realWorldPrimId]) +\end{code} + + +%************************************************************************ +%* * +\subsection{Coercion stuff} +%* * +%************************************************************************ + + +We really want to "look through" coerces. +Reason: I've seen this situation: + + let f = coerce T (\s -> E) + in \x -> case x of + p -> coerce T' f + q -> \s -> E2 + r -> coerce T' f + +If only we w/w'd f, we'd get + let f = coerce T (\s -> fw s) + fw = \s -> E + in ... + +Now we'll inline f to get + + let fw = \s -> E + in \x -> case x of + p -> fw + q -> \s -> E2 + r -> fw + +Now we'll see that fw has arity 1, and will arity expand +the \x to get what we want. + +\begin{code} +-- mkWWargs is driven off the function type and arity. +-- It chomps bites off foralls, arrows, newtypes +-- and keeps repeating that until it's satisfied the supplied arity + +mkWWargs :: Type + -> [Demand] + -> [Bool] -- True for a one-shot arg; ** may be infinite ** + -> UniqSM ([Var], -- Wrapper args + CoreExpr -> CoreExpr, -- Wrapper fn + CoreExpr -> CoreExpr, -- Worker fn + Type) -- Type of wrapper body + +mkWWargs fun_ty demands one_shots + | Just rep_ty <- splitRecNewType_maybe fun_ty + -- The newtype case is for when the function has + -- a recursive newtype after the arrow (rare) + -- We check for arity >= 0 to avoid looping in the case + -- of a function whose type is, in effect, infinite + -- [Arity is driven by looking at the term, not just the type.] + -- + -- It's also important when we have a function returning (say) a pair + -- wrapped in a recursive newtype, at least if CPR analysis can look + -- through such newtypes, which it probably can since they are + -- simply coerces. + = mkWWargs rep_ty demands one_shots `thenUs` \ (wrap_args, wrap_fn_args, work_fn_args, res_ty) -> + returnUs (wrap_args, + Note (Coerce fun_ty rep_ty) . wrap_fn_args, + work_fn_args . Note (Coerce rep_ty fun_ty), + res_ty) + + | notNull demands + = getUniquesUs `thenUs` \ wrap_uniqs -> + let + (tyvars, tau) = splitForAllTys fun_ty + (arg_tys, body_ty) = splitFunTys tau + + n_demands = length demands + n_arg_tys = length arg_tys + n_args = n_demands `min` n_arg_tys + + new_fun_ty = mkFunTys (drop n_demands arg_tys) body_ty + new_demands = drop n_arg_tys demands + new_one_shots = drop n_args one_shots + + val_args = zipWith4 mk_wrap_arg wrap_uniqs arg_tys demands one_shots + wrap_args = tyvars ++ val_args + in +{- ASSERT( notNull tyvars || notNull arg_tys ) -} + if (null tyvars) && (null arg_tys) then + pprTrace "mkWWargs" (ppr fun_ty $$ ppr demands) + returnUs ([], id, id, fun_ty) + else + + mkWWargs new_fun_ty + new_demands + new_one_shots `thenUs` \ (more_wrap_args, wrap_fn_args, work_fn_args, res_ty) -> + + returnUs (wrap_args ++ more_wrap_args, + mkLams wrap_args . wrap_fn_args, + work_fn_args . applyToVars wrap_args, + res_ty) + + | otherwise + = returnUs ([], id, id, fun_ty) + + +applyToVars :: [Var] -> CoreExpr -> CoreExpr +applyToVars vars fn = mkVarApps fn vars + +mk_wrap_arg uniq ty dmd one_shot + = set_one_shot one_shot (setIdNewDemandInfo (mkSysLocal FSLIT("w") uniq ty) dmd) + where + set_one_shot True id = setOneShotLambda id + set_one_shot False id = id +\end{code} + + +%************************************************************************ +%* * +\subsection{Strictness stuff} +%* * +%************************************************************************ + +\begin{code} +mkWWstr :: [Var] -- Wrapper args; have their demand info on them + -- *Includes type variables* + -> UniqSM ([Var], -- Worker args + CoreExpr -> CoreExpr, -- Wrapper body, lacking the worker call + -- and without its lambdas + -- This fn adds the unboxing + + CoreExpr -> CoreExpr) -- Worker body, lacking the original body of the function, + -- and lacking its lambdas. + -- This fn does the reboxing + +---------------------- +nop_fn body = body + +---------------------- +mkWWstr [] + = returnUs ([], nop_fn, nop_fn) + +mkWWstr (arg : args) + = mkWWstr_one arg `thenUs` \ (args1, wrap_fn1, work_fn1) -> + mkWWstr args `thenUs` \ (args2, wrap_fn2, work_fn2) -> + returnUs (args1 ++ args2, wrap_fn1 . wrap_fn2, work_fn1 . work_fn2) + + +---------------------- +-- mkWWstr_one wrap_arg = (work_args, wrap_fn, work_fn) +-- * wrap_fn assumes wrap_arg is in scope, +-- brings into scope work_args (via cases) +-- * work_fn assumes work_args are in scope, a +-- brings into scope wrap_arg (via lets) + +mkWWstr_one arg + | isTyVar arg + = returnUs ([arg], nop_fn, nop_fn) + + | otherwise + = case idNewDemandInfo arg of + + -- Absent case. We don't deal with absence for unlifted types, + -- though, because it's not so easy to manufacture a placeholder + -- We'll see if this turns out to be a problem + Abs | not (isUnLiftedType (idType arg)) -> + returnUs ([], nop_fn, mk_absent_let arg) + + -- Unpack case + Eval (Prod cs) + | Just (arg_tycon, tycon_arg_tys, data_con, inst_con_arg_tys) + <- splitProductType_maybe (idType arg) + -> getUniquesUs `thenUs` \ uniqs -> + let + unpk_args = zipWith mk_ww_local uniqs inst_con_arg_tys + unpk_args_w_ds = zipWithEqual "mkWWstr" set_worker_arg_info unpk_args cs + unbox_fn = mk_unpk_case arg unpk_args data_con arg_tycon + rebox_fn = Let (NonRec arg con_app) + con_app = mkConApp data_con (map Type tycon_arg_tys ++ map Var unpk_args) + in + mkWWstr unpk_args_w_ds `thenUs` \ (worker_args, wrap_fn, work_fn) -> + returnUs (worker_args, unbox_fn . wrap_fn, work_fn . rebox_fn) + -- Don't pass the arg, rebox instead + + -- `seq` demand; evaluate in wrapper in the hope + -- of dropping seqs in the worker + Eval (Poly Abs) + -> let + arg_w_unf = arg `setIdUnfolding` evaldUnfolding + -- Tell the worker arg that it's sure to be evaluated + -- so that internal seqs can be dropped + in + returnUs ([arg_w_unf], mk_seq_case arg, nop_fn) + -- Pass the arg, anyway, even if it is in theory discarded + -- Consider + -- f x y = x `seq` y + -- x gets a (Eval (Poly Abs)) demand, but if we fail to pass it to the worker + -- we ABSOLUTELY MUST record that x is evaluated in the wrapper. + -- Something like: + -- f x y = x `seq` fw y + -- fw y = let x{Evald} = error "oops" in (x `seq` y) + -- If we don't pin on the "Evald" flag, the seq doesn't disappear, and + -- we end up evaluating the absent thunk. + -- But the Evald flag is pretty weird, and I worry that it might disappear + -- during simplification, so for now I've just nuked this whole case + + -- Other cases + other_demand -> returnUs ([arg], nop_fn, nop_fn) + + where + -- If the wrapper argument is a one-shot lambda, then + -- so should (all) the corresponding worker arguments be + -- This bites when we do w/w on a case join point + set_worker_arg_info worker_arg demand = set_one_shot (setIdNewDemandInfo worker_arg demand) + + set_one_shot | isOneShotLambda arg = setOneShotLambda + | otherwise = \x -> x +\end{code} + + +%************************************************************************ +%* * +\subsection{CPR stuff} +%* * +%************************************************************************ + + +@mkWWcpr@ takes the worker/wrapper pair produced from the strictness +info and adds in the CPR transformation. The worker returns an +unboxed tuple containing non-CPR components. The wrapper takes this +tuple and re-produces the correct structured output. + +The non-CPR results appear ordered in the unboxed tuple as if by a +left-to-right traversal of the result structure. + + +\begin{code} +mkWWcpr :: Type -- function body type + -> DmdResult -- CPR analysis results + -> UniqSM (CoreExpr -> CoreExpr, -- New wrapper + CoreExpr -> CoreExpr, -- New worker + Type) -- Type of worker's body + +mkWWcpr body_ty RetCPR + | not (isAlgType body_ty) + = WARN( True, text "mkWWcpr: non-algebraic body type" <+> ppr body_ty ) + returnUs (id, id, body_ty) + + | n_con_args == 1 && isUnLiftedType con_arg_ty1 + -- Special case when there is a single result of unlifted type + -- + -- Wrapper: case (..call worker..) of x -> C x + -- Worker: case ( ..body.. ) of C x -> x + = getUniquesUs `thenUs` \ (work_uniq : arg_uniq : _) -> + let + work_wild = mk_ww_local work_uniq body_ty + arg = mk_ww_local arg_uniq con_arg_ty1 + con_app = mkConApp data_con (map Type tycon_arg_tys ++ [Var arg]) + in + returnUs (\ wkr_call -> Case wkr_call arg (exprType con_app) [(DEFAULT, [], con_app)], + \ body -> workerCase body work_wild con_arg_ty1 [(DataAlt data_con, [arg], Var arg)], + con_arg_ty1) + + | otherwise -- The general case + -- Wrapper: case (..call worker..) of (# a, b #) -> C a b + -- Worker: case ( ...body... ) of C a b -> (# a, b #) + = getUniquesUs `thenUs` \ uniqs -> + let + (wrap_wild : work_wild : args) = zipWith mk_ww_local uniqs (ubx_tup_ty : body_ty : con_arg_tys) + arg_vars = map Var args + ubx_tup_con = tupleCon Unboxed n_con_args + ubx_tup_ty = exprType ubx_tup_app + ubx_tup_app = mkConApp ubx_tup_con (map Type con_arg_tys ++ arg_vars) + con_app = mkConApp data_con (map Type tycon_arg_tys ++ arg_vars) + in + returnUs (\ wkr_call -> Case wkr_call wrap_wild (exprType con_app) [(DataAlt ubx_tup_con, args, con_app)], + \ body -> workerCase body work_wild ubx_tup_ty [(DataAlt data_con, args, ubx_tup_app)], + ubx_tup_ty) + where + (_, tycon_arg_tys, data_con, con_arg_tys) = splitProductType "mkWWcpr" body_ty + n_con_args = length con_arg_tys + con_arg_ty1 = head con_arg_tys + +mkWWcpr body_ty other -- No CPR info + = returnUs (id, id, body_ty) + +-- If the original function looked like +-- f = \ x -> _scc_ "foo" E +-- +-- then we want the CPR'd worker to look like +-- \ x -> _scc_ "foo" (case E of I# x -> x) +-- and definitely not +-- \ x -> case (_scc_ "foo" E) of I# x -> x) +-- +-- This transform doesn't move work or allocation +-- from one cost centre to another + +workerCase (Note (SCC cc) e) arg ty alts = Note (SCC cc) (Case e arg ty alts) +workerCase e arg ty alts = Case e arg ty alts +\end{code} + + +%************************************************************************ +%* * +\subsection{Utilities} +%* * +%************************************************************************ + + +\begin{code} +mk_absent_let arg body + | not (isUnLiftedType arg_ty) + = Let (NonRec arg abs_rhs) body + | otherwise + = panic "WwLib: haven't done mk_absent_let for primitives yet" + where + arg_ty = idType arg + abs_rhs = mkRuntimeErrorApp rUNTIME_ERROR_ID arg_ty msg + msg = "Oops! Entered absent arg " ++ showSDocDebug (ppr arg <+> ppr (idType arg)) + +mk_unpk_case arg unpk_args boxing_con boxing_tycon body + -- A data type + = Case (Var arg) + (sanitiseCaseBndr arg) + (exprType body) + [(DataAlt boxing_con, unpk_args, body)] + +mk_seq_case arg body = Case (Var arg) (sanitiseCaseBndr arg) (exprType body) [(DEFAULT, [], body)] + +sanitiseCaseBndr :: Id -> Id +-- The argument we are scrutinising has the right type to be +-- a case binder, so it's convenient to re-use it for that purpose. +-- But we *must* throw away all its IdInfo. In particular, the argument +-- will have demand info on it, and that demand info may be incorrect for +-- the case binder. e.g. case ww_arg of ww_arg { I# x -> ... } +-- Quite likely ww_arg isn't used in '...'. The case may get discarded +-- if the case binder says "I'm demanded". This happened in a situation +-- like (x+y) `seq` .... +sanitiseCaseBndr id = id `setIdInfo` vanillaIdInfo + +mk_ww_local uniq ty = mkSysLocal FSLIT("ww") uniq ty +\end{code} diff --git a/compiler/typecheck/Inst.lhs b/compiler/typecheck/Inst.lhs new file mode 100644 index 0000000000..8768e20250 --- /dev/null +++ b/compiler/typecheck/Inst.lhs @@ -0,0 +1,790 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[Inst]{The @Inst@ type: dictionaries or method instances} + +\begin{code} +module Inst ( + Inst, + + pprInstances, pprDictsTheta, pprDictsInFull, -- User error messages + showLIE, pprInst, pprInsts, pprInstInFull, -- Debugging messages + + tidyInsts, tidyMoreInsts, + + newDicts, newDictAtLoc, newDictsAtLoc, cloneDict, + shortCutFracLit, shortCutIntLit, newIPDict, + newMethod, newMethodFromName, newMethodWithGivenTy, + tcInstClassOp, tcInstStupidTheta, + tcSyntaxName, isHsVar, + + tyVarsOfInst, tyVarsOfInsts, tyVarsOfLIE, + ipNamesOfInst, ipNamesOfInsts, fdPredsOfInst, fdPredsOfInsts, + instLoc, getDictClassTys, dictPred, + + lookupInst, LookupInstResult(..), lookupPred, + tcExtendLocalInstEnv, tcGetInstEnvs, getOverlapFlag, + + isDict, isClassDict, isMethod, + isLinearInst, linearInstType, isIPDict, isInheritableInst, + isTyVarDict, isMethodFor, + + zonkInst, zonkInsts, + instToId, instName, + + InstOrigin(..), InstLoc(..), pprInstLoc + ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} TcExpr( tcPolyExpr ) + +import HsSyn ( HsLit(..), HsOverLit(..), HsExpr(..), LHsExpr, mkHsApp, + nlHsLit, nlHsVar ) +import TcHsSyn ( mkHsTyApp, mkHsDictApp, zonkId ) +import TcRnMonad +import TcEnv ( tcLookupId, checkWellStaged, topIdLvl, tcMetaTy ) +import InstEnv ( DFunId, InstEnv, Instance(..), OverlapFlag(..), + lookupInstEnv, extendInstEnv, pprInstances, + instanceHead, instanceDFunId, setInstanceDFunId ) +import FunDeps ( checkFunDeps ) +import TcMType ( zonkTcType, zonkTcTypes, zonkTcPredType, zonkTcThetaType, + tcInstTyVar, tcInstSkolType + ) +import TcType ( Type, TcType, TcThetaType, TcTyVarSet, TcPredType, + BoxyRhoType, + PredType(..), SkolemInfo(..), typeKind, mkSigmaTy, + tcSplitForAllTys, applyTys, + tcSplitPhiTy, tcSplitDFunHead, + isIntTy,isFloatTy, isIntegerTy, isDoubleTy, + mkPredTy, mkTyVarTys, + tyVarsOfType, tyVarsOfTypes, tyVarsOfPred, tidyPred, + isClassPred, isTyVarClassPred, isLinearPred, + getClassPredTys, mkPredName, + isInheritablePred, isIPPred, + tidyType, tidyTypes, tidyFreeTyVars, tcSplitSigmaTy, + pprPred, pprParendType, pprTheta + ) +import Type ( TvSubst, substTy, substTyVar, substTyWith, substTheta, zipTopTvSubst, + notElemTvSubst, extendTvSubstList ) +import Unify ( tcMatchTys ) +import Kind ( isSubKind ) +import Packages ( isHomeModule ) +import HscTypes ( ExternalPackageState(..) ) +import CoreFVs ( idFreeTyVars ) +import DataCon ( DataCon, dataConTyVars, dataConStupidTheta, dataConName, dataConWrapId ) +import Id ( Id, idName, idType, mkUserLocal, mkLocalId ) +import Name ( Name, mkMethodOcc, getOccName, getSrcLoc, nameModule, + isInternalName, setNameUnique ) +import NameSet ( addOneToNameSet ) +import Literal ( inIntRange ) +import Var ( TyVar, tyVarKind, setIdType ) +import VarEnv ( TidyEnv, emptyTidyEnv ) +import VarSet ( elemVarSet, emptyVarSet, unionVarSet, mkVarSet ) +import TysWiredIn ( floatDataCon, doubleDataCon ) +import PrelNames ( integerTyConName, fromIntegerName, fromRationalName, rationalTyConName ) +import BasicTypes( IPName(..), mapIPName, ipNameName ) +import UniqSupply( uniqsFromSupply ) +import SrcLoc ( mkSrcSpan, noLoc, unLoc, Located(..) ) +import DynFlags ( DynFlag(..), dopt ) +import Maybes ( isJust ) +import Outputable +\end{code} + + +Selection +~~~~~~~~~ +\begin{code} +instName :: Inst -> Name +instName inst = idName (instToId inst) + +instToId :: Inst -> TcId +instToId (LitInst nm _ ty _) = mkLocalId nm ty +instToId (Dict nm pred _) = mkLocalId nm (mkPredTy pred) +instToId (Method id _ _ _ _) = id + +instLoc (Dict _ _ loc) = loc +instLoc (Method _ _ _ _ loc) = loc +instLoc (LitInst _ _ _ loc) = loc + +dictPred (Dict _ pred _ ) = pred +dictPred inst = pprPanic "dictPred" (ppr inst) + +getDictClassTys (Dict _ pred _) = getClassPredTys pred + +-- fdPredsOfInst is used to get predicates that contain functional +-- dependencies *or* might do so. The "might do" part is because +-- a constraint (C a b) might have a superclass with FDs +-- Leaving these in is really important for the call to fdPredsOfInsts +-- in TcSimplify.inferLoop, because the result is fed to 'grow', +-- which is supposed to be conservative +fdPredsOfInst (Dict _ pred _) = [pred] +fdPredsOfInst (Method _ _ _ theta _) = theta +fdPredsOfInst other = [] -- LitInsts etc + +fdPredsOfInsts :: [Inst] -> [PredType] +fdPredsOfInsts insts = concatMap fdPredsOfInst insts + +isInheritableInst (Dict _ pred _) = isInheritablePred pred +isInheritableInst (Method _ _ _ theta _) = all isInheritablePred theta +isInheritableInst other = True + + +ipNamesOfInsts :: [Inst] -> [Name] +ipNamesOfInst :: Inst -> [Name] +-- Get the implicit parameters mentioned by these Insts +-- NB: ?x and %x get different Names +ipNamesOfInsts insts = [n | inst <- insts, n <- ipNamesOfInst inst] + +ipNamesOfInst (Dict _ (IParam n _) _) = [ipNameName n] +ipNamesOfInst (Method _ _ _ theta _) = [ipNameName n | IParam n _ <- theta] +ipNamesOfInst other = [] + +tyVarsOfInst :: Inst -> TcTyVarSet +tyVarsOfInst (LitInst _ _ ty _) = tyVarsOfType ty +tyVarsOfInst (Dict _ pred _) = tyVarsOfPred pred +tyVarsOfInst (Method _ id tys _ _) = tyVarsOfTypes tys `unionVarSet` idFreeTyVars id + -- The id might have free type variables; in the case of + -- locally-overloaded class methods, for example + + +tyVarsOfInsts insts = foldr (unionVarSet . tyVarsOfInst) emptyVarSet insts +tyVarsOfLIE lie = tyVarsOfInsts (lieToList lie) +\end{code} + +Predicates +~~~~~~~~~~ +\begin{code} +isDict :: Inst -> Bool +isDict (Dict _ _ _) = True +isDict other = False + +isClassDict :: Inst -> Bool +isClassDict (Dict _ pred _) = isClassPred pred +isClassDict other = False + +isTyVarDict :: Inst -> Bool +isTyVarDict (Dict _ pred _) = isTyVarClassPred pred +isTyVarDict other = False + +isIPDict :: Inst -> Bool +isIPDict (Dict _ pred _) = isIPPred pred +isIPDict other = False + +isMethod :: Inst -> Bool +isMethod (Method {}) = True +isMethod other = False + +isMethodFor :: TcIdSet -> Inst -> Bool +isMethodFor ids (Method uniq id tys _ loc) = id `elemVarSet` ids +isMethodFor ids inst = False + +isLinearInst :: Inst -> Bool +isLinearInst (Dict _ pred _) = isLinearPred pred +isLinearInst other = False + -- We never build Method Insts that have + -- linear implicit paramters in them. + -- Hence no need to look for Methods + -- See TcExpr.tcId + +linearInstType :: Inst -> TcType -- %x::t --> t +linearInstType (Dict _ (IParam _ ty) _) = ty +\end{code} + + + +%************************************************************************ +%* * +\subsection{Building dictionaries} +%* * +%************************************************************************ + +\begin{code} +newDicts :: InstOrigin + -> TcThetaType + -> TcM [Inst] +newDicts orig theta + = getInstLoc orig `thenM` \ loc -> + newDictsAtLoc loc theta + +cloneDict :: Inst -> TcM Inst +cloneDict (Dict nm ty loc) = newUnique `thenM` \ uniq -> + returnM (Dict (setNameUnique nm uniq) ty loc) + +newDictAtLoc :: InstLoc -> TcPredType -> TcM Inst +newDictAtLoc inst_loc pred + = do { uniq <- newUnique + ; return (mkDict inst_loc uniq pred) } + +newDictsAtLoc :: InstLoc -> TcThetaType -> TcM [Inst] +newDictsAtLoc inst_loc theta + = newUniqueSupply `thenM` \ us -> + returnM (zipWith (mkDict inst_loc) (uniqsFromSupply us) theta) + +mkDict inst_loc uniq pred + = Dict name pred inst_loc + where + name = mkPredName uniq (instLocSrcLoc inst_loc) pred + +-- For vanilla implicit parameters, there is only one in scope +-- at any time, so we used to use the name of the implicit parameter itself +-- But with splittable implicit parameters there may be many in +-- scope, so we make up a new name. +newIPDict :: InstOrigin -> IPName Name -> Type + -> TcM (IPName Id, Inst) +newIPDict orig ip_name ty + = getInstLoc orig `thenM` \ inst_loc -> + newUnique `thenM` \ uniq -> + let + pred = IParam ip_name ty + name = mkPredName uniq (instLocSrcLoc inst_loc) pred + dict = Dict name pred inst_loc + in + returnM (mapIPName (\n -> instToId dict) ip_name, dict) +\end{code} + + + +%************************************************************************ +%* * +\subsection{Building methods (calls of overloaded functions)} +%* * +%************************************************************************ + + +\begin{code} +tcInstStupidTheta :: DataCon -> [TcType] -> TcM () +-- Instantiate the "stupid theta" of the data con, and throw +-- the constraints into the constraint set +tcInstStupidTheta data_con inst_tys + | null stupid_theta + = return () + | otherwise + = do { stupid_dicts <- newDicts (OccurrenceOf (dataConName data_con)) + (substTheta tenv stupid_theta) + ; extendLIEs stupid_dicts } + where + stupid_theta = dataConStupidTheta data_con + tenv = zipTopTvSubst (dataConTyVars data_con) inst_tys + +newMethodFromName :: InstOrigin -> BoxyRhoType -> Name -> TcM TcId +newMethodFromName origin ty name + = tcLookupId name `thenM` \ id -> + -- Use tcLookupId not tcLookupGlobalId; the method is almost + -- always a class op, but with -fno-implicit-prelude GHC is + -- meant to find whatever thing is in scope, and that may + -- be an ordinary function. + getInstLoc origin `thenM` \ loc -> + tcInstClassOp loc id [ty] `thenM` \ inst -> + extendLIE inst `thenM_` + returnM (instToId inst) + +newMethodWithGivenTy orig id tys + = getInstLoc orig `thenM` \ loc -> + newMethod loc id tys `thenM` \ inst -> + extendLIE inst `thenM_` + returnM (instToId inst) + +-------------------------------------------- +-- tcInstClassOp, and newMethod do *not* drop the +-- Inst into the LIE; they just returns the Inst +-- This is important because they are used by TcSimplify +-- to simplify Insts + +-- NB: the kind of the type variable to be instantiated +-- might be a sub-kind of the type to which it is applied, +-- notably when the latter is a type variable of kind ?? +-- Hence the call to checkKind +-- A worry: is this needed anywhere else? +tcInstClassOp :: InstLoc -> Id -> [TcType] -> TcM Inst +tcInstClassOp inst_loc sel_id tys + = let + (tyvars, _rho) = tcSplitForAllTys (idType sel_id) + in + zipWithM_ checkKind tyvars tys `thenM_` + newMethod inst_loc sel_id tys + +checkKind :: TyVar -> TcType -> TcM () +-- Ensure that the type has a sub-kind of the tyvar +checkKind tv ty + = do { let ty1 = ty + -- ty1 <- zonkTcType ty + ; if typeKind ty1 `isSubKind` tyVarKind tv + then return () + else + + pprPanic "checkKind: adding kind constraint" + (vcat [ppr tv <+> ppr (tyVarKind tv), + ppr ty <+> ppr ty1 <+> ppr (typeKind ty1)]) + } +-- do { tv1 <- tcInstTyVar tv +-- ; unifyType ty1 (mkTyVarTy tv1) } } + + +--------------------------- +newMethod inst_loc id tys + = newUnique `thenM` \ new_uniq -> + let + (theta,tau) = tcSplitPhiTy (applyTys (idType id) tys) + meth_id = mkUserLocal (mkMethodOcc (getOccName id)) new_uniq tau loc + inst = Method meth_id id tys theta inst_loc + loc = instLocSrcLoc inst_loc + in + returnM inst +\end{code} + +\begin{code} +shortCutIntLit :: Integer -> TcType -> Maybe (HsExpr TcId) +shortCutIntLit i ty + | isIntTy ty && inIntRange i -- Short cut for Int + = Just (HsLit (HsInt i)) + | isIntegerTy ty -- Short cut for Integer + = Just (HsLit (HsInteger i ty)) + | otherwise = Nothing + +shortCutFracLit :: Rational -> TcType -> Maybe (HsExpr TcId) +shortCutFracLit f ty + | isFloatTy ty + = Just (mk_lit floatDataCon (HsFloatPrim f)) + | isDoubleTy ty + = Just (mk_lit doubleDataCon (HsDoublePrim f)) + | otherwise = Nothing + where + mk_lit con lit = HsApp (nlHsVar (dataConWrapId con)) (nlHsLit lit) + +mkIntegerLit :: Integer -> TcM (LHsExpr TcId) +mkIntegerLit i + = tcMetaTy integerTyConName `thenM` \ integer_ty -> + getSrcSpanM `thenM` \ span -> + returnM (L span $ HsLit (HsInteger i integer_ty)) + +mkRatLit :: Rational -> TcM (LHsExpr TcId) +mkRatLit r + = tcMetaTy rationalTyConName `thenM` \ rat_ty -> + getSrcSpanM `thenM` \ span -> + returnM (L span $ HsLit (HsRat r rat_ty)) + +isHsVar :: HsExpr Name -> Name -> Bool +isHsVar (HsVar f) g = f==g +isHsVar other g = False +\end{code} + + +%************************************************************************ +%* * +\subsection{Zonking} +%* * +%************************************************************************ + +Zonking makes sure that the instance types are fully zonked. + +\begin{code} +zonkInst :: Inst -> TcM Inst +zonkInst (Dict name pred loc) + = zonkTcPredType pred `thenM` \ new_pred -> + returnM (Dict name new_pred loc) + +zonkInst (Method m id tys theta loc) + = zonkId id `thenM` \ new_id -> + -- Essential to zonk the id in case it's a local variable + -- Can't use zonkIdOcc because the id might itself be + -- an InstId, in which case it won't be in scope + + zonkTcTypes tys `thenM` \ new_tys -> + zonkTcThetaType theta `thenM` \ new_theta -> + returnM (Method m new_id new_tys new_theta loc) + +zonkInst (LitInst nm lit ty loc) + = zonkTcType ty `thenM` \ new_ty -> + returnM (LitInst nm lit new_ty loc) + +zonkInsts insts = mappM zonkInst insts +\end{code} + + +%************************************************************************ +%* * +\subsection{Printing} +%* * +%************************************************************************ + +ToDo: improve these pretty-printing things. The ``origin'' is really only +relevant in error messages. + +\begin{code} +instance Outputable Inst where + ppr inst = pprInst inst + +pprDictsTheta :: [Inst] -> SDoc +-- Print in type-like fashion (Eq a, Show b) +pprDictsTheta dicts = pprTheta (map dictPred dicts) + +pprDictsInFull :: [Inst] -> SDoc +-- Print in type-like fashion, but with source location +pprDictsInFull dicts + = vcat (map go dicts) + where + go dict = sep [quotes (ppr (dictPred dict)), nest 2 (pprInstLoc (instLoc dict))] + +pprInsts :: [Inst] -> SDoc +-- Debugging: print the evidence :: type +pprInsts insts = brackets (interpp'SP insts) + +pprInst, pprInstInFull :: Inst -> SDoc +-- Debugging: print the evidence :: type +pprInst (LitInst nm lit ty loc) = ppr nm <+> dcolon <+> ppr ty +pprInst (Dict nm pred loc) = ppr nm <+> dcolon <+> pprPred pred + +pprInst m@(Method inst_id id tys theta loc) + = ppr inst_id <+> dcolon <+> + braces (sep [ppr id <+> ptext SLIT("at"), + brackets (sep (map pprParendType tys))]) + +pprInstInFull inst + = sep [quotes (pprInst inst), nest 2 (pprInstLoc (instLoc inst))] + +tidyInst :: TidyEnv -> Inst -> Inst +tidyInst env (LitInst nm lit ty loc) = LitInst nm lit (tidyType env ty) loc +tidyInst env (Dict nm pred loc) = Dict nm (tidyPred env pred) loc +tidyInst env (Method u id tys theta loc) = Method u id (tidyTypes env tys) theta loc + +tidyMoreInsts :: TidyEnv -> [Inst] -> (TidyEnv, [Inst]) +-- This function doesn't assume that the tyvars are in scope +-- so it works like tidyOpenType, returning a TidyEnv +tidyMoreInsts env insts + = (env', map (tidyInst env') insts) + where + env' = tidyFreeTyVars env (tyVarsOfInsts insts) + +tidyInsts :: [Inst] -> (TidyEnv, [Inst]) +tidyInsts insts = tidyMoreInsts emptyTidyEnv insts + +showLIE :: SDoc -> TcM () -- Debugging +showLIE str + = do { lie_var <- getLIEVar ; + lie <- readMutVar lie_var ; + traceTc (str <+> vcat (map pprInstInFull (lieToList lie))) } +\end{code} + + +%************************************************************************ +%* * + Extending the instance environment +%* * +%************************************************************************ + +\begin{code} +tcExtendLocalInstEnv :: [Instance] -> TcM a -> TcM a + -- Add new locally-defined instances +tcExtendLocalInstEnv dfuns thing_inside + = do { traceDFuns dfuns + ; env <- getGblEnv + ; inst_env' <- foldlM addLocalInst (tcg_inst_env env) dfuns + ; let env' = env { tcg_insts = dfuns ++ tcg_insts env, + tcg_inst_env = inst_env' } + ; setGblEnv env' thing_inside } + +addLocalInst :: InstEnv -> Instance -> TcM InstEnv +-- Check that the proposed new instance is OK, +-- and then add it to the home inst env +addLocalInst home_ie ispec + = do { -- Instantiate the dfun type so that we extend the instance + -- envt with completely fresh template variables + -- This is important because the template variables must + -- not overlap with anything in the things being looked up + -- (since we do unification). + -- We use tcInstSkolType because we don't want to allocate fresh + -- *meta* type variables. + let dfun = instanceDFunId ispec + ; (tvs', theta', tau') <- tcInstSkolType (InstSkol dfun) (idType dfun) + ; let (cls, tys') = tcSplitDFunHead tau' + dfun' = setIdType dfun (mkSigmaTy tvs' theta' tau') + ispec' = setInstanceDFunId ispec dfun' + + -- Load imported instances, so that we report + -- duplicates correctly + ; eps <- getEps + ; let inst_envs = (eps_inst_env eps, home_ie) + + -- Check functional dependencies + ; case checkFunDeps inst_envs ispec' of + Just specs -> funDepErr ispec' specs + Nothing -> return () + + -- Check for duplicate instance decls + ; let { (matches, _) = lookupInstEnv inst_envs cls tys' + ; dup_ispecs = [ dup_ispec + | (_, dup_ispec) <- matches + , let (_,_,_,dup_tys) = instanceHead dup_ispec + , isJust (tcMatchTys (mkVarSet tvs') tys' dup_tys)] } + -- Find memebers of the match list which ispec itself matches. + -- If the match is 2-way, it's a duplicate + ; case dup_ispecs of + dup_ispec : _ -> dupInstErr ispec' dup_ispec + [] -> return () + + -- OK, now extend the envt + ; return (extendInstEnv home_ie ispec') } + +getOverlapFlag :: TcM OverlapFlag +getOverlapFlag + = do { dflags <- getDOpts + ; let overlap_ok = dopt Opt_AllowOverlappingInstances dflags + incoherent_ok = dopt Opt_AllowIncoherentInstances dflags + overlap_flag | incoherent_ok = Incoherent + | overlap_ok = OverlapOk + | otherwise = NoOverlap + + ; return overlap_flag } + +traceDFuns ispecs + = traceTc (hang (text "Adding instances:") 2 (vcat (map pp ispecs))) + where + pp ispec = ppr (instanceDFunId ispec) <+> colon <+> ppr ispec + -- Print the dfun name itself too + +funDepErr ispec ispecs + = addDictLoc ispec $ + addErr (hang (ptext SLIT("Functional dependencies conflict between instance declarations:")) + 2 (pprInstances (ispec:ispecs))) +dupInstErr ispec dup_ispec + = addDictLoc ispec $ + addErr (hang (ptext SLIT("Duplicate instance declarations:")) + 2 (pprInstances [ispec, dup_ispec])) + +addDictLoc ispec thing_inside + = setSrcSpan (mkSrcSpan loc loc) thing_inside + where + loc = getSrcLoc ispec +\end{code} + + +%************************************************************************ +%* * +\subsection{Looking up Insts} +%* * +%************************************************************************ + +\begin{code} +data LookupInstResult + = NoInstance + | SimpleInst (LHsExpr TcId) -- Just a variable, type application, or literal + | GenInst [Inst] (LHsExpr TcId) -- The expression and its needed insts + +lookupInst :: Inst -> TcM LookupInstResult +-- It's important that lookupInst does not put any new stuff into +-- the LIE. Instead, any Insts needed by the lookup are returned in +-- the LookupInstResult, where they can be further processed by tcSimplify + + +-- Methods + +lookupInst inst@(Method _ id tys theta loc) + = newDictsAtLoc loc theta `thenM` \ dicts -> + returnM (GenInst dicts (mkHsDictApp (mkHsTyApp (L span (HsVar id)) tys) (map instToId dicts))) + where + span = instLocSrcSpan loc + +-- Literals + +-- Look for short cuts first: if the literal is *definitely* a +-- int, integer, float or a double, generate the real thing here. +-- This is essential (see nofib/spectral/nucleic). +-- [Same shortcut as in newOverloadedLit, but we +-- may have done some unification by now] + +lookupInst inst@(LitInst _nm (HsIntegral i from_integer_name) ty loc) + | Just expr <- shortCutIntLit i ty + = returnM (GenInst [] (noLoc expr)) -- GenInst, not SimpleInst, because + -- expr may be a constructor application + | otherwise + = ASSERT( from_integer_name `isHsVar` fromIntegerName ) -- A LitInst invariant + tcLookupId fromIntegerName `thenM` \ from_integer -> + tcInstClassOp loc from_integer [ty] `thenM` \ method_inst -> + mkIntegerLit i `thenM` \ integer_lit -> + returnM (GenInst [method_inst] + (mkHsApp (L (instLocSrcSpan loc) + (HsVar (instToId method_inst))) integer_lit)) + +lookupInst inst@(LitInst _nm (HsFractional f from_rat_name) ty loc) + | Just expr <- shortCutFracLit f ty + = returnM (GenInst [] (noLoc expr)) + + | otherwise + = ASSERT( from_rat_name `isHsVar` fromRationalName ) -- A LitInst invariant + tcLookupId fromRationalName `thenM` \ from_rational -> + tcInstClassOp loc from_rational [ty] `thenM` \ method_inst -> + mkRatLit f `thenM` \ rat_lit -> + returnM (GenInst [method_inst] (mkHsApp (L (instLocSrcSpan loc) + (HsVar (instToId method_inst))) rat_lit)) + +-- Dictionaries +lookupInst (Dict _ pred loc) + = do { mb_result <- lookupPred pred + ; case mb_result of { + Nothing -> return NoInstance ; + Just (tenv, dfun_id) -> do + + -- tenv is a substitution that instantiates the dfun_id + -- to match the requested result type. + -- + -- We ASSUME that the dfun is quantified over the very same tyvars + -- that are bound by the tenv. + -- + -- However, the dfun + -- might have some tyvars that *only* appear in arguments + -- dfun :: forall a b. C a b, Ord b => D [a] + -- We instantiate b to a flexi type variable -- it'll presumably + -- become fixed later via functional dependencies + { use_stage <- getStage + ; checkWellStaged (ptext SLIT("instance for") <+> quotes (ppr pred)) + (topIdLvl dfun_id) use_stage + + -- It's possible that not all the tyvars are in + -- the substitution, tenv. For example: + -- instance C X a => D X where ... + -- (presumably there's a functional dependency in class C) + -- Hence the open_tvs to instantiate any un-substituted tyvars. + ; let (tyvars, rho) = tcSplitForAllTys (idType dfun_id) + open_tvs = filter (`notElemTvSubst` tenv) tyvars + ; open_tvs' <- mappM tcInstTyVar open_tvs + ; let + tenv' = extendTvSubstList tenv open_tvs (mkTyVarTys open_tvs') + -- Since the open_tvs' are freshly made, they cannot possibly be captured by + -- any nested for-alls in rho. So the in-scope set is unchanged + dfun_rho = substTy tenv' rho + (theta, _) = tcSplitPhiTy dfun_rho + ty_app = mkHsTyApp (L (instLocSrcSpan loc) (HsVar dfun_id)) + (map (substTyVar tenv') tyvars) + ; if null theta then + returnM (SimpleInst ty_app) + else do + { dicts <- newDictsAtLoc loc theta + ; let rhs = mkHsDictApp ty_app (map instToId dicts) + ; returnM (GenInst dicts rhs) + }}}} + +--------------- +lookupPred :: TcPredType -> TcM (Maybe (TvSubst, DFunId)) +-- Look up a class constraint in the instance environment +lookupPred pred@(ClassP clas tys) + = do { eps <- getEps + ; tcg_env <- getGblEnv + ; let inst_envs = (eps_inst_env eps, tcg_inst_env tcg_env) + ; case lookupInstEnv inst_envs clas tys of { + ([(tenv, ispec)], []) + -> do { let dfun_id = is_dfun ispec + ; traceTc (text "lookupInst success" <+> + vcat [text "dict" <+> ppr pred, + text "witness" <+> ppr dfun_id + <+> ppr (idType dfun_id) ]) + -- Record that this dfun is needed + ; record_dfun_usage dfun_id + ; return (Just (tenv, dfun_id)) } ; + + (matches, unifs) + -> do { traceTc (text "lookupInst fail" <+> + vcat [text "dict" <+> ppr pred, + text "matches" <+> ppr matches, + text "unifs" <+> ppr unifs]) + -- In the case of overlap (multiple matches) we report + -- NoInstance here. That has the effect of making the + -- context-simplifier return the dict as an irreducible one. + -- Then it'll be given to addNoInstanceErrs, which will do another + -- lookupInstEnv to get the detailed info about what went wrong. + ; return Nothing } + }} + +lookupPred ip_pred = return Nothing + +record_dfun_usage dfun_id + = do { gbl <- getGblEnv + ; let dfun_name = idName dfun_id + dfun_mod = nameModule dfun_name + ; if isInternalName dfun_name || -- Internal name => defined in this module + not (isHomeModule (tcg_home_mods gbl) dfun_mod) + then return () -- internal, or in another package + else do { tcg_env <- getGblEnv + ; updMutVar (tcg_inst_uses tcg_env) + (`addOneToNameSet` idName dfun_id) }} + + +tcGetInstEnvs :: TcM (InstEnv, InstEnv) +-- Gets both the external-package inst-env +-- and the home-pkg inst env (includes module being compiled) +tcGetInstEnvs = do { eps <- getEps; env <- getGblEnv; + return (eps_inst_env eps, tcg_inst_env env) } +\end{code} + + + +%************************************************************************ +%* * + Re-mappable syntax +%* * +%************************************************************************ + +Suppose we are doing the -fno-implicit-prelude thing, and we encounter +a do-expression. We have to find (>>) in the current environment, which is +done by the rename. Then we have to check that it has the same type as +Control.Monad.(>>). Or, more precisely, a compatible type. One 'customer' had +this: + + (>>) :: HB m n mn => m a -> n b -> mn b + +So the idea is to generate a local binding for (>>), thus: + + let then72 :: forall a b. m a -> m b -> m b + then72 = ...something involving the user's (>>)... + in + ...the do-expression... + +Now the do-expression can proceed using then72, which has exactly +the expected type. + +In fact tcSyntaxName just generates the RHS for then72, because we only +want an actual binding in the do-expression case. For literals, we can +just use the expression inline. + +\begin{code} +tcSyntaxName :: InstOrigin + -> TcType -- Type to instantiate it at + -> (Name, HsExpr Name) -- (Standard name, user name) + -> TcM (Name, HsExpr TcId) -- (Standard name, suitable expression) +-- *** NOW USED ONLY FOR CmdTop (sigh) *** +-- NB: tcSyntaxName calls tcExpr, and hence can do unification. +-- So we do not call it from lookupInst, which is called from tcSimplify + +tcSyntaxName orig ty (std_nm, HsVar user_nm) + | std_nm == user_nm + = newMethodFromName orig ty std_nm `thenM` \ id -> + returnM (std_nm, HsVar id) + +tcSyntaxName orig ty (std_nm, user_nm_expr) + = tcLookupId std_nm `thenM` \ std_id -> + let + -- C.f. newMethodAtLoc + ([tv], _, tau) = tcSplitSigmaTy (idType std_id) + sigma1 = substTyWith [tv] [ty] tau + -- Actually, the "tau-type" might be a sigma-type in the + -- case of locally-polymorphic methods. + in + addErrCtxtM (syntaxNameCtxt user_nm_expr orig sigma1) $ + + -- Check that the user-supplied thing has the + -- same type as the standard one. + -- Tiresome jiggling because tcCheckSigma takes a located expression + getSrcSpanM `thenM` \ span -> + tcPolyExpr (L span user_nm_expr) sigma1 `thenM` \ expr -> + returnM (std_nm, unLoc expr) + +syntaxNameCtxt name orig ty tidy_env + = getInstLoc orig `thenM` \ inst_loc -> + let + msg = vcat [ptext SLIT("When checking that") <+> quotes (ppr name) <+> + ptext SLIT("(needed by a syntactic construct)"), + nest 2 (ptext SLIT("has the required type:") <+> ppr (tidyType tidy_env ty)), + nest 2 (pprInstLoc inst_loc)] + in + returnM (tidy_env, msg) +\end{code} diff --git a/compiler/typecheck/TcArrows.lhs b/compiler/typecheck/TcArrows.lhs new file mode 100644 index 0000000000..3bfa9b4757 --- /dev/null +++ b/compiler/typecheck/TcArrows.lhs @@ -0,0 +1,350 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section{Typecheck arrow notation} + +\begin{code} +module TcArrows ( tcProc ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} TcExpr( tcMonoExpr, tcInferRho ) + +import HsSyn +import TcHsSyn ( mkHsDictLet ) + +import TcMatches ( matchCtxt, tcStmts, tcMDoStmt, tcGuardStmt, + TcMatchCtxt(..), tcMatchesCase ) + +import TcType ( TcType, TcTauType, BoxyRhoType, mkFunTys, mkTyConApp, + mkTyVarTy, mkAppTys, tcSplitTyConApp_maybe, tcEqType, + SkolemInfo(..) ) +import TcMType ( newFlexiTyVarTy, tcInstSkolTyVars, zonkTcType ) +import TcBinds ( tcLocalBinds ) +import TcSimplify ( tcSimplifyCheck ) +import TcPat ( tcPat, tcPats, PatCtxt(..) ) +import TcUnify ( checkSigTyVarsWrt, boxySplitAppTy ) +import TcRnMonad +import Inst ( tcSyntaxName ) +import Name ( Name ) +import TysWiredIn ( boolTy, pairTyCon ) +import VarSet +import TysPrim ( alphaTyVar ) +import Type ( Kind, mkArrowKinds, liftedTypeKind, openTypeKind, tyVarsOfTypes ) + +import SrcLoc ( Located(..) ) +import Outputable +import Util ( lengthAtLeast ) + +\end{code} + +%************************************************************************ +%* * + Proc +%* * +%************************************************************************ + +\begin{code} +tcProc :: InPat Name -> LHsCmdTop Name -- proc pat -> expr + -> BoxyRhoType -- Expected type of whole proc expression + -> TcM (OutPat TcId, LHsCmdTop TcId) + +tcProc pat cmd exp_ty + = newArrowScope $ + do { (exp_ty1, res_ty) <- boxySplitAppTy exp_ty + ; (arr_ty, arg_ty) <- boxySplitAppTy exp_ty1 + ; let cmd_env = CmdEnv { cmd_arr = arr_ty } + ; (pat', cmd') <- tcPat LamPat pat arg_ty res_ty $ \ res_ty' -> + tcCmdTop cmd_env cmd ([], res_ty') + ; return (pat', cmd') } +\end{code} + + +%************************************************************************ +%* * + Commands +%* * +%************************************************************************ + +\begin{code} +type CmdStack = [TcTauType] +data CmdEnv + = CmdEnv { + cmd_arr :: TcType -- arrow type constructor, of kind *->*->* + } + +mkCmdArrTy :: CmdEnv -> TcTauType -> TcTauType -> TcTauType +mkCmdArrTy env t1 t2 = mkAppTys (cmd_arr env) [t1, t2] + +--------------------------------------- +tcCmdTop :: CmdEnv + -> LHsCmdTop Name + -> (CmdStack, TcTauType) -- Expected result type; always a monotype + -- We know exactly how many cmd args are expected, + -- albeit perhaps not their types; so we can pass + -- in a CmdStack + -> TcM (LHsCmdTop TcId) + +tcCmdTop env (L loc (HsCmdTop cmd _ _ names)) (cmd_stk, res_ty) + = setSrcSpan loc $ + do { cmd' <- tcCmd env cmd (cmd_stk, res_ty) + ; names' <- mapM (tcSyntaxName ProcOrigin (cmd_arr env)) names + ; return (L loc $ HsCmdTop cmd' cmd_stk res_ty names') } + + +---------------------------------------- +tcCmd :: CmdEnv -> LHsExpr Name -> (CmdStack, TcTauType) -> TcM (LHsExpr TcId) + -- The main recursive function +tcCmd env (L loc expr) res_ty + = setSrcSpan loc $ do + { expr' <- tc_cmd env expr res_ty + ; return (L loc expr') } + +tc_cmd env (HsPar cmd) res_ty + = do { cmd' <- tcCmd env cmd res_ty + ; return (HsPar cmd') } + +tc_cmd env (HsLet binds (L body_loc body)) res_ty + = do { (binds', body') <- tcLocalBinds binds $ + setSrcSpan body_loc $ + tc_cmd env body res_ty + ; return (HsLet binds' (L body_loc body')) } + +tc_cmd env in_cmd@(HsCase scrut matches) (stk, res_ty) + = addErrCtxt (cmdCtxt in_cmd) $ + addErrCtxt (caseScrutCtxt scrut) ( + tcInferRho scrut + ) `thenM` \ (scrut', scrut_ty) -> + tcMatchesCase match_ctxt scrut_ty matches res_ty `thenM` \ matches' -> + returnM (HsCase scrut' matches') + where + match_ctxt = MC { mc_what = CaseAlt, + mc_body = mc_body } + mc_body body res_ty' = tcCmd env body (stk, res_ty') + +tc_cmd env (HsIf pred b1 b2) res_ty + = do { pred' <- tcMonoExpr pred boolTy + ; b1' <- tcCmd env b1 res_ty + ; b2' <- tcCmd env b2 res_ty + ; return (HsIf pred' b1' b2') + } + +------------------------------------------- +-- Arrow application +-- (f -< a) or (f -<< a) + +tc_cmd env cmd@(HsArrApp fun arg _ ho_app lr) (cmd_stk, res_ty) + = addErrCtxt (cmdCtxt cmd) $ + do { arg_ty <- newFlexiTyVarTy openTypeKind + ; let fun_ty = mkCmdArrTy env (foldl mkPairTy arg_ty cmd_stk) res_ty + + ; fun' <- select_arrow_scope (tcMonoExpr fun fun_ty) + + ; arg' <- tcMonoExpr arg arg_ty + + ; return (HsArrApp fun' arg' fun_ty ho_app lr) } + where + -- Before type-checking f, use the environment of the enclosing + -- proc for the (-<) case. + -- Local bindings, inside the enclosing proc, are not in scope + -- inside f. In the higher-order case (-<<), they are. + select_arrow_scope tc = case ho_app of + HsHigherOrderApp -> tc + HsFirstOrderApp -> escapeArrowScope tc + +------------------------------------------- +-- Command application + +tc_cmd env cmd@(HsApp fun arg) (cmd_stk, res_ty) + = addErrCtxt (cmdCtxt cmd) $ +-- gaw 2004 FIX? + do { arg_ty <- newFlexiTyVarTy openTypeKind + + ; fun' <- tcCmd env fun (arg_ty:cmd_stk, res_ty) + + ; arg' <- tcMonoExpr arg arg_ty + + ; return (HsApp fun' arg') } + +------------------------------------------- +-- Lambda + +-- gaw 2004 +tc_cmd env cmd@(HsLam (MatchGroup [L mtch_loc (match@(Match pats maybe_rhs_sig grhss))] _)) + (cmd_stk, res_ty) + = addErrCtxt (matchCtxt match_ctxt match) $ + + do { -- Check the cmd stack is big enough + ; checkTc (lengthAtLeast cmd_stk n_pats) + (kappaUnderflow cmd) + + -- Check the patterns, and the GRHSs inside + ; (pats', grhss') <- setSrcSpan mtch_loc $ + tcPats LamPat pats cmd_stk res_ty $ + tc_grhss grhss + + ; let match' = L mtch_loc (Match pats' Nothing grhss') + ; return (HsLam (MatchGroup [match'] res_ty)) + } + + where + n_pats = length pats + stk' = drop n_pats cmd_stk + match_ctxt = LambdaExpr -- Maybe KappaExpr? + pg_ctxt = PatGuard match_ctxt + + tc_grhss (GRHSs grhss binds) res_ty + = do { (binds', grhss') <- tcLocalBinds binds $ + mapM (wrapLocM (tc_grhs res_ty)) grhss + ; return (GRHSs grhss' binds') } + + tc_grhs res_ty (GRHS guards body) + = do { (guards', rhs') <- tcStmts pg_ctxt tcGuardStmt + guards res_ty + (\res_ty' -> tcCmd env body (stk', res_ty')) + ; return (GRHS guards' rhs') } + +------------------------------------------- +-- Do notation + +tc_cmd env cmd@(HsDo do_or_lc stmts body ty) (cmd_stk, res_ty) + = do { checkTc (null cmd_stk) (nonEmptyCmdStkErr cmd) + ; (stmts', body') <- tcStmts do_or_lc tc_stmt stmts res_ty $ \ res_ty' -> + tcCmd env body ([], res_ty') + ; return (HsDo do_or_lc stmts' body' res_ty) } + where + tc_stmt = tcMDoStmt tc_rhs + tc_rhs rhs = do { ty <- newFlexiTyVarTy liftedTypeKind + ; rhs' <- tcCmd env rhs ([], ty) + ; return (rhs', ty) } + + +----------------------------------------------------------------- +-- Arrow ``forms'' (| e c1 .. cn |) +-- +-- G |-b c : [s1 .. sm] s +-- pop(G) |- e : forall w. b ((w,s1) .. sm) s +-- -> a ((w,t1) .. tn) t +-- e \not\in (s, s1..sm, t, t1..tn) +-- ---------------------------------------------- +-- G |-a (| e c |) : [t1 .. tn] t + +tc_cmd env cmd@(HsArrForm expr fixity cmd_args) (cmd_stk, res_ty) + = addErrCtxt (cmdCtxt cmd) $ + do { cmds_w_tys <- zipWithM new_cmd_ty cmd_args [1..] + ; span <- getSrcSpanM + ; [w_tv] <- tcInstSkolTyVars (ArrowSkol span) [alphaTyVar] + ; let w_ty = mkTyVarTy w_tv -- Just a convenient starting point + + -- a ((w,t1) .. tn) t + ; let e_res_ty = mkCmdArrTy env (foldl mkPairTy w_ty cmd_stk) res_ty + + -- b ((w,s1) .. sm) s + -- -> a ((w,t1) .. tn) t + ; let e_ty = mkFunTys [mkAppTys b [tup,s] | (_,_,b,tup,s) <- cmds_w_tys] + e_res_ty + + -- Check expr + ; (expr', lie) <- escapeArrowScope (getLIE (tcMonoExpr expr e_ty)) + ; inst_binds <- tcSimplifyCheck sig_msg [w_tv] [] lie + + -- Check that the polymorphic variable hasn't been unified with anything + -- and is not free in res_ty or the cmd_stk (i.e. t, t1..tn) + ; checkSigTyVarsWrt (tyVarsOfTypes (res_ty:cmd_stk)) [w_tv] + + -- OK, now we are in a position to unscramble + -- the s1..sm and check each cmd + ; cmds' <- mapM (tc_cmd w_tv) cmds_w_tys + + ; returnM (HsArrForm (mkHsTyLam [w_tv] (mkHsDictLet inst_binds expr')) fixity cmds') + } + where + -- Make the types + -- b, ((e,s1) .. sm), s + new_cmd_ty :: LHsCmdTop Name -> Int + -> TcM (LHsCmdTop Name, Int, TcType, TcType, TcType) + new_cmd_ty cmd i +-- gaw 2004 FIX? + = do { b_ty <- newFlexiTyVarTy arrowTyConKind + ; tup_ty <- newFlexiTyVarTy liftedTypeKind + -- We actually make a type variable for the tuple + -- because we don't know how deeply nested it is yet + ; s_ty <- newFlexiTyVarTy liftedTypeKind + ; return (cmd, i, b_ty, tup_ty, s_ty) + } + + tc_cmd w_tv (cmd, i, b, tup_ty, s) + = do { tup_ty' <- zonkTcType tup_ty + ; let (corner_ty, arg_tys) = unscramble tup_ty' + + -- Check that it has the right shape: + -- ((w,s1) .. sn) + -- where the si do not mention w + ; checkTc (corner_ty `tcEqType` mkTyVarTy w_tv && + not (w_tv `elemVarSet` tyVarsOfTypes arg_tys)) + (badFormFun i tup_ty') + + ; tcCmdTop (env { cmd_arr = b }) cmd (arg_tys, s) } + + unscramble :: TcType -> (TcType, [TcType]) + -- unscramble ((w,s1) .. sn) = (w, [s1..sn]) + unscramble ty + = case tcSplitTyConApp_maybe ty of + Just (tc, [t,s]) | tc == pairTyCon + -> let + (w,ss) = unscramble t + in (w, s:ss) + + other -> (ty, []) + + sig_msg = ptext SLIT("expected type of a command form") + +----------------------------------------------------------------- +-- Base case for illegal commands +-- This is where expressions that aren't commands get rejected + +tc_cmd env cmd _ + = failWithTc (vcat [ptext SLIT("The expression"), nest 2 (ppr cmd), + ptext SLIT("was found where an arrow command was expected")]) +\end{code} + + +%************************************************************************ +%* * + Helpers +%* * +%************************************************************************ + + +\begin{code} +mkPairTy t1 t2 = mkTyConApp pairTyCon [t1,t2] + +arrowTyConKind :: Kind -- *->*->* +arrowTyConKind = mkArrowKinds [liftedTypeKind, liftedTypeKind] liftedTypeKind +\end{code} + + +%************************************************************************ +%* * + Errors +%* * +%************************************************************************ + +\begin{code} +cmdCtxt cmd = ptext SLIT("In the command:") <+> ppr cmd + +caseScrutCtxt cmd + = hang (ptext SLIT("In the scrutinee of a case command:")) 4 (ppr cmd) + +nonEmptyCmdStkErr cmd + = hang (ptext SLIT("Non-empty command stack at command:")) + 4 (ppr cmd) + +kappaUnderflow cmd + = hang (ptext SLIT("Command stack underflow at command:")) + 4 (ppr cmd) + +badFormFun i tup_ty' + = hang (ptext SLIT("The type of the") <+> speakNth i <+> ptext SLIT("argument of a command form has the wrong shape")) + 4 (ptext SLIT("Argument type:") <+> ppr tup_ty') +\end{code} diff --git a/compiler/typecheck/TcBinds.lhs b/compiler/typecheck/TcBinds.lhs new file mode 100644 index 0000000000..cffcb9cfb9 --- /dev/null +++ b/compiler/typecheck/TcBinds.lhs @@ -0,0 +1,1117 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[TcBinds]{TcBinds} + +\begin{code} +module TcBinds ( tcLocalBinds, tcTopBinds, + tcHsBootSigs, tcMonoBinds, + TcPragFun, tcSpecPrag, tcPrags, mkPragFun, + TcSigInfo(..), + badBootDeclErr ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} TcMatches ( tcGRHSsPat, tcMatchesFun ) +import {-# SOURCE #-} TcExpr ( tcMonoExpr ) + +import DynFlags ( DynFlag(Opt_MonomorphismRestriction, Opt_GlasgowExts) ) +import HsSyn ( HsExpr(..), HsBind(..), LHsBinds, LHsBind, Sig(..), + HsLocalBinds(..), HsValBinds(..), HsIPBinds(..), + LSig, Match(..), IPBind(..), Prag(..), + HsType(..), LHsType, HsExplicitForAll(..), hsLTyVarNames, + isVanillaLSig, sigName, placeHolderNames, isPragLSig, + LPat, GRHSs, MatchGroup(..), pprLHsBinds, mkHsCoerce, + collectHsBindBinders, collectPatBinders, pprPatBind, isBangHsBind + ) +import TcHsSyn ( zonkId ) + +import TcRnMonad +import Inst ( newDictsAtLoc, newIPDict, instToId ) +import TcEnv ( tcExtendIdEnv, tcExtendIdEnv2, tcExtendTyVarEnv2, + pprBinders, tcLookupLocalId_maybe, tcLookupId, + tcGetGlobalTyVars ) +import TcUnify ( tcInfer, tcSubExp, unifyTheta, + bleatEscapedTvs, sigCtxt ) +import TcSimplify ( tcSimplifyInfer, tcSimplifyInferCheck, + tcSimplifyRestricted, tcSimplifyIPs ) +import TcHsType ( tcHsSigType, UserTypeCtxt(..) ) +import TcPat ( tcPat, PatCtxt(..) ) +import TcSimplify ( bindInstsOfLocalFuns ) +import TcMType ( newFlexiTyVarTy, zonkQuantifiedTyVar, zonkSigTyVar, + tcInstSigTyVars, tcInstSkolTyVars, tcInstType, + zonkTcType, zonkTcTypes, zonkTcTyVars ) +import TcType ( TcType, TcTyVar, TcThetaType, + SkolemInfo(SigSkol), UserTypeCtxt(FunSigCtxt), + TcTauType, TcSigmaType, isUnboxedTupleType, + mkTyVarTy, mkForAllTys, mkFunTys, exactTyVarsOfType, + mkForAllTy, isUnLiftedType, tcGetTyVar, + mkTyVarTys, tidyOpenTyVar ) +import Kind ( argTypeKind ) +import VarEnv ( TyVarEnv, emptyVarEnv, lookupVarEnv, extendVarEnv ) +import TysWiredIn ( unitTy ) +import TysPrim ( alphaTyVar ) +import Id ( Id, mkLocalId, mkVanillaGlobal ) +import IdInfo ( vanillaIdInfo ) +import Var ( TyVar, idType, idName ) +import Name ( Name ) +import NameSet +import NameEnv +import VarSet +import SrcLoc ( Located(..), unLoc, getLoc ) +import Bag +import ErrUtils ( Message ) +import Digraph ( SCC(..), stronglyConnComp ) +import Maybes ( expectJust, isJust, isNothing, orElse ) +import Util ( singleton ) +import BasicTypes ( TopLevelFlag(..), isTopLevel, isNotTopLevel, + RecFlag(..), isNonRec, InlineSpec, defaultInlineSpec ) +import Outputable +\end{code} + + +%************************************************************************ +%* * +\subsection{Type-checking bindings} +%* * +%************************************************************************ + +@tcBindsAndThen@ typechecks a @HsBinds@. The "and then" part is because +it needs to know something about the {\em usage} of the things bound, +so that it can create specialisations of them. So @tcBindsAndThen@ +takes a function which, given an extended environment, E, typechecks +the scope of the bindings returning a typechecked thing and (most +important) an LIE. It is this LIE which is then used as the basis for +specialising the things bound. + +@tcBindsAndThen@ also takes a "combiner" which glues together the +bindings and the "thing" to make a new "thing". + +The real work is done by @tcBindWithSigsAndThen@. + +Recursive and non-recursive binds are handled in essentially the same +way: because of uniques there are no scoping issues left. The only +difference is that non-recursive bindings can bind primitive values. + +Even for non-recursive binding groups we add typings for each binder +to the LVE for the following reason. When each individual binding is +checked the type of its LHS is unified with that of its RHS; and +type-checking the LHS of course requires that the binder is in scope. + +At the top-level the LIE is sure to contain nothing but constant +dictionaries, which we resolve at the module level. + +\begin{code} +tcTopBinds :: HsValBinds Name -> TcM (LHsBinds TcId, TcLclEnv) + -- Note: returning the TcLclEnv is more than we really + -- want. The bit we care about is the local bindings + -- and the free type variables thereof +tcTopBinds binds + = do { (ValBindsOut prs _, env) <- tcValBinds TopLevel binds getLclEnv + ; return (foldr (unionBags . snd) emptyBag prs, env) } + -- The top level bindings are flattened into a giant + -- implicitly-mutually-recursive LHsBinds + +tcHsBootSigs :: HsValBinds Name -> TcM [Id] +-- A hs-boot file has only one BindGroup, and it only has type +-- signatures in it. The renamer checked all this +tcHsBootSigs (ValBindsOut binds sigs) + = do { checkTc (null binds) badBootDeclErr + ; mapM (addLocM tc_boot_sig) (filter isVanillaLSig sigs) } + where + tc_boot_sig (TypeSig (L _ name) ty) + = do { sigma_ty <- tcHsSigType (FunSigCtxt name) ty + ; return (mkVanillaGlobal name sigma_ty vanillaIdInfo) } + -- Notice that we make GlobalIds, not LocalIds +tcHsBootSigs groups = pprPanic "tcHsBootSigs" (ppr groups) + +badBootDeclErr :: Message +badBootDeclErr = ptext SLIT("Illegal declarations in an hs-boot file") + +------------------------ +tcLocalBinds :: HsLocalBinds Name -> TcM thing + -> TcM (HsLocalBinds TcId, thing) + +tcLocalBinds EmptyLocalBinds thing_inside + = do { thing <- thing_inside + ; return (EmptyLocalBinds, thing) } + +tcLocalBinds (HsValBinds binds) thing_inside + = do { (binds', thing) <- tcValBinds NotTopLevel binds thing_inside + ; return (HsValBinds binds', thing) } + +tcLocalBinds (HsIPBinds (IPBinds ip_binds _)) thing_inside + = do { (thing, lie) <- getLIE thing_inside + ; (avail_ips, ip_binds') <- mapAndUnzipM (wrapLocSndM tc_ip_bind) ip_binds + + -- If the binding binds ?x = E, we must now + -- discharge any ?x constraints in expr_lie + ; dict_binds <- tcSimplifyIPs avail_ips lie + ; return (HsIPBinds (IPBinds ip_binds' dict_binds), thing) } + where + -- I wonder if we should do these one at at time + -- Consider ?x = 4 + -- ?y = ?x + 1 + tc_ip_bind (IPBind ip expr) + = newFlexiTyVarTy argTypeKind `thenM` \ ty -> + newIPDict (IPBindOrigin ip) ip ty `thenM` \ (ip', ip_inst) -> + tcMonoExpr expr ty `thenM` \ expr' -> + returnM (ip_inst, (IPBind ip' expr')) + +------------------------ +tcValBinds :: TopLevelFlag + -> HsValBinds Name -> TcM thing + -> TcM (HsValBinds TcId, thing) + +tcValBinds top_lvl (ValBindsIn binds sigs) thing_inside + = pprPanic "tcValBinds" (ppr binds) + +tcValBinds top_lvl (ValBindsOut binds sigs) thing_inside + = do { -- Typecheck the signature + ; let { prag_fn = mkPragFun sigs + ; ty_sigs = filter isVanillaLSig sigs + ; sig_fn = mkSigFun ty_sigs } + + ; poly_ids <- mapM tcTySig ty_sigs + + -- Extend the envt right away with all + -- the Ids declared with type signatures + ; (binds', thing) <- tcExtendIdEnv poly_ids $ + tc_val_binds top_lvl sig_fn prag_fn + binds thing_inside + + ; return (ValBindsOut binds' sigs, thing) } + +------------------------ +tc_val_binds :: TopLevelFlag -> TcSigFun -> TcPragFun + -> [(RecFlag, LHsBinds Name)] -> TcM thing + -> TcM ([(RecFlag, LHsBinds TcId)], thing) +-- Typecheck a whole lot of value bindings, +-- one strongly-connected component at a time + +tc_val_binds top_lvl sig_fn prag_fn [] thing_inside + = do { thing <- thing_inside + ; return ([], thing) } + +tc_val_binds top_lvl sig_fn prag_fn (group : groups) thing_inside + = do { (group', (groups', thing)) + <- tc_group top_lvl sig_fn prag_fn group $ + tc_val_binds top_lvl sig_fn prag_fn groups thing_inside + ; return (group' ++ groups', thing) } + +------------------------ +tc_group :: TopLevelFlag -> TcSigFun -> TcPragFun + -> (RecFlag, LHsBinds Name) -> TcM thing + -> TcM ([(RecFlag, LHsBinds TcId)], thing) + +-- Typecheck one strongly-connected component of the original program. +-- We get a list of groups back, because there may +-- be specialisations etc as well + +tc_group top_lvl sig_fn prag_fn (NonRecursive, binds) thing_inside + = -- A single non-recursive binding + -- We want to keep non-recursive things non-recursive + -- so that we desugar unlifted bindings correctly + do { (binds, thing) <- tcPolyBinds top_lvl NonRecursive NonRecursive + sig_fn prag_fn binds thing_inside + ; return ([(NonRecursive, b) | b <- binds], thing) } + +tc_group top_lvl sig_fn prag_fn (Recursive, binds) thing_inside + = -- A recursive strongly-connected component + -- To maximise polymorphism (with -fglasgow-exts), we do a new + -- strongly-connected-component analysis, this time omitting + -- any references to variables with type signatures. + -- + -- Then we bring into scope all the variables with type signatures + do { traceTc (text "tc_group rec" <+> pprLHsBinds binds) + ; gla_exts <- doptM Opt_GlasgowExts + ; (binds,thing) <- if gla_exts + then go new_sccs + else tc_binds Recursive binds thing_inside + ; return ([(Recursive, unionManyBags binds)], thing) } + -- Rec them all together + where + new_sccs :: [SCC (LHsBind Name)] + new_sccs = stronglyConnComp (mkEdges sig_fn binds) + +-- go :: SCC (LHsBind Name) -> TcM ([LHsBind TcId], thing) + go (scc:sccs) = do { (binds1, (binds2, thing)) <- go1 scc (go sccs) + ; return (binds1 ++ binds2, thing) } + go [] = do { thing <- thing_inside; return ([], thing) } + + go1 (AcyclicSCC bind) = tc_binds NonRecursive (unitBag bind) + go1 (CyclicSCC binds) = tc_binds Recursive (listToBag binds) + + tc_binds rec_tc binds = tcPolyBinds top_lvl Recursive rec_tc sig_fn prag_fn binds + +------------------------ +mkEdges :: TcSigFun -> LHsBinds Name + -> [(LHsBind Name, BKey, [BKey])] + +type BKey = Int -- Just number off the bindings + +mkEdges sig_fn binds + = [ (bind, key, [key | n <- nameSetToList (bind_fvs (unLoc bind)), + Just key <- [lookupNameEnv key_map n], no_sig n ]) + | (bind, key) <- keyd_binds + ] + where + no_sig :: Name -> Bool + no_sig n = isNothing (sig_fn n) + + keyd_binds = bagToList binds `zip` [0::BKey ..] + + key_map :: NameEnv BKey -- Which binding it comes from + key_map = mkNameEnv [(bndr, key) | (L _ bind, key) <- keyd_binds + , bndr <- bindersOfHsBind bind ] + +bindersOfHsBind :: HsBind Name -> [Name] +bindersOfHsBind (PatBind { pat_lhs = pat }) = collectPatBinders pat +bindersOfHsBind (FunBind { fun_id = L _ f }) = [f] + +------------------------ +tcPolyBinds :: TopLevelFlag + -> RecFlag -- Whether the group is really recursive + -> RecFlag -- Whether it's recursive for typechecking purposes + -> TcSigFun -> TcPragFun + -> LHsBinds Name + -> TcM thing + -> TcM ([LHsBinds TcId], thing) + +-- Typechecks a single bunch of bindings all together, +-- and generalises them. The bunch may be only part of a recursive +-- group, because we use type signatures to maximise polymorphism +-- +-- Deals with the bindInstsOfLocalFuns thing too +-- +-- Returns a list because the input may be a single non-recursive binding, +-- in which case the dependency order of the resulting bindings is +-- important. + +tcPolyBinds top_lvl rec_group rec_tc sig_fn prag_fn scc thing_inside + = -- NB: polymorphic recursion means that a function + -- may use an instance of itself, we must look at the LIE arising + -- from the function's own right hand side. Hence the getLIE + -- encloses the tc_poly_binds. + do { traceTc (text "tcPolyBinds" <+> ppr scc) + ; ((binds1, poly_ids, thing), lie) <- getLIE $ + do { (binds1, poly_ids) <- tc_poly_binds top_lvl rec_group rec_tc + sig_fn prag_fn scc + ; thing <- tcExtendIdEnv poly_ids thing_inside + ; return (binds1, poly_ids, thing) } + + ; if isTopLevel top_lvl + then -- For the top level don't bother will all this + -- bindInstsOfLocalFuns stuff. All the top level + -- things are rec'd together anyway, so it's fine to + -- leave them to the tcSimplifyTop, + -- and quite a bit faster too + do { extendLIEs lie; return (binds1, thing) } + + else do -- Nested case + { lie_binds <- bindInstsOfLocalFuns lie poly_ids + ; return (binds1 ++ [lie_binds], thing) }} + +------------------------ +tc_poly_binds :: TopLevelFlag -- See comments on tcPolyBinds + -> RecFlag -> RecFlag + -> TcSigFun -> TcPragFun + -> LHsBinds Name + -> TcM ([LHsBinds TcId], [TcId]) +-- Typechecks the bindings themselves +-- Knows nothing about the scope of the bindings + +tc_poly_binds top_lvl rec_group rec_tc sig_fn prag_fn binds + = let + binder_names = collectHsBindBinders binds + bind_list = bagToList binds + + loc = getLoc (head bind_list) + -- TODO: location a bit awkward, but the mbinds have been + -- dependency analysed and may no longer be adjacent + in + -- SET UP THE MAIN RECOVERY; take advantage of any type sigs + setSrcSpan loc $ + recoverM (recoveryCode binder_names) $ do + + { traceTc (ptext SLIT("------------------------------------------------")) + ; traceTc (ptext SLIT("Bindings for") <+> ppr binder_names) + + -- TYPECHECK THE BINDINGS + ; ((binds', mono_bind_infos), lie_req) + <- getLIE (tcMonoBinds bind_list sig_fn rec_tc) + + -- CHECK FOR UNLIFTED BINDINGS + -- These must be non-recursive etc, and are not generalised + -- They desugar to a case expression in the end + ; zonked_mono_tys <- zonkTcTypes (map getMonoType mono_bind_infos) + ; is_strict <- checkStrictBinds top_lvl rec_group binds' + zonked_mono_tys mono_bind_infos + ; if is_strict then + do { extendLIEs lie_req + ; let exports = zipWith mk_export mono_bind_infos zonked_mono_tys + mk_export (name, Nothing, mono_id) mono_ty = ([], mkLocalId name mono_ty, mono_id, []) + mk_export (name, Just sig, mono_id) mono_ty = ([], sig_id sig, mono_id, []) + -- ToDo: prags for unlifted bindings + + ; return ( [unitBag $ L loc $ AbsBinds [] [] exports binds'], + [poly_id | (_, poly_id, _, _) <- exports]) } -- Guaranteed zonked + + else do -- The normal lifted case: GENERALISE + { is_unres <- isUnRestrictedGroup bind_list sig_fn + ; (tyvars_to_gen, dict_binds, dict_ids) + <- addErrCtxt (genCtxt (bndrNames mono_bind_infos)) $ + generalise top_lvl is_unres mono_bind_infos lie_req + + -- FINALISE THE QUANTIFIED TYPE VARIABLES + -- The quantified type variables often include meta type variables + -- we want to freeze them into ordinary type variables, and + -- default their kind (e.g. from OpenTypeKind to TypeKind) + ; tyvars_to_gen' <- mappM zonkQuantifiedTyVar tyvars_to_gen + + -- BUILD THE POLYMORPHIC RESULT IDs + ; exports <- mapM (mkExport prag_fn tyvars_to_gen' (map idType dict_ids)) + mono_bind_infos + + -- ZONK THE poly_ids, because they are used to extend the type + -- environment; see the invariant on TcEnv.tcExtendIdEnv + ; let poly_ids = [poly_id | (_, poly_id, _, _) <- exports] + ; zonked_poly_ids <- mappM zonkId poly_ids + + ; traceTc (text "binding:" <+> ppr (zonked_poly_ids `zip` map idType zonked_poly_ids)) + + ; let abs_bind = L loc $ AbsBinds tyvars_to_gen' + dict_ids exports + (dict_binds `unionBags` binds') + + ; return ([unitBag abs_bind], zonked_poly_ids) + } } + + +-------------- +mkExport :: TcPragFun -> [TyVar] -> [TcType] -> MonoBindInfo + -> TcM ([TyVar], Id, Id, [Prag]) +mkExport prag_fn inferred_tvs dict_tys (poly_name, mb_sig, mono_id) + = case mb_sig of + Nothing -> do { prags <- tcPrags poly_id (prag_fn poly_name) + ; return (inferred_tvs, poly_id, mono_id, prags) } + where + poly_id = mkLocalId poly_name poly_ty + poly_ty = mkForAllTys inferred_tvs + $ mkFunTys dict_tys + $ idType mono_id + + Just sig -> do { let poly_id = sig_id sig + ; prags <- tcPrags poly_id (prag_fn poly_name) + ; sig_tys <- zonkTcTyVars (sig_tvs sig) + ; let sig_tvs' = map (tcGetTyVar "mkExport") sig_tys + ; return (sig_tvs', poly_id, mono_id, prags) } + -- We zonk the sig_tvs here so that the export triple + -- always has zonked type variables; + -- a convenient invariant + + +------------------------ +type TcPragFun = Name -> [LSig Name] + +mkPragFun :: [LSig Name] -> TcPragFun +mkPragFun sigs = \n -> lookupNameEnv env n `orElse` [] + where + prs = [(expectJust "mkPragFun" (sigName sig), sig) + | sig <- sigs, isPragLSig sig] + env = foldl add emptyNameEnv prs + add env (n,p) = extendNameEnv_Acc (:) singleton env n p + +tcPrags :: Id -> [LSig Name] -> TcM [Prag] +tcPrags poly_id prags = mapM tc_prag prags + where + tc_prag (L loc prag) = setSrcSpan loc $ + addErrCtxt (pragSigCtxt prag) $ + tcPrag poly_id prag + +pragSigCtxt prag = hang (ptext SLIT("In the pragma")) 2 (ppr prag) + +tcPrag :: TcId -> Sig Name -> TcM Prag +tcPrag poly_id (SpecSig orig_name hs_ty inl) = tcSpecPrag poly_id hs_ty inl +tcPrag poly_id (SpecInstSig hs_ty) = tcSpecPrag poly_id hs_ty defaultInlineSpec +tcPrag poly_id (InlineSig v inl) = return (InlinePrag inl) + + +tcSpecPrag :: TcId -> LHsType Name -> InlineSpec -> TcM Prag +tcSpecPrag poly_id hs_ty inl + = do { spec_ty <- tcHsSigType (FunSigCtxt (idName poly_id)) hs_ty + ; (co_fn, lie) <- getLIE (tcSubExp (idType poly_id) spec_ty) + ; extendLIEs lie + ; let const_dicts = map instToId lie + ; return (SpecPrag (mkHsCoerce co_fn (HsVar poly_id)) spec_ty const_dicts inl) } + +-------------- +-- If typechecking the binds fails, then return with each +-- signature-less binder given type (forall a.a), to minimise +-- subsequent error messages +recoveryCode binder_names + = do { traceTc (text "tcBindsWithSigs: error recovery" <+> ppr binder_names) + ; poly_ids <- mapM mk_dummy binder_names + ; return ([], poly_ids) } + where + mk_dummy name = do { mb_id <- tcLookupLocalId_maybe name + ; case mb_id of + Just id -> return id -- Had signature, was in envt + Nothing -> return (mkLocalId name forall_a_a) } -- No signature + +forall_a_a :: TcType +forall_a_a = mkForAllTy alphaTyVar (mkTyVarTy alphaTyVar) + + +-- Check that non-overloaded unlifted bindings are +-- a) non-recursive, +-- b) not top level, +-- c) not a multiple-binding group (more or less implied by (a)) + +checkStrictBinds :: TopLevelFlag -> RecFlag + -> LHsBinds TcId -> [TcType] -> [MonoBindInfo] + -> TcM Bool +checkStrictBinds top_lvl rec_group mbind mono_tys infos + | unlifted || bang_pat + = do { checkTc (isNotTopLevel top_lvl) + (strictBindErr "Top-level" unlifted mbind) + ; checkTc (isNonRec rec_group) + (strictBindErr "Recursive" unlifted mbind) + ; checkTc (isSingletonBag mbind) + (strictBindErr "Multiple" unlifted mbind) + ; mapM_ check_sig infos + ; return True } + | otherwise + = return False + where + unlifted = any isUnLiftedType mono_tys + bang_pat = anyBag (isBangHsBind . unLoc) mbind + check_sig (_, Just sig, _) = checkTc (null (sig_tvs sig) && null (sig_theta sig)) + (badStrictSig unlifted sig) + check_sig other = return () + +strictBindErr flavour unlifted mbind + = hang (text flavour <+> msg <+> ptext SLIT("aren't allowed:")) 4 (ppr mbind) + where + msg | unlifted = ptext SLIT("bindings for unlifted types") + | otherwise = ptext SLIT("bang-pattern bindings") + +badStrictSig unlifted sig + = hang (ptext SLIT("Illegal polymorphic signature in") <+> msg) + 4 (ppr sig) + where + msg | unlifted = ptext SLIT("an unlifted binding") + | otherwise = ptext SLIT("a bang-pattern binding") +\end{code} + + +%************************************************************************ +%* * +\subsection{tcMonoBind} +%* * +%************************************************************************ + +@tcMonoBinds@ deals with a perhaps-recursive group of HsBinds. +The signatures have been dealt with already. + +\begin{code} +tcMonoBinds :: [LHsBind Name] + -> TcSigFun + -> RecFlag -- Whether the binding is recursive for typechecking purposes + -- i.e. the binders are mentioned in their RHSs, and + -- we are not resuced by a type signature + -> TcM (LHsBinds TcId, [MonoBindInfo]) + +tcMonoBinds [L b_loc (FunBind { fun_id = L nm_loc name, fun_infix = inf, + fun_matches = matches, bind_fvs = fvs })] + sig_fn -- Single function binding, + NonRecursive -- binder isn't mentioned in RHS, + | Nothing <- sig_fn name -- ...with no type signature + = -- In this very special case we infer the type of the + -- right hand side first (it may have a higher-rank type) + -- and *then* make the monomorphic Id for the LHS + -- e.g. f = \(x::forall a. a->a) -> <body> + -- We want to infer a higher-rank type for f + setSrcSpan b_loc $ + do { ((co_fn, matches'), rhs_ty) <- tcInfer (tcMatchesFun name matches) + + -- Check for an unboxed tuple type + -- f = (# True, False #) + -- Zonk first just in case it's hidden inside a meta type variable + -- (This shows up as a (more obscure) kind error + -- in the 'otherwise' case of tcMonoBinds.) + ; zonked_rhs_ty <- zonkTcType rhs_ty + ; checkTc (not (isUnboxedTupleType zonked_rhs_ty)) + (unboxedTupleErr name zonked_rhs_ty) + + ; mono_name <- newLocalName name + ; let mono_id = mkLocalId mono_name zonked_rhs_ty + ; return (unitBag (L b_loc (FunBind { fun_id = L nm_loc mono_id, fun_infix = inf, + fun_matches = matches', bind_fvs = fvs, + fun_co_fn = co_fn })), + [(name, Nothing, mono_id)]) } + +tcMonoBinds [L b_loc (FunBind { fun_id = L nm_loc name, fun_infix = inf, + fun_matches = matches, bind_fvs = fvs })] + sig_fn -- Single function binding + non_rec + | Just sig <- sig_fn name -- ...with a type signature + = -- When we have a single function binding, with a type signature + -- we can (a) use genuine, rigid skolem constants for the type variables + -- (b) bring (rigid) scoped type variables into scope + setSrcSpan b_loc $ + do { tc_sig <- tcInstSig True sig + ; mono_name <- newLocalName name + ; let mono_ty = sig_tau tc_sig + mono_id = mkLocalId mono_name mono_ty + rhs_tvs = [ (name, mkTyVarTy tv) + | (name, tv) <- sig_scoped tc_sig `zip` sig_tvs tc_sig ] + + ; (co_fn, matches') <- tcExtendTyVarEnv2 rhs_tvs $ + tcMatchesFun mono_name matches mono_ty + + ; let fun_bind' = FunBind { fun_id = L nm_loc mono_id, + fun_infix = inf, fun_matches = matches', + bind_fvs = placeHolderNames, fun_co_fn = co_fn } + ; return (unitBag (L b_loc fun_bind'), + [(name, Just tc_sig, mono_id)]) } + +tcMonoBinds binds sig_fn non_rec + = do { tc_binds <- mapM (wrapLocM (tcLhs sig_fn)) binds + + -- Bring the monomorphic Ids, into scope for the RHSs + ; let mono_info = getMonoBindInfo tc_binds + rhs_id_env = [(name,mono_id) | (name, Nothing, mono_id) <- mono_info] + -- A monomorphic binding for each term variable that lacks + -- a type sig. (Ones with a sig are already in scope.) + + ; binds' <- tcExtendIdEnv2 rhs_id_env $ + traceTc (text "tcMonoBinds" <+> vcat [ ppr n <+> ppr id <+> ppr (idType id) + | (n,id) <- rhs_id_env]) `thenM_` + mapM (wrapLocM tcRhs) tc_binds + ; return (listToBag binds', mono_info) } + +------------------------ +-- tcLhs typechecks the LHS of the bindings, to construct the environment in which +-- we typecheck the RHSs. Basically what we are doing is this: for each binder: +-- if there's a signature for it, use the instantiated signature type +-- otherwise invent a type variable +-- You see that quite directly in the FunBind case. +-- +-- But there's a complication for pattern bindings: +-- data T = MkT (forall a. a->a) +-- MkT f = e +-- Here we can guess a type variable for the entire LHS (which will be refined to T) +-- but we want to get (f::forall a. a->a) as the RHS environment. +-- The simplest way to do this is to typecheck the pattern, and then look up the +-- bound mono-ids. Then we want to retain the typechecked pattern to avoid re-doing +-- it; hence the TcMonoBind data type in which the LHS is done but the RHS isn't + +data TcMonoBind -- Half completed; LHS done, RHS not done + = TcFunBind MonoBindInfo (Located TcId) Bool (MatchGroup Name) + | TcPatBind [MonoBindInfo] (LPat TcId) (GRHSs Name) TcSigmaType + +type MonoBindInfo = (Name, Maybe TcSigInfo, TcId) + -- Type signature (if any), and + -- the monomorphic bound things + +bndrNames :: [MonoBindInfo] -> [Name] +bndrNames mbi = [n | (n,_,_) <- mbi] + +getMonoType :: MonoBindInfo -> TcTauType +getMonoType (_,_,mono_id) = idType mono_id + +tcLhs :: TcSigFun -> HsBind Name -> TcM TcMonoBind +tcLhs sig_fn (FunBind { fun_id = L nm_loc name, fun_infix = inf, fun_matches = matches }) + = do { mb_sig <- tcInstSig_maybe (sig_fn name) + ; mono_name <- newLocalName name + ; mono_ty <- mk_mono_ty mb_sig + ; let mono_id = mkLocalId mono_name mono_ty + ; return (TcFunBind (name, mb_sig, mono_id) (L nm_loc mono_id) inf matches) } + where + mk_mono_ty (Just sig) = return (sig_tau sig) + mk_mono_ty Nothing = newFlexiTyVarTy argTypeKind + +tcLhs sig_fn bind@(PatBind { pat_lhs = pat, pat_rhs = grhss }) + = do { mb_sigs <- mapM (tcInstSig_maybe . sig_fn) names + + ; let nm_sig_prs = names `zip` mb_sigs + tau_sig_env = mkNameEnv [ (name, sig_tau sig) | (name, Just sig) <- nm_sig_prs] + sig_tau_fn = lookupNameEnv tau_sig_env + + tc_pat exp_ty = tcPat (LetPat sig_tau_fn) pat exp_ty unitTy $ \ _ -> + mapM lookup_info nm_sig_prs + -- The unitTy is a bit bogus; it's the "result type" for lookup_info. + + -- After typechecking the pattern, look up the binder + -- names, which the pattern has brought into scope. + lookup_info :: (Name, Maybe TcSigInfo) -> TcM MonoBindInfo + lookup_info (name, mb_sig) = do { mono_id <- tcLookupId name + ; return (name, mb_sig, mono_id) } + + ; ((pat', infos), pat_ty) <- addErrCtxt (patMonoBindsCtxt pat grhss) $ + tcInfer tc_pat + + ; return (TcPatBind infos pat' grhss pat_ty) } + where + names = collectPatBinders pat + + +tcLhs sig_fn other_bind = pprPanic "tcLhs" (ppr other_bind) + -- AbsBind, VarBind impossible + +------------------- +tcRhs :: TcMonoBind -> TcM (HsBind TcId) +tcRhs (TcFunBind info fun'@(L _ mono_id) inf matches) + = do { (co_fn, matches') <- tcMatchesFun (idName mono_id) matches + (idType mono_id) + ; return (FunBind { fun_id = fun', fun_infix = inf, fun_matches = matches', + bind_fvs = placeHolderNames, fun_co_fn = co_fn }) } + +tcRhs bind@(TcPatBind _ pat' grhss pat_ty) + = do { grhss' <- addErrCtxt (patMonoBindsCtxt pat' grhss) $ + tcGRHSsPat grhss pat_ty + ; return (PatBind { pat_lhs = pat', pat_rhs = grhss', pat_rhs_ty = pat_ty, + bind_fvs = placeHolderNames }) } + + +--------------------- +getMonoBindInfo :: [Located TcMonoBind] -> [MonoBindInfo] +getMonoBindInfo tc_binds + = foldr (get_info . unLoc) [] tc_binds + where + get_info (TcFunBind info _ _ _) rest = info : rest + get_info (TcPatBind infos _ _ _) rest = infos ++ rest +\end{code} + + +%************************************************************************ +%* * + Generalisation +%* * +%************************************************************************ + +\begin{code} +generalise :: TopLevelFlag -> Bool + -> [MonoBindInfo] -> [Inst] + -> TcM ([TcTyVar], TcDictBinds, [TcId]) +generalise top_lvl is_unrestricted mono_infos lie_req + | not is_unrestricted -- RESTRICTED CASE + = -- Check signature contexts are empty + do { checkTc (all is_mono_sig sigs) + (restrictedBindCtxtErr bndrs) + + -- Now simplify with exactly that set of tyvars + -- We have to squash those Methods + ; (qtvs, binds) <- tcSimplifyRestricted doc top_lvl bndrs + tau_tvs lie_req + + -- Check that signature type variables are OK + ; final_qtvs <- checkSigsTyVars qtvs sigs + + ; return (final_qtvs, binds, []) } + + | null sigs -- UNRESTRICTED CASE, NO TYPE SIGS + = tcSimplifyInfer doc tau_tvs lie_req + + | otherwise -- UNRESTRICTED CASE, WITH TYPE SIGS + = do { sig_lie <- unifyCtxts sigs -- sigs is non-empty + ; let -- The "sig_avails" is the stuff available. We get that from + -- the context of the type signature, BUT ALSO the lie_avail + -- so that polymorphic recursion works right (see Note [Polymorphic recursion]) + local_meths = [mkMethInst sig mono_id | (_, Just sig, mono_id) <- mono_infos] + sig_avails = sig_lie ++ local_meths + + -- Check that the needed dicts can be + -- expressed in terms of the signature ones + ; (forall_tvs, dict_binds) <- tcSimplifyInferCheck doc tau_tvs sig_avails lie_req + + -- Check that signature type variables are OK + ; final_qtvs <- checkSigsTyVars forall_tvs sigs + + ; returnM (final_qtvs, dict_binds, map instToId sig_lie) } + where + bndrs = bndrNames mono_infos + sigs = [sig | (_, Just sig, _) <- mono_infos] + tau_tvs = foldr (unionVarSet . exactTyVarsOfType . getMonoType) emptyVarSet mono_infos + -- NB: exactTyVarsOfType; see Note [Silly type synonym] + -- near defn of TcType.exactTyVarsOfType + is_mono_sig sig = null (sig_theta sig) + doc = ptext SLIT("type signature(s) for") <+> pprBinders bndrs + + mkMethInst (TcSigInfo { sig_id = poly_id, sig_tvs = tvs, + sig_theta = theta, sig_loc = loc }) mono_id + = Method mono_id poly_id (mkTyVarTys tvs) theta loc +\end{code} + +unifyCtxts checks that all the signature contexts are the same +The type signatures on a mutually-recursive group of definitions +must all have the same context (or none). + +The trick here is that all the signatures should have the same +context, and we want to share type variables for that context, so that +all the right hand sides agree a common vocabulary for their type +constraints + +We unify them because, with polymorphic recursion, their types +might not otherwise be related. This is a rather subtle issue. + +\begin{code} +unifyCtxts :: [TcSigInfo] -> TcM [Inst] +unifyCtxts (sig1 : sigs) -- Argument is always non-empty + = do { mapM unify_ctxt sigs + ; newDictsAtLoc (sig_loc sig1) (sig_theta sig1) } + where + theta1 = sig_theta sig1 + unify_ctxt :: TcSigInfo -> TcM () + unify_ctxt sig@(TcSigInfo { sig_theta = theta }) + = setSrcSpan (instLocSrcSpan (sig_loc sig)) $ + addErrCtxt (sigContextsCtxt sig1 sig) $ + unifyTheta theta1 theta + +checkSigsTyVars :: [TcTyVar] -> [TcSigInfo] -> TcM [TcTyVar] +checkSigsTyVars qtvs sigs + = do { gbl_tvs <- tcGetGlobalTyVars + ; sig_tvs_s <- mappM (check_sig gbl_tvs) sigs + + ; let -- Sigh. Make sure that all the tyvars in the type sigs + -- appear in the returned ty var list, which is what we are + -- going to generalise over. Reason: we occasionally get + -- silly types like + -- type T a = () -> () + -- f :: T a + -- f () = () + -- Here, 'a' won't appear in qtvs, so we have to add it + sig_tvs = foldl extendVarSetList emptyVarSet sig_tvs_s + all_tvs = varSetElems (extendVarSetList sig_tvs qtvs) + ; returnM all_tvs } + where + check_sig gbl_tvs (TcSigInfo {sig_id = id, sig_tvs = tvs, + sig_theta = theta, sig_tau = tau}) + = addErrCtxt (ptext SLIT("In the type signature for") <+> quotes (ppr id)) $ + addErrCtxtM (sigCtxt id tvs theta tau) $ + do { tvs' <- checkDistinctTyVars tvs + ; ifM (any (`elemVarSet` gbl_tvs) tvs') + (bleatEscapedTvs gbl_tvs tvs tvs') + ; return tvs' } + +checkDistinctTyVars :: [TcTyVar] -> TcM [TcTyVar] +-- (checkDistinctTyVars tvs) checks that the tvs from one type signature +-- are still all type variables, and all distinct from each other. +-- It returns a zonked set of type variables. +-- For example, if the type sig is +-- f :: forall a b. a -> b -> b +-- we want to check that 'a' and 'b' haven't +-- (a) been unified with a non-tyvar type +-- (b) been unified with each other (all distinct) + +checkDistinctTyVars sig_tvs + = do { zonked_tvs <- mapM zonkSigTyVar sig_tvs + ; foldlM check_dup emptyVarEnv (sig_tvs `zip` zonked_tvs) + ; return zonked_tvs } + where + check_dup :: TyVarEnv TcTyVar -> (TcTyVar, TcTyVar) -> TcM (TyVarEnv TcTyVar) + -- The TyVarEnv maps each zonked type variable back to its + -- corresponding user-written signature type variable + check_dup acc (sig_tv, zonked_tv) + = case lookupVarEnv acc zonked_tv of + Just sig_tv' -> bomb_out sig_tv sig_tv' + + Nothing -> return (extendVarEnv acc zonked_tv sig_tv) + + bomb_out sig_tv1 sig_tv2 + = do { env0 <- tcInitTidyEnv + ; let (env1, tidy_tv1) = tidyOpenTyVar env0 sig_tv1 + (env2, tidy_tv2) = tidyOpenTyVar env1 sig_tv2 + msg = ptext SLIT("Quantified type variable") <+> quotes (ppr tidy_tv1) + <+> ptext SLIT("is unified with another quantified type variable") + <+> quotes (ppr tidy_tv2) + ; failWithTcM (env2, msg) } + where +\end{code} + + +@getTyVarsToGen@ decides what type variables to generalise over. + +For a "restricted group" -- see the monomorphism restriction +for a definition -- we bind no dictionaries, and +remove from tyvars_to_gen any constrained type variables + +*Don't* simplify dicts at this point, because we aren't going +to generalise over these dicts. By the time we do simplify them +we may well know more. For example (this actually came up) + f :: Array Int Int + f x = array ... xs where xs = [1,2,3,4,5] +We don't want to generate lots of (fromInt Int 1), (fromInt Int 2) +stuff. If we simplify only at the f-binding (not the xs-binding) +we'll know that the literals are all Ints, and we can just produce +Int literals! + +Find all the type variables involved in overloading, the +"constrained_tyvars". These are the ones we *aren't* going to +generalise. We must be careful about doing this: + + (a) If we fail to generalise a tyvar which is not actually + constrained, then it will never, ever get bound, and lands + up printed out in interface files! Notorious example: + instance Eq a => Eq (Foo a b) where .. + Here, b is not constrained, even though it looks as if it is. + Another, more common, example is when there's a Method inst in + the LIE, whose type might very well involve non-overloaded + type variables. + [NOTE: Jan 2001: I don't understand the problem here so I'm doing + the simple thing instead] + + (b) On the other hand, we mustn't generalise tyvars which are constrained, + because we are going to pass on out the unmodified LIE, with those + tyvars in it. They won't be in scope if we've generalised them. + +So we are careful, and do a complete simplification just to find the +constrained tyvars. We don't use any of the results, except to +find which tyvars are constrained. + +Note [Polymorphic recursion] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The game plan for polymorphic recursion in the code above is + + * Bind any variable for which we have a type signature + to an Id with a polymorphic type. Then when type-checking + the RHSs we'll make a full polymorphic call. + +This fine, but if you aren't a bit careful you end up with a horrendous +amount of partial application and (worse) a huge space leak. For example: + + f :: Eq a => [a] -> [a] + f xs = ...f... + +If we don't take care, after typechecking we get + + f = /\a -> \d::Eq a -> let f' = f a d + in + \ys:[a] -> ...f'... + +Notice the the stupid construction of (f a d), which is of course +identical to the function we're executing. In this case, the +polymorphic recursion isn't being used (but that's a very common case). +This can lead to a massive space leak, from the following top-level defn +(post-typechecking) + + ff :: [Int] -> [Int] + ff = f Int dEqInt + +Now (f dEqInt) evaluates to a lambda that has f' as a free variable; but +f' is another thunk which evaluates to the same thing... and you end +up with a chain of identical values all hung onto by the CAF ff. + + ff = f Int dEqInt + + = let f' = f Int dEqInt in \ys. ...f'... + + = let f' = let f' = f Int dEqInt in \ys. ...f'... + in \ys. ...f'... + +Etc. + +NOTE: a bit of arity anaysis would push the (f a d) inside the (\ys...), +which would make the space leak go away in this case + +Solution: when typechecking the RHSs we always have in hand the +*monomorphic* Ids for each binding. So we just need to make sure that +if (Method f a d) shows up in the constraints emerging from (...f...) +we just use the monomorphic Id. We achieve this by adding monomorphic Ids +to the "givens" when simplifying constraints. That's what the "lies_avail" +is doing. + +Then we get + + f = /\a -> \d::Eq a -> letrec + fm = \ys:[a] -> ...fm... + in + fm + + + +%************************************************************************ +%* * + Signatures +%* * +%************************************************************************ + +Type signatures are tricky. See Note [Signature skolems] in TcType + +@tcSigs@ checks the signatures for validity, and returns a list of +{\em freshly-instantiated} signatures. That is, the types are already +split up, and have fresh type variables installed. All non-type-signature +"RenamedSigs" are ignored. + +The @TcSigInfo@ contains @TcTypes@ because they are unified with +the variable's type, and after that checked to see whether they've +been instantiated. + +\begin{code} +type TcSigFun = Name -> Maybe (LSig Name) + +mkSigFun :: [LSig Name] -> TcSigFun +-- Search for a particular type signature +-- Precondition: the sigs are all type sigs +-- Precondition: no duplicates +mkSigFun sigs = lookupNameEnv env + where + env = mkNameEnv [(expectJust "mkSigFun" (sigName sig), sig) | sig <- sigs] + +--------------- +data TcSigInfo + = TcSigInfo { + sig_id :: TcId, -- *Polymorphic* binder for this value... + + sig_scoped :: [Name], -- Names for any scoped type variables + -- Invariant: correspond 1-1 with an initial + -- segment of sig_tvs (see Note [Scoped]) + + sig_tvs :: [TcTyVar], -- Instantiated type variables + -- See Note [Instantiate sig] + + sig_theta :: TcThetaType, -- Instantiated theta + sig_tau :: TcTauType, -- Instantiated tau + sig_loc :: InstLoc -- The location of the signature + } + +-- Note [Scoped] +-- There may be more instantiated type variables than scoped +-- ones. For example: +-- type T a = forall b. b -> (a,b) +-- f :: forall c. T c +-- Here, the signature for f will have one scoped type variable, c, +-- but two instantiated type variables, c' and b'. +-- +-- We assume that the scoped ones are at the *front* of sig_tvs, +-- and remember the names from the original HsForAllTy in sig_scoped + +-- Note [Instantiate sig] +-- It's vital to instantiate a type signature with fresh variable. +-- For example: +-- type S = forall a. a->a +-- f,g :: S +-- f = ... +-- g = ... +-- Here, we must use distinct type variables when checking f,g's right hand sides. +-- (Instantiation is only necessary because of type synonyms. Otherwise, +-- it's all cool; each signature has distinct type variables from the renamer.) + +instance Outputable TcSigInfo where + ppr (TcSigInfo { sig_id = id, sig_tvs = tyvars, sig_theta = theta, sig_tau = tau}) + = ppr id <+> ptext SLIT("::") <+> ppr tyvars <+> ppr theta <+> ptext SLIT("=>") <+> ppr tau +\end{code} + +\begin{code} +tcTySig :: LSig Name -> TcM TcId +tcTySig (L span (TypeSig (L _ name) ty)) + = setSrcSpan span $ + do { sigma_ty <- tcHsSigType (FunSigCtxt name) ty + ; return (mkLocalId name sigma_ty) } + +------------------- +tcInstSig_maybe :: Maybe (LSig Name) -> TcM (Maybe TcSigInfo) +-- Instantiate with *meta* type variables; +-- this signature is part of a multi-signature group +tcInstSig_maybe Nothing = return Nothing +tcInstSig_maybe (Just sig) = do { tc_sig <- tcInstSig False sig + ; return (Just tc_sig) } + +tcInstSig :: Bool -> LSig Name -> TcM TcSigInfo +-- Instantiate the signature, with either skolems or meta-type variables +-- depending on the use_skols boolean +-- +-- We always instantiate with freshs uniques, +-- although we keep the same print-name +-- +-- type T = forall a. [a] -> [a] +-- f :: T; +-- f = g where { g :: T; g = <rhs> } +-- +-- We must not use the same 'a' from the defn of T at both places!! + +tcInstSig use_skols (L loc (TypeSig (L _ name) hs_ty)) + = setSrcSpan loc $ + do { poly_id <- tcLookupId name -- Cannot fail; the poly ids are put into + -- scope when starting the binding group + ; let skol_info = SigSkol (FunSigCtxt name) + inst_tyvars | use_skols = tcInstSkolTyVars skol_info + | otherwise = tcInstSigTyVars skol_info + ; (tvs, theta, tau) <- tcInstType inst_tyvars (idType poly_id) + ; loc <- getInstLoc (SigOrigin skol_info) + ; return (TcSigInfo { sig_id = poly_id, + sig_tvs = tvs, sig_theta = theta, sig_tau = tau, + sig_scoped = scoped_names, sig_loc = loc }) } + -- Note that the scoped_names and the sig_tvs will have + -- different Names. That's quite ok; when we bring the + -- scoped_names into scope, we just bind them to the sig_tvs + where + -- The scoped names are the ones explicitly mentioned + -- in the HsForAll. (There may be more in sigma_ty, because + -- of nested type synonyms. See Note [Scoped] with TcSigInfo.) + -- We also only have scoped type variables when we are instantiating + -- with true skolems + scoped_names = case (use_skols, hs_ty) of + (True, L _ (HsForAllTy Explicit tvs _ _)) -> hsLTyVarNames tvs + other -> [] + +------------------- +isUnRestrictedGroup :: [LHsBind Name] -> TcSigFun -> TcM Bool +isUnRestrictedGroup binds sig_fn + = do { mono_restriction <- doptM Opt_MonomorphismRestriction + ; return (not mono_restriction || all_unrestricted) } + where + all_unrestricted = all (unrestricted . unLoc) binds + has_sig n = isJust (sig_fn n) + + unrestricted (PatBind {}) = False + unrestricted (VarBind { var_id = v }) = has_sig v + unrestricted (FunBind { fun_id = v, fun_matches = matches }) = unrestricted_match matches + || has_sig (unLoc v) + + unrestricted_match (MatchGroup (L _ (Match [] _ _) : _) _) = False + -- No args => like a pattern binding + unrestricted_match other = True + -- Some args => a function binding +\end{code} + + +%************************************************************************ +%* * +\subsection[TcBinds-errors]{Error contexts and messages} +%* * +%************************************************************************ + + +\begin{code} +-- This one is called on LHS, when pat and grhss are both Name +-- and on RHS, when pat is TcId and grhss is still Name +patMonoBindsCtxt pat grhss + = hang (ptext SLIT("In a pattern binding:")) 4 (pprPatBind pat grhss) + +----------------------------------------------- +sigContextsCtxt sig1 sig2 + = vcat [ptext SLIT("When matching the contexts of the signatures for"), + nest 2 (vcat [ppr id1 <+> dcolon <+> ppr (idType id1), + ppr id2 <+> dcolon <+> ppr (idType id2)]), + ptext SLIT("The signature contexts in a mutually recursive group should all be identical")] + where + id1 = sig_id sig1 + id2 = sig_id sig2 + + +----------------------------------------------- +unboxedTupleErr name ty + = hang (ptext SLIT("Illegal binding of unboxed tuple")) + 4 (ppr name <+> dcolon <+> ppr ty) + +----------------------------------------------- +restrictedBindCtxtErr binder_names + = hang (ptext SLIT("Illegal overloaded type signature(s)")) + 4 (vcat [ptext SLIT("in a binding group for") <+> pprBinders binder_names, + ptext SLIT("that falls under the monomorphism restriction")]) + +genCtxt binder_names + = ptext SLIT("When generalising the type(s) for") <+> pprBinders binder_names +\end{code} diff --git a/compiler/typecheck/TcClassDcl.lhs b/compiler/typecheck/TcClassDcl.lhs new file mode 100644 index 0000000000..14682a295d --- /dev/null +++ b/compiler/typecheck/TcClassDcl.lhs @@ -0,0 +1,790 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[TcClassDcl]{Typechecking class declarations} + +\begin{code} +module TcClassDcl ( tcClassSigs, tcClassDecl2, + getGenericInstances, + MethodSpec, tcMethodBind, mkMethodBind, + tcAddDeclCtxt, badMethodErr + ) where + +#include "HsVersions.h" + +import HsSyn +import RnHsSyn ( maybeGenericMatch, extractHsTyVars ) +import RnExpr ( rnLExpr ) +import RnEnv ( lookupTopBndrRn, lookupImportedName ) +import Inst ( instToId, newDicts, newDictsAtLoc, newMethod, getOverlapFlag ) +import InstEnv ( mkLocalInstance ) +import TcEnv ( tcLookupLocatedClass, + tcExtendTyVarEnv, tcExtendIdEnv, + InstInfo(..), pprInstInfoDetails, + simpleInstInfoTyCon, simpleInstInfoTy, + InstBindings(..), newDFunName + ) +import TcBinds ( TcPragFun, tcMonoBinds, tcPrags, mkPragFun, TcSigInfo(..) ) +import TcHsType ( tcHsKindedType, tcHsSigType ) +import TcSimplify ( tcSimplifyCheck ) +import TcUnify ( checkSigTyVars, sigCtxt ) +import TcMType ( tcSkolSigTyVars ) +import TcType ( Type, SkolemInfo(ClsSkol, InstSkol), UserTypeCtxt( GenPatCtxt ), + TcType, TcThetaType, TcTyVar, mkTyVarTys, + mkClassPred, tcSplitSigmaTy, tcSplitFunTys, + tcIsTyVarTy, tcSplitTyConApp_maybe, tcSplitForAllTys, tcSplitPhiTy, + getClassPredTys_maybe, mkPhiTy, mkTyVarTy + ) +import TcRnMonad +import Generics ( mkGenericRhs, validGenericInstanceType ) +import PrelInfo ( nO_METHOD_BINDING_ERROR_ID ) +import Class ( classTyVars, classBigSig, + Class, ClassOpItem, DefMeth (..) ) +import TyCon ( TyCon, tyConName, tyConHasGenerics ) +import Type ( substTyWith ) +import MkId ( mkDefaultMethodId, mkDictFunId ) +import Id ( Id, idType, idName, mkUserLocal ) +import Name ( Name, NamedThing(..) ) +import NameEnv ( NameEnv, lookupNameEnv, mkNameEnv ) +import NameSet ( nameSetToList ) +import OccName ( reportIfUnused, mkDefaultMethodOcc ) +import RdrName ( RdrName, mkDerivedRdrName ) +import Outputable +import PrelNames ( genericTyConNames ) +import DynFlags +import ErrUtils ( dumpIfSet_dyn ) +import Util ( count, lengthIs, isSingleton, lengthExceeds ) +import Unique ( Uniquable(..) ) +import ListSetOps ( equivClassesByUniq, minusList ) +import SrcLoc ( Located(..), srcSpanStart, unLoc, noLoc ) +import Maybes ( seqMaybe, isJust, mapCatMaybes ) +import List ( partition ) +import BasicTypes ( RecFlag(..), Boxity(..) ) +import Bag +import FastString +\end{code} + + + +Dictionary handling +~~~~~~~~~~~~~~~~~~~ +Every class implicitly declares a new data type, corresponding to dictionaries +of that class. So, for example: + + class (D a) => C a where + op1 :: a -> a + op2 :: forall b. Ord b => a -> b -> b + +would implicitly declare + + data CDict a = CDict (D a) + (a -> a) + (forall b. Ord b => a -> b -> b) + +(We could use a record decl, but that means changing more of the existing apparatus. +One step at at time!) + +For classes with just one superclass+method, we use a newtype decl instead: + + class C a where + op :: forallb. a -> b -> b + +generates + + newtype CDict a = CDict (forall b. a -> b -> b) + +Now DictTy in Type is just a form of type synomym: + DictTy c t = TyConTy CDict `AppTy` t + +Death to "ExpandingDicts". + + +%************************************************************************ +%* * + Type-checking the class op signatures +%* * +%************************************************************************ + +\begin{code} +tcClassSigs :: Name -- Name of the class + -> [LSig Name] + -> LHsBinds Name + -> TcM [TcMethInfo] + +type TcMethInfo = (Name, DefMeth, Type) -- A temporary intermediate, to communicate + -- between tcClassSigs and buildClass +tcClassSigs clas sigs def_methods + = do { dm_env <- checkDefaultBinds clas op_names def_methods + ; mappM (tcClassSig dm_env) op_sigs } + where + op_sigs = [sig | sig@(L _ (TypeSig _ _)) <- sigs] + op_names = [n | sig@(L _ (TypeSig (L _ n) _)) <- op_sigs] + + +checkDefaultBinds :: Name -> [Name] -> LHsBinds Name -> TcM (NameEnv Bool) + -- Check default bindings + -- a) must be for a class op for this class + -- b) must be all generic or all non-generic + -- and return a mapping from class-op to Bool + -- where True <=> it's a generic default method +checkDefaultBinds clas ops binds + = do dm_infos <- mapM (addLocM (checkDefaultBind clas ops)) (bagToList binds) + return (mkNameEnv dm_infos) + +checkDefaultBind clas ops (FunBind {fun_id = L _ op, fun_matches = MatchGroup matches _ }) + = do { -- Check that the op is from this class + checkTc (op `elem` ops) (badMethodErr clas op) + + -- Check that all the defns ar generic, or none are + ; checkTc (all_generic || none_generic) (mixedGenericErr op) + + ; returnM (op, all_generic) + } + where + n_generic = count (isJust . maybeGenericMatch) matches + none_generic = n_generic == 0 + all_generic = matches `lengthIs` n_generic + + +tcClassSig :: NameEnv Bool -- Info about default methods; + -> LSig Name + -> TcM TcMethInfo + +tcClassSig dm_env (L loc (TypeSig (L _ op_name) op_hs_ty)) + = setSrcSpan loc $ do + { op_ty <- tcHsKindedType op_hs_ty -- Class tyvars already in scope + ; let dm = case lookupNameEnv dm_env op_name of + Nothing -> NoDefMeth + Just False -> DefMeth + Just True -> GenDefMeth + ; returnM (op_name, dm, op_ty) } +\end{code} + + +%************************************************************************ +%* * +\subsection[Default methods]{Default methods} +%* * +%************************************************************************ + +The default methods for a class are each passed a dictionary for the +class, so that they get access to the other methods at the same type. +So, given the class decl +\begin{verbatim} +class Foo a where + op1 :: a -> Bool + op2 :: Ord b => a -> b -> b -> b + + op1 x = True + op2 x y z = if (op1 x) && (y < z) then y else z +\end{verbatim} +we get the default methods: +\begin{verbatim} +defm.Foo.op1 :: forall a. Foo a => a -> Bool +defm.Foo.op1 = /\a -> \dfoo -> \x -> True + +defm.Foo.op2 :: forall a. Foo a => forall b. Ord b => a -> b -> b -> b +defm.Foo.op2 = /\ a -> \ dfoo -> /\ b -> \ dord -> \x y z -> + if (op1 a dfoo x) && (< b dord y z) then y else z +\end{verbatim} + +When we come across an instance decl, we may need to use the default +methods: +\begin{verbatim} +instance Foo Int where {} +\end{verbatim} +gives +\begin{verbatim} +const.Foo.Int.op1 :: Int -> Bool +const.Foo.Int.op1 = defm.Foo.op1 Int dfun.Foo.Int + +const.Foo.Int.op2 :: forall b. Ord b => Int -> b -> b -> b +const.Foo.Int.op2 = defm.Foo.op2 Int dfun.Foo.Int + +dfun.Foo.Int :: Foo Int +dfun.Foo.Int = (const.Foo.Int.op1, const.Foo.Int.op2) +\end{verbatim} +Notice that, as with method selectors above, we assume that dictionary +application is curried, so there's no need to mention the Ord dictionary +in const.Foo.Int.op2 (or the type variable). + +\begin{verbatim} +instance Foo a => Foo [a] where {} + +dfun.Foo.List :: forall a. Foo a -> Foo [a] +dfun.Foo.List + = /\ a -> \ dfoo_a -> + let rec + op1 = defm.Foo.op1 [a] dfoo_list + op2 = defm.Foo.op2 [a] dfoo_list + dfoo_list = (op1, op2) + in + dfoo_list +\end{verbatim} + +@tcClassDecls2@ generates bindings for polymorphic default methods +(generic default methods have by now turned into instance declarations) + +\begin{code} +tcClassDecl2 :: LTyClDecl Name -- The class declaration + -> TcM (LHsBinds Id, [Id]) + +tcClassDecl2 (L loc (ClassDecl {tcdLName = class_name, tcdSigs = sigs, + tcdMeths = default_binds})) + = recoverM (returnM (emptyLHsBinds, [])) $ + setSrcSpan loc $ + tcLookupLocatedClass class_name `thenM` \ clas -> + + -- We make a separate binding for each default method. + -- At one time I used a single AbsBinds for all of them, thus + -- AbsBind [d] [dm1, dm2, dm3] { dm1 = ...; dm2 = ...; dm3 = ... } + -- But that desugars into + -- ds = \d -> (..., ..., ...) + -- dm1 = \d -> case ds d of (a,b,c) -> a + -- And since ds is big, it doesn't get inlined, so we don't get good + -- default methods. Better to make separate AbsBinds for each + let + (tyvars, _, _, op_items) = classBigSig clas + prag_fn = mkPragFun sigs + tc_dm = tcDefMeth clas tyvars default_binds prag_fn + + dm_sel_ids = [sel_id | (sel_id, DefMeth) <- op_items] + -- Generate code for polymorphic default methods only + -- (Generic default methods have turned into instance decls by now.) + -- This is incompatible with Hugs, which expects a polymorphic + -- default method for every class op, regardless of whether or not + -- the programmer supplied an explicit default decl for the class. + -- (If necessary we can fix that, but we don't have a convenient Id to hand.) + in + mapAndUnzipM tc_dm dm_sel_ids `thenM` \ (defm_binds, dm_ids_s) -> + returnM (listToBag defm_binds, concat dm_ids_s) + +tcDefMeth clas tyvars binds_in prag_fn sel_id + = do { dm_name <- lookupTopBndrRn (mkDefMethRdrName sel_id) + ; let rigid_info = ClsSkol clas + clas_tyvars = tcSkolSigTyVars rigid_info tyvars + inst_tys = mkTyVarTys clas_tyvars + dm_ty = idType sel_id -- Same as dict selector! + theta = [mkClassPred clas inst_tys] + local_dm_id = mkDefaultMethodId dm_name dm_ty + origin = SigOrigin rigid_info + + ; (_, meth_info) <- mkMethodBind origin clas inst_tys binds_in (sel_id, DefMeth) + ; [this_dict] <- newDicts origin theta + ; (defm_bind, insts_needed) <- getLIE (tcMethodBind clas_tyvars theta + [this_dict] prag_fn meth_info) + + ; addErrCtxt (defltMethCtxt clas) $ do + + -- Check the context + { dict_binds <- tcSimplifyCheck + (ptext SLIT("class") <+> ppr clas) + clas_tyvars + [this_dict] + insts_needed + + -- Simplification can do unification + ; checkSigTyVars clas_tyvars + + -- Inline pragmas + -- We'll have an inline pragma on the local binding, made by tcMethodBind + -- but that's not enough; we want one on the global default method too + -- Specialisations, on the other hand, belong on the thing inside only, I think + ; let (_,dm_inst_id,_) = meth_info + sel_name = idName sel_id + inline_prags = filter isInlineLSig (prag_fn sel_name) + ; prags <- tcPrags dm_inst_id inline_prags + + ; let full_bind = AbsBinds clas_tyvars + [instToId this_dict] + [(clas_tyvars, local_dm_id, dm_inst_id, prags)] + (dict_binds `unionBags` defm_bind) + ; returnM (noLoc full_bind, [local_dm_id]) }} + +mkDefMethRdrName :: Id -> RdrName +mkDefMethRdrName sel_id = mkDerivedRdrName (idName sel_id) mkDefaultMethodOcc +\end{code} + + +%************************************************************************ +%* * +\subsection{Typechecking a method} +%* * +%************************************************************************ + +@tcMethodBind@ is used to type-check both default-method and +instance-decl method declarations. We must type-check methods one at a +time, because their signatures may have different contexts and +tyvar sets. + +\begin{code} +type MethodSpec = (Id, -- Global selector Id + Id, -- Local Id (class tyvars instantiated) + LHsBind Name) -- Binding for the method + +tcMethodBind + :: [TcTyVar] -- Skolemised type variables for the + -- enclosing class/instance decl. + -- They'll be signature tyvars, and we + -- want to check that they don't get bound + -- Also they are scoped, so we bring them into scope + -- Always equal the range of the type envt + -> TcThetaType -- Available theta; it's just used for the error message + -> [Inst] -- Available from context, used to simplify constraints + -- from the method body + -> TcPragFun -- Pragmas (e.g. inline pragmas) + -> MethodSpec -- Details of this method + -> TcM (LHsBinds Id) + +tcMethodBind inst_tyvars inst_theta avail_insts prag_fn + (sel_id, meth_id, meth_bind) + = recoverM (returnM emptyLHsBinds) $ + -- If anything fails, recover returning no bindings. + -- This is particularly useful when checking the default-method binding of + -- a class decl. If we don't recover, we don't add the default method to + -- the type enviroment, and we get a tcLookup failure on $dmeth later. + + -- Check the bindings; first adding inst_tyvars to the envt + -- so that we don't quantify over them in nested places + + + let meth_sig = noLoc (TypeSig (noLoc (idName meth_id)) (noLoc bogus_ty)) + bogus_ty = HsTupleTy Boxed [] -- *Only* used to extract scoped type + -- variables... and there aren't any + lookup_sig name = ASSERT( name == idName meth_id ) + Just meth_sig + in + tcExtendTyVarEnv inst_tyvars ( + tcExtendIdEnv [meth_id] $ -- In scope for tcInstSig + addErrCtxt (methodCtxt sel_id) $ + getLIE $ + tcMonoBinds [meth_bind] lookup_sig Recursive + ) `thenM` \ ((meth_bind, mono_bind_infos), meth_lie) -> + + -- Now do context reduction. We simplify wrt both the local tyvars + -- and the ones of the class/instance decl, so that there is + -- no problem with + -- class C a where + -- op :: Eq a => a -> b -> a + -- + -- We do this for each method independently to localise error messages + + let + [(_, Just sig, local_meth_id)] = mono_bind_infos + in + + addErrCtxtM (sigCtxt sel_id inst_tyvars inst_theta (idType meth_id)) $ + newDictsAtLoc (sig_loc sig) (sig_theta sig) `thenM` \ meth_dicts -> + let + meth_tvs = sig_tvs sig + all_tyvars = meth_tvs ++ inst_tyvars + all_insts = avail_insts ++ meth_dicts + sel_name = idName sel_id + in + tcSimplifyCheck + (ptext SLIT("class or instance method") <+> quotes (ppr sel_id)) + all_tyvars all_insts meth_lie `thenM` \ lie_binds -> + + checkSigTyVars all_tyvars `thenM_` + + tcPrags meth_id (prag_fn sel_name) `thenM` \ prags -> + let + poly_meth_bind = noLoc $ AbsBinds meth_tvs + (map instToId meth_dicts) + [(meth_tvs, meth_id, local_meth_id, prags)] + (lie_binds `unionBags` meth_bind) + in + returnM (unitBag poly_meth_bind) + + +mkMethodBind :: InstOrigin + -> Class -> [TcType] -- Class and instance types + -> LHsBinds Name -- Method binding (pick the right one from in here) + -> ClassOpItem + -> TcM (Maybe Inst, -- Method inst + MethodSpec) +-- Find the binding for the specified method, or make +-- up a suitable default method if it isn't there + +mkMethodBind origin clas inst_tys meth_binds (sel_id, dm_info) + = mkMethId origin clas sel_id inst_tys `thenM` \ (mb_inst, meth_id) -> + let + meth_name = idName meth_id + in + -- Figure out what method binding to use + -- If the user suppplied one, use it, else construct a default one + getSrcSpanM `thenM` \ loc -> + (case find_bind (idName sel_id) meth_name meth_binds of + Just user_bind -> returnM user_bind + Nothing -> + mkDefMethRhs origin clas inst_tys sel_id loc dm_info `thenM` \ rhs -> + -- Not infix decl + returnM (noLoc $ mkFunBind (noLoc meth_name) [mkSimpleMatch [] rhs]) + ) `thenM` \ meth_bind -> + + returnM (mb_inst, (sel_id, meth_id, meth_bind)) + +mkMethId :: InstOrigin -> Class + -> Id -> [TcType] -- Selector, and instance types + -> TcM (Maybe Inst, Id) + +-- mkMethId instantiates the selector Id at the specified types +mkMethId origin clas sel_id inst_tys + = let + (tyvars,rho) = tcSplitForAllTys (idType sel_id) + rho_ty = ASSERT( length tyvars == length inst_tys ) + substTyWith tyvars inst_tys rho + (preds,tau) = tcSplitPhiTy rho_ty + first_pred = head preds + in + -- The first predicate should be of form (C a b) + -- where C is the class in question + ASSERT( not (null preds) && + case getClassPredTys_maybe first_pred of + { Just (clas1,tys) -> clas == clas1 ; Nothing -> False } + ) + if isSingleton preds then + -- If it's the only one, make a 'method' + getInstLoc origin `thenM` \ inst_loc -> + newMethod inst_loc sel_id inst_tys `thenM` \ meth_inst -> + returnM (Just meth_inst, instToId meth_inst) + else + -- If it's not the only one we need to be careful + -- For example, given 'op' defined thus: + -- class Foo a where + -- op :: (?x :: String) => a -> a + -- (mkMethId op T) should return an Inst with type + -- (?x :: String) => T -> T + -- That is, the class-op's context is still there. + -- BUT: it can't be a Method any more, because it breaks + -- INVARIANT 2 of methods. (See the data decl for Inst.) + newUnique `thenM` \ uniq -> + getSrcSpanM `thenM` \ loc -> + let + real_tau = mkPhiTy (tail preds) tau + meth_id = mkUserLocal (getOccName sel_id) uniq real_tau + (srcSpanStart loc) --TODO + in + returnM (Nothing, meth_id) + + -- The user didn't supply a method binding, + -- so we have to make up a default binding + -- The RHS of a default method depends on the default-method info +mkDefMethRhs origin clas inst_tys sel_id loc DefMeth + = -- An polymorphic default method + lookupImportedName (mkDefMethRdrName sel_id) `thenM` \ dm_name -> + -- Might not be imported, but will be an OrigName + traceRn (text "mkDefMeth" <+> ppr dm_name) `thenM_` + returnM (nlHsVar dm_name) + +mkDefMethRhs origin clas inst_tys sel_id loc NoDefMeth + = -- No default method + -- Warn only if -fwarn-missing-methods + doptM Opt_WarnMissingMethods `thenM` \ warn -> + warnTc (isInstDecl origin + && warn + && reportIfUnused (getOccName sel_id)) + (omittedMethodWarn sel_id) `thenM_` + returnM error_rhs + where + error_rhs = noLoc $ HsLam (mkMatchGroup [mkSimpleMatch wild_pats simple_rhs]) + simple_rhs = nlHsApp (nlHsVar (getName nO_METHOD_BINDING_ERROR_ID)) + (nlHsLit (HsStringPrim (mkFastString error_msg))) + error_msg = showSDoc (hcat [ppr loc, text "|", ppr sel_id ]) + + -- When the type is of form t1 -> t2 -> t3 + -- make a default method like (\ _ _ -> noMethBind "blah") + -- rather than simply (noMethBind "blah") + -- Reason: if t1 or t2 are higher-ranked types we get n + -- silly ambiguity messages. + -- Example: f :: (forall a. Eq a => a -> a) -> Int + -- f = error "urk" + -- Here, tcSub tries to force (error "urk") to have the right type, + -- thus: f = \(x::forall a. Eq a => a->a) -> error "urk" (x t) + -- where 't' is fresh ty var. This leads directly to "ambiguous t". + -- + -- NB: technically this changes the meaning of the default-default + -- method slightly, because `seq` can see the lambdas. Oh well. + (_,_,tau1) = tcSplitSigmaTy (idType sel_id) + (_,_,tau2) = tcSplitSigmaTy tau1 + -- Need two splits because the selector can have a type like + -- forall a. Foo a => forall b. Eq b => ... + (arg_tys, _) = tcSplitFunTys tau2 + wild_pats = [nlWildPat | ty <- arg_tys] + +mkDefMethRhs origin clas inst_tys sel_id loc GenDefMeth + = -- A generic default method + -- If the method is defined generically, we can only do the job if the + -- instance declaration is for a single-parameter type class with + -- a type constructor applied to type arguments in the instance decl + -- (checkTc, so False provokes the error) + ASSERT( isInstDecl origin ) -- We never get here from a class decl + do { checkTc (isJust maybe_tycon) + (badGenericInstance sel_id (notSimple inst_tys)) + ; checkTc (tyConHasGenerics tycon) + (badGenericInstance sel_id (notGeneric tycon)) + + ; dflags <- getDOpts + ; ioToTcRn (dumpIfSet_dyn dflags Opt_D_dump_deriv "Filling in method body" + (vcat [ppr clas <+> ppr inst_tys, + nest 2 (ppr sel_id <+> equals <+> ppr rhs)])) + + -- Rename it before returning it + ; (rn_rhs, _) <- rnLExpr rhs + ; returnM rn_rhs } + where + rhs = mkGenericRhs sel_id clas_tyvar tycon + + -- The tycon is only used in the generic case, and in that + -- case we require that the instance decl is for a single-parameter + -- type class with type variable arguments: + -- instance (...) => C (T a b) + clas_tyvar = head (classTyVars clas) + Just tycon = maybe_tycon + maybe_tycon = case inst_tys of + [ty] -> case tcSplitTyConApp_maybe ty of + Just (tycon, arg_tys) | all tcIsTyVarTy arg_tys -> Just tycon + other -> Nothing + other -> Nothing + +isInstDecl (SigOrigin (InstSkol _)) = True +isInstDecl (SigOrigin (ClsSkol _)) = False +\end{code} + + +\begin{code} +-- The renamer just puts the selector ID as the binder in the method binding +-- but we must use the method name; so we substitute it here. Crude but simple. +find_bind sel_name meth_name binds + = foldlBag seqMaybe Nothing (mapBag f binds) + where + f (L loc1 bind@(FunBind { fun_id = L loc2 op_name })) | op_name == sel_name + = Just (L loc1 (bind { fun_id = L loc2 meth_name })) + f _other = Nothing +\end{code} + + +%************************************************************************ +%* * +\subsection{Extracting generic instance declaration from class declarations} +%* * +%************************************************************************ + +@getGenericInstances@ extracts the generic instance declarations from a class +declaration. For exmaple + + class C a where + op :: a -> a + + op{ x+y } (Inl v) = ... + op{ x+y } (Inr v) = ... + op{ x*y } (v :*: w) = ... + op{ 1 } Unit = ... + +gives rise to the instance declarations + + instance C (x+y) where + op (Inl v) = ... + op (Inr v) = ... + + instance C (x*y) where + op (v :*: w) = ... + + instance C 1 where + op Unit = ... + + +\begin{code} +getGenericInstances :: [LTyClDecl Name] -> TcM [InstInfo] +getGenericInstances class_decls + = do { gen_inst_infos <- mappM (addLocM get_generics) class_decls + ; let { gen_inst_info = concat gen_inst_infos } + + -- Return right away if there is no generic stuff + ; if null gen_inst_info then returnM [] + else do + + -- Otherwise print it out + { dflags <- getDOpts + ; ioToTcRn (dumpIfSet_dyn dflags Opt_D_dump_deriv "Generic instances" + (vcat (map pprInstInfoDetails gen_inst_info))) + ; returnM gen_inst_info }} + +get_generics decl@(ClassDecl {tcdLName = class_name, tcdMeths = def_methods}) + | null generic_binds + = returnM [] -- The comon case: no generic default methods + + | otherwise -- A source class decl with generic default methods + = recoverM (returnM []) $ + tcAddDeclCtxt decl $ + tcLookupLocatedClass class_name `thenM` \ clas -> + + -- Group by type, and + -- make an InstInfo out of each group + let + groups = groupWith listToBag generic_binds + in + mappM (mkGenericInstance clas) groups `thenM` \ inst_infos -> + + -- Check that there is only one InstInfo for each type constructor + -- The main way this can fail is if you write + -- f {| a+b |} ... = ... + -- f {| x+y |} ... = ... + -- Then at this point we'll have an InstInfo for each + let + tc_inst_infos :: [(TyCon, InstInfo)] + tc_inst_infos = [(simpleInstInfoTyCon i, i) | i <- inst_infos] + + bad_groups = [group | group <- equivClassesByUniq get_uniq tc_inst_infos, + group `lengthExceeds` 1] + get_uniq (tc,_) = getUnique tc + in + mappM (addErrTc . dupGenericInsts) bad_groups `thenM_` + + -- Check that there is an InstInfo for each generic type constructor + let + missing = genericTyConNames `minusList` [tyConName tc | (tc,_) <- tc_inst_infos] + in + checkTc (null missing) (missingGenericInstances missing) `thenM_` + + returnM inst_infos + where + generic_binds :: [(HsType Name, LHsBind Name)] + generic_binds = getGenericBinds def_methods + + +--------------------------------- +getGenericBinds :: LHsBinds Name -> [(HsType Name, LHsBind Name)] + -- Takes a group of method bindings, finds the generic ones, and returns + -- them in finite map indexed by the type parameter in the definition. +getGenericBinds binds = concat (map getGenericBind (bagToList binds)) + +getGenericBind (L loc bind@(FunBind { fun_matches = MatchGroup matches ty })) + = groupWith wrap (mapCatMaybes maybeGenericMatch matches) + where + wrap ms = L loc (bind { fun_matches = MatchGroup ms ty }) +getGenericBind _ + = [] + +groupWith :: ([a] -> b) -> [(HsType Name, a)] -> [(HsType Name, b)] +groupWith op [] = [] +groupWith op ((t,v):prs) = (t, op (v:vs)) : groupWith op rest + where + vs = map snd this + (this,rest) = partition same_t prs + same_t (t',v) = t `eqPatType` t' + +eqPatLType :: LHsType Name -> LHsType Name -> Bool +eqPatLType t1 t2 = unLoc t1 `eqPatType` unLoc t2 + +eqPatType :: HsType Name -> HsType Name -> Bool +-- A very simple equality function, only for +-- type patterns in generic function definitions. +eqPatType (HsTyVar v1) (HsTyVar v2) = v1==v2 +eqPatType (HsAppTy s1 t1) (HsAppTy s2 t2) = s1 `eqPatLType` s2 && t2 `eqPatLType` t2 +eqPatType (HsOpTy s1 op1 t1) (HsOpTy s2 op2 t2) = s1 `eqPatLType` s2 && t2 `eqPatLType` t2 && unLoc op1 == unLoc op2 +eqPatType (HsNumTy n1) (HsNumTy n2) = n1 == n2 +eqPatType (HsParTy t1) t2 = unLoc t1 `eqPatType` t2 +eqPatType t1 (HsParTy t2) = t1 `eqPatType` unLoc t2 +eqPatType _ _ = False + +--------------------------------- +mkGenericInstance :: Class + -> (HsType Name, LHsBinds Name) + -> TcM InstInfo + +mkGenericInstance clas (hs_ty, binds) + -- Make a generic instance declaration + -- For example: instance (C a, C b) => C (a+b) where { binds } + + = -- Extract the universally quantified type variables + -- and wrap them as forall'd tyvars, so that kind inference + -- works in the standard way + let + sig_tvs = map (noLoc.UserTyVar) (nameSetToList (extractHsTyVars (noLoc hs_ty))) + hs_forall_ty = noLoc $ mkExplicitHsForAllTy sig_tvs (noLoc []) (noLoc hs_ty) + in + -- Type-check the instance type, and check its form + tcHsSigType GenPatCtxt hs_forall_ty `thenM` \ forall_inst_ty -> + let + (tyvars, inst_ty) = tcSplitForAllTys forall_inst_ty + in + checkTc (validGenericInstanceType inst_ty) + (badGenericInstanceType binds) `thenM_` + + -- Make the dictionary function. + getSrcSpanM `thenM` \ span -> + getOverlapFlag `thenM` \ overlap_flag -> + newDFunName clas [inst_ty] (srcSpanStart span) `thenM` \ dfun_name -> + let + inst_theta = [mkClassPred clas [mkTyVarTy tv] | tv <- tyvars] + dfun_id = mkDictFunId dfun_name tyvars inst_theta clas [inst_ty] + ispec = mkLocalInstance dfun_id overlap_flag + in + returnM (InstInfo { iSpec = ispec, iBinds = VanillaInst binds [] }) +\end{code} + + +%************************************************************************ +%* * + Error messages +%* * +%************************************************************************ + +\begin{code} +tcAddDeclCtxt decl thing_inside + = addErrCtxt ctxt thing_inside + where + thing = case decl of + ClassDecl {} -> "class" + TySynonym {} -> "type synonym" + TyData {tcdND = NewType} -> "newtype" + TyData {tcdND = DataType} -> "data type" + + ctxt = hsep [ptext SLIT("In the"), text thing, + ptext SLIT("declaration for"), quotes (ppr (tcdName decl))] + +defltMethCtxt clas + = ptext SLIT("When checking the default methods for class") <+> quotes (ppr clas) + +methodCtxt sel_id + = ptext SLIT("In the definition for method") <+> quotes (ppr sel_id) + +badMethodErr clas op + = hsep [ptext SLIT("Class"), quotes (ppr clas), + ptext SLIT("does not have a method"), quotes (ppr op)] + +omittedMethodWarn sel_id + = ptext SLIT("No explicit method nor default method for") <+> quotes (ppr sel_id) + +badGenericInstance sel_id because + = sep [ptext SLIT("Can't derive generic code for") <+> quotes (ppr sel_id), + because] + +notSimple inst_tys + = vcat [ptext SLIT("because the instance type(s)"), + nest 2 (ppr inst_tys), + ptext SLIT("is not a simple type of form (T a b c)")] + +notGeneric tycon + = vcat [ptext SLIT("because the instance type constructor") <+> quotes (ppr tycon) <+> + ptext SLIT("was not compiled with -fgenerics")] + +badGenericInstanceType binds + = vcat [ptext SLIT("Illegal type pattern in the generic bindings"), + nest 4 (ppr binds)] + +missingGenericInstances missing + = ptext SLIT("Missing type patterns for") <+> pprQuotedList missing + +dupGenericInsts tc_inst_infos + = vcat [ptext SLIT("More than one type pattern for a single generic type constructor:"), + nest 4 (vcat (map ppr_inst_ty tc_inst_infos)), + ptext SLIT("All the type patterns for a generic type constructor must be identical") + ] + where + ppr_inst_ty (_,inst) = ppr (simpleInstInfoTy inst) + +mixedGenericErr op + = ptext SLIT("Can't mix generic and non-generic equations for class method") <+> quotes (ppr op) +\end{code} diff --git a/compiler/typecheck/TcDefaults.lhs b/compiler/typecheck/TcDefaults.lhs new file mode 100644 index 0000000000..6c9de36a3c --- /dev/null +++ b/compiler/typecheck/TcDefaults.lhs @@ -0,0 +1,79 @@ +% +% (c) The AQUA Project, Glasgow University, 1993-1998 +% +\section[TcDefaults]{Typechecking \tr{default} declarations} + +\begin{code} +module TcDefaults ( tcDefaults ) where + +#include "HsVersions.h" + +import HsSyn ( DefaultDecl(..), LDefaultDecl ) +import Name ( Name ) +import TcRnMonad +import TcEnv ( tcLookupClass ) +import TcHsType ( tcHsSigType, UserTypeCtxt( DefaultDeclCtxt ) ) +import TcSimplify ( tcSimplifyDefault ) +import TcType ( Type, mkClassPred, isTauTy ) +import PrelNames ( numClassName ) +import SrcLoc ( Located(..) ) +import Outputable +\end{code} + +\begin{code} +tcDefaults :: [LDefaultDecl Name] + -> TcM (Maybe [Type]) -- Defaulting types to heave + -- into Tc monad for later use + -- in Disambig. + +tcDefaults [] + = getDefaultTys -- No default declaration, so get the + -- default types from the envt; + -- i.e. use the curent ones + -- (the caller will put them back there) + -- It's important not to return defaultDefaultTys here (which + -- we used to do) because in a TH program, tcDefaults [] is called + -- repeatedly, once for each group of declarations between top-level + -- splices. We don't want to carefully set the default types in + -- one group, only for the next group to ignore them and install + -- defaultDefaultTys + +tcDefaults [L locn (DefaultDecl [])] + = returnM (Just []) -- Default declaration specifying no types + +tcDefaults [L locn (DefaultDecl mono_tys)] + = setSrcSpan locn $ + addErrCtxt defaultDeclCtxt $ + tcLookupClass numClassName `thenM` \ num_class -> + mappM tc_default_ty mono_tys `thenM` \ tau_tys -> + + -- Check that all the types are instances of Num + -- We only care about whether it worked or not + tcSimplifyDefault [mkClassPred num_class [ty] | ty <- tau_tys] `thenM_` + + returnM (Just tau_tys) + +tcDefaults decls@(L locn (DefaultDecl _) : _) = + setSrcSpan locn $ + failWithTc (dupDefaultDeclErr decls) + + +tc_default_ty hs_ty + = tcHsSigType DefaultDeclCtxt hs_ty `thenM` \ ty -> + checkTc (isTauTy ty) (polyDefErr hs_ty) `thenM_` + returnM ty + +defaultDeclCtxt = ptext SLIT("when checking that each type in a default declaration") + $$ ptext SLIT("is an instance of class Num") + + +dupDefaultDeclErr (L _ (DefaultDecl _) : dup_things) + = hang (ptext SLIT("Multiple default declarations")) + 4 (vcat (map pp dup_things)) + where + pp (L locn (DefaultDecl _)) = ptext SLIT("here was another default declaration") <+> ppr locn + +polyDefErr ty + = hang (ptext SLIT("Illegal polymorphic type in default declaration") <> colon) 4 (ppr ty) +\end{code} + diff --git a/compiler/typecheck/TcDeriv.lhs b/compiler/typecheck/TcDeriv.lhs new file mode 100644 index 0000000000..472ce6b94d --- /dev/null +++ b/compiler/typecheck/TcDeriv.lhs @@ -0,0 +1,960 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[TcDeriv]{Deriving} + +Handles @deriving@ clauses on @data@ declarations. + +\begin{code} +module TcDeriv ( tcDeriving ) where + +#include "HsVersions.h" + +import HsSyn +import DynFlags ( DynFlag(..) ) + +import Generics ( mkTyConGenericBinds ) +import TcRnMonad +import TcEnv ( newDFunName, pprInstInfoDetails, + InstInfo(..), InstBindings(..), simpleInstInfoClsTy, + tcLookupClass, tcLookupTyCon, tcExtendTyVarEnv + ) +import TcGenDeriv -- Deriv stuff +import InstEnv ( Instance, OverlapFlag, mkLocalInstance, instanceHead, extendInstEnvList ) +import Inst ( getOverlapFlag ) +import TcHsType ( tcHsDeriv ) +import TcSimplify ( tcSimplifyDeriv ) + +import RnBinds ( rnMethodBinds, rnTopBinds ) +import RnEnv ( bindLocalNames ) +import HscTypes ( FixityEnv ) + +import Class ( className, classArity, classKey, classTyVars, classSCTheta, Class ) +import Type ( zipOpenTvSubst, substTheta ) +import ErrUtils ( dumpIfSet_dyn ) +import MkId ( mkDictFunId ) +import DataCon ( isNullarySrcDataCon, isVanillaDataCon, dataConOrigArgTys ) +import Maybes ( catMaybes ) +import RdrName ( RdrName ) +import Name ( Name, getSrcLoc ) +import NameSet ( duDefs ) +import Kind ( splitKindFunTys ) +import TyCon ( tyConTyVars, tyConDataCons, tyConArity, tyConHasGenerics, + tyConStupidTheta, isProductTyCon, isDataTyCon, newTyConRhs, + isEnumerationTyCon, isRecursiveTyCon, TyCon + ) +import TcType ( TcType, ThetaType, mkTyVarTys, mkTyConApp, tcTyConAppTyCon, + isUnLiftedType, mkClassPred, tyVarsOfTypes, isArgTypeKind, + tcEqTypes, tcSplitAppTys, mkAppTys ) +import Var ( TyVar, tyVarKind, varName ) +import VarSet ( mkVarSet, subVarSet ) +import PrelNames +import SrcLoc ( srcLocSpan, Located(..) ) +import Util ( zipWithEqual, sortLe, notNull ) +import ListSetOps ( removeDups, assocMaybe ) +import Outputable +import Bag +\end{code} + +%************************************************************************ +%* * +\subsection[TcDeriv-intro]{Introduction to how we do deriving} +%* * +%************************************************************************ + +Consider + + data T a b = C1 (Foo a) (Bar b) + | C2 Int (T b a) + | C3 (T a a) + deriving (Eq) + +[NOTE: See end of these comments for what to do with + data (C a, D b) => T a b = ... +] + +We want to come up with an instance declaration of the form + + instance (Ping a, Pong b, ...) => Eq (T a b) where + x == y = ... + +It is pretty easy, albeit tedious, to fill in the code "...". The +trick is to figure out what the context for the instance decl is, +namely @Ping@, @Pong@ and friends. + +Let's call the context reqd for the T instance of class C at types +(a,b, ...) C (T a b). Thus: + + Eq (T a b) = (Ping a, Pong b, ...) + +Now we can get a (recursive) equation from the @data@ decl: + + Eq (T a b) = Eq (Foo a) u Eq (Bar b) -- From C1 + u Eq (T b a) u Eq Int -- From C2 + u Eq (T a a) -- From C3 + +Foo and Bar may have explicit instances for @Eq@, in which case we can +just substitute for them. Alternatively, either or both may have +their @Eq@ instances given by @deriving@ clauses, in which case they +form part of the system of equations. + +Now all we need do is simplify and solve the equations, iterating to +find the least fixpoint. Notice that the order of the arguments can +switch around, as here in the recursive calls to T. + +Let's suppose Eq (Foo a) = Eq a, and Eq (Bar b) = Ping b. + +We start with: + + Eq (T a b) = {} -- The empty set + +Next iteration: + Eq (T a b) = Eq (Foo a) u Eq (Bar b) -- From C1 + u Eq (T b a) u Eq Int -- From C2 + u Eq (T a a) -- From C3 + + After simplification: + = Eq a u Ping b u {} u {} u {} + = Eq a u Ping b + +Next iteration: + + Eq (T a b) = Eq (Foo a) u Eq (Bar b) -- From C1 + u Eq (T b a) u Eq Int -- From C2 + u Eq (T a a) -- From C3 + + After simplification: + = Eq a u Ping b + u (Eq b u Ping a) + u (Eq a u Ping a) + + = Eq a u Ping b u Eq b u Ping a + +The next iteration gives the same result, so this is the fixpoint. We +need to make a canonical form of the RHS to ensure convergence. We do +this by simplifying the RHS to a form in which + + - the classes constrain only tyvars + - the list is sorted by tyvar (major key) and then class (minor key) + - no duplicates, of course + +So, here are the synonyms for the ``equation'' structures: + +\begin{code} +type DerivEqn = (Name, Class, TyCon, [TyVar], DerivRhs) + -- The Name is the name for the DFun we'll build + -- The tyvars bind all the variables in the RHS + +pprDerivEqn (n,c,tc,tvs,rhs) + = parens (hsep [ppr n, ppr c, ppr tc, ppr tvs] <+> equals <+> ppr rhs) + +type DerivRhs = ThetaType +type DerivSoln = DerivRhs +\end{code} + + +[Data decl contexts] A note about contexts on data decls +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Consider + + data (RealFloat a) => Complex a = !a :+ !a deriving( Read ) + +We will need an instance decl like: + + instance (Read a, RealFloat a) => Read (Complex a) where + ... + +The RealFloat in the context is because the read method for Complex is bound +to construct a Complex, and doing that requires that the argument type is +in RealFloat. + +But this ain't true for Show, Eq, Ord, etc, since they don't construct +a Complex; they only take them apart. + +Our approach: identify the offending classes, and add the data type +context to the instance decl. The "offending classes" are + + Read, Enum? + +FURTHER NOTE ADDED March 2002. In fact, Haskell98 now requires that +pattern matching against a constructor from a data type with a context +gives rise to the constraints for that context -- or at least the thinned +version. So now all classes are "offending". + +[Newtype deriving] +~~~~~~~~~~~~~~~~~~ +Consider this: + class C a b + instance C [a] Char + newtype T = T Char deriving( C [a] ) + +Notice the free 'a' in the deriving. We have to fill this out to + newtype T = T Char deriving( forall a. C [a] ) + +And then translate it to: + instance C [a] Char => C [a] T where ... + + + + +%************************************************************************ +%* * +\subsection[TcDeriv-driver]{Top-level function for \tr{derivings}} +%* * +%************************************************************************ + +\begin{code} +tcDeriving :: [LTyClDecl Name] -- All type constructors + -> TcM ([InstInfo], -- The generated "instance decls" + HsValBinds Name) -- Extra generated top-level bindings + +tcDeriving tycl_decls + = recoverM (returnM ([], emptyValBindsOut)) $ + do { -- Fish the "deriving"-related information out of the TcEnv + -- and make the necessary "equations". + overlap_flag <- getOverlapFlag + ; (ordinary_eqns, newtype_inst_info) <- makeDerivEqns overlap_flag tycl_decls + + ; (ordinary_inst_info, deriv_binds) + <- extendLocalInstEnv (map iSpec newtype_inst_info) $ + deriveOrdinaryStuff overlap_flag ordinary_eqns + -- Add the newtype-derived instances to the inst env + -- before tacking the "ordinary" ones + + ; let inst_info = newtype_inst_info ++ ordinary_inst_info + + -- If we are compiling a hs-boot file, + -- don't generate any derived bindings + ; is_boot <- tcIsHsBoot + ; if is_boot then + return (inst_info, emptyValBindsOut) + else do + { + + -- Generate the generic to/from functions from each type declaration + ; gen_binds <- mkGenericBinds tycl_decls + + -- Rename these extra bindings, discarding warnings about unused bindings etc + -- Set -fglasgow exts so that we can have type signatures in patterns, + -- which is used in the generic binds + ; rn_binds + <- discardWarnings $ setOptM Opt_GlasgowExts $ do + { (rn_deriv, _dus1) <- rnTopBinds (ValBindsIn deriv_binds []) + ; (rn_gen, dus_gen) <- rnTopBinds (ValBindsIn gen_binds []) + ; keepAliveSetTc (duDefs dus_gen) -- Mark these guys to + -- be kept alive + ; return (rn_deriv `plusHsValBinds` rn_gen) } + + + ; dflags <- getDOpts + ; ioToTcRn (dumpIfSet_dyn dflags Opt_D_dump_deriv "Derived instances" + (ddump_deriving inst_info rn_binds)) + + ; returnM (inst_info, rn_binds) + }} + where + ddump_deriving :: [InstInfo] -> HsValBinds Name -> SDoc + ddump_deriving inst_infos extra_binds + = vcat (map pprInstInfoDetails inst_infos) $$ ppr extra_binds + +----------------------------------------- +deriveOrdinaryStuff overlap_flag [] -- Short cut + = returnM ([], emptyLHsBinds) + +deriveOrdinaryStuff overlap_flag eqns + = do { -- Take the equation list and solve it, to deliver a list of + -- solutions, a.k.a. the contexts for the instance decls + -- required for the corresponding equations. + inst_specs <- solveDerivEqns overlap_flag eqns + + -- Generate the InstInfo for each dfun, + -- plus any auxiliary bindings it needs + ; (inst_infos, aux_binds_s) <- mapAndUnzipM genInst inst_specs + + -- Generate any extra not-one-inst-decl-specific binds, + -- notably "con2tag" and/or "tag2con" functions. + ; extra_binds <- genTaggeryBinds inst_infos + + -- Done + ; returnM (inst_infos, unionManyBags (extra_binds : aux_binds_s)) + } + +----------------------------------------- +mkGenericBinds tycl_decls + = do { tcs <- mapM tcLookupTyCon + [ tc_name | + L _ (TyData { tcdLName = L _ tc_name }) <- tycl_decls] + -- We are only interested in the data type declarations + ; return (unionManyBags [ mkTyConGenericBinds tc | + tc <- tcs, tyConHasGenerics tc ]) } + -- And then only in the ones whose 'has-generics' flag is on +\end{code} + + +%************************************************************************ +%* * +\subsection[TcDeriv-eqns]{Forming the equations} +%* * +%************************************************************************ + +@makeDerivEqns@ fishes around to find the info about needed derived +instances. Complicating factors: +\begin{itemize} +\item +We can only derive @Enum@ if the data type is an enumeration +type (all nullary data constructors). + +\item +We can only derive @Ix@ if the data type is an enumeration {\em +or} has just one data constructor (e.g., tuples). +\end{itemize} + +[See Appendix~E in the Haskell~1.2 report.] This code here deals w/ +all those. + +\begin{code} +makeDerivEqns :: OverlapFlag + -> [LTyClDecl Name] + -> TcM ([DerivEqn], -- Ordinary derivings + [InstInfo]) -- Special newtype derivings + +makeDerivEqns overlap_flag tycl_decls + = mapAndUnzipM mk_eqn derive_these `thenM` \ (maybe_ordinaries, maybe_newtypes) -> + returnM (catMaybes maybe_ordinaries, catMaybes maybe_newtypes) + where + ------------------------------------------------------------------ + derive_these :: [(NewOrData, Name, LHsType Name)] + -- Find the (nd, TyCon, Pred) pairs that must be `derived' + derive_these = [ (nd, tycon, pred) + | L _ (TyData { tcdND = nd, tcdLName = L _ tycon, + tcdDerivs = Just preds }) <- tycl_decls, + pred <- preds ] + + ------------------------------------------------------------------ + mk_eqn :: (NewOrData, Name, LHsType Name) -> TcM (Maybe DerivEqn, Maybe InstInfo) + -- We swizzle the tyvars and datacons out of the tycon + -- to make the rest of the equation + -- + -- The "deriv_ty" is a LHsType to take account of the fact that for newtype derivign + -- we allow deriving (forall a. C [a]). + + mk_eqn (new_or_data, tycon_name, hs_deriv_ty) + = tcLookupTyCon tycon_name `thenM` \ tycon -> + setSrcSpan (srcLocSpan (getSrcLoc tycon)) $ + addErrCtxt (derivCtxt Nothing tycon) $ + tcExtendTyVarEnv (tyConTyVars tycon) $ -- Deriving preds may (now) mention + -- the type variables for the type constructor + tcHsDeriv hs_deriv_ty `thenM` \ (deriv_tvs, clas, tys) -> + doptM Opt_GlasgowExts `thenM` \ gla_exts -> + mk_eqn_help gla_exts new_or_data tycon deriv_tvs clas tys + + ------------------------------------------------------------------ + mk_eqn_help gla_exts DataType tycon deriv_tvs clas tys + | Just err <- checkSideConditions gla_exts tycon deriv_tvs clas tys + = bale_out (derivingThingErr clas tys tycon (tyConTyVars tycon) err) + | otherwise + = do { eqn <- mkDataTypeEqn tycon clas + ; returnM (Just eqn, Nothing) } + + mk_eqn_help gla_exts NewType tycon deriv_tvs clas tys + | can_derive_via_isomorphism && (gla_exts || std_class_via_iso clas) + = -- Go ahead and use the isomorphism + traceTc (text "newtype deriving:" <+> ppr tycon <+> ppr rep_tys) `thenM_` + new_dfun_name clas tycon `thenM` \ dfun_name -> + returnM (Nothing, Just (InstInfo { iSpec = mk_inst_spec dfun_name, + iBinds = NewTypeDerived rep_tys })) + | std_class gla_exts clas + = mk_eqn_help gla_exts DataType tycon deriv_tvs clas tys -- Go via bale-out route + + | otherwise -- Non-standard instance + = bale_out (if gla_exts then + cant_derive_err -- Too hard + else + non_std_err) -- Just complain about being a non-std instance + where + -- Here is the plan for newtype derivings. We see + -- newtype T a1...an = T (t ak...an) deriving (.., C s1 .. sm, ...) + -- where t is a type, + -- ak...an is a suffix of a1..an + -- ak...an do not occur free in t, + -- (C s1 ... sm) is a *partial applications* of class C + -- with the last parameter missing + -- + -- We generate the instances + -- instance C s1 .. sm (t ak...ap) => C s1 .. sm (T a1...ap) + -- where T a1...ap is the partial application of the LHS of the correct kind + -- and p >= k + -- + -- Running example: newtype T s a = MkT (ST s a) deriving( Monad ) + -- instance Monad (ST s) => Monad (T s) where + -- fail = coerce ... (fail @ ST s) + -- (Actually we don't need the coerce, because non-rec newtypes are transparent + + clas_tyvars = classTyVars clas + kind = tyVarKind (last clas_tyvars) + -- Kind of the thing we want to instance + -- e.g. argument kind of Monad, *->* + + (arg_kinds, _) = splitKindFunTys kind + n_args_to_drop = length arg_kinds + -- Want to drop 1 arg from (T s a) and (ST s a) + -- to get instance Monad (ST s) => Monad (T s) + + -- Note [newtype representation] + -- Need newTyConRhs *not* newTyConRep to get the representation + -- type, because the latter looks through all intermediate newtypes + -- For example + -- newtype B = MkB Int + -- newtype A = MkA B deriving( Num ) + -- We want the Num instance of B, *not* the Num instance of Int, + -- when making the Num instance of A! + (tc_tvs, rep_ty) = newTyConRhs tycon + (rep_fn, rep_ty_args) = tcSplitAppTys rep_ty + + n_tyvars_to_keep = tyConArity tycon - n_args_to_drop + tyvars_to_drop = drop n_tyvars_to_keep tc_tvs + tyvars_to_keep = take n_tyvars_to_keep tc_tvs + + n_args_to_keep = length rep_ty_args - n_args_to_drop + args_to_drop = drop n_args_to_keep rep_ty_args + args_to_keep = take n_args_to_keep rep_ty_args + + rep_tys = tys ++ [mkAppTys rep_fn args_to_keep] + rep_pred = mkClassPred clas rep_tys + -- rep_pred is the representation dictionary, from where + -- we are gong to get all the methods for the newtype dictionary + + inst_tys = (tys ++ [mkTyConApp tycon (mkTyVarTys tyvars_to_keep)]) + -- The 'tys' here come from the partial application + -- in the deriving clause. The last arg is the new + -- instance type. + + -- We must pass the superclasses; the newtype might be an instance + -- of them in a different way than the representation type + -- E.g. newtype Foo a = Foo a deriving( Show, Num, Eq ) + -- Then the Show instance is not done via isomprphism; it shows + -- Foo 3 as "Foo 3" + -- The Num instance is derived via isomorphism, but the Show superclass + -- dictionary must the Show instance for Foo, *not* the Show dictionary + -- gotten from the Num dictionary. So we must build a whole new dictionary + -- not just use the Num one. The instance we want is something like: + -- instance (Num a, Show (Foo a), Eq (Foo a)) => Num (Foo a) where + -- (+) = ((+)@a) + -- ...etc... + -- There's no 'corece' needed because after the type checker newtypes + -- are transparent. + + sc_theta = substTheta (zipOpenTvSubst clas_tyvars inst_tys) + (classSCTheta clas) + + -- If there are no tyvars, there's no need + -- to abstract over the dictionaries we need + dict_tvs = deriv_tvs ++ tc_tvs + dict_args | null dict_tvs = [] + | otherwise = rep_pred : sc_theta + + -- Finally! Here's where we build the dictionary Id + mk_inst_spec dfun_name + = mkLocalInstance dfun overlap_flag + where + dfun = mkDictFunId dfun_name dict_tvs dict_args clas inst_tys + + ------------------------------------------------------------------- + -- Figuring out whether we can only do this newtype-deriving thing + + right_arity = length tys + 1 == classArity clas + + -- Never derive Read,Show,Typeable,Data this way + non_iso_classes = [readClassKey, showClassKey, typeableClassKey, dataClassKey] + can_derive_via_isomorphism + = not (getUnique clas `elem` non_iso_classes) + && right_arity -- Well kinded; + -- eg not: newtype T ... deriving( ST ) + -- because ST needs *2* type params + && n_tyvars_to_keep >= 0 -- Type constructor has right kind: + -- eg not: newtype T = T Int deriving( Monad ) + && n_args_to_keep >= 0 -- Rep type has right kind: + -- eg not: newtype T a = T Int deriving( Monad ) + && eta_ok -- Eta reduction works + && not (isRecursiveTyCon tycon) -- Does not work for recursive tycons: + -- newtype A = MkA [A] + -- Don't want + -- instance Eq [A] => Eq A !! + -- Here's a recursive newtype that's actually OK + -- newtype S1 = S1 [T1 ()] + -- newtype T1 a = T1 (StateT S1 IO a ) deriving( Monad ) + -- It's currently rejected. Oh well. + -- In fact we generate an instance decl that has method of form + -- meth @ instTy = meth @ repTy + -- (no coerce's). We'd need a coerce if we wanted to handle + -- recursive newtypes too + + -- Check that eta reduction is OK + -- (a) the dropped-off args are identical + -- (b) the remaining type args mention + -- only the remaining type variables + eta_ok = (args_to_drop `tcEqTypes` mkTyVarTys tyvars_to_drop) + && (tyVarsOfTypes args_to_keep `subVarSet` mkVarSet tyvars_to_keep) + + cant_derive_err = derivingThingErr clas tys tycon tyvars_to_keep + (vcat [ptext SLIT("even with cunning newtype deriving:"), + if isRecursiveTyCon tycon then + ptext SLIT("the newtype is recursive") + else empty, + if not right_arity then + quotes (ppr (mkClassPred clas tys)) <+> ptext SLIT("does not have arity 1") + else empty, + if not (n_tyvars_to_keep >= 0) then + ptext SLIT("the type constructor has wrong kind") + else if not (n_args_to_keep >= 0) then + ptext SLIT("the representation type has wrong kind") + else if not eta_ok then + ptext SLIT("the eta-reduction property does not hold") + else empty + ]) + + non_std_err = derivingThingErr clas tys tycon tyvars_to_keep + (vcat [non_std_why clas, + ptext SLIT("Try -fglasgow-exts for GHC's newtype-deriving extension")]) + + bale_out err = addErrTc err `thenM_` returnM (Nothing, Nothing) + +std_class gla_exts clas + = key `elem` derivableClassKeys + || (gla_exts && (key == typeableClassKey || key == dataClassKey)) + where + key = classKey clas + +std_class_via_iso clas -- These standard classes can be derived for a newtype + -- using the isomorphism trick *even if no -fglasgow-exts* + = classKey clas `elem` [eqClassKey, ordClassKey, ixClassKey, boundedClassKey] + -- Not Read/Show because they respect the type + -- Not Enum, becuase newtypes are never in Enum + + +new_dfun_name clas tycon -- Just a simple wrapper + = newDFunName clas [mkTyConApp tycon []] (getSrcLoc tycon) + -- The type passed to newDFunName is only used to generate + -- a suitable string; hence the empty type arg list + +------------------------------------------------------------------ +mkDataTypeEqn :: TyCon -> Class -> TcM DerivEqn +mkDataTypeEqn tycon clas + | clas `hasKey` typeableClassKey + = -- The Typeable class is special in several ways + -- data T a b = ... deriving( Typeable ) + -- gives + -- instance Typeable2 T where ... + -- Notice that: + -- 1. There are no constraints in the instance + -- 2. There are no type variables either + -- 3. The actual class we want to generate isn't necessarily + -- Typeable; it depends on the arity of the type + do { real_clas <- tcLookupClass (typeableClassNames !! tyConArity tycon) + ; dfun_name <- new_dfun_name real_clas tycon + ; return (dfun_name, real_clas, tycon, [], []) } + + | otherwise + = do { dfun_name <- new_dfun_name clas tycon + ; return (dfun_name, clas, tycon, tyvars, constraints) } + where + tyvars = tyConTyVars tycon + constraints = extra_constraints ++ ordinary_constraints + extra_constraints = tyConStupidTheta tycon + -- "extra_constraints": see note [Data decl contexts] above + + ordinary_constraints + = [ mkClassPred clas [arg_ty] + | data_con <- tyConDataCons tycon, + arg_ty <- dataConOrigArgTys data_con, + not (isUnLiftedType arg_ty) -- No constraints for unlifted types? + ] + + +------------------------------------------------------------------ +-- Check side conditions that dis-allow derivability for particular classes +-- This is *apart* from the newtype-deriving mechanism + +checkSideConditions :: Bool -> TyCon -> [TyVar] -> Class -> [TcType] -> Maybe SDoc +checkSideConditions gla_exts tycon deriv_tvs clas tys + | notNull deriv_tvs || notNull tys + = Just ty_args_why -- e.g. deriving( Foo s ) + | otherwise + = case [cond | (key,cond) <- sideConditions, key == getUnique clas] of + [] -> Just (non_std_why clas) + [cond] -> cond (gla_exts, tycon) + other -> pprPanic "checkSideConditions" (ppr clas) + where + ty_args_why = quotes (ppr (mkClassPred clas tys)) <+> ptext SLIT("is not a class") + +non_std_why clas = quotes (ppr clas) <+> ptext SLIT("is not a derivable class") + +sideConditions :: [(Unique, Condition)] +sideConditions + = [ (eqClassKey, cond_std), + (ordClassKey, cond_std), + (readClassKey, cond_std), + (showClassKey, cond_std), + (enumClassKey, cond_std `andCond` cond_isEnumeration), + (ixClassKey, cond_std `andCond` (cond_isEnumeration `orCond` cond_isProduct)), + (boundedClassKey, cond_std `andCond` (cond_isEnumeration `orCond` cond_isProduct)), + (typeableClassKey, cond_glaExts `andCond` cond_typeableOK), + (dataClassKey, cond_glaExts `andCond` cond_std) + ] + +type Condition = (Bool, TyCon) -> Maybe SDoc -- Nothing => OK + +orCond :: Condition -> Condition -> Condition +orCond c1 c2 tc + = case c1 tc of + Nothing -> Nothing -- c1 succeeds + Just x -> case c2 tc of -- c1 fails + Nothing -> Nothing + Just y -> Just (x $$ ptext SLIT(" and") $$ y) + -- Both fail + +andCond c1 c2 tc = case c1 tc of + Nothing -> c2 tc -- c1 succeeds + Just x -> Just x -- c1 fails + +cond_std :: Condition +cond_std (gla_exts, tycon) + | any (not . isVanillaDataCon) data_cons = Just existential_why + | null data_cons = Just no_cons_why + | otherwise = Nothing + where + data_cons = tyConDataCons tycon + no_cons_why = quotes (ppr tycon) <+> ptext SLIT("has no data constructors") + existential_why = quotes (ppr tycon) <+> ptext SLIT("has non-Haskell-98 constructor(s)") + +cond_isEnumeration :: Condition +cond_isEnumeration (gla_exts, tycon) + | isEnumerationTyCon tycon = Nothing + | otherwise = Just why + where + why = quotes (ppr tycon) <+> ptext SLIT("has non-nullary constructors") + +cond_isProduct :: Condition +cond_isProduct (gla_exts, tycon) + | isProductTyCon tycon = Nothing + | otherwise = Just why + where + why = quotes (ppr tycon) <+> ptext SLIT("has more than one constructor") + +cond_typeableOK :: Condition +-- OK for Typeable class +-- Currently: (a) args all of kind * +-- (b) 7 or fewer args +cond_typeableOK (gla_exts, tycon) + | tyConArity tycon > 7 = Just too_many + | not (all (isArgTypeKind . tyVarKind) (tyConTyVars tycon)) = Just bad_kind + | otherwise = Nothing + where + too_many = quotes (ppr tycon) <+> ptext SLIT("has too many arguments") + bad_kind = quotes (ppr tycon) <+> ptext SLIT("has arguments of kind other than `*'") + +cond_glaExts :: Condition +cond_glaExts (gla_exts, tycon) | gla_exts = Nothing + | otherwise = Just why + where + why = ptext SLIT("You need -fglasgow-exts to derive an instance for this class") +\end{code} + +%************************************************************************ +%* * +\subsection[TcDeriv-fixpoint]{Finding the fixed point of \tr{deriving} equations} +%* * +%************************************************************************ + +A ``solution'' (to one of the equations) is a list of (k,TyVarTy tv) +terms, which is the final correct RHS for the corresponding original +equation. +\begin{itemize} +\item +Each (k,TyVarTy tv) in a solution constrains only a type +variable, tv. + +\item +The (k,TyVarTy tv) pairs in a solution are canonically +ordered by sorting on type varible, tv, (major key) and then class, k, +(minor key) +\end{itemize} + +\begin{code} +solveDerivEqns :: OverlapFlag + -> [DerivEqn] + -> TcM [Instance]-- Solns in same order as eqns. + -- This bunch is Absolutely minimal... + +solveDerivEqns overlap_flag orig_eqns + = iterateDeriv 1 initial_solutions + where + -- The initial solutions for the equations claim that each + -- instance has an empty context; this solution is certainly + -- in canonical form. + initial_solutions :: [DerivSoln] + initial_solutions = [ [] | _ <- orig_eqns ] + + ------------------------------------------------------------------ + -- iterateDeriv calculates the next batch of solutions, + -- compares it with the current one; finishes if they are the + -- same, otherwise recurses with the new solutions. + -- It fails if any iteration fails + iterateDeriv :: Int -> [DerivSoln] -> TcM [Instance] + iterateDeriv n current_solns + | n > 20 -- Looks as if we are in an infinite loop + -- This can happen if we have -fallow-undecidable-instances + -- (See TcSimplify.tcSimplifyDeriv.) + = pprPanic "solveDerivEqns: probable loop" + (vcat (map pprDerivEqn orig_eqns) $$ ppr current_solns) + | otherwise + = let + inst_specs = zipWithEqual "add_solns" mk_inst_spec + orig_eqns current_solns + in + checkNoErrs ( + -- Extend the inst info from the explicit instance decls + -- with the current set of solutions, and simplify each RHS + extendLocalInstEnv inst_specs $ + mappM gen_soln orig_eqns + ) `thenM` \ new_solns -> + if (current_solns == new_solns) then + returnM inst_specs + else + iterateDeriv (n+1) new_solns + + ------------------------------------------------------------------ + gen_soln (_, clas, tc,tyvars,deriv_rhs) + = setSrcSpan (srcLocSpan (getSrcLoc tc)) $ + addErrCtxt (derivCtxt (Just clas) tc) $ + tcSimplifyDeriv tc tyvars deriv_rhs `thenM` \ theta -> + returnM (sortLe (<=) theta) -- Canonicalise before returning the soluction + + ------------------------------------------------------------------ + mk_inst_spec (dfun_name, clas, tycon, tyvars, _) theta + = mkLocalInstance dfun overlap_flag + where + dfun = mkDictFunId dfun_name tyvars theta clas + [mkTyConApp tycon (mkTyVarTys tyvars)] + +extendLocalInstEnv :: [Instance] -> TcM a -> TcM a +-- Add new locally-defined instances; don't bother to check +-- for functional dependency errors -- that'll happen in TcInstDcls +extendLocalInstEnv dfuns thing_inside + = do { env <- getGblEnv + ; let inst_env' = extendInstEnvList (tcg_inst_env env) dfuns + env' = env { tcg_inst_env = inst_env' } + ; setGblEnv env' thing_inside } +\end{code} + +%************************************************************************ +%* * +\subsection[TcDeriv-normal-binds]{Bindings for the various classes} +%* * +%************************************************************************ + +After all the trouble to figure out the required context for the +derived instance declarations, all that's left is to chug along to +produce them. They will then be shoved into @tcInstDecls2@, which +will do all its usual business. + +There are lots of possibilities for code to generate. Here are +various general remarks. + +PRINCIPLES: +\begin{itemize} +\item +We want derived instances of @Eq@ and @Ord@ (both v common) to be +``you-couldn't-do-better-by-hand'' efficient. + +\item +Deriving @Show@---also pretty common--- should also be reasonable good code. + +\item +Deriving for the other classes isn't that common or that big a deal. +\end{itemize} + +PRAGMATICS: + +\begin{itemize} +\item +Deriving @Ord@ is done mostly with the 1.3 @compare@ method. + +\item +Deriving @Eq@ also uses @compare@, if we're deriving @Ord@, too. + +\item +We {\em normally} generate code only for the non-defaulted methods; +there are some exceptions for @Eq@ and (especially) @Ord@... + +\item +Sometimes we use a @_con2tag_<tycon>@ function, which returns a data +constructor's numeric (@Int#@) tag. These are generated by +@gen_tag_n_con_binds@, and the heuristic for deciding if one of +these is around is given by @hasCon2TagFun@. + +The examples under the different sections below will make this +clearer. + +\item +Much less often (really just for deriving @Ix@), we use a +@_tag2con_<tycon>@ function. See the examples. + +\item +We use the renamer!!! Reason: we're supposed to be +producing @LHsBinds Name@ for the methods, but that means +producing correctly-uniquified code on the fly. This is entirely +possible (the @TcM@ monad has a @UniqueSupply@), but it is painful. +So, instead, we produce @MonoBinds RdrName@ then heave 'em through +the renamer. What a great hack! +\end{itemize} + +\begin{code} +-- Generate the InstInfo for the required instance, +-- plus any auxiliary bindings required +genInst :: Instance -> TcM (InstInfo, LHsBinds RdrName) +genInst spec + = do { fix_env <- getFixityEnv + ; let + (tyvars,_,clas,[ty]) = instanceHead spec + clas_nm = className clas + tycon = tcTyConAppTyCon ty + (meth_binds, aux_binds) = genDerivBinds clas fix_env tycon + + -- Bring the right type variables into + -- scope, and rename the method binds + -- It's a bit yukky that we return *renamed* InstInfo, but + -- *non-renamed* auxiliary bindings + ; (rn_meth_binds, _fvs) <- discardWarnings $ + bindLocalNames (map varName tyvars) $ + rnMethodBinds clas_nm [] meth_binds + + -- Build the InstInfo + ; return (InstInfo { iSpec = spec, + iBinds = VanillaInst rn_meth_binds [] }, + aux_binds) + } + +genDerivBinds clas fix_env tycon + | className clas `elem` typeableClassNames + = (gen_Typeable_binds tycon, emptyLHsBinds) + + | otherwise + = case assocMaybe gen_list (getUnique clas) of + Just gen_fn -> gen_fn fix_env tycon + Nothing -> pprPanic "genDerivBinds: bad derived class" (ppr clas) + where + gen_list :: [(Unique, FixityEnv -> TyCon -> (LHsBinds RdrName, LHsBinds RdrName))] + gen_list = [(eqClassKey, no_aux_binds (ignore_fix_env gen_Eq_binds)) + ,(ordClassKey, no_aux_binds (ignore_fix_env gen_Ord_binds)) + ,(enumClassKey, no_aux_binds (ignore_fix_env gen_Enum_binds)) + ,(boundedClassKey, no_aux_binds (ignore_fix_env gen_Bounded_binds)) + ,(ixClassKey, no_aux_binds (ignore_fix_env gen_Ix_binds)) + ,(typeableClassKey,no_aux_binds (ignore_fix_env gen_Typeable_binds)) + ,(showClassKey, no_aux_binds gen_Show_binds) + ,(readClassKey, no_aux_binds gen_Read_binds) + ,(dataClassKey, gen_Data_binds) + ] + + -- no_aux_binds is used for generators that don't + -- need to produce any auxiliary bindings + no_aux_binds f fix_env tc = (f fix_env tc, emptyLHsBinds) + ignore_fix_env f fix_env tc = f tc +\end{code} + + +%************************************************************************ +%* * +\subsection[TcDeriv-taggery-Names]{What con2tag/tag2con functions are available?} +%* * +%************************************************************************ + + +data Foo ... = ... + +con2tag_Foo :: Foo ... -> Int# +tag2con_Foo :: Int -> Foo ... -- easier if Int, not Int# +maxtag_Foo :: Int -- ditto (NB: not unlifted) + + +We have a @con2tag@ function for a tycon if: +\begin{itemize} +\item +We're deriving @Eq@ and the tycon has nullary data constructors. + +\item +Or: we're deriving @Ord@ (unless single-constructor), @Enum@, @Ix@ +(enum type only????) +\end{itemize} + +We have a @tag2con@ function for a tycon if: +\begin{itemize} +\item +We're deriving @Enum@, or @Ix@ (enum type only???) +\end{itemize} + +If we have a @tag2con@ function, we also generate a @maxtag@ constant. + +\begin{code} +genTaggeryBinds :: [InstInfo] -> TcM (LHsBinds RdrName) +genTaggeryBinds infos + = do { names_so_far <- foldlM do_con2tag [] tycons_of_interest + ; nm_alist_etc <- foldlM do_tag2con names_so_far tycons_of_interest + ; return (listToBag (map gen_tag_n_con_monobind nm_alist_etc)) } + where + all_CTs = [ (cls, tcTyConAppTyCon ty) + | info <- infos, + let (cls,ty) = simpleInstInfoClsTy info ] + all_tycons = map snd all_CTs + (tycons_of_interest, _) = removeDups compare all_tycons + + do_con2tag acc_Names tycon + | isDataTyCon tycon && + ((we_are_deriving eqClassKey tycon + && any isNullarySrcDataCon (tyConDataCons tycon)) + || (we_are_deriving ordClassKey tycon + && not (isProductTyCon tycon)) + || (we_are_deriving enumClassKey tycon) + || (we_are_deriving ixClassKey tycon)) + + = returnM ((con2tag_RDR tycon, tycon, GenCon2Tag) + : acc_Names) + | otherwise + = returnM acc_Names + + do_tag2con acc_Names tycon + | isDataTyCon tycon && + (we_are_deriving enumClassKey tycon || + we_are_deriving ixClassKey tycon + && isEnumerationTyCon tycon) + = returnM ( (tag2con_RDR tycon, tycon, GenTag2Con) + : (maxtag_RDR tycon, tycon, GenMaxTag) + : acc_Names) + | otherwise + = returnM acc_Names + + we_are_deriving clas_key tycon + = is_in_eqns clas_key tycon all_CTs + where + is_in_eqns clas_key tycon [] = False + is_in_eqns clas_key tycon ((c,t):cts) + = (clas_key == classKey c && tycon == t) + || is_in_eqns clas_key tycon cts +\end{code} + +\begin{code} +derivingThingErr clas tys tycon tyvars why + = sep [hsep [ptext SLIT("Can't make a derived instance of"), quotes (ppr pred)], + parens why] + where + pred = mkClassPred clas (tys ++ [mkTyConApp tycon (mkTyVarTys tyvars)]) + +derivCtxt :: Maybe Class -> TyCon -> SDoc +derivCtxt maybe_cls tycon + = ptext SLIT("When deriving") <+> cls <+> ptext SLIT("for type") <+> quotes (ppr tycon) + where + cls = case maybe_cls of + Nothing -> ptext SLIT("instances") + Just c -> ptext SLIT("the") <+> quotes (ppr c) <+> ptext SLIT("instance") +\end{code} + diff --git a/compiler/typecheck/TcEnv.lhs b/compiler/typecheck/TcEnv.lhs new file mode 100644 index 0000000000..497ba235da --- /dev/null +++ b/compiler/typecheck/TcEnv.lhs @@ -0,0 +1,628 @@ +\begin{code} +module TcEnv( + TyThing(..), TcTyThing(..), TcId, + + -- Instance environment, and InstInfo type + InstInfo(..), iDFunId, pprInstInfo, pprInstInfoDetails, + simpleInstInfoClsTy, simpleInstInfoTy, simpleInstInfoTyCon, + InstBindings(..), + + -- Global environment + tcExtendGlobalEnv, + tcExtendGlobalValEnv, + tcLookupLocatedGlobal, tcLookupGlobal, + tcLookupGlobalId, tcLookupTyCon, tcLookupClass, tcLookupDataCon, + tcLookupLocatedGlobalId, tcLookupLocatedTyCon, + tcLookupLocatedClass, + + -- Local environment + tcExtendKindEnv, tcExtendKindEnvTvs, + tcExtendTyVarEnv, tcExtendTyVarEnv2, + tcExtendIdEnv, tcExtendIdEnv1, tcExtendIdEnv2, + tcLookup, tcLookupLocated, tcLookupLocalIds, tcLookupLocalId_maybe, + tcLookupId, tcLookupTyVar, getScopedTyVarBinds, + lclEnvElts, getInLocalScope, findGlobals, + wrongThingErr, pprBinders, + refineEnvironment, + + tcExtendRecEnv, -- For knot-tying + + -- Rules + tcExtendRules, + + -- Global type variables + tcGetGlobalTyVars, + + -- Template Haskell stuff + checkWellStaged, spliceOK, bracketOK, tcMetaTy, thLevel, + topIdLvl, + + -- New Ids + newLocalName, newDFunName + ) where + +#include "HsVersions.h" + +import HsSyn ( LRuleDecl, LHsBinds, LSig, + LHsTyVarBndr, HsTyVarBndr(..), pprLHsBinds ) +import TcIface ( tcImportDecl ) +import IfaceEnv ( newGlobalBinder ) +import TcRnMonad +import TcMType ( zonkTcType, zonkTcTyVarsAndFV ) +import TcType ( Type, TcKind, TcTyVar, TcTyVarSet, TcType, TvSubst, + substTy, substTyVar, tyVarsOfType, tcTyVarsOfTypes, mkTyConApp, + getDFunTyKey, tcTyConAppTyCon, tcGetTyVar, mkTyVarTy, + tidyOpenType, isRefineableTy + ) +import qualified Type ( getTyVar_maybe ) +import Id ( idName, isLocalId, setIdType ) +import Var ( TyVar, Id, idType, tyVarName ) +import VarSet +import VarEnv +import RdrName ( extendLocalRdrEnv ) +import InstEnv ( Instance, DFunId, instanceDFunId, instanceHead ) +import DataCon ( DataCon ) +import TyCon ( TyCon ) +import Class ( Class ) +import Name ( Name, NamedThing(..), getSrcLoc, nameModule, isExternalName ) +import PrelNames ( thFAKE ) +import NameEnv +import OccName ( mkDFunOcc, occNameString ) +import HscTypes ( extendTypeEnvList, lookupType, + TyThing(..), tyThingId, tyThingDataCon, + ExternalPackageState(..) ) + +import SrcLoc ( SrcLoc, Located(..) ) +import Outputable +\end{code} + + +%************************************************************************ +%* * +%* tcLookupGlobal * +%* * +%************************************************************************ + +Using the Located versions (eg. tcLookupLocatedGlobal) is preferred, +unless you know that the SrcSpan in the monad is already set to the +span of the Name. + +\begin{code} +tcLookupLocatedGlobal :: Located Name -> TcM TyThing +-- c.f. IfaceEnvEnv.tcIfaceGlobal +tcLookupLocatedGlobal name + = addLocM tcLookupGlobal name + +tcLookupGlobal :: Name -> TcM TyThing +-- The Name is almost always an ExternalName, but not always +-- In GHCi, we may make command-line bindings (ghci> let x = True) +-- that bind a GlobalId, but with an InternalName +tcLookupGlobal name + = do { env <- getGblEnv + + -- Try local envt + ; case lookupNameEnv (tcg_type_env env) name of { + Just thing -> return thing ; + Nothing -> do + + -- Try global envt + { (eps,hpt) <- getEpsAndHpt + ; case lookupType hpt (eps_PTE eps) name of { + Just thing -> return thing ; + Nothing -> do + + -- Should it have been in the local envt? + { let mod = nameModule name + ; if mod == tcg_mod env || mod == thFAKE then + notFound name -- It should be local, so panic + -- The thFAKE possibility is because it + -- might be in a declaration bracket + else + tcImportDecl name -- Go find it in an interface + }}}}} + +tcLookupGlobalId :: Name -> TcM Id +-- Never used for Haskell-source DataCons, hence no ADataCon case +tcLookupGlobalId name + = tcLookupGlobal name `thenM` \ thing -> + return (tyThingId thing) + +tcLookupDataCon :: Name -> TcM DataCon +tcLookupDataCon con_name + = tcLookupGlobal con_name `thenM` \ thing -> + return (tyThingDataCon thing) + +tcLookupClass :: Name -> TcM Class +tcLookupClass name + = tcLookupGlobal name `thenM` \ thing -> + case thing of + AClass cls -> return cls + other -> wrongThingErr "class" (AGlobal thing) name + +tcLookupTyCon :: Name -> TcM TyCon +tcLookupTyCon name + = tcLookupGlobal name `thenM` \ thing -> + case thing of + ATyCon tc -> return tc + other -> wrongThingErr "type constructor" (AGlobal thing) name + +tcLookupLocatedGlobalId :: Located Name -> TcM Id +tcLookupLocatedGlobalId = addLocM tcLookupId + +tcLookupLocatedClass :: Located Name -> TcM Class +tcLookupLocatedClass = addLocM tcLookupClass + +tcLookupLocatedTyCon :: Located Name -> TcM TyCon +tcLookupLocatedTyCon = addLocM tcLookupTyCon +\end{code} + +%************************************************************************ +%* * + Extending the global environment +%* * +%************************************************************************ + + +\begin{code} +tcExtendGlobalEnv :: [TyThing] -> TcM r -> TcM r + -- Given a mixture of Ids, TyCons, Classes, all from the + -- module being compiled, extend the global environment +tcExtendGlobalEnv things thing_inside + = do { env <- getGblEnv + ; let ge' = extendTypeEnvList (tcg_type_env env) things + ; setGblEnv (env {tcg_type_env = ge'}) thing_inside } + +tcExtendGlobalValEnv :: [Id] -> TcM a -> TcM a + -- Same deal as tcExtendGlobalEnv, but for Ids +tcExtendGlobalValEnv ids thing_inside + = tcExtendGlobalEnv [AnId id | id <- ids] thing_inside +\end{code} + +\begin{code} +tcExtendRecEnv :: [(Name,TyThing)] -> TcM r -> TcM r +-- Extend the global environments for the type/class knot tying game +tcExtendRecEnv gbl_stuff thing_inside + = updGblEnv upd thing_inside + where + upd env = env { tcg_type_env = extend (tcg_type_env env) } + extend env = extendNameEnvList env gbl_stuff +\end{code} + + +%************************************************************************ +%* * +\subsection{The local environment} +%* * +%************************************************************************ + +\begin{code} +tcLookupLocated :: Located Name -> TcM TcTyThing +tcLookupLocated = addLocM tcLookup + +tcLookup :: Name -> TcM TcTyThing +tcLookup name + = getLclEnv `thenM` \ local_env -> + case lookupNameEnv (tcl_env local_env) name of + Just thing -> returnM thing + Nothing -> tcLookupGlobal name `thenM` \ thing -> + returnM (AGlobal thing) + +tcLookupTyVar :: Name -> TcM TcTyVar +tcLookupTyVar name + = tcLookup name `thenM` \ thing -> + case thing of + ATyVar _ ty -> return (tcGetTyVar "tcLookupTyVar" ty) + other -> pprPanic "tcLookupTyVar" (ppr name) + +tcLookupId :: Name -> TcM Id +-- Used when we aren't interested in the binding level +-- Never a DataCon. (Why does that matter? see TcExpr.tcId) +tcLookupId name + = tcLookup name `thenM` \ thing -> + case thing of + ATcId tc_id _ _ -> returnM tc_id + AGlobal (AnId id) -> returnM id + other -> pprPanic "tcLookupId" (ppr name) + +tcLookupLocalId_maybe :: Name -> TcM (Maybe Id) +tcLookupLocalId_maybe name + = getLclEnv `thenM` \ local_env -> + case lookupNameEnv (tcl_env local_env) name of + Just (ATcId tc_id _ _) -> return (Just tc_id) + other -> return Nothing + +tcLookupLocalIds :: [Name] -> TcM [TcId] +-- We expect the variables to all be bound, and all at +-- the same level as the lookup. Only used in one place... +tcLookupLocalIds ns + = getLclEnv `thenM` \ env -> + returnM (map (lookup (tcl_env env) (thLevel (tcl_th_ctxt env))) ns) + where + lookup lenv lvl name + = case lookupNameEnv lenv name of + Just (ATcId id lvl1 _) -> ASSERT( lvl == lvl1 ) id + other -> pprPanic "tcLookupLocalIds" (ppr name) + +lclEnvElts :: TcLclEnv -> [TcTyThing] +lclEnvElts env = nameEnvElts (tcl_env env) + +getInLocalScope :: TcM (Name -> Bool) + -- Ids only +getInLocalScope = getLclEnv `thenM` \ env -> + let + lcl_env = tcl_env env + in + return (`elemNameEnv` lcl_env) +\end{code} + +\begin{code} +tcExtendKindEnv :: [(Name, TcKind)] -> TcM r -> TcM r +tcExtendKindEnv things thing_inside + = updLclEnv upd thing_inside + where + upd lcl_env = lcl_env { tcl_env = extend (tcl_env lcl_env) } + extend env = extendNameEnvList env [(n, AThing k) | (n,k) <- things] + +tcExtendKindEnvTvs :: [LHsTyVarBndr Name] -> TcM r -> TcM r +tcExtendKindEnvTvs bndrs thing_inside + = updLclEnv upd thing_inside + where + upd lcl_env = lcl_env { tcl_env = extend (tcl_env lcl_env) } + extend env = extendNameEnvList env pairs + pairs = [(n, AThing k) | L _ (KindedTyVar n k) <- bndrs] + +tcExtendTyVarEnv :: [TyVar] -> TcM r -> TcM r +tcExtendTyVarEnv tvs thing_inside + = tcExtendTyVarEnv2 [(tyVarName tv, mkTyVarTy tv) | tv <- tvs] thing_inside + +tcExtendTyVarEnv2 :: [(Name,TcType)] -> TcM r -> TcM r +tcExtendTyVarEnv2 binds thing_inside + = getLclEnv `thenM` \ env@(TcLclEnv {tcl_env = le, + tcl_tyvars = gtvs, + tcl_rdr = rdr_env}) -> + let + rdr_env' = extendLocalRdrEnv rdr_env (map fst binds) + new_tv_set = tcTyVarsOfTypes (map snd binds) + le' = extendNameEnvList le [(name, ATyVar name ty) | (name, ty) <- binds] + in + -- It's important to add the in-scope tyvars to the global tyvar set + -- as well. Consider + -- f (_::r) = let g y = y::r in ... + -- Here, g mustn't be generalised. This is also important during + -- class and instance decls, when we mustn't generalise the class tyvars + -- when typechecking the methods. + tc_extend_gtvs gtvs new_tv_set `thenM` \ gtvs' -> + setLclEnv (env {tcl_env = le', tcl_tyvars = gtvs', tcl_rdr = rdr_env'}) thing_inside + +getScopedTyVarBinds :: TcM [(Name, TcType)] +getScopedTyVarBinds + = do { lcl_env <- getLclEnv + ; return [(name, ty) | ATyVar name ty <- nameEnvElts (tcl_env lcl_env)] } +\end{code} + + +\begin{code} +tcExtendIdEnv :: [TcId] -> TcM a -> TcM a +-- Invariant: the TcIds are fully zonked. Reasons: +-- (a) The kinds of the forall'd type variables are defaulted +-- (see Kind.defaultKind, done in zonkQuantifiedTyVar) +-- (b) There are no via-Indirect occurrences of the bound variables +-- in the types, because instantiation does not look through such things +-- (c) The call to tyVarsOfTypes is ok without looking through refs +tcExtendIdEnv ids thing_inside = tcExtendIdEnv2 [(idName id, id) | id <- ids] thing_inside + +tcExtendIdEnv1 :: Name -> TcId -> TcM a -> TcM a +tcExtendIdEnv1 name id thing_inside = tcExtendIdEnv2 [(name,id)] thing_inside + +tcExtendIdEnv2 :: [(Name,TcId)] -> TcM a -> TcM a +-- Invariant: the TcIds are fully zonked (see tcExtendIdEnv above) +tcExtendIdEnv2 names_w_ids thing_inside + = getLclEnv `thenM` \ env -> + let + extra_global_tyvars = tcTyVarsOfTypes [idType id | (_,id) <- names_w_ids] + th_lvl = thLevel (tcl_th_ctxt env) + extra_env = [ (name, ATcId id th_lvl (isRefineableTy (idType id))) + | (name,id) <- names_w_ids] + le' = extendNameEnvList (tcl_env env) extra_env + rdr_env' = extendLocalRdrEnv (tcl_rdr env) [name | (name,_) <- names_w_ids] + in + traceTc (text "env2") `thenM_` + traceTc (text "env3" <+> ppr extra_env) `thenM_` + tc_extend_gtvs (tcl_tyvars env) extra_global_tyvars `thenM` \ gtvs' -> + (traceTc (text "env4") `thenM_` + setLclEnv (env {tcl_env = le', tcl_tyvars = gtvs', tcl_rdr = rdr_env'}) thing_inside) +\end{code} + + +\begin{code} +----------------------- +-- findGlobals looks at the value environment and finds values +-- whose types mention the offending type variable. It has to be +-- careful to zonk the Id's type first, so it has to be in the monad. +-- We must be careful to pass it a zonked type variable, too. + +findGlobals :: TcTyVarSet + -> TidyEnv + -> TcM (TidyEnv, [SDoc]) + +findGlobals tvs tidy_env + = getLclEnv `thenM` \ lcl_env -> + go tidy_env [] (lclEnvElts lcl_env) + where + go tidy_env acc [] = returnM (tidy_env, acc) + go tidy_env acc (thing : things) + = find_thing ignore_it tidy_env thing `thenM` \ (tidy_env1, maybe_doc) -> + case maybe_doc of + Just d -> go tidy_env1 (d:acc) things + Nothing -> go tidy_env1 acc things + + ignore_it ty = not (tvs `intersectsVarSet` tyVarsOfType ty) + +----------------------- +find_thing ignore_it tidy_env (ATcId id _ _) + = zonkTcType (idType id) `thenM` \ id_ty -> + if ignore_it id_ty then + returnM (tidy_env, Nothing) + else let + (tidy_env', tidy_ty) = tidyOpenType tidy_env id_ty + msg = sep [ppr id <+> dcolon <+> ppr tidy_ty, + nest 2 (parens (ptext SLIT("bound at") <+> + ppr (getSrcLoc id)))] + in + returnM (tidy_env', Just msg) + +find_thing ignore_it tidy_env (ATyVar tv ty) + = zonkTcType ty `thenM` \ tv_ty -> + if ignore_it tv_ty then + returnM (tidy_env, Nothing) + else let + -- The name tv is scoped, so we don't need to tidy it + (tidy_env1, tidy_ty) = tidyOpenType tidy_env tv_ty + msg = sep [ptext SLIT("Scoped type variable") <+> quotes (ppr tv) <+> eq_stuff, nest 2 bound_at] + + eq_stuff | Just tv' <- Type.getTyVar_maybe tv_ty, + getOccName tv == getOccName tv' = empty + | otherwise = equals <+> ppr tidy_ty + -- It's ok to use Type.getTyVar_maybe because ty is zonked by now + bound_at = parens $ ptext SLIT("bound at:") <+> ppr (getSrcLoc tv) + in + returnM (tidy_env1, Just msg) +\end{code} + +\begin{code} +refineEnvironment :: TvSubst -> TcM a -> TcM a +refineEnvironment reft thing_inside + = do { env <- getLclEnv + ; let le' = mapNameEnv refine (tcl_env env) + ; gtvs' <- refineGlobalTyVars reft (tcl_tyvars env) + ; setLclEnv (env {tcl_env = le', tcl_tyvars = gtvs'}) thing_inside } + where + refine (ATcId id lvl True) = ATcId (setIdType id (substTy reft (idType id))) lvl True + refine (ATyVar tv ty) = ATyVar tv (substTy reft ty) + refine elt = elt +\end{code} + +%************************************************************************ +%* * +\subsection{The global tyvars} +%* * +%************************************************************************ + +\begin{code} +tc_extend_gtvs gtvs extra_global_tvs + = readMutVar gtvs `thenM` \ global_tvs -> + newMutVar (global_tvs `unionVarSet` extra_global_tvs) + +refineGlobalTyVars :: GadtRefinement -> TcRef TcTyVarSet -> TcM (TcRef TcTyVarSet) +refineGlobalTyVars reft gtv_var + = readMutVar gtv_var `thenM` \ gbl_tvs -> + newMutVar (tcTyVarsOfTypes (map (substTyVar reft) (varSetElems gbl_tvs))) +\end{code} + +@tcGetGlobalTyVars@ returns a fully-zonked set of tyvars free in the environment. +To improve subsequent calls to the same function it writes the zonked set back into +the environment. + +\begin{code} +tcGetGlobalTyVars :: TcM TcTyVarSet +tcGetGlobalTyVars + = getLclEnv `thenM` \ (TcLclEnv {tcl_tyvars = gtv_var}) -> + readMutVar gtv_var `thenM` \ gbl_tvs -> + zonkTcTyVarsAndFV (varSetElems gbl_tvs) `thenM` \ gbl_tvs' -> + writeMutVar gtv_var gbl_tvs' `thenM_` + returnM gbl_tvs' +\end{code} + + +%************************************************************************ +%* * +\subsection{Rules} +%* * +%************************************************************************ + +\begin{code} +tcExtendRules :: [LRuleDecl Id] -> TcM a -> TcM a + -- Just pop the new rules into the EPS and envt resp + -- All the rules come from an interface file, not soruce + -- Nevertheless, some may be for this module, if we read + -- its interface instead of its source code +tcExtendRules lcl_rules thing_inside + = do { env <- getGblEnv + ; let + env' = env { tcg_rules = lcl_rules ++ tcg_rules env } + ; setGblEnv env' thing_inside } +\end{code} + + +%************************************************************************ +%* * + Meta level +%* * +%************************************************************************ + +\begin{code} +instance Outputable ThStage where + ppr Comp = text "Comp" + ppr (Brack l _ _) = text "Brack" <+> int l + ppr (Splice l) = text "Splice" <+> int l + + +thLevel :: ThStage -> ThLevel +thLevel Comp = topLevel +thLevel (Splice l) = l +thLevel (Brack l _ _) = l + + +checkWellStaged :: SDoc -- What the stage check is for + -> ThLevel -- Binding level + -> ThStage -- Use stage + -> TcM () -- Fail if badly staged, adding an error +checkWellStaged pp_thing bind_lvl use_stage + | bind_lvl <= use_lvl -- OK! + = returnM () + + | bind_lvl == topLevel -- GHC restriction on top level splices + = failWithTc $ + sep [ptext SLIT("GHC stage restriction:") <+> pp_thing, + nest 2 (ptext SLIT("is used in a top-level splice, and must be imported, not defined locally"))] + + | otherwise -- Badly staged + = failWithTc $ + ptext SLIT("Stage error:") <+> pp_thing <+> + hsep [ptext SLIT("is bound at stage") <+> ppr bind_lvl, + ptext SLIT("but used at stage") <+> ppr use_lvl] + where + use_lvl = thLevel use_stage + + +topIdLvl :: Id -> ThLevel +-- Globals may either be imported, or may be from an earlier "chunk" +-- (separated by declaration splices) of this module. The former +-- *can* be used inside a top-level splice, but the latter cannot. +-- Hence we give the former impLevel, but the latter topLevel +-- E.g. this is bad: +-- x = [| foo |] +-- $( f x ) +-- By the time we are prcessing the $(f x), the binding for "x" +-- will be in the global env, not the local one. +topIdLvl id | isLocalId id = topLevel + | otherwise = impLevel + +-- Indicates the legal transitions on bracket( [| |] ). +bracketOK :: ThStage -> Maybe ThLevel +bracketOK (Brack _ _ _) = Nothing -- Bracket illegal inside a bracket +bracketOK stage = Just (thLevel stage + 1) + +-- Indicates the legal transitions on splice($). +spliceOK :: ThStage -> Maybe ThLevel +spliceOK (Splice _) = Nothing -- Splice illegal inside splice +spliceOK stage = Just (thLevel stage - 1) + +tcMetaTy :: Name -> TcM Type +-- Given the name of a Template Haskell data type, +-- return the type +-- E.g. given the name "Expr" return the type "Expr" +tcMetaTy tc_name + = tcLookupTyCon tc_name `thenM` \ t -> + returnM (mkTyConApp t []) +\end{code} + + +%************************************************************************ +%* * +\subsection{The InstInfo type} +%* * +%************************************************************************ + +The InstInfo type summarises the information in an instance declaration + + instance c => k (t tvs) where b + +It is used just for *local* instance decls (not ones from interface files). +But local instance decls includes + - derived ones + - generic ones +as well as explicit user written ones. + +\begin{code} +data InstInfo + = InstInfo { + iSpec :: Instance, -- Includes the dfun id. Its forall'd type + iBinds :: InstBindings -- variables scope over the stuff in InstBindings! + } + +iDFunId :: InstInfo -> DFunId +iDFunId info = instanceDFunId (iSpec info) + +data InstBindings + = VanillaInst -- The normal case + (LHsBinds Name) -- Bindings + [LSig Name] -- User pragmas recorded for generating + -- specialised instances + + | NewTypeDerived -- Used for deriving instances of newtypes, where the + [Type] -- witness dictionary is identical to the argument + -- dictionary. Hence no bindings, no pragmas + -- The [Type] are the representation types + -- See notes in TcDeriv + +pprInstInfo info = vcat [ptext SLIT("InstInfo:") <+> ppr (idType (iDFunId info))] + +pprInstInfoDetails info = pprInstInfo info $$ nest 2 (details (iBinds info)) + where + details (VanillaInst b _) = pprLHsBinds b + details (NewTypeDerived _) = text "Derived from the representation type" + +simpleInstInfoClsTy :: InstInfo -> (Class, Type) +simpleInstInfoClsTy info = case instanceHead (iSpec info) of + (_, _, cls, [ty]) -> (cls, ty) + +simpleInstInfoTy :: InstInfo -> Type +simpleInstInfoTy info = snd (simpleInstInfoClsTy info) + +simpleInstInfoTyCon :: InstInfo -> TyCon + -- Gets the type constructor for a simple instance declaration, + -- i.e. one of the form instance (...) => C (T a b c) where ... +simpleInstInfoTyCon inst = tcTyConAppTyCon (simpleInstInfoTy inst) +\end{code} + +Make a name for the dict fun for an instance decl. It's an *external* +name, like otber top-level names, and hence must be made with newGlobalBinder. + +\begin{code} +newDFunName :: Class -> [Type] -> SrcLoc -> TcM Name +newDFunName clas (ty:_) loc + = do { index <- nextDFunIndex + ; is_boot <- tcIsHsBoot + ; mod <- getModule + ; let info_string = occNameString (getOccName clas) ++ + occNameString (getDFunTyKey ty) + dfun_occ = mkDFunOcc info_string is_boot index + + ; newGlobalBinder mod dfun_occ Nothing loc } + +newDFunName clas [] loc = pprPanic "newDFunName" (ppr clas <+> ppr loc) +\end{code} + + +%************************************************************************ +%* * +\subsection{Errors} +%* * +%************************************************************************ + +\begin{code} +pprBinders :: [Name] -> SDoc +-- Used in error messages +-- Use quotes for a single one; they look a bit "busy" for several +pprBinders [bndr] = quotes (ppr bndr) +pprBinders bndrs = pprWithCommas ppr bndrs + +notFound name + = failWithTc (ptext SLIT("GHC internal error:") <+> quotes (ppr name) <+> + ptext SLIT("is not in scope")) + +wrongThingErr expected thing name + = failWithTc (pprTcTyThingCategory thing <+> quotes (ppr name) <+> + ptext SLIT("used as a") <+> text expected) +\end{code} diff --git a/compiler/typecheck/TcExpr.hi-boot-5 b/compiler/typecheck/TcExpr.hi-boot-5 new file mode 100644 index 0000000000..14714cd2f6 --- /dev/null +++ b/compiler/typecheck/TcExpr.hi-boot-5 @@ -0,0 +1,16 @@ +__interface TcExpr 1 0 where +__export TcExpr tcCheckSigma tcCheckRho tcMonoExpr ; +1 tcCheckSigma :: + HsExpr.LHsExpr Name.Name + -> TcType.TcType + -> TcRnTypes.TcM (HsExpr.LHsExpr Var.Id) ; + +1 tcCheckRho :: + HsExpr.LHsExpr Name.Name + -> TcType.TcType + -> TcRnTypes.TcM (HsExpr.LHsExpr Var.Id) ; + +1 tcMonoExpr :: + HsExpr.LHsExpr Name.Name + -> TcUnify.Expected TcType.TcType + -> TcRnTypes.TcM (HsExpr.LHsExpr Var.Id) ; diff --git a/compiler/typecheck/TcExpr.hi-boot-6 b/compiler/typecheck/TcExpr.hi-boot-6 new file mode 100644 index 0000000000..5a0fa8cd2c --- /dev/null +++ b/compiler/typecheck/TcExpr.hi-boot-6 @@ -0,0 +1,21 @@ +module TcExpr where + +tcPolyExpr :: + HsExpr.LHsExpr Name.Name + -> TcType.BoxySigmaType + -> TcRnTypes.TcM (HsExpr.LHsExpr Var.Id) + +tcMonoExpr :: + HsExpr.LHsExpr Name.Name + -> TcType.BoxyRhoType + -> TcRnTypes.TcM (HsExpr.LHsExpr Var.Id) + +tcInferRho :: + HsExpr.LHsExpr Name.Name + -> TcRnTypes.TcM (HsExpr.LHsExpr Var.Id, TcType.TcType) + +tcSyntaxOp :: + TcRnTypes.InstOrigin + -> HsExpr.HsExpr Name.Name + -> TcType.TcType + -> TcRnTypes.TcM (HsExpr.HsExpr Var.Id) diff --git a/compiler/typecheck/TcExpr.lhs b/compiler/typecheck/TcExpr.lhs new file mode 100644 index 0000000000..a044f43ef2 --- /dev/null +++ b/compiler/typecheck/TcExpr.lhs @@ -0,0 +1,1139 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[TcExpr]{Typecheck an expression} + +\begin{code} +module TcExpr ( tcPolyExpr, tcPolyExprNC, + tcMonoExpr, tcInferRho, tcSyntaxOp ) where + +#include "HsVersions.h" + +#ifdef GHCI /* Only if bootstrapped */ +import {-# SOURCE #-} TcSplice( tcSpliceExpr, tcBracket ) +import HsSyn ( nlHsVar ) +import Id ( Id ) +import Name ( isExternalName ) +import TcType ( isTauTy ) +import TcEnv ( checkWellStaged ) +import HsSyn ( nlHsApp ) +import qualified DsMeta +#endif + +import HsSyn ( HsExpr(..), LHsExpr, ArithSeqInfo(..), recBindFields, + HsMatchContext(..), HsRecordBinds, + mkHsCoerce, mkHsApp, mkHsDictApp, mkHsTyApp ) +import TcHsSyn ( hsLitType ) +import TcRnMonad +import TcUnify ( tcInfer, tcSubExp, tcFunResTy, tcGen, boxyUnify, subFunTys, zapToMonotype, stripBoxyType, + boxySplitListTy, boxySplitTyConApp, wrapFunResCoercion, boxySubMatchType, + unBox ) +import BasicTypes ( Arity, isMarkedStrict ) +import Inst ( newMethodFromName, newIPDict, instToId, + newDicts, newMethodWithGivenTy, tcInstStupidTheta ) +import TcBinds ( tcLocalBinds ) +import TcEnv ( tcLookup, tcLookupId, + tcLookupDataCon, tcLookupGlobalId + ) +import TcArrows ( tcProc ) +import TcMatches ( tcMatchesCase, tcMatchLambda, tcDoStmts, TcMatchCtxt(..) ) +import TcHsType ( tcHsSigType, UserTypeCtxt(..) ) +import TcPat ( tcOverloadedLit, badFieldCon ) +import TcMType ( tcInstTyVars, newFlexiTyVarTy, newBoxyTyVars, readFilledBox, + tcInstBoxyTyVar, tcInstTyVar ) +import TcType ( TcType, TcSigmaType, TcRhoType, + BoxySigmaType, BoxyRhoType, ThetaType, + mkTyVarTys, mkFunTys, tcMultiSplitSigmaTy, tcSplitFunTysN, + isSigmaTy, mkFunTy, mkTyConApp, isLinearPred, + exactTyVarsOfType, exactTyVarsOfTypes, mkTyVarTy, + zipTopTvSubst, zipOpenTvSubst, substTys, substTyVar, lookupTyVar + ) +import Kind ( argTypeKind ) + +import Id ( idType, idName, recordSelectorFieldLabel, isRecordSelector, + isNaughtyRecordSelector, isDataConId_maybe ) +import DataCon ( DataCon, dataConFieldLabels, dataConStrictMarks, dataConSourceArity, + dataConWrapId, isVanillaDataCon, dataConTyVars, dataConOrigArgTys ) +import Name ( Name ) +import TyCon ( FieldLabel, tyConStupidTheta, tyConDataCons ) +import Type ( substTheta, substTy ) +import Var ( TyVar, tyVarKind ) +import VarSet ( emptyVarSet, elemVarSet, unionVarSet ) +import TysWiredIn ( boolTy, parrTyCon, tupleTyCon ) +import PrelNames ( enumFromName, enumFromThenName, + enumFromToName, enumFromThenToName, + enumFromToPName, enumFromThenToPName, negateName + ) +import DynFlags +import StaticFlags ( opt_NoMethodSharing ) +import HscTypes ( TyThing(..) ) +import SrcLoc ( Located(..), unLoc, noLoc, getLoc ) +import Util +import ListSetOps ( assocMaybe ) +import Maybes ( catMaybes ) +import Outputable +import FastString + +#ifdef DEBUG +import TyCon ( tyConArity ) +#endif +\end{code} + +%************************************************************************ +%* * +\subsection{Main wrappers} +%* * +%************************************************************************ + +\begin{code} +tcPolyExpr, tcPolyExprNC + :: LHsExpr Name -- Expession to type check + -> BoxySigmaType -- Expected type (could be a polytpye) + -> TcM (LHsExpr TcId) -- Generalised expr with expected type + +-- tcPolyExpr is a convenient place (frequent but not too frequent) place +-- to add context information. +-- The NC version does not do so, usually because the caller wants +-- to do so himself. + +tcPolyExpr expr res_ty + = addErrCtxt (exprCtxt (unLoc expr)) $ + tcPolyExprNC expr res_ty + +tcPolyExprNC expr res_ty + | isSigmaTy res_ty + = do { (gen_fn, expr') <- tcGen res_ty emptyVarSet (tcPolyExprNC expr) + -- Note the recursive call to tcPolyExpr, because the + -- type may have multiple layers of for-alls + ; return (L (getLoc expr') (mkHsCoerce gen_fn (unLoc expr'))) } + + | otherwise + = tcMonoExpr expr res_ty + +--------------- +tcPolyExprs :: [LHsExpr Name] -> [TcType] -> TcM [LHsExpr TcId] +tcPolyExprs [] [] = returnM [] +tcPolyExprs (expr:exprs) (ty:tys) + = do { expr' <- tcPolyExpr expr ty + ; exprs' <- tcPolyExprs exprs tys + ; returnM (expr':exprs') } +tcPolyExprs exprs tys = pprPanic "tcPolyExprs" (ppr exprs $$ ppr tys) + +--------------- +tcMonoExpr :: LHsExpr Name -- Expression to type check + -> BoxyRhoType -- Expected type (could be a type variable) + -- Definitely no foralls at the top + -- Can contain boxes, which will be filled in + -> TcM (LHsExpr TcId) + +tcMonoExpr (L loc expr) res_ty + = ASSERT( not (isSigmaTy res_ty) ) + setSrcSpan loc $ + do { expr' <- tcExpr expr res_ty + ; return (L loc expr') } + +--------------- +tcInferRho :: LHsExpr Name -> TcM (LHsExpr TcId, TcRhoType) +tcInferRho expr = tcInfer (tcMonoExpr expr) +\end{code} + + + +%************************************************************************ +%* * + tcExpr: the main expression typechecker +%* * +%************************************************************************ + +\begin{code} +tcExpr :: HsExpr Name -> BoxyRhoType -> TcM (HsExpr TcId) +tcExpr (HsVar name) res_ty = tcId (OccurrenceOf name) name res_ty + +tcExpr (HsLit lit) res_ty = do { boxyUnify (hsLitType lit) res_ty + ; return (HsLit lit) } + +tcExpr (HsPar expr) res_ty = do { expr' <- tcMonoExpr expr res_ty + ; return (HsPar expr') } + +tcExpr (HsSCC lbl expr) res_ty = do { expr' <- tcMonoExpr expr res_ty + ; returnM (HsSCC lbl expr') } + +tcExpr (HsCoreAnn lbl expr) res_ty -- hdaume: core annotation + = do { expr' <- tcMonoExpr expr res_ty + ; return (HsCoreAnn lbl expr') } + +tcExpr (HsOverLit lit) res_ty + = do { lit' <- tcOverloadedLit (LiteralOrigin lit) lit res_ty + ; return (HsOverLit lit') } + +tcExpr (NegApp expr neg_expr) res_ty + = do { neg_expr' <- tcSyntaxOp (OccurrenceOf negateName) neg_expr + (mkFunTy res_ty res_ty) + ; expr' <- tcMonoExpr expr res_ty + ; return (NegApp expr' neg_expr') } + +tcExpr (HsIPVar ip) res_ty + = do { -- Implicit parameters must have a *tau-type* not a + -- type scheme. We enforce this by creating a fresh + -- type variable as its type. (Because res_ty may not + -- be a tau-type.) + ip_ty <- newFlexiTyVarTy argTypeKind -- argTypeKind: it can't be an unboxed tuple + ; co_fn <- tcSubExp ip_ty res_ty + ; (ip', inst) <- newIPDict (IPOccOrigin ip) ip ip_ty + ; extendLIE inst + ; return (mkHsCoerce co_fn (HsIPVar ip')) } + +tcExpr (HsApp e1 e2) res_ty + = go e1 [e2] + where + go :: LHsExpr Name -> [LHsExpr Name] -> TcM (HsExpr TcId) + go (L _ (HsApp e1 e2)) args = go e1 (e2:args) + go lfun@(L loc fun) args + = do { (fun', args') <- addErrCtxt (callCtxt lfun args) $ + tcApp fun (length args) (tcArgs lfun args) res_ty + ; return (unLoc (foldl mkHsApp (L loc fun') args')) } + +tcExpr (HsLam match) res_ty + = do { (co_fn, match') <- tcMatchLambda match res_ty + ; return (mkHsCoerce co_fn (HsLam match')) } + +tcExpr in_expr@(ExprWithTySig expr sig_ty) res_ty + = do { sig_tc_ty <- tcHsSigType ExprSigCtxt sig_ty + ; expr' <- tcPolyExpr expr sig_tc_ty + ; co_fn <- tcSubExp sig_tc_ty res_ty + ; return (mkHsCoerce co_fn (ExprWithTySigOut expr' sig_ty)) } + +tcExpr (HsType ty) res_ty + = failWithTc (text "Can't handle type argument:" <+> ppr ty) + -- This is the syntax for type applications that I was planning + -- but there are difficulties (e.g. what order for type args) + -- so it's not enabled yet. + -- Can't eliminate it altogether from the parser, because the + -- same parser parses *patterns*. +\end{code} + + +%************************************************************************ +%* * + Infix operators and sections +%* * +%************************************************************************ + +\begin{code} +tcExpr in_expr@(OpApp arg1 lop@(L loc op) fix arg2) res_ty + = do { (op', [arg1', arg2']) <- tcApp op 2 (tcArgs lop [arg1,arg2]) res_ty + ; return (OpApp arg1' (L loc op') fix arg2') } + +-- Left sections, equivalent to +-- \ x -> e op x, +-- or +-- \ x -> op e x, +-- or just +-- op e +-- +-- We treat it as similar to the latter, so we don't +-- actually require the function to take two arguments +-- at all. For example, (x `not`) means (not x); +-- you get postfix operators! Not really Haskell 98 +-- I suppose, but it's less work and kind of useful. + +tcExpr in_expr@(SectionL arg1 lop@(L loc op)) res_ty + = do { (op', [arg1']) <- tcApp op 1 (tcArgs lop [arg1]) res_ty + ; return (SectionL arg1' (L loc op')) } + +-- Right sections, equivalent to \ x -> x `op` expr, or +-- \ x -> op x expr + +tcExpr in_expr@(SectionR lop@(L loc op) arg2) res_ty + = do { (co_fn, (op', arg2')) <- subFunTys doc 1 res_ty $ \ [arg1_ty'] res_ty' -> + tcApp op 2 (tc_args arg1_ty') res_ty' + ; return (mkHsCoerce co_fn (SectionR (L loc op') arg2')) } + where + doc = ptext SLIT("The section") <+> quotes (ppr in_expr) + <+> ptext SLIT("takes one argument") + tc_args arg1_ty' [arg1_ty, arg2_ty] + = do { boxyUnify arg1_ty' arg1_ty + ; tcArg lop (arg2, arg2_ty, 2) } +\end{code} + +\begin{code} +tcExpr (HsLet binds expr) res_ty + = do { (binds', expr') <- tcLocalBinds binds $ + tcMonoExpr expr res_ty + ; return (HsLet binds' expr') } + +tcExpr (HsCase scrut matches) exp_ty + = do { -- We used to typecheck the case alternatives first. + -- The case patterns tend to give good type info to use + -- when typechecking the scrutinee. For example + -- case (map f) of + -- (x:xs) -> ... + -- will report that map is applied to too few arguments + -- + -- But now, in the GADT world, we need to typecheck the scrutinee + -- first, to get type info that may be refined in the case alternatives + (scrut', scrut_ty) <- addErrCtxt (caseScrutCtxt scrut) + (tcInferRho scrut) + + ; traceTc (text "HsCase" <+> ppr scrut_ty) + ; matches' <- tcMatchesCase match_ctxt scrut_ty matches exp_ty + ; return (HsCase scrut' matches') } + where + match_ctxt = MC { mc_what = CaseAlt, + mc_body = tcPolyExpr } + +tcExpr (HsIf pred b1 b2) res_ty + = do { pred' <- addErrCtxt (predCtxt pred) $ + tcMonoExpr pred boolTy + ; b1' <- tcMonoExpr b1 res_ty + ; b2' <- tcMonoExpr b2 res_ty + ; return (HsIf pred' b1' b2') } + +tcExpr (HsDo do_or_lc stmts body _) res_ty + = tcDoStmts do_or_lc stmts body res_ty + +tcExpr in_expr@(ExplicitList _ exprs) res_ty -- Non-empty list + = do { elt_ty <- boxySplitListTy res_ty + ; exprs' <- mappM (tc_elt elt_ty) exprs + ; return (ExplicitList elt_ty exprs') } + where + tc_elt elt_ty expr = tcPolyExpr expr elt_ty + +tcExpr in_expr@(ExplicitPArr _ exprs) res_ty -- maybe empty + = do { [elt_ty] <- boxySplitTyConApp parrTyCon res_ty + ; exprs' <- mappM (tc_elt elt_ty) exprs + ; ifM (null exprs) (zapToMonotype elt_ty) + -- If there are no expressions in the comprehension + -- we must still fill in the box + -- (Not needed for [] and () becuase they happen + -- to parse as data constructors.) + ; return (ExplicitPArr elt_ty exprs') } + where + tc_elt elt_ty expr = tcPolyExpr expr elt_ty + +tcExpr (ExplicitTuple exprs boxity) res_ty + = do { arg_tys <- boxySplitTyConApp (tupleTyCon boxity (length exprs)) res_ty + ; exprs' <- tcPolyExprs exprs arg_tys + ; return (ExplicitTuple exprs' boxity) } + +tcExpr (HsProc pat cmd) res_ty + = do { (pat', cmd') <- tcProc pat cmd res_ty + ; return (HsProc pat' cmd') } + +tcExpr e@(HsArrApp _ _ _ _ _) _ + = failWithTc (vcat [ptext SLIT("The arrow command"), nest 2 (ppr e), + ptext SLIT("was found where an expression was expected")]) + +tcExpr e@(HsArrForm _ _ _) _ + = failWithTc (vcat [ptext SLIT("The arrow command"), nest 2 (ppr e), + ptext SLIT("was found where an expression was expected")]) +\end{code} + +%************************************************************************ +%* * + Record construction and update +%* * +%************************************************************************ + +\begin{code} +tcExpr expr@(RecordCon (L loc con_name) _ rbinds) res_ty + = do { data_con <- tcLookupDataCon con_name + + -- Check for missing fields + ; checkMissingFields data_con rbinds + + ; let arity = dataConSourceArity data_con + check_fields arg_tys + = do { rbinds' <- tcRecordBinds data_con arg_tys rbinds + ; mapM unBox arg_tys + ; return rbinds' } + -- The unBox ensures that all the boxes in arg_tys are indeed + -- filled, which is the invariant expected by tcIdApp + + ; (con_expr, rbinds') <- tcIdApp con_name arity check_fields res_ty + + ; returnM (RecordCon (L loc (dataConWrapId data_con)) con_expr rbinds') } + +-- The main complication with RecordUpd is that we need to explicitly +-- handle the *non-updated* fields. Consider: +-- +-- data T a b = MkT1 { fa :: a, fb :: b } +-- | MkT2 { fa :: a, fc :: Int -> Int } +-- | MkT3 { fd :: a } +-- +-- upd :: T a b -> c -> T a c +-- upd t x = t { fb = x} +-- +-- The type signature on upd is correct (i.e. the result should not be (T a b)) +-- because upd should be equivalent to: +-- +-- upd t x = case t of +-- MkT1 p q -> MkT1 p x +-- MkT2 a b -> MkT2 p b +-- MkT3 d -> error ... +-- +-- So we need to give a completely fresh type to the result record, +-- and then constrain it by the fields that are *not* updated ("p" above). +-- +-- Note that because MkT3 doesn't contain all the fields being updated, +-- its RHS is simply an error, so it doesn't impose any type constraints +-- +-- All this is done in STEP 4 below. +-- +-- Note about GADTs +-- ~~~~~~~~~~~~~~~~ +-- For record update we require that every constructor involved in the +-- update (i.e. that has all the specified fields) is "vanilla". I +-- don't know how to do the update otherwise. + + +tcExpr expr@(RecordUpd record_expr rbinds _ _) res_ty + = -- STEP 0 + -- Check that the field names are really field names + ASSERT( notNull rbinds ) + let + field_names = map fst rbinds + in + mappM (tcLookupGlobalId.unLoc) field_names `thenM` \ sel_ids -> + -- The renamer has already checked that they + -- are all in scope + let + bad_guys = [ setSrcSpan loc $ addErrTc (notSelector field_name) + | (L loc field_name, sel_id) <- field_names `zip` sel_ids, + not (isRecordSelector sel_id) -- Excludes class ops + ] + in + checkM (null bad_guys) (sequenceM bad_guys `thenM_` failM) `thenM_` + + -- STEP 1 + -- Figure out the tycon and data cons from the first field name + let + -- It's OK to use the non-tc splitters here (for a selector) + upd_field_lbls = recBindFields rbinds + sel_id : _ = sel_ids + (tycon, _) = recordSelectorFieldLabel sel_id -- We've failed already if + data_cons = tyConDataCons tycon -- it's not a field label + relevant_cons = filter is_relevant data_cons + is_relevant con = all (`elem` dataConFieldLabels con) upd_field_lbls + in + + -- STEP 2 + -- Check that at least one constructor has all the named fields + -- i.e. has an empty set of bad fields returned by badFields + checkTc (not (null relevant_cons)) + (badFieldsUpd rbinds) `thenM_` + + -- Check that all relevant data cons are vanilla. Doing record updates on + -- GADTs and/or existentials is more than my tiny brain can cope with today + checkTc (all isVanillaDataCon relevant_cons) + (nonVanillaUpd tycon) `thenM_` + + -- STEP 4 + -- Use the un-updated fields to find a vector of booleans saying + -- which type arguments must be the same in updatee and result. + -- + -- WARNING: this code assumes that all data_cons in a common tycon + -- have FieldLabels abstracted over the same tyvars. + let + -- A constructor is only relevant to this process if + -- it contains *all* the fields that are being updated + con1 = head relevant_cons -- A representative constructor + con1_tyvars = dataConTyVars con1 + con1_flds = dataConFieldLabels con1 + con1_arg_tys = dataConOrigArgTys con1 + common_tyvars = exactTyVarsOfTypes [ty | (fld,ty) <- con1_flds `zip` con1_arg_tys + , not (fld `elem` upd_field_lbls) ] + + is_common_tv tv = tv `elemVarSet` common_tyvars + + mk_inst_ty tv result_inst_ty + | is_common_tv tv = returnM result_inst_ty -- Same as result type + | otherwise = newFlexiTyVarTy (tyVarKind tv) -- Fresh type, of correct kind + in + tcInstTyVars con1_tyvars `thenM` \ (_, result_inst_tys, inst_env) -> + zipWithM mk_inst_ty con1_tyvars result_inst_tys `thenM` \ inst_tys -> + + -- STEP 3 + -- Typecheck the update bindings. + -- (Do this after checking for bad fields in case there's a field that + -- doesn't match the constructor.) + let + result_record_ty = mkTyConApp tycon result_inst_tys + con1_arg_tys' = map (substTy inst_env) con1_arg_tys + in + tcSubExp result_record_ty res_ty `thenM` \ co_fn -> + tcRecordBinds con1 con1_arg_tys' rbinds `thenM` \ rbinds' -> + + -- STEP 5 + -- Typecheck the expression to be updated + let + record_ty = ASSERT( length inst_tys == tyConArity tycon ) + mkTyConApp tycon inst_tys + -- This is one place where the isVanilla check is important + -- So that inst_tys matches the tycon + in + tcMonoExpr record_expr record_ty `thenM` \ record_expr' -> + + -- STEP 6 + -- Figure out the LIE we need. We have to generate some + -- dictionaries for the data type context, since we are going to + -- do pattern matching over the data cons. + -- + -- What dictionaries do we need? + -- We just take the context of the first data constructor + -- This isn't right, but I just can't bear to union up all the relevant ones + let + theta' = substTheta inst_env (tyConStupidTheta tycon) + in + newDicts RecordUpdOrigin theta' `thenM` \ dicts -> + extendLIEs dicts `thenM_` + + -- Phew! + returnM (mkHsCoerce co_fn (RecordUpd record_expr' rbinds' record_ty result_record_ty)) +\end{code} + + +%************************************************************************ +%* * + Arithmetic sequences e.g. [a,b..] + and their parallel-array counterparts e.g. [: a,b.. :] + +%* * +%************************************************************************ + +\begin{code} +tcExpr (ArithSeq _ seq@(From expr)) res_ty + = do { elt_ty <- boxySplitListTy res_ty + ; expr' <- tcPolyExpr expr elt_ty + ; enum_from <- newMethodFromName (ArithSeqOrigin seq) + elt_ty enumFromName + ; return (ArithSeq (HsVar enum_from) (From expr')) } + +tcExpr in_expr@(ArithSeq _ seq@(FromThen expr1 expr2)) res_ty + = do { elt_ty <- boxySplitListTy res_ty + ; expr1' <- tcPolyExpr expr1 elt_ty + ; expr2' <- tcPolyExpr expr2 elt_ty + ; enum_from_then <- newMethodFromName (ArithSeqOrigin seq) + elt_ty enumFromThenName + ; return (ArithSeq (HsVar enum_from_then) (FromThen expr1' expr2')) } + + +tcExpr in_expr@(ArithSeq _ seq@(FromTo expr1 expr2)) res_ty + = do { elt_ty <- boxySplitListTy res_ty + ; expr1' <- tcPolyExpr expr1 elt_ty + ; expr2' <- tcPolyExpr expr2 elt_ty + ; enum_from_to <- newMethodFromName (ArithSeqOrigin seq) + elt_ty enumFromToName + ; return (ArithSeq (HsVar enum_from_to) (FromTo expr1' expr2')) } + +tcExpr in_expr@(ArithSeq _ seq@(FromThenTo expr1 expr2 expr3)) res_ty + = do { elt_ty <- boxySplitListTy res_ty + ; expr1' <- tcPolyExpr expr1 elt_ty + ; expr2' <- tcPolyExpr expr2 elt_ty + ; expr3' <- tcPolyExpr expr3 elt_ty + ; eft <- newMethodFromName (ArithSeqOrigin seq) + elt_ty enumFromThenToName + ; return (ArithSeq (HsVar eft) (FromThenTo expr1' expr2' expr3')) } + +tcExpr in_expr@(PArrSeq _ seq@(FromTo expr1 expr2)) res_ty + = do { [elt_ty] <- boxySplitTyConApp parrTyCon res_ty + ; expr1' <- tcPolyExpr expr1 elt_ty + ; expr2' <- tcPolyExpr expr2 elt_ty + ; enum_from_to <- newMethodFromName (PArrSeqOrigin seq) + elt_ty enumFromToPName + ; return (PArrSeq (HsVar enum_from_to) (FromTo expr1' expr2')) } + +tcExpr in_expr@(PArrSeq _ seq@(FromThenTo expr1 expr2 expr3)) res_ty + = do { [elt_ty] <- boxySplitTyConApp parrTyCon res_ty + ; expr1' <- tcPolyExpr expr1 elt_ty + ; expr2' <- tcPolyExpr expr2 elt_ty + ; expr3' <- tcPolyExpr expr3 elt_ty + ; eft <- newMethodFromName (PArrSeqOrigin seq) + elt_ty enumFromThenToPName + ; return (PArrSeq (HsVar eft) (FromThenTo expr1' expr2' expr3')) } + +tcExpr (PArrSeq _ _) _ + = panic "TcExpr.tcMonoExpr: Infinite parallel array!" + -- the parser shouldn't have generated it and the renamer shouldn't have + -- let it through +\end{code} + + +%************************************************************************ +%* * + Template Haskell +%* * +%************************************************************************ + +\begin{code} +#ifdef GHCI /* Only if bootstrapped */ + -- Rename excludes these cases otherwise +tcExpr (HsSpliceE splice) res_ty = tcSpliceExpr splice res_ty +tcExpr (HsBracket brack) res_ty = do { e <- tcBracket brack res_ty + ; return (unLoc e) } +#endif /* GHCI */ +\end{code} + + +%************************************************************************ +%* * + Catch-all +%* * +%************************************************************************ + +\begin{code} +tcExpr other _ = pprPanic "tcMonoExpr" (ppr other) +\end{code} + + +%************************************************************************ +%* * + Applications +%* * +%************************************************************************ + +\begin{code} +--------------------------- +tcApp :: HsExpr Name -- Function + -> Arity -- Number of args reqd + -> ([BoxySigmaType] -> TcM arg_results) -- Argument type-checker + -> BoxyRhoType -- Result type + -> TcM (HsExpr TcId, arg_results) + +-- (tcFun fun n_args arg_checker res_ty) +-- The argument type checker, arg_checker, will be passed exactly n_args types + +tcApp (HsVar fun_name) n_args arg_checker res_ty + = tcIdApp fun_name n_args arg_checker res_ty + +tcApp fun n_args arg_checker res_ty -- The vanilla case (rula APP) + = do { arg_boxes <- newBoxyTyVars (replicate n_args argTypeKind) + ; fun' <- tcExpr fun (mkFunTys (mkTyVarTys arg_boxes) res_ty) + ; arg_tys' <- mapM readFilledBox arg_boxes + ; args' <- arg_checker arg_tys' + ; return (fun', args') } + +--------------------------- +tcIdApp :: Name -- Function + -> Arity -- Number of args reqd + -> ([BoxySigmaType] -> TcM arg_results) -- Argument type-checker + -- The arg-checker guarantees to fill all boxes in the arg types + -> BoxyRhoType -- Result type + -> TcM (HsExpr TcId, arg_results) + +-- Call (f e1 ... en) :: res_ty +-- Type f :: forall a b c. theta => fa_1 -> ... -> fa_k -> fres +-- (where k <= n; fres has the rest) +-- NB: if k < n then the function doesn't have enough args, and +-- presumably fres is a type variable that we are going to +-- instantiate with a function type +-- +-- Then fres <= bx_(k+1) -> ... -> bx_n -> res_ty + +tcIdApp fun_name n_args arg_checker res_ty + = do { fun_id <- lookupFun (OccurrenceOf fun_name) fun_name + + -- Split up the function type + ; let (tv_theta_prs, rho) = tcMultiSplitSigmaTy (idType fun_id) + (fun_arg_tys, fun_res_ty) = tcSplitFunTysN rho n_args + + qtvs = concatMap fst tv_theta_prs -- Quantified tyvars + arg_qtvs = exactTyVarsOfTypes fun_arg_tys + res_qtvs = exactTyVarsOfType fun_res_ty + -- NB: exactTyVarsOfType. See Note [Silly type synonyms in smart-app] + tau_qtvs = arg_qtvs `unionVarSet` res_qtvs + k = length fun_arg_tys -- k <= n_args + n_missing_args = n_args - k -- Always >= 0 + + -- Match the result type of the function with the + -- result type of the context, to get an inital substitution + ; extra_arg_boxes <- newBoxyTyVars (replicate n_missing_args argTypeKind) + ; let extra_arg_tys' = mkTyVarTys extra_arg_boxes + res_ty' = mkFunTys extra_arg_tys' res_ty + subst = boxySubMatchType arg_qtvs fun_res_ty res_ty' + -- Only bind arg_qtvs, since only they will be + -- *definitely* be filled in by arg_checker + -- E.g. error :: forall a. String -> a + -- (error "foo") :: bx5 + -- Don't make subst [a |-> bx5] + -- because then the result subsumption becomes + -- bx5 ~ bx5 + -- and the unifer doesn't expect the + -- same box on both sides + inst_qtv tv | Just boxy_ty <- lookupTyVar subst tv = return boxy_ty + | tv `elemVarSet` tau_qtvs = do { tv' <- tcInstBoxyTyVar tv + ; return (mkTyVarTy tv') } + | otherwise = do { tv' <- tcInstTyVar tv + ; return (mkTyVarTy tv') } + -- The 'otherwise' case handles type variables that are + -- mentioned only in the constraints, not in argument or + -- result types. We'll make them tau-types + + ; qtys' <- mapM inst_qtv qtvs + ; let arg_subst = zipOpenTvSubst qtvs qtys' + fun_arg_tys' = substTys arg_subst fun_arg_tys + + -- Typecheck the arguments! + -- Doing so will fill arg_qtvs and extra_arg_tys' + ; args' <- arg_checker (fun_arg_tys' ++ extra_arg_tys') + + ; let strip qtv qty' | qtv `elemVarSet` arg_qtvs = stripBoxyType qty' + | otherwise = return qty' + ; qtys'' <- zipWithM strip qtvs qtys' + ; extra_arg_tys'' <- mapM readFilledBox extra_arg_boxes + + -- Result subsumption + ; let res_subst = zipOpenTvSubst qtvs qtys'' + fun_res_ty'' = substTy res_subst fun_res_ty + res_ty'' = mkFunTys extra_arg_tys'' res_ty + ; co_fn <- tcFunResTy fun_name fun_res_ty'' res_ty'' + + -- And pack up the results + -- By applying the coercion just to the *function* we can make + -- tcFun work nicely for OpApp and Sections too + ; fun' <- instFun fun_id qtvs qtys'' tv_theta_prs + ; co_fn' <- wrapFunResCoercion fun_arg_tys' co_fn + ; return (mkHsCoerce co_fn' fun', args') } +\end{code} + +Note [Silly type synonyms in smart-app] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +When we call sripBoxyType, all of the boxes should be filled +in. But we need to be careful about type synonyms: + type T a = Int + f :: T a -> Int + ...(f x)... +In the call (f x) we'll typecheck x, expecting it to have type +(T box). Usually that would fill in the box, but in this case not; +because 'a' is discarded by the silly type synonym T. So we must +use exactTyVarsOfType to figure out which type variables are free +in the argument type. + +\begin{code} +-- tcId is a specialisation of tcIdApp when there are no arguments +-- tcId f ty = do { (res, _) <- tcIdApp f [] (\[] -> return ()) ty +-- ; return res } + +tcId :: InstOrigin + -> Name -- Function + -> BoxyRhoType -- Result type + -> TcM (HsExpr TcId) +tcId orig fun_name res_ty + = do { traceTc (text "tcId" <+> ppr fun_name <+> ppr res_ty) + ; fun_id <- lookupFun orig fun_name + + -- Split up the function type + ; let (tv_theta_prs, fun_tau) = tcMultiSplitSigmaTy (idType fun_id) + qtvs = concatMap fst tv_theta_prs -- Quantified tyvars + tau_qtvs = exactTyVarsOfType fun_tau -- Mentiond in the tau part + inst_qtv tv | tv `elemVarSet` tau_qtvs = do { tv' <- tcInstBoxyTyVar tv + ; return (mkTyVarTy tv') } + | otherwise = do { tv' <- tcInstTyVar tv + ; return (mkTyVarTy tv') } + + -- Do the subsumption check wrt the result type + ; qtv_tys <- mapM inst_qtv qtvs + ; let res_subst = zipTopTvSubst qtvs qtv_tys + fun_tau' = substTy res_subst fun_tau + + ; co_fn <- tcFunResTy fun_name fun_tau' res_ty + + -- And pack up the results + ; fun' <- instFun fun_id qtvs qtv_tys tv_theta_prs + ; return (mkHsCoerce co_fn fun') } + +-- Note [Push result type in] +-- +-- Unify with expected result before (was: after) type-checking the args +-- so that the info from res_ty (was: args) percolates to args (was actual_res_ty). +-- This is when we might detect a too-few args situation. +-- (One can think of cases when the opposite order would give +-- a better error message.) +-- [March 2003: I'm experimenting with putting this first. Here's an +-- example where it actually makes a real difference +-- class C t a b | t a -> b +-- instance C Char a Bool +-- +-- data P t a = forall b. (C t a b) => MkP b +-- data Q t = MkQ (forall a. P t a) + +-- f1, f2 :: Q Char; +-- f1 = MkQ (MkP True) +-- f2 = MkQ (MkP True :: forall a. P Char a) +-- +-- With the change, f1 will type-check, because the 'Char' info from +-- the signature is propagated into MkQ's argument. With the check +-- in the other order, the extra signature in f2 is reqd.] + +--------------------------- +tcSyntaxOp :: InstOrigin -> HsExpr Name -> TcType -> TcM (HsExpr TcId) +-- Typecheck a syntax operator, checking that it has the specified type +-- The operator is always a variable at this stage (i.e. renamer output) +tcSyntaxOp orig (HsVar op) ty = tcId orig op ty +tcSyntaxOp orig other ty = pprPanic "tcSyntaxOp" (ppr other) + +--------------------------- +instFun :: TcId + -> [TyVar] -> [TcType] -- Quantified type variables and + -- their instantiating types + -> [([TyVar], ThetaType)] -- Stuff to instantiate + -> TcM (HsExpr TcId) +instFun fun_id qtvs qtv_tys [] + = return (HsVar fun_id) -- Common short cut + +instFun fun_id qtvs qtv_tys tv_theta_prs + = do { let subst = zipOpenTvSubst qtvs qtv_tys + ty_theta_prs' = map subst_pr tv_theta_prs + subst_pr (tvs, theta) = (map (substTyVar subst) tvs, + substTheta subst theta) + + -- The ty_theta_prs' is always non-empty + ((tys1',theta1') : further_prs') = ty_theta_prs' + + -- First, chuck in the constraints from + -- the "stupid theta" of a data constructor (sigh) + ; case isDataConId_maybe fun_id of + Just con -> tcInstStupidTheta con tys1' + Nothing -> return () + + ; if want_method_inst theta1' + then do { meth_id <- newMethodWithGivenTy orig fun_id tys1' + -- See Note [Multiple instantiation] + ; go (HsVar meth_id) further_prs' } + else go (HsVar fun_id) ty_theta_prs' + } + where + orig = OccurrenceOf (idName fun_id) + + go fun [] = return fun + + go fun ((tys, theta) : prs) + = do { dicts <- newDicts orig theta + ; extendLIEs dicts + ; let the_app = unLoc $ mkHsDictApp (mkHsTyApp (noLoc fun) tys) + (map instToId dicts) + ; go the_app prs } + + -- Hack Alert (want_method_inst)! + -- See Note [No method sharing] + -- If f :: (%x :: T) => Int -> Int + -- Then if we have two separate calls, (f 3, f 4), we cannot + -- make a method constraint that then gets shared, thus: + -- let m = f %x in (m 3, m 4) + -- because that loses the linearity of the constraint. + -- The simplest thing to do is never to construct a method constraint + -- in the first place that has a linear implicit parameter in it. + want_method_inst theta = not (null theta) -- Overloaded + && not (any isLinearPred theta) -- Not linear + && not opt_NoMethodSharing + -- See Note [No method sharing] below +\end{code} + +Note [Multiple instantiation] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +We are careful never to make a MethodInst that has, as its meth_id, another MethodInst. +For example, consider + f :: forall a. Eq a => forall b. Ord b => a -> b +At a call to f, at say [Int, Bool], it's tempting to translate the call to + + f_m1 + where + f_m1 :: forall b. Ord b => Int -> b + f_m1 = f Int dEqInt + + f_m2 :: Int -> Bool + f_m2 = f_m1 Bool dOrdBool + +But notice that f_m2 has f_m1 as its meth_id. Now the danger is that if we do +a tcSimplCheck with a Given f_mx :: f Int dEqInt, we may make a binding + f_m1 = f_mx +But it's entirely possible that f_m2 will continue to float out, because it +mentions no type variables. Result, f_m1 isn't in scope. + +Here's a concrete example that does this (test tc200): + + class C a where + f :: Eq b => b -> a -> Int + baz :: Eq a => Int -> a -> Int + + instance C Int where + baz = f + +Current solution: only do the "method sharing" thing for the first type/dict +application, not for the iterated ones. A horribly subtle point. + +Note [No method sharing] +~~~~~~~~~~~~~~~~~~~~~~~~ +The -fno-method-sharing flag controls what happens so far as the LIE +is concerned. The default case is that for an overloaded function we +generate a "method" Id, and add the Method Inst to the LIE. So you get +something like + f :: Num a => a -> a + f = /\a (d:Num a) -> let m = (+) a d in \ (x:a) -> m x x +If you specify -fno-method-sharing, the dictionary application +isn't shared, so we get + f :: Num a => a -> a + f = /\a (d:Num a) (x:a) -> (+) a d x x +This gets a bit less sharing, but + a) it's better for RULEs involving overloaded functions + b) perhaps fewer separated lambdas + +\begin{code} +tcArgs :: LHsExpr Name -- The function (for error messages) + -> [LHsExpr Name] -> [TcSigmaType] -- Actual arguments and expected arg types + -> TcM [LHsExpr TcId] -- Resulting args + +tcArgs fun args expected_arg_tys + = mapM (tcArg fun) (zip3 args expected_arg_tys [1..]) + +tcArg :: LHsExpr Name -- The function (for error messages) + -> (LHsExpr Name, BoxySigmaType, Int) -- Actual argument and expected arg type + -> TcM (LHsExpr TcId) -- Resulting argument +tcArg fun (arg, ty, arg_no) = addErrCtxt (funAppCtxt fun arg arg_no) $ + tcPolyExprNC arg ty +\end{code} + + +%************************************************************************ +%* * +\subsection{@tcId@ typchecks an identifier occurrence} +%* * +%************************************************************************ + +\begin{code} +lookupFun :: InstOrigin -> Name -> TcM TcId +lookupFun orig id_name + = do { thing <- tcLookup id_name + ; case thing of + AGlobal (ADataCon con) -> return (dataConWrapId con) + + AGlobal (AnId id) + | isNaughtyRecordSelector id -> failWithTc (naughtyRecordSel id) + | otherwise -> return id + -- A global cannot possibly be ill-staged + -- nor does it need the 'lifting' treatment + +#ifndef GHCI + ATcId id th_level _ -> return id -- Non-TH case +#else + ATcId id th_level _ -> do { use_stage <- getStage -- TH case + ; thLocalId orig id_name id th_level use_stage } +#endif + + other -> failWithTc (ppr other <+> ptext SLIT("used where a value identifer was expected")) + } + +#ifdef GHCI /* GHCI and TH is on */ +-------------------------------------- +-- thLocalId : Check for cross-stage lifting +thLocalId orig id_name id th_bind_lvl (Brack use_lvl ps_var lie_var) + | use_lvl > th_bind_lvl + = thBrackId orig id_name id ps_var lie_var +thLocalId orig id_name id th_bind_lvl use_stage + = do { checkWellStaged (quotes (ppr id)) th_bind_lvl use_stage + ; return id } + +-------------------------------------- +thBrackId orig id_name id ps_var lie_var + | isExternalName id_name + = -- Top-level identifiers in this module, + -- (which have External Names) + -- are just like the imported case: + -- no need for the 'lifting' treatment + -- E.g. this is fine: + -- f x = x + -- g y = [| f 3 |] + -- But we do need to put f into the keep-alive + -- set, because after desugaring the code will + -- only mention f's *name*, not f itself. + do { keepAliveTc id_name; return id } + + | otherwise + = -- Nested identifiers, such as 'x' in + -- E.g. \x -> [| h x |] + -- We must behave as if the reference to x was + -- h $(lift x) + -- We use 'x' itself as the splice proxy, used by + -- the desugarer to stitch it all back together. + -- If 'x' occurs many times we may get many identical + -- bindings of the same splice proxy, but that doesn't + -- matter, although it's a mite untidy. + do { let id_ty = idType id + ; checkTc (isTauTy id_ty) (polySpliceErr id) + -- If x is polymorphic, its occurrence sites might + -- have different instantiations, so we can't use plain + -- 'x' as the splice proxy name. I don't know how to + -- solve this, and it's probably unimportant, so I'm + -- just going to flag an error for now + + ; id_ty' <- zapToMonotype id_ty + -- The id_ty might have an OpenTypeKind, but we + -- can't instantiate the Lift class at that kind, + -- so we zap it to a LiftedTypeKind monotype + -- C.f. the call in TcPat.newLitInst + + ; setLIEVar lie_var $ do + { lift <- newMethodFromName orig id_ty' DsMeta.liftName + -- Put the 'lift' constraint into the right LIE + + -- Update the pending splices + ; ps <- readMutVar ps_var + ; writeMutVar ps_var ((id_name, nlHsApp (nlHsVar lift) (nlHsVar id)) : ps) + + ; return id } } +#endif /* GHCI */ +\end{code} + + +%************************************************************************ +%* * +\subsection{Record bindings} +%* * +%************************************************************************ + +Game plan for record bindings +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +1. Find the TyCon for the bindings, from the first field label. + +2. Instantiate its tyvars and unify (T a1 .. an) with expected_ty. + +For each binding field = value + +3. Instantiate the field type (from the field label) using the type + envt from step 2. + +4 Type check the value using tcArg, passing the field type as + the expected argument type. + +This extends OK when the field types are universally quantified. + + +\begin{code} +tcRecordBinds + :: DataCon + -> [TcType] -- Expected type for each field + -> HsRecordBinds Name + -> TcM (HsRecordBinds TcId) + +tcRecordBinds data_con arg_tys rbinds + = do { mb_binds <- mappM do_bind rbinds + ; return (catMaybes mb_binds) } + where + flds_w_tys = zipEqual "tcRecordBinds" (dataConFieldLabels data_con) arg_tys + do_bind (L loc field_lbl, rhs) + | Just field_ty <- assocMaybe flds_w_tys field_lbl + = addErrCtxt (fieldCtxt field_lbl) $ + do { rhs' <- tcPolyExprNC rhs field_ty + ; sel_id <- tcLookupId field_lbl + ; ASSERT( isRecordSelector sel_id ) + return (Just (L loc sel_id, rhs')) } + | otherwise + = do { addErrTc (badFieldCon data_con field_lbl) + ; return Nothing } + +checkMissingFields :: DataCon -> HsRecordBinds Name -> TcM () +checkMissingFields data_con rbinds + | null field_labels -- Not declared as a record; + -- But C{} is still valid if no strict fields + = if any isMarkedStrict field_strs then + -- Illegal if any arg is strict + addErrTc (missingStrictFields data_con []) + else + returnM () + + | otherwise -- A record + = checkM (null missing_s_fields) + (addErrTc (missingStrictFields data_con missing_s_fields)) `thenM_` + + doptM Opt_WarnMissingFields `thenM` \ warn -> + checkM (not (warn && notNull missing_ns_fields)) + (warnTc True (missingFields data_con missing_ns_fields)) + + where + missing_s_fields + = [ fl | (fl, str) <- field_info, + isMarkedStrict str, + not (fl `elem` field_names_used) + ] + missing_ns_fields + = [ fl | (fl, str) <- field_info, + not (isMarkedStrict str), + not (fl `elem` field_names_used) + ] + + field_names_used = recBindFields rbinds + field_labels = dataConFieldLabels data_con + + field_info = zipEqual "missingFields" + field_labels + field_strs + + field_strs = dataConStrictMarks data_con +\end{code} + +%************************************************************************ +%* * +\subsection{Errors and contexts} +%* * +%************************************************************************ + +Boring and alphabetical: +\begin{code} +caseScrutCtxt expr + = hang (ptext SLIT("In the scrutinee of a case expression:")) 4 (ppr expr) + +exprCtxt expr + = hang (ptext SLIT("In the expression:")) 4 (ppr expr) + +fieldCtxt field_name + = ptext SLIT("In the") <+> quotes (ppr field_name) <+> ptext SLIT("field of a record") + +funAppCtxt fun arg arg_no + = hang (hsep [ ptext SLIT("In the"), speakNth arg_no, ptext SLIT("argument of"), + quotes (ppr fun) <> text ", namely"]) + 4 (quotes (ppr arg)) + +predCtxt expr + = hang (ptext SLIT("In the predicate expression:")) 4 (ppr expr) + +nonVanillaUpd tycon + = vcat [ptext SLIT("Record update for the non-Haskell-98 data type") <+> quotes (ppr tycon) + <+> ptext SLIT("is not (yet) supported"), + ptext SLIT("Use pattern-matching instead")] +badFieldsUpd rbinds + = hang (ptext SLIT("No constructor has all these fields:")) + 4 (pprQuotedList (recBindFields rbinds)) + +naughtyRecordSel sel_id + = ptext SLIT("Cannot use record selector") <+> quotes (ppr sel_id) <+> + ptext SLIT("as a function due to escaped type variables") $$ + ptext SLIT("Probably fix: use pattern-matching syntax instead") + +notSelector field + = hsep [quotes (ppr field), ptext SLIT("is not a record selector")] + +missingStrictFields :: DataCon -> [FieldLabel] -> SDoc +missingStrictFields con fields + = header <> rest + where + rest | null fields = empty -- Happens for non-record constructors + -- with strict fields + | otherwise = colon <+> pprWithCommas ppr fields + + header = ptext SLIT("Constructor") <+> quotes (ppr con) <+> + ptext SLIT("does not have the required strict field(s)") + +missingFields :: DataCon -> [FieldLabel] -> SDoc +missingFields con fields + = ptext SLIT("Fields of") <+> quotes (ppr con) <+> ptext SLIT("not initialised:") + <+> pprWithCommas ppr fields + +callCtxt fun args + = ptext SLIT("In the call") <+> parens (ppr (foldl mkHsApp fun args)) + +#ifdef GHCI +polySpliceErr :: Id -> SDoc +polySpliceErr id + = ptext SLIT("Can't splice the polymorphic local variable") <+> quotes (ppr id) +#endif +\end{code} diff --git a/compiler/typecheck/TcExpr.lhs-boot b/compiler/typecheck/TcExpr.lhs-boot new file mode 100644 index 0000000000..1c4240dcc0 --- /dev/null +++ b/compiler/typecheck/TcExpr.lhs-boot @@ -0,0 +1,28 @@ +\begin{code} +module TcExpr where +import HsSyn ( HsExpr, LHsExpr ) +import Name ( Name ) +import Var ( Id ) +import TcType ( TcType, BoxySigmaType, BoxyRhoType ) +import TcRnTypes( TcM, InstOrigin ) + +tcPolyExpr :: + LHsExpr Name + -> BoxySigmaType + -> TcM (LHsExpr Id) + +tcMonoExpr :: + LHsExpr Name + -> BoxyRhoType + -> TcM (LHsExpr Id) + +tcInferRho :: + LHsExpr Name + -> TcM (LHsExpr Id, TcType) + +tcSyntaxOp :: + InstOrigin + -> HsExpr Name + -> TcType + -> TcM (HsExpr Id) +\end{code} diff --git a/compiler/typecheck/TcForeign.lhs b/compiler/typecheck/TcForeign.lhs new file mode 100644 index 0000000000..4be039bd93 --- /dev/null +++ b/compiler/typecheck/TcForeign.lhs @@ -0,0 +1,367 @@ +% +% (c) The AQUA Project, Glasgow University, 1998 +% +\section[TcForeign]{Typechecking \tr{foreign} declarations} + +A foreign declaration is used to either give an externally +implemented function a Haskell type (and calling interface) or +give a Haskell function an external calling interface. Either way, +the range of argument and result types these functions can accommodate +is restricted to what the outside world understands (read C), and this +module checks to see if a foreign declaration has got a legal type. + +\begin{code} +module TcForeign + ( + tcForeignImports + , tcForeignExports + ) where + +#include "HsVersions.h" + +import HsSyn + +import TcRnMonad +import TcHsType ( tcHsSigType, UserTypeCtxt(..) ) +import TcExpr ( tcPolyExpr ) + +import ForeignCall ( CCallConv(..) ) +import ErrUtils ( Message ) +import Id ( Id, mkLocalId, mkExportedLocalId ) +#if alpha_TARGET_ARCH +import Type ( typePrimRep ) +import SMRep ( argMachRep, primRepToCgRep, primRepHint ) +#endif +import OccName ( mkForeignExportOcc ) +import Name ( Name, NamedThing(..), mkExternalName ) +import TcType ( Type, tcSplitFunTys, tcSplitTyConApp_maybe, + tcSplitForAllTys, + isFFIArgumentTy, isFFIImportResultTy, + isFFIExportResultTy, isFFILabelTy, + isFFIExternalTy, isFFIDynArgumentTy, + isFFIDynResultTy, isFFIDotnetTy, isFFIDotnetObjTy, + toDNType + ) +import ForeignCall ( CExportSpec(..), CCallTarget(..), + CLabelString, isCLabelString, + isDynamicTarget, withDNTypes, DNKind(..), DNCallSpec(..) ) +import PrelNames ( hasKey, ioTyConKey ) +import DynFlags ( DynFlags(..), HscTarget(..) ) +import Outputable +import SrcLoc ( Located(..), srcSpanStart ) +import Bag ( consBag ) + +#if alpha_TARGET_ARCH +import MachOp ( machRepByteWidth, MachHint(FloatHint) ) +#endif +\end{code} + +\begin{code} +-- Defines a binding +isForeignImport :: LForeignDecl name -> Bool +isForeignImport (L _ (ForeignImport _ _ _ _)) = True +isForeignImport _ = False + +-- Exports a binding +isForeignExport :: LForeignDecl name -> Bool +isForeignExport (L _ (ForeignExport _ _ _ _)) = True +isForeignExport _ = False +\end{code} + +%************************************************************************ +%* * +\subsection{Imports} +%* * +%************************************************************************ + +\begin{code} +tcForeignImports :: [LForeignDecl Name] -> TcM ([Id], [LForeignDecl Id]) +tcForeignImports decls + = mapAndUnzipM (wrapLocSndM tcFImport) (filter isForeignImport decls) + +tcFImport :: ForeignDecl Name -> TcM (Id, ForeignDecl Id) +tcFImport fo@(ForeignImport (L loc nm) hs_ty imp_decl isDeprec) + = addErrCtxt (foreignDeclCtxt fo) $ + tcHsSigType (ForSigCtxt nm) hs_ty `thenM` \ sig_ty -> + let + -- drop the foralls before inspecting the structure + -- of the foreign type. + (_, t_ty) = tcSplitForAllTys sig_ty + (arg_tys, res_ty) = tcSplitFunTys t_ty + id = mkLocalId nm sig_ty + -- Use a LocalId to obey the invariant that locally-defined + -- things are LocalIds. However, it does not need zonking, + -- (so TcHsSyn.zonkForeignExports ignores it). + in + tcCheckFIType sig_ty arg_tys res_ty imp_decl `thenM` \ imp_decl' -> + -- can't use sig_ty here because it :: Type and we need HsType Id + -- hence the undefined + returnM (id, ForeignImport (L loc id) undefined imp_decl' isDeprec) +\end{code} + + +------------ Checking types for foreign import ---------------------- +\begin{code} +tcCheckFIType _ arg_tys res_ty (DNImport spec) + = checkCg checkDotnet `thenM_` + getDOpts `thenM` \ dflags -> + checkForeignArgs (isFFIDotnetTy dflags) arg_tys `thenM_` + checkForeignRes True{-non IO ok-} (isFFIDotnetTy dflags) res_ty `thenM_` + let (DNCallSpec isStatic kind _ _ _ _) = spec in + (case kind of + DNMethod | not isStatic -> + case arg_tys of + [] -> addErrTc illegalDNMethodSig + _ + | not (isFFIDotnetObjTy (last arg_tys)) -> addErrTc illegalDNMethodSig + | otherwise -> returnM () + _ -> returnM ()) `thenM_` + returnM (DNImport (withDNTypes spec (map toDNType arg_tys) (toDNType res_ty))) + +tcCheckFIType sig_ty arg_tys res_ty idecl@(CImport _ _ _ _ (CLabel _)) + = checkCg checkCOrAsm `thenM_` + check (isFFILabelTy sig_ty) (illegalForeignTyErr empty sig_ty) `thenM_` + return idecl + +tcCheckFIType sig_ty arg_tys res_ty idecl@(CImport cconv _ _ _ CWrapper) + = -- Foreign wrapper (former f.e.d.) + -- The type must be of the form ft -> IO (FunPtr ft), where ft is a + -- valid foreign type. For legacy reasons ft -> IO (Ptr ft) as well + -- as ft -> IO Addr is accepted, too. The use of the latter two forms + -- is DEPRECATED, though. + checkCg checkCOrAsmOrInterp `thenM_` + checkCConv cconv `thenM_` + (case arg_tys of + [arg1_ty] -> checkForeignArgs isFFIExternalTy arg1_tys `thenM_` + checkForeignRes nonIOok isFFIExportResultTy res1_ty `thenM_` + checkForeignRes mustBeIO isFFIDynResultTy res_ty `thenM_` + checkFEDArgs arg1_tys + where + (arg1_tys, res1_ty) = tcSplitFunTys arg1_ty + other -> addErrTc (illegalForeignTyErr empty sig_ty) ) `thenM_` + return idecl + +tcCheckFIType sig_ty arg_tys res_ty idecl@(CImport cconv safety _ _ (CFunction target)) + | isDynamicTarget target -- Foreign import dynamic + = checkCg checkCOrAsmOrInterp `thenM_` + checkCConv cconv `thenM_` + case arg_tys of -- The first arg must be Ptr, FunPtr, or Addr + [] -> + check False (illegalForeignTyErr empty sig_ty) `thenM_` + return idecl + (arg1_ty:arg_tys) -> + getDOpts `thenM` \ dflags -> + check (isFFIDynArgumentTy arg1_ty) + (illegalForeignTyErr argument arg1_ty) `thenM_` + checkForeignArgs (isFFIArgumentTy dflags safety) arg_tys `thenM_` + checkForeignRes nonIOok (isFFIImportResultTy dflags) res_ty `thenM_` + return idecl + | otherwise -- Normal foreign import + = checkCg (checkCOrAsmOrDotNetOrInterp) `thenM_` + checkCConv cconv `thenM_` + checkCTarget target `thenM_` + getDOpts `thenM` \ dflags -> + checkForeignArgs (isFFIArgumentTy dflags safety) arg_tys `thenM_` + checkForeignRes nonIOok (isFFIImportResultTy dflags) res_ty `thenM_` + return idecl + +-- This makes a convenient place to check +-- that the C identifier is valid for C +checkCTarget (StaticTarget str) + = checkCg checkCOrAsmOrDotNetOrInterp `thenM_` + check (isCLabelString str) (badCName str) +\end{code} + +On an Alpha, with foreign export dynamic, due to a giant hack when +building adjustor thunks, we only allow 4 integer arguments with +foreign export dynamic (i.e., 32 bytes of arguments after padding each +argument to a quadword, excluding floating-point arguments). + +The check is needed for both via-C and native-code routes + +\begin{code} +#include "nativeGen/NCG.h" +#if alpha_TARGET_ARCH +checkFEDArgs arg_tys + = check (integral_args <= 32) err + where + integral_args = sum [ (machRepByteWidth . argMachRep . primRepToCgRep) prim_rep + | prim_rep <- map typePrimRep arg_tys, + primRepHint prim_rep /= FloatHint ] + err = ptext SLIT("On Alpha, I can only handle 32 bytes of non-floating-point arguments to foreign export dynamic") +#else +checkFEDArgs arg_tys = returnM () +#endif +\end{code} + + +%************************************************************************ +%* * +\subsection{Exports} +%* * +%************************************************************************ + +\begin{code} +tcForeignExports :: [LForeignDecl Name] + -> TcM (LHsBinds TcId, [LForeignDecl TcId]) +tcForeignExports decls + = foldlM combine (emptyLHsBinds, []) (filter isForeignExport decls) + where + combine (binds, fs) fe = + wrapLocSndM tcFExport fe `thenM` \ (b, f) -> + returnM (b `consBag` binds, f:fs) + +tcFExport :: ForeignDecl Name -> TcM (LHsBind Id, ForeignDecl Id) +tcFExport fo@(ForeignExport (L loc nm) hs_ty spec isDeprec) = + addErrCtxt (foreignDeclCtxt fo) $ + + tcHsSigType (ForSigCtxt nm) hs_ty `thenM` \ sig_ty -> + tcPolyExpr (nlHsVar nm) sig_ty `thenM` \ rhs -> + + tcCheckFEType sig_ty spec `thenM_` + + -- we're exporting a function, but at a type possibly more + -- constrained than its declared/inferred type. Hence the need + -- to create a local binding which will call the exported function + -- at a particular type (and, maybe, overloading). + + newUnique `thenM` \ uniq -> + getModule `thenM` \ mod -> + let + gnm = mkExternalName uniq mod (mkForeignExportOcc (getOccName nm)) + Nothing (srcSpanStart loc) + id = mkExportedLocalId gnm sig_ty + bind = L loc (VarBind id rhs) + in + returnM (bind, ForeignExport (L loc id) undefined spec isDeprec) +\end{code} + +------------ Checking argument types for foreign export ---------------------- + +\begin{code} +tcCheckFEType sig_ty (CExport (CExportStatic str _)) + = check (isCLabelString str) (badCName str) `thenM_` + checkForeignArgs isFFIExternalTy arg_tys `thenM_` + checkForeignRes nonIOok isFFIExportResultTy res_ty + where + -- Drop the foralls before inspecting n + -- the structure of the foreign type. + (_, t_ty) = tcSplitForAllTys sig_ty + (arg_tys, res_ty) = tcSplitFunTys t_ty +\end{code} + + + +%************************************************************************ +%* * +\subsection{Miscellaneous} +%* * +%************************************************************************ + +\begin{code} +------------ Checking argument types for foreign import ---------------------- +checkForeignArgs :: (Type -> Bool) -> [Type] -> TcM () +checkForeignArgs pred tys + = mappM go tys `thenM_` + returnM () + where + go ty = check (pred ty) (illegalForeignTyErr argument ty) + +------------ Checking result types for foreign calls ---------------------- +-- Check that the type has the form +-- (IO t) or (t) , and that t satisfies the given predicate. +-- +checkForeignRes :: Bool -> (Type -> Bool) -> Type -> TcM () + +nonIOok = True +mustBeIO = False + +checkForeignRes non_io_result_ok pred_res_ty ty + = case tcSplitTyConApp_maybe ty of + Just (io, [res_ty]) + | io `hasKey` ioTyConKey && pred_res_ty res_ty + -> returnM () + _ + -> check (non_io_result_ok && pred_res_ty ty) + (illegalForeignTyErr result ty) +\end{code} + +\begin{code} +checkDotnet HscILX = Nothing +#if defined(mingw32_TARGET_OS) +checkDotnet HscC = Nothing +checkDotnet _ = Just (text "requires C code generation (-fvia-C)") +#else +checkDotnet other = Just (text "requires .NET support (-filx or win32)") +#endif + +checkCOrAsm HscC = Nothing +checkCOrAsm HscAsm = Nothing +checkCOrAsm other + = Just (text "requires via-C or native code generation (-fvia-C)") + +checkCOrAsmOrInterp HscC = Nothing +checkCOrAsmOrInterp HscAsm = Nothing +checkCOrAsmOrInterp HscInterpreted = Nothing +checkCOrAsmOrInterp other + = Just (text "requires interpreted, C or native code generation") + +checkCOrAsmOrDotNetOrInterp HscC = Nothing +checkCOrAsmOrDotNetOrInterp HscAsm = Nothing +checkCOrAsmOrDotNetOrInterp HscILX = Nothing +checkCOrAsmOrDotNetOrInterp HscInterpreted = Nothing +checkCOrAsmOrDotNetOrInterp other + = Just (text "requires interpreted, C, native or .NET ILX code generation") + +checkCg check + = getDOpts `thenM` \ dflags -> + let target = hscTarget dflags in + case target of + HscNothing -> returnM () + otherwise -> + case check target of + Nothing -> returnM () + Just err -> addErrTc (text "Illegal foreign declaration:" <+> err) +\end{code} + +Calling conventions + +\begin{code} +checkCConv :: CCallConv -> TcM () +checkCConv CCallConv = return () +#if i386_TARGET_ARCH +checkCConv StdCallConv = return () +#else +checkCConv StdCallConv = addErrTc (text "calling convention not supported on this architecture: stdcall") +#endif +\end{code} + +Warnings + +\begin{code} +check :: Bool -> Message -> TcM () +check True _ = returnM () +check _ the_err = addErrTc the_err + +illegalForeignTyErr arg_or_res ty + = hang (hsep [ptext SLIT("Unacceptable"), arg_or_res, + ptext SLIT("type in foreign declaration:")]) + 4 (hsep [ppr ty]) + +-- Used for 'arg_or_res' argument to illegalForeignTyErr +argument = text "argument" +result = text "result" + +badCName :: CLabelString -> Message +badCName target + = sep [quotes (ppr target) <+> ptext SLIT("is not a valid C identifier")] + +foreignDeclCtxt fo + = hang (ptext SLIT("When checking declaration:")) + 4 (ppr fo) + +illegalDNMethodSig + = ptext SLIT("'This pointer' expected as last argument") + +\end{code} + diff --git a/compiler/typecheck/TcGenDeriv.lhs b/compiler/typecheck/TcGenDeriv.lhs new file mode 100644 index 0000000000..40e091d475 --- /dev/null +++ b/compiler/typecheck/TcGenDeriv.lhs @@ -0,0 +1,1480 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[TcGenDeriv]{Generating derived instance declarations} + +This module is nominally ``subordinate'' to @TcDeriv@, which is the +``official'' interface to deriving-related things. + +This is where we do all the grimy bindings' generation. + +\begin{code} +module TcGenDeriv ( + gen_Bounded_binds, + gen_Enum_binds, + gen_Eq_binds, + gen_Ix_binds, + gen_Ord_binds, + gen_Read_binds, + gen_Show_binds, + gen_Data_binds, + gen_Typeable_binds, + gen_tag_n_con_monobind, + + con2tag_RDR, tag2con_RDR, maxtag_RDR, + + TagThingWanted(..) + ) where + +#include "HsVersions.h" + +import HsSyn +import RdrName ( RdrName, mkVarUnqual, getRdrName, mkRdrUnqual, + mkDerivedRdrName ) +import BasicTypes ( Fixity(..), maxPrecedence, Boxity(..) ) +import DataCon ( isNullarySrcDataCon, dataConTag, + dataConOrigArgTys, dataConSourceArity, fIRST_TAG, + DataCon, dataConName, dataConIsInfix, + dataConFieldLabels ) +import Name ( getOccString, getSrcLoc, Name, NamedThing(..) ) + +import HscTypes ( FixityEnv, lookupFixity ) +import PrelInfo +import PrelNames +import MkId ( eRROR_ID ) +import PrimOp ( PrimOp(..) ) +import SrcLoc ( Located(..), noLoc, srcLocSpan ) +import TyCon ( TyCon, isNewTyCon, tyConDataCons, isEnumerationTyCon, tyConArity, + maybeTyConSingleCon, tyConFamilySize, tyConTyVars, tyConName + ) +import TcType ( isUnLiftedType, tcEqType, Type ) +import TysPrim ( charPrimTy, intPrimTy, wordPrimTy, addrPrimTy, floatPrimTy, doublePrimTy, + intPrimTyCon ) +import TysWiredIn ( charDataCon, intDataCon, floatDataCon, doubleDataCon, + intDataCon_RDR, true_RDR, false_RDR ) +import Util ( zipWithEqual, isSingleton, + zipWith3Equal, nOfThem, zipEqual ) +import Constants +import List ( partition, intersperse ) +import Outputable +import FastString +import OccName +import Bag +\end{code} + +%************************************************************************ +%* * +\subsection{Generating code, by derivable class} +%* * +%************************************************************************ + +%************************************************************************ +%* * +\subsubsection{Generating @Eq@ instance declarations} +%* * +%************************************************************************ + +Here are the heuristics for the code we generate for @Eq@: +\begin{itemize} +\item + Let's assume we have a data type with some (possibly zero) nullary + data constructors and some ordinary, non-nullary ones (the rest, + also possibly zero of them). Here's an example, with both \tr{N}ullary + and \tr{O}rdinary data cons. +\begin{verbatim} +data Foo ... = N1 | N2 ... | Nn | O1 a b | O2 Int | O3 Double b b | ... +\end{verbatim} + +\item + For the ordinary constructors (if any), we emit clauses to do The + Usual Thing, e.g.,: + +\begin{verbatim} +(==) (O1 a1 b1) (O1 a2 b2) = a1 == a2 && b1 == b2 +(==) (O2 a1) (O2 a2) = a1 == a2 +(==) (O3 a1 b1 c1) (O3 a2 b2 c2) = a1 == a2 && b1 == b2 && c1 == c2 +\end{verbatim} + + Note: if we're comparing unlifted things, e.g., if \tr{a1} and + \tr{a2} are \tr{Float#}s, then we have to generate +\begin{verbatim} +case (a1 `eqFloat#` a2) of + r -> r +\end{verbatim} + for that particular test. + +\item + If there are any nullary constructors, we emit a catch-all clause of + the form: + +\begin{verbatim} +(==) a b = case (con2tag_Foo a) of { a# -> + case (con2tag_Foo b) of { b# -> + case (a# ==# b#) of { + r -> r + }}} +\end{verbatim} + + If there aren't any nullary constructors, we emit a simpler + catch-all: +\begin{verbatim} +(==) a b = False +\end{verbatim} + +\item + For the @(/=)@ method, we normally just use the default method. + + If the type is an enumeration type, we could/may/should? generate + special code that calls @con2tag_Foo@, much like for @(==)@ shown + above. + +\item + We thought about doing this: If we're also deriving @Ord@ for this + tycon, we generate: +\begin{verbatim} +instance ... Eq (Foo ...) where + (==) a b = case (compare a b) of { _LT -> False; _EQ -> True ; _GT -> False} + (/=) a b = case (compare a b) of { _LT -> True ; _EQ -> False; _GT -> True } +\begin{verbatim} + However, that requires that \tr{Ord <whatever>} was put in the context + for the instance decl, which it probably wasn't, so the decls + produced don't get through the typechecker. +\end{itemize} + + +\begin{code} +gen_Eq_binds :: TyCon -> LHsBinds RdrName + +gen_Eq_binds tycon + = let + tycon_loc = getSrcSpan tycon + + (nullary_cons, nonnullary_cons) + | isNewTyCon tycon = ([], tyConDataCons tycon) + | otherwise = partition isNullarySrcDataCon (tyConDataCons tycon) + + rest + = if (null nullary_cons) then + case maybeTyConSingleCon tycon of + Just _ -> [] + Nothing -> -- if cons don't match, then False + [([nlWildPat, nlWildPat], false_Expr)] + else -- calc. and compare the tags + [([a_Pat, b_Pat], + untag_Expr tycon [(a_RDR,ah_RDR), (b_RDR,bh_RDR)] + (genOpApp (nlHsVar ah_RDR) eqInt_RDR (nlHsVar bh_RDR)))] + in + listToBag [ + mk_FunBind tycon_loc eq_RDR ((map pats_etc nonnullary_cons) ++ rest), + mk_easy_FunBind tycon_loc ne_RDR [a_Pat, b_Pat] ( + nlHsApp (nlHsVar not_RDR) (nlHsPar (nlHsVarApps eq_RDR [a_RDR, b_RDR]))) + ] + where + ------------------------------------------------------------------ + pats_etc data_con + = let + con1_pat = nlConVarPat data_con_RDR as_needed + con2_pat = nlConVarPat data_con_RDR bs_needed + + data_con_RDR = getRdrName data_con + con_arity = length tys_needed + as_needed = take con_arity as_RDRs + bs_needed = take con_arity bs_RDRs + tys_needed = dataConOrigArgTys data_con + in + ([con1_pat, con2_pat], nested_eq_expr tys_needed as_needed bs_needed) + where + nested_eq_expr [] [] [] = true_Expr + nested_eq_expr tys as bs + = foldl1 and_Expr (zipWith3Equal "nested_eq" nested_eq tys as bs) + where + nested_eq ty a b = nlHsPar (eq_Expr tycon ty (nlHsVar a) (nlHsVar b)) +\end{code} + +%************************************************************************ +%* * +\subsubsection{Generating @Ord@ instance declarations} +%* * +%************************************************************************ + +For a derived @Ord@, we concentrate our attentions on @compare@ +\begin{verbatim} +compare :: a -> a -> Ordering +data Ordering = LT | EQ | GT deriving () +\end{verbatim} + +We will use the same example data type as above: +\begin{verbatim} +data Foo ... = N1 | N2 ... | Nn | O1 a b | O2 Int | O3 Double b b | ... +\end{verbatim} + +\begin{itemize} +\item + We do all the other @Ord@ methods with calls to @compare@: +\begin{verbatim} +instance ... (Ord <wurble> <wurble>) where + a < b = case (compare a b) of { LT -> True; EQ -> False; GT -> False } + a <= b = case (compare a b) of { LT -> True; EQ -> True; GT -> False } + a >= b = case (compare a b) of { LT -> False; EQ -> True; GT -> True } + a > b = case (compare a b) of { LT -> False; EQ -> False; GT -> True } + + max a b = case (compare a b) of { LT -> b; EQ -> a; GT -> a } + min a b = case (compare a b) of { LT -> a; EQ -> b; GT -> b } + + -- compare to come... +\end{verbatim} + +\item + @compare@ always has two parts. First, we use the compared + data-constructors' tags to deal with the case of different + constructors: +\begin{verbatim} +compare a b = case (con2tag_Foo a) of { a# -> + case (con2tag_Foo b) of { b# -> + case (a# ==# b#) of { + True -> cmp_eq a b + False -> case (a# <# b#) of + True -> _LT + False -> _GT + }}} + where + cmp_eq = ... to come ... +\end{verbatim} + +\item + We are only left with the ``help'' function @cmp_eq@, to deal with + comparing data constructors with the same tag. + + For the ordinary constructors (if any), we emit the sorta-obvious + compare-style stuff; for our example: +\begin{verbatim} +cmp_eq (O1 a1 b1) (O1 a2 b2) + = case (compare a1 a2) of { LT -> LT; EQ -> compare b1 b2; GT -> GT } + +cmp_eq (O2 a1) (O2 a2) + = compare a1 a2 + +cmp_eq (O3 a1 b1 c1) (O3 a2 b2 c2) + = case (compare a1 a2) of { + LT -> LT; + GT -> GT; + EQ -> case compare b1 b2 of { + LT -> LT; + GT -> GT; + EQ -> compare c1 c2 + } + } +\end{verbatim} + + Again, we must be careful about unlifted comparisons. For example, + if \tr{a1} and \tr{a2} were \tr{Int#}s in the 2nd example above, we'd need to + generate: + +\begin{verbatim} +cmp_eq lt eq gt (O2 a1) (O2 a2) + = compareInt# a1 a2 + -- or maybe the unfolded equivalent +\end{verbatim} + +\item + For the remaining nullary constructors, we already know that the + tags are equal so: +\begin{verbatim} +cmp_eq _ _ = EQ +\end{verbatim} +\end{itemize} + +If there is only one constructor in the Data Type we don't need the WildCard Pattern. +JJQC-30-Nov-1997 + +\begin{code} +gen_Ord_binds :: TyCon -> LHsBinds RdrName + +gen_Ord_binds tycon + = unitBag compare -- `AndMonoBinds` compare + -- The default declaration in PrelBase handles this + where + tycon_loc = getSrcSpan tycon + -------------------------------------------------------------------- + + compare = L tycon_loc (mkFunBind (L tycon_loc compare_RDR) compare_matches) + compare_matches = [mkMatch [a_Pat, b_Pat] compare_rhs cmp_eq_binds] + cmp_eq_binds = HsValBinds (ValBindsIn (unitBag cmp_eq) []) + + compare_rhs + | single_con_type = cmp_eq_Expr a_Expr b_Expr + | otherwise + = untag_Expr tycon [(a_RDR, ah_RDR), (b_RDR, bh_RDR)] + (cmp_tags_Expr eqInt_RDR ah_RDR bh_RDR + (cmp_eq_Expr a_Expr b_Expr) -- True case + -- False case; they aren't equal + -- So we need to do a less-than comparison on the tags + (cmp_tags_Expr ltInt_RDR ah_RDR bh_RDR ltTag_Expr gtTag_Expr)) + + tycon_data_cons = tyConDataCons tycon + single_con_type = isSingleton tycon_data_cons + (nullary_cons, nonnullary_cons) + | isNewTyCon tycon = ([], tyConDataCons tycon) + | otherwise = partition isNullarySrcDataCon tycon_data_cons + + cmp_eq = mk_FunBind tycon_loc cmp_eq_RDR cmp_eq_match + cmp_eq_match + | isEnumerationTyCon tycon + -- We know the tags are equal, so if it's an enumeration TyCon, + -- then there is nothing left to do + -- Catch this specially to avoid warnings + -- about overlapping patterns from the desugarer, + -- and to avoid unnecessary pattern-matching + = [([nlWildPat,nlWildPat], eqTag_Expr)] + | otherwise + = map pats_etc nonnullary_cons ++ + (if single_con_type then -- Omit wildcards when there's just one + [] -- constructor, to silence desugarer + else + [([nlWildPat, nlWildPat], default_rhs)]) + + where + pats_etc data_con + = ([con1_pat, con2_pat], + nested_compare_expr tys_needed as_needed bs_needed) + where + con1_pat = nlConVarPat data_con_RDR as_needed + con2_pat = nlConVarPat data_con_RDR bs_needed + + data_con_RDR = getRdrName data_con + con_arity = length tys_needed + as_needed = take con_arity as_RDRs + bs_needed = take con_arity bs_RDRs + tys_needed = dataConOrigArgTys data_con + + nested_compare_expr [ty] [a] [b] + = careful_compare_Case tycon ty eqTag_Expr (nlHsVar a) (nlHsVar b) + + nested_compare_expr (ty:tys) (a:as) (b:bs) + = let eq_expr = nested_compare_expr tys as bs + in careful_compare_Case tycon ty eq_expr (nlHsVar a) (nlHsVar b) + + default_rhs | null nullary_cons = impossible_Expr -- Keep desugarer from complaining about + -- inexhaustive patterns + | otherwise = eqTag_Expr -- Some nullary constructors; + -- Tags are equal, no args => return EQ +\end{code} + +%************************************************************************ +%* * +\subsubsection{Generating @Enum@ instance declarations} +%* * +%************************************************************************ + +@Enum@ can only be derived for enumeration types. For a type +\begin{verbatim} +data Foo ... = N1 | N2 | ... | Nn +\end{verbatim} + +we use both @con2tag_Foo@ and @tag2con_Foo@ functions, as well as a +@maxtag_Foo@ variable (all generated by @gen_tag_n_con_binds@). + +\begin{verbatim} +instance ... Enum (Foo ...) where + succ x = toEnum (1 + fromEnum x) + pred x = toEnum (fromEnum x - 1) + + toEnum i = tag2con_Foo i + + enumFrom a = map tag2con_Foo [con2tag_Foo a .. maxtag_Foo] + + -- or, really... + enumFrom a + = case con2tag_Foo a of + a# -> map tag2con_Foo (enumFromTo (I# a#) maxtag_Foo) + + enumFromThen a b + = map tag2con_Foo [con2tag_Foo a, con2tag_Foo b .. maxtag_Foo] + + -- or, really... + enumFromThen a b + = case con2tag_Foo a of { a# -> + case con2tag_Foo b of { b# -> + map tag2con_Foo (enumFromThenTo (I# a#) (I# b#) maxtag_Foo) + }} +\end{verbatim} + +For @enumFromTo@ and @enumFromThenTo@, we use the default methods. + +\begin{code} +gen_Enum_binds :: TyCon -> LHsBinds RdrName + +gen_Enum_binds tycon + = listToBag [ + succ_enum, + pred_enum, + to_enum, + enum_from, + enum_from_then, + from_enum + ] + where + tycon_loc = getSrcSpan tycon + occ_nm = getOccString tycon + + succ_enum + = mk_easy_FunBind tycon_loc succ_RDR [a_Pat] $ + untag_Expr tycon [(a_RDR, ah_RDR)] $ + nlHsIf (nlHsApps eq_RDR [nlHsVar (maxtag_RDR tycon), + nlHsVarApps intDataCon_RDR [ah_RDR]]) + (illegal_Expr "succ" occ_nm "tried to take `succ' of last tag in enumeration") + (nlHsApp (nlHsVar (tag2con_RDR tycon)) + (nlHsApps plus_RDR [nlHsVarApps intDataCon_RDR [ah_RDR], + nlHsIntLit 1])) + + pred_enum + = mk_easy_FunBind tycon_loc pred_RDR [a_Pat] $ + untag_Expr tycon [(a_RDR, ah_RDR)] $ + nlHsIf (nlHsApps eq_RDR [nlHsIntLit 0, + nlHsVarApps intDataCon_RDR [ah_RDR]]) + (illegal_Expr "pred" occ_nm "tried to take `pred' of first tag in enumeration") + (nlHsApp (nlHsVar (tag2con_RDR tycon)) + (nlHsApps plus_RDR [nlHsVarApps intDataCon_RDR [ah_RDR], + nlHsLit (HsInt (-1))])) + + to_enum + = mk_easy_FunBind tycon_loc toEnum_RDR [a_Pat] $ + nlHsIf (nlHsApps and_RDR + [nlHsApps ge_RDR [nlHsVar a_RDR, nlHsIntLit 0], + nlHsApps le_RDR [nlHsVar a_RDR, nlHsVar (maxtag_RDR tycon)]]) + (nlHsVarApps (tag2con_RDR tycon) [a_RDR]) + (illegal_toEnum_tag occ_nm (maxtag_RDR tycon)) + + enum_from + = mk_easy_FunBind tycon_loc enumFrom_RDR [a_Pat] $ + untag_Expr tycon [(a_RDR, ah_RDR)] $ + nlHsApps map_RDR + [nlHsVar (tag2con_RDR tycon), + nlHsPar (enum_from_to_Expr + (nlHsVarApps intDataCon_RDR [ah_RDR]) + (nlHsVar (maxtag_RDR tycon)))] + + enum_from_then + = mk_easy_FunBind tycon_loc enumFromThen_RDR [a_Pat, b_Pat] $ + untag_Expr tycon [(a_RDR, ah_RDR), (b_RDR, bh_RDR)] $ + nlHsApp (nlHsVarApps map_RDR [tag2con_RDR tycon]) $ + nlHsPar (enum_from_then_to_Expr + (nlHsVarApps intDataCon_RDR [ah_RDR]) + (nlHsVarApps intDataCon_RDR [bh_RDR]) + (nlHsIf (nlHsApps gt_RDR [nlHsVarApps intDataCon_RDR [ah_RDR], + nlHsVarApps intDataCon_RDR [bh_RDR]]) + (nlHsIntLit 0) + (nlHsVar (maxtag_RDR tycon)) + )) + + from_enum + = mk_easy_FunBind tycon_loc fromEnum_RDR [a_Pat] $ + untag_Expr tycon [(a_RDR, ah_RDR)] $ + (nlHsVarApps intDataCon_RDR [ah_RDR]) +\end{code} + +%************************************************************************ +%* * +\subsubsection{Generating @Bounded@ instance declarations} +%* * +%************************************************************************ + +\begin{code} +gen_Bounded_binds tycon + = if isEnumerationTyCon tycon then + listToBag [ min_bound_enum, max_bound_enum ] + else + ASSERT(isSingleton data_cons) + listToBag [ min_bound_1con, max_bound_1con ] + where + data_cons = tyConDataCons tycon + tycon_loc = getSrcSpan tycon + + ----- enum-flavored: --------------------------- + min_bound_enum = mkVarBind tycon_loc minBound_RDR (nlHsVar data_con_1_RDR) + max_bound_enum = mkVarBind tycon_loc maxBound_RDR (nlHsVar data_con_N_RDR) + + data_con_1 = head data_cons + data_con_N = last data_cons + data_con_1_RDR = getRdrName data_con_1 + data_con_N_RDR = getRdrName data_con_N + + ----- single-constructor-flavored: ------------- + arity = dataConSourceArity data_con_1 + + min_bound_1con = mkVarBind tycon_loc minBound_RDR $ + nlHsVarApps data_con_1_RDR (nOfThem arity minBound_RDR) + max_bound_1con = mkVarBind tycon_loc maxBound_RDR $ + nlHsVarApps data_con_1_RDR (nOfThem arity maxBound_RDR) +\end{code} + +%************************************************************************ +%* * +\subsubsection{Generating @Ix@ instance declarations} +%* * +%************************************************************************ + +Deriving @Ix@ is only possible for enumeration types and +single-constructor types. We deal with them in turn. + +For an enumeration type, e.g., +\begin{verbatim} + data Foo ... = N1 | N2 | ... | Nn +\end{verbatim} +things go not too differently from @Enum@: +\begin{verbatim} +instance ... Ix (Foo ...) where + range (a, b) + = map tag2con_Foo [con2tag_Foo a .. con2tag_Foo b] + + -- or, really... + range (a, b) + = case (con2tag_Foo a) of { a# -> + case (con2tag_Foo b) of { b# -> + map tag2con_Foo (enumFromTo (I# a#) (I# b#)) + }} + + -- Generate code for unsafeIndex, becuase using index leads + -- to lots of redundant range tests + unsafeIndex c@(a, b) d + = case (con2tag_Foo d -# con2tag_Foo a) of + r# -> I# r# + + inRange (a, b) c + = let + p_tag = con2tag_Foo c + in + p_tag >= con2tag_Foo a && p_tag <= con2tag_Foo b + + -- or, really... + inRange (a, b) c + = case (con2tag_Foo a) of { a_tag -> + case (con2tag_Foo b) of { b_tag -> + case (con2tag_Foo c) of { c_tag -> + if (c_tag >=# a_tag) then + c_tag <=# b_tag + else + False + }}} +\end{verbatim} +(modulo suitable case-ification to handle the unlifted tags) + +For a single-constructor type (NB: this includes all tuples), e.g., +\begin{verbatim} + data Foo ... = MkFoo a b Int Double c c +\end{verbatim} +we follow the scheme given in Figure~19 of the Haskell~1.2 report +(p.~147). + +\begin{code} +gen_Ix_binds :: TyCon -> LHsBinds RdrName + +gen_Ix_binds tycon + = if isEnumerationTyCon tycon + then enum_ixes + else single_con_ixes + where + tycon_loc = getSrcSpan tycon + + -------------------------------------------------------------- + enum_ixes = listToBag [ enum_range, enum_index, enum_inRange ] + + enum_range + = mk_easy_FunBind tycon_loc range_RDR [nlTuplePat [a_Pat, b_Pat] Boxed] $ + untag_Expr tycon [(a_RDR, ah_RDR)] $ + untag_Expr tycon [(b_RDR, bh_RDR)] $ + nlHsApp (nlHsVarApps map_RDR [tag2con_RDR tycon]) $ + nlHsPar (enum_from_to_Expr + (nlHsVarApps intDataCon_RDR [ah_RDR]) + (nlHsVarApps intDataCon_RDR [bh_RDR])) + + enum_index + = mk_easy_FunBind tycon_loc unsafeIndex_RDR + [noLoc (AsPat (noLoc c_RDR) + (nlTuplePat [a_Pat, nlWildPat] Boxed)), + d_Pat] ( + untag_Expr tycon [(a_RDR, ah_RDR)] ( + untag_Expr tycon [(d_RDR, dh_RDR)] ( + let + rhs = nlHsVarApps intDataCon_RDR [c_RDR] + in + nlHsCase + (genOpApp (nlHsVar dh_RDR) minusInt_RDR (nlHsVar ah_RDR)) + [mkSimpleHsAlt (nlVarPat c_RDR) rhs] + )) + ) + + enum_inRange + = mk_easy_FunBind tycon_loc inRange_RDR [nlTuplePat [a_Pat, b_Pat] Boxed, c_Pat] $ + untag_Expr tycon [(a_RDR, ah_RDR)] ( + untag_Expr tycon [(b_RDR, bh_RDR)] ( + untag_Expr tycon [(c_RDR, ch_RDR)] ( + nlHsIf (genOpApp (nlHsVar ch_RDR) geInt_RDR (nlHsVar ah_RDR)) ( + (genOpApp (nlHsVar ch_RDR) leInt_RDR (nlHsVar bh_RDR)) + ) {-else-} ( + false_Expr + )))) + + -------------------------------------------------------------- + single_con_ixes + = listToBag [single_con_range, single_con_index, single_con_inRange] + + data_con + = case maybeTyConSingleCon tycon of -- just checking... + Nothing -> panic "get_Ix_binds" + Just dc | any isUnLiftedType (dataConOrigArgTys dc) + -> pprPanic "Can't derive Ix for a single-constructor type with primitive argument types:" (ppr tycon) + | otherwise -> dc + + con_arity = dataConSourceArity data_con + data_con_RDR = getRdrName data_con + + as_needed = take con_arity as_RDRs + bs_needed = take con_arity bs_RDRs + cs_needed = take con_arity cs_RDRs + + con_pat xs = nlConVarPat data_con_RDR xs + con_expr = nlHsVarApps data_con_RDR cs_needed + + -------------------------------------------------------------- + single_con_range + = mk_easy_FunBind tycon_loc range_RDR + [nlTuplePat [con_pat as_needed, con_pat bs_needed] Boxed] $ + nlHsDo ListComp stmts con_expr + where + stmts = zipWith3Equal "single_con_range" mk_qual as_needed bs_needed cs_needed + + mk_qual a b c = noLoc $ mkBindStmt (nlVarPat c) + (nlHsApp (nlHsVar range_RDR) + (nlTuple [nlHsVar a, nlHsVar b] Boxed)) + + ---------------- + single_con_index + = mk_easy_FunBind tycon_loc unsafeIndex_RDR + [nlTuplePat [con_pat as_needed, con_pat bs_needed] Boxed, + con_pat cs_needed] + (mk_index (zip3 as_needed bs_needed cs_needed)) + where + -- index (l1,u1) i1 + rangeSize (l1,u1) * (index (l2,u2) i2 + ...) + mk_index [] = nlHsIntLit 0 + mk_index [(l,u,i)] = mk_one l u i + mk_index ((l,u,i) : rest) + = genOpApp ( + mk_one l u i + ) plus_RDR ( + genOpApp ( + (nlHsApp (nlHsVar unsafeRangeSize_RDR) + (nlTuple [nlHsVar l, nlHsVar u] Boxed)) + ) times_RDR (mk_index rest) + ) + mk_one l u i + = nlHsApps unsafeIndex_RDR [nlTuple [nlHsVar l, nlHsVar u] Boxed, nlHsVar i] + + ------------------ + single_con_inRange + = mk_easy_FunBind tycon_loc inRange_RDR + [nlTuplePat [con_pat as_needed, con_pat bs_needed] Boxed, + con_pat cs_needed] $ + foldl1 and_Expr (zipWith3Equal "single_con_inRange" in_range as_needed bs_needed cs_needed) + where + in_range a b c = nlHsApps inRange_RDR [nlTuple [nlHsVar a, nlHsVar b] Boxed, + nlHsVar c] +\end{code} + +%************************************************************************ +%* * +\subsubsection{Generating @Read@ instance declarations} +%* * +%************************************************************************ + +Example + + infix 4 %% + data T = Int %% Int + | T1 { f1 :: Int } + | T2 Int + + +instance Read T where + readPrec = + parens + ( prec 4 ( + do x <- ReadP.step Read.readPrec + Symbol "%%" <- Lex.lex + y <- ReadP.step Read.readPrec + return (x %% y)) + +++ + prec appPrec ( + do Ident "T1" <- Lex.lex + Punc '{' <- Lex.lex + Ident "f1" <- Lex.lex + Punc '=' <- Lex.lex + x <- ReadP.reset Read.readPrec + Punc '}' <- Lex.lex + return (T1 { f1 = x })) + +++ + prec appPrec ( + do Ident "T2" <- Lex.lexP + x <- ReadP.step Read.readPrec + return (T2 x)) + ) + + readListPrec = readListPrecDefault + readList = readListDefault + + +\begin{code} +gen_Read_binds :: FixityEnv -> TyCon -> LHsBinds RdrName + +gen_Read_binds get_fixity tycon + = listToBag [read_prec, default_readlist, default_readlistprec] + where + ----------------------------------------------------------------------- + default_readlist + = mkVarBind loc readList_RDR (nlHsVar readListDefault_RDR) + + default_readlistprec + = mkVarBind loc readListPrec_RDR (nlHsVar readListPrecDefault_RDR) + ----------------------------------------------------------------------- + + loc = getSrcSpan tycon + data_cons = tyConDataCons tycon + (nullary_cons, non_nullary_cons) = partition isNullarySrcDataCon data_cons + + read_prec = mkVarBind loc readPrec_RDR + (nlHsApp (nlHsVar parens_RDR) read_cons) + + read_cons = foldr1 mk_alt (read_nullary_cons ++ read_non_nullary_cons) + read_non_nullary_cons = map read_non_nullary_con non_nullary_cons + + read_nullary_cons + = case nullary_cons of + [] -> [] + [con] -> [nlHsDo DoExpr [bindLex (ident_pat (data_con_str con))] + (result_expr con [])] + _ -> [nlHsApp (nlHsVar choose_RDR) + (nlList (map mk_pair nullary_cons))] + + mk_pair con = nlTuple [nlHsLit (mkHsString (data_con_str con)), + nlHsApp (nlHsVar returnM_RDR) (nlHsVar (getRdrName con))] + Boxed + + read_non_nullary_con data_con + = nlHsApps prec_RDR [nlHsIntLit prec, nlHsDo DoExpr stmts body] + where + stmts | is_infix = infix_stmts + | length labels > 0 = lbl_stmts + | otherwise = prefix_stmts + + body = result_expr data_con as_needed + con_str = data_con_str data_con + + prefix_stmts -- T a b c + = [bindLex (ident_pat (wrapOpParens con_str))] + ++ read_args + + infix_stmts -- a %% b, or a `T` b + = [read_a1] + ++ (if isSym con_str + then [bindLex (symbol_pat con_str)] + else [read_punc "`", bindLex (ident_pat con_str), read_punc "`"]) + ++ [read_a2] + + lbl_stmts -- T { f1 = a, f2 = b } + = [bindLex (ident_pat (wrapOpParens con_str)), + read_punc "{"] + ++ concat (intersperse [read_punc ","] field_stmts) + ++ [read_punc "}"] + + field_stmts = zipWithEqual "lbl_stmts" read_field labels as_needed + + con_arity = dataConSourceArity data_con + labels = dataConFieldLabels data_con + dc_nm = getName data_con + is_infix = dataConIsInfix data_con + as_needed = take con_arity as_RDRs + read_args = zipWithEqual "gen_Read_binds" read_arg as_needed (dataConOrigArgTys data_con) + (read_a1:read_a2:_) = read_args + prec = getPrec is_infix get_fixity dc_nm + + ------------------------------------------------------------------------ + -- Helpers + ------------------------------------------------------------------------ + mk_alt e1 e2 = genOpApp e1 alt_RDR e2 + bindLex pat = noLoc (mkBindStmt pat (nlHsVar lexP_RDR)) + con_app c as = nlHsVarApps (getRdrName c) as + result_expr c as = nlHsApp (nlHsVar returnM_RDR) (con_app c as) + + punc_pat s = nlConPat punc_RDR [nlLitPat (mkHsString s)] -- Punc 'c' + ident_pat s = nlConPat ident_RDR [nlLitPat (mkHsString s)] -- Ident "foo" + symbol_pat s = nlConPat symbol_RDR [nlLitPat (mkHsString s)] -- Symbol ">>" + + data_con_str con = occNameString (getOccName con) + + read_punc c = bindLex (punc_pat c) + read_arg a ty + | isUnLiftedType ty = pprPanic "Error in deriving:" (text "Can't read unlifted types yet:" <+> ppr ty) + | otherwise = noLoc (mkBindStmt (nlVarPat a) (nlHsVarApps step_RDR [readPrec_RDR])) + + read_field lbl a = read_lbl lbl ++ + [read_punc "=", + noLoc (mkBindStmt (nlVarPat a) (nlHsVarApps reset_RDR [readPrec_RDR]))] + + -- When reading field labels we might encounter + -- a = 3 + -- _a = 3 + -- or (#) = 4 + -- Note the parens! + read_lbl lbl | isSym lbl_str + = [read_punc "(", + bindLex (symbol_pat lbl_str), + read_punc ")"] + | otherwise + = [bindLex (ident_pat lbl_str)] + where + lbl_str = occNameString (getOccName lbl) +\end{code} + + +%************************************************************************ +%* * +\subsubsection{Generating @Show@ instance declarations} +%* * +%************************************************************************ + +Example + + infixr 5 :^: + + data Tree a = Leaf a | Tree a :^: Tree a + + instance (Show a) => Show (Tree a) where + + showsPrec d (Leaf m) = showParen (d > app_prec) showStr + where + showStr = showString "Leaf " . showsPrec (app_prec+1) m + + showsPrec d (u :^: v) = showParen (d > up_prec) showStr + where + showStr = showsPrec (up_prec+1) u . + showString " :^: " . + showsPrec (up_prec+1) v + -- Note: right-associativity of :^: ignored + + up_prec = 5 -- Precedence of :^: + app_prec = 10 -- Application has precedence one more than + -- the most tightly-binding operator + +\begin{code} +gen_Show_binds :: FixityEnv -> TyCon -> LHsBinds RdrName + +gen_Show_binds get_fixity tycon + = listToBag [shows_prec, show_list] + where + tycon_loc = getSrcSpan tycon + ----------------------------------------------------------------------- + show_list = mkVarBind tycon_loc showList_RDR + (nlHsApp (nlHsVar showList___RDR) (nlHsPar (nlHsApp (nlHsVar showsPrec_RDR) (nlHsIntLit 0)))) + ----------------------------------------------------------------------- + shows_prec = mk_FunBind tycon_loc showsPrec_RDR (map pats_etc (tyConDataCons tycon)) + where + pats_etc data_con + | nullary_con = -- skip the showParen junk... + ASSERT(null bs_needed) + ([nlWildPat, con_pat], mk_showString_app con_str) + | otherwise = + ([a_Pat, con_pat], + showParen_Expr (nlHsPar (genOpApp a_Expr ge_RDR (nlHsLit (HsInt con_prec_plus_one)))) + (nlHsPar (nested_compose_Expr show_thingies))) + where + data_con_RDR = getRdrName data_con + con_arity = dataConSourceArity data_con + bs_needed = take con_arity bs_RDRs + arg_tys = dataConOrigArgTys data_con -- Correspond 1-1 with bs_needed + con_pat = nlConVarPat data_con_RDR bs_needed + nullary_con = con_arity == 0 + labels = dataConFieldLabels data_con + lab_fields = length labels + record_syntax = lab_fields > 0 + + dc_nm = getName data_con + dc_occ_nm = getOccName data_con + con_str = occNameString dc_occ_nm + op_con_str = wrapOpParens con_str + backquote_str = wrapOpBackquotes con_str + + show_thingies + | is_infix = [show_arg1, mk_showString_app (" " ++ backquote_str ++ " "), show_arg2] + | record_syntax = mk_showString_app (op_con_str ++ " {") : + show_record_args ++ [mk_showString_app "}"] + | otherwise = mk_showString_app (op_con_str ++ " ") : show_prefix_args + + show_label l = mk_showString_app (nm ++ " = ") + -- Note the spaces around the "=" sign. If we don't have them + -- then we get Foo { x=-1 } and the "=-" parses as a single + -- lexeme. Only the space after the '=' is necessary, but + -- it seems tidier to have them both sides. + where + occ_nm = getOccName l + nm = wrapOpParens (occNameString occ_nm) + + show_args = zipWith show_arg bs_needed arg_tys + (show_arg1:show_arg2:_) = show_args + show_prefix_args = intersperse (nlHsVar showSpace_RDR) show_args + + -- Assumption for record syntax: no of fields == no of labelled fields + -- (and in same order) + show_record_args = concat $ + intersperse [mk_showString_app ", "] $ + [ [show_label lbl, arg] + | (lbl,arg) <- zipEqual "gen_Show_binds" + labels show_args ] + + -- Generates (showsPrec p x) for argument x, but it also boxes + -- the argument first if necessary. Note that this prints unboxed + -- things without any '#' decorations; could change that if need be + show_arg b arg_ty = nlHsApps showsPrec_RDR [nlHsLit (HsInt arg_prec), + box_if_necy "Show" tycon (nlHsVar b) arg_ty] + + -- Fixity stuff + is_infix = dataConIsInfix data_con + con_prec_plus_one = 1 + getPrec is_infix get_fixity dc_nm + arg_prec | record_syntax = 0 -- Record fields don't need parens + | otherwise = con_prec_plus_one + +wrapOpParens :: String -> String +wrapOpParens s | isSym s = '(' : s ++ ")" + | otherwise = s + +wrapOpBackquotes :: String -> String +wrapOpBackquotes s | isSym s = s + | otherwise = '`' : s ++ "`" + +isSym :: String -> Bool +isSym "" = False +isSym (c:cs) = startsVarSym c || startsConSym c + +mk_showString_app str = nlHsApp (nlHsVar showString_RDR) (nlHsLit (mkHsString str)) +\end{code} + +\begin{code} +getPrec :: Bool -> FixityEnv -> Name -> Integer +getPrec is_infix get_fixity nm + | not is_infix = appPrecedence + | otherwise = getPrecedence get_fixity nm + +appPrecedence :: Integer +appPrecedence = fromIntegral maxPrecedence + 1 + -- One more than the precedence of the most + -- tightly-binding operator + +getPrecedence :: FixityEnv -> Name -> Integer +getPrecedence get_fixity nm + = case lookupFixity get_fixity nm of + Fixity x _ -> fromIntegral x +\end{code} + + +%************************************************************************ +%* * +\subsection{Typeable} +%* * +%************************************************************************ + +From the data type + + data T a b = .... + +we generate + + instance Typeable2 T where + typeOf2 _ = mkTyConApp (mkTyConRep "T") [] + +We are passed the Typeable2 class as well as T + +\begin{code} +gen_Typeable_binds :: TyCon -> LHsBinds RdrName +gen_Typeable_binds tycon + = unitBag $ + mk_easy_FunBind tycon_loc + (mk_typeOf_RDR tycon) -- Name of appropriate type0f function + [nlWildPat] + (nlHsApps mkTypeRep_RDR [tycon_rep, nlList []]) + where + tycon_loc = getSrcSpan tycon + tycon_rep = nlHsVar mkTyConRep_RDR `nlHsApp` nlHsLit (mkHsString (showSDoc (ppr tycon))) + +mk_typeOf_RDR :: TyCon -> RdrName +-- Use the arity of the TyCon to make the right typeOfn function +mk_typeOf_RDR tycon = varQual_RDR tYPEABLE (mkFastString ("typeOf" ++ suffix)) + where + arity = tyConArity tycon + suffix | arity == 0 = "" + | otherwise = show arity +\end{code} + + + +%************************************************************************ +%* * +\subsection{Data} +%* * +%************************************************************************ + +From the data type + + data T a b = T1 a b | T2 + +we generate + + $cT1 = mkDataCon $dT "T1" Prefix + $cT2 = mkDataCon $dT "T2" Prefix + $dT = mkDataType "Module.T" [] [$con_T1, $con_T2] + -- the [] is for field labels. + + instance (Data a, Data b) => Data (T a b) where + gfoldl k z (T1 a b) = z T `k` a `k` b + gfoldl k z T2 = z T2 + -- ToDo: add gmapT,Q,M, gfoldr + + gunfold k z c = case conIndex c of + I# 1# -> k (k (z T1)) + I# 2# -> z T2 + + toConstr (T1 _ _) = $cT1 + toConstr T2 = $cT2 + + dataTypeOf _ = $dT + +\begin{code} +gen_Data_binds :: FixityEnv + -> TyCon + -> (LHsBinds RdrName, -- The method bindings + LHsBinds RdrName) -- Auxiliary bindings +gen_Data_binds fix_env tycon + = (listToBag [gfoldl_bind, gunfold_bind, toCon_bind, dataTypeOf_bind], + -- Auxiliary definitions: the data type and constructors + datatype_bind `consBag` listToBag (map mk_con_bind data_cons)) + where + tycon_loc = getSrcSpan tycon + tycon_name = tyConName tycon + data_cons = tyConDataCons tycon + n_cons = length data_cons + one_constr = n_cons == 1 + + ------------ gfoldl + gfoldl_bind = mk_FunBind tycon_loc gfoldl_RDR (map gfoldl_eqn data_cons) + gfoldl_eqn con = ([nlVarPat k_RDR, nlVarPat z_RDR, nlConVarPat con_name as_needed], + foldl mk_k_app (nlHsVar z_RDR `nlHsApp` nlHsVar con_name) as_needed) + where + con_name :: RdrName + con_name = getRdrName con + as_needed = take (dataConSourceArity con) as_RDRs + mk_k_app e v = nlHsPar (nlHsOpApp e k_RDR (nlHsVar v)) + + ------------ gunfold + gunfold_bind = mk_FunBind tycon_loc + gunfold_RDR + [([k_Pat, z_Pat, if one_constr then nlWildPat else c_Pat], + gunfold_rhs)] + + gunfold_rhs + | one_constr = mk_unfold_rhs (head data_cons) -- No need for case + | otherwise = nlHsCase (nlHsVar conIndex_RDR `nlHsApp` c_Expr) + (map gunfold_alt data_cons) + + gunfold_alt dc = mkSimpleHsAlt (mk_unfold_pat dc) (mk_unfold_rhs dc) + mk_unfold_rhs dc = foldr nlHsApp + (nlHsVar z_RDR `nlHsApp` nlHsVar (getRdrName dc)) + (replicate (dataConSourceArity dc) (nlHsVar k_RDR)) + + mk_unfold_pat dc -- Last one is a wild-pat, to avoid + -- redundant test, and annoying warning + | tag-fIRST_TAG == n_cons-1 = nlWildPat -- Last constructor + | otherwise = nlConPat intDataCon_RDR [nlLitPat (HsIntPrim (toInteger tag))] + where + tag = dataConTag dc + + ------------ toConstr + toCon_bind = mk_FunBind tycon_loc toConstr_RDR (map to_con_eqn data_cons) + to_con_eqn dc = ([nlWildConPat dc], nlHsVar (mk_constr_name dc)) + + ------------ dataTypeOf + dataTypeOf_bind = mk_easy_FunBind + tycon_loc + dataTypeOf_RDR + [nlWildPat] + (nlHsVar data_type_name) + + ------------ $dT + + data_type_name = mkDerivedRdrName tycon_name mkDataTOcc + datatype_bind = mkVarBind + tycon_loc + data_type_name + ( nlHsVar mkDataType_RDR + `nlHsApp` nlHsLit (mkHsString (showSDoc (ppr tycon))) + `nlHsApp` nlList constrs + ) + constrs = [nlHsVar (mk_constr_name con) | con <- data_cons] + + + ------------ $cT1 etc + mk_constr_name con = mkDerivedRdrName (dataConName con) mkDataCOcc + mk_con_bind dc = mkVarBind + tycon_loc + (mk_constr_name dc) + (nlHsApps mkConstr_RDR (constr_args dc)) + constr_args dc = + [ -- nlHsIntLit (toInteger (dataConTag dc)), -- Tag + nlHsVar data_type_name, -- DataType + nlHsLit (mkHsString (occNameString dc_occ)), -- String name + nlList labels, -- Field labels + nlHsVar fixity] -- Fixity + where + labels = map (nlHsLit . mkHsString . getOccString) + (dataConFieldLabels dc) + dc_occ = getOccName dc + is_infix = isDataSymOcc dc_occ + fixity | is_infix = infix_RDR + | otherwise = prefix_RDR + +gfoldl_RDR = varQual_RDR gENERICS FSLIT("gfoldl") +gunfold_RDR = varQual_RDR gENERICS FSLIT("gunfold") +toConstr_RDR = varQual_RDR gENERICS FSLIT("toConstr") +dataTypeOf_RDR = varQual_RDR gENERICS FSLIT("dataTypeOf") +mkConstr_RDR = varQual_RDR gENERICS FSLIT("mkConstr") +mkDataType_RDR = varQual_RDR gENERICS FSLIT("mkDataType") +conIndex_RDR = varQual_RDR gENERICS FSLIT("constrIndex") +prefix_RDR = dataQual_RDR gENERICS FSLIT("Prefix") +infix_RDR = dataQual_RDR gENERICS FSLIT("Infix") +\end{code} + +%************************************************************************ +%* * +\subsection{Generating extra binds (@con2tag@ and @tag2con@)} +%* * +%************************************************************************ + +\begin{verbatim} +data Foo ... = ... + +con2tag_Foo :: Foo ... -> Int# +tag2con_Foo :: Int -> Foo ... -- easier if Int, not Int# +maxtag_Foo :: Int -- ditto (NB: not unlifted) +\end{verbatim} + +The `tags' here start at zero, hence the @fIRST_TAG@ (currently one) +fiddling around. + +\begin{code} +data TagThingWanted + = GenCon2Tag | GenTag2Con | GenMaxTag + +gen_tag_n_con_monobind + :: ( RdrName, -- (proto)Name for the thing in question + TyCon, -- tycon in question + TagThingWanted) + -> LHsBind RdrName + +gen_tag_n_con_monobind (rdr_name, tycon, GenCon2Tag) + | lots_of_constructors + = mk_FunBind tycon_loc rdr_name [([], get_tag_rhs)] + + | otherwise + = mk_FunBind tycon_loc rdr_name (map mk_stuff (tyConDataCons tycon)) + + where + tycon_loc = getSrcSpan tycon + + tvs = map (mkRdrUnqual . getOccName) (tyConTyVars tycon) + -- We can't use gerRdrName because that makes an Exact RdrName + -- and we can't put them in the LocalRdrEnv + + -- Give a signature to the bound variable, so + -- that the case expression generated by getTag is + -- monomorphic. In the push-enter model we get better code. + get_tag_rhs = noLoc $ ExprWithTySig + (nlHsLam (mkSimpleHsAlt (nlVarPat a_RDR) + (nlHsApp (nlHsVar getTag_RDR) a_Expr))) + (noLoc (mkExplicitHsForAllTy (map (noLoc.UserTyVar) tvs) (noLoc []) con2tag_ty)) + + con2tag_ty = foldl nlHsAppTy (nlHsTyVar (getRdrName tycon)) + (map nlHsTyVar tvs) + `nlHsFunTy` + nlHsTyVar (getRdrName intPrimTyCon) + + lots_of_constructors = tyConFamilySize tycon > mAX_FAMILY_SIZE_FOR_VEC_RETURNS + + mk_stuff :: DataCon -> ([LPat RdrName], LHsExpr RdrName) + mk_stuff con = ([nlWildConPat con], + nlHsLit (HsIntPrim (toInteger ((dataConTag con) - fIRST_TAG)))) + +gen_tag_n_con_monobind (rdr_name, tycon, GenTag2Con) + = mk_FunBind (getSrcSpan tycon) rdr_name + [([nlConVarPat intDataCon_RDR [a_RDR]], + noLoc (ExprWithTySig (nlHsApp (nlHsVar tagToEnum_RDR) a_Expr) + (nlHsTyVar (getRdrName tycon))))] + +gen_tag_n_con_monobind (rdr_name, tycon, GenMaxTag) + = mkVarBind (getSrcSpan tycon) rdr_name + (nlHsApp (nlHsVar intDataCon_RDR) (nlHsLit (HsIntPrim max_tag))) + where + max_tag = case (tyConDataCons tycon) of + data_cons -> toInteger ((length data_cons) - fIRST_TAG) + +\end{code} + +%************************************************************************ +%* * +\subsection{Utility bits for generating bindings} +%* * +%************************************************************************ + + +ToDo: Better SrcLocs. + +\begin{code} +compare_gen_Case :: + LHsExpr RdrName -- What to do for equality + -> LHsExpr RdrName -> LHsExpr RdrName + -> LHsExpr RdrName +careful_compare_Case :: -- checks for primitive types... + TyCon -- The tycon we are deriving for + -> Type + -> LHsExpr RdrName -- What to do for equality + -> LHsExpr RdrName -> LHsExpr RdrName + -> LHsExpr RdrName + +cmp_eq_Expr a b = nlHsApp (nlHsApp (nlHsVar cmp_eq_RDR) a) b + -- Was: compare_gen_Case cmp_eq_RDR + +compare_gen_Case (L _ (HsVar eq_tag)) a b | eq_tag == eqTag_RDR + = nlHsApp (nlHsApp (nlHsVar compare_RDR) a) b -- Simple case +compare_gen_Case eq a b -- General case + = nlHsCase (nlHsPar (nlHsApp (nlHsApp (nlHsVar compare_RDR) a) b)) {-of-} + [mkSimpleHsAlt (nlNullaryConPat ltTag_RDR) ltTag_Expr, + mkSimpleHsAlt (nlNullaryConPat eqTag_RDR) eq, + mkSimpleHsAlt (nlNullaryConPat gtTag_RDR) gtTag_Expr] + +careful_compare_Case tycon ty eq a b + | not (isUnLiftedType ty) + = compare_gen_Case eq a b + | otherwise -- We have to do something special for primitive things... + = nlHsIf (genOpApp a relevant_eq_op b) + eq + (nlHsIf (genOpApp a relevant_lt_op b) ltTag_Expr gtTag_Expr) + where + relevant_eq_op = primOpRdrName (assoc_ty_id "Ord" tycon eq_op_tbl ty) + relevant_lt_op = primOpRdrName (assoc_ty_id "Ord" tycon lt_op_tbl ty) + + +box_if_necy :: String -- The class involved + -> TyCon -- The tycon involved + -> LHsExpr RdrName -- The argument + -> Type -- The argument type + -> LHsExpr RdrName -- Boxed version of the arg +box_if_necy cls_str tycon arg arg_ty + | isUnLiftedType arg_ty = nlHsApp (nlHsVar box_con) arg + | otherwise = arg + where + box_con = assoc_ty_id cls_str tycon box_con_tbl arg_ty + +assoc_ty_id :: String -- The class involved + -> TyCon -- The tycon involved + -> [(Type,a)] -- The table + -> Type -- The type + -> a -- The result of the lookup +assoc_ty_id cls_str tycon tbl ty + | null res = pprPanic "Error in deriving:" (text "Can't derive" <+> text cls_str <+> + text "for primitive type" <+> ppr ty) + | otherwise = head res + where + res = [id | (ty',id) <- tbl, ty `tcEqType` ty'] + +eq_op_tbl :: [(Type, PrimOp)] +eq_op_tbl = + [(charPrimTy, CharEqOp) + ,(intPrimTy, IntEqOp) + ,(wordPrimTy, WordEqOp) + ,(addrPrimTy, AddrEqOp) + ,(floatPrimTy, FloatEqOp) + ,(doublePrimTy, DoubleEqOp) + ] + +lt_op_tbl :: [(Type, PrimOp)] +lt_op_tbl = + [(charPrimTy, CharLtOp) + ,(intPrimTy, IntLtOp) + ,(wordPrimTy, WordLtOp) + ,(addrPrimTy, AddrLtOp) + ,(floatPrimTy, FloatLtOp) + ,(doublePrimTy, DoubleLtOp) + ] + +box_con_tbl = + [(charPrimTy, getRdrName charDataCon) + ,(intPrimTy, getRdrName intDataCon) + ,(wordPrimTy, wordDataCon_RDR) + ,(addrPrimTy, addrDataCon_RDR) + ,(floatPrimTy, getRdrName floatDataCon) + ,(doublePrimTy, getRdrName doubleDataCon) + ] + +----------------------------------------------------------------------- + +and_Expr :: LHsExpr RdrName -> LHsExpr RdrName -> LHsExpr RdrName +and_Expr a b = genOpApp a and_RDR b + +----------------------------------------------------------------------- + +eq_Expr :: TyCon -> Type -> LHsExpr RdrName -> LHsExpr RdrName -> LHsExpr RdrName +eq_Expr tycon ty a b = genOpApp a eq_op b + where + eq_op + | not (isUnLiftedType ty) = eq_RDR + | otherwise = primOpRdrName (assoc_ty_id "Eq" tycon eq_op_tbl ty) + -- we have to do something special for primitive things... +\end{code} + +\begin{code} +untag_Expr :: TyCon -> [( RdrName, RdrName)] -> LHsExpr RdrName -> LHsExpr RdrName +untag_Expr tycon [] expr = expr +untag_Expr tycon ((untag_this, put_tag_here) : more) expr + = nlHsCase (nlHsPar (nlHsVarApps (con2tag_RDR tycon) [untag_this])) {-of-} + [mkSimpleHsAlt (nlVarPat put_tag_here) (untag_Expr tycon more expr)] + +cmp_tags_Expr :: RdrName -- Comparison op + -> RdrName -> RdrName -- Things to compare + -> LHsExpr RdrName -- What to return if true + -> LHsExpr RdrName -- What to return if false + -> LHsExpr RdrName + +cmp_tags_Expr op a b true_case false_case + = nlHsIf (genOpApp (nlHsVar a) op (nlHsVar b)) true_case false_case + +enum_from_to_Expr + :: LHsExpr RdrName -> LHsExpr RdrName + -> LHsExpr RdrName +enum_from_then_to_Expr + :: LHsExpr RdrName -> LHsExpr RdrName -> LHsExpr RdrName + -> LHsExpr RdrName + +enum_from_to_Expr f t2 = nlHsApp (nlHsApp (nlHsVar enumFromTo_RDR) f) t2 +enum_from_then_to_Expr f t t2 = nlHsApp (nlHsApp (nlHsApp (nlHsVar enumFromThenTo_RDR) f) t) t2 + +showParen_Expr + :: LHsExpr RdrName -> LHsExpr RdrName + -> LHsExpr RdrName + +showParen_Expr e1 e2 = nlHsApp (nlHsApp (nlHsVar showParen_RDR) e1) e2 + +nested_compose_Expr :: [LHsExpr RdrName] -> LHsExpr RdrName + +nested_compose_Expr [e] = parenify e +nested_compose_Expr (e:es) + = nlHsApp (nlHsApp (nlHsVar compose_RDR) (parenify e)) (nested_compose_Expr es) + +-- impossible_Expr is used in case RHSs that should never happen. +-- We generate these to keep the desugarer from complaining that they *might* happen! +impossible_Expr = nlHsApp (nlHsVar error_RDR) (nlHsLit (mkHsString "Urk! in TcGenDeriv")) + +-- illegal_Expr is used when signalling error conditions in the RHS of a derived +-- method. It is currently only used by Enum.{succ,pred} +illegal_Expr meth tp msg = + nlHsApp (nlHsVar error_RDR) (nlHsLit (mkHsString (meth ++ '{':tp ++ "}: " ++ msg))) + +-- illegal_toEnum_tag is an extended version of illegal_Expr, which also allows you +-- to include the value of a_RDR in the error string. +illegal_toEnum_tag tp maxtag = + nlHsApp (nlHsVar error_RDR) + (nlHsApp (nlHsApp (nlHsVar append_RDR) + (nlHsLit (mkHsString ("toEnum{" ++ tp ++ "}: tag (")))) + (nlHsApp (nlHsApp (nlHsApp + (nlHsVar showsPrec_RDR) + (nlHsIntLit 0)) + (nlHsVar a_RDR)) + (nlHsApp (nlHsApp + (nlHsVar append_RDR) + (nlHsLit (mkHsString ") is outside of enumeration's range (0,"))) + (nlHsApp (nlHsApp (nlHsApp + (nlHsVar showsPrec_RDR) + (nlHsIntLit 0)) + (nlHsVar maxtag)) + (nlHsLit (mkHsString ")")))))) + +parenify e@(L _ (HsVar _)) = e +parenify e = mkHsPar e + +-- genOpApp wraps brackets round the operator application, so that the +-- renamer won't subsequently try to re-associate it. +genOpApp e1 op e2 = nlHsPar (nlHsOpApp e1 op e2) +\end{code} + +\begin{code} +getSrcSpan = srcLocSpan . getSrcLoc +\end{code} + +\begin{code} +a_RDR = mkVarUnqual FSLIT("a") +b_RDR = mkVarUnqual FSLIT("b") +c_RDR = mkVarUnqual FSLIT("c") +d_RDR = mkVarUnqual FSLIT("d") +k_RDR = mkVarUnqual FSLIT("k") +z_RDR = mkVarUnqual FSLIT("z") +ah_RDR = mkVarUnqual FSLIT("a#") +bh_RDR = mkVarUnqual FSLIT("b#") +ch_RDR = mkVarUnqual FSLIT("c#") +dh_RDR = mkVarUnqual FSLIT("d#") +cmp_eq_RDR = mkVarUnqual FSLIT("cmp_eq") + +as_RDRs = [ mkVarUnqual (mkFastString ("a"++show i)) | i <- [(1::Int) .. ] ] +bs_RDRs = [ mkVarUnqual (mkFastString ("b"++show i)) | i <- [(1::Int) .. ] ] +cs_RDRs = [ mkVarUnqual (mkFastString ("c"++show i)) | i <- [(1::Int) .. ] ] + +a_Expr = nlHsVar a_RDR +b_Expr = nlHsVar b_RDR +c_Expr = nlHsVar c_RDR +ltTag_Expr = nlHsVar ltTag_RDR +eqTag_Expr = nlHsVar eqTag_RDR +gtTag_Expr = nlHsVar gtTag_RDR +false_Expr = nlHsVar false_RDR +true_Expr = nlHsVar true_RDR + +a_Pat = nlVarPat a_RDR +b_Pat = nlVarPat b_RDR +c_Pat = nlVarPat c_RDR +d_Pat = nlVarPat d_RDR +k_Pat = nlVarPat k_RDR +z_Pat = nlVarPat z_RDR + +con2tag_RDR, tag2con_RDR, maxtag_RDR :: TyCon -> RdrName +-- Generates Orig s RdrName, for the binding positions +con2tag_RDR tycon = mk_tc_deriv_name tycon "con2tag_" +tag2con_RDR tycon = mk_tc_deriv_name tycon "tag2con_" +maxtag_RDR tycon = mk_tc_deriv_name tycon "maxtag_" + +mk_tc_deriv_name tycon str + = mkDerivedRdrName tc_name mk_occ + where + tc_name = tyConName tycon + mk_occ tc_occ = mkVarOccFS (mkFastString new_str) + where + new_str = str ++ occNameString tc_occ ++ "#" +\end{code} + +s RdrName for PrimOps. Can't be done in PrelNames, because PrimOp imports +PrelNames, so PrelNames can't import PrimOp. + +\begin{code} +primOpRdrName op = getRdrName (primOpId op) + +minusInt_RDR = primOpRdrName IntSubOp +eqInt_RDR = primOpRdrName IntEqOp +ltInt_RDR = primOpRdrName IntLtOp +geInt_RDR = primOpRdrName IntGeOp +leInt_RDR = primOpRdrName IntLeOp +tagToEnum_RDR = primOpRdrName TagToEnumOp + +error_RDR = getRdrName eRROR_ID +\end{code} diff --git a/compiler/typecheck/TcHsSyn.lhs b/compiler/typecheck/TcHsSyn.lhs new file mode 100644 index 0000000000..6389f34aef --- /dev/null +++ b/compiler/typecheck/TcHsSyn.lhs @@ -0,0 +1,961 @@ +% +% (c) The AQUA Project, Glasgow University, 1996-1998 +% +\section[TcHsSyn]{Specialisations of the @HsSyn@ syntax for the typechecker} + +This module is an extension of @HsSyn@ syntax, for use in the type +checker. + +\begin{code} +module TcHsSyn ( + mkHsTyApp, mkHsDictApp, mkHsConApp, + mkHsTyLam, mkHsDictLam, mkHsDictLet, mkHsApp, + hsLitType, hsPatType, mkHsAppTy, mkSimpleHsAlt, + nlHsIntLit, mkVanillaTuplePat, + + + -- re-exported from TcMonad + TcId, TcIdSet, TcDictBinds, + + zonkTopDecls, zonkTopExpr, zonkTopLExpr, + zonkId, zonkTopBndrs + ) where + +#include "HsVersions.h" + +-- friends: +import HsSyn -- oodles of it + +-- others: +import Id ( idType, setIdType, Id ) + +import TcRnMonad +import Type ( Type ) +import TcType ( TcType, TcTyVar, mkTyVarTy, mkTyConApp, isImmutableTyVar ) +import Kind ( isLiftedTypeKind, liftedTypeKind, isSubKind ) +import qualified Type +import TcMType ( zonkQuantifiedTyVar, zonkType, zonkTcType, writeMetaTyVar ) +import TysPrim ( charPrimTy, intPrimTy, floatPrimTy, + doublePrimTy, addrPrimTy + ) +import TysWiredIn ( charTy, stringTy, intTy, + mkListTy, mkPArrTy, mkTupleTy, unitTy, + voidTy, listTyCon, tupleTyCon ) +import TyCon ( mkPrimTyCon, tyConKind, PrimRep(..) ) +import Kind ( splitKindFunTys ) +import Name ( Name, getOccName, mkInternalName, mkDerivedTyConOcc ) +import Var ( Var, isId, isLocalVar, tyVarKind ) +import VarSet +import VarEnv +import BasicTypes ( Boxity(..), IPName(..), ipNameName, mapIPName ) +import Maybes ( orElse ) +import Unique ( Uniquable(..) ) +import SrcLoc ( noSrcLoc, noLoc, Located(..), unLoc ) +import Util ( mapSnd ) +import Bag +import Outputable +\end{code} + + +%************************************************************************ +%* * +\subsection[mkFailurePair]{Code for pattern-matching and other failures} +%* * +%************************************************************************ + +Note: If @hsPatType@ doesn't bear a strong resemblance to @exprType@, +then something is wrong. +\begin{code} +mkVanillaTuplePat :: [OutPat Id] -> Boxity -> Pat Id +-- A vanilla tuple pattern simply gets its type from its sub-patterns +mkVanillaTuplePat pats box + = TuplePat pats box (mkTupleTy box (length pats) (map hsPatType pats)) + +hsPatType :: OutPat Id -> Type +hsPatType (L _ pat) = pat_type pat + +pat_type (ParPat pat) = hsPatType pat +pat_type (WildPat ty) = ty +pat_type (VarPat var) = idType var +pat_type (VarPatOut var _) = idType var +pat_type (BangPat pat) = hsPatType pat +pat_type (LazyPat pat) = hsPatType pat +pat_type (LitPat lit) = hsLitType lit +pat_type (AsPat var pat) = idType (unLoc var) +pat_type (ListPat _ ty) = mkListTy ty +pat_type (PArrPat _ ty) = mkPArrTy ty +pat_type (TuplePat pats box ty) = ty +pat_type (ConPatOut _ _ _ _ _ ty) = ty +pat_type (SigPatOut pat ty) = ty +pat_type (NPat lit _ _ ty) = ty +pat_type (NPlusKPat id _ _ _) = idType (unLoc id) +pat_type (DictPat ds ms) = case (ds ++ ms) of + [] -> unitTy + [d] -> idType d + ds -> mkTupleTy Boxed (length ds) (map idType ds) + + +hsLitType :: HsLit -> TcType +hsLitType (HsChar c) = charTy +hsLitType (HsCharPrim c) = charPrimTy +hsLitType (HsString str) = stringTy +hsLitType (HsStringPrim s) = addrPrimTy +hsLitType (HsInt i) = intTy +hsLitType (HsIntPrim i) = intPrimTy +hsLitType (HsInteger i ty) = ty +hsLitType (HsRat _ ty) = ty +hsLitType (HsFloatPrim f) = floatPrimTy +hsLitType (HsDoublePrim d) = doublePrimTy +\end{code} + + +%************************************************************************ +%* * +\subsection[BackSubst-HsBinds]{Running a substitution over @HsBinds@} +%* * +%************************************************************************ + +\begin{code} +-- zonkId is used *during* typechecking just to zonk the Id's type +zonkId :: TcId -> TcM TcId +zonkId id + = zonkTcType (idType id) `thenM` \ ty' -> + returnM (setIdType id ty') +\end{code} + +The rest of the zonking is done *after* typechecking. +The main zonking pass runs over the bindings + + a) to convert TcTyVars to TyVars etc, dereferencing any bindings etc + b) convert unbound TcTyVar to Void + c) convert each TcId to an Id by zonking its type + +The type variables are converted by binding mutable tyvars to immutable ones +and then zonking as normal. + +The Ids are converted by binding them in the normal Tc envt; that +way we maintain sharing; eg an Id is zonked at its binding site and they +all occurrences of that Id point to the common zonked copy + +It's all pretty boring stuff, because HsSyn is such a large type, and +the environment manipulation is tiresome. + +\begin{code} +data ZonkEnv = ZonkEnv (TcType -> TcM Type) -- How to zonk a type + (IdEnv Id) -- What variables are in scope + -- Maps an Id to its zonked version; both have the same Name + -- Is only consulted lazily; hence knot-tying + +emptyZonkEnv = ZonkEnv zonkTypeZapping emptyVarEnv + +extendZonkEnv :: ZonkEnv -> [Id] -> ZonkEnv +extendZonkEnv (ZonkEnv zonk_ty env) ids + = ZonkEnv zonk_ty (extendVarEnvList env [(id,id) | id <- ids]) + +extendZonkEnv1 :: ZonkEnv -> Id -> ZonkEnv +extendZonkEnv1 (ZonkEnv zonk_ty env) id + = ZonkEnv zonk_ty (extendVarEnv env id id) + +setZonkType :: ZonkEnv -> (TcType -> TcM Type) -> ZonkEnv +setZonkType (ZonkEnv _ env) zonk_ty = ZonkEnv zonk_ty env + +zonkEnvIds :: ZonkEnv -> [Id] +zonkEnvIds (ZonkEnv _ env) = varEnvElts env + +zonkIdOcc :: ZonkEnv -> TcId -> Id +-- Ids defined in this module should be in the envt; +-- ignore others. (Actually, data constructors are also +-- not LocalVars, even when locally defined, but that is fine.) +-- (Also foreign-imported things aren't currently in the ZonkEnv; +-- that's ok because they don't need zonking.) +-- +-- Actually, Template Haskell works in 'chunks' of declarations, and +-- an earlier chunk won't be in the 'env' that the zonking phase +-- carries around. Instead it'll be in the tcg_gbl_env, already fully +-- zonked. There's no point in looking it up there (except for error +-- checking), and it's not conveniently to hand; hence the simple +-- 'orElse' case in the LocalVar branch. +-- +-- Even without template splices, in module Main, the checking of +-- 'main' is done as a separate chunk. +zonkIdOcc (ZonkEnv zonk_ty env) id + | isLocalVar id = lookupVarEnv env id `orElse` id + | otherwise = id + +zonkIdOccs env ids = map (zonkIdOcc env) ids + +-- zonkIdBndr is used *after* typechecking to get the Id's type +-- to its final form. The TyVarEnv give +zonkIdBndr :: ZonkEnv -> TcId -> TcM Id +zonkIdBndr env id + = zonkTcTypeToType env (idType id) `thenM` \ ty' -> + returnM (setIdType id ty') + +zonkIdBndrs :: ZonkEnv -> [TcId] -> TcM [Id] +zonkIdBndrs env ids = mappM (zonkIdBndr env) ids + +zonkTopBndrs :: [TcId] -> TcM [Id] +zonkTopBndrs ids = zonkIdBndrs emptyZonkEnv ids +\end{code} + + +\begin{code} +zonkTopExpr :: HsExpr TcId -> TcM (HsExpr Id) +zonkTopExpr e = zonkExpr emptyZonkEnv e + +zonkTopLExpr :: LHsExpr TcId -> TcM (LHsExpr Id) +zonkTopLExpr e = zonkLExpr emptyZonkEnv e + +zonkTopDecls :: LHsBinds TcId -> [LRuleDecl TcId] -> [LForeignDecl TcId] + -> TcM ([Id], + Bag (LHsBind Id), + [LForeignDecl Id], + [LRuleDecl Id]) +zonkTopDecls binds rules fords + = do { (env, binds') <- zonkRecMonoBinds emptyZonkEnv binds + -- Top level is implicitly recursive + ; rules' <- zonkRules env rules + ; fords' <- zonkForeignExports env fords + ; return (zonkEnvIds env, binds', fords', rules') } + +--------------------------------------------- +zonkLocalBinds :: ZonkEnv -> HsLocalBinds TcId -> TcM (ZonkEnv, HsLocalBinds Id) +zonkLocalBinds env EmptyLocalBinds + = return (env, EmptyLocalBinds) + +zonkLocalBinds env (HsValBinds binds) + = do { (env1, new_binds) <- zonkValBinds env binds + ; return (env1, HsValBinds new_binds) } + +zonkLocalBinds env (HsIPBinds (IPBinds binds dict_binds)) + = mappM (wrapLocM zonk_ip_bind) binds `thenM` \ new_binds -> + let + env1 = extendZonkEnv env [ipNameName n | L _ (IPBind n _) <- new_binds] + in + zonkRecMonoBinds env1 dict_binds `thenM` \ (env2, new_dict_binds) -> + returnM (env2, HsIPBinds (IPBinds new_binds new_dict_binds)) + where + zonk_ip_bind (IPBind n e) + = mapIPNameTc (zonkIdBndr env) n `thenM` \ n' -> + zonkLExpr env e `thenM` \ e' -> + returnM (IPBind n' e') + + +--------------------------------------------- +zonkValBinds :: ZonkEnv -> HsValBinds TcId -> TcM (ZonkEnv, HsValBinds Id) +zonkValBinds env bs@(ValBindsIn _ _) + = panic "zonkValBinds" -- Not in typechecker output +zonkValBinds env (ValBindsOut binds sigs) + = do { (env1, new_binds) <- go env binds + ; return (env1, ValBindsOut new_binds sigs) } + where + go env [] = return (env, []) + go env ((r,b):bs) = do { (env1, b') <- zonkRecMonoBinds env b + ; (env2, bs') <- go env1 bs + ; return (env2, (r,b'):bs') } + +--------------------------------------------- +zonkRecMonoBinds :: ZonkEnv -> LHsBinds TcId -> TcM (ZonkEnv, LHsBinds Id) +zonkRecMonoBinds env binds + = fixM (\ ~(_, new_binds) -> do + { let env1 = extendZonkEnv env (collectHsBindBinders new_binds) + ; binds' <- zonkMonoBinds env1 binds + ; return (env1, binds') }) + +--------------------------------------------- +zonkMonoBinds :: ZonkEnv -> LHsBinds TcId -> TcM (LHsBinds Id) +zonkMonoBinds env binds = mapBagM (wrapLocM (zonk_bind env)) binds + +zonk_bind :: ZonkEnv -> HsBind TcId -> TcM (HsBind Id) +zonk_bind env bind@(PatBind { pat_lhs = pat, pat_rhs = grhss, pat_rhs_ty = ty}) + = do { (_env, new_pat) <- zonkPat env pat -- Env already extended + ; new_grhss <- zonkGRHSs env grhss + ; new_ty <- zonkTcTypeToType env ty + ; return (bind { pat_lhs = new_pat, pat_rhs = new_grhss, pat_rhs_ty = new_ty }) } + +zonk_bind env (VarBind { var_id = var, var_rhs = expr }) + = zonkIdBndr env var `thenM` \ new_var -> + zonkLExpr env expr `thenM` \ new_expr -> + returnM (VarBind { var_id = new_var, var_rhs = new_expr }) + +zonk_bind env bind@(FunBind { fun_id = var, fun_matches = ms, fun_co_fn = co_fn }) + = wrapLocM (zonkIdBndr env) var `thenM` \ new_var -> + zonkCoFn env co_fn `thenM` \ (env1, new_co_fn) -> + zonkMatchGroup env1 ms `thenM` \ new_ms -> + returnM (bind { fun_id = new_var, fun_matches = new_ms, fun_co_fn = new_co_fn }) + +zonk_bind env (AbsBinds { abs_tvs = tyvars, abs_dicts = dicts, + abs_exports = exports, abs_binds = val_binds }) + = ASSERT( all isImmutableTyVar tyvars ) + zonkIdBndrs env dicts `thenM` \ new_dicts -> + fixM (\ ~(new_val_binds, _) -> + let + env1 = extendZonkEnv env new_dicts + env2 = extendZonkEnv env1 (collectHsBindBinders new_val_binds) + in + zonkMonoBinds env2 val_binds `thenM` \ new_val_binds -> + mappM (zonkExport env2) exports `thenM` \ new_exports -> + returnM (new_val_binds, new_exports) + ) `thenM` \ (new_val_bind, new_exports) -> + returnM (AbsBinds { abs_tvs = tyvars, abs_dicts = new_dicts, + abs_exports = new_exports, abs_binds = new_val_bind }) + where + zonkExport env (tyvars, global, local, prags) + = zonkIdBndr env global `thenM` \ new_global -> + mapM zonk_prag prags `thenM` \ new_prags -> + returnM (tyvars, new_global, zonkIdOcc env local, new_prags) + zonk_prag prag@(InlinePrag {}) = return prag + zonk_prag (SpecPrag expr ty ds inl) = do { expr' <- zonkExpr env expr + ; ty' <- zonkTcTypeToType env ty + ; let ds' = zonkIdOccs env ds + ; return (SpecPrag expr' ty' ds' inl) } +\end{code} + +%************************************************************************ +%* * +\subsection[BackSubst-Match-GRHSs]{Match and GRHSs} +%* * +%************************************************************************ + +\begin{code} +zonkMatchGroup :: ZonkEnv -> MatchGroup TcId-> TcM (MatchGroup Id) +zonkMatchGroup env (MatchGroup ms ty) + = do { ms' <- mapM (zonkMatch env) ms + ; ty' <- zonkTcTypeToType env ty + ; return (MatchGroup ms' ty') } + +zonkMatch :: ZonkEnv -> LMatch TcId-> TcM (LMatch Id) +zonkMatch env (L loc (Match pats _ grhss)) + = do { (env1, new_pats) <- zonkPats env pats + ; new_grhss <- zonkGRHSs env1 grhss + ; return (L loc (Match new_pats Nothing new_grhss)) } + +------------------------------------------------------------------------- +zonkGRHSs :: ZonkEnv -> GRHSs TcId -> TcM (GRHSs Id) + +zonkGRHSs env (GRHSs grhss binds) + = zonkLocalBinds env binds `thenM` \ (new_env, new_binds) -> + let + zonk_grhs (GRHS guarded rhs) + = zonkStmts new_env guarded `thenM` \ (env2, new_guarded) -> + zonkLExpr env2 rhs `thenM` \ new_rhs -> + returnM (GRHS new_guarded new_rhs) + in + mappM (wrapLocM zonk_grhs) grhss `thenM` \ new_grhss -> + returnM (GRHSs new_grhss new_binds) +\end{code} + +%************************************************************************ +%* * +\subsection[BackSubst-HsExpr]{Running a zonkitution over a TypeCheckedExpr} +%* * +%************************************************************************ + +\begin{code} +zonkLExprs :: ZonkEnv -> [LHsExpr TcId] -> TcM [LHsExpr Id] +zonkLExpr :: ZonkEnv -> LHsExpr TcId -> TcM (LHsExpr Id) +zonkExpr :: ZonkEnv -> HsExpr TcId -> TcM (HsExpr Id) + +zonkLExprs env exprs = mappM (zonkLExpr env) exprs +zonkLExpr env expr = wrapLocM (zonkExpr env) expr + +zonkExpr env (HsVar id) + = returnM (HsVar (zonkIdOcc env id)) + +zonkExpr env (HsIPVar id) + = returnM (HsIPVar (mapIPName (zonkIdOcc env) id)) + +zonkExpr env (HsLit (HsRat f ty)) + = zonkTcTypeToType env ty `thenM` \ new_ty -> + returnM (HsLit (HsRat f new_ty)) + +zonkExpr env (HsLit lit) + = returnM (HsLit lit) + +zonkExpr env (HsOverLit lit) + = do { lit' <- zonkOverLit env lit + ; return (HsOverLit lit') } + +zonkExpr env (HsLam matches) + = zonkMatchGroup env matches `thenM` \ new_matches -> + returnM (HsLam new_matches) + +zonkExpr env (HsApp e1 e2) + = zonkLExpr env e1 `thenM` \ new_e1 -> + zonkLExpr env e2 `thenM` \ new_e2 -> + returnM (HsApp new_e1 new_e2) + +zonkExpr env (HsBracketOut body bs) + = mappM zonk_b bs `thenM` \ bs' -> + returnM (HsBracketOut body bs') + where + zonk_b (n,e) = zonkLExpr env e `thenM` \ e' -> + returnM (n,e') + +zonkExpr env (HsSpliceE s) = WARN( True, ppr s ) -- Should not happen + returnM (HsSpliceE s) + +zonkExpr env (OpApp e1 op fixity e2) + = zonkLExpr env e1 `thenM` \ new_e1 -> + zonkLExpr env op `thenM` \ new_op -> + zonkLExpr env e2 `thenM` \ new_e2 -> + returnM (OpApp new_e1 new_op fixity new_e2) + +zonkExpr env (NegApp expr op) + = zonkLExpr env expr `thenM` \ new_expr -> + zonkExpr env op `thenM` \ new_op -> + returnM (NegApp new_expr new_op) + +zonkExpr env (HsPar e) + = zonkLExpr env e `thenM` \new_e -> + returnM (HsPar new_e) + +zonkExpr env (SectionL expr op) + = zonkLExpr env expr `thenM` \ new_expr -> + zonkLExpr env op `thenM` \ new_op -> + returnM (SectionL new_expr new_op) + +zonkExpr env (SectionR op expr) + = zonkLExpr env op `thenM` \ new_op -> + zonkLExpr env expr `thenM` \ new_expr -> + returnM (SectionR new_op new_expr) + +zonkExpr env (HsCase expr ms) + = zonkLExpr env expr `thenM` \ new_expr -> + zonkMatchGroup env ms `thenM` \ new_ms -> + returnM (HsCase new_expr new_ms) + +zonkExpr env (HsIf e1 e2 e3) + = zonkLExpr env e1 `thenM` \ new_e1 -> + zonkLExpr env e2 `thenM` \ new_e2 -> + zonkLExpr env e3 `thenM` \ new_e3 -> + returnM (HsIf new_e1 new_e2 new_e3) + +zonkExpr env (HsLet binds expr) + = zonkLocalBinds env binds `thenM` \ (new_env, new_binds) -> + zonkLExpr new_env expr `thenM` \ new_expr -> + returnM (HsLet new_binds new_expr) + +zonkExpr env (HsDo do_or_lc stmts body ty) + = zonkStmts env stmts `thenM` \ (new_env, new_stmts) -> + zonkLExpr new_env body `thenM` \ new_body -> + zonkTcTypeToType env ty `thenM` \ new_ty -> + returnM (HsDo (zonkDo env do_or_lc) + new_stmts new_body new_ty) + +zonkExpr env (ExplicitList ty exprs) + = zonkTcTypeToType env ty `thenM` \ new_ty -> + zonkLExprs env exprs `thenM` \ new_exprs -> + returnM (ExplicitList new_ty new_exprs) + +zonkExpr env (ExplicitPArr ty exprs) + = zonkTcTypeToType env ty `thenM` \ new_ty -> + zonkLExprs env exprs `thenM` \ new_exprs -> + returnM (ExplicitPArr new_ty new_exprs) + +zonkExpr env (ExplicitTuple exprs boxed) + = zonkLExprs env exprs `thenM` \ new_exprs -> + returnM (ExplicitTuple new_exprs boxed) + +zonkExpr env (RecordCon data_con con_expr rbinds) + = zonkExpr env con_expr `thenM` \ new_con_expr -> + zonkRbinds env rbinds `thenM` \ new_rbinds -> + returnM (RecordCon data_con new_con_expr new_rbinds) + +zonkExpr env (RecordUpd expr rbinds in_ty out_ty) + = zonkLExpr env expr `thenM` \ new_expr -> + zonkTcTypeToType env in_ty `thenM` \ new_in_ty -> + zonkTcTypeToType env out_ty `thenM` \ new_out_ty -> + zonkRbinds env rbinds `thenM` \ new_rbinds -> + returnM (RecordUpd new_expr new_rbinds new_in_ty new_out_ty) + +zonkExpr env (ExprWithTySigOut e ty) + = do { e' <- zonkLExpr env e + ; return (ExprWithTySigOut e' ty) } + +zonkExpr env (ExprWithTySig _ _) = panic "zonkExpr env:ExprWithTySig" + +zonkExpr env (ArithSeq expr info) + = zonkExpr env expr `thenM` \ new_expr -> + zonkArithSeq env info `thenM` \ new_info -> + returnM (ArithSeq new_expr new_info) + +zonkExpr env (PArrSeq expr info) + = zonkExpr env expr `thenM` \ new_expr -> + zonkArithSeq env info `thenM` \ new_info -> + returnM (PArrSeq new_expr new_info) + +zonkExpr env (HsSCC lbl expr) + = zonkLExpr env expr `thenM` \ new_expr -> + returnM (HsSCC lbl new_expr) + +-- hdaume: core annotations +zonkExpr env (HsCoreAnn lbl expr) + = zonkLExpr env expr `thenM` \ new_expr -> + returnM (HsCoreAnn lbl new_expr) + +zonkExpr env (TyLam tyvars expr) + = ASSERT( all isImmutableTyVar tyvars ) + zonkLExpr env expr `thenM` \ new_expr -> + returnM (TyLam tyvars new_expr) + +zonkExpr env (TyApp expr tys) + = zonkLExpr env expr `thenM` \ new_expr -> + zonkTcTypeToTypes env tys `thenM` \ new_tys -> + returnM (TyApp new_expr new_tys) + +zonkExpr env (DictLam dicts expr) + = zonkIdBndrs env dicts `thenM` \ new_dicts -> + let + env1 = extendZonkEnv env new_dicts + in + zonkLExpr env1 expr `thenM` \ new_expr -> + returnM (DictLam new_dicts new_expr) + +zonkExpr env (DictApp expr dicts) + = zonkLExpr env expr `thenM` \ new_expr -> + returnM (DictApp new_expr (zonkIdOccs env dicts)) + +-- arrow notation extensions +zonkExpr env (HsProc pat body) + = do { (env1, new_pat) <- zonkPat env pat + ; new_body <- zonkCmdTop env1 body + ; return (HsProc new_pat new_body) } + +zonkExpr env (HsArrApp e1 e2 ty ho rl) + = zonkLExpr env e1 `thenM` \ new_e1 -> + zonkLExpr env e2 `thenM` \ new_e2 -> + zonkTcTypeToType env ty `thenM` \ new_ty -> + returnM (HsArrApp new_e1 new_e2 new_ty ho rl) + +zonkExpr env (HsArrForm op fixity args) + = zonkLExpr env op `thenM` \ new_op -> + mappM (zonkCmdTop env) args `thenM` \ new_args -> + returnM (HsArrForm new_op fixity new_args) + +zonkExpr env (HsCoerce co_fn expr) + = zonkCoFn env co_fn `thenM` \ (env1, new_co_fn) -> + zonkExpr env1 expr `thenM` \ new_expr -> + return (HsCoerce new_co_fn new_expr) + +zonkExpr env other = pprPanic "zonkExpr" (ppr other) + +zonkCmdTop :: ZonkEnv -> LHsCmdTop TcId -> TcM (LHsCmdTop Id) +zonkCmdTop env cmd = wrapLocM (zonk_cmd_top env) cmd + +zonk_cmd_top env (HsCmdTop cmd stack_tys ty ids) + = zonkLExpr env cmd `thenM` \ new_cmd -> + zonkTcTypeToTypes env stack_tys `thenM` \ new_stack_tys -> + zonkTcTypeToType env ty `thenM` \ new_ty -> + mapSndM (zonkExpr env) ids `thenM` \ new_ids -> + returnM (HsCmdTop new_cmd new_stack_tys new_ty new_ids) + +------------------------------------------------------------------------- +zonkCoFn :: ZonkEnv -> ExprCoFn -> TcM (ZonkEnv, ExprCoFn) +zonkCoFn env CoHole = return (env, CoHole) +zonkCoFn env (CoCompose c1 c2) = do { (env1, c1') <- zonkCoFn env c1 + ; (env2, c2') <- zonkCoFn env1 c2 + ; return (env2, CoCompose c1' c2') } +zonkCoFn env (CoLams ids c) = do { ids' <- zonkIdBndrs env ids + ; let env1 = extendZonkEnv env ids' + ; (env2, c') <- zonkCoFn env1 c + ; return (env2, CoLams ids' c') } +zonkCoFn env (CoTyLams tvs c) = ASSERT( all isImmutableTyVar tvs ) + do { (env1, c') <- zonkCoFn env c + ; return (env1, CoTyLams tvs c') } +zonkCoFn env (CoApps c ids) = do { (env1, c') <- zonkCoFn env c + ; return (env1, CoApps c' (zonkIdOccs env ids)) } +zonkCoFn env (CoTyApps c tys) = do { tys' <- zonkTcTypeToTypes env tys + ; (env1, c') <- zonkCoFn env c + ; return (env1, CoTyApps c' tys') } +zonkCoFn env (CoLet bs c) = do { (env1, bs') <- zonkRecMonoBinds env bs + ; (env2, c') <- zonkCoFn env1 c + ; return (env2, CoLet bs' c') } + + +------------------------------------------------------------------------- +zonkDo :: ZonkEnv -> HsStmtContext Name -> HsStmtContext Name +-- Only used for 'do', so the only Ids are in a MDoExpr table +zonkDo env (MDoExpr tbl) = MDoExpr (mapSnd (zonkIdOcc env) tbl) +zonkDo env do_or_lc = do_or_lc + +------------------------------------------------------------------------- +zonkOverLit :: ZonkEnv -> HsOverLit TcId -> TcM (HsOverLit Id) +zonkOverLit env (HsIntegral i e) + = do { e' <- zonkExpr env e; return (HsIntegral i e') } +zonkOverLit env (HsFractional r e) + = do { e' <- zonkExpr env e; return (HsFractional r e') } + +------------------------------------------------------------------------- +zonkArithSeq :: ZonkEnv -> ArithSeqInfo TcId -> TcM (ArithSeqInfo Id) + +zonkArithSeq env (From e) + = zonkLExpr env e `thenM` \ new_e -> + returnM (From new_e) + +zonkArithSeq env (FromThen e1 e2) + = zonkLExpr env e1 `thenM` \ new_e1 -> + zonkLExpr env e2 `thenM` \ new_e2 -> + returnM (FromThen new_e1 new_e2) + +zonkArithSeq env (FromTo e1 e2) + = zonkLExpr env e1 `thenM` \ new_e1 -> + zonkLExpr env e2 `thenM` \ new_e2 -> + returnM (FromTo new_e1 new_e2) + +zonkArithSeq env (FromThenTo e1 e2 e3) + = zonkLExpr env e1 `thenM` \ new_e1 -> + zonkLExpr env e2 `thenM` \ new_e2 -> + zonkLExpr env e3 `thenM` \ new_e3 -> + returnM (FromThenTo new_e1 new_e2 new_e3) + + +------------------------------------------------------------------------- +zonkStmts :: ZonkEnv -> [LStmt TcId] -> TcM (ZonkEnv, [LStmt Id]) +zonkStmts env [] = return (env, []) +zonkStmts env (s:ss) = do { (env1, s') <- wrapLocSndM (zonkStmt env) s + ; (env2, ss') <- zonkStmts env1 ss + ; return (env2, s' : ss') } + +zonkStmt :: ZonkEnv -> Stmt TcId -> TcM (ZonkEnv, Stmt Id) +zonkStmt env (ParStmt stmts_w_bndrs) + = mappM zonk_branch stmts_w_bndrs `thenM` \ new_stmts_w_bndrs -> + let + new_binders = concat (map snd new_stmts_w_bndrs) + env1 = extendZonkEnv env new_binders + in + return (env1, ParStmt new_stmts_w_bndrs) + where + zonk_branch (stmts, bndrs) = zonkStmts env stmts `thenM` \ (env1, new_stmts) -> + returnM (new_stmts, zonkIdOccs env1 bndrs) + +zonkStmt env (RecStmt segStmts lvs rvs rets binds) + = zonkIdBndrs env rvs `thenM` \ new_rvs -> + let + env1 = extendZonkEnv env new_rvs + in + zonkStmts env1 segStmts `thenM` \ (env2, new_segStmts) -> + -- Zonk the ret-expressions in an envt that + -- has the polymorphic bindings in the envt + mapM (zonkExpr env2) rets `thenM` \ new_rets -> + let + new_lvs = zonkIdOccs env2 lvs + env3 = extendZonkEnv env new_lvs -- Only the lvs are needed + in + zonkRecMonoBinds env3 binds `thenM` \ (env4, new_binds) -> + returnM (env4, RecStmt new_segStmts new_lvs new_rvs new_rets new_binds) + +zonkStmt env (ExprStmt expr then_op ty) + = zonkLExpr env expr `thenM` \ new_expr -> + zonkExpr env then_op `thenM` \ new_then -> + zonkTcTypeToType env ty `thenM` \ new_ty -> + returnM (env, ExprStmt new_expr new_then new_ty) + +zonkStmt env (LetStmt binds) + = zonkLocalBinds env binds `thenM` \ (env1, new_binds) -> + returnM (env1, LetStmt new_binds) + +zonkStmt env (BindStmt pat expr bind_op fail_op) + = do { new_expr <- zonkLExpr env expr + ; (env1, new_pat) <- zonkPat env pat + ; new_bind <- zonkExpr env bind_op + ; new_fail <- zonkExpr env fail_op + ; return (env1, BindStmt new_pat new_expr new_bind new_fail) } + + +------------------------------------------------------------------------- +zonkRbinds :: ZonkEnv -> HsRecordBinds TcId -> TcM (HsRecordBinds Id) + +zonkRbinds env rbinds + = mappM zonk_rbind rbinds + where + zonk_rbind (field, expr) + = zonkLExpr env expr `thenM` \ new_expr -> + returnM (fmap (zonkIdOcc env) field, new_expr) + +------------------------------------------------------------------------- +mapIPNameTc :: (a -> TcM b) -> IPName a -> TcM (IPName b) +mapIPNameTc f (Dupable n) = f n `thenM` \ r -> returnM (Dupable r) +mapIPNameTc f (Linear n) = f n `thenM` \ r -> returnM (Linear r) +\end{code} + + +%************************************************************************ +%* * +\subsection[BackSubst-Pats]{Patterns} +%* * +%************************************************************************ + +\begin{code} +zonkPat :: ZonkEnv -> OutPat TcId -> TcM (ZonkEnv, OutPat Id) +-- Extend the environment as we go, because it's possible for one +-- pattern to bind something that is used in another (inside or +-- to the right) +zonkPat env pat = wrapLocSndM (zonk_pat env) pat + +zonk_pat env (ParPat p) + = do { (env', p') <- zonkPat env p + ; return (env', ParPat p') } + +zonk_pat env (WildPat ty) + = do { ty' <- zonkTcTypeToType env ty + ; return (env, WildPat ty') } + +zonk_pat env (VarPat v) + = do { v' <- zonkIdBndr env v + ; return (extendZonkEnv1 env v', VarPat v') } + +zonk_pat env (VarPatOut v binds) + = do { v' <- zonkIdBndr env v + ; (env', binds') <- zonkRecMonoBinds (extendZonkEnv1 env v') binds + ; returnM (env', VarPatOut v' binds') } + +zonk_pat env (LazyPat pat) + = do { (env', pat') <- zonkPat env pat + ; return (env', LazyPat pat') } + +zonk_pat env (BangPat pat) + = do { (env', pat') <- zonkPat env pat + ; return (env', BangPat pat') } + +zonk_pat env (AsPat (L loc v) pat) + = do { v' <- zonkIdBndr env v + ; (env', pat') <- zonkPat (extendZonkEnv1 env v') pat + ; return (env', AsPat (L loc v') pat') } + +zonk_pat env (ListPat pats ty) + = do { ty' <- zonkTcTypeToType env ty + ; (env', pats') <- zonkPats env pats + ; return (env', ListPat pats' ty') } + +zonk_pat env (PArrPat pats ty) + = do { ty' <- zonkTcTypeToType env ty + ; (env', pats') <- zonkPats env pats + ; return (env', PArrPat pats' ty') } + +zonk_pat env (TuplePat pats boxed ty) + = do { ty' <- zonkTcTypeToType env ty + ; (env', pats') <- zonkPats env pats + ; return (env', TuplePat pats' boxed ty') } + +zonk_pat env (ConPatOut n tvs dicts binds stuff ty) + = ASSERT( all isImmutableTyVar tvs ) + do { new_ty <- zonkTcTypeToType env ty + ; new_dicts <- zonkIdBndrs env dicts + ; let env1 = extendZonkEnv env new_dicts + ; (env2, new_binds) <- zonkRecMonoBinds env1 binds + ; (env', new_stuff) <- zonkConStuff env2 stuff + ; returnM (env', ConPatOut n tvs new_dicts new_binds new_stuff new_ty) } + +zonk_pat env (LitPat lit) = return (env, LitPat lit) + +zonk_pat env (SigPatOut pat ty) + = do { ty' <- zonkTcTypeToType env ty + ; (env', pat') <- zonkPat env pat + ; return (env', SigPatOut pat' ty') } + +zonk_pat env (NPat lit mb_neg eq_expr ty) + = do { lit' <- zonkOverLit env lit + ; mb_neg' <- case mb_neg of + Nothing -> return Nothing + Just neg -> do { neg' <- zonkExpr env neg + ; return (Just neg') } + ; eq_expr' <- zonkExpr env eq_expr + ; ty' <- zonkTcTypeToType env ty + ; return (env, NPat lit' mb_neg' eq_expr' ty') } + +zonk_pat env (NPlusKPat (L loc n) lit e1 e2) + = do { n' <- zonkIdBndr env n + ; lit' <- zonkOverLit env lit + ; e1' <- zonkExpr env e1 + ; e2' <- zonkExpr env e2 + ; return (extendZonkEnv1 env n', NPlusKPat (L loc n') lit' e1' e2') } + +zonk_pat env (DictPat ds ms) + = do { ds' <- zonkIdBndrs env ds + ; ms' <- zonkIdBndrs env ms + ; return (extendZonkEnv env (ds' ++ ms'), DictPat ds' ms') } + +--------------------------- +zonkConStuff env (PrefixCon pats) + = do { (env', pats') <- zonkPats env pats + ; return (env', PrefixCon pats') } + +zonkConStuff env (InfixCon p1 p2) + = do { (env1, p1') <- zonkPat env p1 + ; (env', p2') <- zonkPat env1 p2 + ; return (env', InfixCon p1' p2') } + +zonkConStuff env (RecCon rpats) + = do { (env', pats') <- zonkPats env pats + ; returnM (env', RecCon (fields `zip` pats')) } + where + (fields, pats) = unzip rpats + +--------------------------- +zonkPats env [] = return (env, []) +zonkPats env (pat:pats) = do { (env1, pat') <- zonkPat env pat + ; (env', pats') <- zonkPats env1 pats + ; return (env', pat':pats') } +\end{code} + +%************************************************************************ +%* * +\subsection[BackSubst-Foreign]{Foreign exports} +%* * +%************************************************************************ + + +\begin{code} +zonkForeignExports :: ZonkEnv -> [LForeignDecl TcId] -> TcM [LForeignDecl Id] +zonkForeignExports env ls = mappM (wrapLocM (zonkForeignExport env)) ls + +zonkForeignExport :: ZonkEnv -> ForeignDecl TcId -> TcM (ForeignDecl Id) +zonkForeignExport env (ForeignExport i hs_ty spec isDeprec) = + returnM (ForeignExport (fmap (zonkIdOcc env) i) undefined spec isDeprec) +zonkForeignExport env for_imp + = returnM for_imp -- Foreign imports don't need zonking +\end{code} + +\begin{code} +zonkRules :: ZonkEnv -> [LRuleDecl TcId] -> TcM [LRuleDecl Id] +zonkRules env rs = mappM (wrapLocM (zonkRule env)) rs + +zonkRule :: ZonkEnv -> RuleDecl TcId -> TcM (RuleDecl Id) +zonkRule env (HsRule name act (vars::[RuleBndr TcId]) lhs fv_lhs rhs fv_rhs) + = mappM zonk_bndr vars `thenM` \ new_bndrs -> + newMutVar emptyVarSet `thenM` \ unbound_tv_set -> + let + env_rhs = extendZonkEnv env [id | b <- new_bndrs, let id = unLoc b, isId id] + -- Type variables don't need an envt + -- They are bound through the mutable mechanism + + env_lhs = setZonkType env_rhs (zonkTypeCollecting unbound_tv_set) + -- We need to gather the type variables mentioned on the LHS so we can + -- quantify over them. Example: + -- data T a = C + -- + -- foo :: T a -> Int + -- foo C = 1 + -- + -- {-# RULES "myrule" foo C = 1 #-} + -- + -- After type checking the LHS becomes (foo a (C a)) + -- and we do not want to zap the unbound tyvar 'a' to (), because + -- that limits the applicability of the rule. Instead, we + -- want to quantify over it! + -- + -- It's easiest to find the free tyvars here. Attempts to do so earlier + -- are tiresome, because (a) the data type is big and (b) finding the + -- free type vars of an expression is necessarily monadic operation. + -- (consider /\a -> f @ b, where b is side-effected to a) + in + zonkLExpr env_lhs lhs `thenM` \ new_lhs -> + zonkLExpr env_rhs rhs `thenM` \ new_rhs -> + + readMutVar unbound_tv_set `thenM` \ unbound_tvs -> + let + final_bndrs :: [Located Var] + final_bndrs = map noLoc (varSetElems unbound_tvs) ++ new_bndrs + in + returnM (HsRule name act (map RuleBndr final_bndrs) new_lhs fv_lhs new_rhs fv_rhs) + -- I hate this map RuleBndr stuff + where + zonk_bndr (RuleBndr v) + | isId (unLoc v) = wrapLocM (zonkIdBndr env) v + | otherwise = ASSERT( isImmutableTyVar (unLoc v) ) + return v +\end{code} + + +%************************************************************************ +%* * +\subsection[BackSubst-Foreign]{Foreign exports} +%* * +%************************************************************************ + +\begin{code} +zonkTcTypeToType :: ZonkEnv -> TcType -> TcM Type +zonkTcTypeToType (ZonkEnv zonk_ty _) ty = zonk_ty ty + +zonkTcTypeToTypes :: ZonkEnv -> [TcType] -> TcM [Type] +zonkTcTypeToTypes env tys = mapM (zonkTcTypeToType env) tys + +zonkTypeCollecting :: TcRef TyVarSet -> TcType -> TcM Type +-- This variant collects unbound type variables in a mutable variable +zonkTypeCollecting unbound_tv_set + = zonkType zonk_unbound_tyvar + where + zonk_unbound_tyvar tv + = zonkQuantifiedTyVar tv `thenM` \ tv' -> + readMutVar unbound_tv_set `thenM` \ tv_set -> + writeMutVar unbound_tv_set (extendVarSet tv_set tv') `thenM_` + return (mkTyVarTy tv') + +zonkTypeZapping :: TcType -> TcM Type +-- This variant is used for everything except the LHS of rules +-- It zaps unbound type variables to (), or some other arbitrary type +zonkTypeZapping ty + = zonkType zonk_unbound_tyvar ty + where + -- Zonk a mutable but unbound type variable to an arbitrary type + -- We know it's unbound even though we don't carry an environment, + -- because at the binding site for a type variable we bind the + -- mutable tyvar to a fresh immutable one. So the mutable store + -- plays the role of an environment. If we come across a mutable + -- type variable that isn't so bound, it must be completely free. + zonk_unbound_tyvar tv = do { writeMetaTyVar tv ty; return ty } + where + ty = mkArbitraryType tv + + +-- When the type checker finds a type variable with no binding, +-- which means it can be instantiated with an arbitrary type, it +-- usually instantiates it to Void. Eg. +-- +-- length [] +-- ===> +-- length Void (Nil Void) +-- +-- But in really obscure programs, the type variable might have +-- a kind other than *, so we need to invent a suitably-kinded type. +-- +-- This commit uses +-- Void for kind * +-- List for kind *->* +-- Tuple for kind *->...*->* +-- +-- which deals with most cases. (Previously, it only dealt with +-- kind *.) +-- +-- In the other cases, it just makes up a TyCon with a suitable +-- kind. If this gets into an interface file, anyone reading that +-- file won't understand it. This is fixable (by making the client +-- of the interface file make up a TyCon too) but it is tiresome and +-- never happens, so I am leaving it + +mkArbitraryType :: TcTyVar -> Type +-- Make up an arbitrary type whose kind is the same as the tyvar. +-- We'll use this to instantiate the (unbound) tyvar. +mkArbitraryType tv + | liftedTypeKind `isSubKind` kind = voidTy -- The vastly common case + | otherwise = mkTyConApp tycon [] + where + kind = tyVarKind tv + (args,res) = splitKindFunTys kind + + tycon | kind == tyConKind listTyCon -- *->* + = listTyCon -- No tuples this size + + | all isLiftedTypeKind args && isLiftedTypeKind res + = tupleTyCon Boxed (length args) -- *-> ... ->*->* + + | otherwise + = pprTrace "Urk! Inventing strangely-kinded void TyCon:" (ppr tc_name $$ ppr kind) $ + mkPrimTyCon tc_name kind 0 [] VoidRep + -- Same name as the tyvar, apart from making it start with a colon (sigh) + -- I dread to think what will happen if this gets out into an + -- interface file. Catastrophe likely. Major sigh. + + tc_name = mkInternalName (getUnique tv) (mkDerivedTyConOcc (getOccName tv)) noSrcLoc +\end{code} diff --git a/compiler/typecheck/TcHsType.lhs b/compiler/typecheck/TcHsType.lhs new file mode 100644 index 0000000000..968ccfb960 --- /dev/null +++ b/compiler/typecheck/TcHsType.lhs @@ -0,0 +1,816 @@ + +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[TcMonoType]{Typechecking user-specified @MonoTypes@} + +\begin{code} +module TcHsType ( + tcHsSigType, tcHsDeriv, + UserTypeCtxt(..), + + -- Kind checking + kcHsTyVars, kcHsSigType, kcHsLiftedSigType, + kcCheckHsType, kcHsContext, kcHsType, + + -- Typechecking kinded types + tcHsKindedContext, tcHsKindedType, tcHsBangType, + tcTyVarBndrs, dsHsType, tcLHsConResTy, + tcDataKindSig, + + -- Pattern type signatures + tcHsPatSigType, tcPatSig + ) where + +#include "HsVersions.h" + +import HsSyn ( HsType(..), LHsType, HsTyVarBndr(..), LHsTyVarBndr, + LHsContext, HsPred(..), LHsPred, HsExplicitForAll(..) ) +import RnHsSyn ( extractHsTyVars ) +import TcRnMonad +import TcEnv ( tcExtendTyVarEnv, tcExtendKindEnvTvs, + tcLookup, tcLookupClass, tcLookupTyCon, + TyThing(..), getInLocalScope, getScopedTyVarBinds, + wrongThingErr + ) +import TcMType ( newKindVar, + zonkTcKindToKind, + tcInstBoxyTyVar, readFilledBox, + checkValidType + ) +import TcUnify ( boxyUnify, unifyFunKind, checkExpectedKind ) +import TcIface ( checkWiredInTyCon ) +import TcType ( Type, PredType(..), ThetaType, BoxySigmaType, + TcType, TcKind, isRigidTy, + UserTypeCtxt(..), pprUserTypeCtxt, + substTyWith, mkTyVarTys, tcEqType, + tcIsTyVarTy, mkFunTy, mkSigmaTy, mkPredTy, + mkTyConApp, mkAppTys, typeKind ) +import Kind ( Kind, isLiftedTypeKind, liftedTypeKind, ubxTupleKind, + openTypeKind, argTypeKind, splitKindFunTys ) +import Var ( TyVar, mkTyVar, tyVarName ) +import TyCon ( TyCon, tyConKind ) +import Class ( Class, classTyCon ) +import Name ( Name, mkInternalName ) +import OccName ( mkOccName, tvName ) +import NameSet +import PrelNames ( genUnitTyConName ) +import TysWiredIn ( mkListTy, listTyCon, mkPArrTy, parrTyCon, tupleTyCon ) +import BasicTypes ( Boxity(..) ) +import SrcLoc ( Located(..), unLoc, noLoc, getLoc, srcSpanStart ) +import UniqSupply ( uniqsFromSupply ) +import Outputable +\end{code} + + + ---------------------------- + General notes + ---------------------------- + +Generally speaking we now type-check types in three phases + + 1. kcHsType: kind check the HsType + *includes* performing any TH type splices; + so it returns a translated, and kind-annotated, type + + 2. dsHsType: convert from HsType to Type: + perform zonking + expand type synonyms [mkGenTyApps] + hoist the foralls [tcHsType] + + 3. checkValidType: check the validity of the resulting type + +Often these steps are done one after the other (tcHsSigType). +But in mutually recursive groups of type and class decls we do + 1 kind-check the whole group + 2 build TyCons/Classes in a knot-tied way + 3 check the validity of types in the now-unknotted TyCons/Classes + +For example, when we find + (forall a m. m a -> m a) +we bind a,m to kind varibles and kind-check (m a -> m a). This makes +a get kind *, and m get kind *->*. Now we typecheck (m a -> m a) in +an environment that binds a and m suitably. + +The kind checker passed to tcHsTyVars needs to look at enough to +establish the kind of the tyvar: + * For a group of type and class decls, it's just the group, not + the rest of the program + * For a tyvar bound in a pattern type signature, its the types + mentioned in the other type signatures in that bunch of patterns + * For a tyvar bound in a RULE, it's the type signatures on other + universally quantified variables in the rule + +Note that this may occasionally give surprising results. For example: + + data T a b = MkT (a b) + +Here we deduce a::*->*, b::* +But equally valid would be a::(*->*)-> *, b::*->* + + +Validity checking +~~~~~~~~~~~~~~~~~ +Some of the validity check could in principle be done by the kind checker, +but not all: + +- During desugaring, we normalise by expanding type synonyms. Only + after this step can we check things like type-synonym saturation + e.g. type T k = k Int + type S a = a + Then (T S) is ok, because T is saturated; (T S) expands to (S Int); + and then S is saturated. This is a GHC extension. + +- Similarly, also a GHC extension, we look through synonyms before complaining + about the form of a class or instance declaration + +- Ambiguity checks involve functional dependencies, and it's easier to wait + until knots have been resolved before poking into them + +Also, in a mutually recursive group of types, we can't look at the TyCon until we've +finished building the loop. So to keep things simple, we postpone most validity +checking until step (3). + +Knot tying +~~~~~~~~~~ +During step (1) we might fault in a TyCon defined in another module, and it might +(via a loop) refer back to a TyCon defined in this module. So when we tie a big +knot around type declarations with ARecThing, so that the fault-in code can get +the TyCon being defined. + + +%************************************************************************ +%* * +\subsection{Checking types} +%* * +%************************************************************************ + +\begin{code} +tcHsSigType :: UserTypeCtxt -> LHsType Name -> TcM Type + -- Do kind checking, and hoist for-alls to the top + -- NB: it's important that the foralls that come from the top-level + -- HsForAllTy in hs_ty occur *first* in the returned type. + -- See Note [Scoped] with TcSigInfo +tcHsSigType ctxt hs_ty + = addErrCtxt (pprHsSigCtxt ctxt hs_ty) $ + do { kinded_ty <- kcTypeType hs_ty + ; ty <- tcHsKindedType kinded_ty + ; checkValidType ctxt ty + ; returnM ty } + +-- Used for the deriving(...) items +tcHsDeriv :: LHsType Name -> TcM ([TyVar], Class, [Type]) +tcHsDeriv = addLocM (tc_hs_deriv []) + +tc_hs_deriv tv_names (HsPredTy (HsClassP cls_name hs_tys)) + = kcHsTyVars tv_names $ \ tv_names' -> + do { cls_kind <- kcClass cls_name + ; (tys, res_kind) <- kcApps cls_kind (ppr cls_name) hs_tys + ; tcTyVarBndrs tv_names' $ \ tyvars -> + do { arg_tys <- dsHsTypes tys + ; cls <- tcLookupClass cls_name + ; return (tyvars, cls, arg_tys) }} + +tc_hs_deriv tv_names1 (HsForAllTy _ tv_names2 (L _ []) (L _ ty)) + = -- Funny newtype deriving form + -- forall a. C [a] + -- where C has arity 2. Hence can't use regular functions + tc_hs_deriv (tv_names1 ++ tv_names2) ty + +tc_hs_deriv _ other + = failWithTc (ptext SLIT("Illegal deriving item") <+> ppr other) +\end{code} + + These functions are used during knot-tying in + type and class declarations, when we have to + separate kind-checking, desugaring, and validity checking + +\begin{code} +kcHsSigType, kcHsLiftedSigType :: LHsType Name -> TcM (LHsType Name) + -- Used for type signatures +kcHsSigType ty = kcTypeType ty +kcHsLiftedSigType ty = kcLiftedType ty + +tcHsKindedType :: LHsType Name -> TcM Type + -- Don't do kind checking, nor validity checking, + -- but do hoist for-alls to the top + -- This is used in type and class decls, where kinding is + -- done in advance, and validity checking is done later + -- [Validity checking done later because of knot-tying issues.] +tcHsKindedType hs_ty = dsHsType hs_ty + +tcHsBangType :: LHsType Name -> TcM Type +-- Permit a bang, but discard it +tcHsBangType (L span (HsBangTy b ty)) = tcHsKindedType ty +tcHsBangType ty = tcHsKindedType ty + +tcHsKindedContext :: LHsContext Name -> TcM ThetaType +-- Used when we are expecting a ClassContext (i.e. no implicit params) +-- Does not do validity checking, like tcHsKindedType +tcHsKindedContext hs_theta = addLocM (mappM dsHsLPred) hs_theta +\end{code} + + +%************************************************************************ +%* * + The main kind checker: kcHsType +%* * +%************************************************************************ + + First a couple of simple wrappers for kcHsType + +\begin{code} +--------------------------- +kcLiftedType :: LHsType Name -> TcM (LHsType Name) +-- The type ty must be a *lifted* *type* +kcLiftedType ty = kcCheckHsType ty liftedTypeKind + +--------------------------- +kcTypeType :: LHsType Name -> TcM (LHsType Name) +-- The type ty must be a *type*, but it can be lifted or +-- unlifted or an unboxed tuple. +kcTypeType ty = kcCheckHsType ty openTypeKind + +--------------------------- +kcCheckHsType :: LHsType Name -> TcKind -> TcM (LHsType Name) +-- Check that the type has the specified kind +-- Be sure to use checkExpectedKind, rather than simply unifying +-- with OpenTypeKind, because it gives better error messages +kcCheckHsType (L span ty) exp_kind + = setSrcSpan span $ + do { (ty', act_kind) <- add_ctxt ty (kc_hs_type ty) + -- Add the context round the inner check only + -- because checkExpectedKind already mentions + -- 'ty' by name in any error message + + ; checkExpectedKind ty act_kind exp_kind + ; return (L span ty') } + where + -- Wrap a context around only if we want to + -- show that contexts. Omit invisble ones + -- and ones user's won't grok (HsPred p). + add_ctxt (HsPredTy p) thing = thing + add_ctxt (HsForAllTy Implicit tvs (L _ []) ty) thing = thing + add_ctxt other_ty thing = addErrCtxt (typeCtxt ty) thing +\end{code} + + Here comes the main function + +\begin{code} +kcHsType :: LHsType Name -> TcM (LHsType Name, TcKind) +kcHsType ty = wrapLocFstM kc_hs_type ty +-- kcHsType *returns* the kind of the type, rather than taking an expected +-- kind as argument as tcExpr does. +-- Reasons: +-- (a) the kind of (->) is +-- forall bx1 bx2. Type bx1 -> Type bx2 -> Type Boxed +-- so we'd need to generate huge numbers of bx variables. +-- (b) kinds are so simple that the error messages are fine +-- +-- The translated type has explicitly-kinded type-variable binders + +kc_hs_type (HsParTy ty) + = kcHsType ty `thenM` \ (ty', kind) -> + returnM (HsParTy ty', kind) + +kc_hs_type (HsTyVar name) + = kcTyVar name `thenM` \ kind -> + returnM (HsTyVar name, kind) + +kc_hs_type (HsListTy ty) + = kcLiftedType ty `thenM` \ ty' -> + returnM (HsListTy ty', liftedTypeKind) + +kc_hs_type (HsPArrTy ty) + = kcLiftedType ty `thenM` \ ty' -> + returnM (HsPArrTy ty', liftedTypeKind) + +kc_hs_type (HsNumTy n) + = returnM (HsNumTy n, liftedTypeKind) + +kc_hs_type (HsKindSig ty k) + = kcCheckHsType ty k `thenM` \ ty' -> + returnM (HsKindSig ty' k, k) + +kc_hs_type (HsTupleTy Boxed tys) + = mappM kcLiftedType tys `thenM` \ tys' -> + returnM (HsTupleTy Boxed tys', liftedTypeKind) + +kc_hs_type (HsTupleTy Unboxed tys) + = mappM kcTypeType tys `thenM` \ tys' -> + returnM (HsTupleTy Unboxed tys', ubxTupleKind) + +kc_hs_type (HsFunTy ty1 ty2) + = kcCheckHsType ty1 argTypeKind `thenM` \ ty1' -> + kcTypeType ty2 `thenM` \ ty2' -> + returnM (HsFunTy ty1' ty2', liftedTypeKind) + +kc_hs_type ty@(HsOpTy ty1 op ty2) + = addLocM kcTyVar op `thenM` \ op_kind -> + kcApps op_kind (ppr op) [ty1,ty2] `thenM` \ ([ty1',ty2'], res_kind) -> + returnM (HsOpTy ty1' op ty2', res_kind) + +kc_hs_type ty@(HsAppTy ty1 ty2) + = kcHsType fun_ty `thenM` \ (fun_ty', fun_kind) -> + kcApps fun_kind (ppr fun_ty) arg_tys `thenM` \ ((arg_ty':arg_tys'), res_kind) -> + returnM (foldl mk_app (HsAppTy fun_ty' arg_ty') arg_tys', res_kind) + where + (fun_ty, arg_tys) = split ty1 [ty2] + split (L _ (HsAppTy f a)) as = split f (a:as) + split f as = (f,as) + mk_app fun arg = HsAppTy (noLoc fun) arg -- Add noLocs for inner nodes of + -- the application; they are never used + +kc_hs_type (HsPredTy pred) + = kcHsPred pred `thenM` \ pred' -> + returnM (HsPredTy pred', liftedTypeKind) + +kc_hs_type (HsForAllTy exp tv_names context ty) + = kcHsTyVars tv_names $ \ tv_names' -> + kcHsContext context `thenM` \ ctxt' -> + kcLiftedType ty `thenM` \ ty' -> + -- The body of a forall is usually a type, but in principle + -- there's no reason to prohibit *unlifted* types. + -- In fact, GHC can itself construct a function with an + -- unboxed tuple inside a for-all (via CPR analyis; see + -- typecheck/should_compile/tc170) + -- + -- Still, that's only for internal interfaces, which aren't + -- kind-checked, so we only allow liftedTypeKind here + returnM (HsForAllTy exp tv_names' ctxt' ty', liftedTypeKind) + +kc_hs_type (HsBangTy b ty) + = do { (ty', kind) <- kcHsType ty + ; return (HsBangTy b ty', kind) } + +kc_hs_type ty@(HsSpliceTy _) + = failWithTc (ptext SLIT("Unexpected type splice:") <+> ppr ty) + + +--------------------------- +kcApps :: TcKind -- Function kind + -> SDoc -- Function + -> [LHsType Name] -- Arg types + -> TcM ([LHsType Name], TcKind) -- Kind-checked args +kcApps fun_kind ppr_fun args + = split_fk fun_kind (length args) `thenM` \ (arg_kinds, res_kind) -> + zipWithM kc_arg args arg_kinds `thenM` \ args' -> + returnM (args', res_kind) + where + split_fk fk 0 = returnM ([], fk) + split_fk fk n = unifyFunKind fk `thenM` \ mb_fk -> + case mb_fk of + Nothing -> failWithTc too_many_args + Just (ak,fk') -> split_fk fk' (n-1) `thenM` \ (aks, rk) -> + returnM (ak:aks, rk) + + kc_arg arg arg_kind = kcCheckHsType arg arg_kind + + too_many_args = ptext SLIT("Kind error:") <+> quotes ppr_fun <+> + ptext SLIT("is applied to too many type arguments") + +--------------------------- +kcHsContext :: LHsContext Name -> TcM (LHsContext Name) +kcHsContext ctxt = wrapLocM (mappM kcHsLPred) ctxt + +kcHsLPred :: LHsPred Name -> TcM (LHsPred Name) +kcHsLPred = wrapLocM kcHsPred + +kcHsPred :: HsPred Name -> TcM (HsPred Name) +kcHsPred pred -- Checks that the result is of kind liftedType + = kc_pred pred `thenM` \ (pred', kind) -> + checkExpectedKind pred kind liftedTypeKind `thenM_` + returnM pred' + +--------------------------- +kc_pred :: HsPred Name -> TcM (HsPred Name, TcKind) + -- Does *not* check for a saturated + -- application (reason: used from TcDeriv) +kc_pred pred@(HsIParam name ty) + = kcHsType ty `thenM` \ (ty', kind) -> + returnM (HsIParam name ty', kind) + +kc_pred pred@(HsClassP cls tys) + = kcClass cls `thenM` \ kind -> + kcApps kind (ppr cls) tys `thenM` \ (tys', res_kind) -> + returnM (HsClassP cls tys', res_kind) + +--------------------------- +kcTyVar :: Name -> TcM TcKind +kcTyVar name -- Could be a tyvar or a tycon + = traceTc (text "lk1" <+> ppr name) `thenM_` + tcLookup name `thenM` \ thing -> + traceTc (text "lk2" <+> ppr name <+> ppr thing) `thenM_` + case thing of + ATyVar _ ty -> returnM (typeKind ty) + AThing kind -> returnM kind + AGlobal (ATyCon tc) -> returnM (tyConKind tc) + other -> wrongThingErr "type" thing name + +kcClass :: Name -> TcM TcKind +kcClass cls -- Must be a class + = tcLookup cls `thenM` \ thing -> + case thing of + AThing kind -> returnM kind + AGlobal (AClass cls) -> returnM (tyConKind (classTyCon cls)) + other -> wrongThingErr "class" thing cls +\end{code} + + +%************************************************************************ +%* * + Desugaring +%* * +%************************************************************************ + +The type desugarer + + * Transforms from HsType to Type + * Zonks any kinds + +It cannot fail, and does no validity checking, except for +structural matters, such as + (a) spurious ! annotations. + (b) a class used as a type + +\begin{code} +dsHsType :: LHsType Name -> TcM Type +-- All HsTyVarBndrs in the intput type are kind-annotated +dsHsType ty = ds_type (unLoc ty) + +ds_type ty@(HsTyVar name) + = ds_app ty [] + +ds_type (HsParTy ty) -- Remove the parentheses markers + = dsHsType ty + +ds_type ty@(HsBangTy _ _) -- No bangs should be here + = failWithTc (ptext SLIT("Unexpected strictness annotation:") <+> ppr ty) + +ds_type (HsKindSig ty k) + = dsHsType ty -- Kind checking done already + +ds_type (HsListTy ty) + = dsHsType ty `thenM` \ tau_ty -> + checkWiredInTyCon listTyCon `thenM_` + returnM (mkListTy tau_ty) + +ds_type (HsPArrTy ty) + = dsHsType ty `thenM` \ tau_ty -> + checkWiredInTyCon parrTyCon `thenM_` + returnM (mkPArrTy tau_ty) + +ds_type (HsTupleTy boxity tys) + = dsHsTypes tys `thenM` \ tau_tys -> + checkWiredInTyCon tycon `thenM_` + returnM (mkTyConApp tycon tau_tys) + where + tycon = tupleTyCon boxity (length tys) + +ds_type (HsFunTy ty1 ty2) + = dsHsType ty1 `thenM` \ tau_ty1 -> + dsHsType ty2 `thenM` \ tau_ty2 -> + returnM (mkFunTy tau_ty1 tau_ty2) + +ds_type (HsOpTy ty1 (L span op) ty2) + = dsHsType ty1 `thenM` \ tau_ty1 -> + dsHsType ty2 `thenM` \ tau_ty2 -> + setSrcSpan span (ds_var_app op [tau_ty1,tau_ty2]) + +ds_type (HsNumTy n) + = ASSERT(n==1) + tcLookupTyCon genUnitTyConName `thenM` \ tc -> + returnM (mkTyConApp tc []) + +ds_type ty@(HsAppTy _ _) + = ds_app ty [] + +ds_type (HsPredTy pred) + = dsHsPred pred `thenM` \ pred' -> + returnM (mkPredTy pred') + +ds_type full_ty@(HsForAllTy exp tv_names ctxt ty) + = tcTyVarBndrs tv_names $ \ tyvars -> + mappM dsHsLPred (unLoc ctxt) `thenM` \ theta -> + dsHsType ty `thenM` \ tau -> + returnM (mkSigmaTy tyvars theta tau) + +dsHsTypes arg_tys = mappM dsHsType arg_tys +\end{code} + +Help functions for type applications +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +\begin{code} +ds_app :: HsType Name -> [LHsType Name] -> TcM Type +ds_app (HsAppTy ty1 ty2) tys + = ds_app (unLoc ty1) (ty2:tys) + +ds_app ty tys + = dsHsTypes tys `thenM` \ arg_tys -> + case ty of + HsTyVar fun -> ds_var_app fun arg_tys + other -> ds_type ty `thenM` \ fun_ty -> + returnM (mkAppTys fun_ty arg_tys) + +ds_var_app :: Name -> [Type] -> TcM Type +ds_var_app name arg_tys + = tcLookup name `thenM` \ thing -> + case thing of + ATyVar _ ty -> returnM (mkAppTys ty arg_tys) + AGlobal (ATyCon tc) -> returnM (mkTyConApp tc arg_tys) + other -> wrongThingErr "type" thing name +\end{code} + + +Contexts +~~~~~~~~ + +\begin{code} +dsHsLPred :: LHsPred Name -> TcM PredType +dsHsLPred pred = dsHsPred (unLoc pred) + +dsHsPred pred@(HsClassP class_name tys) + = dsHsTypes tys `thenM` \ arg_tys -> + tcLookupClass class_name `thenM` \ clas -> + returnM (ClassP clas arg_tys) + +dsHsPred (HsIParam name ty) + = dsHsType ty `thenM` \ arg_ty -> + returnM (IParam name arg_ty) +\end{code} + +GADT constructor signatures + +\begin{code} +tcLHsConResTy :: LHsType Name -> TcM (TyCon, [TcType]) +tcLHsConResTy ty@(L span _) + = setSrcSpan span $ + addErrCtxt (gadtResCtxt ty) $ + tc_con_res ty [] + +tc_con_res (L _ (HsAppTy fun res_ty)) res_tys + = do { res_ty' <- dsHsType res_ty + ; tc_con_res fun (res_ty' : res_tys) } + +tc_con_res ty@(L _ (HsTyVar name)) res_tys + = do { thing <- tcLookup name + ; case thing of + AGlobal (ATyCon tc) -> return (tc, res_tys) + other -> failWithTc (badGadtDecl ty) + } + +tc_con_res ty _ = failWithTc (badGadtDecl ty) + +gadtResCtxt ty + = hang (ptext SLIT("In the result type of a data constructor:")) + 2 (ppr ty) +badGadtDecl ty + = hang (ptext SLIT("Malformed constructor result type:")) + 2 (ppr ty) + +typeCtxt ty = ptext SLIT("In the type") <+> quotes (ppr ty) +\end{code} + +%************************************************************************ +%* * + Type-variable binders +%* * +%************************************************************************ + + +\begin{code} +kcHsTyVars :: [LHsTyVarBndr Name] + -> ([LHsTyVarBndr Name] -> TcM r) -- These binders are kind-annotated + -- They scope over the thing inside + -> TcM r +kcHsTyVars tvs thing_inside + = mappM (wrapLocM kcHsTyVar) tvs `thenM` \ bndrs -> + tcExtendKindEnvTvs bndrs (thing_inside bndrs) + +kcHsTyVar :: HsTyVarBndr Name -> TcM (HsTyVarBndr Name) + -- Return a *kind-annotated* binder, and a tyvar with a mutable kind in it +kcHsTyVar (UserTyVar name) = newKindVar `thenM` \ kind -> + returnM (KindedTyVar name kind) +kcHsTyVar (KindedTyVar name kind) = returnM (KindedTyVar name kind) + +------------------ +tcTyVarBndrs :: [LHsTyVarBndr Name] -- Kind-annotated binders, which need kind-zonking + -> ([TyVar] -> TcM r) + -> TcM r +-- Used when type-checking types/classes/type-decls +-- Brings into scope immutable TyVars, not mutable ones that require later zonking +tcTyVarBndrs bndrs thing_inside + = mapM (zonk . unLoc) bndrs `thenM` \ tyvars -> + tcExtendTyVarEnv tyvars (thing_inside tyvars) + where + zonk (KindedTyVar name kind) = do { kind' <- zonkTcKindToKind kind + ; return (mkTyVar name kind') } + zonk (UserTyVar name) = pprTrace "Un-kinded tyvar" (ppr name) $ + return (mkTyVar name liftedTypeKind) + +----------------------------------- +tcDataKindSig :: Maybe Kind -> TcM [TyVar] +-- GADT decls can have a (perhpas partial) kind signature +-- e.g. data T :: * -> * -> * where ... +-- This function makes up suitable (kinded) type variables for +-- the argument kinds, and checks that the result kind is indeed * +tcDataKindSig Nothing = return [] +tcDataKindSig (Just kind) + = do { checkTc (isLiftedTypeKind res_kind) (badKindSig kind) + ; span <- getSrcSpanM + ; us <- newUniqueSupply + ; let loc = srcSpanStart span + uniqs = uniqsFromSupply us + ; return [ mk_tv loc uniq str kind + | ((kind, str), uniq) <- arg_kinds `zip` names `zip` uniqs ] } + where + (arg_kinds, res_kind) = splitKindFunTys kind + mk_tv loc uniq str kind = mkTyVar name kind + where + name = mkInternalName uniq occ loc + occ = mkOccName tvName str + + names :: [String] -- a,b,c...aa,ab,ac etc + names = [ c:cs | cs <- "" : names, c <- ['a'..'z'] ] + +badKindSig :: Kind -> SDoc +badKindSig kind + = hang (ptext SLIT("Kind signature on data type declaration has non-* return kind")) + 2 (ppr kind) +\end{code} + + +%************************************************************************ +%* * + Scoped type variables +%* * +%************************************************************************ + + +tcAddScopedTyVars is used for scoped type variables added by pattern +type signatures + e.g. \ ((x::a), (y::a)) -> x+y +They never have explicit kinds (because this is source-code only) +They are mutable (because they can get bound to a more specific type). + +Usually we kind-infer and expand type splices, and then +tupecheck/desugar the type. That doesn't work well for scoped type +variables, because they scope left-right in patterns. (e.g. in the +example above, the 'a' in (y::a) is bound by the 'a' in (x::a). + +The current not-very-good plan is to + * find all the types in the patterns + * find their free tyvars + * do kind inference + * bring the kinded type vars into scope + * BUT throw away the kind-checked type + (we'll kind-check it again when we type-check the pattern) + +This is bad because throwing away the kind checked type throws away +its splices. But too bad for now. [July 03] + +Historical note: + We no longer specify that these type variables must be univerally + quantified (lots of email on the subject). If you want to put that + back in, you need to + a) Do a checkSigTyVars after thing_inside + b) More insidiously, don't pass in expected_ty, else + we unify with it too early and checkSigTyVars barfs + Instead you have to pass in a fresh ty var, and unify + it with expected_ty afterwards + +\begin{code} +tcHsPatSigType :: UserTypeCtxt + -> LHsType Name -- The type signature + -> TcM ([TyVar], -- Newly in-scope type variables + Type) -- The signature +-- Used for type-checking type signatures in +-- (a) patterns e.g f (x::Int) = e +-- (b) result signatures e.g. g x :: Int = e +-- (c) RULE forall bndrs e.g. forall (x::Int). f x = x + +tcHsPatSigType ctxt hs_ty + = addErrCtxt (pprHsSigCtxt ctxt hs_ty) $ + do { -- Find the type variables that are mentioned in the type + -- but not already in scope. These are the ones that + -- should be bound by the pattern signature + in_scope <- getInLocalScope + ; let span = getLoc hs_ty + sig_tvs = [ L span (UserTyVar n) + | n <- nameSetToList (extractHsTyVars hs_ty), + not (in_scope n) ] + + -- Behave very like type-checking (HsForAllTy sig_tvs hs_ty), + -- except that we want to keep the tvs separate + ; (kinded_tvs, kinded_ty) <- kcHsTyVars sig_tvs $ \ kinded_tvs -> do + { kinded_ty <- kcTypeType hs_ty + ; return (kinded_tvs, kinded_ty) } + ; tcTyVarBndrs kinded_tvs $ \ tyvars -> do + { sig_ty <- dsHsType kinded_ty + ; checkValidType ctxt sig_ty + ; return (tyvars, sig_ty) + } } + +tcPatSig :: UserTypeCtxt + -> LHsType Name + -> BoxySigmaType + -> TcM (TcType, -- The type to use for "inside" the signature + [(Name,TcType)]) -- The new bit of type environment, binding + -- the scoped type variables +tcPatSig ctxt sig res_ty + = do { (sig_tvs, sig_ty) <- tcHsPatSigType ctxt sig + + ; if null sig_tvs then do { + -- The type signature binds no type variables, + -- and hence is rigid, so use it to zap the res_ty + boxyUnify sig_ty res_ty + ; return (sig_ty, []) + + } else do { + -- Type signature binds at least one scoped type variable + + -- A pattern binding cannot bind scoped type variables + -- The renamer fails with a name-out-of-scope error + -- if a pattern binding tries to bind a type variable, + -- So we just have an ASSERT here + ; let in_pat_bind = case ctxt of + BindPatSigCtxt -> True + other -> False + ; ASSERT( not in_pat_bind || null sig_tvs ) return () + + -- Check that pat_ty is rigid + ; checkTc (isRigidTy res_ty) (wobblyPatSig sig_tvs) + + -- Now match the pattern signature against res_ty + -- For convenience, and uniform-looking error messages + -- we do the matching by allocating meta type variables, + -- unifying, and reading out the results. + -- This is a strictly local operation. + ; box_tvs <- mapM tcInstBoxyTyVar sig_tvs + ; boxyUnify (substTyWith sig_tvs (mkTyVarTys box_tvs) sig_ty) res_ty + ; sig_tv_tys <- mapM readFilledBox box_tvs + + -- Check that each is bound to a distinct type variable, + -- and one that is not already in scope + ; let tv_binds = map tyVarName sig_tvs `zip` sig_tv_tys + ; binds_in_scope <- getScopedTyVarBinds + ; check binds_in_scope tv_binds + + -- Phew! + ; return (res_ty, tv_binds) + } } + where + check in_scope [] = return () + check in_scope ((n,ty):rest) = do { check_one in_scope n ty + ; check ((n,ty):in_scope) rest } + + check_one in_scope n ty + = do { checkTc (tcIsTyVarTy ty) (scopedNonVar n ty) + -- Must bind to a type variable + + ; checkTc (null dups) (dupInScope n (head dups) ty) + -- Must not bind to the same type variable + -- as some other in-scope type variable + + ; return () } + where + dups = [n' | (n',ty') <- in_scope, tcEqType ty' ty] +\end{code} + + +%************************************************************************ +%* * + Scoped type variables +%* * +%************************************************************************ + +\begin{code} +pprHsSigCtxt :: UserTypeCtxt -> LHsType Name -> SDoc +pprHsSigCtxt ctxt hs_ty = vcat [ ptext SLIT("In") <+> pprUserTypeCtxt ctxt <> colon, + nest 2 (pp_sig ctxt) ] + where + pp_sig (FunSigCtxt n) = pp_n_colon n + pp_sig (ConArgCtxt n) = pp_n_colon n + pp_sig (ForSigCtxt n) = pp_n_colon n + pp_sig (RuleSigCtxt n) = pp_n_colon n + pp_sig other = ppr (unLoc hs_ty) + + pp_n_colon n = ppr n <+> dcolon <+> ppr (unLoc hs_ty) + + +wobblyPatSig sig_tvs + = hang (ptext SLIT("A pattern type signature cannot bind scoped type variables") + <+> pprQuotedList sig_tvs) + 2 (ptext SLIT("unless the pattern has a rigid type context")) + +scopedNonVar n ty + = vcat [sep [ptext SLIT("The scoped type variable") <+> quotes (ppr n), + nest 2 (ptext SLIT("is bound to the type") <+> quotes (ppr ty))], + nest 2 (ptext SLIT("You can only bind scoped type variables to type variables"))] + +dupInScope n n' ty + = hang (ptext SLIT("The scoped type variables") <+> quotes (ppr n) <+> ptext SLIT("and") <+> quotes (ppr n')) + 2 (vcat [ptext SLIT("are bound to the same type (variable)"), + ptext SLIT("Distinct scoped type variables must be distinct")]) +\end{code} + diff --git a/compiler/typecheck/TcInstDcls.lhs b/compiler/typecheck/TcInstDcls.lhs new file mode 100644 index 0000000000..45338d0a1e --- /dev/null +++ b/compiler/typecheck/TcInstDcls.lhs @@ -0,0 +1,610 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[TcInstDecls]{Typechecking instance declarations} + +\begin{code} +module TcInstDcls ( tcInstDecls1, tcInstDecls2 ) where + +#include "HsVersions.h" + +import HsSyn +import TcBinds ( mkPragFun, tcPrags, badBootDeclErr ) +import TcClassDcl ( tcMethodBind, mkMethodBind, badMethodErr, + tcClassDecl2, getGenericInstances ) +import TcRnMonad +import TcMType ( tcSkolSigType, checkValidInstance, checkValidInstHead ) +import TcType ( mkClassPred, tcSplitSigmaTy, tcSplitDFunHead, mkTyVarTys, + SkolemInfo(InstSkol), tcSplitDFunTy ) +import Inst ( tcInstClassOp, newDicts, instToId, showLIE, + getOverlapFlag, tcExtendLocalInstEnv ) +import InstEnv ( mkLocalInstance, instanceDFunId ) +import TcDeriv ( tcDeriving ) +import TcEnv ( InstInfo(..), InstBindings(..), + newDFunName, tcExtendIdEnv + ) +import TcHsType ( kcHsSigType, tcHsKindedType ) +import TcUnify ( checkSigTyVars ) +import TcSimplify ( tcSimplifyCheck, tcSimplifySuperClasses ) +import Type ( zipOpenTvSubst, substTheta, substTys ) +import DataCon ( classDataCon ) +import Class ( classBigSig ) +import Var ( Id, idName, idType ) +import MkId ( mkDictFunId ) +import Name ( Name, getSrcLoc ) +import Maybe ( catMaybes ) +import SrcLoc ( srcLocSpan, unLoc, noLoc, Located(..), srcSpanStart ) +import ListSetOps ( minusList ) +import Outputable +import Bag +import BasicTypes ( Activation( AlwaysActive ), InlineSpec(..) ) +import FastString +\end{code} + +Typechecking instance declarations is done in two passes. The first +pass, made by @tcInstDecls1@, collects information to be used in the +second pass. + +This pre-processed info includes the as-yet-unprocessed bindings +inside the instance declaration. These are type-checked in the second +pass, when the class-instance envs and GVE contain all the info from +all the instance and value decls. Indeed that's the reason we need +two passes over the instance decls. + + +Here is the overall algorithm. +Assume that we have an instance declaration + + instance c => k (t tvs) where b + +\begin{enumerate} +\item +$LIE_c$ is the LIE for the context of class $c$ +\item +$betas_bar$ is the free variables in the class method type, excluding the + class variable +\item +$LIE_cop$ is the LIE constraining a particular class method +\item +$tau_cop$ is the tau type of a class method +\item +$LIE_i$ is the LIE for the context of instance $i$ +\item +$X$ is the instance constructor tycon +\item +$gammas_bar$ is the set of type variables of the instance +\item +$LIE_iop$ is the LIE for a particular class method instance +\item +$tau_iop$ is the tau type for this instance of a class method +\item +$alpha$ is the class variable +\item +$LIE_cop' = LIE_cop [X gammas_bar / alpha, fresh betas_bar]$ +\item +$tau_cop' = tau_cop [X gammas_bar / alpha, fresh betas_bar]$ +\end{enumerate} + +ToDo: Update the list above with names actually in the code. + +\begin{enumerate} +\item +First, make the LIEs for the class and instance contexts, which means +instantiate $thetaC [X inst_tyvars / alpha ]$, yielding LIElistC' and LIEC', +and make LIElistI and LIEI. +\item +Then process each method in turn. +\item +order the instance methods according to the ordering of the class methods +\item +express LIEC' in terms of LIEI, yielding $dbinds_super$ or an error +\item +Create final dictionary function from bindings generated already +\begin{pseudocode} +df = lambda inst_tyvars + lambda LIEI + let Bop1 + Bop2 + ... + Bopn + and dbinds_super + in <op1,op2,...,opn,sd1,...,sdm> +\end{pseudocode} +Here, Bop1 \ldots Bopn bind the methods op1 \ldots opn, +and $dbinds_super$ bind the superclass dictionaries sd1 \ldots sdm. +\end{enumerate} + + +%************************************************************************ +%* * +\subsection{Extracting instance decls} +%* * +%************************************************************************ + +Gather up the instance declarations from their various sources + +\begin{code} +tcInstDecls1 -- Deal with both source-code and imported instance decls + :: [LTyClDecl Name] -- For deriving stuff + -> [LInstDecl Name] -- Source code instance decls + -> TcM (TcGblEnv, -- The full inst env + [InstInfo], -- Source-code instance decls to process; + -- contains all dfuns for this module + HsValBinds Name) -- Supporting bindings for derived instances + +tcInstDecls1 tycl_decls inst_decls + = checkNoErrs $ + -- Stop if addInstInfos etc discovers any errors + -- (they recover, so that we get more than one error each round) + + -- (1) Do the ordinary instance declarations + mappM tcLocalInstDecl1 inst_decls `thenM` \ local_inst_infos -> + + let + local_inst_info = catMaybes local_inst_infos + clas_decls = filter (isClassDecl.unLoc) tycl_decls + in + -- (2) Instances from generic class declarations + getGenericInstances clas_decls `thenM` \ generic_inst_info -> + + -- Next, construct the instance environment so far, consisting of + -- a) local instance decls + -- b) generic instances + addInsts local_inst_info $ + addInsts generic_inst_info $ + + -- (3) Compute instances from "deriving" clauses; + -- This stuff computes a context for the derived instance decl, so it + -- needs to know about all the instances possible; hence inst_env4 + tcDeriving tycl_decls `thenM` \ (deriv_inst_info, deriv_binds) -> + addInsts deriv_inst_info $ + + getGblEnv `thenM` \ gbl_env -> + returnM (gbl_env, + generic_inst_info ++ deriv_inst_info ++ local_inst_info, + deriv_binds) + +addInsts :: [InstInfo] -> TcM a -> TcM a +addInsts infos thing_inside + = tcExtendLocalInstEnv (map iSpec infos) thing_inside +\end{code} + +\begin{code} +tcLocalInstDecl1 :: LInstDecl Name + -> TcM (Maybe InstInfo) -- Nothing if there was an error + -- A source-file instance declaration + -- Type-check all the stuff before the "where" + -- + -- We check for respectable instance type, and context +tcLocalInstDecl1 decl@(L loc (InstDecl poly_ty binds uprags)) + = -- Prime error recovery, set source location + recoverM (returnM Nothing) $ + setSrcSpan loc $ + addErrCtxt (instDeclCtxt1 poly_ty) $ + + do { is_boot <- tcIsHsBoot + ; checkTc (not is_boot || (isEmptyLHsBinds binds && null uprags)) + badBootDeclErr + + -- Typecheck the instance type itself. We can't use + -- tcHsSigType, because it's not a valid user type. + ; kinded_ty <- kcHsSigType poly_ty + ; poly_ty' <- tcHsKindedType kinded_ty + ; let (tyvars, theta, tau) = tcSplitSigmaTy poly_ty' + + ; (clas, inst_tys) <- checkValidInstHead tau + ; checkValidInstance tyvars theta clas inst_tys + + ; dfun_name <- newDFunName clas inst_tys (srcSpanStart loc) + ; overlap_flag <- getOverlapFlag + ; let dfun = mkDictFunId dfun_name tyvars theta clas inst_tys + ispec = mkLocalInstance dfun overlap_flag + + ; return (Just (InstInfo { iSpec = ispec, iBinds = VanillaInst binds uprags })) } +\end{code} + + +%************************************************************************ +%* * +\subsection{Type-checking instance declarations, pass 2} +%* * +%************************************************************************ + +\begin{code} +tcInstDecls2 :: [LTyClDecl Name] -> [InstInfo] + -> TcM (LHsBinds Id, TcLclEnv) +-- (a) From each class declaration, +-- generate any default-method bindings +-- (b) From each instance decl +-- generate the dfun binding + +tcInstDecls2 tycl_decls inst_decls + = do { -- (a) Default methods from class decls + (dm_binds_s, dm_ids_s) <- mapAndUnzipM tcClassDecl2 $ + filter (isClassDecl.unLoc) tycl_decls + ; tcExtendIdEnv (concat dm_ids_s) $ do + + -- (b) instance declarations + ; inst_binds_s <- mappM tcInstDecl2 inst_decls + + -- Done + ; let binds = unionManyBags dm_binds_s `unionBags` + unionManyBags inst_binds_s + ; tcl_env <- getLclEnv -- Default method Ids in here + ; returnM (binds, tcl_env) } +\end{code} + +======= New documentation starts here (Sept 92) ============== + +The main purpose of @tcInstDecl2@ is to return a @HsBinds@ which defines +the dictionary function for this instance declaration. For example +\begin{verbatim} + instance Foo a => Foo [a] where + op1 x = ... + op2 y = ... +\end{verbatim} +might generate something like +\begin{verbatim} + dfun.Foo.List dFoo_a = let op1 x = ... + op2 y = ... + in + Dict [op1, op2] +\end{verbatim} + +HOWEVER, if the instance decl has no context, then it returns a +bigger @HsBinds@ with declarations for each method. For example +\begin{verbatim} + instance Foo [a] where + op1 x = ... + op2 y = ... +\end{verbatim} +might produce +\begin{verbatim} + dfun.Foo.List a = Dict [Foo.op1.List a, Foo.op2.List a] + const.Foo.op1.List a x = ... + const.Foo.op2.List a y = ... +\end{verbatim} +This group may be mutually recursive, because (for example) there may +be no method supplied for op2 in which case we'll get +\begin{verbatim} + const.Foo.op2.List a = default.Foo.op2 (dfun.Foo.List a) +\end{verbatim} +that is, the default method applied to the dictionary at this type. + +What we actually produce in either case is: + + AbsBinds [a] [dfun_theta_dicts] + [(dfun.Foo.List, d)] ++ (maybe) [(const.Foo.op1.List, op1), ...] + { d = (sd1,sd2, ..., op1, op2, ...) + op1 = ... + op2 = ... + } + +The "maybe" says that we only ask AbsBinds to make global constant methods +if the dfun_theta is empty. + + +For an instance declaration, say, + + instance (C1 a, C2 b) => C (T a b) where + ... + +where the {\em immediate} superclasses of C are D1, D2, we build a dictionary +function whose type is + + (C1 a, C2 b, D1 (T a b), D2 (T a b)) => C (T a b) + +Notice that we pass it the superclass dictionaries at the instance type; this +is the ``Mark Jones optimisation''. The stuff before the "=>" here +is the @dfun_theta@ below. + +First comes the easy case of a non-local instance decl. + + +\begin{code} +tcInstDecl2 :: InstInfo -> TcM (LHsBinds Id) + +tcInstDecl2 (InstInfo { iSpec = ispec, iBinds = binds }) + = let + dfun_id = instanceDFunId ispec + rigid_info = InstSkol dfun_id + inst_ty = idType dfun_id + in + -- Prime error recovery + recoverM (returnM emptyLHsBinds) $ + setSrcSpan (srcLocSpan (getSrcLoc dfun_id)) $ + addErrCtxt (instDeclCtxt2 (idType dfun_id)) $ + + -- Instantiate the instance decl with skolem constants + tcSkolSigType rigid_info inst_ty `thenM` \ (inst_tyvars', dfun_theta', inst_head') -> + -- These inst_tyvars' scope over the 'where' part + -- Those tyvars are inside the dfun_id's type, which is a bit + -- bizarre, but OK so long as you realise it! + let + (clas, inst_tys') = tcSplitDFunHead inst_head' + (class_tyvars, sc_theta, _, op_items) = classBigSig clas + + -- Instantiate the super-class context with inst_tys + sc_theta' = substTheta (zipOpenTvSubst class_tyvars inst_tys') sc_theta + origin = SigOrigin rigid_info + in + -- Create dictionary Ids from the specified instance contexts. + newDicts InstScOrigin sc_theta' `thenM` \ sc_dicts -> + newDicts origin dfun_theta' `thenM` \ dfun_arg_dicts -> + newDicts origin [mkClassPred clas inst_tys'] `thenM` \ [this_dict] -> + -- Default-method Ids may be mentioned in synthesised RHSs, + -- but they'll already be in the environment. + + -- Typecheck the methods + let -- These insts are in scope; quite a few, eh? + avail_insts = [this_dict] ++ dfun_arg_dicts ++ sc_dicts + in + tcMethods origin clas inst_tyvars' + dfun_theta' inst_tys' avail_insts + op_items binds `thenM` \ (meth_ids, meth_binds) -> + + -- Figure out bindings for the superclass context + -- Don't include this_dict in the 'givens', else + -- sc_dicts get bound by just selecting from this_dict!! + addErrCtxt superClassCtxt + (tcSimplifySuperClasses inst_tyvars' + dfun_arg_dicts + sc_dicts) `thenM` \ sc_binds -> + + -- It's possible that the superclass stuff might unified one + -- of the inst_tyavars' with something in the envt + checkSigTyVars inst_tyvars' `thenM_` + + -- Deal with 'SPECIALISE instance' pragmas + let + specs = case binds of + VanillaInst _ prags -> filter isSpecInstLSig prags + other -> [] + in + tcPrags dfun_id specs `thenM` \ prags -> + + -- Create the result bindings + let + dict_constr = classDataCon clas + scs_and_meths = map instToId sc_dicts ++ meth_ids + this_dict_id = instToId this_dict + inline_prag | null dfun_arg_dicts = [] + | otherwise = [InlinePrag (Inline AlwaysActive True)] + -- Always inline the dfun; this is an experimental decision + -- because it makes a big performance difference sometimes. + -- Often it means we can do the method selection, and then + -- inline the method as well. Marcin's idea; see comments below. + -- + -- BUT: don't inline it if it's a constant dictionary; + -- we'll get all the benefit without inlining, and we get + -- a **lot** of code duplication if we inline it + -- + -- See Note [Inline dfuns] below + + dict_rhs + = mkHsConApp dict_constr inst_tys' (map HsVar scs_and_meths) + -- We don't produce a binding for the dict_constr; instead we + -- rely on the simplifier to unfold this saturated application + -- We do this rather than generate an HsCon directly, because + -- it means that the special cases (e.g. dictionary with only one + -- member) are dealt with by the common MkId.mkDataConWrapId code rather + -- than needing to be repeated here. + + dict_bind = noLoc (VarBind this_dict_id dict_rhs) + all_binds = dict_bind `consBag` (sc_binds `unionBags` meth_binds) + + main_bind = noLoc $ AbsBinds + inst_tyvars' + (map instToId dfun_arg_dicts) + [(inst_tyvars', dfun_id, this_dict_id, + inline_prag ++ prags)] + all_binds + in + showLIE (text "instance") `thenM_` + returnM (unitBag main_bind) + + +tcMethods origin clas inst_tyvars' dfun_theta' inst_tys' + avail_insts op_items (VanillaInst monobinds uprags) + = -- Check that all the method bindings come from this class + let + sel_names = [idName sel_id | (sel_id, _) <- op_items] + bad_bndrs = collectHsBindBinders monobinds `minusList` sel_names + in + mappM (addErrTc . badMethodErr clas) bad_bndrs `thenM_` + + -- Make the method bindings + let + mk_method_bind = mkMethodBind origin clas inst_tys' monobinds + in + mapAndUnzipM mk_method_bind op_items `thenM` \ (meth_insts, meth_infos) -> + + -- And type check them + -- It's really worth making meth_insts available to the tcMethodBind + -- Consider instance Monad (ST s) where + -- {-# INLINE (>>) #-} + -- (>>) = ...(>>=)... + -- If we don't include meth_insts, we end up with bindings like this: + -- rec { dict = MkD then bind ... + -- then = inline_me (... (GHC.Base.>>= dict) ...) + -- bind = ... } + -- The trouble is that (a) 'then' and 'dict' are mutually recursive, + -- and (b) the inline_me prevents us inlining the >>= selector, which + -- would unravel the loop. Result: (>>) ends up as a loop breaker, and + -- is not inlined across modules. Rather ironic since this does not + -- happen without the INLINE pragma! + -- + -- Solution: make meth_insts available, so that 'then' refers directly + -- to the local 'bind' rather than going via the dictionary. + -- + -- BUT WATCH OUT! If the method type mentions the class variable, then + -- this optimisation is not right. Consider + -- class C a where + -- op :: Eq a => a + -- + -- instance C Int where + -- op = op + -- The occurrence of 'op' on the rhs gives rise to a constraint + -- op at Int + -- The trouble is that the 'meth_inst' for op, which is 'available', also + -- looks like 'op at Int'. But they are not the same. + let + prag_fn = mkPragFun uprags + all_insts = avail_insts ++ catMaybes meth_insts + tc_method_bind = tcMethodBind inst_tyvars' dfun_theta' all_insts prag_fn + meth_ids = [meth_id | (_,meth_id,_) <- meth_infos] + in + + mapM tc_method_bind meth_infos `thenM` \ meth_binds_s -> + + returnM (meth_ids, unionManyBags meth_binds_s) + + +-- Derived newtype instances +tcMethods origin clas inst_tyvars' dfun_theta' inst_tys' + avail_insts op_items (NewTypeDerived rep_tys) + = getInstLoc origin `thenM` \ inst_loc -> + mapAndUnzip3M (do_one inst_loc) op_items `thenM` \ (meth_ids, meth_binds, rhs_insts) -> + + tcSimplifyCheck + (ptext SLIT("newtype derived instance")) + inst_tyvars' avail_insts rhs_insts `thenM` \ lie_binds -> + + -- I don't think we have to do the checkSigTyVars thing + + returnM (meth_ids, lie_binds `unionBags` listToBag meth_binds) + + where + do_one inst_loc (sel_id, _) + = -- The binding is like "op @ NewTy = op @ RepTy" + -- Make the *binder*, like in mkMethodBind + tcInstClassOp inst_loc sel_id inst_tys' `thenM` \ meth_inst -> + + -- Make the *occurrence on the rhs* + tcInstClassOp inst_loc sel_id rep_tys' `thenM` \ rhs_inst -> + let + meth_id = instToId meth_inst + in + return (meth_id, noLoc (VarBind meth_id (nlHsVar (instToId rhs_inst))), rhs_inst) + + -- Instantiate rep_tys with the relevant type variables + -- This looks a bit odd, because inst_tyvars' are the skolemised version + -- of the type variables in the instance declaration; but rep_tys doesn't + -- have the skolemised version, so we substitute them in here + rep_tys' = substTys subst rep_tys + subst = zipOpenTvSubst inst_tyvars' (mkTyVarTys inst_tyvars') +\end{code} + + + ------------------------------ + [Inline dfuns] Inlining dfuns unconditionally + ------------------------------ + +The code above unconditionally inlines dict funs. Here's why. +Consider this program: + + test :: Int -> Int -> Bool + test x y = (x,y) == (y,x) || test y x + -- Recursive to avoid making it inline. + +This needs the (Eq (Int,Int)) instance. If we inline that dfun +the code we end up with is good: + + Test.$wtest = + \r -> case ==# [ww ww1] of wild { + PrelBase.False -> Test.$wtest ww1 ww; + PrelBase.True -> + case ==# [ww1 ww] of wild1 { + PrelBase.False -> Test.$wtest ww1 ww; + PrelBase.True -> PrelBase.True []; + }; + }; + Test.test = \r [w w1] + case w of w2 { + PrelBase.I# ww -> + case w1 of w3 { PrelBase.I# ww1 -> Test.$wtest ww ww1; }; + }; + +If we don't inline the dfun, the code is not nearly as good: + + (==) = case PrelTup.$fEq(,) PrelBase.$fEqInt PrelBase.$fEqInt of tpl { + PrelBase.:DEq tpl1 tpl2 -> tpl2; + }; + + Test.$wtest = + \r [ww ww1] + let { y = PrelBase.I#! [ww1]; } in + let { x = PrelBase.I#! [ww]; } in + let { sat_slx = PrelTup.(,)! [y x]; } in + let { sat_sly = PrelTup.(,)! [x y]; + } in + case == sat_sly sat_slx of wild { + PrelBase.False -> Test.$wtest ww1 ww; + PrelBase.True -> PrelBase.True []; + }; + + Test.test = + \r [w w1] + case w of w2 { + PrelBase.I# ww -> + case w1 of w3 { PrelBase.I# ww1 -> Test.$wtest ww ww1; }; + }; + +Why doesn't GHC inline $fEq? Because it looks big: + + PrelTup.zdfEqZ1T{-rcX-} + = \ @ a{-reT-} :: * @ b{-reS-} :: * + zddEq{-rf6-} _Ks :: {PrelBase.Eq{-23-} a{-reT-}} + zddEq1{-rf7-} _Ks :: {PrelBase.Eq{-23-} b{-reS-}} -> + let { + zeze{-rf0-} _Kl :: (b{-reS-} -> b{-reS-} -> PrelBase.Bool{-3c-}) + zeze{-rf0-} = PrelBase.zeze{-01L-}@ b{-reS-} zddEq1{-rf7-} } in + let { + zeze1{-rf3-} _Kl :: (a{-reT-} -> a{-reT-} -> PrelBase.Bool{-3c-}) + zeze1{-rf3-} = PrelBase.zeze{-01L-} @ a{-reT-} zddEq{-rf6-} } in + let { + zeze2{-reN-} :: ((a{-reT-}, b{-reS-}) -> (a{-reT-}, b{-reS-})-> PrelBase.Bool{-3c-}) + zeze2{-reN-} = \ ds{-rf5-} _Ks :: (a{-reT-}, b{-reS-}) + ds1{-rf4-} _Ks :: (a{-reT-}, b{-reS-}) -> + case ds{-rf5-} + of wild{-reW-} _Kd { (a1{-rf2-} _Ks, a2{-reZ-} _Ks) -> + case ds1{-rf4-} + of wild1{-reX-} _Kd { (b1{-rf1-} _Ks, b2{-reY-} _Ks) -> + PrelBase.zaza{-r4e-} + (zeze1{-rf3-} a1{-rf2-} b1{-rf1-}) + (zeze{-rf0-} a2{-reZ-} b2{-reY-}) + } + } } in + let { + a1{-reR-} :: ((a{-reT-}, b{-reS-})-> (a{-reT-}, b{-reS-})-> PrelBase.Bool{-3c-}) + a1{-reR-} = \ a2{-reV-} _Ks :: (a{-reT-}, b{-reS-}) + b1{-reU-} _Ks :: (a{-reT-}, b{-reS-}) -> + PrelBase.not{-r6I-} (zeze2{-reN-} a2{-reV-} b1{-reU-}) + } in + PrelBase.zdwZCDEq{-r8J-} @ (a{-reT-}, b{-reS-}) a1{-reR-} zeze2{-reN-}) + +and it's not as bad as it seems, because it's further dramatically +simplified: only zeze2 is extracted and its body is simplified. + + +%************************************************************************ +%* * +\subsection{Error messages} +%* * +%************************************************************************ + +\begin{code} +instDeclCtxt1 hs_inst_ty + = inst_decl_ctxt (case unLoc hs_inst_ty of + HsForAllTy _ _ _ (L _ (HsPredTy pred)) -> ppr pred + HsPredTy pred -> ppr pred + other -> ppr hs_inst_ty) -- Don't expect this +instDeclCtxt2 dfun_ty + = inst_decl_ctxt (ppr (mkClassPred cls tys)) + where + (_,_,cls,tys) = tcSplitDFunTy dfun_ty + +inst_decl_ctxt doc = ptext SLIT("In the instance declaration for") <+> quotes doc + +superClassCtxt = ptext SLIT("When checking the super-classes of an instance declaration") +\end{code} diff --git a/compiler/typecheck/TcMType.lhs b/compiler/typecheck/TcMType.lhs new file mode 100644 index 0000000000..fa129d3927 --- /dev/null +++ b/compiler/typecheck/TcMType.lhs @@ -0,0 +1,1206 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section{Monadic type operations} + +This module contains monadic operations over types that contain mutable type variables + +\begin{code} +module TcMType ( + TcTyVar, TcKind, TcType, TcTauType, TcThetaType, TcTyVarSet, + + -------------------------------- + -- Creating new mutable type variables + newFlexiTyVar, + newFlexiTyVarTy, -- Kind -> TcM TcType + newFlexiTyVarTys, -- Int -> Kind -> TcM [TcType] + newKindVar, newKindVars, + lookupTcTyVar, LookupTyVarResult(..), + newMetaTyVar, readMetaTyVar, writeMetaTyVar, + + -------------------------------- + -- Boxy type variables + newBoxyTyVar, newBoxyTyVars, newBoxyTyVarTys, readFilledBox, + + -------------------------------- + -- Instantiation + tcInstTyVar, tcInstType, tcInstTyVars, tcInstBoxy, tcInstBoxyTyVar, + tcInstSigTyVars, zonkSigTyVar, + tcInstSkolTyVar, tcInstSkolTyVars, tcInstSkolType, + tcSkolSigType, tcSkolSigTyVars, + + -------------------------------- + -- Checking type validity + Rank, UserTypeCtxt(..), checkValidType, + SourceTyCtxt(..), checkValidTheta, checkFreeness, + checkValidInstHead, checkValidInstance, checkAmbiguity, + checkInstTermination, + arityErr, + + -------------------------------- + -- Zonking + zonkType, zonkTcPredType, + zonkTcTyVar, zonkTcTyVars, zonkTcTyVarsAndFV, zonkQuantifiedTyVar, + zonkTcType, zonkTcTypes, zonkTcClassConstraints, zonkTcThetaType, + zonkTcKindToKind, zonkTcKind, + + readKindVar, writeKindVar + + ) where + +#include "HsVersions.h" + + +-- friends: +import TypeRep ( Type(..), PredType(..), -- Friend; can see representation + ThetaType + ) +import TcType ( TcType, TcThetaType, TcTauType, TcPredType, + TcTyVarSet, TcKind, TcTyVar, TcTyVarDetails(..), + MetaDetails(..), SkolemInfo(..), BoxInfo(..), + BoxyTyVar, BoxyType, BoxyThetaType, BoxySigmaType, + UserTypeCtxt(..), + isMetaTyVar, isSigTyVar, metaTvRef, + tcCmpPred, isClassPred, tcGetTyVar, + tcSplitPhiTy, tcSplitPredTy_maybe, tcSplitAppTy_maybe, + tcValidInstHeadTy, tcSplitForAllTys, + tcIsTyVarTy, tcSplitSigmaTy, + isUnLiftedType, isIPPred, + typeKind, isSkolemTyVar, + mkAppTy, mkTyVarTy, mkTyVarTys, + tyVarsOfPred, getClassPredTys_maybe, + tyVarsOfType, tyVarsOfTypes, tcView, + pprPred, pprTheta, pprClassPred ) +import Kind ( Kind(..), KindVar, kindVarRef, mkKindVar, + isLiftedTypeKind, isArgTypeKind, isOpenTypeKind, + liftedTypeKind, defaultKind + ) +import Type ( TvSubst, zipTopTvSubst, substTy ) +import Class ( Class, classArity, className ) +import TyCon ( TyCon, isSynTyCon, isUnboxedTupleTyCon, + tyConArity, tyConName ) +import Var ( TyVar, tyVarKind, tyVarName, isTcTyVar, + mkTyVar, mkTcTyVar, tcTyVarDetails ) + + -- Assertions +#ifdef DEBUG +import TcType ( isFlexi, isBoxyTyVar, isImmutableTyVar ) +import Kind ( isSubKind ) +#endif + +-- others: +import TcRnMonad -- TcType, amongst others +import FunDeps ( grow, checkInstCoverage ) +import Name ( Name, setNameUnique, mkSysTvName ) +import VarSet +import DynFlags ( dopt, DynFlag(..) ) +import Util ( nOfThem, isSingleton, notNull ) +import ListSetOps ( removeDups ) +import Outputable + +import Control.Monad ( when ) +import Data.List ( (\\) ) +\end{code} + + +%************************************************************************ +%* * + Instantiation in general +%* * +%************************************************************************ + +\begin{code} +tcInstType :: ([TyVar] -> TcM [TcTyVar]) -- How to instantiate the type variables + -> TcType -- Type to instantiate + -> TcM ([TcTyVar], TcThetaType, TcType) -- Result +tcInstType inst_tyvars ty + = case tcSplitForAllTys ty of + ([], rho) -> let -- There may be overloading despite no type variables; + -- (?x :: Int) => Int -> Int + (theta, tau) = tcSplitPhiTy rho + in + return ([], theta, tau) + + (tyvars, rho) -> do { tyvars' <- inst_tyvars tyvars + + ; let tenv = zipTopTvSubst tyvars (mkTyVarTys tyvars') + -- Either the tyvars are freshly made, by inst_tyvars, + -- or (in the call from tcSkolSigType) any nested foralls + -- have different binders. Either way, zipTopTvSubst is ok + + ; let (theta, tau) = tcSplitPhiTy (substTy tenv rho) + ; return (tyvars', theta, tau) } +\end{code} + + +%************************************************************************ +%* * + Kind variables +%* * +%************************************************************************ + +\begin{code} +newKindVar :: TcM TcKind +newKindVar = do { uniq <- newUnique + ; ref <- newMutVar Nothing + ; return (KindVar (mkKindVar uniq ref)) } + +newKindVars :: Int -> TcM [TcKind] +newKindVars n = mappM (\ _ -> newKindVar) (nOfThem n ()) +\end{code} + + +%************************************************************************ +%* * + SkolemTvs (immutable) +%* * +%************************************************************************ + +\begin{code} +mkSkolTyVar :: Name -> Kind -> SkolemInfo -> TcTyVar +mkSkolTyVar name kind info = mkTcTyVar name kind (SkolemTv info) + +tcSkolSigType :: SkolemInfo -> Type -> TcM ([TcTyVar], TcThetaType, TcType) +-- Instantiate a type signature with skolem constants, but +-- do *not* give them fresh names, because we want the name to +-- be in the type environment -- it is lexically scoped. +tcSkolSigType info ty = tcInstType (\tvs -> return (tcSkolSigTyVars info tvs)) ty + +tcSkolSigTyVars :: SkolemInfo -> [TyVar] -> [TcTyVar] +-- Make skolem constants, but do *not* give them new names, as above +tcSkolSigTyVars info tyvars = [ mkSkolTyVar (tyVarName tv) (tyVarKind tv) info + | tv <- tyvars ] + +tcInstSkolType :: SkolemInfo -> TcType -> TcM ([TcTyVar], TcThetaType, TcType) +-- Instantiate a type with fresh skolem constants +tcInstSkolType info ty = tcInstType (tcInstSkolTyVars info) ty + +tcInstSkolTyVar :: SkolemInfo -> TyVar -> TcM TcTyVar +tcInstSkolTyVar info tyvar + = do { uniq <- newUnique + ; let name = setNameUnique (tyVarName tyvar) uniq + kind = tyVarKind tyvar + ; return (mkSkolTyVar name kind info) } + +tcInstSkolTyVars :: SkolemInfo -> [TyVar] -> TcM [TcTyVar] +tcInstSkolTyVars info tyvars = mapM (tcInstSkolTyVar info) tyvars +\end{code} + + +%************************************************************************ +%* * + MetaTvs (meta type variables; mutable) +%* * +%************************************************************************ + +\begin{code} +newMetaTyVar :: BoxInfo -> Kind -> TcM TcTyVar +-- Make a new meta tyvar out of thin air +newMetaTyVar box_info kind + = do { uniq <- newUnique + ; ref <- newMutVar Flexi ; + ; let name = mkSysTvName uniq fs + fs = case box_info of + BoxTv -> FSLIT("bx") + TauTv -> FSLIT("t") + SigTv _ -> FSLIT("a") + ; return (mkTcTyVar name kind (MetaTv box_info ref)) } + +instMetaTyVar :: BoxInfo -> TyVar -> TcM TcTyVar +-- Make a new meta tyvar whose Name and Kind +-- come from an existing TyVar +instMetaTyVar box_info tyvar + = do { uniq <- newUnique + ; ref <- newMutVar Flexi ; + ; let name = setNameUnique (tyVarName tyvar) uniq + kind = tyVarKind tyvar + ; return (mkTcTyVar name kind (MetaTv box_info ref)) } + +readMetaTyVar :: TyVar -> TcM MetaDetails +readMetaTyVar tyvar = ASSERT2( isMetaTyVar tyvar, ppr tyvar ) + readMutVar (metaTvRef tyvar) + +writeMetaTyVar :: TcTyVar -> TcType -> TcM () +#ifndef DEBUG +writeMetaTyVar tyvar ty = writeMutVar (metaTvRef tyvar) (Indirect ty) +#else +writeMetaTyVar tyvar ty + | not (isMetaTyVar tyvar) + = pprTrace "writeMetaTyVar" (ppr tyvar) $ + returnM () + + | otherwise + = ASSERT( isMetaTyVar tyvar ) + ASSERT2( k2 `isSubKind` k1, (ppr tyvar <+> ppr k1) $$ (ppr ty <+> ppr k2) ) + do { ASSERTM2( do { details <- readMetaTyVar tyvar; return (isFlexi details) }, ppr tyvar ) + ; writeMutVar (metaTvRef tyvar) (Indirect ty) } + where + k1 = tyVarKind tyvar + k2 = typeKind ty +#endif +\end{code} + + +%************************************************************************ +%* * + MetaTvs: TauTvs +%* * +%************************************************************************ + +\begin{code} +newFlexiTyVar :: Kind -> TcM TcTyVar +newFlexiTyVar kind = newMetaTyVar TauTv kind + +newFlexiTyVarTy :: Kind -> TcM TcType +newFlexiTyVarTy kind + = newFlexiTyVar kind `thenM` \ tc_tyvar -> + returnM (TyVarTy tc_tyvar) + +newFlexiTyVarTys :: Int -> Kind -> TcM [TcType] +newFlexiTyVarTys n kind = mappM newFlexiTyVarTy (nOfThem n kind) + +tcInstTyVar :: TyVar -> TcM TcTyVar +-- Instantiate with a META type variable +tcInstTyVar tyvar = instMetaTyVar TauTv tyvar + +tcInstTyVars :: [TyVar] -> TcM ([TcTyVar], [TcType], TvSubst) +-- Instantiate with META type variables +tcInstTyVars tyvars + = do { tc_tvs <- mapM tcInstTyVar tyvars + ; let tys = mkTyVarTys tc_tvs + ; returnM (tc_tvs, tys, zipTopTvSubst tyvars tys) } + -- Since the tyvars are freshly made, + -- they cannot possibly be captured by + -- any existing for-alls. Hence zipTopTvSubst +\end{code} + + +%************************************************************************ +%* * + MetaTvs: SigTvs +%* * +%************************************************************************ + +\begin{code} +tcInstSigTyVars :: SkolemInfo -> [TyVar] -> TcM [TcTyVar] +-- Instantiate with meta SigTvs +tcInstSigTyVars skol_info tyvars + = mapM (instMetaTyVar (SigTv skol_info)) tyvars + +zonkSigTyVar :: TcTyVar -> TcM TcTyVar +zonkSigTyVar sig_tv + | isSkolemTyVar sig_tv + = return sig_tv -- Happens in the call in TcBinds.checkDistinctTyVars + | otherwise + = ASSERT( isSigTyVar sig_tv ) + do { ty <- zonkTcTyVar sig_tv + ; return (tcGetTyVar "zonkSigTyVar" ty) } + -- 'ty' is bound to be a type variable, because SigTvs + -- can only be unified with type variables +\end{code} + + +%************************************************************************ +%* * + MetaTvs: BoxTvs +%* * +%************************************************************************ + +\begin{code} +newBoxyTyVar :: Kind -> TcM BoxyTyVar +newBoxyTyVar kind = newMetaTyVar BoxTv kind + +newBoxyTyVars :: [Kind] -> TcM [BoxyTyVar] +newBoxyTyVars kinds = mapM newBoxyTyVar kinds + +newBoxyTyVarTys :: [Kind] -> TcM [BoxyType] +newBoxyTyVarTys kinds = do { tvs <- mapM newBoxyTyVar kinds; return (mkTyVarTys tvs) } + +readFilledBox :: BoxyTyVar -> TcM TcType +-- Read the contents of the box, which should be filled in by now +readFilledBox box_tv = ASSERT( isBoxyTyVar box_tv ) + do { cts <- readMetaTyVar box_tv + ; case cts of + Flexi -> pprPanic "readFilledBox" (ppr box_tv) + Indirect ty -> return ty } + +tcInstBoxyTyVar :: TyVar -> TcM BoxyTyVar +-- Instantiate with a BOXY type variable +tcInstBoxyTyVar tyvar = instMetaTyVar BoxTv tyvar + +tcInstBoxy :: TcType -> TcM ([BoxyTyVar], BoxyThetaType, BoxySigmaType) +-- tcInstType instantiates the outer-level for-alls of a TcType with +-- fresh BOXY type variables, splits off the dictionary part, +-- and returns the pieces. +tcInstBoxy ty = tcInstType (mapM tcInstBoxyTyVar) ty +\end{code} + + +%************************************************************************ +%* * +\subsection{Putting and getting mutable type variables} +%* * +%************************************************************************ + +But it's more fun to short out indirections on the way: If this +version returns a TyVar, then that TyVar is unbound. If it returns +any other type, then there might be bound TyVars embedded inside it. + +We return Nothing iff the original box was unbound. + +\begin{code} +data LookupTyVarResult -- The result of a lookupTcTyVar call + = DoneTv TcTyVarDetails -- SkolemTv or virgin MetaTv + | IndirectTv TcType + +lookupTcTyVar :: TcTyVar -> TcM LookupTyVarResult +lookupTcTyVar tyvar + = case details of + SkolemTv _ -> return (DoneTv details) + MetaTv _ ref -> do { meta_details <- readMutVar ref + ; case meta_details of + Indirect ty -> return (IndirectTv ty) + Flexi -> return (DoneTv details) } + where + details = tcTyVarDetails tyvar + +{- +-- gaw 2004 We aren't shorting anything out anymore, at least for now +getTcTyVar tyvar + | not (isTcTyVar tyvar) + = pprTrace "getTcTyVar" (ppr tyvar) $ + returnM (Just (mkTyVarTy tyvar)) + + | otherwise + = ASSERT2( isTcTyVar tyvar, ppr tyvar ) + readMetaTyVar tyvar `thenM` \ maybe_ty -> + case maybe_ty of + Just ty -> short_out ty `thenM` \ ty' -> + writeMetaTyVar tyvar (Just ty') `thenM_` + returnM (Just ty') + + Nothing -> returnM Nothing + +short_out :: TcType -> TcM TcType +short_out ty@(TyVarTy tyvar) + | not (isTcTyVar tyvar) + = returnM ty + + | otherwise + = readMetaTyVar tyvar `thenM` \ maybe_ty -> + case maybe_ty of + Just ty' -> short_out ty' `thenM` \ ty' -> + writeMetaTyVar tyvar (Just ty') `thenM_` + returnM ty' + + other -> returnM ty + +short_out other_ty = returnM other_ty +-} +\end{code} + + +%************************************************************************ +%* * +\subsection{Zonking -- the exernal interfaces} +%* * +%************************************************************************ + +----------------- Type variables + +\begin{code} +zonkTcTyVars :: [TcTyVar] -> TcM [TcType] +zonkTcTyVars tyvars = mappM zonkTcTyVar tyvars + +zonkTcTyVarsAndFV :: [TcTyVar] -> TcM TcTyVarSet +zonkTcTyVarsAndFV tyvars = mappM zonkTcTyVar tyvars `thenM` \ tys -> + returnM (tyVarsOfTypes tys) + +zonkTcTyVar :: TcTyVar -> TcM TcType +zonkTcTyVar tyvar = ASSERT( isTcTyVar tyvar ) + zonk_tc_tyvar (\ tv -> returnM (TyVarTy tv)) tyvar +\end{code} + +----------------- Types + +\begin{code} +zonkTcType :: TcType -> TcM TcType +zonkTcType ty = zonkType (\ tv -> returnM (TyVarTy tv)) ty + +zonkTcTypes :: [TcType] -> TcM [TcType] +zonkTcTypes tys = mappM zonkTcType tys + +zonkTcClassConstraints cts = mappM zonk cts + where zonk (clas, tys) + = zonkTcTypes tys `thenM` \ new_tys -> + returnM (clas, new_tys) + +zonkTcThetaType :: TcThetaType -> TcM TcThetaType +zonkTcThetaType theta = mappM zonkTcPredType theta + +zonkTcPredType :: TcPredType -> TcM TcPredType +zonkTcPredType (ClassP c ts) + = zonkTcTypes ts `thenM` \ new_ts -> + returnM (ClassP c new_ts) +zonkTcPredType (IParam n t) + = zonkTcType t `thenM` \ new_t -> + returnM (IParam n new_t) +\end{code} + +------------------- These ...ToType, ...ToKind versions + are used at the end of type checking + +\begin{code} +zonkQuantifiedTyVar :: TcTyVar -> TcM TyVar +-- zonkQuantifiedTyVar is applied to the a TcTyVar when quantifying over it. +-- It might be a meta TyVar, in which case we freeze it into an ordinary TyVar. +-- When we do this, we also default the kind -- see notes with Kind.defaultKind +-- The meta tyvar is updated to point to the new regular TyVar. Now any +-- bound occurences of the original type variable will get zonked to +-- the immutable version. +-- +-- We leave skolem TyVars alone; they are immutable. +zonkQuantifiedTyVar tv + | isSkolemTyVar tv = return tv + -- It might be a skolem type variable, + -- for example from a user type signature + + | otherwise -- It's a meta-type-variable + = do { details <- readMetaTyVar tv + + -- Create the new, frozen, regular type variable + ; let final_kind = defaultKind (tyVarKind tv) + final_tv = mkTyVar (tyVarName tv) final_kind + + -- Bind the meta tyvar to the new tyvar + ; case details of + Indirect ty -> WARN( True, ppr tv $$ ppr ty ) + return () + -- [Sept 04] I don't think this should happen + -- See note [Silly Type Synonym] + + Flexi -> writeMetaTyVar tv (mkTyVarTy final_tv) + + -- Return the new tyvar + ; return final_tv } +\end{code} + +[Silly Type Synonyms] + +Consider this: + type C u a = u -- Note 'a' unused + + foo :: (forall a. C u a -> C u a) -> u + foo x = ... + + bar :: Num u => u + bar = foo (\t -> t + t) + +* From the (\t -> t+t) we get type {Num d} => d -> d + where d is fresh. + +* Now unify with type of foo's arg, and we get: + {Num (C d a)} => C d a -> C d a + where a is fresh. + +* Now abstract over the 'a', but float out the Num (C d a) constraint + because it does not 'really' mention a. (see exactTyVarsOfType) + The arg to foo becomes + /\a -> \t -> t+t + +* So we get a dict binding for Num (C d a), which is zonked to give + a = () + [Note Sept 04: now that we are zonking quantified type variables + on construction, the 'a' will be frozen as a regular tyvar on + quantification, so the floated dict will still have type (C d a). + Which renders this whole note moot; happily!] + +* Then the /\a abstraction has a zonked 'a' in it. + +All very silly. I think its harmless to ignore the problem. We'll end up with +a /\a in the final result but all the occurrences of a will be zonked to () + + +%************************************************************************ +%* * +\subsection{Zonking -- the main work-horses: zonkType, zonkTyVar} +%* * +%* For internal use only! * +%* * +%************************************************************************ + +\begin{code} +-- For unbound, mutable tyvars, zonkType uses the function given to it +-- For tyvars bound at a for-all, zonkType zonks them to an immutable +-- type variable and zonks the kind too + +zonkType :: (TcTyVar -> TcM Type) -- What to do with unbound mutable type variables + -- see zonkTcType, and zonkTcTypeToType + -> TcType + -> TcM Type +zonkType unbound_var_fn ty + = go ty + where + go (NoteTy _ ty2) = go ty2 -- Discard free-tyvar annotations + + go (TyConApp tc tys) = mappM go tys `thenM` \ tys' -> + returnM (TyConApp tc tys') + + go (PredTy p) = go_pred p `thenM` \ p' -> + returnM (PredTy p') + + go (FunTy arg res) = go arg `thenM` \ arg' -> + go res `thenM` \ res' -> + returnM (FunTy arg' res') + + go (AppTy fun arg) = go fun `thenM` \ fun' -> + go arg `thenM` \ arg' -> + returnM (mkAppTy fun' arg') + -- NB the mkAppTy; we might have instantiated a + -- type variable to a type constructor, so we need + -- to pull the TyConApp to the top. + + -- The two interesting cases! + go (TyVarTy tyvar) | isTcTyVar tyvar = zonk_tc_tyvar unbound_var_fn tyvar + | otherwise = return (TyVarTy tyvar) + -- Ordinary (non Tc) tyvars occur inside quantified types + + go (ForAllTy tyvar ty) = ASSERT( isImmutableTyVar tyvar ) + go ty `thenM` \ ty' -> + returnM (ForAllTy tyvar ty') + + go_pred (ClassP c tys) = mappM go tys `thenM` \ tys' -> + returnM (ClassP c tys') + go_pred (IParam n ty) = go ty `thenM` \ ty' -> + returnM (IParam n ty') + +zonk_tc_tyvar :: (TcTyVar -> TcM Type) -- What to do for an unbound mutable variable + -> TcTyVar -> TcM TcType +zonk_tc_tyvar unbound_var_fn tyvar + | not (isMetaTyVar tyvar) -- Skolems + = returnM (TyVarTy tyvar) + + | otherwise -- Mutables + = do { cts <- readMetaTyVar tyvar + ; case cts of + Flexi -> unbound_var_fn tyvar -- Unbound meta type variable + Indirect ty -> zonkType unbound_var_fn ty } +\end{code} + + + +%************************************************************************ +%* * + Zonking kinds +%* * +%************************************************************************ + +\begin{code} +readKindVar :: KindVar -> TcM (Maybe TcKind) +writeKindVar :: KindVar -> TcKind -> TcM () +readKindVar kv = readMutVar (kindVarRef kv) +writeKindVar kv val = writeMutVar (kindVarRef kv) (Just val) + +------------- +zonkTcKind :: TcKind -> TcM TcKind +zonkTcKind (FunKind k1 k2) = do { k1' <- zonkTcKind k1 + ; k2' <- zonkTcKind k2 + ; returnM (FunKind k1' k2') } +zonkTcKind k@(KindVar kv) = do { mb_kind <- readKindVar kv + ; case mb_kind of + Nothing -> returnM k + Just k -> zonkTcKind k } +zonkTcKind other_kind = returnM other_kind + +------------- +zonkTcKindToKind :: TcKind -> TcM Kind +zonkTcKindToKind (FunKind k1 k2) = do { k1' <- zonkTcKindToKind k1 + ; k2' <- zonkTcKindToKind k2 + ; returnM (FunKind k1' k2') } + +zonkTcKindToKind (KindVar kv) = do { mb_kind <- readKindVar kv + ; case mb_kind of + Nothing -> return liftedTypeKind + Just k -> zonkTcKindToKind k } + +zonkTcKindToKind OpenTypeKind = returnM liftedTypeKind -- An "Open" kind defaults to * +zonkTcKindToKind other_kind = returnM other_kind +\end{code} + +%************************************************************************ +%* * +\subsection{Checking a user type} +%* * +%************************************************************************ + +When dealing with a user-written type, we first translate it from an HsType +to a Type, performing kind checking, and then check various things that should +be true about it. We don't want to perform these checks at the same time +as the initial translation because (a) they are unnecessary for interface-file +types and (b) when checking a mutually recursive group of type and class decls, +we can't "look" at the tycons/classes yet. Also, the checks are are rather +diverse, and used to really mess up the other code. + +One thing we check for is 'rank'. + + Rank 0: monotypes (no foralls) + Rank 1: foralls at the front only, Rank 0 inside + Rank 2: foralls at the front, Rank 1 on left of fn arrow, + + basic ::= tyvar | T basic ... basic + + r2 ::= forall tvs. cxt => r2a + r2a ::= r1 -> r2a | basic + r1 ::= forall tvs. cxt => r0 + r0 ::= r0 -> r0 | basic + +Another thing is to check that type synonyms are saturated. +This might not necessarily show up in kind checking. + type A i = i + data T k = MkT (k Int) + f :: T A -- BAD! + + +\begin{code} +checkValidType :: UserTypeCtxt -> Type -> TcM () +-- Checks that the type is valid for the given context +checkValidType ctxt ty + = traceTc (text "checkValidType" <+> ppr ty) `thenM_` + doptM Opt_GlasgowExts `thenM` \ gla_exts -> + let + rank | gla_exts = Arbitrary + | otherwise + = case ctxt of -- Haskell 98 + GenPatCtxt -> Rank 0 + LamPatSigCtxt -> Rank 0 + BindPatSigCtxt -> Rank 0 + DefaultDeclCtxt-> Rank 0 + ResSigCtxt -> Rank 0 + TySynCtxt _ -> Rank 0 + ExprSigCtxt -> Rank 1 + FunSigCtxt _ -> Rank 1 + ConArgCtxt _ -> Rank 1 -- We are given the type of the entire + -- constructor, hence rank 1 + ForSigCtxt _ -> Rank 1 + RuleSigCtxt _ -> Rank 1 + SpecInstCtxt -> Rank 1 + + actual_kind = typeKind ty + + kind_ok = case ctxt of + TySynCtxt _ -> True -- Any kind will do + ResSigCtxt -> isOpenTypeKind actual_kind + ExprSigCtxt -> isOpenTypeKind actual_kind + GenPatCtxt -> isLiftedTypeKind actual_kind + ForSigCtxt _ -> isLiftedTypeKind actual_kind + other -> isArgTypeKind actual_kind + + ubx_tup | not gla_exts = UT_NotOk + | otherwise = case ctxt of + TySynCtxt _ -> UT_Ok + ExprSigCtxt -> UT_Ok + other -> UT_NotOk + -- Unboxed tuples ok in function results, + -- but for type synonyms we allow them even at + -- top level + in + -- Check that the thing has kind Type, and is lifted if necessary + checkTc kind_ok (kindErr actual_kind) `thenM_` + + -- Check the internal validity of the type itself + check_poly_type rank ubx_tup ty `thenM_` + + traceTc (text "checkValidType done" <+> ppr ty) +\end{code} + + +\begin{code} +data Rank = Rank Int | Arbitrary + +decRank :: Rank -> Rank +decRank Arbitrary = Arbitrary +decRank (Rank n) = Rank (n-1) + +---------------------------------------- +data UbxTupFlag = UT_Ok | UT_NotOk + -- The "Ok" version means "ok if -fglasgow-exts is on" + +---------------------------------------- +check_poly_type :: Rank -> UbxTupFlag -> Type -> TcM () +check_poly_type (Rank 0) ubx_tup ty + = check_tau_type (Rank 0) ubx_tup ty + +check_poly_type rank ubx_tup ty + = let + (tvs, theta, tau) = tcSplitSigmaTy ty + in + check_valid_theta SigmaCtxt theta `thenM_` + check_tau_type (decRank rank) ubx_tup tau `thenM_` + checkFreeness tvs theta `thenM_` + checkAmbiguity tvs theta (tyVarsOfType tau) + +---------------------------------------- +check_arg_type :: Type -> TcM () +-- The sort of type that can instantiate a type variable, +-- or be the argument of a type constructor. +-- Not an unboxed tuple, but now *can* be a forall (since impredicativity) +-- Other unboxed types are very occasionally allowed as type +-- arguments depending on the kind of the type constructor +-- +-- For example, we want to reject things like: +-- +-- instance Ord a => Ord (forall s. T s a) +-- and +-- g :: T s (forall b.b) +-- +-- NB: unboxed tuples can have polymorphic or unboxed args. +-- This happens in the workers for functions returning +-- product types with polymorphic components. +-- But not in user code. +-- Anyway, they are dealt with by a special case in check_tau_type + +check_arg_type ty + = check_poly_type Arbitrary UT_NotOk ty `thenM_` + checkTc (not (isUnLiftedType ty)) (unliftedArgErr ty) + +---------------------------------------- +check_tau_type :: Rank -> UbxTupFlag -> Type -> TcM () +-- Rank is allowed rank for function args +-- No foralls otherwise + +check_tau_type rank ubx_tup ty@(ForAllTy _ _) = failWithTc (forAllTyErr ty) +check_tau_type rank ubx_tup ty@(FunTy (PredTy _) _) = failWithTc (forAllTyErr ty) + -- Reject e.g. (Maybe (?x::Int => Int)), with a decent error message + +-- Naked PredTys don't usually show up, but they can as a result of +-- {-# SPECIALISE instance Ord Char #-} +-- The Right Thing would be to fix the way that SPECIALISE instance pragmas +-- are handled, but the quick thing is just to permit PredTys here. +check_tau_type rank ubx_tup (PredTy sty) = getDOpts `thenM` \ dflags -> + check_source_ty dflags TypeCtxt sty + +check_tau_type rank ubx_tup (TyVarTy _) = returnM () +check_tau_type rank ubx_tup ty@(FunTy arg_ty res_ty) + = check_poly_type rank UT_NotOk arg_ty `thenM_` + check_poly_type rank UT_Ok res_ty + +check_tau_type rank ubx_tup (AppTy ty1 ty2) + = check_arg_type ty1 `thenM_` check_arg_type ty2 + +check_tau_type rank ubx_tup (NoteTy other_note ty) + = check_tau_type rank ubx_tup ty + +check_tau_type rank ubx_tup ty@(TyConApp tc tys) + | isSynTyCon tc + = do { -- It's OK to have an *over-applied* type synonym + -- data Tree a b = ... + -- type Foo a = Tree [a] + -- f :: Foo a b -> ... + ; case tcView ty of + Just ty' -> check_tau_type rank ubx_tup ty' -- Check expansion + Nothing -> failWithTc arity_msg + + ; gla_exts <- doptM Opt_GlasgowExts + ; if gla_exts then + -- If -fglasgow-exts then don't check the type arguments + -- This allows us to instantiate a synonym defn with a + -- for-all type, or with a partially-applied type synonym. + -- e.g. type T a b = a + -- type S m = m () + -- f :: S (T Int) + -- Here, T is partially applied, so it's illegal in H98. + -- But if you expand S first, then T we get just + -- f :: Int + -- which is fine. + returnM () + else + -- For H98, do check the type args + mappM_ check_arg_type tys + } + + | isUnboxedTupleTyCon tc + = doptM Opt_GlasgowExts `thenM` \ gla_exts -> + checkTc (ubx_tup_ok gla_exts) ubx_tup_msg `thenM_` + mappM_ (check_tau_type (Rank 0) UT_Ok) tys + -- Args are allowed to be unlifted, or + -- more unboxed tuples, so can't use check_arg_ty + + | otherwise + = mappM_ check_arg_type tys + + where + ubx_tup_ok gla_exts = case ubx_tup of { UT_Ok -> gla_exts; other -> False } + + n_args = length tys + tc_arity = tyConArity tc + + arity_msg = arityErr "Type synonym" (tyConName tc) tc_arity n_args + ubx_tup_msg = ubxArgTyErr ty + +---------------------------------------- +forAllTyErr ty = ptext SLIT("Illegal polymorphic or qualified type:") <+> ppr ty +unliftedArgErr ty = ptext SLIT("Illegal unlifted type argument:") <+> ppr ty +ubxArgTyErr ty = ptext SLIT("Illegal unboxed tuple type as function argument:") <+> ppr ty +kindErr kind = ptext SLIT("Expecting an ordinary type, but found a type of kind") <+> ppr kind +\end{code} + + + +%************************************************************************ +%* * +\subsection{Checking a theta or source type} +%* * +%************************************************************************ + +\begin{code} +-- Enumerate the contexts in which a "source type", <S>, can occur +-- Eq a +-- or ?x::Int +-- or r <: {x::Int} +-- or (N a) where N is a newtype + +data SourceTyCtxt + = ClassSCCtxt Name -- Superclasses of clas + -- class <S> => C a where ... + | SigmaCtxt -- Theta part of a normal for-all type + -- f :: <S> => a -> a + | DataTyCtxt Name -- Theta part of a data decl + -- data <S> => T a = MkT a + | TypeCtxt -- Source type in an ordinary type + -- f :: N a -> N a + | InstThetaCtxt -- Context of an instance decl + -- instance <S> => C [a] where ... + +pprSourceTyCtxt (ClassSCCtxt c) = ptext SLIT("the super-classes of class") <+> quotes (ppr c) +pprSourceTyCtxt SigmaCtxt = ptext SLIT("the context of a polymorphic type") +pprSourceTyCtxt (DataTyCtxt tc) = ptext SLIT("the context of the data type declaration for") <+> quotes (ppr tc) +pprSourceTyCtxt InstThetaCtxt = ptext SLIT("the context of an instance declaration") +pprSourceTyCtxt TypeCtxt = ptext SLIT("the context of a type") +\end{code} + +\begin{code} +checkValidTheta :: SourceTyCtxt -> ThetaType -> TcM () +checkValidTheta ctxt theta + = addErrCtxt (checkThetaCtxt ctxt theta) (check_valid_theta ctxt theta) + +------------------------- +check_valid_theta ctxt [] + = returnM () +check_valid_theta ctxt theta + = getDOpts `thenM` \ dflags -> + warnTc (notNull dups) (dupPredWarn dups) `thenM_` + mappM_ (check_source_ty dflags ctxt) theta + where + (_,dups) = removeDups tcCmpPred theta + +------------------------- +check_source_ty dflags ctxt pred@(ClassP cls tys) + = -- Class predicates are valid in all contexts + checkTc (arity == n_tys) arity_err `thenM_` + + -- Check the form of the argument types + mappM_ check_arg_type tys `thenM_` + checkTc (check_class_pred_tys dflags ctxt tys) + (predTyVarErr pred $$ how_to_allow) + + where + class_name = className cls + arity = classArity cls + n_tys = length tys + arity_err = arityErr "Class" class_name arity n_tys + how_to_allow = parens (ptext SLIT("Use -fglasgow-exts to permit this")) + +check_source_ty dflags SigmaCtxt (IParam _ ty) = check_arg_type ty + -- Implicit parameters only allows in type + -- signatures; not in instance decls, superclasses etc + -- The reason for not allowing implicit params in instances is a bit subtle + -- If we allowed instance (?x::Int, Eq a) => Foo [a] where ... + -- then when we saw (e :: (?x::Int) => t) it would be unclear how to + -- discharge all the potential usas of the ?x in e. For example, a + -- constraint Foo [Int] might come out of e,and applying the + -- instance decl would show up two uses of ?x. + +-- Catch-all +check_source_ty dflags ctxt sty = failWithTc (badSourceTyErr sty) + +------------------------- +check_class_pred_tys dflags ctxt tys + = case ctxt of + TypeCtxt -> True -- {-# SPECIALISE instance Eq (T Int) #-} is fine + InstThetaCtxt -> gla_exts || undecidable_ok || all tcIsTyVarTy tys + -- Further checks on head and theta in + -- checkInstTermination + other -> gla_exts || all tyvar_head tys + where + gla_exts = dopt Opt_GlasgowExts dflags + undecidable_ok = dopt Opt_AllowUndecidableInstances dflags + +------------------------- +tyvar_head ty -- Haskell 98 allows predicates of form + | tcIsTyVarTy ty = True -- C (a ty1 .. tyn) + | otherwise -- where a is a type variable + = case tcSplitAppTy_maybe ty of + Just (ty, _) -> tyvar_head ty + Nothing -> False +\end{code} + +Check for ambiguity +~~~~~~~~~~~~~~~~~~~ + forall V. P => tau +is ambiguous if P contains generic variables +(i.e. one of the Vs) that are not mentioned in tau + +However, we need to take account of functional dependencies +when we speak of 'mentioned in tau'. Example: + class C a b | a -> b where ... +Then the type + forall x y. (C x y) => x +is not ambiguous because x is mentioned and x determines y + +NB; the ambiguity check is only used for *user* types, not for types +coming from inteface files. The latter can legitimately have +ambiguous types. Example + + class S a where s :: a -> (Int,Int) + instance S Char where s _ = (1,1) + f:: S a => [a] -> Int -> (Int,Int) + f (_::[a]) x = (a*x,b) + where (a,b) = s (undefined::a) + +Here the worker for f gets the type + fw :: forall a. S a => Int -> (# Int, Int #) + +If the list of tv_names is empty, we have a monotype, and then we +don't need to check for ambiguity either, because the test can't fail +(see is_ambig). + +\begin{code} +checkAmbiguity :: [TyVar] -> ThetaType -> TyVarSet -> TcM () +checkAmbiguity forall_tyvars theta tau_tyvars + = mappM_ complain (filter is_ambig theta) + where + complain pred = addErrTc (ambigErr pred) + extended_tau_vars = grow theta tau_tyvars + + -- Only a *class* predicate can give rise to ambiguity + -- An *implicit parameter* cannot. For example: + -- foo :: (?x :: [a]) => Int + -- foo = length ?x + -- is fine. The call site will suppply a particular 'x' + is_ambig pred = isClassPred pred && + any ambig_var (varSetElems (tyVarsOfPred pred)) + + ambig_var ct_var = (ct_var `elem` forall_tyvars) && + not (ct_var `elemVarSet` extended_tau_vars) + +ambigErr pred + = sep [ptext SLIT("Ambiguous constraint") <+> quotes (pprPred pred), + nest 4 (ptext SLIT("At least one of the forall'd type variables mentioned by the constraint") $$ + ptext SLIT("must be reachable from the type after the '=>'"))] +\end{code} + +In addition, GHC insists that at least one type variable +in each constraint is in V. So we disallow a type like + forall a. Eq b => b -> b +even in a scope where b is in scope. + +\begin{code} +checkFreeness forall_tyvars theta + = mappM_ complain (filter is_free theta) + where + is_free pred = not (isIPPred pred) + && not (any bound_var (varSetElems (tyVarsOfPred pred))) + bound_var ct_var = ct_var `elem` forall_tyvars + complain pred = addErrTc (freeErr pred) + +freeErr pred + = sep [ptext SLIT("All of the type variables in the constraint") <+> quotes (pprPred pred) <+> + ptext SLIT("are already in scope"), + nest 4 (ptext SLIT("(at least one must be universally quantified here)")) + ] +\end{code} + +\begin{code} +checkThetaCtxt ctxt theta + = vcat [ptext SLIT("In the context:") <+> pprTheta theta, + ptext SLIT("While checking") <+> pprSourceTyCtxt ctxt ] + +badSourceTyErr sty = ptext SLIT("Illegal constraint") <+> pprPred sty +predTyVarErr pred = sep [ptext SLIT("Non type-variable argument"), + nest 2 (ptext SLIT("in the constraint:") <+> pprPred pred)] +dupPredWarn dups = ptext SLIT("Duplicate constraint(s):") <+> pprWithCommas pprPred (map head dups) + +arityErr kind name n m + = hsep [ text kind, quotes (ppr name), ptext SLIT("should have"), + n_arguments <> comma, text "but has been given", int m] + where + n_arguments | n == 0 = ptext SLIT("no arguments") + | n == 1 = ptext SLIT("1 argument") + | True = hsep [int n, ptext SLIT("arguments")] +\end{code} + + +%************************************************************************ +%* * +\subsection{Checking for a decent instance head type} +%* * +%************************************************************************ + +@checkValidInstHead@ checks the type {\em and} its syntactic constraints: +it must normally look like: @instance Foo (Tycon a b c ...) ...@ + +The exceptions to this syntactic checking: (1)~if the @GlasgowExts@ +flag is on, or (2)~the instance is imported (they must have been +compiled elsewhere). In these cases, we let them go through anyway. + +We can also have instances for functions: @instance Foo (a -> b) ...@. + +\begin{code} +checkValidInstHead :: Type -> TcM (Class, [TcType]) + +checkValidInstHead ty -- Should be a source type + = case tcSplitPredTy_maybe ty of { + Nothing -> failWithTc (instTypeErr (ppr ty) empty) ; + Just pred -> + + case getClassPredTys_maybe pred of { + Nothing -> failWithTc (instTypeErr (pprPred pred) empty) ; + Just (clas,tys) -> + + getDOpts `thenM` \ dflags -> + mappM_ check_arg_type tys `thenM_` + check_inst_head dflags clas tys `thenM_` + returnM (clas, tys) + }} + +check_inst_head dflags clas tys + -- If GlasgowExts then check at least one isn't a type variable + | dopt Opt_GlasgowExts dflags + = mapM_ check_one tys + + -- WITH HASKELL 98, MUST HAVE C (T a b c) + | otherwise + = checkTc (isSingleton tys && tcValidInstHeadTy first_ty) + (instTypeErr (pprClassPred clas tys) head_shape_msg) + + where + (first_ty : _) = tys + + head_shape_msg = parens (text "The instance type must be of form (T a b c)" $$ + text "where T is not a synonym, and a,b,c are distinct type variables") + + -- For now, I only allow tau-types (not polytypes) in + -- the head of an instance decl. + -- E.g. instance C (forall a. a->a) is rejected + -- One could imagine generalising that, but I'm not sure + -- what all the consequences might be + check_one ty = do { check_tau_type (Rank 0) UT_NotOk ty + ; checkTc (not (isUnLiftedType ty)) (unliftedArgErr ty) } + +instTypeErr pp_ty msg + = sep [ptext SLIT("Illegal instance declaration for") <+> quotes pp_ty, + nest 4 msg] +\end{code} + + +%************************************************************************ +%* * +\subsection{Checking instance for termination} +%* * +%************************************************************************ + + +\begin{code} +checkValidInstance :: [TyVar] -> ThetaType -> Class -> [TcType] -> TcM () +checkValidInstance tyvars theta clas inst_tys + = do { gla_exts <- doptM Opt_GlasgowExts + ; undecidable_ok <- doptM Opt_AllowUndecidableInstances + + ; checkValidTheta InstThetaCtxt theta + ; checkAmbiguity tyvars theta (tyVarsOfTypes inst_tys) + + -- Check that instance inference will terminate (if we care) + -- For Haskell 98, checkValidTheta has already done that + ; when (gla_exts && not undecidable_ok) $ + checkInstTermination theta inst_tys + + -- The Coverage Condition + ; checkTc (undecidable_ok || checkInstCoverage clas inst_tys) + (instTypeErr (pprClassPred clas inst_tys) msg) + } + where + msg = parens (ptext SLIT("the Coverage Condition fails for one of the functional dependencies")) +\end{code} + +Termination test: each assertion in the context satisfies + (1) no variable has more occurrences in the assertion than in the head, and + (2) the assertion has fewer constructors and variables (taken together + and counting repetitions) than the head. +This is only needed with -fglasgow-exts, as Haskell 98 restrictions +(which have already been checked) guarantee termination. + +The underlying idea is that + + for any ground substitution, each assertion in the + context has fewer type constructors than the head. + + +\begin{code} +checkInstTermination :: ThetaType -> [TcType] -> TcM () +checkInstTermination theta tys + = do { mappM_ (check_nomore (fvTypes tys)) theta + ; mappM_ (check_smaller (sizeTypes tys)) theta } + +check_nomore :: [TyVar] -> PredType -> TcM () +check_nomore fvs pred + = checkTc (null (fvPred pred \\ fvs)) + (predUndecErr pred nomoreMsg $$ parens undecidableMsg) + +check_smaller :: Int -> PredType -> TcM () +check_smaller n pred + = checkTc (sizePred pred < n) + (predUndecErr pred smallerMsg $$ parens undecidableMsg) + +predUndecErr pred msg = sep [msg, + nest 2 (ptext SLIT("in the constraint:") <+> pprPred pred)] + +nomoreMsg = ptext SLIT("Variable occurs more often in a constraint than in the instance head") +smallerMsg = ptext SLIT("Constraint is no smaller than the instance head") +undecidableMsg = ptext SLIT("Use -fallow-undecidable-instances to permit this") + +-- Free variables of a type, retaining repetitions, and expanding synonyms +fvType :: Type -> [TyVar] +fvType ty | Just exp_ty <- tcView ty = fvType exp_ty +fvType (TyVarTy tv) = [tv] +fvType (TyConApp _ tys) = fvTypes tys +fvType (NoteTy _ ty) = fvType ty +fvType (PredTy pred) = fvPred pred +fvType (FunTy arg res) = fvType arg ++ fvType res +fvType (AppTy fun arg) = fvType fun ++ fvType arg +fvType (ForAllTy tyvar ty) = filter (/= tyvar) (fvType ty) + +fvTypes :: [Type] -> [TyVar] +fvTypes tys = concat (map fvType tys) + +fvPred :: PredType -> [TyVar] +fvPred (ClassP _ tys') = fvTypes tys' +fvPred (IParam _ ty) = fvType ty + +-- Size of a type: the number of variables and constructors +sizeType :: Type -> Int +sizeType ty | Just exp_ty <- tcView ty = sizeType exp_ty +sizeType (TyVarTy _) = 1 +sizeType (TyConApp _ tys) = sizeTypes tys + 1 +sizeType (NoteTy _ ty) = sizeType ty +sizeType (PredTy pred) = sizePred pred +sizeType (FunTy arg res) = sizeType arg + sizeType res + 1 +sizeType (AppTy fun arg) = sizeType fun + sizeType arg +sizeType (ForAllTy _ ty) = sizeType ty + +sizeTypes :: [Type] -> Int +sizeTypes xs = sum (map sizeType xs) + +sizePred :: PredType -> Int +sizePred (ClassP _ tys') = sizeTypes tys' +sizePred (IParam _ ty) = sizeType ty +\end{code} diff --git a/compiler/typecheck/TcMatches.hi-boot-5 b/compiler/typecheck/TcMatches.hi-boot-5 new file mode 100644 index 0000000000..43e2330683 --- /dev/null +++ b/compiler/typecheck/TcMatches.hi-boot-5 @@ -0,0 +1,10 @@ +__interface TcMatches 1 0 where +__export TcMatches tcGRHSsPat tcMatchesFun; +1 tcGRHSsPat :: HsExpr.GRHSs Name.Name + -> TcUnify.Expected TcType.TcType + -> TcRnTypes.TcM (HsExpr.GRHSs Var.Id) ; + +1 tcMatchesFun :: Name.Name + -> [HsExpr.LMatch Name.Name] + -> TcUnify.Expected TcType.TcType + -> TcRnTypes.TcM [HsExpr.LMatch Var.Id] ; diff --git a/compiler/typecheck/TcMatches.hi-boot-6 b/compiler/typecheck/TcMatches.hi-boot-6 new file mode 100644 index 0000000000..fb723a4527 --- /dev/null +++ b/compiler/typecheck/TcMatches.hi-boot-6 @@ -0,0 +1,10 @@ +module TcMatches where + +tcGRHSsPat :: HsExpr.GRHSs Name.Name + -> TcType.BoxyRhoType + -> TcRnTypes.TcM (HsExpr.GRHSs Var.Id) + +tcMatchesFun :: Name.Name + -> HsExpr.MatchGroup Name.Name + -> TcType.BoxyRhoType + -> TcRnTypes.TcM (HsBinds.ExprCoFn, HsExpr.MatchGroup Var.Id) diff --git a/compiler/typecheck/TcMatches.lhs b/compiler/typecheck/TcMatches.lhs new file mode 100644 index 0000000000..07a1094d58 --- /dev/null +++ b/compiler/typecheck/TcMatches.lhs @@ -0,0 +1,515 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[TcMatches]{Typecheck some @Matches@} + +\begin{code} +module TcMatches ( tcMatchesFun, tcGRHSsPat, tcMatchesCase, tcMatchLambda, + matchCtxt, TcMatchCtxt(..), + tcStmts, tcDoStmts, + tcDoStmt, tcMDoStmt, tcGuardStmt + ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} TcExpr( tcSyntaxOp, tcInferRho, tcMonoExpr, tcPolyExpr ) + +import HsSyn ( HsExpr(..), LHsExpr, MatchGroup(..), + Match(..), LMatch, GRHSs(..), GRHS(..), + Stmt(..), LStmt, HsMatchContext(..), HsStmtContext(..), + pprMatch, isIrrefutableHsPat, mkHsCoerce, + pprMatchContext, pprStmtContext, + noSyntaxExpr, matchGroupArity, pprMatches, + ExprCoFn ) + +import TcRnMonad +import TcHsType ( tcPatSig, UserTypeCtxt(..) ) +import Inst ( newMethodFromName ) +import TcEnv ( TcId, tcLookupLocalIds, tcLookupId, tcExtendIdEnv, + tcExtendTyVarEnv2 ) +import TcPat ( PatCtxt(..), tcPats, tcPat ) +import TcMType ( newFlexiTyVarTy, newFlexiTyVarTys ) +import TcType ( TcType, TcRhoType, + BoxySigmaType, BoxyRhoType, + mkFunTys, mkFunTy, mkAppTy, mkTyConApp, + liftedTypeKind ) +import TcBinds ( tcLocalBinds ) +import TcUnify ( boxySplitAppTy, boxySplitTyConApp, boxySplitListTy, + subFunTys, tcSubExp, withBox ) +import TcSimplify ( bindInstsOfLocalFuns ) +import Name ( Name ) +import TysWiredIn ( stringTy, boolTy, parrTyCon, listTyCon, mkListTy, mkPArrTy ) +import PrelNames ( bindMName, returnMName, mfixName, thenMName, failMName ) +import Id ( idType, mkLocalId ) +import TyCon ( TyCon ) +import Outputable +import SrcLoc ( Located(..), getLoc ) +import ErrUtils ( Message ) +\end{code} + +%************************************************************************ +%* * +\subsection{tcMatchesFun, tcMatchesCase} +%* * +%************************************************************************ + +@tcMatchesFun@ typechecks a @[Match]@ list which occurs in a +@FunMonoBind@. The second argument is the name of the function, which +is used in error messages. It checks that all the equations have the +same number of arguments before using @tcMatches@ to do the work. + +\begin{code} +tcMatchesFun :: Name + -> MatchGroup Name + -> BoxyRhoType -- Expected type of function + -> TcM (ExprCoFn, MatchGroup TcId) -- Returns type of body + +tcMatchesFun fun_name matches exp_ty + = do { -- Check that they all have the same no of arguments + -- Location is in the monad, set the caller so that + -- any inter-equation error messages get some vaguely + -- sensible location. Note: we have to do this odd + -- ann-grabbing, because we don't always have annotations in + -- hand when we call tcMatchesFun... + checkArgs fun_name matches + + -- ToDo: Don't use "expected" stuff if there ain't a type signature + -- because inconsistency between branches + -- may show up as something wrong with the (non-existent) type signature + + -- This is one of two places places we call subFunTys + -- The point is that if expected_y is a "hole", we want + -- to make pat_tys and rhs_ty as "holes" too. + ; subFunTys doc n_pats exp_ty $ \ pat_tys rhs_ty -> + tcMatches match_ctxt pat_tys rhs_ty matches + } + where + doc = ptext SLIT("The equation(s) for") <+> quotes (ppr fun_name) + <+> ptext SLIT("have") <+> speakNOf n_pats (ptext SLIT("argument")) + n_pats = matchGroupArity matches + match_ctxt = MC { mc_what = FunRhs fun_name, mc_body = tcPolyExpr } +\end{code} + +@tcMatchesCase@ doesn't do the argument-count check because the +parser guarantees that each equation has exactly one argument. + +\begin{code} +tcMatchesCase :: TcMatchCtxt -- Case context + -> TcRhoType -- Type of scrutinee + -> MatchGroup Name -- The case alternatives + -> BoxyRhoType -- Type of whole case expressions + -> TcM (MatchGroup TcId) -- Translated alternatives + +tcMatchesCase ctxt scrut_ty matches res_ty + = tcMatches ctxt [scrut_ty] res_ty matches + +tcMatchLambda :: MatchGroup Name -> BoxyRhoType -> TcM (ExprCoFn, MatchGroup TcId) +tcMatchLambda match res_ty + = subFunTys doc n_pats res_ty $ \ pat_tys rhs_ty -> + tcMatches match_ctxt pat_tys rhs_ty match + where + n_pats = matchGroupArity match + doc = sep [ ptext SLIT("The lambda expression") + <+> quotes (pprSetDepth 1 $ pprMatches LambdaExpr match), + -- The pprSetDepth makes the abstraction print briefly + ptext SLIT("has") <+> speakNOf n_pats (ptext SLIT("argument"))] + match_ctxt = MC { mc_what = LambdaExpr, + mc_body = tcPolyExpr } +\end{code} + +@tcGRHSsPat@ typechecks @[GRHSs]@ that occur in a @PatMonoBind@. + +\begin{code} +tcGRHSsPat :: GRHSs Name -> BoxyRhoType -> TcM (GRHSs TcId) +tcGRHSsPat grhss res_ty = tcGRHSs match_ctxt grhss res_ty + where + match_ctxt = MC { mc_what = PatBindRhs, + mc_body = tcPolyExpr } +\end{code} + + +%************************************************************************ +%* * +\subsection{tcMatch} +%* * +%************************************************************************ + +\begin{code} +tcMatches :: TcMatchCtxt + -> [BoxySigmaType] -- Expected pattern types + -> BoxyRhoType -- Expected result-type of the Match. + -> MatchGroup Name + -> TcM (MatchGroup TcId) + +data TcMatchCtxt -- c.f. TcStmtCtxt, also in this module + = MC { mc_what :: HsMatchContext Name, -- What kind of thing this is + mc_body :: LHsExpr Name -- Type checker for a body of an alternative + -> BoxyRhoType + -> TcM (LHsExpr TcId) } + +tcMatches ctxt pat_tys rhs_ty (MatchGroup matches _) + = do { matches' <- mapM (tcMatch ctxt pat_tys rhs_ty) matches + ; return (MatchGroup matches' (mkFunTys pat_tys rhs_ty)) } + +------------- +tcMatch :: TcMatchCtxt + -> [BoxySigmaType] -- Expected pattern types + -> BoxyRhoType -- Expected result-type of the Match. + -> LMatch Name + -> TcM (LMatch TcId) + +tcMatch ctxt pat_tys rhs_ty match + = wrapLocM (tc_match ctxt pat_tys rhs_ty) match + where + tc_match ctxt pat_tys rhs_ty match@(Match pats maybe_rhs_sig grhss) + = addErrCtxt (matchCtxt (mc_what ctxt) match) $ + do { (pats', grhss') <- tcPats LamPat pats pat_tys rhs_ty $ + tc_grhss ctxt maybe_rhs_sig grhss + ; returnM (Match pats' Nothing grhss') } + + tc_grhss ctxt Nothing grhss rhs_ty + = tcGRHSs ctxt grhss rhs_ty -- No result signature + + tc_grhss ctxt (Just res_sig) grhss rhs_ty + = do { (inner_ty, sig_tvs) <- tcPatSig ResSigCtxt res_sig rhs_ty + ; tcExtendTyVarEnv2 sig_tvs $ + tcGRHSs ctxt grhss inner_ty } + +------------- +tcGRHSs :: TcMatchCtxt -> GRHSs Name -> BoxyRhoType -> TcM (GRHSs TcId) + +-- Notice that we pass in the full res_ty, so that we get +-- good inference from simple things like +-- f = \(x::forall a.a->a) -> <stuff> +-- We used to force it to be a monotype when there was more than one guard +-- but we don't need to do that any more + +tcGRHSs ctxt (GRHSs grhss binds) res_ty + = do { (binds', grhss') <- tcLocalBinds binds $ + mappM (wrapLocM (tcGRHS ctxt res_ty)) grhss + + ; returnM (GRHSs grhss' binds') } + +------------- +tcGRHS :: TcMatchCtxt -> BoxyRhoType -> GRHS Name -> TcM (GRHS TcId) + +tcGRHS ctxt res_ty (GRHS guards rhs) + = do { (guards', rhs') <- tcStmts stmt_ctxt tcGuardStmt guards res_ty $ + mc_body ctxt rhs + ; return (GRHS guards' rhs') } + where + stmt_ctxt = PatGuard (mc_what ctxt) +\end{code} + + +%************************************************************************ +%* * +\subsection{@tcDoStmts@ typechecks a {\em list} of do statements} +%* * +%************************************************************************ + +\begin{code} +tcDoStmts :: HsStmtContext Name + -> [LStmt Name] + -> LHsExpr Name + -> BoxyRhoType + -> TcM (HsExpr TcId) -- Returns a HsDo +tcDoStmts ListComp stmts body res_ty + = do { elt_ty <- boxySplitListTy res_ty + ; (stmts', body') <- tcStmts ListComp (tcLcStmt listTyCon) stmts elt_ty $ + tcBody (doBodyCtxt ListComp body) body + ; return (HsDo ListComp stmts' body' (mkListTy elt_ty)) } + +tcDoStmts PArrComp stmts body res_ty + = do { [elt_ty] <- boxySplitTyConApp parrTyCon res_ty + ; (stmts', body') <- tcStmts PArrComp (tcLcStmt parrTyCon) stmts elt_ty $ + tcBody (doBodyCtxt PArrComp body) body + ; return (HsDo PArrComp stmts' body' (mkPArrTy elt_ty)) } + +tcDoStmts DoExpr stmts body res_ty + = do { (m_ty, elt_ty) <- boxySplitAppTy res_ty + ; let res_ty' = mkAppTy m_ty elt_ty -- The boxySplit consumes res_ty + ; (stmts', body') <- tcStmts DoExpr (tcDoStmt m_ty) stmts res_ty' $ + tcBody (doBodyCtxt DoExpr body) body + ; return (HsDo DoExpr stmts' body' res_ty') } + +tcDoStmts ctxt@(MDoExpr _) stmts body res_ty + = do { (m_ty, elt_ty) <- boxySplitAppTy res_ty + ; let res_ty' = mkAppTy m_ty elt_ty -- The boxySplit consumes res_ty + tc_rhs rhs = withBox liftedTypeKind $ \ pat_ty -> + tcMonoExpr rhs (mkAppTy m_ty pat_ty) + + ; (stmts', body') <- tcStmts ctxt (tcMDoStmt tc_rhs) stmts res_ty' $ + tcBody (doBodyCtxt ctxt body) body + + ; let names = [mfixName, bindMName, thenMName, returnMName, failMName] + ; insts <- mapM (newMethodFromName DoOrigin m_ty) names + ; return (HsDo (MDoExpr (names `zip` insts)) stmts' body' res_ty') } + +tcDoStmts ctxt stmts body res_ty = pprPanic "tcDoStmts" (pprStmtContext ctxt) + +tcBody :: Message -> LHsExpr Name -> BoxyRhoType -> TcM (LHsExpr TcId) +tcBody ctxt body res_ty + = -- addErrCtxt ctxt $ -- This context adds little that is useful + tcPolyExpr body res_ty +\end{code} + + +%************************************************************************ +%* * +\subsection{tcStmts} +%* * +%************************************************************************ + +\begin{code} +type TcStmtChecker + = forall thing. HsStmtContext Name + -> Stmt Name + -> BoxyRhoType -- Result type for comprehension + -> (BoxyRhoType -> TcM thing) -- Checker for what follows the stmt + -> TcM (Stmt TcId, thing) + + -- The incoming BoxyRhoType may be refined by type refinements + -- before being passed to the thing_inside + +tcStmts :: HsStmtContext Name + -> TcStmtChecker -- NB: higher-rank type + -> [LStmt Name] + -> BoxyRhoType + -> (BoxyRhoType -> TcM thing) + -> TcM ([LStmt TcId], thing) + +-- Note the higher-rank type. stmt_chk is applied at different +-- types in the equations for tcStmts + +tcStmts ctxt stmt_chk [] res_ty thing_inside + = do { thing <- thing_inside res_ty + ; return ([], thing) } + +-- LetStmts are handled uniformly, regardless of context +tcStmts ctxt stmt_chk (L loc (LetStmt binds) : stmts) res_ty thing_inside + = do { (binds', (stmts',thing)) <- tcLocalBinds binds $ + tcStmts ctxt stmt_chk stmts res_ty thing_inside + ; return (L loc (LetStmt binds') : stmts', thing) } + +-- For the vanilla case, handle the location-setting part +tcStmts ctxt stmt_chk (L loc stmt : stmts) res_ty thing_inside + = do { (stmt', (stmts', thing)) <- + setSrcSpan loc $ + addErrCtxt (stmtCtxt ctxt stmt) $ + stmt_chk ctxt stmt res_ty $ \ res_ty' -> + popErrCtxt $ + tcStmts ctxt stmt_chk stmts res_ty' $ + thing_inside + ; return (L loc stmt' : stmts', thing) } + +-------------------------------- +-- Pattern guards +tcGuardStmt :: TcStmtChecker +tcGuardStmt ctxt (ExprStmt guard _ _) res_ty thing_inside + = do { guard' <- tcMonoExpr guard boolTy + ; thing <- thing_inside res_ty + ; return (ExprStmt guard' noSyntaxExpr boolTy, thing) } + +tcGuardStmt ctxt (BindStmt pat rhs _ _) res_ty thing_inside + = do { (rhs', rhs_ty) <- tcInferRho rhs + ; (pat', thing) <- tcPat LamPat pat rhs_ty res_ty thing_inside + ; return (BindStmt pat' rhs' noSyntaxExpr noSyntaxExpr, thing) } + +tcGuardStmt ctxt stmt res_ty thing_inside + = pprPanic "tcGuardStmt: unexpected Stmt" (ppr stmt) + + +-------------------------------- +-- List comprehensions and PArrays + +tcLcStmt :: TyCon -- The list/Parray type constructor ([] or PArray) + -> TcStmtChecker + +-- A generator, pat <- rhs +tcLcStmt m_tc ctxt (BindStmt pat rhs _ _) res_ty thing_inside + = do { (rhs', pat_ty) <- withBox liftedTypeKind $ \ ty -> + tcMonoExpr rhs (mkTyConApp m_tc [ty]) + ; (pat', thing) <- tcPat LamPat pat pat_ty res_ty thing_inside + ; return (BindStmt pat' rhs' noSyntaxExpr noSyntaxExpr, thing) } + +-- A boolean guard +tcLcStmt m_tc ctxt (ExprStmt rhs _ _) res_ty thing_inside + = do { rhs' <- tcMonoExpr rhs boolTy + ; thing <- thing_inside res_ty + ; return (ExprStmt rhs' noSyntaxExpr boolTy, thing) } + +-- A parallel set of comprehensions +-- [ (g x, h x) | ... ; let g v = ... +-- | ... ; let h v = ... ] +-- +-- It's possible that g,h are overloaded, so we need to feed the LIE from the +-- (g x, h x) up through both lots of bindings (so we get the bindInstsOfLocalFuns). +-- Similarly if we had an existential pattern match: +-- +-- data T = forall a. Show a => C a +-- +-- [ (show x, show y) | ... ; C x <- ... +-- | ... ; C y <- ... ] +-- +-- Then we need the LIE from (show x, show y) to be simplified against +-- the bindings for x and y. +-- +-- It's difficult to do this in parallel, so we rely on the renamer to +-- ensure that g,h and x,y don't duplicate, and simply grow the environment. +-- So the binders of the first parallel group will be in scope in the second +-- group. But that's fine; there's no shadowing to worry about. + +tcLcStmt m_tc ctxt (ParStmt bndr_stmts_s) elt_ty thing_inside + = do { (pairs', thing) <- loop bndr_stmts_s + ; return (ParStmt pairs', thing) } + where + -- loop :: [([LStmt Name], [Name])] -> TcM ([([LStmt TcId], [TcId])], thing) + loop [] = do { thing <- thing_inside elt_ty -- No refinement from pattern + ; return ([], thing) } -- matching in the branches + + loop ((stmts, names) : pairs) + = do { (stmts', (ids, pairs', thing)) + <- tcStmts ctxt (tcLcStmt m_tc) stmts elt_ty $ \ elt_ty' -> + do { ids <- tcLookupLocalIds names + ; (pairs', thing) <- loop pairs + ; return (ids, pairs', thing) } + ; return ( (stmts', ids) : pairs', thing ) } + +tcLcStmt m_tc ctxt stmt elt_ty thing_inside + = pprPanic "tcLcStmt: unexpected Stmt" (ppr stmt) + +-------------------------------- +-- Do-notation +-- The main excitement here is dealing with rebindable syntax + +tcDoStmt :: TcType -- Monad type, m + -> TcStmtChecker + +tcDoStmt m_ty ctxt (BindStmt pat rhs bind_op fail_op) res_ty thing_inside + = do { (rhs', pat_ty) <- withBox liftedTypeKind $ \ pat_ty -> + tcMonoExpr rhs (mkAppTy m_ty pat_ty) + -- We should use type *inference* for the RHS computations, becuase of GADTs. + -- do { pat <- rhs; <rest> } + -- is rather like + -- case rhs of { pat -> <rest> } + -- We do inference on rhs, so that information about its type can be refined + -- when type-checking the pattern. + + ; (pat', thing) <- tcPat LamPat pat pat_ty res_ty thing_inside + + -- Deal with rebindable syntax; (>>=) :: m a -> (a -> m b) -> m b + ; let bind_ty = mkFunTys [mkAppTy m_ty pat_ty, + mkFunTy pat_ty res_ty] res_ty + ; bind_op' <- tcSyntaxOp DoOrigin bind_op bind_ty + -- If (but only if) the pattern can fail, + -- typecheck the 'fail' operator + ; fail_op' <- if isIrrefutableHsPat pat' + then return noSyntaxExpr + else tcSyntaxOp DoOrigin fail_op (mkFunTy stringTy res_ty) + ; return (BindStmt pat' rhs' bind_op' fail_op', thing) } + + +tcDoStmt m_ty ctxt (ExprStmt rhs then_op _) res_ty thing_inside + = do { -- Deal with rebindable syntax; (>>) :: m a -> m b -> m b + a_ty <- newFlexiTyVarTy liftedTypeKind + ; let rhs_ty = mkAppTy m_ty a_ty + then_ty = mkFunTys [rhs_ty, res_ty] res_ty + ; then_op' <- tcSyntaxOp DoOrigin then_op then_ty + ; rhs' <- tcPolyExpr rhs rhs_ty + ; thing <- thing_inside res_ty + ; return (ExprStmt rhs' then_op' rhs_ty, thing) } + +tcDoStmt m_ty ctxt stmt res_ty thing_inside + = pprPanic "tcDoStmt: unexpected Stmt" (ppr stmt) + +-------------------------------- +-- Mdo-notation +-- The distinctive features here are +-- (a) RecStmts, and +-- (b) no rebindable syntax + +tcMDoStmt :: (LHsExpr Name -> TcM (LHsExpr TcId, TcType)) -- RHS inference + -> TcStmtChecker +tcMDoStmt tc_rhs ctxt (BindStmt pat rhs bind_op fail_op) res_ty thing_inside + = do { (rhs', pat_ty) <- tc_rhs rhs + ; (pat', thing) <- tcPat LamPat pat pat_ty res_ty thing_inside + ; return (BindStmt pat' rhs' noSyntaxExpr noSyntaxExpr, thing) } + +tcMDoStmt tc_rhs ctxt (ExprStmt rhs then_op _) res_ty thing_inside + = do { (rhs', elt_ty) <- tc_rhs rhs + ; thing <- thing_inside res_ty + ; return (ExprStmt rhs' noSyntaxExpr elt_ty, thing) } + +tcMDoStmt tc_rhs ctxt (RecStmt stmts laterNames recNames _ _) res_ty thing_inside + = do { rec_tys <- newFlexiTyVarTys (length recNames) liftedTypeKind + ; let rec_ids = zipWith mkLocalId recNames rec_tys + ; tcExtendIdEnv rec_ids $ do + { (stmts', (later_ids, rec_rets)) + <- tcStmts ctxt (tcMDoStmt tc_rhs) stmts res_ty $ \ res_ty' -> + -- ToDo: res_ty not really right + do { rec_rets <- zipWithM tc_ret recNames rec_tys + ; later_ids <- tcLookupLocalIds laterNames + ; return (later_ids, rec_rets) } + + ; (thing,lie) <- tcExtendIdEnv later_ids (getLIE (thing_inside res_ty)) + -- NB: The rec_ids for the recursive things + -- already scope over this part. This binding may shadow + -- some of them with polymorphic things with the same Name + -- (see note [RecStmt] in HsExpr) + ; lie_binds <- bindInstsOfLocalFuns lie later_ids + + ; return (RecStmt stmts' later_ids rec_ids rec_rets lie_binds, thing) + }} + where + -- Unify the types of the "final" Ids with those of "knot-tied" Ids + tc_ret rec_name mono_ty + = do { poly_id <- tcLookupId rec_name + -- poly_id may have a polymorphic type + -- but mono_ty is just a monomorphic type variable + ; co_fn <- tcSubExp (idType poly_id) mono_ty + ; return (mkHsCoerce co_fn (HsVar poly_id)) } + +tcMDoStmt tc_rhs ctxt stmt res_ty thing_inside + = pprPanic "tcMDoStmt: unexpected Stmt" (ppr stmt) + +\end{code} + + +%************************************************************************ +%* * +\subsection{Errors and contexts} +%* * +%************************************************************************ + +@sameNoOfArgs@ takes a @[RenamedMatch]@ and decides whether the same +number of args are used in each equation. + +\begin{code} +checkArgs :: Name -> MatchGroup Name -> TcM () +checkArgs fun (MatchGroup (match1:matches) _) + | null bad_matches = return () + | otherwise + = failWithTc (vcat [ptext SLIT("Equations for") <+> quotes (ppr fun) <+> + ptext SLIT("have different numbers of arguments"), + nest 2 (ppr (getLoc match1)), + nest 2 (ppr (getLoc (head bad_matches)))]) + where + n_args1 = args_in_match match1 + bad_matches = [m | m <- matches, args_in_match m /= n_args1] + + args_in_match :: LMatch Name -> Int + args_in_match (L _ (Match pats _ _)) = length pats +\end{code} + +\begin{code} +matchCtxt ctxt match = hang (ptext SLIT("In") <+> pprMatchContext ctxt <> colon) + 4 (pprMatch ctxt match) + +doBodyCtxt :: HsStmtContext Name -> LHsExpr Name -> SDoc +doBodyCtxt ctxt body = hang (ptext SLIT("In the result of") <+> pprStmtContext ctxt <> colon) + 4 (ppr body) + +stmtCtxt ctxt stmt = hang (ptext SLIT("In") <+> pprStmtContext ctxt <> colon) + 4 (ppr stmt) +\end{code} diff --git a/compiler/typecheck/TcMatches.lhs-boot b/compiler/typecheck/TcMatches.lhs-boot new file mode 100644 index 0000000000..18a79fa984 --- /dev/null +++ b/compiler/typecheck/TcMatches.lhs-boot @@ -0,0 +1,17 @@ +\begin{code} +module TcMatches where +import HsSyn ( GRHSs, MatchGroup, ExprCoFn ) +import Name ( Name ) +import Var ( Id ) +import TcType ( BoxyRhoType ) +import TcRnTypes( TcM ) + +tcGRHSsPat :: GRHSs Name + -> BoxyRhoType + -> TcM (GRHSs Id) + +tcMatchesFun :: Name + -> MatchGroup Name + -> BoxyRhoType + -> TcM (ExprCoFn, MatchGroup Id) +\end{code} diff --git a/compiler/typecheck/TcPat.lhs b/compiler/typecheck/TcPat.lhs new file mode 100644 index 0000000000..4c56b083bb --- /dev/null +++ b/compiler/typecheck/TcPat.lhs @@ -0,0 +1,816 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[TcPat]{Typechecking patterns} + +\begin{code} +module TcPat ( tcPat, tcPats, tcOverloadedLit, + PatCtxt(..), badFieldCon, polyPatSig ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} TcExpr( tcSyntaxOp ) +import HsSyn ( Pat(..), LPat, HsConDetails(..), HsLit(..), HsOverLit(..), HsExpr(..), + LHsBinds, emptyLHsBinds, isEmptyLHsBinds, + collectPatsBinders, nlHsLit ) +import TcHsSyn ( TcId, hsLitType ) +import TcRnMonad +import Inst ( InstOrigin(..), shortCutFracLit, shortCutIntLit, + newDicts, instToId, tcInstStupidTheta, isHsVar + ) +import Id ( Id, idType, mkLocalId ) +import CoreFVs ( idFreeTyVars ) +import Name ( Name, mkSystemVarName ) +import TcSimplify ( tcSimplifyCheck, bindInstsOfLocalFuns ) +import TcEnv ( newLocalName, tcExtendIdEnv1, tcExtendTyVarEnv2, + tcLookupClass, tcLookupDataCon, tcLookupId, refineEnvironment, + tcMetaTy ) +import TcMType ( newFlexiTyVarTy, arityErr, tcInstSkolTyVars, newBoxyTyVar, zonkTcType ) +import TcType ( TcType, TcTyVar, TcSigmaType, TcRhoType, + SkolemInfo(PatSkol), + BoxySigmaType, BoxyRhoType, + pprSkolTvBinding, isRefineableTy, isRigidTy, tcTyVarsOfTypes, mkTyVarTy, lookupTyVar, + emptyTvSubst, substTyVar, substTy, mkTopTvSubst, zipTopTvSubst, zipOpenTvSubst, + mkTyVarTys, mkClassPred, mkTyConApp, isOverloadedTy, + mkFunTy, mkFunTys, exactTyVarsOfTypes, + tidyOpenTypes ) +import VarSet ( elemVarSet, mkVarSet ) +import Kind ( liftedTypeKind, openTypeKind ) +import TcUnify ( boxySplitTyConApp, boxySplitListTy, + unBox, stripBoxyType, zapToMonotype, + boxyMatchTypes, boxyUnify, boxyUnifyList, checkSigTyVarsWrt ) +import TcHsType ( UserTypeCtxt(..), tcPatSig ) +import TysWiredIn ( boolTy, parrTyCon, tupleTyCon ) +import Unify ( MaybeErr(..), gadtRefineTys ) +import Type ( substTys, substTheta ) +import StaticFlags ( opt_IrrefutableTuples ) +import TyCon ( TyCon ) +import DataCon ( DataCon, dataConTyCon, isVanillaDataCon, + dataConFieldLabels, dataConSourceArity, dataConSig ) +import PrelNames ( integralClassName, fromIntegerName, integerTyConName, + fromRationalName, rationalTyConName ) +import BasicTypes ( isBoxed ) +import SrcLoc ( Located(..), SrcSpan, noLoc ) +import ErrUtils ( Message ) +import Util ( takeList, zipEqual ) +import Outputable +import FastString +\end{code} + + +%************************************************************************ +%* * + External interface +%* * +%************************************************************************ + +\begin{code} +tcPats :: PatCtxt + -> [LPat Name] -- Patterns, + -> [BoxySigmaType] -- and their types + -> BoxyRhoType -- Result type, + -> (BoxyRhoType -> TcM a) -- and the checker for the body + -> TcM ([LPat TcId], a) + +-- This is the externally-callable wrapper function +-- Typecheck the patterns, extend the environment to bind the variables, +-- do the thing inside, use any existentially-bound dictionaries to +-- discharge parts of the returning LIE, and deal with pattern type +-- signatures + +-- 1. Initialise the PatState +-- 2. Check the patterns +-- 3. Apply the refinement +-- 4. Check the body +-- 5. Check that no existentials escape + +tcPats ctxt pats tys res_ty thing_inside + = do { let init_state = PS { pat_ctxt = ctxt, pat_reft = emptyTvSubst } + + ; (pats', ex_tvs, res) <- tc_lpats init_state pats tys $ \ pstate' -> + refineEnvironment (pat_reft pstate') $ + thing_inside (refineType (pat_reft pstate') res_ty) + + ; tcCheckExistentialPat ctxt pats' ex_tvs tys res_ty + + ; returnM (pats', res) } + + +----------------- +tcPat :: PatCtxt + -> LPat Name -> BoxySigmaType + -> BoxyRhoType -- Result type + -> (BoxyRhoType -> TcM a) -- Checker for body, given its result type + -> TcM (LPat TcId, a) +tcPat ctxt pat pat_ty res_ty thing_inside + = do { ([pat'],thing) <- tcPats ctxt [pat] [pat_ty] res_ty thing_inside + ; return (pat', thing) } + + +----------------- +tcCheckExistentialPat :: PatCtxt + -> [LPat TcId] -- Patterns (just for error message) + -> [TcTyVar] -- Existentially quantified tyvars bound by pattern + -> [BoxySigmaType] -- Types of the patterns + -> BoxyRhoType -- Type of the body of the match + -- Tyvars in either of these must not escape + -> TcM () +-- NB: we *must* pass "pats_tys" not just "body_ty" to tcCheckExistentialPat +-- For example, we must reject this program: +-- data C = forall a. C (a -> Int) +-- f (C g) x = g x +-- Here, result_ty will be simply Int, but expected_ty is (C -> a -> Int). + +tcCheckExistentialPat ctxt pats [] pat_tys body_ty + = return () -- Short cut for case when there are no existentials + +tcCheckExistentialPat (LetPat _) pats ex_tvs pat_tys body_ty + -- Don't know how to deal with pattern-bound existentials yet + = failWithTc (existentialExplode pats) + +tcCheckExistentialPat ctxt pats ex_tvs pat_tys body_ty + = addErrCtxtM (sigPatCtxt (collectPatsBinders pats) ex_tvs pat_tys) $ + checkSigTyVarsWrt (tcTyVarsOfTypes (body_ty:pat_tys)) ex_tvs + +data PatState = PS { + pat_ctxt :: PatCtxt, + pat_reft :: GadtRefinement -- Binds rigid TcTyVars to their refinements + } + +data PatCtxt + = LamPat + | LetPat (Name -> Maybe TcRhoType) -- Used for let(rec) bindings + +patSigCtxt :: PatState -> UserTypeCtxt +patSigCtxt (PS { pat_ctxt = LetPat _ }) = BindPatSigCtxt +patSigCtxt other = LamPatSigCtxt +\end{code} + + + +%************************************************************************ +%* * + Binders +%* * +%************************************************************************ + +\begin{code} +tcPatBndr :: PatState -> Name -> BoxySigmaType -> TcM TcId +tcPatBndr (PS { pat_ctxt = LamPat }) bndr_name pat_ty + = do { pat_ty' <- unBox pat_ty + -- We have an undecorated binder, so we do rule ABS1, + -- by unboxing the boxy type, forcing any un-filled-in + -- boxes to become monotypes + -- NB that pat_ty' can still be a polytype: + -- data T = MkT (forall a. a->a) + -- f t = case t of { MkT g -> ... } + -- Here, the 'g' must get type (forall a. a->a) from the + -- MkT context + ; return (mkLocalId bndr_name pat_ty') } + +tcPatBndr (PS { pat_ctxt = LetPat lookup_sig }) bndr_name pat_ty + | Just mono_ty <- lookup_sig bndr_name + = do { mono_name <- newLocalName bndr_name + ; boxyUnify mono_ty pat_ty + ; return (mkLocalId mono_name mono_ty) } + + | otherwise + = do { pat_ty' <- unBox pat_ty + ; mono_name <- newLocalName bndr_name + ; return (mkLocalId mono_name pat_ty') } + + +------------------- +bindInstsOfPatId :: TcId -> TcM a -> TcM (a, LHsBinds TcId) +bindInstsOfPatId id thing_inside + | not (isOverloadedTy (idType id)) + = do { res <- thing_inside; return (res, emptyLHsBinds) } + | otherwise + = do { (res, lie) <- getLIE thing_inside + ; binds <- bindInstsOfLocalFuns lie [id] + ; return (res, binds) } +\end{code} + + +%************************************************************************ +%* * + The main worker functions +%* * +%************************************************************************ + +Note [Nesting] +~~~~~~~~~~~~~~ +tcPat takes a "thing inside" over which the patter scopes. This is partly +so that tcPat can extend the environment for the thing_inside, but also +so that constraints arising in the thing_inside can be discharged by the +pattern. + +This does not work so well for the ErrCtxt carried by the monad: we don't +want the error-context for the pattern to scope over the RHS. +Hence the getErrCtxt/setErrCtxt stuff in tc_lpats. + +\begin{code} +-------------------- +tc_lpats :: PatState + -> [LPat Name] + -> [BoxySigmaType] + -> (PatState -> TcM a) + -> TcM ([LPat TcId], [TcTyVar], a) + +tc_lpats pstate pats pat_tys thing_inside + = do { err_ctxt <- getErrCtxt + ; let loop pstate [] [] + = do { res <- thing_inside pstate + ; return ([], [], res) } + + loop pstate (p:ps) (ty:tys) + = do { (p', p_tvs, (ps', ps_tvs, res)) + <- tc_lpat pstate p ty $ \ pstate' -> + setErrCtxt err_ctxt $ + loop pstate' ps tys + -- setErrCtxt: restore context before doing the next pattern + -- See note [Nesting] above + + ; return (p':ps', p_tvs ++ ps_tvs, res) } + + loop _ _ _ = pprPanic "tc_lpats" (ppr pats $$ ppr pat_tys) + + ; loop pstate pats pat_tys } + +-------------------- +tc_lpat :: PatState + -> LPat Name + -> BoxySigmaType + -> (PatState -> TcM a) + -> TcM (LPat TcId, [TcTyVar], a) +tc_lpat pstate (L span pat) pat_ty thing_inside + = setSrcSpan span $ + maybeAddErrCtxt (patCtxt pat) $ + do { let pat_ty' = refineType (pat_reft pstate) pat_ty + -- Make sure the result type reflects the current refinement + ; (pat', tvs, res) <- tc_pat pstate pat pat_ty' thing_inside + ; return (L span pat', tvs, res) } + + +-------------------- +tc_pat :: PatState + -> Pat Name -> BoxySigmaType -- Fully refined result type + -> (PatState -> TcM a) -- Thing inside + -> TcM (Pat TcId, -- Translated pattern + [TcTyVar], -- Existential binders + a) -- Result of thing inside + +tc_pat pstate (VarPat name) pat_ty thing_inside + = do { id <- tcPatBndr pstate name pat_ty + ; (res, binds) <- bindInstsOfPatId id $ + tcExtendIdEnv1 name id $ + (traceTc (text "binding" <+> ppr name <+> ppr (idType id)) + >> thing_inside pstate) + ; let pat' | isEmptyLHsBinds binds = VarPat id + | otherwise = VarPatOut id binds + ; return (pat', [], res) } + +tc_pat pstate (ParPat pat) pat_ty thing_inside + = do { (pat', tvs, res) <- tc_lpat pstate pat pat_ty thing_inside + ; return (ParPat pat', tvs, res) } + +tc_pat pstate (BangPat pat) pat_ty thing_inside + = do { (pat', tvs, res) <- tc_lpat pstate pat pat_ty thing_inside + ; return (BangPat pat', tvs, res) } + +-- There's a wrinkle with irrefuatable patterns, namely that we +-- must not propagate type refinement from them. For example +-- data T a where { T1 :: Int -> T Int; ... } +-- f :: T a -> Int -> a +-- f ~(T1 i) y = y +-- It's obviously not sound to refine a to Int in the right +-- hand side, because the arugment might not match T1 at all! +-- +-- Nor should a lazy pattern bind any existential type variables +-- because they won't be in scope when we do the desugaring +tc_pat pstate lpat@(LazyPat pat) pat_ty thing_inside + = do { (pat', pat_tvs, res) <- tc_lpat pstate pat pat_ty $ \ _ -> + thing_inside pstate + -- Ignore refined pstate', + -- revert to pstate + ; if (null pat_tvs) then return () + else lazyPatErr lpat pat_tvs + ; return (LazyPat pat', [], res) } + +tc_pat pstate (WildPat _) pat_ty thing_inside + = do { pat_ty' <- unBox pat_ty -- Make sure it's filled in with monotypes + ; res <- thing_inside pstate + ; return (WildPat pat_ty', [], res) } + +tc_pat pstate (AsPat (L nm_loc name) pat) pat_ty thing_inside + = do { bndr_id <- setSrcSpan nm_loc (tcPatBndr pstate name pat_ty) + ; (pat', tvs, res) <- tcExtendIdEnv1 name bndr_id $ + tc_lpat pstate pat (idType bndr_id) thing_inside + -- NB: if we do inference on: + -- \ (y@(x::forall a. a->a)) = e + -- we'll fail. The as-pattern infers a monotype for 'y', which then + -- fails to unify with the polymorphic type for 'x'. This could + -- perhaps be fixed, but only with a bit more work. + -- + -- If you fix it, don't forget the bindInstsOfPatIds! + ; return (AsPat (L nm_loc bndr_id) pat', tvs, res) } + +-- Type signatures in patterns +-- See Note [Pattern coercions] below +tc_pat pstate (SigPatIn pat sig_ty) pat_ty thing_inside + = do { (inner_ty, tv_binds) <- tcPatSig (patSigCtxt pstate) sig_ty pat_ty + ; (pat', tvs, res) <- tcExtendTyVarEnv2 tv_binds $ + tc_lpat pstate pat inner_ty thing_inside + ; return (SigPatOut pat' inner_ty, tvs, res) } + +tc_pat pstate pat@(TypePat ty) pat_ty thing_inside + = failWithTc (badTypePat pat) + +------------------------ +-- Lists, tuples, arrays +tc_pat pstate (ListPat pats _) pat_ty thing_inside + = do { elt_ty <- boxySplitListTy pat_ty + ; let elt_tys = takeList pats (repeat elt_ty) + ; (pats', pats_tvs, res) <- tc_lpats pstate pats elt_tys thing_inside + ; return (ListPat pats' elt_ty, pats_tvs, res) } + +tc_pat pstate (PArrPat pats _) pat_ty thing_inside + = do { [elt_ty] <- boxySplitTyConApp parrTyCon pat_ty + ; let elt_tys = takeList pats (repeat elt_ty) + ; (pats', pats_tvs, res) <- tc_lpats pstate pats elt_tys thing_inside + ; ifM (null pats) (zapToMonotype pat_ty) -- c.f. ExplicitPArr in TcExpr + ; return (PArrPat pats' elt_ty, pats_tvs, res) } + +tc_pat pstate (TuplePat pats boxity _) pat_ty thing_inside + = do { arg_tys <- boxySplitTyConApp (tupleTyCon boxity (length pats)) pat_ty + ; (pats', pats_tvs, res) <- tc_lpats pstate pats arg_tys thing_inside + + -- Under flag control turn a pattern (x,y,z) into ~(x,y,z) + -- so that we can experiment with lazy tuple-matching. + -- This is a pretty odd place to make the switch, but + -- it was easy to do. + ; let unmangled_result = TuplePat pats' boxity pat_ty + possibly_mangled_result + | opt_IrrefutableTuples && isBoxed boxity = LazyPat (noLoc unmangled_result) + | otherwise = unmangled_result + + ; ASSERT( length arg_tys == length pats ) -- Syntactically enforced + return (possibly_mangled_result, pats_tvs, res) } + +------------------------ +-- Data constructors +tc_pat pstate pat_in@(ConPatIn (L con_span con_name) arg_pats) pat_ty thing_inside + = do { data_con <- tcLookupDataCon con_name + ; let tycon = dataConTyCon data_con + ; tcConPat pstate con_span data_con tycon pat_ty arg_pats thing_inside } + +------------------------ +-- Literal patterns +tc_pat pstate (LitPat simple_lit) pat_ty thing_inside + = do { boxyUnify (hsLitType simple_lit) pat_ty + ; res <- thing_inside pstate + ; returnM (LitPat simple_lit, [], res) } + +------------------------ +-- Overloaded patterns: n, and n+k +tc_pat pstate pat@(NPat over_lit mb_neg eq _) pat_ty thing_inside + = do { let orig = LiteralOrigin over_lit + ; lit' <- tcOverloadedLit orig over_lit pat_ty + ; eq' <- tcSyntaxOp orig eq (mkFunTys [pat_ty, pat_ty] boolTy) + ; mb_neg' <- case mb_neg of + Nothing -> return Nothing -- Positive literal + Just neg -> -- Negative literal + -- The 'negate' is re-mappable syntax + do { neg' <- tcSyntaxOp orig neg (mkFunTy pat_ty pat_ty) + ; return (Just neg') } + ; res <- thing_inside pstate + ; returnM (NPat lit' mb_neg' eq' pat_ty, [], res) } + +tc_pat pstate pat@(NPlusKPat (L nm_loc name) lit ge minus) pat_ty thing_inside + = do { bndr_id <- setSrcSpan nm_loc (tcPatBndr pstate name pat_ty) + ; let pat_ty' = idType bndr_id + orig = LiteralOrigin lit + ; lit' <- tcOverloadedLit orig lit pat_ty' + + -- The '>=' and '-' parts are re-mappable syntax + ; ge' <- tcSyntaxOp orig ge (mkFunTys [pat_ty', pat_ty'] boolTy) + ; minus' <- tcSyntaxOp orig minus (mkFunTys [pat_ty', pat_ty'] pat_ty') + + -- The Report says that n+k patterns must be in Integral + -- We may not want this when using re-mappable syntax, though (ToDo?) + ; icls <- tcLookupClass integralClassName + ; dicts <- newDicts orig [mkClassPred icls [pat_ty']] + ; extendLIEs dicts + + ; res <- tcExtendIdEnv1 name bndr_id (thing_inside pstate) + ; returnM (NPlusKPat (L nm_loc bndr_id) lit' ge' minus', [], res) } +\end{code} + + +%************************************************************************ +%* * + Most of the work for constructors is here + (the rest is in the ConPatIn case of tc_pat) +%* * +%************************************************************************ + +\begin{code} +tcConPat :: PatState -> SrcSpan -> DataCon -> TyCon + -> BoxySigmaType -- Type of the pattern + -> HsConDetails Name (LPat Name) -> (PatState -> TcM a) + -> TcM (Pat TcId, [TcTyVar], a) +tcConPat pstate con_span data_con tycon pat_ty arg_pats thing_inside + | isVanillaDataCon data_con + = do { ty_args <- boxySplitTyConApp tycon pat_ty + ; let (tvs, _, arg_tys, _, _) = dataConSig data_con + arg_tvs = exactTyVarsOfTypes arg_tys + -- See Note [Silly type synonyms in smart-app] in TcExpr + -- for why we must use exactTyVarsOfTypes + inst_prs = zipEqual "tcConPat" tvs ty_args + subst = mkTopTvSubst inst_prs + arg_tys' = substTys subst arg_tys + unconstrained_ty_args = [ty_arg | (tv,ty_arg) <- inst_prs, + not (tv `elemVarSet` arg_tvs)] + ; mapM unBox unconstrained_ty_args -- Zap these to monotypes + ; tcInstStupidTheta data_con ty_args + ; traceTc (text "tcConPat" <+> vcat [ppr data_con, ppr ty_args, ppr arg_tys']) + ; (arg_pats', tvs, res) <- tcConArgs pstate data_con arg_pats arg_tys' thing_inside + ; return (ConPatOut (L con_span data_con) [] [] emptyLHsBinds + arg_pats' (mkTyConApp tycon ty_args), + tvs, res) } + + | otherwise -- GADT case + = do { ty_args <- boxySplitTyConApp tycon pat_ty + ; span <- getSrcSpanM -- The whole pattern + + -- Instantiate the constructor type variables and result type + ; let (tvs, theta, arg_tys, _, res_tys) = dataConSig data_con + arg_tvs = exactTyVarsOfTypes arg_tys + -- See Note [Silly type synonyms in smart-app] in TcExpr + -- for why we must use exactTyVarsOfTypes + skol_info = PatSkol data_con span + arg_flags = [ tv `elemVarSet` arg_tvs | tv <- tvs ] + ; tvs' <- tcInstSkolTyVars skol_info tvs + ; let res_tys' = substTys (zipTopTvSubst tvs (mkTyVarTys tvs')) res_tys + + -- Do type refinement! + ; traceTc (text "tcGadtPat" <+> vcat [ppr data_con, ppr tvs', ppr res_tys', + text "ty-args:" <+> ppr ty_args ]) + ; refineAlt pstate data_con tvs' arg_flags res_tys' ty_args + $ \ pstate' tv_tys' -> do + + -- ToDo: arg_tys should be boxy, but I don't think theta' should be, + -- or the tv_tys' in the call to tcInstStupidTheta + { let tenv' = zipTopTvSubst tvs tv_tys' + theta' = substTheta tenv' theta + arg_tys' = substTys tenv' arg_tys -- Boxy types + + ; ((arg_pats', inner_tvs, res), lie_req) <- getLIE $ + do { tcInstStupidTheta data_con tv_tys' + -- The stupid-theta mentions the newly-bound tyvars, so + -- it must live inside the getLIE, so that the + -- tcSimplifyCheck will apply the type refinement to it + ; tcConArgs pstate' data_con arg_pats arg_tys' thing_inside } + + ; dicts <- newDicts (SigOrigin skol_info) theta' + ; dict_binds <- tcSimplifyCheck doc tvs' dicts lie_req + + ; return (ConPatOut (L con_span data_con) + tvs' (map instToId dicts) dict_binds + arg_pats' (mkTyConApp tycon ty_args), + tvs' ++ inner_tvs, res) + } } + where + doc = ptext SLIT("existential context for") <+> quotes (ppr data_con) + +tcConArgs :: PatState -> DataCon + -> HsConDetails Name (LPat Name) -> [TcSigmaType] + -> (PatState -> TcM a) + -> TcM (HsConDetails TcId (LPat Id), [TcTyVar], a) + +tcConArgs pstate data_con (PrefixCon arg_pats) arg_tys thing_inside + = do { checkTc (con_arity == no_of_args) -- Check correct arity + (arityErr "Constructor" data_con con_arity no_of_args) + ; (arg_pats', tvs, res) <- tc_lpats pstate arg_pats arg_tys thing_inside + ; return (PrefixCon arg_pats', tvs, res) } + where + con_arity = dataConSourceArity data_con + no_of_args = length arg_pats + +tcConArgs pstate data_con (InfixCon p1 p2) arg_tys thing_inside + = do { checkTc (con_arity == 2) -- Check correct arity + (arityErr "Constructor" data_con con_arity 2) + ; ([p1',p2'], tvs, res) <- tc_lpats pstate [p1,p2] arg_tys thing_inside + ; return (InfixCon p1' p2', tvs, res) } + where + con_arity = dataConSourceArity data_con + +tcConArgs pstate data_con (RecCon rpats) arg_tys thing_inside + = do { (rpats', tvs, res) <- tc_fields pstate rpats thing_inside + ; return (RecCon rpats', tvs, res) } + where + tc_fields :: PatState -> [(Located Name, LPat Name)] + -> (PatState -> TcM a) + -> TcM ([(Located TcId, LPat TcId)], [TcTyVar], a) + tc_fields pstate [] thing_inside + = do { res <- thing_inside pstate + ; return ([], [], res) } + + tc_fields pstate (rpat : rpats) thing_inside + = do { (rpat', tvs1, (rpats', tvs2, res)) + <- tc_field pstate rpat $ \ pstate' -> + tc_fields pstate' rpats thing_inside + ; return (rpat':rpats', tvs1 ++ tvs2, res) } + + tc_field pstate (field_lbl, pat) thing_inside + = do { (sel_id, pat_ty) <- wrapLocFstM find_field_ty field_lbl + ; (pat', tvs, res) <- tc_lpat pstate pat pat_ty thing_inside + ; return ((sel_id, pat'), tvs, res) } + + find_field_ty field_lbl + = case [ty | (f,ty) <- field_tys, f == field_lbl] of + + -- No matching field; chances are this field label comes from some + -- other record type (or maybe none). As well as reporting an + -- error we still want to typecheck the pattern, principally to + -- make sure that all the variables it binds are put into the + -- environment, else the type checker crashes later: + -- f (R { foo = (a,b) }) = a+b + -- If foo isn't one of R's fields, we don't want to crash when + -- typechecking the "a+b". + [] -> do { addErrTc (badFieldCon data_con field_lbl) + ; bogus_ty <- newFlexiTyVarTy liftedTypeKind + ; return (error "Bogus selector Id", bogus_ty) } + + -- The normal case, when the field comes from the right constructor + (pat_ty : extras) -> + ASSERT( null extras ) + do { sel_id <- tcLookupId field_lbl + ; return (sel_id, pat_ty) } + + field_tys = zip (dataConFieldLabels data_con) arg_tys + -- Don't use zipEqual! If the constructor isn't really a record, then + -- dataConFieldLabels will be empty (and each field in the pattern + -- will generate an error below). +\end{code} + + +%************************************************************************ +%* * + Type refinement +%* * +%************************************************************************ + +\begin{code} +refineAlt :: PatState + -> DataCon -- For tracing only + -> [TcTyVar] -- Type variables from pattern + -> [Bool] -- Flags indicating which type variables occur + -- in the type of at least one argument + -> [TcType] -- Result types from the pattern + -> [BoxySigmaType] -- Result types from the scrutinee (context) + -> (PatState -> [BoxySigmaType] -> TcM a) + -- Possibly-refined existentials + -> TcM a +refineAlt pstate con pat_tvs arg_flags pat_res_tys ctxt_res_tys thing_inside + | not (all isRigidTy ctxt_res_tys) + -- The context is not a rigid type, so we do no type refinement here. + = do { let arg_tvs = mkVarSet [ tv | (tv, True) <- pat_tvs `zip` arg_flags] + subst = boxyMatchTypes arg_tvs pat_res_tys ctxt_res_tys + + res_tvs = tcTyVarsOfTypes pat_res_tys + -- The tvs are (already) all fresh skolems. We need a + -- fresh skolem for each type variable (to bind in the pattern) + -- even if it's refined away by the type refinement + find_inst tv + | not (tv `elemVarSet` res_tvs) = return (mkTyVarTy tv) + | Just boxy_ty <- lookupTyVar subst tv = return boxy_ty + | otherwise = do { tv <- newBoxyTyVar openTypeKind + ; return (mkTyVarTy tv) } + ; pat_tys' <- mapM find_inst pat_tvs + + -- Do the thing inside + ; res <- thing_inside pstate pat_tys' + + -- Unbox the types that have been filled in by the thing_inside + -- I.e. the ones whose type variables are mentioned in at least one arg + ; let strip ty in_arg_tv | in_arg_tv = stripBoxyType ty + | otherwise = return ty + ; pat_tys'' <- zipWithM strip pat_tys' arg_flags + ; let subst = zipOpenTvSubst pat_tvs pat_tys'' + ; boxyUnifyList (substTys subst pat_res_tys) ctxt_res_tys + + ; return res } -- All boxes now filled + + | otherwise -- The context is rigid, so we can do type refinement + = case gadtRefineTys (pat_reft pstate) con pat_tvs pat_res_tys ctxt_res_tys of + Failed msg -> failWithTc (inaccessibleAlt msg) + Succeeded (new_subst, all_bound_here) + | all_bound_here -- All the new bindings are for pat_tvs, so no need + -- to refine the environment or pstate + -> do { traceTc trace_msg + ; thing_inside pstate pat_tvs' } + | otherwise -- New bindings affect the context, so pass down pstate'. + -- DO NOT refine the envt, because we might be inside a + -- lazy pattern + -> do { traceTc trace_msg + ; thing_inside pstate' pat_tvs' } + where + pat_tvs' = map (substTyVar new_subst) pat_tvs + pstate' = pstate { pat_reft = new_subst } + trace_msg = text "refineTypes:match" <+> ppr con <+> ppr new_subst + +refineType :: GadtRefinement -> BoxyRhoType -> BoxyRhoType +-- Refine the type if it is rigid +refineType reft ty + | isRefineableTy ty = substTy reft ty + | otherwise = ty +\end{code} + + +%************************************************************************ +%* * + Overloaded literals +%* * +%************************************************************************ + +In tcOverloadedLit we convert directly to an Int or Integer if we +know that's what we want. This may save some time, by not +temporarily generating overloaded literals, but it won't catch all +cases (the rest are caught in lookupInst). + +\begin{code} +tcOverloadedLit :: InstOrigin + -> HsOverLit Name + -> BoxyRhoType + -> TcM (HsOverLit TcId) +tcOverloadedLit orig lit@(HsIntegral i fi) res_ty + | not (fi `isHsVar` fromIntegerName) -- Do not generate a LitInst for rebindable syntax. + -- Reason: If we do, tcSimplify will call lookupInst, which + -- will call tcSyntaxName, which does unification, + -- which tcSimplify doesn't like + -- ToDo: noLoc sadness + = do { integer_ty <- tcMetaTy integerTyConName + ; fi' <- tcSyntaxOp orig fi (mkFunTy integer_ty res_ty) + ; return (HsIntegral i (HsApp (noLoc fi') (nlHsLit (HsInteger i integer_ty)))) } + + | Just expr <- shortCutIntLit i res_ty + = return (HsIntegral i expr) + + | otherwise + = do { expr <- newLitInst orig lit res_ty + ; return (HsIntegral i expr) } + +tcOverloadedLit orig lit@(HsFractional r fr) res_ty + | not (fr `isHsVar` fromRationalName) -- c.f. HsIntegral case + = do { rat_ty <- tcMetaTy rationalTyConName + ; fr' <- tcSyntaxOp orig fr (mkFunTy rat_ty res_ty) + -- Overloaded literals must have liftedTypeKind, because + -- we're instantiating an overloaded function here, + -- whereas res_ty might be openTypeKind. This was a bug in 6.2.2 + -- However this'll be picked up by tcSyntaxOp if necessary + ; return (HsFractional r (HsApp (noLoc fr') (nlHsLit (HsRat r rat_ty)))) } + + | Just expr <- shortCutFracLit r res_ty + = return (HsFractional r expr) + + | otherwise + = do { expr <- newLitInst orig lit res_ty + ; return (HsFractional r expr) } + +newLitInst :: InstOrigin -> HsOverLit Name -> BoxyRhoType -> TcM (HsExpr TcId) +newLitInst orig lit res_ty -- Make a LitInst + = do { loc <- getInstLoc orig + ; res_tau <- zapToMonotype res_ty + ; new_uniq <- newUnique + ; let + lit_nm = mkSystemVarName new_uniq FSLIT("lit") + lit_inst = LitInst lit_nm lit res_tau loc + ; extendLIE lit_inst + ; return (HsVar (instToId lit_inst)) } +\end{code} + + +%************************************************************************ +%* * + Note [Pattern coercions] +%* * +%************************************************************************ + +In principle, these program would be reasonable: + + f :: (forall a. a->a) -> Int + f (x :: Int->Int) = x 3 + + g :: (forall a. [a]) -> Bool + g [] = True + +In both cases, the function type signature restricts what arguments can be passed +in a call (to polymorphic ones). The pattern type signature then instantiates this +type. For example, in the first case, (forall a. a->a) <= Int -> Int, and we +generate the translated term + f = \x' :: (forall a. a->a). let x = x' Int in x 3 + +From a type-system point of view, this is perfectly fine, but it's *very* seldom useful. +And it requires a significant amount of code to implement, becuase we need to decorate +the translated pattern with coercion functions (generated from the subsumption check +by tcSub). + +So for now I'm just insisting on type *equality* in patterns. No subsumption. + +Old notes about desugaring, at a time when pattern coercions were handled: + +A SigPat is a type coercion and must be handled one at at time. We can't +combine them unless the type of the pattern inside is identical, and we don't +bother to check for that. For example: + + data T = T1 Int | T2 Bool + f :: (forall a. a -> a) -> T -> t + f (g::Int->Int) (T1 i) = T1 (g i) + f (g::Bool->Bool) (T2 b) = T2 (g b) + +We desugar this as follows: + + f = \ g::(forall a. a->a) t::T -> + let gi = g Int + in case t of { T1 i -> T1 (gi i) + other -> + let gb = g Bool + in case t of { T2 b -> T2 (gb b) + other -> fail }} + +Note that we do not treat the first column of patterns as a +column of variables, because the coerced variables (gi, gb) +would be of different types. So we get rather grotty code. +But I don't think this is a common case, and if it was we could +doubtless improve it. + +Meanwhile, the strategy is: + * treat each SigPat coercion (always non-identity coercions) + as a separate block + * deal with the stuff inside, and then wrap a binding round + the result to bind the new variable (gi, gb, etc) + + +%************************************************************************ +%* * +\subsection{Errors and contexts} +%* * +%************************************************************************ + +\begin{code} +patCtxt :: Pat Name -> Maybe Message -- Not all patterns are worth pushing a context +patCtxt (VarPat _) = Nothing +patCtxt (ParPat _) = Nothing +patCtxt (AsPat _ _) = Nothing +patCtxt pat = Just (hang (ptext SLIT("In the pattern:")) + 4 (ppr pat)) + +----------------------------------------------- + +existentialExplode pats + = hang (vcat [text "My brain just exploded.", + text "I can't handle pattern bindings for existentially-quantified constructors.", + text "In the binding group for"]) + 4 (vcat (map ppr pats)) + +sigPatCtxt bound_ids bound_tvs tys tidy_env + = -- tys is (body_ty : pat_tys) + mapM zonkTcType tys `thenM` \ tys' -> + let + (env1, tidy_tys) = tidyOpenTypes tidy_env (map idType show_ids) + (_env2, tidy_body_ty : tidy_pat_tys) = tidyOpenTypes env1 tys' + in + returnM (env1, + sep [ptext SLIT("When checking an existential match that binds"), + nest 4 (vcat (zipWith ppr_id show_ids tidy_tys)), + ptext SLIT("The pattern(s) have type(s):") <+> vcat (map ppr tidy_pat_tys), + ptext SLIT("The body has type:") <+> ppr tidy_body_ty + ]) + where + show_ids = filter is_interesting bound_ids + is_interesting id = any (`elemVarSet` idFreeTyVars id) bound_tvs + + ppr_id id ty = ppr id <+> dcolon <+> ppr ty + -- Don't zonk the types so we get the separate, un-unified versions + +badFieldCon :: DataCon -> Name -> SDoc +badFieldCon con field + = hsep [ptext SLIT("Constructor") <+> quotes (ppr con), + ptext SLIT("does not have field"), quotes (ppr field)] + +polyPatSig :: TcType -> SDoc +polyPatSig sig_ty + = hang (ptext SLIT("Illegal polymorphic type signature in pattern:")) + 4 (ppr sig_ty) + +badTypePat pat = ptext SLIT("Illegal type pattern") <+> ppr pat + +lazyPatErr pat tvs + = failWithTc $ + hang (ptext SLIT("A lazy (~) pattern connot bind existential type variables")) + 2 (vcat (map pprSkolTvBinding tvs)) + +inaccessibleAlt msg + = hang (ptext SLIT("Inaccessible case alternative:")) 2 msg +\end{code} diff --git a/compiler/typecheck/TcRnDriver.lhs b/compiler/typecheck/TcRnDriver.lhs new file mode 100644 index 0000000000..5f4b487103 --- /dev/null +++ b/compiler/typecheck/TcRnDriver.lhs @@ -0,0 +1,1357 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[TcModule]{Typechecking a whole module} + +\begin{code} +module TcRnDriver ( +#ifdef GHCI + tcRnStmt, tcRnExpr, tcRnType, + tcRnLookupRdrName, + tcRnLookupName, + tcRnGetInfo, + getModuleExports, +#endif + tcRnModule, + tcTopSrcDecls, + tcRnExtCore + ) where + +#include "HsVersions.h" + +import IO +#ifdef GHCI +import {-# SOURCE #-} TcSplice ( tcSpliceDecls ) +#endif + +import DynFlags ( DynFlag(..), DynFlags(..), dopt, GhcMode(..) ) +import StaticFlags ( opt_PprStyle_Debug ) +import Packages ( checkForPackageConflicts, mkHomeModules ) +import HsSyn ( HsModule(..), HsExtCore(..), HsGroup(..), LHsDecl, + SpliceDecl(..), HsBind(..), LHsBinds, + emptyRdrGroup, emptyRnGroup, appendGroups, plusHsValBinds, + nlHsApp, nlHsVar, pprLHsBinds ) +import RdrHsSyn ( findSplice ) + +import PrelNames ( runMainIOName, rootMainKey, rOOT_MAIN, mAIN, + main_RDR_Unqual ) +import RdrName ( RdrName, mkRdrUnqual, emptyGlobalRdrEnv ) +import TcHsSyn ( zonkTopDecls ) +import TcExpr ( tcInferRho ) +import TcRnMonad +import TcType ( tidyTopType, tcEqType, mkTyVarTys, substTyWith ) +import Inst ( showLIE ) +import InstEnv ( extendInstEnvList, Instance, pprInstances, instanceDFunId ) +import TcBinds ( tcTopBinds, tcHsBootSigs ) +import TcDefaults ( tcDefaults ) +import TcEnv ( tcExtendGlobalValEnv, iDFunId ) +import TcRules ( tcRules ) +import TcForeign ( tcForeignImports, tcForeignExports ) +import TcInstDcls ( tcInstDecls1, tcInstDecls2 ) +import TcIface ( tcExtCoreBindings, tcHiBootIface ) +import TcSimplify ( tcSimplifyTop ) +import TcTyClsDecls ( tcTyAndClassDecls ) +import LoadIface ( loadOrphanModules ) +import RnNames ( importsFromLocalDecls, rnImports, rnExports, + mkRdrEnvAndImports, mkExportNameSet, + reportUnusedNames, reportDeprecations ) +import RnEnv ( lookupSrcOcc_maybe ) +import RnSource ( rnSrcDecls, rnTyClDecls, checkModDeprec ) +import PprCore ( pprRules, pprCoreBindings ) +import CoreSyn ( CoreRule, bindersOfBinds ) +import DataCon ( dataConWrapId ) +import ErrUtils ( Messages, mkDumpDoc, showPass ) +import Id ( Id, mkExportedLocalId, isLocalId, idName, idType ) +import Var ( Var ) +import Module ( Module, ModuleEnv, moduleEnvElts, elemModuleEnv ) +import OccName ( mkVarOccFS ) +import Name ( Name, NamedThing(..), isExternalName, getSrcLoc, isWiredInName, + mkExternalName, isInternalName ) +import NameSet +import TyCon ( tyConHasGenerics, isSynTyCon, synTyConDefn, tyConKind ) +import SrcLoc ( srcLocSpan, Located(..), noLoc ) +import DriverPhases ( HscSource(..), isHsBoot ) +import HscTypes ( ModGuts(..), ModDetails(..), emptyModDetails, + HscEnv(..), ExternalPackageState(..), + IsBootInterface, noDependencies, + Deprecs( NoDeprecs ), plusDeprecs, + ForeignStubs(NoStubs), TyThing(..), + TypeEnv, lookupTypeEnv, hptInstances, + extendTypeEnvWithIds, typeEnvIds, typeEnvTyCons, typeEnvElts, + emptyFixityEnv + ) +import Outputable + +#ifdef GHCI +import HsSyn ( HsStmtContext(..), Stmt(..), HsExpr(..), + HsLocalBinds(..), HsValBinds(..), + LStmt, LHsExpr, LHsType, mkMatch, emptyLocalBinds, + collectLStmtsBinders, collectLStmtBinders, nlVarPat, + mkFunBind, placeHolderType, noSyntaxExpr ) +import RdrName ( GlobalRdrElt(..), globalRdrEnvElts, + unQualOK, lookupLocalRdrEnv, extendLocalRdrEnv ) +import RnSource ( addTcgDUs ) +import TcHsSyn ( mkHsDictLet, zonkTopLExpr, zonkTopBndrs ) +import TcHsType ( kcHsType ) +import TcMType ( zonkTcType, zonkQuantifiedTyVar ) +import TcMatches ( tcStmts, tcDoStmt ) +import TcSimplify ( tcSimplifyInteractive, tcSimplifyInfer ) +import TcType ( Type, mkForAllTys, mkFunTys, mkTyConApp, tyVarsOfType, isTauTy, + isUnLiftedType, tyClsNamesOfDFunHead, tyClsNamesOfType, isUnitTy ) +import TcEnv ( tcLookupTyCon, tcLookupId, tcLookupGlobal ) +import RnTypes ( rnLHsType ) +import Inst ( tcGetInstEnvs ) +import InstEnv ( classInstances, instEnvElts ) +import RnExpr ( rnStmts, rnLExpr ) +import LoadIface ( loadSrcInterface, loadSysInterface ) +import IfaceEnv ( ifaceExportNames ) +import Module ( moduleSetElts, mkModuleSet ) +import RnEnv ( lookupOccRn, dataTcOccs, lookupFixityRn ) +import Id ( setIdType ) +import MkId ( unsafeCoerceId ) +import TyCon ( tyConName ) +import TysWiredIn ( mkListTy, unitTy ) +import IdInfo ( GlobalIdDetails(..) ) +import Kind ( Kind ) +import Var ( globaliseId ) +import Name ( nameOccName, nameModule, isBuiltInSyntax ) +import OccName ( isTcOcc ) +import NameEnv ( delListFromNameEnv ) +import PrelNames ( iNTERACTIVE, ioTyConName, printName, itName, + bindIOName, thenIOName, returnIOName ) +import HscTypes ( InteractiveContext(..), + ModIface(..), icPrintUnqual, + Dependencies(..) ) +import BasicTypes ( Fixity, RecFlag(..) ) +import SrcLoc ( unLoc ) +#endif + +import FastString ( mkFastString ) +import Maybes ( MaybeErr(..) ) +import Util ( sortLe ) +import Bag ( unionBags, snocBag, emptyBag, unitBag, unionManyBags ) + +import Maybe ( isJust ) +\end{code} + + + +%************************************************************************ +%* * + Typecheck and rename a module +%* * +%************************************************************************ + + +\begin{code} +tcRnModule :: HscEnv + -> HscSource + -> Bool -- True <=> save renamed syntax + -> Located (HsModule RdrName) + -> IO (Messages, Maybe TcGblEnv) + +tcRnModule hsc_env hsc_src save_rn_syntax + (L loc (HsModule maybe_mod export_ies + import_decls local_decls mod_deprec)) + = do { showPass (hsc_dflags hsc_env) "Renamer/typechecker" ; + + let { this_mod = case maybe_mod of + Nothing -> mAIN -- 'module M where' is omitted + Just (L _ mod) -> mod } ; -- The normal case + + initTc hsc_env hsc_src this_mod $ + setSrcSpan loc $ + do { + -- Deal with imports; + rn_imports <- rnImports import_decls ; + (rdr_env, imports) <- mkRdrEnvAndImports rn_imports ; + + let { dep_mods :: ModuleEnv (Module, IsBootInterface) + ; dep_mods = imp_dep_mods imports + + -- We want instance declarations from all home-package + -- modules below this one, including boot modules, except + -- ourselves. The 'except ourselves' is so that we don't + -- get the instances from this module's hs-boot file + ; want_instances :: Module -> Bool + ; want_instances mod = mod `elemModuleEnv` dep_mods + && mod /= this_mod + ; home_insts = hptInstances hsc_env want_instances + } ; + + -- Record boot-file info in the EPS, so that it's + -- visible to loadHiBootInterface in tcRnSrcDecls, + -- and any other incrementally-performed imports + updateEps_ (\eps -> eps { eps_is_boot = dep_mods }) ; + + checkConflicts imports this_mod $ do { + + -- Update the gbl env + updGblEnv ( \ gbl -> + gbl { tcg_rdr_env = rdr_env, + tcg_inst_env = extendInstEnvList (tcg_inst_env gbl) home_insts, + tcg_imports = tcg_imports gbl `plusImportAvails` imports, + tcg_rn_imports = if save_rn_syntax then + Just rn_imports + else + Nothing, + tcg_rn_decls = if save_rn_syntax then + Just emptyRnGroup + else + Nothing }) + $ do { + + traceRn (text "rn1" <+> ppr (imp_dep_mods imports)) ; + -- Fail if there are any errors so far + -- The error printing (if needed) takes advantage + -- of the tcg_env we have now set + failIfErrsM ; + + -- Load any orphan-module interfaces, so that + -- their rules and instance decls will be found + loadOrphanModules (imp_orphs imports) ; + + traceRn (text "rn1a") ; + -- Rename and type check the declarations + tcg_env <- if isHsBoot hsc_src then + tcRnHsBootDecls local_decls + else + tcRnSrcDecls local_decls ; + setGblEnv tcg_env $ do { + + traceRn (text "rn3") ; + + -- Report the use of any deprecated things + -- We do this before processsing the export list so + -- that we don't bleat about re-exporting a deprecated + -- thing (especially via 'module Foo' export item) + -- Only uses in the body of the module are complained about + reportDeprecations tcg_env ; + + -- Process the export list + rn_exports <- rnExports export_ies ; + let { liftM2' fn a b = do a' <- a; b' <- b; return (fn a' b') } ; + exports <- mkExportNameSet (isJust maybe_mod) (liftM2' (,) rn_exports export_ies) ; + + -- Check whether the entire module is deprecated + -- This happens only once per module + let { mod_deprecs = checkModDeprec mod_deprec } ; + + -- Add exports and deprecations to envt + let { final_env = tcg_env { tcg_exports = exports, + tcg_rn_exports = if save_rn_syntax then + rn_exports + else Nothing, + tcg_dus = tcg_dus tcg_env `plusDU` usesOnly exports, + tcg_deprecs = tcg_deprecs tcg_env `plusDeprecs` + mod_deprecs } + -- A module deprecation over-rides the earlier ones + } ; + + -- Report unused names + reportUnusedNames export_ies final_env ; + + -- Dump output and return + tcDump final_env ; + return final_env + }}}}} + + +-- The program is not allowed to contain two modules with the same +-- name, and we check for that here. It could happen if the home package +-- contains a module that is also present in an external package, for example. +checkConflicts imports this_mod and_then = do + dflags <- getDOpts + let + dep_mods = this_mod : map fst (moduleEnvElts (imp_dep_mods imports)) + -- don't forget to include the current module! + + mb_dep_pkgs = checkForPackageConflicts + dflags dep_mods (imp_dep_pkgs imports) + -- + case mb_dep_pkgs of + Failed msg -> + do addErr msg; failM + Succeeded _ -> + updGblEnv (\gbl -> gbl{ tcg_home_mods = mkHomeModules dep_mods }) + and_then +\end{code} + + +%************************************************************************ +%* * + Type-checking external-core modules +%* * +%************************************************************************ + +\begin{code} +tcRnExtCore :: HscEnv + -> HsExtCore RdrName + -> IO (Messages, Maybe ModGuts) + -- Nothing => some error occurred + +tcRnExtCore hsc_env (HsExtCore this_mod decls src_binds) + -- The decls are IfaceDecls; all names are original names + = do { showPass (hsc_dflags hsc_env) "Renamer/typechecker" ; + + initTc hsc_env ExtCoreFile this_mod $ do { + + let { ldecls = map noLoc decls } ; + + -- Deal with the type declarations; first bring their stuff + -- into scope, then rname them, then type check them + tcg_env <- importsFromLocalDecls (mkFakeGroup ldecls) ; + + setGblEnv tcg_env $ do { + + rn_decls <- rnTyClDecls ldecls ; + failIfErrsM ; + + -- Dump trace of renaming part + rnDump (ppr rn_decls) ; + + -- Typecheck them all together so that + -- any mutually recursive types are done right + tcg_env <- checkNoErrs (tcTyAndClassDecls emptyModDetails rn_decls) ; + -- Make the new type env available to stuff slurped from interface files + + setGblEnv tcg_env $ do { + + -- Now the core bindings + core_binds <- initIfaceExtCore (tcExtCoreBindings src_binds) ; + + -- Wrap up + let { + bndrs = bindersOfBinds core_binds ; + my_exports = mkNameSet (map idName bndrs) ; + -- ToDo: export the data types also? + + final_type_env = extendTypeEnvWithIds (tcg_type_env tcg_env) bndrs ; + + mod_guts = ModGuts { mg_module = this_mod, + mg_boot = False, + mg_usages = [], -- ToDo: compute usage + mg_dir_imps = [], -- ?? + mg_deps = noDependencies, -- ?? + mg_home_mods = mkHomeModules [], -- ?? wrong!! + mg_exports = my_exports, + mg_types = final_type_env, + mg_insts = tcg_insts tcg_env, + mg_rules = [], + mg_binds = core_binds, + + -- Stubs + mg_rdr_env = emptyGlobalRdrEnv, + mg_fix_env = emptyFixityEnv, + mg_deprecs = NoDeprecs, + mg_foreign = NoStubs + } } ; + + tcCoreDump mod_guts ; + + return mod_guts + }}}} + +mkFakeGroup decls -- Rather clumsy; lots of unused fields + = emptyRdrGroup { hs_tyclds = decls } +\end{code} + + +%************************************************************************ +%* * + Type-checking the top level of a module +%* * +%************************************************************************ + +\begin{code} +tcRnSrcDecls :: [LHsDecl RdrName] -> TcM TcGblEnv + -- Returns the variables free in the decls + -- Reason: solely to report unused imports and bindings +tcRnSrcDecls decls + = do { -- Load the hi-boot interface for this module, if any + -- We do this now so that the boot_names can be passed + -- to tcTyAndClassDecls, because the boot_names are + -- automatically considered to be loop breakers + mod <- getModule ; + boot_iface <- tcHiBootIface mod ; + + -- Do all the declarations + (tc_envs, lie) <- getLIE (tc_rn_src_decls boot_iface decls) ; + + -- tcSimplifyTop deals with constant or ambiguous InstIds. + -- How could there be ambiguous ones? They can only arise if a + -- top-level decl falls under the monomorphism + -- restriction, and no subsequent decl instantiates its + -- type. (Usually, ambiguous type variables are resolved + -- during the generalisation step.) + traceTc (text "Tc8") ; + inst_binds <- setEnvs tc_envs (tcSimplifyTop lie) ; + -- Setting the global env exposes the instances to tcSimplifyTop + -- Setting the local env exposes the local Ids to tcSimplifyTop, + -- so that we get better error messages (monomorphism restriction) + + -- Backsubstitution. This must be done last. + -- Even tcSimplifyTop may do some unification. + traceTc (text "Tc9") ; + let { (tcg_env, _) = tc_envs ; + TcGblEnv { tcg_type_env = type_env, tcg_binds = binds, + tcg_rules = rules, tcg_fords = fords } = tcg_env } ; + + tcDump tcg_env ; + (bind_ids, binds', fords', rules') <- zonkTopDecls (binds `unionBags` inst_binds) + rules fords ; + + let { final_type_env = extendTypeEnvWithIds type_env bind_ids + ; tcg_env' = tcg_env { tcg_type_env = final_type_env, + tcg_binds = binds', + tcg_rules = rules', + tcg_fords = fords' } } ; + + -- Make the new type env available to stuff slurped from interface files + writeMutVar (tcg_type_env_var tcg_env) final_type_env ; + + -- Compare the hi-boot iface (if any) with the real thing + dfun_binds <- checkHiBootIface tcg_env' boot_iface ; + + return (tcg_env' { tcg_binds = tcg_binds tcg_env' `unionBags` dfun_binds }) + } + +tc_rn_src_decls :: ModDetails -> [LHsDecl RdrName] -> TcM (TcGblEnv, TcLclEnv) +-- Loops around dealing with each top level inter-splice group +-- in turn, until it's dealt with the entire module +tc_rn_src_decls boot_details ds + = do { let { (first_group, group_tail) = findSplice ds } ; + -- If ds is [] we get ([], Nothing) + + -- Type check the decls up to, but not including, the first splice + tc_envs@(tcg_env,tcl_env) <- tcRnGroup boot_details first_group ; + + -- Bale out if errors; for example, error recovery when checking + -- the RHS of 'main' can mean that 'main' is not in the envt for + -- the subsequent checkMain test + failIfErrsM ; + + setEnvs tc_envs $ + + -- If there is no splice, we're nearly done + case group_tail of { + Nothing -> do { -- Last thing: check for `main' + tcg_env <- checkMain ; + return (tcg_env, tcl_env) + } ; + + -- If there's a splice, we must carry on + Just (SpliceDecl splice_expr, rest_ds) -> do { +#ifndef GHCI + failWithTc (text "Can't do a top-level splice; need a bootstrapped compiler") +#else + + -- Rename the splice expression, and get its supporting decls + (rn_splice_expr, splice_fvs) <- rnLExpr splice_expr ; + failIfErrsM ; -- Don't typecheck if renaming failed + + -- Execute the splice + spliced_decls <- tcSpliceDecls rn_splice_expr ; + + -- Glue them on the front of the remaining decls and loop + setGblEnv (tcg_env `addTcgDUs` usesOnly splice_fvs) $ + tc_rn_src_decls boot_details (spliced_decls ++ rest_ds) +#endif /* GHCI */ + }}} +\end{code} + +%************************************************************************ +%* * + Compiling hs-boot source files, and + comparing the hi-boot interface with the real thing +%* * +%************************************************************************ + +\begin{code} +tcRnHsBootDecls :: [LHsDecl RdrName] -> TcM TcGblEnv +tcRnHsBootDecls decls + = do { let { (first_group, group_tail) = findSplice decls } + + ; case group_tail of + Just stuff -> spliceInHsBootErr stuff + Nothing -> return () + + -- Rename the declarations + ; (tcg_env, rn_group) <- rnTopSrcDecls first_group + ; setGblEnv tcg_env $ do { + + -- Todo: check no foreign decls, no rules, no default decls + + -- Typecheck type/class decls + ; traceTc (text "Tc2") + ; let tycl_decls = hs_tyclds rn_group + ; tcg_env <- checkNoErrs (tcTyAndClassDecls emptyModDetails tycl_decls) + ; setGblEnv tcg_env $ do { + + -- Typecheck instance decls + ; traceTc (text "Tc3") + ; (tcg_env, inst_infos, _binds) <- tcInstDecls1 tycl_decls (hs_instds rn_group) + ; setGblEnv tcg_env $ do { + + -- Typecheck value declarations + ; traceTc (text "Tc5") + ; val_ids <- tcHsBootSigs (hs_valds rn_group) + + -- Wrap up + -- No simplification or zonking to do + ; traceTc (text "Tc7a") + ; gbl_env <- getGblEnv + + -- Make the final type-env + -- Include the dfun_ids so that their type sigs get + -- are written into the interface file + ; let { type_env0 = tcg_type_env gbl_env + ; type_env1 = extendTypeEnvWithIds type_env0 val_ids + ; type_env2 = extendTypeEnvWithIds type_env1 dfun_ids + ; dfun_ids = map iDFunId inst_infos } + ; return (gbl_env { tcg_type_env = type_env2 }) + }}}} + +spliceInHsBootErr (SpliceDecl (L loc _), _) + = addErrAt loc (ptext SLIT("Splices are not allowed in hs-boot files")) +\end{code} + +Once we've typechecked the body of the module, we want to compare what +we've found (gathered in a TypeEnv) with the hi-boot details (if any). + +\begin{code} +checkHiBootIface :: TcGblEnv -> ModDetails -> TcM (LHsBinds Id) +-- Compare the hi-boot file for this module (if there is one) +-- with the type environment we've just come up with +-- In the common case where there is no hi-boot file, the list +-- of boot_names is empty. +-- +-- The bindings we return give bindings for the dfuns defined in the +-- hs-boot file, such as $fbEqT = $fEqT + +checkHiBootIface + (TcGblEnv { tcg_insts = local_insts, tcg_type_env = local_type_env }) + (ModDetails { md_insts = boot_insts, md_types = boot_type_env }) + = do { mapM_ check_one (typeEnvElts boot_type_env) + ; dfun_binds <- mapM check_inst boot_insts + ; return (unionManyBags dfun_binds) } + where + check_one boot_thing + | no_check name + = return () + | otherwise + = case lookupTypeEnv local_type_env name of + Nothing -> addErrTc (missingBootThing boot_thing) + Just real_thing -> check_thing boot_thing real_thing + where + name = getName boot_thing + + no_check name = isWiredInName name -- No checking for wired-in names. In particular, + -- 'error' is handled by a rather gross hack + -- (see comments in GHC.Err.hs-boot) + || name `elem` dfun_names + dfun_names = map getName boot_insts + + check_inst boot_inst + = case [dfun | inst <- local_insts, + let dfun = instanceDFunId inst, + idType dfun `tcEqType` boot_inst_ty ] of + [] -> do { addErrTc (instMisMatch boot_inst); return emptyBag } + (dfun:_) -> return (unitBag $ noLoc $ VarBind local_boot_dfun (nlHsVar dfun)) + where + boot_dfun = instanceDFunId boot_inst + boot_inst_ty = idType boot_dfun + local_boot_dfun = mkExportedLocalId (idName boot_dfun) boot_inst_ty + +---------------- +check_thing (ATyCon boot_tc) (ATyCon real_tc) + | isSynTyCon boot_tc && isSynTyCon real_tc, + defn1 `tcEqType` substTyWith tvs2 (mkTyVarTys tvs1) defn2 + = return () + + | tyConKind boot_tc == tyConKind real_tc + = return () + where + (tvs1, defn1) = synTyConDefn boot_tc + (tvs2, defn2) = synTyConDefn boot_tc + +check_thing (AnId boot_id) (AnId real_id) + | idType boot_id `tcEqType` idType real_id + = return () + +check_thing (ADataCon dc1) (ADataCon dc2) + | idType (dataConWrapId dc1) `tcEqType` idType (dataConWrapId dc2) + = return () + + -- Can't declare a class in a hi-boot file + +check_thing boot_thing real_thing -- Default case; failure + = addErrAt (srcLocSpan (getSrcLoc real_thing)) + (bootMisMatch real_thing) + +---------------- +missingBootThing thing + = ppr thing <+> ptext SLIT("is defined in the hs-boot file, but not in the module") +bootMisMatch thing + = ppr thing <+> ptext SLIT("has conflicting definitions in the module and its hs-boot file") +instMisMatch inst + = hang (ppr inst) + 2 (ptext SLIT("is defined in the hs-boot file, but not in the module")) +\end{code} + + +%************************************************************************ +%* * + Type-checking the top level of a module +%* * +%************************************************************************ + +tcRnGroup takes a bunch of top-level source-code declarations, and + * renames them + * gets supporting declarations from interface files + * typechecks them + * zonks them + * and augments the TcGblEnv with the results + +In Template Haskell it may be called repeatedly for each group of +declarations. It expects there to be an incoming TcGblEnv in the +monad; it augments it and returns the new TcGblEnv. + +\begin{code} +tcRnGroup :: ModDetails -> HsGroup RdrName -> TcM (TcGblEnv, TcLclEnv) + -- Returns the variables free in the decls, for unused-binding reporting +tcRnGroup boot_details decls + = do { -- Rename the declarations + (tcg_env, rn_decls) <- rnTopSrcDecls decls ; + setGblEnv tcg_env $ do { + + -- Typecheck the declarations + tcTopSrcDecls boot_details rn_decls + }} + +------------------------------------------------ +rnTopSrcDecls :: HsGroup RdrName -> TcM (TcGblEnv, HsGroup Name) +rnTopSrcDecls group + = do { -- Bring top level binders into scope + tcg_env <- importsFromLocalDecls group ; + setGblEnv tcg_env $ do { + + failIfErrsM ; -- No point in continuing if (say) we have duplicate declarations + + -- Rename the source decls + (tcg_env, rn_decls) <- rnSrcDecls group ; + failIfErrsM ; + + -- save the renamed syntax, if we want it + let { tcg_env' + | Just grp <- tcg_rn_decls tcg_env + = tcg_env{ tcg_rn_decls = Just (appendGroups grp rn_decls) } + | otherwise + = tcg_env }; + + -- Dump trace of renaming part + rnDump (ppr rn_decls) ; + + return (tcg_env', rn_decls) + }} + +------------------------------------------------ +tcTopSrcDecls :: ModDetails -> HsGroup Name -> TcM (TcGblEnv, TcLclEnv) +tcTopSrcDecls boot_details + (HsGroup { hs_tyclds = tycl_decls, + hs_instds = inst_decls, + hs_fords = foreign_decls, + hs_defds = default_decls, + hs_ruleds = rule_decls, + hs_valds = val_binds }) + = do { -- Type-check the type and class decls, and all imported decls + -- The latter come in via tycl_decls + traceTc (text "Tc2") ; + + tcg_env <- checkNoErrs (tcTyAndClassDecls boot_details tycl_decls) ; + -- tcTyAndClassDecls recovers internally, but if anything gave rise to + -- an error we'd better stop now, to avoid a cascade + + -- Make these type and class decls available to stuff slurped from interface files + writeMutVar (tcg_type_env_var tcg_env) (tcg_type_env tcg_env) ; + + + setGblEnv tcg_env $ do { + -- Source-language instances, including derivings, + -- and import the supporting declarations + traceTc (text "Tc3") ; + (tcg_env, inst_infos, deriv_binds) <- tcInstDecls1 tycl_decls inst_decls ; + setGblEnv tcg_env $ do { + + -- Foreign import declarations next. No zonking necessary + -- here; we can tuck them straight into the global environment. + traceTc (text "Tc4") ; + (fi_ids, fi_decls) <- tcForeignImports foreign_decls ; + tcExtendGlobalValEnv fi_ids $ do { + + -- Default declarations + traceTc (text "Tc4a") ; + default_tys <- tcDefaults default_decls ; + updGblEnv (\gbl -> gbl { tcg_default = default_tys }) $ do { + + -- Value declarations next + -- We also typecheck any extra binds that came out + -- of the "deriving" process (deriv_binds) + traceTc (text "Tc5") ; + (tc_val_binds, tcl_env) <- tcTopBinds (val_binds `plusHsValBinds` deriv_binds) ; + setLclTypeEnv tcl_env $ do { + + -- Second pass over class and instance declarations, + traceTc (text "Tc6") ; + (inst_binds, tcl_env) <- tcInstDecls2 tycl_decls inst_infos ; + showLIE (text "after instDecls2") ; + + -- Foreign exports + -- They need to be zonked, so we return them + traceTc (text "Tc7") ; + (foe_binds, foe_decls) <- tcForeignExports foreign_decls ; + + -- Rules + rules <- tcRules rule_decls ; + + -- Wrap up + traceTc (text "Tc7a") ; + tcg_env <- getGblEnv ; + let { all_binds = tc_val_binds `unionBags` + inst_binds `unionBags` + foe_binds ; + + -- Extend the GblEnv with the (as yet un-zonked) + -- bindings, rules, foreign decls + tcg_env' = tcg_env { tcg_binds = tcg_binds tcg_env `unionBags` all_binds, + tcg_rules = tcg_rules tcg_env ++ rules, + tcg_fords = tcg_fords tcg_env ++ foe_decls ++ fi_decls } } ; + return (tcg_env', tcl_env) + }}}}}} +\end{code} + + +%************************************************************************ +%* * + Checking for 'main' +%* * +%************************************************************************ + +\begin{code} +checkMain :: TcM TcGblEnv +-- If we are in module Main, check that 'main' is defined. +checkMain + = do { ghc_mode <- getGhcMode ; + tcg_env <- getGblEnv ; + dflags <- getDOpts ; + let { main_mod = mainModIs dflags ; + main_fn = case mainFunIs dflags of { + Just fn -> mkRdrUnqual (mkVarOccFS (mkFastString fn)) ; + Nothing -> main_RDR_Unqual } } ; + + check_main ghc_mode tcg_env main_mod main_fn + } + + +check_main ghc_mode tcg_env main_mod main_fn + | mod /= main_mod + = traceTc (text "checkMain not" <+> ppr main_mod <+> ppr mod) >> + return tcg_env + + | otherwise + = addErrCtxt mainCtxt $ + do { mb_main <- lookupSrcOcc_maybe main_fn + -- Check that 'main' is in scope + -- It might be imported from another module! + ; case mb_main of { + Nothing -> do { traceTc (text "checkMain fail" <+> ppr main_mod <+> ppr main_fn) + ; complain_no_main + ; return tcg_env } ; + Just main_name -> do + { traceTc (text "checkMain found" <+> ppr main_mod <+> ppr main_fn) + ; let { rhs = nlHsApp (nlHsVar runMainIOName) (nlHsVar main_name) } + -- :Main.main :: IO () = runMainIO main + + ; (main_expr, ty) <- setSrcSpan (srcLocSpan (getSrcLoc main_name)) $ + tcInferRho rhs + + -- The function that the RTS invokes is always :Main.main, + -- which we call root_main_id. + -- (Because GHC allows the user to have a module not called + -- Main as the main module, we can't rely on the main function + -- being called "Main.main". That's why root_main_id has a fixed + -- module ":Main".) + -- We also make root_main_id an implicit Id, by making main_name + -- its parent (hence (Just main_name)). That has the effect + -- of preventing its type and unfolding from getting out into + -- the interface file. Otherwise we can end up with two defns + -- for 'main' in the interface file! + + ; let { root_main_name = mkExternalName rootMainKey rOOT_MAIN + (mkVarOccFS FSLIT("main")) + (Just main_name) (getSrcLoc main_name) + ; root_main_id = mkExportedLocalId root_main_name ty + ; main_bind = noLoc (VarBind root_main_id main_expr) } + + ; return (tcg_env { tcg_binds = tcg_binds tcg_env + `snocBag` main_bind, + tcg_dus = tcg_dus tcg_env + `plusDU` usesOnly (unitFV main_name) + -- Record the use of 'main', so that we don't + -- complain about it being defined but not used + }) + }}} + where + mod = tcg_mod tcg_env + + complain_no_main | ghc_mode == Interactive = return () + | otherwise = failWithTc noMainMsg + -- In interactive mode, don't worry about the absence of 'main' + -- In other modes, fail altogether, so that we don't go on + -- and complain a second time when processing the export list. + + mainCtxt = ptext SLIT("When checking the type of the main function") <+> quotes (ppr main_fn) + noMainMsg = ptext SLIT("The main function") <+> quotes (ppr main_fn) + <+> ptext SLIT("is not defined in module") <+> quotes (ppr main_mod) +\end{code} + +%********************************************************* +%* * + GHCi stuff +%* * +%********************************************************* + +\begin{code} +#ifdef GHCI +setInteractiveContext :: HscEnv -> InteractiveContext -> TcRn a -> TcRn a +setInteractiveContext hsc_env icxt thing_inside + = let + -- Initialise the tcg_inst_env with instances + -- from all home modules. This mimics the more selective + -- call to hptInstances in tcRnModule + dfuns = hptInstances hsc_env (\mod -> True) + in + updGblEnv (\env -> env { + tcg_rdr_env = ic_rn_gbl_env icxt, + tcg_type_env = ic_type_env icxt, + tcg_inst_env = extendInstEnvList (tcg_inst_env env) dfuns }) $ + + updLclEnv (\env -> env { tcl_rdr = ic_rn_local_env icxt }) $ + + do { traceTc (text "setIC" <+> ppr (ic_type_env icxt)) + ; thing_inside } +\end{code} + + +\begin{code} +tcRnStmt :: HscEnv + -> InteractiveContext + -> LStmt RdrName + -> IO (Maybe (InteractiveContext, [Name], LHsExpr Id)) + -- The returned [Name] is the same as the input except for + -- ExprStmt, in which case the returned [Name] is [itName] + -- + -- The returned TypecheckedHsExpr is of type IO [ () ], + -- a list of the bound values, coerced to (). + +tcRnStmt hsc_env ictxt rdr_stmt + = initTcPrintErrors hsc_env iNTERACTIVE $ + setInteractiveContext hsc_env ictxt $ do { + + -- Rename; use CmdLineMode because tcRnStmt is only used interactively + (([rn_stmt], _), fvs) <- rnStmts DoExpr [rdr_stmt] (return ((), emptyFVs)) ; + traceRn (text "tcRnStmt" <+> vcat [ppr rdr_stmt, ppr rn_stmt, ppr fvs]) ; + failIfErrsM ; + + -- The real work is done here + (bound_ids, tc_expr) <- mkPlan rn_stmt ; + zonked_expr <- zonkTopLExpr tc_expr ; + zonked_ids <- zonkTopBndrs bound_ids ; + + -- None of the Ids should be of unboxed type, because we + -- cast them all to HValues in the end! + mappM bad_unboxed (filter (isUnLiftedType . idType) zonked_ids) ; + + traceTc (text "tcs 1") ; + let { -- (a) Make all the bound ids "global" ids, now that + -- they're notionally top-level bindings. This is + -- important: otherwise when we come to compile an expression + -- using these ids later, the byte code generator will consider + -- the occurrences to be free rather than global. + -- + -- (b) Tidy their types; this is important, because :info may + -- ask to look at them, and :info expects the things it looks + -- up to have tidy types + global_ids = map globaliseAndTidy zonked_ids ; + + -- Update the interactive context + rn_env = ic_rn_local_env ictxt ; + type_env = ic_type_env ictxt ; + + bound_names = map idName global_ids ; + new_rn_env = extendLocalRdrEnv rn_env bound_names ; + + -- Remove any shadowed bindings from the type_env; + -- they are inaccessible but might, I suppose, cause + -- a space leak if we leave them there + shadowed = [ n | name <- bound_names, + let rdr_name = mkRdrUnqual (nameOccName name), + Just n <- [lookupLocalRdrEnv rn_env rdr_name] ] ; + + filtered_type_env = delListFromNameEnv type_env shadowed ; + new_type_env = extendTypeEnvWithIds filtered_type_env global_ids ; + + new_ic = ictxt { ic_rn_local_env = new_rn_env, + ic_type_env = new_type_env } + } ; + + dumpOptTcRn Opt_D_dump_tc + (vcat [text "Bound Ids" <+> pprWithCommas ppr global_ids, + text "Typechecked expr" <+> ppr zonked_expr]) ; + + returnM (new_ic, bound_names, zonked_expr) + } + where + bad_unboxed id = addErr (sep [ptext SLIT("GHCi can't bind a variable of unlifted type:"), + nest 2 (ppr id <+> dcolon <+> ppr (idType id))]) + +globaliseAndTidy :: Id -> Id +globaliseAndTidy id +-- Give the Id a Global Name, and tidy its type + = setIdType (globaliseId VanillaGlobal id) tidy_type + where + tidy_type = tidyTopType (idType id) +\end{code} + +Here is the grand plan, implemented in tcUserStmt + + What you type The IO [HValue] that hscStmt returns + ------------- ------------------------------------ + let pat = expr ==> let pat = expr in return [coerce HVal x, coerce HVal y, ...] + bindings: [x,y,...] + + pat <- expr ==> expr >>= \ pat -> return [coerce HVal x, coerce HVal y, ...] + bindings: [x,y,...] + + expr (of IO type) ==> expr >>= \ it -> return [coerce HVal it] + [NB: result not printed] bindings: [it] + + expr (of non-IO type, ==> let it = expr in print it >> return [coerce HVal it] + result showable) bindings: [it] + + expr (of non-IO type, + result not showable) ==> error + + +\begin{code} +--------------------------- +type PlanResult = ([Id], LHsExpr Id) +type Plan = TcM PlanResult + +runPlans :: [Plan] -> TcM PlanResult +-- Try the plans in order. If one fails (by raising an exn), try the next. +-- If one succeeds, take it. +runPlans [] = panic "runPlans" +runPlans [p] = p +runPlans (p:ps) = tryTcLIE_ (runPlans ps) p + +-------------------- +mkPlan :: LStmt Name -> TcM PlanResult +mkPlan (L loc (ExprStmt expr _ _)) -- An expression typed at the prompt + = do { uniq <- newUnique -- is treated very specially + ; let fresh_it = itName uniq + the_bind = L loc $ mkFunBind (L loc fresh_it) matches + matches = [mkMatch [] expr emptyLocalBinds] + let_stmt = L loc $ LetStmt (HsValBinds (ValBindsOut [(NonRecursive,unitBag the_bind)] [])) + bind_stmt = L loc $ BindStmt (nlVarPat fresh_it) expr + (HsVar bindIOName) noSyntaxExpr + print_it = L loc $ ExprStmt (nlHsApp (nlHsVar printName) (nlHsVar fresh_it)) + (HsVar thenIOName) placeHolderType + + -- The plans are: + -- [it <- e; print it] but not if it::() + -- [it <- e] + -- [let it = e; print it] + ; runPlans [ -- Plan A + do { stuff@([it_id], _) <- tcGhciStmts [bind_stmt, print_it] + ; it_ty <- zonkTcType (idType it_id) + ; ifM (isUnitTy it_ty) failM + ; return stuff }, + + -- Plan B; a naked bind statment + tcGhciStmts [bind_stmt], + + -- Plan C; check that the let-binding is typeable all by itself. + -- If not, fail; if so, try to print it. + -- The two-step process avoids getting two errors: one from + -- the expression itself, and one from the 'print it' part + -- This two-step story is very clunky, alas + do { checkNoErrs (tcGhciStmts [let_stmt]) + --- checkNoErrs defeats the error recovery of let-bindings + ; tcGhciStmts [let_stmt, print_it] } + ]} + +mkPlan stmt@(L loc (BindStmt {})) + | [L _ v] <- collectLStmtBinders stmt -- One binder, for a bind stmt + = do { let print_v = L loc $ ExprStmt (nlHsApp (nlHsVar printName) (nlHsVar v)) + (HsVar thenIOName) placeHolderType + -- The plans are: + -- [stmt; print v] but not if v::() + -- [stmt] + ; runPlans [do { stuff@([v_id], _) <- tcGhciStmts [stmt, print_v] + ; v_ty <- zonkTcType (idType v_id) + ; ifM (isUnitTy v_ty || not (isTauTy v_ty)) failM + ; return stuff }, + tcGhciStmts [stmt] + ]} + +mkPlan stmt + = tcGhciStmts [stmt] + +--------------------------- +tcGhciStmts :: [LStmt Name] -> TcM PlanResult +tcGhciStmts stmts + = do { ioTyCon <- tcLookupTyCon ioTyConName ; + ret_id <- tcLookupId returnIOName ; -- return @ IO + let { + io_ty = mkTyConApp ioTyCon [] ; + ret_ty = mkListTy unitTy ; + io_ret_ty = mkTyConApp ioTyCon [ret_ty] ; + + names = map unLoc (collectLStmtsBinders stmts) ; + + -- mk_return builds the expression + -- returnIO @ [()] [coerce () x, .., coerce () z] + -- + -- Despite the inconvenience of building the type applications etc, + -- this *has* to be done in type-annotated post-typecheck form + -- because we are going to return a list of *polymorphic* values + -- coerced to type (). If we built a *source* stmt + -- return [coerce x, ..., coerce z] + -- then the type checker would instantiate x..z, and we wouldn't + -- get their *polymorphic* values. (And we'd get ambiguity errs + -- if they were overloaded, since they aren't applied to anything.) + mk_return ids = nlHsApp (noLoc $ TyApp (nlHsVar ret_id) [ret_ty]) + (noLoc $ ExplicitList unitTy (map mk_item ids)) ; + mk_item id = nlHsApp (noLoc $ TyApp (nlHsVar unsafeCoerceId) [idType id, unitTy]) + (nlHsVar id) + } ; + + -- OK, we're ready to typecheck the stmts + traceTc (text "tcs 2") ; + ((tc_stmts, ids), lie) <- getLIE $ + tcStmts DoExpr (tcDoStmt io_ty) stmts io_ret_ty $ \ _ -> + mappM tcLookupId names ; + -- Look up the names right in the middle, + -- where they will all be in scope + + -- Simplify the context + const_binds <- checkNoErrs (tcSimplifyInteractive lie) ; + -- checkNoErrs ensures that the plan fails if context redn fails + + return (ids, mkHsDictLet const_binds $ + noLoc (HsDo DoExpr tc_stmts (mk_return ids) io_ret_ty)) + } +\end{code} + + +tcRnExpr just finds the type of an expression + +\begin{code} +tcRnExpr :: HscEnv + -> InteractiveContext + -> LHsExpr RdrName + -> IO (Maybe Type) +tcRnExpr hsc_env ictxt rdr_expr + = initTcPrintErrors hsc_env iNTERACTIVE $ + setInteractiveContext hsc_env ictxt $ do { + + (rn_expr, fvs) <- rnLExpr rdr_expr ; + failIfErrsM ; + + -- Now typecheck the expression; + -- it might have a rank-2 type (e.g. :t runST) + ((tc_expr, res_ty), lie) <- getLIE (tcInferRho rn_expr) ; + ((qtvs, _, dict_ids), lie_top) <- getLIE (tcSimplifyInfer smpl_doc (tyVarsOfType res_ty) lie) ; + tcSimplifyInteractive lie_top ; + qtvs' <- mappM zonkQuantifiedTyVar qtvs ; + + let { all_expr_ty = mkForAllTys qtvs' $ + mkFunTys (map idType dict_ids) $ + res_ty } ; + zonkTcType all_expr_ty + } + where + smpl_doc = ptext SLIT("main expression") +\end{code} + +tcRnType just finds the kind of a type + +\begin{code} +tcRnType :: HscEnv + -> InteractiveContext + -> LHsType RdrName + -> IO (Maybe Kind) +tcRnType hsc_env ictxt rdr_type + = initTcPrintErrors hsc_env iNTERACTIVE $ + setInteractiveContext hsc_env ictxt $ do { + + rn_type <- rnLHsType doc rdr_type ; + failIfErrsM ; + + -- Now kind-check the type + (ty', kind) <- kcHsType rn_type ; + return kind + } + where + doc = ptext SLIT("In GHCi input") + +#endif /* GHCi */ +\end{code} + + +%************************************************************************ +%* * + More GHCi stuff, to do with browsing and getting info +%* * +%************************************************************************ + +\begin{code} +#ifdef GHCI +-- ASSUMES that the module is either in the HomePackageTable or is +-- a package module with an interface on disk. If neither of these is +-- true, then the result will be an error indicating the interface +-- could not be found. +getModuleExports :: HscEnv -> Module -> IO (Messages, Maybe NameSet) +getModuleExports hsc_env mod + = initTc hsc_env HsSrcFile iNTERACTIVE (tcGetModuleExports mod) + +tcGetModuleExports :: Module -> TcM NameSet +tcGetModuleExports mod = do + iface <- load_iface mod + loadOrphanModules (dep_orphs (mi_deps iface)) + -- Load any orphan-module interfaces, + -- so their instances are visible + ifaceExportNames (mi_exports iface) + +load_iface mod = loadSrcInterface doc mod False {- Not boot iface -} + where + doc = ptext SLIT("context for compiling statements") + + +tcRnLookupRdrName :: HscEnv -> RdrName -> IO (Maybe [Name]) +tcRnLookupRdrName hsc_env rdr_name + = initTcPrintErrors hsc_env iNTERACTIVE $ + setInteractiveContext hsc_env (hsc_IC hsc_env) $ + lookup_rdr_name rdr_name + +lookup_rdr_name rdr_name = do { + -- If the identifier is a constructor (begins with an + -- upper-case letter), then we need to consider both + -- constructor and type class identifiers. + let { rdr_names = dataTcOccs rdr_name } ; + + -- results :: [Either Messages Name] + results <- mapM (tryTcErrs . lookupOccRn) rdr_names ; + + traceRn (text "xx" <+> vcat [ppr rdr_names, ppr (map snd results)]); + -- The successful lookups will be (Just name) + let { (warns_s, good_names) = unzip [ (msgs, name) + | (msgs, Just name) <- results] ; + errs_s = [msgs | (msgs, Nothing) <- results] } ; + + -- Fail if nothing good happened, else add warnings + if null good_names then + -- No lookup succeeded, so + -- pick the first error message and report it + -- ToDo: If one of the errors is "could be Foo.X or Baz.X", + -- while the other is "X is not in scope", + -- we definitely want the former; but we might pick the latter + do { addMessages (head errs_s) ; failM } + else -- Add deprecation warnings + mapM_ addMessages warns_s ; + + return good_names + } + + +tcRnLookupName :: HscEnv -> Name -> IO (Maybe TyThing) +tcRnLookupName hsc_env name + = initTcPrintErrors hsc_env iNTERACTIVE $ + setInteractiveContext hsc_env (hsc_IC hsc_env) $ + tcLookupGlobal name + + +tcRnGetInfo :: HscEnv + -> Name + -> IO (Maybe (TyThing, Fixity, [Instance])) + +-- Used to implemnent :info in GHCi +-- +-- Look up a RdrName and return all the TyThings it might be +-- A capitalised RdrName is given to us in the DataName namespace, +-- but we want to treat it as *both* a data constructor +-- *and* as a type or class constructor; +-- hence the call to dataTcOccs, and we return up to two results +tcRnGetInfo hsc_env name + = initTcPrintErrors hsc_env iNTERACTIVE $ + let ictxt = hsc_IC hsc_env in + setInteractiveContext hsc_env ictxt $ do + + -- Load the interface for all unqualified types and classes + -- That way we will find all the instance declarations + -- (Packages have not orphan modules, and we assume that + -- in the home package all relevant modules are loaded.) + loadUnqualIfaces ictxt + + thing <- tcLookupGlobal name + fixity <- lookupFixityRn name + ispecs <- lookupInsts (icPrintUnqual ictxt) thing + return (thing, fixity, ispecs) + + +lookupInsts :: PrintUnqualified -> TyThing -> TcM [Instance] +-- Filter the instances by the ones whose tycons (or clases resp) +-- are in scope unqualified. Otherwise we list a whole lot too many! +lookupInsts print_unqual (AClass cls) + = do { inst_envs <- tcGetInstEnvs + ; return [ ispec + | ispec <- classInstances inst_envs cls + , plausibleDFun print_unqual (instanceDFunId ispec) ] } + +lookupInsts print_unqual (ATyCon tc) + = do { eps <- getEps -- Load all instances for all classes that are + -- in the type environment (which are all the ones + -- we've seen in any interface file so far) + ; (pkg_ie, home_ie) <- tcGetInstEnvs -- Search all + ; return [ ispec + | ispec <- instEnvElts home_ie ++ instEnvElts pkg_ie + , let dfun = instanceDFunId ispec + , relevant dfun + , plausibleDFun print_unqual dfun ] } + where + relevant df = tc_name `elemNameSet` tyClsNamesOfDFunHead (idType df) + tc_name = tyConName tc + +lookupInsts print_unqual other = return [] + +plausibleDFun print_unqual dfun -- Dfun involving only names that print unqualified + = all ok (nameSetToList (tyClsNamesOfType (idType dfun))) + where + ok name | isBuiltInSyntax name = True + | isExternalName name = print_unqual (nameModule name) (nameOccName name) + | otherwise = True + +loadUnqualIfaces :: InteractiveContext -> TcM () +-- Load the home module for everything that is in scope unqualified +-- This is so that we can accurately report the instances for +-- something +loadUnqualIfaces ictxt + = initIfaceTcRn $ + mapM_ (loadSysInterface doc) (moduleSetElts (mkModuleSet unqual_mods)) + where + unqual_mods = [ nameModule name + | gre <- globalRdrEnvElts (ic_rn_gbl_env ictxt), + let name = gre_name gre, + not (isInternalName name), + isTcOcc (nameOccName name), -- Types and classes only + unQualOK gre ] -- In scope unqualified + doc = ptext SLIT("Need interface for module whose export(s) are in scope unqualified") +#endif /* GHCI */ +\end{code} + +%************************************************************************ +%* * + Degugging output +%* * +%************************************************************************ + +\begin{code} +rnDump :: SDoc -> TcRn () +-- Dump, with a banner, if -ddump-rn +rnDump doc = do { dumpOptTcRn Opt_D_dump_rn (mkDumpDoc "Renamer" doc) } + +tcDump :: TcGblEnv -> TcRn () +tcDump env + = do { dflags <- getDOpts ; + + -- Dump short output if -ddump-types or -ddump-tc + ifM (dopt Opt_D_dump_types dflags || dopt Opt_D_dump_tc dflags) + (dumpTcRn short_dump) ; + + -- Dump bindings if -ddump-tc + dumpOptTcRn Opt_D_dump_tc (mkDumpDoc "Typechecker" full_dump) + } + where + short_dump = pprTcGblEnv env + full_dump = pprLHsBinds (tcg_binds env) + -- NB: foreign x-d's have undefined's in their types; + -- hence can't show the tc_fords + +tcCoreDump mod_guts + = do { dflags <- getDOpts ; + ifM (dopt Opt_D_dump_types dflags || dopt Opt_D_dump_tc dflags) + (dumpTcRn (pprModGuts mod_guts)) ; + + -- Dump bindings if -ddump-tc + dumpOptTcRn Opt_D_dump_tc (mkDumpDoc "Typechecker" full_dump) } + where + full_dump = pprCoreBindings (mg_binds mod_guts) + +-- It's unpleasant having both pprModGuts and pprModDetails here +pprTcGblEnv :: TcGblEnv -> SDoc +pprTcGblEnv (TcGblEnv { tcg_type_env = type_env, + tcg_insts = dfun_ids, + tcg_rules = rules, + tcg_imports = imports }) + = vcat [ ppr_types dfun_ids type_env + , ppr_insts dfun_ids + , vcat (map ppr rules) + , ppr_gen_tycons (typeEnvTyCons type_env) + , ptext SLIT("Dependent modules:") <+> ppr (moduleEnvElts (imp_dep_mods imports)) + , ptext SLIT("Dependent packages:") <+> ppr (imp_dep_pkgs imports)] + +pprModGuts :: ModGuts -> SDoc +pprModGuts (ModGuts { mg_types = type_env, + mg_rules = rules }) + = vcat [ ppr_types [] type_env, + ppr_rules rules ] + + +ppr_types :: [Instance] -> TypeEnv -> SDoc +ppr_types ispecs type_env + = text "TYPE SIGNATURES" $$ nest 4 (ppr_sigs ids) + where + dfun_ids = map instanceDFunId ispecs + ids = [id | id <- typeEnvIds type_env, want_sig id] + want_sig id | opt_PprStyle_Debug = True + | otherwise = isLocalId id && + isExternalName (idName id) && + not (id `elem` dfun_ids) + -- isLocalId ignores data constructors, records selectors etc. + -- The isExternalName ignores local dictionary and method bindings + -- that the type checker has invented. Top-level user-defined things + -- have External names. + +ppr_insts :: [Instance] -> SDoc +ppr_insts [] = empty +ppr_insts ispecs = text "INSTANCES" $$ nest 2 (pprInstances ispecs) + +ppr_sigs :: [Var] -> SDoc +ppr_sigs ids + -- Print type signatures; sort by OccName + = vcat (map ppr_sig (sortLe le_sig ids)) + where + le_sig id1 id2 = getOccName id1 <= getOccName id2 + ppr_sig id = ppr id <+> dcolon <+> ppr (tidyTopType (idType id)) + +ppr_rules :: [CoreRule] -> SDoc +ppr_rules [] = empty +ppr_rules rs = vcat [ptext SLIT("{-# RULES"), + nest 4 (pprRules rs), + ptext SLIT("#-}")] + +ppr_gen_tycons [] = empty +ppr_gen_tycons tcs = vcat [ptext SLIT("Tycons with generics:"), + nest 2 (fsep (map ppr (filter tyConHasGenerics tcs)))] +\end{code} diff --git a/compiler/typecheck/TcRnMonad.lhs b/compiler/typecheck/TcRnMonad.lhs new file mode 100644 index 0000000000..ff1979bc06 --- /dev/null +++ b/compiler/typecheck/TcRnMonad.lhs @@ -0,0 +1,1042 @@ +\begin{code} +module TcRnMonad( + module TcRnMonad, + module TcRnTypes, + module IOEnv + ) where + +#include "HsVersions.h" + +import TcRnTypes -- Re-export all +import IOEnv -- Re-export all + +#if defined(GHCI) && defined(BREAKPOINT) +import TypeRep ( Type(..), liftedTypeKind, TyThing(..) ) +import Var ( mkTyVar, mkGlobalId ) +import IdInfo ( GlobalIdDetails(..), vanillaIdInfo ) +import OccName ( mkOccName, tvName ) +import SrcLoc ( noSrcLoc ) +import TysWiredIn ( intTy, stringTy, mkListTy, unitTy ) +import PrelNames ( breakpointJumpName ) +import NameEnv ( mkNameEnv ) +#endif + +import HsSyn ( emptyLHsBinds ) +import HscTypes ( HscEnv(..), ModGuts(..), ModIface(..), + TyThing, TypeEnv, emptyTypeEnv, HscSource(..), + isHsBoot, ModSummary(..), + ExternalPackageState(..), HomePackageTable, + Deprecs(..), FixityEnv, FixItem, + lookupType, unQualInScope ) +import Module ( Module, unitModuleEnv ) +import RdrName ( GlobalRdrEnv, emptyGlobalRdrEnv, + LocalRdrEnv, emptyLocalRdrEnv ) +import Name ( Name, isInternalName, mkInternalName, tidyNameOcc, nameOccName, getSrcLoc ) +import Type ( Type ) +import TcType ( tcIsTyVarTy, tcGetTyVar ) +import NameEnv ( extendNameEnvList, nameEnvElts ) +import InstEnv ( emptyInstEnv ) + +import Var ( setTyVarName ) +import VarSet ( emptyVarSet ) +import VarEnv ( TidyEnv, emptyTidyEnv, extendVarEnv ) +import ErrUtils ( Message, Messages, emptyMessages, errorsFound, + mkWarnMsg, printErrorsAndWarnings, + mkLocMessage, mkLongErrMsg ) +import Packages ( mkHomeModules ) +import SrcLoc ( mkGeneralSrcSpan, isGoodSrcSpan, SrcSpan, Located(..) ) +import NameEnv ( emptyNameEnv ) +import NameSet ( NameSet, emptyDUs, emptyNameSet, unionNameSets, addOneToNameSet ) +import OccName ( emptyOccEnv, tidyOccName ) +import Bag ( emptyBag ) +import Outputable +import UniqSupply ( UniqSupply, mkSplitUniqSupply, uniqFromSupply, splitUniqSupply ) +import Unique ( Unique ) +import DynFlags ( DynFlags(..), DynFlag(..), dopt, dopt_set, GhcMode ) +import StaticFlags ( opt_PprStyle_Debug ) +import Bag ( snocBag, unionBags ) +import Panic ( showException ) + +import IO ( stderr ) +import DATA_IOREF ( newIORef, readIORef ) +import EXCEPTION ( Exception ) +\end{code} + + + +%************************************************************************ +%* * + initTc +%* * +%************************************************************************ + +\begin{code} +ioToTcRn :: IO r -> TcRn r +ioToTcRn = ioToIOEnv +\end{code} + +\begin{code} +initTc :: HscEnv + -> HscSource + -> Module + -> TcM r + -> IO (Messages, Maybe r) + -- Nothing => error thrown by the thing inside + -- (error messages should have been printed already) + +initTc hsc_env hsc_src mod do_this + = do { errs_var <- newIORef (emptyBag, emptyBag) ; + tvs_var <- newIORef emptyVarSet ; + type_env_var <- newIORef emptyNameEnv ; + dfuns_var <- newIORef emptyNameSet ; + keep_var <- newIORef emptyNameSet ; + th_var <- newIORef False ; + dfun_n_var <- newIORef 1 ; + let { + gbl_env = TcGblEnv { + tcg_mod = mod, + tcg_src = hsc_src, + tcg_rdr_env = emptyGlobalRdrEnv, + tcg_fix_env = emptyNameEnv, + tcg_default = Nothing, + tcg_type_env = emptyNameEnv, + tcg_type_env_var = type_env_var, + tcg_inst_env = emptyInstEnv, + tcg_inst_uses = dfuns_var, + tcg_th_used = th_var, + tcg_exports = emptyNameSet, + tcg_imports = init_imports, + tcg_home_mods = home_mods, + tcg_dus = emptyDUs, + tcg_rn_imports = Nothing, + tcg_rn_exports = Nothing, + tcg_rn_decls = Nothing, + tcg_binds = emptyLHsBinds, + tcg_deprecs = NoDeprecs, + tcg_insts = [], + tcg_rules = [], + tcg_fords = [], + tcg_dfun_n = dfun_n_var, + tcg_keep = keep_var + } ; + lcl_env = TcLclEnv { + tcl_errs = errs_var, + tcl_loc = mkGeneralSrcSpan FSLIT("Top level"), + tcl_ctxt = [], + tcl_rdr = emptyLocalRdrEnv, + tcl_th_ctxt = topStage, + tcl_arrow_ctxt = NoArrowCtxt, + tcl_env = emptyNameEnv, + tcl_tyvars = tvs_var, + tcl_lie = panic "initTc:LIE" -- LIE only valid inside a getLIE + } ; + } ; + + -- OK, here's the business end! + maybe_res <- initTcRnIf 'a' hsc_env gbl_env lcl_env $ + do { +#if defined(GHCI) && defined(BREAKPOINT) + unique <- newUnique ; + let { var = mkInternalName unique (mkOccName tvName "a") noSrcLoc; + tyvar = mkTyVar var liftedTypeKind; + breakpointJumpType = mkGlobalId + (VanillaGlobal) + (breakpointJumpName) + (FunTy intTy + (FunTy (mkListTy unitTy) + (FunTy stringTy + (ForAllTy tyvar + (FunTy (TyVarTy tyvar) + (TyVarTy tyvar)))))) + (vanillaIdInfo); + new_env = mkNameEnv [(breakpointJumpName,AGlobal (AnId breakpointJumpType))]; + }; + r <- tryM (updLclEnv (\gbl -> gbl{tcl_env=new_env}) do_this) +#else + r <- tryM do_this +#endif + ; case r of + Right res -> return (Just res) + Left _ -> return Nothing } ; + + -- Collect any error messages + msgs <- readIORef errs_var ; + + let { dflags = hsc_dflags hsc_env + ; final_res | errorsFound dflags msgs = Nothing + | otherwise = maybe_res } ; + + return (msgs, final_res) + } + where + home_mods = mkHomeModules (map ms_mod (hsc_mod_graph hsc_env)) + -- A guess at the home modules. This will be correct in + -- --make and GHCi modes, but in one-shot mode we need to + -- fix it up after we know the real dependencies of the current + -- module (see tcRnModule). + -- Setting it here is necessary for the typechecker entry points + -- other than tcRnModule: tcRnGetInfo, for example. These are + -- all called via the GHC module, so hsc_mod_graph will contain + -- something sensible. + + init_imports = emptyImportAvails {imp_env = unitModuleEnv mod emptyNameSet} + -- Initialise tcg_imports with an empty set of bindings for + -- this module, so that if we see 'module M' in the export + -- list, and there are no bindings in M, we don't bleat + -- "unknown module M". + +initTcPrintErrors -- Used from the interactive loop only + :: HscEnv + -> Module + -> TcM r + -> IO (Maybe r) +initTcPrintErrors env mod todo = do + (msgs, res) <- initTc env HsSrcFile mod todo + printErrorsAndWarnings (hsc_dflags env) msgs + return res + +-- mkImpTypeEnv makes the imported symbol table +mkImpTypeEnv :: ExternalPackageState -> HomePackageTable + -> Name -> Maybe TyThing +mkImpTypeEnv pcs hpt = lookup + where + pte = eps_PTE pcs + lookup name | isInternalName name = Nothing + | otherwise = lookupType hpt pte name +\end{code} + + +%************************************************************************ +%* * + Initialisation +%* * +%************************************************************************ + + +\begin{code} +initTcRnIf :: Char -- Tag for unique supply + -> HscEnv + -> gbl -> lcl + -> TcRnIf gbl lcl a + -> IO a +initTcRnIf uniq_tag hsc_env gbl_env lcl_env thing_inside + = do { us <- mkSplitUniqSupply uniq_tag ; + ; us_var <- newIORef us ; + + ; let { env = Env { env_top = hsc_env, + env_us = us_var, + env_gbl = gbl_env, + env_lcl = lcl_env } } + + ; runIOEnv env thing_inside + } +\end{code} + +%************************************************************************ +%* * + Simple accessors +%* * +%************************************************************************ + +\begin{code} +getTopEnv :: TcRnIf gbl lcl HscEnv +getTopEnv = do { env <- getEnv; return (env_top env) } + +getGblEnv :: TcRnIf gbl lcl gbl +getGblEnv = do { env <- getEnv; return (env_gbl env) } + +updGblEnv :: (gbl -> gbl) -> TcRnIf gbl lcl a -> TcRnIf gbl lcl a +updGblEnv upd = updEnv (\ env@(Env { env_gbl = gbl }) -> + env { env_gbl = upd gbl }) + +setGblEnv :: gbl -> TcRnIf gbl lcl a -> TcRnIf gbl lcl a +setGblEnv gbl_env = updEnv (\ env -> env { env_gbl = gbl_env }) + +getLclEnv :: TcRnIf gbl lcl lcl +getLclEnv = do { env <- getEnv; return (env_lcl env) } + +updLclEnv :: (lcl -> lcl) -> TcRnIf gbl lcl a -> TcRnIf gbl lcl a +updLclEnv upd = updEnv (\ env@(Env { env_lcl = lcl }) -> + env { env_lcl = upd lcl }) + +setLclEnv :: lcl' -> TcRnIf gbl lcl' a -> TcRnIf gbl lcl a +setLclEnv lcl_env = updEnv (\ env -> env { env_lcl = lcl_env }) + +getEnvs :: TcRnIf gbl lcl (gbl, lcl) +getEnvs = do { env <- getEnv; return (env_gbl env, env_lcl env) } + +setEnvs :: (gbl', lcl') -> TcRnIf gbl' lcl' a -> TcRnIf gbl lcl a +setEnvs (gbl_env, lcl_env) = updEnv (\ env -> env { env_gbl = gbl_env, env_lcl = lcl_env }) +\end{code} + + +Command-line flags + +\begin{code} +getDOpts :: TcRnIf gbl lcl DynFlags +getDOpts = do { env <- getTopEnv; return (hsc_dflags env) } + +doptM :: DynFlag -> TcRnIf gbl lcl Bool +doptM flag = do { dflags <- getDOpts; return (dopt flag dflags) } + +setOptM :: DynFlag -> TcRnIf gbl lcl a -> TcRnIf gbl lcl a +setOptM flag = updEnv (\ env@(Env { env_top = top }) -> + env { env_top = top { hsc_dflags = dopt_set (hsc_dflags top) flag}} ) + +ifOptM :: DynFlag -> TcRnIf gbl lcl () -> TcRnIf gbl lcl () -- Do it flag is true +ifOptM flag thing_inside = do { b <- doptM flag; + if b then thing_inside else return () } + +getGhcMode :: TcRnIf gbl lcl GhcMode +getGhcMode = do { env <- getTopEnv; return (ghcMode (hsc_dflags env)) } +\end{code} + +\begin{code} +getEpsVar :: TcRnIf gbl lcl (TcRef ExternalPackageState) +getEpsVar = do { env <- getTopEnv; return (hsc_EPS env) } + +getEps :: TcRnIf gbl lcl ExternalPackageState +getEps = do { env <- getTopEnv; readMutVar (hsc_EPS env) } + +-- Updating the EPS. This should be an atomic operation. +-- Note the delicate 'seq' which forces the EPS before putting it in the +-- variable. Otherwise what happens is that we get +-- write eps_var (....(unsafeRead eps_var)....) +-- and if the .... is strict, that's obviously bottom. By forcing it beforehand +-- we make the unsafeRead happen before we update the variable. + +updateEps :: (ExternalPackageState -> (ExternalPackageState, a)) + -> TcRnIf gbl lcl a +updateEps upd_fn = do { traceIf (text "updating EPS") + ; eps_var <- getEpsVar + ; eps <- readMutVar eps_var + ; let { (eps', val) = upd_fn eps } + ; seq eps' (writeMutVar eps_var eps') + ; return val } + +updateEps_ :: (ExternalPackageState -> ExternalPackageState) + -> TcRnIf gbl lcl () +updateEps_ upd_fn = do { traceIf (text "updating EPS_") + ; eps_var <- getEpsVar + ; eps <- readMutVar eps_var + ; let { eps' = upd_fn eps } + ; seq eps' (writeMutVar eps_var eps') } + +getHpt :: TcRnIf gbl lcl HomePackageTable +getHpt = do { env <- getTopEnv; return (hsc_HPT env) } + +getEpsAndHpt :: TcRnIf gbl lcl (ExternalPackageState, HomePackageTable) +getEpsAndHpt = do { env <- getTopEnv; eps <- readMutVar (hsc_EPS env) + ; return (eps, hsc_HPT env) } +\end{code} + +%************************************************************************ +%* * + Unique supply +%* * +%************************************************************************ + +\begin{code} +newUnique :: TcRnIf gbl lcl Unique +newUnique = do { us <- newUniqueSupply ; + return (uniqFromSupply us) } + +newUniqueSupply :: TcRnIf gbl lcl UniqSupply +newUniqueSupply + = do { env <- getEnv ; + let { u_var = env_us env } ; + us <- readMutVar u_var ; + let { (us1, us2) = splitUniqSupply us } ; + writeMutVar u_var us1 ; + return us2 } + +newLocalName :: Name -> TcRnIf gbl lcl Name +newLocalName name -- Make a clone + = newUnique `thenM` \ uniq -> + returnM (mkInternalName uniq (nameOccName name) (getSrcLoc name)) +\end{code} + + +%************************************************************************ +%* * + Debugging +%* * +%************************************************************************ + +\begin{code} +traceTc, traceRn :: SDoc -> TcRn () +traceRn = traceOptTcRn Opt_D_dump_rn_trace +traceTc = traceOptTcRn Opt_D_dump_tc_trace +traceSplice = traceOptTcRn Opt_D_dump_splices + + +traceIf :: SDoc -> TcRnIf m n () +traceIf = traceOptIf Opt_D_dump_if_trace +traceHiDiffs = traceOptIf Opt_D_dump_hi_diffs + + +traceOptIf :: DynFlag -> SDoc -> TcRnIf m n () -- No RdrEnv available, so qualify everything +traceOptIf flag doc = ifOptM flag $ + ioToIOEnv (printForUser stderr alwaysQualify doc) + +traceOptTcRn :: DynFlag -> SDoc -> TcRn () +traceOptTcRn flag doc = ifOptM flag $ do + { ctxt <- getErrCtxt + ; loc <- getSrcSpanM + ; env0 <- tcInitTidyEnv + ; ctxt_msgs <- do_ctxt env0 ctxt + ; let real_doc = mkLocMessage loc (vcat (doc : ctxt_to_use ctxt_msgs)) + ; dumpTcRn real_doc } + +dumpTcRn :: SDoc -> TcRn () +dumpTcRn doc = do { rdr_env <- getGlobalRdrEnv ; + ioToTcRn (printForUser stderr (unQualInScope rdr_env) doc) } + +dumpOptTcRn :: DynFlag -> SDoc -> TcRn () +dumpOptTcRn flag doc = ifOptM flag (dumpTcRn doc) +\end{code} + + +%************************************************************************ +%* * + Typechecker global environment +%* * +%************************************************************************ + +\begin{code} +getModule :: TcRn Module +getModule = do { env <- getGblEnv; return (tcg_mod env) } + +setModule :: Module -> TcRn a -> TcRn a +setModule mod thing_inside = updGblEnv (\env -> env { tcg_mod = mod }) thing_inside + +tcIsHsBoot :: TcRn Bool +tcIsHsBoot = do { env <- getGblEnv; return (isHsBoot (tcg_src env)) } + +getGlobalRdrEnv :: TcRn GlobalRdrEnv +getGlobalRdrEnv = do { env <- getGblEnv; return (tcg_rdr_env env) } + +getImports :: TcRn ImportAvails +getImports = do { env <- getGblEnv; return (tcg_imports env) } + +getFixityEnv :: TcRn FixityEnv +getFixityEnv = do { env <- getGblEnv; return (tcg_fix_env env) } + +extendFixityEnv :: [(Name,FixItem)] -> RnM a -> RnM a +extendFixityEnv new_bit + = updGblEnv (\env@(TcGblEnv { tcg_fix_env = old_fix_env }) -> + env {tcg_fix_env = extendNameEnvList old_fix_env new_bit}) + +getDefaultTys :: TcRn (Maybe [Type]) +getDefaultTys = do { env <- getGblEnv; return (tcg_default env) } +\end{code} + +%************************************************************************ +%* * + Error management +%* * +%************************************************************************ + +\begin{code} +getSrcSpanM :: TcRn SrcSpan + -- Avoid clash with Name.getSrcLoc +getSrcSpanM = do { env <- getLclEnv; return (tcl_loc env) } + +setSrcSpan :: SrcSpan -> TcRn a -> TcRn a +setSrcSpan loc thing_inside + | isGoodSrcSpan loc = updLclEnv (\env -> env { tcl_loc = loc }) thing_inside + | otherwise = thing_inside -- Don't overwrite useful info with useless + +addLocM :: (a -> TcM b) -> Located a -> TcM b +addLocM fn (L loc a) = setSrcSpan loc $ fn a + +wrapLocM :: (a -> TcM b) -> Located a -> TcM (Located b) +wrapLocM fn (L loc a) = setSrcSpan loc $ do b <- fn a; return (L loc b) + +wrapLocFstM :: (a -> TcM (b,c)) -> Located a -> TcM (Located b, c) +wrapLocFstM fn (L loc a) = + setSrcSpan loc $ do + (b,c) <- fn a + return (L loc b, c) + +wrapLocSndM :: (a -> TcM (b,c)) -> Located a -> TcM (b, Located c) +wrapLocSndM fn (L loc a) = + setSrcSpan loc $ do + (b,c) <- fn a + return (b, L loc c) +\end{code} + + +\begin{code} +getErrsVar :: TcRn (TcRef Messages) +getErrsVar = do { env <- getLclEnv; return (tcl_errs env) } + +setErrsVar :: TcRef Messages -> TcRn a -> TcRn a +setErrsVar v = updLclEnv (\ env -> env { tcl_errs = v }) + +addErr :: Message -> TcRn () +addErr msg = do { loc <- getSrcSpanM ; addErrAt loc msg } + +addLocErr :: Located e -> (e -> Message) -> TcRn () +addLocErr (L loc e) fn = addErrAt loc (fn e) + +addErrAt :: SrcSpan -> Message -> TcRn () +addErrAt loc msg = addLongErrAt loc msg empty + +addLongErrAt :: SrcSpan -> Message -> Message -> TcRn () +addLongErrAt loc msg extra + = do { traceTc (ptext SLIT("Adding error:") <+> (mkLocMessage loc (msg $$ extra))) ; + errs_var <- getErrsVar ; + rdr_env <- getGlobalRdrEnv ; + let { err = mkLongErrMsg loc (unQualInScope rdr_env) msg extra } ; + (warns, errs) <- readMutVar errs_var ; + writeMutVar errs_var (warns, errs `snocBag` err) } + +addErrs :: [(SrcSpan,Message)] -> TcRn () +addErrs msgs = mappM_ add msgs + where + add (loc,msg) = addErrAt loc msg + +addReport :: Message -> TcRn () +addReport msg = do loc <- getSrcSpanM; addReportAt loc msg + +addReportAt :: SrcSpan -> Message -> TcRn () +addReportAt loc msg + = do { errs_var <- getErrsVar ; + rdr_env <- getGlobalRdrEnv ; + let { warn = mkWarnMsg loc (unQualInScope rdr_env) msg } ; + (warns, errs) <- readMutVar errs_var ; + writeMutVar errs_var (warns `snocBag` warn, errs) } + +addWarn :: Message -> TcRn () +addWarn msg = addReport (ptext SLIT("Warning:") <+> msg) + +addWarnAt :: SrcSpan -> Message -> TcRn () +addWarnAt loc msg = addReportAt loc (ptext SLIT("Warning:") <+> msg) + +addLocWarn :: Located e -> (e -> Message) -> TcRn () +addLocWarn (L loc e) fn = addReportAt loc (fn e) + +checkErr :: Bool -> Message -> TcRn () +-- Add the error if the bool is False +checkErr ok msg = checkM ok (addErr msg) + +warnIf :: Bool -> Message -> TcRn () +warnIf True msg = addWarn msg +warnIf False msg = return () + +addMessages :: Messages -> TcRn () +addMessages (m_warns, m_errs) + = do { errs_var <- getErrsVar ; + (warns, errs) <- readMutVar errs_var ; + writeMutVar errs_var (warns `unionBags` m_warns, + errs `unionBags` m_errs) } + +discardWarnings :: TcRn a -> TcRn a +-- Ignore warnings inside the thing inside; +-- used to ignore-unused-variable warnings inside derived code +-- With -dppr-debug, the effects is switched off, so you can still see +-- what warnings derived code would give +discardWarnings thing_inside + | opt_PprStyle_Debug = thing_inside + | otherwise + = do { errs_var <- newMutVar emptyMessages + ; result <- setErrsVar errs_var thing_inside + ; (_warns, errs) <- readMutVar errs_var + ; addMessages (emptyBag, errs) + ; return result } +\end{code} + + +\begin{code} +try_m :: TcRn r -> TcRn (Either Exception r) +-- Does try_m, with a debug-trace on failure +try_m thing + = do { mb_r <- tryM thing ; + case mb_r of + Left exn -> do { traceTc (exn_msg exn); return mb_r } + Right r -> return mb_r } + where + exn_msg exn = text "tryTc/recoverM recovering from" <+> text (showException exn) + +----------------------- +recoverM :: TcRn r -- Recovery action; do this if the main one fails + -> TcRn r -- Main action: do this first + -> TcRn r +-- Errors in 'thing' are retained +recoverM recover thing + = do { mb_res <- try_m thing ; + case mb_res of + Left exn -> recover + Right res -> returnM res } + +----------------------- +tryTc :: TcRn a -> TcRn (Messages, Maybe a) +-- (tryTc m) executes m, and returns +-- Just r, if m succeeds (returning r) +-- Nothing, if m fails +-- It also returns all the errors and warnings accumulated by m +-- It always succeeds (never raises an exception) +tryTc m + = do { errs_var <- newMutVar emptyMessages ; + res <- try_m (setErrsVar errs_var m) ; + msgs <- readMutVar errs_var ; + return (msgs, case res of + Left exn -> Nothing + Right val -> Just val) + -- The exception is always the IOEnv built-in + -- in exception; see IOEnv.failM + } + +----------------------- +tryTcErrs :: TcRn a -> TcRn (Messages, Maybe a) +-- Run the thing, returning +-- Just r, if m succceeds with no error messages +-- Nothing, if m fails, or if it succeeds but has error messages +-- Either way, the messages are returned; even in the Just case +-- there might be warnings +tryTcErrs thing + = do { (msgs, res) <- tryTc thing + ; dflags <- getDOpts + ; let errs_found = errorsFound dflags msgs + ; return (msgs, case res of + Nothing -> Nothing + Just val | errs_found -> Nothing + | otherwise -> Just val) + } + +----------------------- +tryTcLIE :: TcM a -> TcM (Messages, Maybe a) +-- Just like tryTcErrs, except that it ensures that the LIE +-- for the thing is propagated only if there are no errors +-- Hence it's restricted to the type-check monad +tryTcLIE thing_inside + = do { ((msgs, mb_res), lie) <- getLIE (tryTcErrs thing_inside) ; + ; case mb_res of + Nothing -> return (msgs, Nothing) + Just val -> do { extendLIEs lie; return (msgs, Just val) } + } + +----------------------- +tryTcLIE_ :: TcM r -> TcM r -> TcM r +-- (tryTcLIE_ r m) tries m; +-- if m succeeds with no error messages, it's the answer +-- otherwise tryTcLIE_ drops everything from m and tries r instead. +tryTcLIE_ recover main + = do { (msgs, mb_res) <- tryTcLIE main + ; case mb_res of + Just val -> do { addMessages msgs -- There might be warnings + ; return val } + Nothing -> recover -- Discard all msgs + } + +----------------------- +checkNoErrs :: TcM r -> TcM r +-- (checkNoErrs m) succeeds iff m succeeds and generates no errors +-- If m fails then (checkNoErrsTc m) fails. +-- If m succeeds, it checks whether m generated any errors messages +-- (it might have recovered internally) +-- If so, it fails too. +-- Regardless, any errors generated by m are propagated to the enclosing context. +checkNoErrs main + = do { (msgs, mb_res) <- tryTcLIE main + ; addMessages msgs + ; case mb_res of + Nothing -> failM + Just val -> return val + } + +ifErrsM :: TcRn r -> TcRn r -> TcRn r +-- ifErrsM bale_out main +-- does 'bale_out' if there are errors in errors collection +-- otherwise does 'main' +ifErrsM bale_out normal + = do { errs_var <- getErrsVar ; + msgs <- readMutVar errs_var ; + dflags <- getDOpts ; + if errorsFound dflags msgs then + bale_out + else + normal } + +failIfErrsM :: TcRn () +-- Useful to avoid error cascades +failIfErrsM = ifErrsM failM (return ()) +\end{code} + + +%************************************************************************ +%* * + Context management and error message generation + for the type checker +%* * +%************************************************************************ + +\begin{code} +getErrCtxt :: TcM ErrCtxt +getErrCtxt = do { env <- getLclEnv; return (tcl_ctxt env) } + +setErrCtxt :: ErrCtxt -> TcM a -> TcM a +setErrCtxt ctxt = updLclEnv (\ env -> env { tcl_ctxt = ctxt }) + +addErrCtxt :: Message -> TcM a -> TcM a +addErrCtxt msg = addErrCtxtM (\env -> returnM (env, msg)) + +addErrCtxtM :: (TidyEnv -> TcM (TidyEnv, Message)) -> TcM a -> TcM a +addErrCtxtM msg = updCtxt (\ msgs -> msg : msgs) + +-- Helper function for the above +updCtxt :: (ErrCtxt -> ErrCtxt) -> TcM a -> TcM a +updCtxt upd = updLclEnv (\ env@(TcLclEnv { tcl_ctxt = ctxt }) -> + env { tcl_ctxt = upd ctxt }) + +-- Conditionally add an error context +maybeAddErrCtxt :: Maybe Message -> TcM a -> TcM a +maybeAddErrCtxt (Just msg) thing_inside = addErrCtxt msg thing_inside +maybeAddErrCtxt Nothing thing_inside = thing_inside + +popErrCtxt :: TcM a -> TcM a +popErrCtxt = updCtxt (\ msgs -> case msgs of { [] -> []; (m:ms) -> ms }) + +getInstLoc :: InstOrigin -> TcM InstLoc +getInstLoc origin + = do { loc <- getSrcSpanM ; env <- getLclEnv ; + return (InstLoc origin loc (tcl_ctxt env)) } + +addInstCtxt :: InstLoc -> TcM a -> TcM a +-- Add the SrcSpan and context from the first Inst in the list +-- (they all have similar locations) +addInstCtxt (InstLoc _ src_loc ctxt) thing_inside + = setSrcSpan src_loc (updCtxt (\ old_ctxt -> ctxt) thing_inside) +\end{code} + + The addErrTc functions add an error message, but do not cause failure. + The 'M' variants pass a TidyEnv that has already been used to + tidy up the message; we then use it to tidy the context messages + +\begin{code} +addErrTc :: Message -> TcM () +addErrTc err_msg = do { env0 <- tcInitTidyEnv + ; addErrTcM (env0, err_msg) } + +addErrsTc :: [Message] -> TcM () +addErrsTc err_msgs = mappM_ addErrTc err_msgs + +addErrTcM :: (TidyEnv, Message) -> TcM () +addErrTcM (tidy_env, err_msg) + = do { ctxt <- getErrCtxt ; + loc <- getSrcSpanM ; + add_err_tcm tidy_env err_msg loc ctxt } +\end{code} + +The failWith functions add an error message and cause failure + +\begin{code} +failWithTc :: Message -> TcM a -- Add an error message and fail +failWithTc err_msg + = addErrTc err_msg >> failM + +failWithTcM :: (TidyEnv, Message) -> TcM a -- Add an error message and fail +failWithTcM local_and_msg + = addErrTcM local_and_msg >> failM + +checkTc :: Bool -> Message -> TcM () -- Check that the boolean is true +checkTc True err = returnM () +checkTc False err = failWithTc err +\end{code} + + Warnings have no 'M' variant, nor failure + +\begin{code} +addWarnTc :: Message -> TcM () +addWarnTc msg + = do { ctxt <- getErrCtxt ; + env0 <- tcInitTidyEnv ; + ctxt_msgs <- do_ctxt env0 ctxt ; + addWarn (vcat (msg : ctxt_to_use ctxt_msgs)) } + +warnTc :: Bool -> Message -> TcM () +warnTc warn_if_true warn_msg + | warn_if_true = addWarnTc warn_msg + | otherwise = return () +\end{code} + +----------------------------------- + Tidying + +We initialise the "tidy-env", used for tidying types before printing, +by building a reverse map from the in-scope type variables to the +OccName that the programmer originally used for them + +\begin{code} +tcInitTidyEnv :: TcM TidyEnv +tcInitTidyEnv + = do { lcl_env <- getLclEnv + ; let nm_tv_prs = [ (name, tcGetTyVar "tcInitTidyEnv" ty) + | ATyVar name ty <- nameEnvElts (tcl_env lcl_env) + , tcIsTyVarTy ty ] + ; return (foldl add emptyTidyEnv nm_tv_prs) } + where + add (env,subst) (name, tyvar) + = case tidyOccName env (nameOccName name) of + (env', occ') -> (env', extendVarEnv subst tyvar tyvar') + where + tyvar' = setTyVarName tyvar name' + name' = tidyNameOcc name occ' +\end{code} + +----------------------------------- + Other helper functions + +\begin{code} +add_err_tcm tidy_env err_msg loc ctxt + = do { ctxt_msgs <- do_ctxt tidy_env ctxt ; + addLongErrAt loc err_msg (vcat (ctxt_to_use ctxt_msgs)) } + +do_ctxt tidy_env [] + = return [] +do_ctxt tidy_env (c:cs) + = do { (tidy_env', m) <- c tidy_env ; + ms <- do_ctxt tidy_env' cs ; + return (m:ms) } + +ctxt_to_use ctxt | opt_PprStyle_Debug = ctxt + | otherwise = take 3 ctxt +\end{code} + +debugTc is useful for monadic debugging code + +\begin{code} +debugTc :: TcM () -> TcM () +#ifdef DEBUG +debugTc thing = thing +#else +debugTc thing = return () +#endif +\end{code} + + %************************************************************************ +%* * + Type constraints (the so-called LIE) +%* * +%************************************************************************ + +\begin{code} +nextDFunIndex :: TcM Int -- Get the next dfun index +nextDFunIndex = do { env <- getGblEnv + ; let dfun_n_var = tcg_dfun_n env + ; n <- readMutVar dfun_n_var + ; writeMutVar dfun_n_var (n+1) + ; return n } + +getLIEVar :: TcM (TcRef LIE) +getLIEVar = do { env <- getLclEnv; return (tcl_lie env) } + +setLIEVar :: TcRef LIE -> TcM a -> TcM a +setLIEVar lie_var = updLclEnv (\ env -> env { tcl_lie = lie_var }) + +getLIE :: TcM a -> TcM (a, [Inst]) +-- (getLIE m) runs m, and returns the type constraints it generates +getLIE thing_inside + = do { lie_var <- newMutVar emptyLIE ; + res <- updLclEnv (\ env -> env { tcl_lie = lie_var }) + thing_inside ; + lie <- readMutVar lie_var ; + return (res, lieToList lie) } + +extendLIE :: Inst -> TcM () +extendLIE inst + = do { lie_var <- getLIEVar ; + lie <- readMutVar lie_var ; + writeMutVar lie_var (inst `consLIE` lie) } + +extendLIEs :: [Inst] -> TcM () +extendLIEs [] + = returnM () +extendLIEs insts + = do { lie_var <- getLIEVar ; + lie <- readMutVar lie_var ; + writeMutVar lie_var (mkLIE insts `plusLIE` lie) } +\end{code} + +\begin{code} +setLclTypeEnv :: TcLclEnv -> TcM a -> TcM a +-- Set the local type envt, but do *not* disturb other fields, +-- notably the lie_var +setLclTypeEnv lcl_env thing_inside + = updLclEnv upd thing_inside + where + upd env = env { tcl_env = tcl_env lcl_env, + tcl_tyvars = tcl_tyvars lcl_env } +\end{code} + + +%************************************************************************ +%* * + Template Haskell context +%* * +%************************************************************************ + +\begin{code} +recordThUse :: TcM () +recordThUse = do { env <- getGblEnv; writeMutVar (tcg_th_used env) True } + +keepAliveTc :: Name -> TcM () -- Record the name in the keep-alive set +keepAliveTc n = do { env <- getGblEnv; + ; updMutVar (tcg_keep env) (`addOneToNameSet` n) } + +keepAliveSetTc :: NameSet -> TcM () -- Record the name in the keep-alive set +keepAliveSetTc ns = do { env <- getGblEnv; + ; updMutVar (tcg_keep env) (`unionNameSets` ns) } + +getStage :: TcM ThStage +getStage = do { env <- getLclEnv; return (tcl_th_ctxt env) } + +setStage :: ThStage -> TcM a -> TcM a +setStage s = updLclEnv (\ env -> env { tcl_th_ctxt = s }) +\end{code} + + +%************************************************************************ +%* * + Stuff for the renamer's local env +%* * +%************************************************************************ + +\begin{code} +getLocalRdrEnv :: RnM LocalRdrEnv +getLocalRdrEnv = do { env <- getLclEnv; return (tcl_rdr env) } + +setLocalRdrEnv :: LocalRdrEnv -> RnM a -> RnM a +setLocalRdrEnv rdr_env thing_inside + = updLclEnv (\env -> env {tcl_rdr = rdr_env}) thing_inside +\end{code} + + +%************************************************************************ +%* * + Stuff for interface decls +%* * +%************************************************************************ + +\begin{code} +mkIfLclEnv :: Module -> SDoc -> IfLclEnv +mkIfLclEnv mod loc = IfLclEnv { if_mod = mod, + if_loc = loc, + if_tv_env = emptyOccEnv, + if_id_env = emptyOccEnv } + +initIfaceTcRn :: IfG a -> TcRn a +initIfaceTcRn thing_inside + = do { tcg_env <- getGblEnv + ; let { if_env = IfGblEnv { if_rec_types = Just (tcg_mod tcg_env, get_type_env) } + ; get_type_env = readMutVar (tcg_type_env_var tcg_env) } + ; setEnvs (if_env, ()) thing_inside } + +initIfaceExtCore :: IfL a -> TcRn a +initIfaceExtCore thing_inside + = do { tcg_env <- getGblEnv + ; let { mod = tcg_mod tcg_env + ; doc = ptext SLIT("External Core file for") <+> quotes (ppr mod) + ; if_env = IfGblEnv { + if_rec_types = Just (mod, return (tcg_type_env tcg_env)) } + ; if_lenv = mkIfLclEnv mod doc + } + ; setEnvs (if_env, if_lenv) thing_inside } + +initIfaceCheck :: HscEnv -> IfG a -> IO a +-- Used when checking the up-to-date-ness of the old Iface +-- Initialise the environment with no useful info at all +initIfaceCheck hsc_env do_this + = do { let gbl_env = IfGblEnv { if_rec_types = Nothing } + ; initTcRnIf 'i' hsc_env gbl_env () do_this + } + +initIfaceTc :: ModIface + -> (TcRef TypeEnv -> IfL a) -> TcRnIf gbl lcl a +-- Used when type-checking checking an up-to-date interface file +-- No type envt from the current module, but we do know the module dependencies +initIfaceTc iface do_this + = do { tc_env_var <- newMutVar emptyTypeEnv + ; let { gbl_env = IfGblEnv { if_rec_types = Just (mod, readMutVar tc_env_var) } ; + ; if_lenv = mkIfLclEnv mod doc + } + ; setEnvs (gbl_env, if_lenv) (do_this tc_env_var) + } + where + mod = mi_module iface + doc = ptext SLIT("The interface for") <+> quotes (ppr mod) + +initIfaceRules :: HscEnv -> ModGuts -> IfG a -> IO a +-- Used when sucking in new Rules in SimplCore +-- We have available the type envt of the module being compiled, and we must use it +initIfaceRules hsc_env guts do_this + = do { let { + type_info = (mg_module guts, return (mg_types guts)) + ; gbl_env = IfGblEnv { if_rec_types = Just type_info } ; + } + + -- Run the thing; any exceptions just bubble out from here + ; initTcRnIf 'i' hsc_env gbl_env () do_this + } + +initIfaceLcl :: Module -> SDoc -> IfL a -> IfM lcl a +initIfaceLcl mod loc_doc thing_inside + = setLclEnv (mkIfLclEnv mod loc_doc) thing_inside + +getIfModule :: IfL Module +getIfModule = do { env <- getLclEnv; return (if_mod env) } + +-------------------- +failIfM :: Message -> IfL a +-- The Iface monad doesn't have a place to accumulate errors, so we +-- just fall over fast if one happens; it "shouldnt happen". +-- We use IfL here so that we can get context info out of the local env +failIfM msg + = do { env <- getLclEnv + ; let full_msg = (if_loc env <> colon) $$ nest 2 msg + ; ioToIOEnv (printErrs (full_msg defaultErrStyle)) + ; failM } + +-------------------- +forkM_maybe :: SDoc -> IfL a -> IfL (Maybe a) +-- Run thing_inside in an interleaved thread. +-- It shares everything with the parent thread, so this is DANGEROUS. +-- +-- It returns Nothing if the computation fails +-- +-- It's used for lazily type-checking interface +-- signatures, which is pretty benign + +forkM_maybe doc thing_inside + = do { unsafeInterleaveM $ + do { traceIf (text "Starting fork {" <+> doc) + ; mb_res <- tryM thing_inside ; + case mb_res of + Right r -> do { traceIf (text "} ending fork" <+> doc) + ; return (Just r) } + Left exn -> do { + + -- Bleat about errors in the forked thread, if -ddump-if-trace is on + -- Otherwise we silently discard errors. Errors can legitimately + -- happen when compiling interface signatures (see tcInterfaceSigs) + ifOptM Opt_D_dump_if_trace + (print_errs (hang (text "forkM failed:" <+> doc) + 4 (text (show exn)))) + + ; traceIf (text "} ending fork (badly)" <+> doc) + ; return Nothing } + }} + where + print_errs sdoc = ioToIOEnv (printErrs (sdoc defaultErrStyle)) + +forkM :: SDoc -> IfL a -> IfL a +forkM doc thing_inside + = do { mb_res <- forkM_maybe doc thing_inside + ; return (case mb_res of + Nothing -> pgmError "Cannot continue after interface file error" + -- pprPanic "forkM" doc + Just r -> r) } +\end{code} + + diff --git a/compiler/typecheck/TcRnTypes.lhs b/compiler/typecheck/TcRnTypes.lhs new file mode 100644 index 0000000000..62281b56a1 --- /dev/null +++ b/compiler/typecheck/TcRnTypes.lhs @@ -0,0 +1,818 @@ +% +% (c) The GRASP Project, Glasgow University, 1992-2002 +% +\begin{code} +module TcRnTypes( + TcRnIf, TcRn, TcM, RnM, IfM, IfL, IfG, -- The monad is opaque outside this module + TcRef, + + -- The environment types + Env(..), + TcGblEnv(..), TcLclEnv(..), + IfGblEnv(..), IfLclEnv(..), + + -- Ranamer types + ErrCtxt, + ImportAvails(..), emptyImportAvails, plusImportAvails, + plusAvail, pruneAvails, + AvailEnv, emptyAvailEnv, unitAvailEnv, plusAvailEnv, + mkAvailEnv, lookupAvailEnv, lookupAvailEnv_maybe, availEnvElts, addAvail, + WhereFrom(..), mkModDeps, + + -- Typechecker types + TcTyThing(..), pprTcTyThingCategory, + GadtRefinement, + + -- Template Haskell + ThStage(..), topStage, topSpliceStage, + ThLevel, impLevel, topLevel, + + -- Arrows + ArrowCtxt(NoArrowCtxt), newArrowScope, escapeArrowScope, + + -- Insts + Inst(..), InstOrigin(..), InstLoc(..), pprInstLoc, + instLocSrcLoc, instLocSrcSpan, + LIE, emptyLIE, unitLIE, plusLIE, consLIE, + plusLIEs, mkLIE, isEmptyLIE, lieToList, listToLIE, + + -- Misc other types + TcId, TcIdSet, TcDictBinds + ) where + +#include "HsVersions.h" + +import HsSyn ( PendingSplice, HsOverLit, LRuleDecl, LForeignDecl, + ArithSeqInfo, DictBinds, LHsBinds, LImportDecl, HsGroup, + IE ) +import HscTypes ( FixityEnv, + HscEnv, TypeEnv, TyThing, + GenAvailInfo(..), AvailInfo, HscSource(..), + availName, IsBootInterface, Deprecations ) +import Packages ( PackageId, HomeModules ) +import Type ( Type, pprTyThingCategory ) +import TcType ( TcTyVarSet, TcType, TcThetaType, SkolemInfo, TvSubst, + TcPredType, TcKind, tcCmpPred, tcCmpType, tcCmpTypes, pprSkolInfo ) +import InstEnv ( Instance, InstEnv ) +import IOEnv +import RdrName ( GlobalRdrEnv, LocalRdrEnv ) +import Name ( Name ) +import NameEnv +import NameSet ( NameSet, unionNameSets, DefUses ) +import OccName ( OccEnv ) +import Var ( Id, TyVar ) +import VarEnv ( TidyEnv ) +import Module +import SrcLoc ( SrcSpan, SrcLoc, Located, srcSpanStart ) +import VarSet ( IdSet ) +import ErrUtils ( Messages, Message ) +import UniqSupply ( UniqSupply ) +import BasicTypes ( IPName ) +import Util ( thenCmp ) +import Bag +import Outputable +import Maybe ( mapMaybe ) +import ListSetOps ( unionLists ) +\end{code} + + +%************************************************************************ +%* * + Standard monad definition for TcRn + All the combinators for the monad can be found in TcRnMonad +%* * +%************************************************************************ + +The monad itself has to be defined here, because it is mentioned by ErrCtxt + +\begin{code} +type TcRef a = IORef a +type TcId = Id -- Type may be a TcType +type TcIdSet = IdSet +type TcDictBinds = DictBinds TcId -- Bag of dictionary bindings + + + +type TcRnIf a b c = IOEnv (Env a b) c +type IfM lcl a = TcRnIf IfGblEnv lcl a -- Iface stuff +type IfG a = IfM () a -- Top level +type IfL a = IfM IfLclEnv a -- Nested +type TcRn a = TcRnIf TcGblEnv TcLclEnv a +type RnM a = TcRn a -- Historical +type TcM a = TcRn a -- Historical +\end{code} + + +%************************************************************************ +%* * + The main environment types +%* * +%************************************************************************ + +\begin{code} +data Env gbl lcl -- Changes as we move into an expression + = Env { + env_top :: HscEnv, -- Top-level stuff that never changes + -- Includes all info about imported things + + env_us :: TcRef UniqSupply, -- Unique supply for local varibles + + env_gbl :: gbl, -- Info about things defined at the top level + -- of the module being compiled + + env_lcl :: lcl -- Nested stuff; changes as we go into + -- an expression + } + +-- TcGblEnv describes the top-level of the module at the +-- point at which the typechecker is finished work. +-- It is this structure that is handed on to the desugarer + +data TcGblEnv + = TcGblEnv { + tcg_mod :: Module, -- Module being compiled + tcg_src :: HscSource, -- What kind of module + -- (regular Haskell, hs-boot, ext-core) + + tcg_rdr_env :: GlobalRdrEnv, -- Top level envt; used during renaming + tcg_default :: Maybe [Type], -- Types used for defaulting + -- Nothing => no 'default' decl + + tcg_fix_env :: FixityEnv, -- Just for things in this module + + tcg_type_env :: TypeEnv, -- Global type env for the module we are compiling now + -- All TyCons and Classes (for this module) end up in here right away, + -- along with their derived constructors, selectors. + -- + -- (Ids defined in this module start in the local envt, + -- though they move to the global envt during zonking) + + tcg_type_env_var :: TcRef TypeEnv, + -- Used only to initialise the interface-file + -- typechecker in initIfaceTcRn, so that it can see stuff + -- bound in this module when dealing with hi-boot recursions + -- Updated at intervals (e.g. after dealing with types and classes) + + tcg_inst_env :: InstEnv, -- Instance envt for *home-package* modules + -- Includes the dfuns in tcg_insts + -- Now a bunch of things about this module that are simply + -- accumulated, but never consulted until the end. + -- Nevertheless, it's convenient to accumulate them along + -- with the rest of the info from this module. + tcg_exports :: NameSet, -- What is exported + tcg_imports :: ImportAvails, -- Information about what was imported + -- from where, including things bound + -- in this module + + tcg_home_mods :: HomeModules, + -- Calculated from ImportAvails, allows us to + -- call Packages.isHomeModule + + tcg_dus :: DefUses, -- What is defined in this module and what is used. + -- The latter is used to generate + -- (a) version tracking; no need to recompile if these + -- things have not changed version stamp + -- (b) unused-import info + + tcg_keep :: TcRef NameSet, -- Locally-defined top-level names to keep alive + -- "Keep alive" means give them an Exported flag, so + -- that the simplifier does not discard them as dead + -- code, and so that they are exposed in the interface file + -- (but not to export to the user). + -- + -- Some things, like dict-fun Ids and default-method Ids are + -- "born" with the Exported flag on, for exactly the above reason, + -- but some we only discover as we go. Specifically: + -- * The to/from functions for generic data types + -- * Top-level variables appearing free in the RHS of an orphan rule + -- * Top-level variables appearing free in a TH bracket + + tcg_inst_uses :: TcRef NameSet, -- Home-package Dfuns actually used + -- Used to generate version dependencies + -- This records usages, rather like tcg_dus, but it has to + -- be a mutable variable so it can be augmented + -- when we look up an instance. These uses of dfuns are + -- rather like the free variables of the program, but + -- are implicit instead of explicit. + + tcg_th_used :: TcRef Bool, -- True <=> Template Haskell syntax used + -- We need this so that we can generate a dependency on the + -- Template Haskell package, becuase the desugarer is going to + -- emit loads of references to TH symbols. It's rather like + -- tcg_inst_uses; the reference is implicit rather than explicit, + -- so we have to zap a mutable variable. + + tcg_dfun_n :: TcRef Int, -- Allows us to number off the names of DFuns + -- It's convenient to allocate an External Name for a DFun, with + -- a permanently-fixed unique, just like other top-level functions + -- defined in this module. But that means we need a canonical + -- occurrence name, distinct from all other dfuns in this module, + -- and this name supply serves that purpose (df1, df2, etc). + + -- The next fields accumulate the payload of the module + -- The binds, rules and foreign-decl fiels are collected + -- initially in un-zonked form and are finally zonked in tcRnSrcDecls + + -- The next fields accumulate the payload of the + -- module The binds, rules and foreign-decl fiels are + -- collected initially in un-zonked form and are + -- finally zonked in tcRnSrcDecls + + tcg_rn_imports :: Maybe [LImportDecl Name], + tcg_rn_exports :: Maybe [Located (IE Name)], + tcg_rn_decls :: Maybe (HsGroup Name), -- renamed decls, maybe + -- Nothing <=> Don't retain renamed decls + + tcg_binds :: LHsBinds Id, -- Value bindings in this module + tcg_deprecs :: Deprecations, -- ...Deprecations + tcg_insts :: [Instance], -- ...Instances + tcg_rules :: [LRuleDecl Id], -- ...Rules + tcg_fords :: [LForeignDecl Id] -- ...Foreign import & exports + } +\end{code} + +%************************************************************************ +%* * + The interface environments + Used when dealing with IfaceDecls +%* * +%************************************************************************ + +\begin{code} +data IfGblEnv + = IfGblEnv { + -- The type environment for the module being compiled, + -- in case the interface refers back to it via a reference that + -- was originally a hi-boot file. + -- We need the module name so we can test when it's appropriate + -- to look in this env. + if_rec_types :: Maybe (Module, IfG TypeEnv) + -- Allows a read effect, so it can be in a mutable + -- variable; c.f. handling the external package type env + -- Nothing => interactive stuff, no loops possible + } + +data IfLclEnv + = IfLclEnv { + -- The module for the current IfaceDecl + -- So if we see f = \x -> x + -- it means M.f = \x -> x, where M is the if_mod + if_mod :: Module, + + -- The field is used only for error reporting + -- if (say) there's a Lint error in it + if_loc :: SDoc, + -- Where the interface came from: + -- .hi file, or GHCi state, or ext core + -- plus which bit is currently being examined + + if_tv_env :: OccEnv TyVar, -- Nested tyvar bindings + if_id_env :: OccEnv Id -- Nested id binding + } +\end{code} + + +%************************************************************************ +%* * + The local typechecker environment +%* * +%************************************************************************ + +The Global-Env/Local-Env story +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +During type checking, we keep in the tcg_type_env + * All types and classes + * All Ids derived from types and classes (constructors, selectors) + +At the end of type checking, we zonk the local bindings, +and as we do so we add to the tcg_type_env + * Locally defined top-level Ids + +Why? Because they are now Ids not TcIds. This final GlobalEnv is + a) fed back (via the knot) to typechecking the + unfoldings of interface signatures + b) used in the ModDetails of this module + +\begin{code} +data TcLclEnv -- Changes as we move inside an expression + -- Discarded after typecheck/rename; not passed on to desugarer + = TcLclEnv { + tcl_loc :: SrcSpan, -- Source span + tcl_ctxt :: ErrCtxt, -- Error context + tcl_errs :: TcRef Messages, -- Place to accumulate errors + + tcl_th_ctxt :: ThStage, -- Template Haskell context + tcl_arrow_ctxt :: ArrowCtxt, -- Arrow-notation context + + tcl_rdr :: LocalRdrEnv, -- Local name envt + -- Maintained during renaming, of course, but also during + -- type checking, solely so that when renaming a Template-Haskell + -- splice we have the right environment for the renamer. + -- + -- Does *not* include global name envt; may shadow it + -- Includes both ordinary variables and type variables; + -- they are kept distinct because tyvar have a different + -- occurrence contructor (Name.TvOcc) + -- We still need the unsullied global name env so that + -- we can look up record field names + + tcl_env :: NameEnv TcTyThing, -- The local type environment: Ids and TyVars + -- defined in this module + + tcl_tyvars :: TcRef TcTyVarSet, -- The "global tyvars" + -- Namely, the in-scope TyVars bound in tcl_env, + -- plus the tyvars mentioned in the types of Ids bound in tcl_lenv + -- Why mutable? see notes with tcGetGlobalTyVars + + tcl_lie :: TcRef LIE -- Place to accumulate type constraints + } + +type GadtRefinement = TvSubst + +{- Note [Given Insts] + ~~~~~~~~~~~~~~~~~~ +Because of GADTs, we have to pass inwards the Insts provided by type signatures +and existential contexts. Consider + data T a where { T1 :: b -> b -> T [b] } + f :: Eq a => T a -> Bool + f (T1 x y) = [x]==[y] + +The constructor T1 binds an existential variable 'b', and we need Eq [b]. +Well, we have it, because Eq a refines to Eq [b], but we can only spot that if we +pass it inwards. + +-} + +--------------------------- +-- Template Haskell levels +--------------------------- + +type ThLevel = Int + -- Indicates how many levels of brackets we are inside + -- (always >= 0) + -- Incremented when going inside a bracket, + -- decremented when going inside a splice + +impLevel, topLevel :: ThLevel +topLevel = 1 -- Things defined at top level of this module +impLevel = 0 -- Imported things; they can be used inside a top level splice +-- +-- For example: +-- f = ... +-- g1 = $(map ...) is OK +-- g2 = $(f ...) is not OK; because we havn't compiled f yet + + +data ThStage + = Comp -- Ordinary compiling, at level topLevel + | Splice ThLevel -- Inside a splice + | Brack ThLevel -- Inside brackets; + (TcRef [PendingSplice]) -- accumulate pending splices here + (TcRef LIE) -- and type constraints here +topStage, topSpliceStage :: ThStage +topStage = Comp +topSpliceStage = Splice (topLevel - 1) -- Stage for the body of a top-level splice + +--------------------------- +-- Arrow-notation context +--------------------------- + +{- +In arrow notation, a variable bound by a proc (or enclosed let/kappa) +is not in scope to the left of an arrow tail (-<) or the head of (|..|). +For example + + proc x -> (e1 -< e2) + +Here, x is not in scope in e1, but it is in scope in e2. This can get +a bit complicated: + + let x = 3 in + proc y -> (proc z -> e1) -< e2 + +Here, x and z are in scope in e1, but y is not. We implement this by +recording the environment when passing a proc (using newArrowScope), +and returning to that (using escapeArrowScope) on the left of -< and the +head of (|..|). +-} + +data ArrowCtxt + = NoArrowCtxt + | ArrowCtxt (Env TcGblEnv TcLclEnv) + +-- Record the current environment (outside a proc) +newArrowScope :: TcM a -> TcM a +newArrowScope + = updEnv $ \env -> + env { env_lcl = (env_lcl env) { tcl_arrow_ctxt = ArrowCtxt env } } + +-- Return to the stored environment (from the enclosing proc) +escapeArrowScope :: TcM a -> TcM a +escapeArrowScope + = updEnv $ \ env -> case tcl_arrow_ctxt (env_lcl env) of + NoArrowCtxt -> env + ArrowCtxt env' -> env' + +--------------------------- +-- TcTyThing +--------------------------- + +data TcTyThing + = AGlobal TyThing -- Used only in the return type of a lookup + + | ATcId TcId -- Ids defined in this module; may not be fully zonked + ThLevel + Bool -- True <=> apply the type refinement to me + + | ATyVar Name TcType -- The type to which the lexically scoped type vaiable + -- is currently refined. We only need the Name + -- for error-message purposes + + | AThing TcKind -- Used temporarily, during kind checking, for the + -- tycons and clases in this recursive group + +instance Outputable TcTyThing where -- Debugging only + ppr (AGlobal g) = ppr g + ppr (ATcId g tl rig) = text "Identifier" <> + ifPprDebug (brackets (ppr g <> comma <> ppr tl <+> ppr rig)) + ppr (ATyVar tv _) = text "Type variable" <+> quotes (ppr tv) + ppr (AThing k) = text "AThing" <+> ppr k + +pprTcTyThingCategory :: TcTyThing -> SDoc +pprTcTyThingCategory (AGlobal thing) = pprTyThingCategory thing +pprTcTyThingCategory (ATyVar {}) = ptext SLIT("Type variable") +pprTcTyThingCategory (ATcId {}) = ptext SLIT("Local identifier") +pprTcTyThingCategory (AThing {}) = ptext SLIT("Kinded thing") +\end{code} + +\begin{code} +type ErrCtxt = [TidyEnv -> TcM (TidyEnv, Message)] + -- Innermost first. Monadic so that we have a chance + -- to deal with bound type variables just before error + -- message construction +\end{code} + + +%************************************************************************ +%* * + Operations over ImportAvails +%* * +%************************************************************************ + +ImportAvails summarises what was imported from where, irrespective +of whether the imported things are actually used or not +It is used * when processing the export list + * when constructing usage info for the inteface file + * to identify the list of directly imported modules + for initialisation purposes + * when figuring out what things are really unused + +\begin{code} +data ImportAvails + = ImportAvails { + imp_env :: ModuleEnv NameSet, + -- All the things imported, classified by + -- the *module qualifier* for its import + -- e.g. import List as Foo + -- would add a binding Foo |-> ...stuff from List... + -- to imp_env. + -- + -- We need to classify them like this so that we can figure out + -- "module M" export specifiers in an export list + -- (see 1.4 Report Section 5.1.1). Ultimately, we want to find + -- everything that is unambiguously in scope as 'M.x' + -- and where plain 'x' is (perhaps ambiguously) in scope. + -- So the starting point is all things that are in scope as 'M.x', + -- which is what this field tells us. + + imp_mods :: ModuleEnv (Module, Bool, SrcSpan), + -- Domain is all directly-imported modules + -- Bool means: + -- True => import was "import Foo ()" + -- False => import was some other form + -- + -- We need the Module in the range because we can't get + -- the keys of a ModuleEnv + -- Used + -- (a) to help construct the usage information in + -- the interface file; if we import somethign we + -- need to recompile if the export version changes + -- (b) to specify what child modules to initialise + + imp_dep_mods :: ModuleEnv (Module, IsBootInterface), + -- Home-package modules needed by the module being compiled + -- + -- It doesn't matter whether any of these dependencies + -- are actually *used* when compiling the module; they + -- are listed if they are below it at all. For + -- example, suppose M imports A which imports X. Then + -- compiling M might not need to consult X.hi, but X + -- is still listed in M's dependencies. + + imp_dep_pkgs :: [PackageId], + -- Packages needed by the module being compiled, whether + -- directly, or via other modules in this package, or via + -- modules imported from other packages. + + imp_orphs :: [Module] + -- Orphan modules below us in the import tree + } + +mkModDeps :: [(Module, IsBootInterface)] + -> ModuleEnv (Module, IsBootInterface) +mkModDeps deps = foldl add emptyModuleEnv deps + where + add env elt@(m,_) = extendModuleEnv env m elt + +emptyImportAvails :: ImportAvails +emptyImportAvails = ImportAvails { imp_env = emptyModuleEnv, + imp_mods = emptyModuleEnv, + imp_dep_mods = emptyModuleEnv, + imp_dep_pkgs = [], + imp_orphs = [] } + +plusImportAvails :: ImportAvails -> ImportAvails -> ImportAvails +plusImportAvails + (ImportAvails { imp_env = env1, imp_mods = mods1, + imp_dep_mods = dmods1, imp_dep_pkgs = dpkgs1, imp_orphs = orphs1 }) + (ImportAvails { imp_env = env2, imp_mods = mods2, + imp_dep_mods = dmods2, imp_dep_pkgs = dpkgs2, imp_orphs = orphs2 }) + = ImportAvails { imp_env = plusModuleEnv_C unionNameSets env1 env2, + imp_mods = mods1 `plusModuleEnv` mods2, + imp_dep_mods = plusModuleEnv_C plus_mod_dep dmods1 dmods2, + imp_dep_pkgs = dpkgs1 `unionLists` dpkgs2, + imp_orphs = orphs1 `unionLists` orphs2 } + where + plus_mod_dep (m1, boot1) (m2, boot2) + = WARN( not (m1 == m2), (ppr m1 <+> ppr m2) $$ (ppr boot1 <+> ppr boot2) ) + -- Check mod-names match + (m1, boot1 && boot2) -- If either side can "see" a non-hi-boot interface, use that +\end{code} + +%************************************************************************ +%* * + Avails, AvailEnv, etc +%* * +v%************************************************************************ + +\begin{code} +plusAvail (Avail n1) (Avail n2) = Avail n1 +plusAvail (AvailTC n1 ns1) (AvailTC n2 ns2) = AvailTC n2 (ns1 `unionLists` ns2) +-- Added SOF 4/97 +#ifdef DEBUG +plusAvail a1 a2 = pprPanic "RnEnv.plusAvail" (hsep [ppr a1,ppr a2]) +#endif + +------------------------- +pruneAvails :: (Name -> Bool) -- Keep if this is True + -> [AvailInfo] + -> [AvailInfo] +pruneAvails keep avails + = mapMaybe del avails + where + del :: AvailInfo -> Maybe AvailInfo -- Nothing => nothing left! + del (Avail n) | keep n = Just (Avail n) + | otherwise = Nothing + del (AvailTC n ns) | null ns' = Nothing + | otherwise = Just (AvailTC n ns') + where + ns' = filter keep ns +\end{code} + +--------------------------------------- + AvailEnv and friends +--------------------------------------- + +\begin{code} +type AvailEnv = NameEnv AvailInfo -- Maps a Name to the AvailInfo that contains it + +emptyAvailEnv :: AvailEnv +emptyAvailEnv = emptyNameEnv + +unitAvailEnv :: AvailInfo -> AvailEnv +unitAvailEnv a = unitNameEnv (availName a) a + +plusAvailEnv :: AvailEnv -> AvailEnv -> AvailEnv +plusAvailEnv = plusNameEnv_C plusAvail + +lookupAvailEnv_maybe :: AvailEnv -> Name -> Maybe AvailInfo +lookupAvailEnv_maybe = lookupNameEnv + +lookupAvailEnv :: AvailEnv -> Name -> AvailInfo +lookupAvailEnv env n = case lookupNameEnv env n of + Just avail -> avail + Nothing -> pprPanic "lookupAvailEnv" (ppr n) + +availEnvElts = nameEnvElts + +addAvail :: AvailEnv -> AvailInfo -> AvailEnv +addAvail avails avail = extendNameEnv_C plusAvail avails (availName avail) avail + +mkAvailEnv :: [AvailInfo] -> AvailEnv + -- 'avails' may have several items with the same availName + -- E.g import Ix( Ix(..), index ) + -- will give Ix(Ix,index,range) and Ix(index) + -- We want to combine these; addAvail does that +mkAvailEnv avails = foldl addAvail emptyAvailEnv avails +\end{code} + +%************************************************************************ +%* * +\subsection{Where from} +%* * +%************************************************************************ + +The @WhereFrom@ type controls where the renamer looks for an interface file + +\begin{code} +data WhereFrom + = ImportByUser IsBootInterface -- Ordinary user import (perhaps {-# SOURCE #-}) + | ImportBySystem -- Non user import. + +instance Outputable WhereFrom where + ppr (ImportByUser is_boot) | is_boot = ptext SLIT("{- SOURCE -}") + | otherwise = empty + ppr ImportBySystem = ptext SLIT("{- SYSTEM -}") +\end{code} + + +%************************************************************************ +%* * +\subsection[Inst-types]{@Inst@ types} +%* * +v%************************************************************************ + +An @Inst@ is either a dictionary, an instance of an overloaded +literal, or an instance of an overloaded value. We call the latter a +``method'' even though it may not correspond to a class operation. +For example, we might have an instance of the @double@ function at +type Int, represented by + + Method 34 doubleId [Int] origin + +\begin{code} +data Inst + = Dict + Name + TcPredType + InstLoc + + | Method + Id + + TcId -- The overloaded function + -- This function will be a global, local, or ClassOpId; + -- inside instance decls (only) it can also be an InstId! + -- The id needn't be completely polymorphic. + -- You'll probably find its name (for documentation purposes) + -- inside the InstOrigin + + [TcType] -- The types to which its polymorphic tyvars + -- should be instantiated. + -- These types must saturate the Id's foralls. + + TcThetaType -- The (types of the) dictionaries to which the function + -- must be applied to get the method + + InstLoc + + -- INVARIANT 1: in (Method u f tys theta tau loc) + -- type of (f tys dicts(from theta)) = tau + + -- INVARIANT 2: tau must not be of form (Pred -> Tau) + -- Reason: two methods are considered equal if the + -- base Id matches, and the instantiating types + -- match. The TcThetaType should then match too. + -- This only bites in the call to tcInstClassOp in TcClassDcl.mkMethodBind + + | LitInst + Name + (HsOverLit Name) -- The literal from the occurrence site + -- INVARIANT: never a rebindable-syntax literal + -- Reason: tcSyntaxName does unification, and we + -- don't want to deal with that during tcSimplify, + -- when resolving LitInsts + TcType -- The type at which the literal is used + InstLoc +\end{code} + +@Insts@ are ordered by their class/type info, rather than by their +unique. This allows the context-reduction mechanism to use standard finite +maps to do their stuff. + +\begin{code} +instance Ord Inst where + compare = cmpInst + +instance Eq Inst where + (==) i1 i2 = case i1 `cmpInst` i2 of + EQ -> True + other -> False + +cmpInst (Dict _ pred1 _) (Dict _ pred2 _) = pred1 `tcCmpPred` pred2 +cmpInst (Dict _ _ _) other = LT + +cmpInst (Method _ _ _ _ _) (Dict _ _ _) = GT +cmpInst (Method _ id1 tys1 _ _) (Method _ id2 tys2 _ _) = (id1 `compare` id2) `thenCmp` (tys1 `tcCmpTypes` tys2) +cmpInst (Method _ _ _ _ _) other = LT + +cmpInst (LitInst _ _ _ _) (Dict _ _ _) = GT +cmpInst (LitInst _ _ _ _) (Method _ _ _ _ _) = GT +cmpInst (LitInst _ lit1 ty1 _) (LitInst _ lit2 ty2 _) = (lit1 `compare` lit2) `thenCmp` (ty1 `tcCmpType` ty2) +\end{code} + + +%************************************************************************ +%* * +\subsection[Inst-collections]{LIE: a collection of Insts} +%* * +%************************************************************************ + +\begin{code} +-- FIXME: Rename this. It clashes with (Located (IE ...)) +type LIE = Bag Inst + +isEmptyLIE = isEmptyBag +emptyLIE = emptyBag +unitLIE inst = unitBag inst +mkLIE insts = listToBag insts +plusLIE lie1 lie2 = lie1 `unionBags` lie2 +consLIE inst lie = inst `consBag` lie +plusLIEs lies = unionManyBags lies +lieToList = bagToList +listToLIE = listToBag +\end{code} + + +%************************************************************************ +%* * +\subsection[Inst-origin]{The @InstOrigin@ type} +%* * +%************************************************************************ + +The @InstOrigin@ type gives information about where a dictionary came from. +This is important for decent error message reporting because dictionaries +don't appear in the original source code. Doubtless this type will evolve... + +It appears in TcMonad because there are a couple of error-message-generation +functions that deal with it. + +\begin{code} +data InstLoc = InstLoc InstOrigin SrcSpan ErrCtxt + +instLocSrcLoc :: InstLoc -> SrcLoc +instLocSrcLoc (InstLoc _ src_span _) = srcSpanStart src_span + +instLocSrcSpan :: InstLoc -> SrcSpan +instLocSrcSpan (InstLoc _ src_span _) = src_span + +data InstOrigin + = SigOrigin SkolemInfo -- Pattern, class decl, inst decl etc; + -- Places that bind type variables and introduce + -- available constraints + + | IPBindOrigin (IPName Name) -- Binding site of an implicit parameter + + ------------------------------------------------------- + -- The rest are all occurrences: Insts that are 'wanted' + ------------------------------------------------------- + | OccurrenceOf Name -- Occurrence of an overloaded identifier + + | IPOccOrigin (IPName Name) -- Occurrence of an implicit parameter + + | LiteralOrigin (HsOverLit Name) -- Occurrence of a literal + + | ArithSeqOrigin (ArithSeqInfo Name) -- [x..], [x..y] etc + | PArrSeqOrigin (ArithSeqInfo Name) -- [:x..y:] and [:x,y..z:] + + | InstSigOrigin -- A dict occurrence arising from instantiating + -- a polymorphic type during a subsumption check + + | RecordUpdOrigin + | InstScOrigin -- Typechecking superclasses of an instance declaration + | DerivOrigin -- Typechecking deriving + | DefaultOrigin -- Typechecking a default decl + | DoOrigin -- Arising from a do expression + | ProcOrigin -- Arising from a proc expression +\end{code} + +\begin{code} +pprInstLoc :: InstLoc -> SDoc +pprInstLoc (InstLoc orig locn _) + = hsep [text "arising from", pp_orig orig, text "at", ppr locn] + where + pp_orig (OccurrenceOf name) = hsep [ptext SLIT("use of"), quotes (ppr name)] + pp_orig (IPOccOrigin name) = hsep [ptext SLIT("use of implicit parameter"), quotes (ppr name)] + pp_orig (IPBindOrigin name) = hsep [ptext SLIT("binding for implicit parameter"), quotes (ppr name)] + pp_orig RecordUpdOrigin = ptext SLIT("a record update") + pp_orig (LiteralOrigin lit) = hsep [ptext SLIT("the literal"), quotes (ppr lit)] + pp_orig (ArithSeqOrigin seq) = hsep [ptext SLIT("the arithmetic sequence"), quotes (ppr seq)] + pp_orig (PArrSeqOrigin seq) = hsep [ptext SLIT("the parallel array sequence"), quotes (ppr seq)] + pp_orig InstSigOrigin = ptext SLIT("instantiating a type signature") + pp_orig InstScOrigin = ptext SLIT("the superclasses of an instance declaration") + pp_orig DerivOrigin = ptext SLIT("the 'deriving' clause of a data type declaration") + pp_orig DefaultOrigin = ptext SLIT("a 'default' declaration") + pp_orig DoOrigin = ptext SLIT("a do statement") + pp_orig ProcOrigin = ptext SLIT("a proc expression") + pp_orig (SigOrigin info) = pprSkolInfo info +\end{code} diff --git a/compiler/typecheck/TcRules.lhs b/compiler/typecheck/TcRules.lhs new file mode 100644 index 0000000000..b14c2c9449 --- /dev/null +++ b/compiler/typecheck/TcRules.lhs @@ -0,0 +1,116 @@ +% +% (c) The AQUA Project, Glasgow University, 1993-1998 +% +\section[TcRules]{Typechecking transformation rules} + +\begin{code} +module TcRules ( tcRules ) where + +#include "HsVersions.h" + +import HsSyn ( RuleDecl(..), LRuleDecl, RuleBndr(..), mkHsDictLet ) +import TcRnMonad +import TcSimplify ( tcSimplifyToDicts, tcSimplifyInferCheck ) +import TcMType ( newFlexiTyVarTy, zonkQuantifiedTyVar, tcSkolSigTyVars ) +import TcType ( tyVarsOfTypes, openTypeKind, SkolemInfo(..), substTyWith, mkTyVarTys ) +import TcHsType ( UserTypeCtxt(..), tcHsPatSigType ) +import TcExpr ( tcMonoExpr ) +import TcEnv ( tcExtendIdEnv, tcExtendTyVarEnv ) +import Inst ( instToId ) +import Id ( idType, mkLocalId ) +import Name ( Name ) +import SrcLoc ( noLoc, unLoc ) +import Outputable +\end{code} + +\begin{code} +tcRules :: [LRuleDecl Name] -> TcM [LRuleDecl TcId] +tcRules decls = mappM (wrapLocM tcRule) decls + +tcRule :: RuleDecl Name -> TcM (RuleDecl TcId) +tcRule (HsRule name act vars lhs fv_lhs rhs fv_rhs) + = addErrCtxt (ruleCtxt name) $ + traceTc (ptext SLIT("---- Rule ------") + <+> ppr name) `thenM_` + newFlexiTyVarTy openTypeKind `thenM` \ rule_ty -> + + -- Deal with the tyvars mentioned in signatures + tcRuleBndrs vars (\ ids -> + -- Now LHS and RHS + getLIE (tcMonoExpr lhs rule_ty) `thenM` \ (lhs', lhs_lie) -> + getLIE (tcMonoExpr rhs rule_ty) `thenM` \ (rhs', rhs_lie) -> + returnM (ids, lhs', rhs', lhs_lie, rhs_lie) + ) `thenM` \ (ids, lhs', rhs', lhs_lie, rhs_lie) -> + + -- Check that LHS has no overloading at all + getLIE (tcSimplifyToDicts lhs_lie) `thenM` \ (lhs_binds, lhs_dicts) -> + + -- Gather the template variables and tyvars + let + tpl_ids = map instToId lhs_dicts ++ ids + + -- IMPORTANT! We *quantify* over any dicts that appear in the LHS + -- Reason: + -- a) The particular dictionary isn't important, because its value + -- depends only on the type + -- e.g gcd Int $fIntegralInt + -- Here we'd like to match against (gcd Int any_d) for any 'any_d' + -- + -- b) We'd like to make available the dictionaries bound + -- on the LHS in the RHS, so quantifying over them is good + -- See the 'lhs_dicts' in tcSimplifyAndCheck for the RHS + + -- We initially quantify over any tyvars free in *either* the rule + -- *or* the bound variables. The latter is important. Consider + -- ss (x,(y,z)) = (x,z) + -- RULE: forall v. fst (ss v) = fst v + -- The type of the rhs of the rule is just a, but v::(a,(b,c)) + -- + -- We also need to get the free tyvars of the LHS; but we do that + -- during zonking (see TcHsSyn.zonkRule) + -- + forall_tvs = tyVarsOfTypes (rule_ty : map idType tpl_ids) + in + -- RHS can be a bit more lenient. In particular, + -- we let constant dictionaries etc float outwards + -- + -- NB: tcSimplifyInferCheck zonks the forall_tvs, and + -- knocks out any that are constrained by the environment + tcSimplifyInferCheck (text "tcRule") + forall_tvs + lhs_dicts rhs_lie `thenM` \ (forall_tvs1, rhs_binds) -> + mappM zonkQuantifiedTyVar forall_tvs1 `thenM` \ forall_tvs2 -> + -- This zonk is exactly the same as the one in TcBinds.tcBindWithSigs + + returnM (HsRule name act + (map (RuleBndr . noLoc) (forall_tvs2 ++ tpl_ids)) -- yuk + (mkHsDictLet lhs_binds lhs') fv_lhs + (mkHsDictLet rhs_binds rhs') fv_rhs) + where + +tcRuleBndrs [] thing_inside = thing_inside [] +tcRuleBndrs (RuleBndr var : vars) thing_inside + = do { ty <- newFlexiTyVarTy openTypeKind + ; let id = mkLocalId (unLoc var) ty + ; tcExtendIdEnv [id] $ + tcRuleBndrs vars (\ids -> thing_inside (id:ids)) } +tcRuleBndrs (RuleBndrSig var rn_ty : vars) thing_inside +-- e.g x :: a->a +-- The tyvar 'a' is brought into scope first, just as if you'd written +-- a::*, x :: a->a + = do { let ctxt = RuleSigCtxt (unLoc var) + ; (tyvars, ty) <- tcHsPatSigType ctxt rn_ty + ; let skol_tvs = tcSkolSigTyVars (SigSkol ctxt) tyvars + id_ty = substTyWith tyvars (mkTyVarTys skol_tvs) ty + id = mkLocalId (unLoc var) id_ty + ; tcExtendTyVarEnv skol_tvs $ + tcExtendIdEnv [id] $ + tcRuleBndrs vars (\ids -> thing_inside (id:ids)) } + +ruleCtxt name = ptext SLIT("When checking the transformation rule") <+> + doubleQuotes (ftext name) +\end{code} + + + + diff --git a/compiler/typecheck/TcSimplify.lhs b/compiler/typecheck/TcSimplify.lhs new file mode 100644 index 0000000000..7656198a25 --- /dev/null +++ b/compiler/typecheck/TcSimplify.lhs @@ -0,0 +1,2534 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[TcSimplify]{TcSimplify} + + + +\begin{code} +module TcSimplify ( + tcSimplifyInfer, tcSimplifyInferCheck, + tcSimplifyCheck, tcSimplifyRestricted, + tcSimplifyToDicts, tcSimplifyIPs, + tcSimplifySuperClasses, + tcSimplifyTop, tcSimplifyInteractive, + tcSimplifyBracket, + + tcSimplifyDeriv, tcSimplifyDefault, + bindInstsOfLocalFuns + ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} TcUnify( unifyType ) +import TypeRep ( Type(..) ) +import HsSyn ( HsBind(..), HsExpr(..), LHsExpr, emptyLHsBinds ) +import TcHsSyn ( mkHsApp, mkHsTyApp, mkHsDictApp ) + +import TcRnMonad +import Inst ( lookupInst, LookupInstResult(..), + tyVarsOfInst, fdPredsOfInsts, newDicts, + isDict, isClassDict, isLinearInst, linearInstType, + isMethodFor, isMethod, + instToId, tyVarsOfInsts, cloneDict, + ipNamesOfInsts, ipNamesOfInst, dictPred, + fdPredsOfInst, + newDictsAtLoc, tcInstClassOp, + getDictClassTys, isTyVarDict, instLoc, + zonkInst, tidyInsts, tidyMoreInsts, + pprInsts, pprDictsInFull, pprInstInFull, tcGetInstEnvs, + isInheritableInst, pprDictsTheta + ) +import TcEnv ( tcGetGlobalTyVars, tcLookupId, findGlobals, pprBinders, + lclEnvElts, tcMetaTy ) +import InstEnv ( lookupInstEnv, classInstances, pprInstances ) +import TcMType ( zonkTcTyVarsAndFV, tcInstTyVars, zonkTcPredType, + checkAmbiguity, checkInstTermination ) +import TcType ( TcTyVar, TcTyVarSet, ThetaType, TcPredType, tidyPred, + mkClassPred, isOverloadedTy, mkTyConApp, isSkolemTyVar, + mkTyVarTy, tcGetTyVar, isTyVarClassPred, mkTyVarTys, + tyVarsOfPred, tcEqType, pprPred, mkPredTy, tcIsTyVarTy ) +import TcIface ( checkWiredInTyCon ) +import Id ( idType, mkUserLocal ) +import Var ( TyVar ) +import TyCon ( TyCon ) +import Name ( Name, getOccName, getSrcLoc ) +import NameSet ( NameSet, mkNameSet, elemNameSet ) +import Class ( classBigSig, classKey ) +import FunDeps ( oclose, grow, improve, pprEquation ) +import PrelInfo ( isNumericClass, isStandardClass ) +import PrelNames ( splitName, fstName, sndName, integerTyConName, + showClassKey, eqClassKey, ordClassKey ) +import Type ( zipTopTvSubst, substTheta, substTy ) +import TysWiredIn ( pairTyCon, doubleTy, doubleTyCon ) +import ErrUtils ( Message ) +import BasicTypes ( TopLevelFlag, isNotTopLevel ) +import VarSet +import VarEnv ( TidyEnv ) +import FiniteMap +import Bag +import Outputable +import ListSetOps ( equivClasses ) +import Util ( zipEqual, isSingleton ) +import List ( partition ) +import SrcLoc ( Located(..) ) +import DynFlags ( DynFlag(..) ) +import StaticFlags +\end{code} + + +%************************************************************************ +%* * +\subsection{NOTES} +%* * +%************************************************************************ + + -------------------------------------- + Notes on functional dependencies (a bug) + -------------------------------------- + +| > class Foo a b | a->b +| > +| > class Bar a b | a->b +| > +| > data Obj = Obj +| > +| > instance Bar Obj Obj +| > +| > instance (Bar a b) => Foo a b +| > +| > foo:: (Foo a b) => a -> String +| > foo _ = "works" +| > +| > runFoo:: (forall a b. (Foo a b) => a -> w) -> w +| > runFoo f = f Obj +| +| *Test> runFoo foo +| +| <interactive>:1: +| Could not deduce (Bar a b) from the context (Foo a b) +| arising from use of `foo' at <interactive>:1 +| Probable fix: +| Add (Bar a b) to the expected type of an expression +| In the first argument of `runFoo', namely `foo' +| In the definition of `it': it = runFoo foo +| +| Why all of the sudden does GHC need the constraint Bar a b? The +| function foo didn't ask for that... + +The trouble is that to type (runFoo foo), GHC has to solve the problem: + + Given constraint Foo a b + Solve constraint Foo a b' + +Notice that b and b' aren't the same. To solve this, just do +improvement and then they are the same. But GHC currently does + simplify constraints + apply improvement + and loop + +That is usually fine, but it isn't here, because it sees that Foo a b is +not the same as Foo a b', and so instead applies the instance decl for +instance Bar a b => Foo a b. And that's where the Bar constraint comes +from. + +The Right Thing is to improve whenever the constraint set changes at +all. Not hard in principle, but it'll take a bit of fiddling to do. + + + + -------------------------------------- + Notes on quantification + -------------------------------------- + +Suppose we are about to do a generalisation step. +We have in our hand + + G the environment + T the type of the RHS + C the constraints from that RHS + +The game is to figure out + + Q the set of type variables over which to quantify + Ct the constraints we will *not* quantify over + Cq the constraints we will quantify over + +So we're going to infer the type + + forall Q. Cq => T + +and float the constraints Ct further outwards. + +Here are the things that *must* be true: + + (A) Q intersect fv(G) = EMPTY limits how big Q can be + (B) Q superset fv(Cq union T) \ oclose(fv(G),C) limits how small Q can be + +(A) says we can't quantify over a variable that's free in the +environment. (B) says we must quantify over all the truly free +variables in T, else we won't get a sufficiently general type. We do +not *need* to quantify over any variable that is fixed by the free +vars of the environment G. + + BETWEEN THESE TWO BOUNDS, ANY Q WILL DO! + +Example: class H x y | x->y where ... + + fv(G) = {a} C = {H a b, H c d} + T = c -> b + + (A) Q intersect {a} is empty + (B) Q superset {a,b,c,d} \ oclose({a}, C) = {a,b,c,d} \ {a,b} = {c,d} + + So Q can be {c,d}, {b,c,d} + +Other things being equal, however, we'd like to quantify over as few +variables as possible: smaller types, fewer type applications, more +constraints can get into Ct instead of Cq. + + +----------------------------------------- +We will make use of + + fv(T) the free type vars of T + + oclose(vs,C) The result of extending the set of tyvars vs + using the functional dependencies from C + + grow(vs,C) The result of extend the set of tyvars vs + using all conceivable links from C. + + E.g. vs = {a}, C = {H [a] b, K (b,Int) c, Eq e} + Then grow(vs,C) = {a,b,c} + + Note that grow(vs,C) `superset` grow(vs,simplify(C)) + That is, simplfication can only shrink the result of grow. + +Notice that + oclose is conservative one way: v `elem` oclose(vs,C) => v is definitely fixed by vs + grow is conservative the other way: if v might be fixed by vs => v `elem` grow(vs,C) + + +----------------------------------------- + +Choosing Q +~~~~~~~~~~ +Here's a good way to choose Q: + + Q = grow( fv(T), C ) \ oclose( fv(G), C ) + +That is, quantify over all variable that that MIGHT be fixed by the +call site (which influences T), but which aren't DEFINITELY fixed by +G. This choice definitely quantifies over enough type variables, +albeit perhaps too many. + +Why grow( fv(T), C ) rather than fv(T)? Consider + + class H x y | x->y where ... + + T = c->c + C = (H c d) + + If we used fv(T) = {c} we'd get the type + + forall c. H c d => c -> b + + And then if the fn was called at several different c's, each of + which fixed d differently, we'd get a unification error, because + d isn't quantified. Solution: quantify d. So we must quantify + everything that might be influenced by c. + +Why not oclose( fv(T), C )? Because we might not be able to see +all the functional dependencies yet: + + class H x y | x->y where ... + instance H x y => Eq (T x y) where ... + + T = c->c + C = (Eq (T c d)) + + Now oclose(fv(T),C) = {c}, because the functional dependency isn't + apparent yet, and that's wrong. We must really quantify over d too. + + +There really isn't any point in quantifying over any more than +grow( fv(T), C ), because the call sites can't possibly influence +any other type variables. + + + + -------------------------------------- + Notes on ambiguity + -------------------------------------- + +It's very hard to be certain when a type is ambiguous. Consider + + class K x + class H x y | x -> y + instance H x y => K (x,y) + +Is this type ambiguous? + forall a b. (K (a,b), Eq b) => a -> a + +Looks like it! But if we simplify (K (a,b)) we get (H a b) and +now we see that a fixes b. So we can't tell about ambiguity for sure +without doing a full simplification. And even that isn't possible if +the context has some free vars that may get unified. Urgle! + +Here's another example: is this ambiguous? + forall a b. Eq (T b) => a -> a +Not if there's an insance decl (with no context) + instance Eq (T b) where ... + +You may say of this example that we should use the instance decl right +away, but you can't always do that: + + class J a b where ... + instance J Int b where ... + + f :: forall a b. J a b => a -> a + +(Notice: no functional dependency in J's class decl.) +Here f's type is perfectly fine, provided f is only called at Int. +It's premature to complain when meeting f's signature, or even +when inferring a type for f. + + + +However, we don't *need* to report ambiguity right away. It'll always +show up at the call site.... and eventually at main, which needs special +treatment. Nevertheless, reporting ambiguity promptly is an excellent thing. + +So here's the plan. We WARN about probable ambiguity if + + fv(Cq) is not a subset of oclose(fv(T) union fv(G), C) + +(all tested before quantification). +That is, all the type variables in Cq must be fixed by the the variables +in the environment, or by the variables in the type. + +Notice that we union before calling oclose. Here's an example: + + class J a b c | a b -> c + fv(G) = {a} + +Is this ambiguous? + forall b c. (J a b c) => b -> b + +Only if we union {a} from G with {b} from T before using oclose, +do we see that c is fixed. + +It's a bit vague exactly which C we should use for this oclose call. If we +don't fix enough variables we might complain when we shouldn't (see +the above nasty example). Nothing will be perfect. That's why we can +only issue a warning. + + +Can we ever be *certain* about ambiguity? Yes: if there's a constraint + + c in C such that fv(c) intersect (fv(G) union fv(T)) = EMPTY + +then c is a "bubble"; there's no way it can ever improve, and it's +certainly ambiguous. UNLESS it is a constant (sigh). And what about +the nasty example? + + class K x + class H x y | x -> y + instance H x y => K (x,y) + +Is this type ambiguous? + forall a b. (K (a,b), Eq b) => a -> a + +Urk. The (Eq b) looks "definitely ambiguous" but it isn't. What we are after +is a "bubble" that's a set of constraints + + Cq = Ca union Cq' st fv(Ca) intersect (fv(Cq') union fv(T) union fv(G)) = EMPTY + +Hence another idea. To decide Q start with fv(T) and grow it +by transitive closure in Cq (no functional dependencies involved). +Now partition Cq using Q, leaving the definitely-ambiguous and probably-ok. +The definitely-ambiguous can then float out, and get smashed at top level +(which squashes out the constants, like Eq (T a) above) + + + -------------------------------------- + Notes on principal types + -------------------------------------- + + class C a where + op :: a -> a + + f x = let g y = op (y::Int) in True + +Here the principal type of f is (forall a. a->a) +but we'll produce the non-principal type + f :: forall a. C Int => a -> a + + + -------------------------------------- + The need for forall's in constraints + -------------------------------------- + +[Exchange on Haskell Cafe 5/6 Dec 2000] + + class C t where op :: t -> Bool + instance C [t] where op x = True + + p y = (let f :: c -> Bool; f x = op (y >> return x) in f, y ++ []) + q y = (y ++ [], let f :: c -> Bool; f x = op (y >> return x) in f) + +The definitions of p and q differ only in the order of the components in +the pair on their right-hand sides. And yet: + + ghc and "Typing Haskell in Haskell" reject p, but accept q; + Hugs rejects q, but accepts p; + hbc rejects both p and q; + nhc98 ... (Malcolm, can you fill in the blank for us!). + +The type signature for f forces context reduction to take place, and +the results of this depend on whether or not the type of y is known, +which in turn depends on which component of the pair the type checker +analyzes first. + +Solution: if y::m a, float out the constraints + Monad m, forall c. C (m c) +When m is later unified with [], we can solve both constraints. + + + -------------------------------------- + Notes on implicit parameters + -------------------------------------- + +Question 1: can we "inherit" implicit parameters +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Consider this: + + f x = (x::Int) + ?y + +where f is *not* a top-level binding. +From the RHS of f we'll get the constraint (?y::Int). +There are two types we might infer for f: + + f :: Int -> Int + +(so we get ?y from the context of f's definition), or + + f :: (?y::Int) => Int -> Int + +At first you might think the first was better, becuase then +?y behaves like a free variable of the definition, rather than +having to be passed at each call site. But of course, the WHOLE +IDEA is that ?y should be passed at each call site (that's what +dynamic binding means) so we'd better infer the second. + +BOTTOM LINE: when *inferring types* you *must* quantify +over implicit parameters. See the predicate isFreeWhenInferring. + + +Question 2: type signatures +~~~~~~~~~~~~~~~~~~~~~~~~~~~ +BUT WATCH OUT: When you supply a type signature, we can't force you +to quantify over implicit parameters. For example: + + (?x + 1) :: Int + +This is perfectly reasonable. We do not want to insist on + + (?x + 1) :: (?x::Int => Int) + +That would be silly. Here, the definition site *is* the occurrence site, +so the above strictures don't apply. Hence the difference between +tcSimplifyCheck (which *does* allow implicit paramters to be inherited) +and tcSimplifyCheckBind (which does not). + +What about when you supply a type signature for a binding? +Is it legal to give the following explicit, user type +signature to f, thus: + + f :: Int -> Int + f x = (x::Int) + ?y + +At first sight this seems reasonable, but it has the nasty property +that adding a type signature changes the dynamic semantics. +Consider this: + + (let f x = (x::Int) + ?y + in (f 3, f 3 with ?y=5)) with ?y = 6 + + returns (3+6, 3+5) +vs + (let f :: Int -> Int + f x = x + ?y + in (f 3, f 3 with ?y=5)) with ?y = 6 + + returns (3+6, 3+6) + +Indeed, simply inlining f (at the Haskell source level) would change the +dynamic semantics. + +Nevertheless, as Launchbury says (email Oct 01) we can't really give the +semantics for a Haskell program without knowing its typing, so if you +change the typing you may change the semantics. + +To make things consistent in all cases where we are *checking* against +a supplied signature (as opposed to inferring a type), we adopt the +rule: + + a signature does not need to quantify over implicit params. + +[This represents a (rather marginal) change of policy since GHC 5.02, +which *required* an explicit signature to quantify over all implicit +params for the reasons mentioned above.] + +But that raises a new question. Consider + + Given (signature) ?x::Int + Wanted (inferred) ?x::Int, ?y::Bool + +Clearly we want to discharge the ?x and float the ?y out. But +what is the criterion that distinguishes them? Clearly it isn't +what free type variables they have. The Right Thing seems to be +to float a constraint that + neither mentions any of the quantified type variables + nor any of the quantified implicit parameters + +See the predicate isFreeWhenChecking. + + +Question 3: monomorphism +~~~~~~~~~~~~~~~~~~~~~~~~ +There's a nasty corner case when the monomorphism restriction bites: + + z = (x::Int) + ?y + +The argument above suggests that we *must* generalise +over the ?y parameter, to get + z :: (?y::Int) => Int, +but the monomorphism restriction says that we *must not*, giving + z :: Int. +Why does the momomorphism restriction say this? Because if you have + + let z = x + ?y in z+z + +you might not expect the addition to be done twice --- but it will if +we follow the argument of Question 2 and generalise over ?y. + + +Question 4: top level +~~~~~~~~~~~~~~~~~~~~~ +At the top level, monomorhism makes no sense at all. + + module Main where + main = let ?x = 5 in print foo + + foo = woggle 3 + + woggle :: (?x :: Int) => Int -> Int + woggle y = ?x + y + +We definitely don't want (foo :: Int) with a top-level implicit parameter +(?x::Int) becuase there is no way to bind it. + + +Possible choices +~~~~~~~~~~~~~~~~ +(A) Always generalise over implicit parameters + Bindings that fall under the monomorphism restriction can't + be generalised + + Consequences: + * Inlining remains valid + * No unexpected loss of sharing + * But simple bindings like + z = ?y + 1 + will be rejected, unless you add an explicit type signature + (to avoid the monomorphism restriction) + z :: (?y::Int) => Int + z = ?y + 1 + This seems unacceptable + +(B) Monomorphism restriction "wins" + Bindings that fall under the monomorphism restriction can't + be generalised + Always generalise over implicit parameters *except* for bindings + that fall under the monomorphism restriction + + Consequences + * Inlining isn't valid in general + * No unexpected loss of sharing + * Simple bindings like + z = ?y + 1 + accepted (get value of ?y from binding site) + +(C) Always generalise over implicit parameters + Bindings that fall under the monomorphism restriction can't + be generalised, EXCEPT for implicit parameters + Consequences + * Inlining remains valid + * Unexpected loss of sharing (from the extra generalisation) + * Simple bindings like + z = ?y + 1 + accepted (get value of ?y from occurrence sites) + + +Discussion +~~~~~~~~~~ +None of these choices seems very satisfactory. But at least we should +decide which we want to do. + +It's really not clear what is the Right Thing To Do. If you see + + z = (x::Int) + ?y + +would you expect the value of ?y to be got from the *occurrence sites* +of 'z', or from the valuue of ?y at the *definition* of 'z'? In the +case of function definitions, the answer is clearly the former, but +less so in the case of non-fucntion definitions. On the other hand, +if we say that we get the value of ?y from the definition site of 'z', +then inlining 'z' might change the semantics of the program. + +Choice (C) really says "the monomorphism restriction doesn't apply +to implicit parameters". Which is fine, but remember that every +innocent binding 'x = ...' that mentions an implicit parameter in +the RHS becomes a *function* of that parameter, called at each +use of 'x'. Now, the chances are that there are no intervening 'with' +clauses that bind ?y, so a decent compiler should common up all +those function calls. So I think I strongly favour (C). Indeed, +one could make a similar argument for abolishing the monomorphism +restriction altogether. + +BOTTOM LINE: we choose (B) at present. See tcSimplifyRestricted + + + +%************************************************************************ +%* * +\subsection{tcSimplifyInfer} +%* * +%************************************************************************ + +tcSimplify is called when we *inferring* a type. Here's the overall game plan: + + 1. Compute Q = grow( fvs(T), C ) + + 2. Partition C based on Q into Ct and Cq. Notice that ambiguous + predicates will end up in Ct; we deal with them at the top level + + 3. Try improvement, using functional dependencies + + 4. If Step 3 did any unification, repeat from step 1 + (Unification can change the result of 'grow'.) + +Note: we don't reduce dictionaries in step 2. For example, if we have +Eq (a,b), we don't simplify to (Eq a, Eq b). So Q won't be different +after step 2. However note that we may therefore quantify over more +type variables than we absolutely have to. + +For the guts, we need a loop, that alternates context reduction and +improvement with unification. E.g. Suppose we have + + class C x y | x->y where ... + +and tcSimplify is called with: + (C Int a, C Int b) +Then improvement unifies a with b, giving + (C Int a, C Int a) + +If we need to unify anything, we rattle round the whole thing all over +again. + + +\begin{code} +tcSimplifyInfer + :: SDoc + -> TcTyVarSet -- fv(T); type vars + -> [Inst] -- Wanted + -> TcM ([TcTyVar], -- Tyvars to quantify (zonked) + TcDictBinds, -- Bindings + [TcId]) -- Dict Ids that must be bound here (zonked) + -- Any free (escaping) Insts are tossed into the environment +\end{code} + + +\begin{code} +tcSimplifyInfer doc tau_tvs wanted_lie + = inferLoop doc (varSetElems tau_tvs) + wanted_lie `thenM` \ (qtvs, frees, binds, irreds) -> + + extendLIEs frees `thenM_` + returnM (qtvs, binds, map instToId irreds) + +inferLoop doc tau_tvs wanteds + = -- Step 1 + zonkTcTyVarsAndFV tau_tvs `thenM` \ tau_tvs' -> + mappM zonkInst wanteds `thenM` \ wanteds' -> + tcGetGlobalTyVars `thenM` \ gbl_tvs -> + let + preds = fdPredsOfInsts wanteds' + qtvs = grow preds tau_tvs' `minusVarSet` oclose preds gbl_tvs + + try_me inst + | isFreeWhenInferring qtvs inst = Free + | isClassDict inst = DontReduceUnlessConstant -- Dicts + | otherwise = ReduceMe NoSCs -- Lits and Methods + in + traceTc (text "infloop" <+> vcat [ppr tau_tvs', ppr wanteds', ppr preds, + ppr (grow preds tau_tvs'), ppr qtvs]) `thenM_` + -- Step 2 + reduceContext doc try_me [] wanteds' `thenM` \ (no_improvement, frees, binds, irreds) -> + + -- Step 3 + if no_improvement then + returnM (varSetElems qtvs, frees, binds, irreds) + else + -- If improvement did some unification, we go round again. There + -- are two subtleties: + -- a) We start again with irreds, not wanteds + -- Using an instance decl might have introduced a fresh type variable + -- which might have been unified, so we'd get an infinite loop + -- if we started again with wanteds! See example [LOOP] + -- + -- b) It's also essential to re-process frees, because unification + -- might mean that a type variable that looked free isn't now. + -- + -- Hence the (irreds ++ frees) + + -- However, NOTICE that when we are done, we might have some bindings, but + -- the final qtvs might be empty. See [NO TYVARS] below. + + inferLoop doc tau_tvs (irreds ++ frees) `thenM` \ (qtvs1, frees1, binds1, irreds1) -> + returnM (qtvs1, frees1, binds `unionBags` binds1, irreds1) +\end{code} + +Example [LOOP] + + class If b t e r | b t e -> r + instance If T t e t + instance If F t e e + class Lte a b c | a b -> c where lte :: a -> b -> c + instance Lte Z b T + instance (Lte a b l,If l b a c) => Max a b c + +Wanted: Max Z (S x) y + +Then we'll reduce using the Max instance to: + (Lte Z (S x) l, If l (S x) Z y) +and improve by binding l->T, after which we can do some reduction +on both the Lte and If constraints. What we *can't* do is start again +with (Max Z (S x) y)! + +[NO TYVARS] + + class Y a b | a -> b where + y :: a -> X b + + instance Y [[a]] a where + y ((x:_):_) = X x + + k :: X a -> X a -> X a + + g :: Num a => [X a] -> [X a] + g xs = h xs + where + h ys = ys ++ map (k (y [[0]])) xs + +The excitement comes when simplifying the bindings for h. Initially +try to simplify {y @ [[t1]] t2, 0 @ t1}, with initial qtvs = {t2}. +From this we get t1:=:t2, but also various bindings. We can't forget +the bindings (because of [LOOP]), but in fact t1 is what g is +polymorphic in. + +The net effect of [NO TYVARS] + +\begin{code} +isFreeWhenInferring :: TyVarSet -> Inst -> Bool +isFreeWhenInferring qtvs inst + = isFreeWrtTyVars qtvs inst -- Constrains no quantified vars + && isInheritableInst inst -- And no implicit parameter involved + -- (see "Notes on implicit parameters") + +isFreeWhenChecking :: TyVarSet -- Quantified tyvars + -> NameSet -- Quantified implicit parameters + -> Inst -> Bool +isFreeWhenChecking qtvs ips inst + = isFreeWrtTyVars qtvs inst + && isFreeWrtIPs ips inst + +isFreeWrtTyVars qtvs inst = not (tyVarsOfInst inst `intersectsVarSet` qtvs) +isFreeWrtIPs ips inst = not (any (`elemNameSet` ips) (ipNamesOfInst inst)) +\end{code} + + +%************************************************************************ +%* * +\subsection{tcSimplifyCheck} +%* * +%************************************************************************ + +@tcSimplifyCheck@ is used when we know exactly the set of variables +we are going to quantify over. For example, a class or instance declaration. + +\begin{code} +tcSimplifyCheck + :: SDoc + -> [TcTyVar] -- Quantify over these + -> [Inst] -- Given + -> [Inst] -- Wanted + -> TcM TcDictBinds -- Bindings + +-- tcSimplifyCheck is used when checking expression type signatures, +-- class decls, instance decls etc. +-- +-- NB: tcSimplifyCheck does not consult the +-- global type variables in the environment; so you don't +-- need to worry about setting them before calling tcSimplifyCheck +tcSimplifyCheck doc qtvs givens wanted_lie + = ASSERT( all isSkolemTyVar qtvs ) + do { (qtvs', frees, binds) <- tcSimplCheck doc get_qtvs AddSCs givens wanted_lie + ; extendLIEs frees + ; return binds } + where +-- get_qtvs = zonkTcTyVarsAndFV qtvs + get_qtvs = return (mkVarSet qtvs) -- All skolems + + +-- tcSimplifyInferCheck is used when we know the constraints we are to simplify +-- against, but we don't know the type variables over which we are going to quantify. +-- This happens when we have a type signature for a mutually recursive group +tcSimplifyInferCheck + :: SDoc + -> TcTyVarSet -- fv(T) + -> [Inst] -- Given + -> [Inst] -- Wanted + -> TcM ([TcTyVar], -- Variables over which to quantify + TcDictBinds) -- Bindings + +tcSimplifyInferCheck doc tau_tvs givens wanted_lie + = do { (qtvs', frees, binds) <- tcSimplCheck doc get_qtvs AddSCs givens wanted_lie + ; extendLIEs frees + ; return (qtvs', binds) } + where + -- Figure out which type variables to quantify over + -- You might think it should just be the signature tyvars, + -- but in bizarre cases you can get extra ones + -- f :: forall a. Num a => a -> a + -- f x = fst (g (x, head [])) + 1 + -- g a b = (b,a) + -- Here we infer g :: forall a b. a -> b -> (b,a) + -- We don't want g to be monomorphic in b just because + -- f isn't quantified over b. + all_tvs = varSetElems (tau_tvs `unionVarSet` tyVarsOfInsts givens) + + get_qtvs = zonkTcTyVarsAndFV all_tvs `thenM` \ all_tvs' -> + tcGetGlobalTyVars `thenM` \ gbl_tvs -> + let + qtvs = all_tvs' `minusVarSet` gbl_tvs + -- We could close gbl_tvs, but its not necessary for + -- soundness, and it'll only affect which tyvars, not which + -- dictionaries, we quantify over + in + returnM qtvs +\end{code} + +Here is the workhorse function for all three wrappers. + +\begin{code} +tcSimplCheck doc get_qtvs want_scs givens wanted_lie + = do { (qtvs, frees, binds, irreds) <- check_loop givens wanted_lie + + -- Complain about any irreducible ones + ; if not (null irreds) + then do { givens' <- mappM zonkInst given_dicts_and_ips + ; groupErrs (addNoInstanceErrs (Just doc) givens') irreds } + else return () + + ; returnM (qtvs, frees, binds) } + where + given_dicts_and_ips = filter (not . isMethod) givens + -- For error reporting, filter out methods, which are + -- only added to the given set as an optimisation + + ip_set = mkNameSet (ipNamesOfInsts givens) + + check_loop givens wanteds + = -- Step 1 + mappM zonkInst givens `thenM` \ givens' -> + mappM zonkInst wanteds `thenM` \ wanteds' -> + get_qtvs `thenM` \ qtvs' -> + + -- Step 2 + let + -- When checking against a given signature we always reduce + -- until we find a match against something given, or can't reduce + try_me inst | isFreeWhenChecking qtvs' ip_set inst = Free + | otherwise = ReduceMe want_scs + in + reduceContext doc try_me givens' wanteds' `thenM` \ (no_improvement, frees, binds, irreds) -> + + -- Step 3 + if no_improvement then + returnM (varSetElems qtvs', frees, binds, irreds) + else + check_loop givens' (irreds ++ frees) `thenM` \ (qtvs', frees1, binds1, irreds1) -> + returnM (qtvs', frees1, binds `unionBags` binds1, irreds1) +\end{code} + + +%************************************************************************ +%* * + tcSimplifySuperClasses +%* * +%************************************************************************ + +Note [SUPERCLASS-LOOP 1] +~~~~~~~~~~~~~~~~~~~~~~~~ +We have to be very, very careful when generating superclasses, lest we +accidentally build a loop. Here's an example: + + class S a + + class S a => C a where { opc :: a -> a } + class S b => D b where { opd :: b -> b } + + instance C Int where + opc = opd + + instance D Int where + opd = opc + +From (instance C Int) we get the constraint set {ds1:S Int, dd:D Int} +Simplifying, we may well get: + $dfCInt = :C ds1 (opd dd) + dd = $dfDInt + ds1 = $p1 dd +Notice that we spot that we can extract ds1 from dd. + +Alas! Alack! We can do the same for (instance D Int): + + $dfDInt = :D ds2 (opc dc) + dc = $dfCInt + ds2 = $p1 dc + +And now we've defined the superclass in terms of itself. + +Solution: never generate a superclass selectors at all when +satisfying the superclass context of an instance declaration. + +Two more nasty cases are in + tcrun021 + tcrun033 + +\begin{code} +tcSimplifySuperClasses qtvs givens sc_wanteds + = ASSERT( all isSkolemTyVar qtvs ) + do { (_, frees, binds1) <- tcSimplCheck doc get_qtvs NoSCs givens sc_wanteds + ; binds2 <- tc_simplify_top doc False NoSCs frees + ; return (binds1 `unionBags` binds2) } + where + get_qtvs = return (mkVarSet qtvs) + doc = ptext SLIT("instance declaration superclass context") +\end{code} + + +%************************************************************************ +%* * +\subsection{tcSimplifyRestricted} +%* * +%************************************************************************ + +tcSimplifyRestricted infers which type variables to quantify for a +group of restricted bindings. This isn't trivial. + +Eg1: id = \x -> x + We want to quantify over a to get id :: forall a. a->a + +Eg2: eq = (==) + We do not want to quantify over a, because there's an Eq a + constraint, so we get eq :: a->a->Bool (notice no forall) + +So, assume: + RHS has type 'tau', whose free tyvars are tau_tvs + RHS has constraints 'wanteds' + +Plan A (simple) + Quantify over (tau_tvs \ ftvs(wanteds)) + This is bad. The constraints may contain (Monad (ST s)) + where we have instance Monad (ST s) where... + so there's no need to be monomorphic in s! + + Also the constraint might be a method constraint, + whose type mentions a perfectly innocent tyvar: + op :: Num a => a -> b -> a + Here, b is unconstrained. A good example would be + foo = op (3::Int) + We want to infer the polymorphic type + foo :: forall b. b -> b + + +Plan B (cunning, used for a long time up to and including GHC 6.2) + Step 1: Simplify the constraints as much as possible (to deal + with Plan A's problem). Then set + qtvs = tau_tvs \ ftvs( simplify( wanteds ) ) + + Step 2: Now simplify again, treating the constraint as 'free' if + it does not mention qtvs, and trying to reduce it otherwise. + The reasons for this is to maximise sharing. + + This fails for a very subtle reason. Suppose that in the Step 2 + a constraint (Foo (Succ Zero) (Succ Zero) b) gets thrown upstairs as 'free'. + In the Step 1 this constraint might have been simplified, perhaps to + (Foo Zero Zero b), AND THEN THAT MIGHT BE IMPROVED, to bind 'b' to 'T'. + This won't happen in Step 2... but that in turn might prevent some other + constraint (Baz [a] b) being simplified (e.g. via instance Baz [a] T where {..}) + and that in turn breaks the invariant that no constraints are quantified over. + + Test typecheck/should_compile/tc177 (which failed in GHC 6.2) demonstrates + the problem. + + +Plan C (brutal) + Step 1: Simplify the constraints as much as possible (to deal + with Plan A's problem). Then set + qtvs = tau_tvs \ ftvs( simplify( wanteds ) ) + Return the bindings from Step 1. + + +A note about Plan C (arising from "bug" reported by George Russel March 2004) +Consider this: + + instance (HasBinary ty IO) => HasCodedValue ty + + foo :: HasCodedValue a => String -> IO a + + doDecodeIO :: HasCodedValue a => () -> () -> IO a + doDecodeIO codedValue view + = let { act = foo "foo" } in act + +You might think this should work becuase the call to foo gives rise to a constraint +(HasCodedValue t), which can be satisfied by the type sig for doDecodeIO. But the +restricted binding act = ... calls tcSimplifyRestricted, and PlanC simplifies the +constraint using the (rather bogus) instance declaration, and now we are stuffed. + +I claim this is not really a bug -- but it bit Sergey as well as George. So here's +plan D + + +Plan D (a variant of plan B) + Step 1: Simplify the constraints as much as possible (to deal + with Plan A's problem), BUT DO NO IMPROVEMENT. Then set + qtvs = tau_tvs \ ftvs( simplify( wanteds ) ) + + Step 2: Now simplify again, treating the constraint as 'free' if + it does not mention qtvs, and trying to reduce it otherwise. + + The point here is that it's generally OK to have too few qtvs; that is, + to make the thing more monomorphic than it could be. We don't want to + do that in the common cases, but in wierd cases it's ok: the programmer + can always add a signature. + + Too few qtvs => too many wanteds, which is what happens if you do less + improvement. + + +\begin{code} +tcSimplifyRestricted -- Used for restricted binding groups + -- i.e. ones subject to the monomorphism restriction + :: SDoc + -> TopLevelFlag + -> [Name] -- Things bound in this group + -> TcTyVarSet -- Free in the type of the RHSs + -> [Inst] -- Free in the RHSs + -> TcM ([TcTyVar], -- Tyvars to quantify (zonked) + TcDictBinds) -- Bindings + -- tcSimpifyRestricted returns no constraints to + -- quantify over; by definition there are none. + -- They are all thrown back in the LIE + +tcSimplifyRestricted doc top_lvl bndrs tau_tvs wanteds + -- Zonk everything in sight + = mappM zonkInst wanteds `thenM` \ wanteds' -> + zonkTcTyVarsAndFV (varSetElems tau_tvs) `thenM` \ tau_tvs' -> + tcGetGlobalTyVars `thenM` \ gbl_tvs' -> + + -- 'reduceMe': Reduce as far as we can. Don't stop at + -- dicts; the idea is to get rid of as many type + -- variables as possible, and we don't want to stop + -- at (say) Monad (ST s), because that reduces + -- immediately, with no constraint on s. + -- + -- BUT do no improvement! See Plan D above + reduceContextWithoutImprovement + doc reduceMe wanteds' `thenM` \ (_frees, _binds, constrained_dicts) -> + + -- Next, figure out the tyvars we will quantify over + let + constrained_tvs = tyVarsOfInsts constrained_dicts + qtvs = (tau_tvs' `minusVarSet` oclose (fdPredsOfInsts constrained_dicts) gbl_tvs') + `minusVarSet` constrained_tvs + in + traceTc (text "tcSimplifyRestricted" <+> vcat [ + pprInsts wanteds, pprInsts _frees, pprInsts constrained_dicts, + ppr _binds, + ppr constrained_tvs, ppr tau_tvs', ppr qtvs ]) `thenM_` + + -- The first step may have squashed more methods than + -- necessary, so try again, this time more gently, knowing the exact + -- set of type variables to quantify over. + -- + -- We quantify only over constraints that are captured by qtvs; + -- these will just be a subset of non-dicts. This in contrast + -- to normal inference (using isFreeWhenInferring) in which we quantify over + -- all *non-inheritable* constraints too. This implements choice + -- (B) under "implicit parameter and monomorphism" above. + -- + -- Remember that we may need to do *some* simplification, to + -- (for example) squash {Monad (ST s)} into {}. It's not enough + -- just to float all constraints + -- + -- At top level, we *do* squash methods becuase we want to + -- expose implicit parameters to the test that follows + let + is_nested_group = isNotTopLevel top_lvl + try_me inst | isFreeWrtTyVars qtvs inst, + (is_nested_group || isDict inst) = Free + | otherwise = ReduceMe AddSCs + in + reduceContextWithoutImprovement + doc try_me wanteds' `thenM` \ (frees, binds, irreds) -> + ASSERT( null irreds ) + + -- See "Notes on implicit parameters, Question 4: top level" + if is_nested_group then + extendLIEs frees `thenM_` + returnM (varSetElems qtvs, binds) + else + let + (non_ips, bad_ips) = partition isClassDict frees + in + addTopIPErrs bndrs bad_ips `thenM_` + extendLIEs non_ips `thenM_` + returnM (varSetElems qtvs, binds) +\end{code} + + +%************************************************************************ +%* * +\subsection{tcSimplifyToDicts} +%* * +%************************************************************************ + +On the LHS of transformation rules we only simplify methods and constants, +getting dictionaries. We want to keep all of them unsimplified, to serve +as the available stuff for the RHS of the rule. + +The same thing is used for specialise pragmas. Consider + + f :: Num a => a -> a + {-# SPECIALISE f :: Int -> Int #-} + f = ... + +The type checker generates a binding like: + + f_spec = (f :: Int -> Int) + +and we want to end up with + + f_spec = _inline_me_ (f Int dNumInt) + +But that means that we must simplify the Method for f to (f Int dNumInt)! +So tcSimplifyToDicts squeezes out all Methods. + +IMPORTANT NOTE: we *don't* want to do superclass commoning up. Consider + + fromIntegral :: (Integral a, Num b) => a -> b + {-# RULES "foo" fromIntegral = id :: Int -> Int #-} + +Here, a=b=Int, and Num Int is a superclass of Integral Int. But we *dont* +want to get + + forall dIntegralInt. + fromIntegral Int Int dIntegralInt (scsel dIntegralInt) = id Int + +because the scsel will mess up RULE matching. Instead we want + + forall dIntegralInt, dNumInt. + fromIntegral Int Int dIntegralInt dNumInt = id Int + +Hence "WithoutSCs" + +\begin{code} +tcSimplifyToDicts :: [Inst] -> TcM (TcDictBinds) +tcSimplifyToDicts wanteds + = simpleReduceLoop doc try_me wanteds `thenM` \ (frees, binds, irreds) -> + -- Since try_me doesn't look at types, we don't need to + -- do any zonking, so it's safe to call reduceContext directly + ASSERT( null frees ) + extendLIEs irreds `thenM_` + returnM binds + + where + doc = text "tcSimplifyToDicts" + + -- Reduce methods and lits only; stop as soon as we get a dictionary + try_me inst | isDict inst = KeepDictWithoutSCs -- See notes above re "WithoutSCs" + | otherwise = ReduceMe NoSCs +\end{code} + + + +tcSimplifyBracket is used when simplifying the constraints arising from +a Template Haskell bracket [| ... |]. We want to check that there aren't +any constraints that can't be satisfied (e.g. Show Foo, where Foo has no +Show instance), but we aren't otherwise interested in the results. +Nor do we care about ambiguous dictionaries etc. We will type check +this bracket again at its usage site. + +\begin{code} +tcSimplifyBracket :: [Inst] -> TcM () +tcSimplifyBracket wanteds + = simpleReduceLoop doc reduceMe wanteds `thenM_` + returnM () + where + doc = text "tcSimplifyBracket" +\end{code} + + +%************************************************************************ +%* * +\subsection{Filtering at a dynamic binding} +%* * +%************************************************************************ + +When we have + let ?x = R in B + +we must discharge all the ?x constraints from B. We also do an improvement +step; if we have ?x::t1 and ?x::t2 we must unify t1, t2. + +Actually, the constraints from B might improve the types in ?x. For example + + f :: (?x::Int) => Char -> Char + let ?x = 3 in f 'c' + +then the constraint (?x::Int) arising from the call to f will +force the binding for ?x to be of type Int. + +\begin{code} +tcSimplifyIPs :: [Inst] -- The implicit parameters bound here + -> [Inst] -- Wanted + -> TcM TcDictBinds +tcSimplifyIPs given_ips wanteds + = simpl_loop given_ips wanteds `thenM` \ (frees, binds) -> + extendLIEs frees `thenM_` + returnM binds + where + doc = text "tcSimplifyIPs" <+> ppr given_ips + ip_set = mkNameSet (ipNamesOfInsts given_ips) + + -- Simplify any methods that mention the implicit parameter + try_me inst | isFreeWrtIPs ip_set inst = Free + | otherwise = ReduceMe NoSCs + + simpl_loop givens wanteds + = mappM zonkInst givens `thenM` \ givens' -> + mappM zonkInst wanteds `thenM` \ wanteds' -> + + reduceContext doc try_me givens' wanteds' `thenM` \ (no_improvement, frees, binds, irreds) -> + + if no_improvement then + ASSERT( null irreds ) + returnM (frees, binds) + else + simpl_loop givens' (irreds ++ frees) `thenM` \ (frees1, binds1) -> + returnM (frees1, binds `unionBags` binds1) +\end{code} + + +%************************************************************************ +%* * +\subsection[binds-for-local-funs]{@bindInstsOfLocalFuns@} +%* * +%************************************************************************ + +When doing a binding group, we may have @Insts@ of local functions. +For example, we might have... +\begin{verbatim} +let f x = x + 1 -- orig local function (overloaded) + f.1 = f Int -- two instances of f + f.2 = f Float + in + (f.1 5, f.2 6.7) +\end{verbatim} +The point is: we must drop the bindings for @f.1@ and @f.2@ here, +where @f@ is in scope; those @Insts@ must certainly not be passed +upwards towards the top-level. If the @Insts@ were binding-ified up +there, they would have unresolvable references to @f@. + +We pass in an @init_lie@ of @Insts@ and a list of locally-bound @Ids@. +For each method @Inst@ in the @init_lie@ that mentions one of the +@Ids@, we create a binding. We return the remaining @Insts@ (in an +@LIE@), as well as the @HsBinds@ generated. + +\begin{code} +bindInstsOfLocalFuns :: [Inst] -> [TcId] -> TcM TcDictBinds +-- Simlifies only MethodInsts, and generate only bindings of form +-- fm = f tys dicts +-- We're careful not to even generate bindings of the form +-- d1 = d2 +-- You'd think that'd be fine, but it interacts with what is +-- arguably a bug in Match.tidyEqnInfo (see notes there) + +bindInstsOfLocalFuns wanteds local_ids + | null overloaded_ids + -- Common case + = extendLIEs wanteds `thenM_` + returnM emptyLHsBinds + + | otherwise + = simpleReduceLoop doc try_me for_me `thenM` \ (frees, binds, irreds) -> + ASSERT( null irreds ) + extendLIEs not_for_me `thenM_` + extendLIEs frees `thenM_` + returnM binds + where + doc = text "bindInsts" <+> ppr local_ids + overloaded_ids = filter is_overloaded local_ids + is_overloaded id = isOverloadedTy (idType id) + (for_me, not_for_me) = partition (isMethodFor overloaded_set) wanteds + + overloaded_set = mkVarSet overloaded_ids -- There can occasionally be a lot of them + -- so it's worth building a set, so that + -- lookup (in isMethodFor) is faster + try_me inst | isMethod inst = ReduceMe NoSCs + | otherwise = Free +\end{code} + + +%************************************************************************ +%* * +\subsection{Data types for the reduction mechanism} +%* * +%************************************************************************ + +The main control over context reduction is here + +\begin{code} +data WhatToDo + = ReduceMe WantSCs -- Try to reduce this + -- If there's no instance, behave exactly like + -- DontReduce: add the inst to the irreductible ones, + -- but don't produce an error message of any kind. + -- It might be quite legitimate such as (Eq a)! + + | KeepDictWithoutSCs -- Return as irreducible; don't add its superclasses + -- Rather specialised: see notes with tcSimplifyToDicts + + | DontReduceUnlessConstant -- Return as irreducible unless it can + -- be reduced to a constant in one step + + | Free -- Return as free + +reduceMe :: Inst -> WhatToDo +reduceMe inst = ReduceMe AddSCs + +data WantSCs = NoSCs | AddSCs -- Tells whether we should add the superclasses + -- of a predicate when adding it to the avails + -- The reason for this flag is entirely the super-class loop problem + -- Note [SUPER-CLASS LOOP 1] +\end{code} + + + +\begin{code} +type Avails = FiniteMap Inst Avail +emptyAvails = emptyFM + +data Avail + = IsFree -- Used for free Insts + | Irred -- Used for irreducible dictionaries, + -- which are going to be lambda bound + + | Given TcId -- Used for dictionaries for which we have a binding + -- e.g. those "given" in a signature + Bool -- True <=> actually consumed (splittable IPs only) + + | Rhs -- Used when there is a RHS + (LHsExpr TcId) -- The RHS + [Inst] -- Insts free in the RHS; we need these too + + | Linear -- Splittable Insts only. + Int -- The Int is always 2 or more; indicates how + -- many copies are required + Inst -- The splitter + Avail -- Where the "master copy" is + + | LinRhss -- Splittable Insts only; this is used only internally + -- by extractResults, where a Linear + -- is turned into an LinRhss + [LHsExpr TcId] -- A supply of suitable RHSs + +pprAvails avails = vcat [sep [ppr inst, nest 2 (equals <+> pprAvail avail)] + | (inst,avail) <- fmToList avails ] + +instance Outputable Avail where + ppr = pprAvail + +pprAvail IsFree = text "Free" +pprAvail Irred = text "Irred" +pprAvail (Given x b) = text "Given" <+> ppr x <+> + if b then text "(used)" else empty +pprAvail (Rhs rhs bs) = text "Rhs" <+> ppr rhs <+> braces (ppr bs) +pprAvail (Linear n i a) = text "Linear" <+> ppr n <+> braces (ppr i) <+> ppr a +pprAvail (LinRhss rhss) = text "LinRhss" <+> ppr rhss +\end{code} + +Extracting the bindings from a bunch of Avails. +The bindings do *not* come back sorted in dependency order. +We assume that they'll be wrapped in a big Rec, so that the +dependency analyser can sort them out later + +The loop startes +\begin{code} +extractResults :: Avails + -> [Inst] -- Wanted + -> TcM (TcDictBinds, -- Bindings + [Inst], -- Irreducible ones + [Inst]) -- Free ones + +extractResults avails wanteds + = go avails emptyBag [] [] wanteds + where + go avails binds irreds frees [] + = returnM (binds, irreds, frees) + + go avails binds irreds frees (w:ws) + = case lookupFM avails w of + Nothing -> pprTrace "Urk: extractResults" (ppr w) $ + go avails binds irreds frees ws + + Just IsFree -> go (add_free avails w) binds irreds (w:frees) ws + Just Irred -> go (add_given avails w) binds (w:irreds) frees ws + + Just (Given id _) -> go avails new_binds irreds frees ws + where + new_binds | id == instToId w = binds + | otherwise = addBind binds w (L (instSpan w) (HsVar id)) + -- The sought Id can be one of the givens, via a superclass chain + -- and then we definitely don't want to generate an x=x binding! + + Just (Rhs rhs ws') -> go (add_given avails w) new_binds irreds frees (ws' ++ ws) + where + new_binds = addBind binds w rhs + + Just (Linear n split_inst avail) -- Transform Linear --> LinRhss + -> get_root irreds frees avail w `thenM` \ (irreds', frees', root_id) -> + split n (instToId split_inst) root_id w `thenM` \ (binds', rhss) -> + go (addToFM avails w (LinRhss rhss)) + (binds `unionBags` binds') + irreds' frees' (split_inst : w : ws) + + Just (LinRhss (rhs:rhss)) -- Consume one of the Rhss + -> go new_avails new_binds irreds frees ws + where + new_binds = addBind binds w rhs + new_avails = addToFM avails w (LinRhss rhss) + + get_root irreds frees (Given id _) w = returnM (irreds, frees, id) + get_root irreds frees Irred w = cloneDict w `thenM` \ w' -> + returnM (w':irreds, frees, instToId w') + get_root irreds frees IsFree w = cloneDict w `thenM` \ w' -> + returnM (irreds, w':frees, instToId w') + + add_given avails w = addToFM avails w (Given (instToId w) True) + + add_free avails w | isMethod w = avails + | otherwise = add_given avails w + -- NB: Hack alert! + -- Do *not* replace Free by Given if it's a method. + -- The following situation shows why this is bad: + -- truncate :: forall a. RealFrac a => forall b. Integral b => a -> b + -- From an application (truncate f i) we get + -- t1 = truncate at f + -- t2 = t1 at i + -- If we have also have a second occurrence of truncate, we get + -- t3 = truncate at f + -- t4 = t3 at i + -- When simplifying with i,f free, we might still notice that + -- t1=t3; but alas, the binding for t2 (which mentions t1) + -- will continue to float out! + -- (split n i a) returns: n rhss + -- auxiliary bindings + -- 1 or 0 insts to add to irreds + + +split :: Int -> TcId -> TcId -> Inst + -> TcM (TcDictBinds, [LHsExpr TcId]) +-- (split n split_id root_id wanted) returns +-- * a list of 'n' expressions, all of which witness 'avail' +-- * a bunch of auxiliary bindings to support these expressions +-- * one or zero insts needed to witness the whole lot +-- (maybe be zero if the initial Inst is a Given) +-- +-- NB: 'wanted' is just a template + +split n split_id root_id wanted + = go n + where + ty = linearInstType wanted + pair_ty = mkTyConApp pairTyCon [ty,ty] + id = instToId wanted + occ = getOccName id + loc = getSrcLoc id + span = instSpan wanted + + go 1 = returnM (emptyBag, [L span $ HsVar root_id]) + + go n = go ((n+1) `div` 2) `thenM` \ (binds1, rhss) -> + expand n rhss `thenM` \ (binds2, rhss') -> + returnM (binds1 `unionBags` binds2, rhss') + + -- (expand n rhss) + -- Given ((n+1)/2) rhss, make n rhss, using auxiliary bindings + -- e.g. expand 3 [rhs1, rhs2] + -- = ( { x = split rhs1 }, + -- [fst x, snd x, rhs2] ) + expand n rhss + | n `rem` 2 == 0 = go rhss -- n is even + | otherwise = go (tail rhss) `thenM` \ (binds', rhss') -> + returnM (binds', head rhss : rhss') + where + go rhss = mapAndUnzipM do_one rhss `thenM` \ (binds', rhss') -> + returnM (listToBag binds', concat rhss') + + do_one rhs = newUnique `thenM` \ uniq -> + tcLookupId fstName `thenM` \ fst_id -> + tcLookupId sndName `thenM` \ snd_id -> + let + x = mkUserLocal occ uniq pair_ty loc + in + returnM (L span (VarBind x (mk_app span split_id rhs)), + [mk_fs_app span fst_id ty x, mk_fs_app span snd_id ty x]) + +mk_fs_app span id ty var = L span (HsVar id) `mkHsTyApp` [ty,ty] `mkHsApp` (L span (HsVar var)) + +mk_app span id rhs = L span (HsApp (L span (HsVar id)) rhs) + +addBind binds inst rhs = binds `unionBags` unitBag (L (instLocSrcSpan (instLoc inst)) + (VarBind (instToId inst) rhs)) +instSpan wanted = instLocSrcSpan (instLoc wanted) +\end{code} + + +%************************************************************************ +%* * +\subsection[reduce]{@reduce@} +%* * +%************************************************************************ + +When the "what to do" predicate doesn't depend on the quantified type variables, +matters are easier. We don't need to do any zonking, unless the improvement step +does something, in which case we zonk before iterating. + +The "given" set is always empty. + +\begin{code} +simpleReduceLoop :: SDoc + -> (Inst -> WhatToDo) -- What to do, *not* based on the quantified type variables + -> [Inst] -- Wanted + -> TcM ([Inst], -- Free + TcDictBinds, + [Inst]) -- Irreducible + +simpleReduceLoop doc try_me wanteds + = mappM zonkInst wanteds `thenM` \ wanteds' -> + reduceContext doc try_me [] wanteds' `thenM` \ (no_improvement, frees, binds, irreds) -> + if no_improvement then + returnM (frees, binds, irreds) + else + simpleReduceLoop doc try_me (irreds ++ frees) `thenM` \ (frees1, binds1, irreds1) -> + returnM (frees1, binds `unionBags` binds1, irreds1) +\end{code} + + + +\begin{code} +reduceContext :: SDoc + -> (Inst -> WhatToDo) + -> [Inst] -- Given + -> [Inst] -- Wanted + -> TcM (Bool, -- True <=> improve step did no unification + [Inst], -- Free + TcDictBinds, -- Dictionary bindings + [Inst]) -- Irreducible + +reduceContext doc try_me givens wanteds + = + traceTc (text "reduceContext" <+> (vcat [ + text "----------------------", + doc, + text "given" <+> ppr givens, + text "wanted" <+> ppr wanteds, + text "----------------------" + ])) `thenM_` + + -- Build the Avail mapping from "givens" + foldlM addGiven emptyAvails givens `thenM` \ init_state -> + + -- Do the real work + reduceList (0,[]) try_me wanteds init_state `thenM` \ avails -> + + -- Do improvement, using everything in avails + -- In particular, avails includes all superclasses of everything + tcImprove avails `thenM` \ no_improvement -> + + extractResults avails wanteds `thenM` \ (binds, irreds, frees) -> + + traceTc (text "reduceContext end" <+> (vcat [ + text "----------------------", + doc, + text "given" <+> ppr givens, + text "wanted" <+> ppr wanteds, + text "----", + text "avails" <+> pprAvails avails, + text "frees" <+> ppr frees, + text "no_improvement =" <+> ppr no_improvement, + text "----------------------" + ])) `thenM_` + + returnM (no_improvement, frees, binds, irreds) + +-- reduceContextWithoutImprovement differs from reduceContext +-- (a) no improvement +-- (b) 'givens' is assumed empty +reduceContextWithoutImprovement doc try_me wanteds + = + traceTc (text "reduceContextWithoutImprovement" <+> (vcat [ + text "----------------------", + doc, + text "wanted" <+> ppr wanteds, + text "----------------------" + ])) `thenM_` + + -- Do the real work + reduceList (0,[]) try_me wanteds emptyAvails `thenM` \ avails -> + extractResults avails wanteds `thenM` \ (binds, irreds, frees) -> + + traceTc (text "reduceContextWithoutImprovement end" <+> (vcat [ + text "----------------------", + doc, + text "wanted" <+> ppr wanteds, + text "----", + text "avails" <+> pprAvails avails, + text "frees" <+> ppr frees, + text "----------------------" + ])) `thenM_` + + returnM (frees, binds, irreds) + +tcImprove :: Avails -> TcM Bool -- False <=> no change +-- Perform improvement using all the predicates in Avails +tcImprove avails + = tcGetInstEnvs `thenM` \ inst_envs -> + let + preds = [ (pred, pp_loc) + | (inst, avail) <- fmToList avails, + pred <- get_preds inst avail, + let pp_loc = pprInstLoc (instLoc inst) + ] + -- Avails has all the superclasses etc (good) + -- It also has all the intermediates of the deduction (good) + -- It does not have duplicates (good) + -- NB that (?x::t1) and (?x::t2) will be held separately in avails + -- so that improve will see them separate + + -- For free Methods, we want to take predicates from their context, + -- but for Methods that have been squished their context will already + -- be in Avails, and we don't want duplicates. Hence this rather + -- horrid get_preds function + get_preds inst IsFree = fdPredsOfInst inst + get_preds inst other | isDict inst = [dictPred inst] + | otherwise = [] + + eqns = improve get_insts preds + get_insts clas = classInstances inst_envs clas + in + if null eqns then + returnM True + else + traceTc (ptext SLIT("Improve:") <+> vcat (map pprEquationDoc eqns)) `thenM_` + mappM_ unify eqns `thenM_` + returnM False + where + unify ((qtvs, pairs), what1, what2) + = addErrCtxtM (mkEqnMsg what1 what2) $ + tcInstTyVars (varSetElems qtvs) `thenM` \ (_, _, tenv) -> + mapM_ (unif_pr tenv) pairs + unif_pr tenv (ty1,ty2) = unifyType (substTy tenv ty1) (substTy tenv ty2) + +pprEquationDoc (eqn, (p1,w1), (p2,w2)) = vcat [pprEquation eqn, nest 2 (ppr p1), nest 2 (ppr p2)] + +mkEqnMsg (pred1,from1) (pred2,from2) tidy_env + = do { pred1' <- zonkTcPredType pred1; pred2' <- zonkTcPredType pred2 + ; let { pred1'' = tidyPred tidy_env pred1'; pred2'' = tidyPred tidy_env pred2' } + ; let msg = vcat [ptext SLIT("When using functional dependencies to combine"), + nest 2 (sep [ppr pred1'' <> comma, nest 2 from1]), + nest 2 (sep [ppr pred2'' <> comma, nest 2 from2])] + ; return (tidy_env, msg) } +\end{code} + +The main context-reduction function is @reduce@. Here's its game plan. + +\begin{code} +reduceList :: (Int,[Inst]) -- Stack (for err msgs) + -- along with its depth + -> (Inst -> WhatToDo) + -> [Inst] + -> Avails + -> TcM Avails +\end{code} + +@reduce@ is passed + try_me: given an inst, this function returns + Reduce reduce this + DontReduce return this in "irreds" + Free return this in "frees" + + wanteds: The list of insts to reduce + state: An accumulating parameter of type Avails + that contains the state of the algorithm + + It returns a Avails. + +The (n,stack) pair is just used for error reporting. +n is always the depth of the stack. +The stack is the stack of Insts being reduced: to produce X +I had to produce Y, to produce Y I had to produce Z, and so on. + +\begin{code} +reduceList (n,stack) try_me wanteds state + | n > opt_MaxContextReductionDepth + = failWithTc (reduceDepthErr n stack) + + | otherwise + = +#ifdef DEBUG + (if n > 8 then + pprTrace "Interesting! Context reduction stack deeper than 8:" + (int n $$ ifPprDebug (nest 2 (pprStack stack))) + else (\x->x)) +#endif + go wanteds state + where + go [] state = returnM state + go (w:ws) state = reduce (n+1, w:stack) try_me w state `thenM` \ state' -> + go ws state' + + -- Base case: we're done! +reduce stack try_me wanted avails + -- It's the same as an existing inst, or a superclass thereof + | Just avail <- isAvailable avails wanted + = if isLinearInst wanted then + addLinearAvailable avails avail wanted `thenM` \ (avails', wanteds') -> + reduceList stack try_me wanteds' avails' + else + returnM avails -- No op for non-linear things + + | otherwise + = case try_me wanted of { + + KeepDictWithoutSCs -> addIrred NoSCs avails wanted + + ; DontReduceUnlessConstant -> -- It's irreducible (or at least should not be reduced) + -- First, see if the inst can be reduced to a constant in one step + try_simple (addIrred AddSCs) -- Assume want superclasses + + ; Free -> -- It's free so just chuck it upstairs + -- First, see if the inst can be reduced to a constant in one step + try_simple addFree + + ; ReduceMe want_scs -> -- It should be reduced + lookupInst wanted `thenM` \ lookup_result -> + case lookup_result of + GenInst wanteds' rhs -> addIrred NoSCs avails wanted `thenM` \ avails1 -> + reduceList stack try_me wanteds' avails1 `thenM` \ avails2 -> + addWanted want_scs avails2 wanted rhs wanteds' + -- Experiment with temporarily doing addIrred *before* the reduceList, + -- which has the effect of adding the thing we are trying + -- to prove to the database before trying to prove the things it + -- needs. See note [RECURSIVE DICTIONARIES] + -- NB: we must not do an addWanted before, because that adds the + -- superclasses too, and thaat can lead to a spurious loop; see + -- the examples in [SUPERCLASS-LOOP] + -- So we do an addIrred before, and then overwrite it afterwards with addWanted + + SimpleInst rhs -> addWanted want_scs avails wanted rhs [] + + NoInstance -> -- No such instance! + -- Add it and its superclasses + addIrred want_scs avails wanted + } + where + try_simple do_this_otherwise + = lookupInst wanted `thenM` \ lookup_result -> + case lookup_result of + SimpleInst rhs -> addWanted AddSCs avails wanted rhs [] + other -> do_this_otherwise avails wanted +\end{code} + + +\begin{code} +------------------------- +isAvailable :: Avails -> Inst -> Maybe Avail +isAvailable avails wanted = lookupFM avails wanted + -- NB 1: the Ord instance of Inst compares by the class/type info + -- *not* by unique. So + -- d1::C Int == d2::C Int + +addLinearAvailable :: Avails -> Avail -> Inst -> TcM (Avails, [Inst]) +addLinearAvailable avails avail wanted + -- avails currently maps [wanted -> avail] + -- Extend avails to reflect a neeed for an extra copy of avail + + | Just avail' <- split_avail avail + = returnM (addToFM avails wanted avail', []) + + | otherwise + = tcLookupId splitName `thenM` \ split_id -> + tcInstClassOp (instLoc wanted) split_id + [linearInstType wanted] `thenM` \ split_inst -> + returnM (addToFM avails wanted (Linear 2 split_inst avail), [split_inst]) + + where + split_avail :: Avail -> Maybe Avail + -- (Just av) if there's a modified version of avail that + -- we can use to replace avail in avails + -- Nothing if there isn't, so we need to create a Linear + split_avail (Linear n i a) = Just (Linear (n+1) i a) + split_avail (Given id used) | not used = Just (Given id True) + | otherwise = Nothing + split_avail Irred = Nothing + split_avail IsFree = Nothing + split_avail other = pprPanic "addLinearAvailable" (ppr avail $$ ppr wanted $$ ppr avails) + +------------------------- +addFree :: Avails -> Inst -> TcM Avails + -- When an Inst is tossed upstairs as 'free' we nevertheless add it + -- to avails, so that any other equal Insts will be commoned up right + -- here rather than also being tossed upstairs. This is really just + -- an optimisation, and perhaps it is more trouble that it is worth, + -- as the following comments show! + -- + -- NB: do *not* add superclasses. If we have + -- df::Floating a + -- dn::Num a + -- but a is not bound here, then we *don't* want to derive + -- dn from df here lest we lose sharing. + -- +addFree avails free = returnM (addToFM avails free IsFree) + +addWanted :: WantSCs -> Avails -> Inst -> LHsExpr TcId -> [Inst] -> TcM Avails +addWanted want_scs avails wanted rhs_expr wanteds + = addAvailAndSCs want_scs avails wanted avail + where + avail = Rhs rhs_expr wanteds + +addGiven :: Avails -> Inst -> TcM Avails +addGiven avails given = addAvailAndSCs AddSCs avails given (Given (instToId given) False) + -- Always add superclasses for 'givens' + -- + -- No ASSERT( not (given `elemFM` avails) ) because in an instance + -- decl for Ord t we can add both Ord t and Eq t as 'givens', + -- so the assert isn't true + +addIrred :: WantSCs -> Avails -> Inst -> TcM Avails +addIrred want_scs avails irred = ASSERT2( not (irred `elemFM` avails), ppr irred $$ ppr avails ) + addAvailAndSCs want_scs avails irred Irred + +addAvailAndSCs :: WantSCs -> Avails -> Inst -> Avail -> TcM Avails +addAvailAndSCs want_scs avails inst avail + | not (isClassDict inst) = return avails_with_inst + | NoSCs <- want_scs = return avails_with_inst + | otherwise = do { traceTc (text "addAvailAndSCs" <+> vcat [ppr inst, ppr deps]) + ; addSCs is_loop avails_with_inst inst } + where + avails_with_inst = addToFM avails inst avail + + is_loop pred = any (`tcEqType` mkPredTy pred) dep_tys + -- Note: this compares by *type*, not by Unique + deps = findAllDeps (unitVarSet (instToId inst)) avail + dep_tys = map idType (varSetElems deps) + + findAllDeps :: IdSet -> Avail -> IdSet + -- Find all the Insts that this one depends on + -- See Note [SUPERCLASS-LOOP 2] + -- Watch out, though. Since the avails may contain loops + -- (see Note [RECURSIVE DICTIONARIES]), so we need to track the ones we've seen so far + findAllDeps so_far (Rhs _ kids) = foldl find_all so_far kids + findAllDeps so_far other = so_far + + find_all :: IdSet -> Inst -> IdSet + find_all so_far kid + | kid_id `elemVarSet` so_far = so_far + | Just avail <- lookupFM avails kid = findAllDeps so_far' avail + | otherwise = so_far' + where + so_far' = extendVarSet so_far kid_id -- Add the new kid to so_far + kid_id = instToId kid + +addSCs :: (TcPredType -> Bool) -> Avails -> Inst -> TcM Avails + -- Add all the superclasses of the Inst to Avails + -- The first param says "dont do this because the original thing + -- depends on this one, so you'd build a loop" + -- Invariant: the Inst is already in Avails. + +addSCs is_loop avails dict + = do { sc_dicts <- newDictsAtLoc (instLoc dict) sc_theta' + ; foldlM add_sc avails (zipEqual "add_scs" sc_dicts sc_sels) } + where + (clas, tys) = getDictClassTys dict + (tyvars, sc_theta, sc_sels, _) = classBigSig clas + sc_theta' = substTheta (zipTopTvSubst tyvars tys) sc_theta + + add_sc avails (sc_dict, sc_sel) + | is_loop (dictPred sc_dict) = return avails -- See Note [SUPERCLASS-LOOP 2] + | is_given sc_dict = return avails + | otherwise = addSCs is_loop avails' sc_dict + where + sc_sel_rhs = mkHsDictApp (mkHsTyApp (L (instSpan dict) (HsVar sc_sel)) tys) [instToId dict] + avails' = addToFM avails sc_dict (Rhs sc_sel_rhs [dict]) + + is_given :: Inst -> Bool + is_given sc_dict = case lookupFM avails sc_dict of + Just (Given _ _) -> True -- Given is cheaper than superclass selection + other -> False +\end{code} + +Note [SUPERCLASS-LOOP 2] +~~~~~~~~~~~~~~~~~~~~~~~~ +But the above isn't enough. Suppose we are *given* d1:Ord a, +and want to deduce (d2:C [a]) where + + class Ord a => C a where + instance Ord [a] => C [a] where ... + +Then we'll use the instance decl to deduce C [a] from Ord [a], and then add the +superclasses of C [a] to avails. But we must not overwrite the binding +for Ord [a] (which is obtained from Ord a) with a superclass selection or we'll just +build a loop! + +Here's another variant, immortalised in tcrun020 + class Monad m => C1 m + class C1 m => C2 m x + instance C2 Maybe Bool +For the instance decl we need to build (C1 Maybe), and it's no good if +we run around and add (C2 Maybe Bool) and its superclasses to the avails +before we search for C1 Maybe. + +Here's another example + class Eq b => Foo a b + instance Eq a => Foo [a] a +If we are reducing + (Foo [t] t) + +we'll first deduce that it holds (via the instance decl). We must not +then overwrite the Eq t constraint with a superclass selection! + +At first I had a gross hack, whereby I simply did not add superclass constraints +in addWanted, though I did for addGiven and addIrred. This was sub-optimal, +becuase it lost legitimate superclass sharing, and it still didn't do the job: +I found a very obscure program (now tcrun021) in which improvement meant the +simplifier got two bites a the cherry... so something seemed to be an Irred +first time, but reducible next time. + +Now we implement the Right Solution, which is to check for loops directly +when adding superclasses. It's a bit like the occurs check in unification. + + +Note [RECURSIVE DICTIONARIES] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Consider + data D r = ZeroD | SuccD (r (D r)); + + instance (Eq (r (D r))) => Eq (D r) where + ZeroD == ZeroD = True + (SuccD a) == (SuccD b) = a == b + _ == _ = False; + + equalDC :: D [] -> D [] -> Bool; + equalDC = (==); + +We need to prove (Eq (D [])). Here's how we go: + + d1 : Eq (D []) + +by instance decl, holds if + d2 : Eq [D []] + where d1 = dfEqD d2 + +by instance decl of Eq, holds if + d3 : D [] + where d2 = dfEqList d3 + d1 = dfEqD d2 + +But now we can "tie the knot" to give + + d3 = d1 + d2 = dfEqList d3 + d1 = dfEqD d2 + +and it'll even run! The trick is to put the thing we are trying to prove +(in this case Eq (D []) into the database before trying to prove its +contributing clauses. + + +%************************************************************************ +%* * +\section{tcSimplifyTop: defaulting} +%* * +%************************************************************************ + + +@tcSimplifyTop@ is called once per module to simplify all the constant +and ambiguous Insts. + +We need to be careful of one case. Suppose we have + + instance Num a => Num (Foo a b) where ... + +and @tcSimplifyTop@ is given a constraint (Num (Foo x y)). Then it'll simplify +to (Num x), and default x to Int. But what about y?? + +It's OK: the final zonking stage should zap y to (), which is fine. + + +\begin{code} +tcSimplifyTop, tcSimplifyInteractive :: [Inst] -> TcM TcDictBinds +tcSimplifyTop wanteds + = tc_simplify_top doc False {- Not interactive loop -} AddSCs wanteds + where + doc = text "tcSimplifyTop" + +tcSimplifyInteractive wanteds + = tc_simplify_top doc True {- Interactive loop -} AddSCs wanteds + where + doc = text "tcSimplifyTop" + +-- The TcLclEnv should be valid here, solely to improve +-- error message generation for the monomorphism restriction +tc_simplify_top doc is_interactive want_scs wanteds + = do { lcl_env <- getLclEnv + ; traceTc (text "tcSimplifyTop" <+> ppr (lclEnvElts lcl_env)) + + ; let try_me inst = ReduceMe want_scs + ; (frees, binds, irreds) <- simpleReduceLoop doc try_me wanteds + + ; let + -- First get rid of implicit parameters + (non_ips, bad_ips) = partition isClassDict irreds + + -- All the non-tv or multi-param ones are definite errors + (unary_tv_dicts, non_tvs) = partition is_unary_tyvar_dict non_ips + bad_tyvars = unionVarSets (map tyVarsOfInst non_tvs) + + -- Group by type variable + tv_groups = equivClasses cmp_by_tyvar unary_tv_dicts + + -- Pick the ones which its worth trying to disambiguate + -- namely, the ones whose type variable isn't bound + -- up with one of the non-tyvar classes + (default_gps, non_default_gps) = partition defaultable_group tv_groups + defaultable_group ds + = not (bad_tyvars `intersectsVarSet` tyVarsOfInst (head ds)) + && defaultable_classes (map get_clas ds) + defaultable_classes clss + | is_interactive = any isInteractiveClass clss + | otherwise = all isStandardClass clss && any isNumericClass clss + + isInteractiveClass cls = isNumericClass cls + || (classKey cls `elem` [showClassKey, eqClassKey, ordClassKey]) + -- In interactive mode, we default Show a to Show () + -- to avoid graututious errors on "show []" + + + -- Collect together all the bad guys + bad_guys = non_tvs ++ concat non_default_gps + (ambigs, no_insts) = partition isTyVarDict bad_guys + -- If the dict has no type constructors involved, it must be ambiguous, + -- except I suppose that another error with fundeps maybe should have + -- constrained those type variables + + -- Report definite errors + ; ASSERT( null frees ) + groupErrs (addNoInstanceErrs Nothing []) no_insts + ; strangeTopIPErrs bad_ips + + -- Deal with ambiguity errors, but only if + -- if there has not been an error so far: + -- errors often give rise to spurious ambiguous Insts. + -- For example: + -- f = (*) -- Monomorphic + -- g :: Num a => a -> a + -- g x = f x x + -- Here, we get a complaint when checking the type signature for g, + -- that g isn't polymorphic enough; but then we get another one when + -- dealing with the (Num a) context arising from f's definition; + -- we try to unify a with Int (to default it), but find that it's + -- already been unified with the rigid variable from g's type sig + ; binds_ambig <- ifErrsM (returnM []) $ + do { -- Complain about the ones that don't fall under + -- the Haskell rules for disambiguation + -- This group includes both non-existent instances + -- e.g. Num (IO a) and Eq (Int -> Int) + -- and ambiguous dictionaries + -- e.g. Num a + addTopAmbigErrs ambigs + + -- Disambiguate the ones that look feasible + ; mappM disambigGroup default_gps } + + ; return (binds `unionBags` unionManyBags binds_ambig) } + +---------------------------------- +d1 `cmp_by_tyvar` d2 = get_tv d1 `compare` get_tv d2 + +is_unary_tyvar_dict :: Inst -> Bool -- Dicts of form (C a) + -- Invariant: argument is a ClassDict, not IP or method +is_unary_tyvar_dict d = case getDictClassTys d of + (_, [ty]) -> tcIsTyVarTy ty + other -> False + +get_tv d = case getDictClassTys d of + (clas, [ty]) -> tcGetTyVar "tcSimplify" ty +get_clas d = case getDictClassTys d of + (clas, _) -> clas +\end{code} + +If a dictionary constrains a type variable which is + * not mentioned in the environment + * and not mentioned in the type of the expression +then it is ambiguous. No further information will arise to instantiate +the type variable; nor will it be generalised and turned into an extra +parameter to a function. + +It is an error for this to occur, except that Haskell provided for +certain rules to be applied in the special case of numeric types. +Specifically, if + * at least one of its classes is a numeric class, and + * all of its classes are numeric or standard +then the type variable can be defaulted to the first type in the +default-type list which is an instance of all the offending classes. + +So here is the function which does the work. It takes the ambiguous +dictionaries and either resolves them (producing bindings) or +complains. It works by splitting the dictionary list by type +variable, and using @disambigOne@ to do the real business. + +@disambigOne@ assumes that its arguments dictionaries constrain all +the same type variable. + +ADR Comment 20/6/94: I've changed the @CReturnable@ case to default to +@()@ instead of @Int@. I reckon this is the Right Thing to do since +the most common use of defaulting is code like: +\begin{verbatim} + _ccall_ foo `seqPrimIO` bar +\end{verbatim} +Since we're not using the result of @foo@, the result if (presumably) +@void@. + +\begin{code} +disambigGroup :: [Inst] -- All standard classes of form (C a) + -> TcM TcDictBinds + +disambigGroup dicts + = -- THE DICTS OBEY THE DEFAULTABLE CONSTRAINT + -- SO, TRY DEFAULT TYPES IN ORDER + + -- Failure here is caused by there being no type in the + -- default list which can satisfy all the ambiguous classes. + -- For example, if Real a is reqd, but the only type in the + -- default list is Int. + get_default_tys `thenM` \ default_tys -> + let + try_default [] -- No defaults work, so fail + = failM + + try_default (default_ty : default_tys) + = tryTcLIE_ (try_default default_tys) $ -- If default_ty fails, we try + -- default_tys instead + tcSimplifyDefault theta `thenM` \ _ -> + returnM default_ty + where + theta = [mkClassPred clas [default_ty] | clas <- classes] + in + -- See if any default works + tryM (try_default default_tys) `thenM` \ mb_ty -> + case mb_ty of + Left _ -> bomb_out + Right chosen_default_ty -> choose_default chosen_default_ty + where + tyvar = get_tv (head dicts) -- Should be non-empty + classes = map get_clas dicts + + choose_default default_ty -- Commit to tyvar = default_ty + = -- Bind the type variable + unifyType default_ty (mkTyVarTy tyvar) `thenM_` + -- and reduce the context, for real this time + simpleReduceLoop (text "disambig" <+> ppr dicts) + reduceMe dicts `thenM` \ (frees, binds, ambigs) -> + WARN( not (null frees && null ambigs), ppr frees $$ ppr ambigs ) + warnDefault dicts default_ty `thenM_` + returnM binds + + bomb_out = addTopAmbigErrs dicts `thenM_` + returnM emptyBag + +get_default_tys + = do { mb_defaults <- getDefaultTys + ; case mb_defaults of + Just tys -> return tys + Nothing -> -- No use-supplied default; + -- use [Integer, Double] + do { integer_ty <- tcMetaTy integerTyConName + ; checkWiredInTyCon doubleTyCon + ; return [integer_ty, doubleTy] } } +\end{code} + +[Aside - why the defaulting mechanism is turned off when + dealing with arguments and results to ccalls. + +When typechecking _ccall_s, TcExpr ensures that the external +function is only passed arguments (and in the other direction, +results) of a restricted set of 'native' types. + +The interaction between the defaulting mechanism for numeric +values and CC & CR can be a bit puzzling to the user at times. +For example, + + x <- _ccall_ f + if (x /= 0) then + _ccall_ g x + else + return () + +What type has 'x' got here? That depends on the default list +in operation, if it is equal to Haskell 98's default-default +of (Integer, Double), 'x' has type Double, since Integer +is not an instance of CR. If the default list is equal to +Haskell 1.4's default-default of (Int, Double), 'x' has type +Int. + +End of aside] + + +%************************************************************************ +%* * +\subsection[simple]{@Simple@ versions} +%* * +%************************************************************************ + +Much simpler versions when there are no bindings to make! + +@tcSimplifyThetas@ simplifies class-type constraints formed by +@deriving@ declarations and when specialising instances. We are +only interested in the simplified bunch of class/type constraints. + +It simplifies to constraints of the form (C a b c) where +a,b,c are type variables. This is required for the context of +instance declarations. + +\begin{code} +tcSimplifyDeriv :: TyCon + -> [TyVar] + -> ThetaType -- Wanted + -> TcM ThetaType -- Needed + +tcSimplifyDeriv tc tyvars theta + = tcInstTyVars tyvars `thenM` \ (tvs, _, tenv) -> + -- The main loop may do unification, and that may crash if + -- it doesn't see a TcTyVar, so we have to instantiate. Sigh + -- ToDo: what if two of them do get unified? + newDicts DerivOrigin (substTheta tenv theta) `thenM` \ wanteds -> + simpleReduceLoop doc reduceMe wanteds `thenM` \ (frees, _, irreds) -> + ASSERT( null frees ) -- reduceMe never returns Free + + doptM Opt_GlasgowExts `thenM` \ gla_exts -> + doptM Opt_AllowUndecidableInstances `thenM` \ undecidable_ok -> + let + tv_set = mkVarSet tvs + + (bad_insts, ok_insts) = partition is_bad_inst irreds + is_bad_inst dict + = let pred = dictPred dict -- reduceMe squashes all non-dicts + in isEmptyVarSet (tyVarsOfPred pred) + -- Things like (Eq T) are bad + || (not gla_exts && not (isTyVarClassPred pred)) + + simpl_theta = map dictPred ok_insts + weird_preds = [pred | pred <- simpl_theta + , not (tyVarsOfPred pred `subVarSet` tv_set)] + -- Check for a bizarre corner case, when the derived instance decl should + -- have form instance C a b => D (T a) where ... + -- Note that 'b' isn't a parameter of T. This gives rise to all sorts + -- of problems; in particular, it's hard to compare solutions for + -- equality when finding the fixpoint. So I just rule it out for now. + + rev_env = zipTopTvSubst tvs (mkTyVarTys tyvars) + -- This reverse-mapping is a Royal Pain, + -- but the result should mention TyVars not TcTyVars + + head_ty = TyConApp tc (map TyVarTy tvs) + in + + addNoInstanceErrs Nothing [] bad_insts `thenM_` + mapM_ (addErrTc . badDerivedPred) weird_preds `thenM_` + checkAmbiguity tvs simpl_theta tv_set `thenM_` + -- Check instance termination as for user-declared instances. + -- unless we had -fallow-undecidable-instances (which risks + -- non-termination in the 'deriving' context-inference fixpoint + -- loop). + ifM (gla_exts && not undecidable_ok) + (checkInstTermination simpl_theta [head_ty]) `thenM_` + returnM (substTheta rev_env simpl_theta) + where + doc = ptext SLIT("deriving classes for a data type") +\end{code} + +@tcSimplifyDefault@ just checks class-type constraints, essentially; +used with \tr{default} declarations. We are only interested in +whether it worked or not. + +\begin{code} +tcSimplifyDefault :: ThetaType -- Wanted; has no type variables in it + -> TcM () + +tcSimplifyDefault theta + = newDicts DefaultOrigin theta `thenM` \ wanteds -> + simpleReduceLoop doc reduceMe wanteds `thenM` \ (frees, _, irreds) -> + ASSERT( null frees ) -- try_me never returns Free + addNoInstanceErrs Nothing [] irreds `thenM_` + if null irreds then + returnM () + else + failM + where + doc = ptext SLIT("default declaration") +\end{code} + + +%************************************************************************ +%* * +\section{Errors and contexts} +%* * +%************************************************************************ + +ToDo: for these error messages, should we note the location as coming +from the insts, or just whatever seems to be around in the monad just +now? + +\begin{code} +groupErrs :: ([Inst] -> TcM ()) -- Deal with one group + -> [Inst] -- The offending Insts + -> TcM () +-- Group together insts with the same origin +-- We want to report them together in error messages + +groupErrs report_err [] + = returnM () +groupErrs report_err (inst:insts) + = do_one (inst:friends) `thenM_` + groupErrs report_err others + + where + -- (It may seem a bit crude to compare the error messages, + -- but it makes sure that we combine just what the user sees, + -- and it avoids need equality on InstLocs.) + (friends, others) = partition is_friend insts + loc_msg = showSDoc (pprInstLoc (instLoc inst)) + is_friend friend = showSDoc (pprInstLoc (instLoc friend)) == loc_msg + do_one insts = addInstCtxt (instLoc (head insts)) (report_err insts) + -- Add location and context information derived from the Insts + +-- Add the "arising from..." part to a message about bunch of dicts +addInstLoc :: [Inst] -> Message -> Message +addInstLoc insts msg = msg $$ nest 2 (pprInstLoc (instLoc (head insts))) + +addTopIPErrs :: [Name] -> [Inst] -> TcM () +addTopIPErrs bndrs [] + = return () +addTopIPErrs bndrs ips + = addErrTcM (tidy_env, mk_msg tidy_ips) + where + (tidy_env, tidy_ips) = tidyInsts ips + mk_msg ips = vcat [sep [ptext SLIT("Implicit parameters escape from"), + nest 2 (ptext SLIT("the monomorphic top-level binding(s) of") + <+> pprBinders bndrs <> colon)], + nest 2 (vcat (map ppr_ip ips)), + monomorphism_fix] + ppr_ip ip = pprPred (dictPred ip) <+> pprInstLoc (instLoc ip) + +strangeTopIPErrs :: [Inst] -> TcM () +strangeTopIPErrs dicts -- Strange, becuase addTopIPErrs should have caught them all + = groupErrs report tidy_dicts + where + (tidy_env, tidy_dicts) = tidyInsts dicts + report dicts = addErrTcM (tidy_env, mk_msg dicts) + mk_msg dicts = addInstLoc dicts (ptext SLIT("Unbound implicit parameter") <> + plural tidy_dicts <+> pprDictsTheta tidy_dicts) + +addNoInstanceErrs :: Maybe SDoc -- Nothing => top level + -- Just d => d describes the construct + -> [Inst] -- What is given by the context or type sig + -> [Inst] -- What is wanted + -> TcM () +addNoInstanceErrs mb_what givens [] + = returnM () +addNoInstanceErrs mb_what givens dicts + = -- Some of the dicts are here because there is no instances + -- and some because there are too many instances (overlap) + tcGetInstEnvs `thenM` \ inst_envs -> + let + (tidy_env1, tidy_givens) = tidyInsts givens + (tidy_env2, tidy_dicts) = tidyMoreInsts tidy_env1 dicts + + -- Run through the dicts, generating a message for each + -- overlapping one, but simply accumulating all the + -- no-instance ones so they can be reported as a group + (overlap_doc, no_inst_dicts) = foldl check_overlap (empty, []) tidy_dicts + check_overlap (overlap_doc, no_inst_dicts) dict + | not (isClassDict dict) = (overlap_doc, dict : no_inst_dicts) + | otherwise + = case lookupInstEnv inst_envs clas tys of + -- The case of exactly one match and no unifiers means + -- a successful lookup. That can't happen here, becuase + -- dicts only end up here if they didn't match in Inst.lookupInst +#ifdef DEBUG + ([m],[]) -> pprPanic "addNoInstanceErrs" (ppr dict) +#endif + ([], _) -> (overlap_doc, dict : no_inst_dicts) -- No match + res -> (mk_overlap_msg dict res $$ overlap_doc, no_inst_dicts) + where + (clas,tys) = getDictClassTys dict + in + + -- Now generate a good message for the no-instance bunch + mk_probable_fix tidy_env2 no_inst_dicts `thenM` \ (tidy_env3, probable_fix) -> + let + no_inst_doc | null no_inst_dicts = empty + | otherwise = vcat [addInstLoc no_inst_dicts heading, probable_fix] + heading | null givens = ptext SLIT("No instance") <> plural no_inst_dicts <+> + ptext SLIT("for") <+> pprDictsTheta no_inst_dicts + | otherwise = sep [ptext SLIT("Could not deduce") <+> pprDictsTheta no_inst_dicts, + nest 2 $ ptext SLIT("from the context") <+> pprDictsTheta tidy_givens] + in + -- And emit both the non-instance and overlap messages + addErrTcM (tidy_env3, no_inst_doc $$ overlap_doc) + where + mk_overlap_msg dict (matches, unifiers) + = vcat [ addInstLoc [dict] ((ptext SLIT("Overlapping instances for") + <+> pprPred (dictPred dict))), + sep [ptext SLIT("Matching instances") <> colon, + nest 2 (vcat [pprInstances ispecs, pprInstances unifiers])], + ASSERT( not (null matches) ) + if not (isSingleton matches) + then -- Two or more matches + empty + else -- One match, plus some unifiers + ASSERT( not (null unifiers) ) + parens (vcat [ptext SLIT("The choice depends on the instantiation of") <+> + quotes (pprWithCommas ppr (varSetElems (tyVarsOfInst dict))), + ptext SLIT("Use -fallow-incoherent-instances to use the first choice above")])] + where + ispecs = [ispec | (_, ispec) <- matches] + + mk_probable_fix tidy_env dicts + = returnM (tidy_env, sep [ptext SLIT("Possible fix:"), nest 2 (vcat fixes)]) + where + fixes = add_ors (fix1 ++ fix2) + + fix1 = case mb_what of + Nothing -> [] -- Top level + Just what -> -- Nested (type signatures, instance decls) + [ sep [ ptext SLIT("add") <+> pprDictsTheta dicts, + ptext SLIT("to the") <+> what] ] + + fix2 | null instance_dicts = [] + | otherwise = [ ptext SLIT("add an instance declaration for") + <+> pprDictsTheta instance_dicts ] + instance_dicts = [d | d <- dicts, isClassDict d, not (isTyVarDict d)] + -- Insts for which it is worth suggesting an adding an instance declaration + -- Exclude implicit parameters, and tyvar dicts + + add_ors :: [SDoc] -> [SDoc] -- The empty case should not happen + add_ors [] = [ptext SLIT("[No suggested fixes]")] -- Strange + add_ors (f1:fs) = f1 : map (ptext SLIT("or") <+>) fs + +addTopAmbigErrs dicts +-- Divide into groups that share a common set of ambiguous tyvars + = mapM report (equivClasses cmp [(d, tvs_of d) | d <- tidy_dicts]) + where + (tidy_env, tidy_dicts) = tidyInsts dicts + + tvs_of :: Inst -> [TcTyVar] + tvs_of d = varSetElems (tyVarsOfInst d) + cmp (_,tvs1) (_,tvs2) = tvs1 `compare` tvs2 + + report :: [(Inst,[TcTyVar])] -> TcM () + report pairs@((inst,tvs) : _) -- The pairs share a common set of ambiguous tyvars + = mkMonomorphismMsg tidy_env tvs `thenM` \ (tidy_env, mono_msg) -> + setSrcSpan (instLocSrcSpan (instLoc inst)) $ + -- the location of the first one will do for the err message + addErrTcM (tidy_env, msg $$ mono_msg) + where + dicts = map fst pairs + msg = sep [text "Ambiguous type variable" <> plural tvs <+> + pprQuotedList tvs <+> in_msg, + nest 2 (pprDictsInFull dicts)] + in_msg = text "in the constraint" <> plural dicts <> colon + + +mkMonomorphismMsg :: TidyEnv -> [TcTyVar] -> TcM (TidyEnv, Message) +-- There's an error with these Insts; if they have free type variables +-- it's probably caused by the monomorphism restriction. +-- Try to identify the offending variable +-- ASSUMPTION: the Insts are fully zonked +mkMonomorphismMsg tidy_env inst_tvs + = findGlobals (mkVarSet inst_tvs) tidy_env `thenM` \ (tidy_env, docs) -> + returnM (tidy_env, mk_msg docs) + where + mk_msg [] = ptext SLIT("Probable fix: add a type signature that fixes these type variable(s)") + -- This happens in things like + -- f x = show (read "foo") + -- whre monomorphism doesn't play any role + mk_msg docs = vcat [ptext SLIT("Possible cause: the monomorphism restriction applied to the following:"), + nest 2 (vcat docs), + monomorphism_fix + ] +monomorphism_fix :: SDoc +monomorphism_fix = ptext SLIT("Probable fix:") <+> + (ptext SLIT("give these definition(s) an explicit type signature") + $$ ptext SLIT("or use -fno-monomorphism-restriction")) + +warnDefault dicts default_ty + = doptM Opt_WarnTypeDefaults `thenM` \ warn_flag -> + addInstCtxt (instLoc (head dicts)) (warnTc warn_flag warn_msg) + where + -- Tidy them first + (_, tidy_dicts) = tidyInsts dicts + warn_msg = vcat [ptext SLIT("Defaulting the following constraint(s) to type") <+> + quotes (ppr default_ty), + pprDictsInFull tidy_dicts] + +-- Used for the ...Thetas variants; all top level +badDerivedPred pred + = vcat [ptext SLIT("Can't derive instances where the instance context mentions"), + ptext SLIT("type variables that are not data type parameters"), + nest 2 (ptext SLIT("Offending constraint:") <+> ppr pred)] + +reduceDepthErr n stack + = vcat [ptext SLIT("Context reduction stack overflow; size =") <+> int n, + ptext SLIT("Use -fcontext-stack20 to increase stack size to (e.g.) 20"), + nest 4 (pprStack stack)] + +pprStack stack = vcat (map pprInstInFull stack) +\end{code} diff --git a/compiler/typecheck/TcSplice.hi-boot-6 b/compiler/typecheck/TcSplice.hi-boot-6 new file mode 100644 index 0000000000..aa73980e5a --- /dev/null +++ b/compiler/typecheck/TcSplice.hi-boot-6 @@ -0,0 +1,15 @@ +module TcSplice where + +tcSpliceExpr :: HsExpr.HsSplice Name.Name + -> TcType.BoxyRhoType + -> TcRnTypes.TcM (HsExpr.HsExpr Var.Id) + +kcSpliceType :: HsExpr.HsSplice Name.Name + -> TcRnTypes.TcM (HsTypes.HsType Name.Name, TcType.TcKind) + +tcBracket :: HsExpr.HsBracket Name.Name + -> TcType.BoxyRhoType + -> TcRnTypes.TcM (HsExpr.LHsExpr Var.Id) + +tcSpliceDecls :: HsExpr.LHsExpr Name.Name + -> TcRnTypes.TcM [HsDecls.LHsDecl RdrName.RdrName] diff --git a/compiler/typecheck/TcSplice.lhs b/compiler/typecheck/TcSplice.lhs new file mode 100644 index 0000000000..beb72f1932 --- /dev/null +++ b/compiler/typecheck/TcSplice.lhs @@ -0,0 +1,694 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[TcSplice]{Template Haskell splices} + +\begin{code} +module TcSplice( tcSpliceExpr, tcSpliceDecls, tcBracket ) where + +#include "HsVersions.h" + +import HscMain ( compileExpr ) +import TcRnDriver ( tcTopSrcDecls ) + -- These imports are the reason that TcSplice + -- is very high up the module hierarchy + +import qualified Language.Haskell.TH as TH +-- THSyntax gives access to internal functions and data types +import qualified Language.Haskell.TH.Syntax as TH + +import HsSyn ( HsBracket(..), HsExpr(..), HsSplice(..), LHsExpr, LHsDecl, + HsType, LHsType ) +import Convert ( convertToHsExpr, convertToHsDecls, convertToHsType, thRdrName ) +import RnExpr ( rnLExpr ) +import RnEnv ( lookupFixityRn, lookupSrcOcc_maybe, lookupImportedName ) +import RdrName ( RdrName, lookupLocalRdrEnv, isSrcRdrName ) +import RnTypes ( rnLHsType ) +import TcExpr ( tcMonoExpr ) +import TcHsSyn ( mkHsDictLet, zonkTopLExpr ) +import TcSimplify ( tcSimplifyTop, tcSimplifyBracket ) +import TcUnify ( boxyUnify, unBox ) +import TcType ( TcType, TcKind, BoxyRhoType, liftedTypeKind, mkAppTy, tcSplitSigmaTy ) +import TcEnv ( spliceOK, tcMetaTy, bracketOK ) +import TcMType ( newFlexiTyVarTy, newKindVar, UserTypeCtxt(ExprSigCtxt), zonkTcType ) +import TcHsType ( tcHsSigType, kcHsType ) +import TcIface ( tcImportDecl ) +import TypeRep ( Type(..), PredType(..), TyThing(..) ) -- For reification +import PrelNames ( thFAKE ) +import Name ( Name, NamedThing(..), nameOccName, nameModule, isExternalName, + nameIsLocalOrFrom ) +import NameEnv ( lookupNameEnv ) +import HscTypes ( lookupType, ExternalPackageState(..), emptyModDetails ) +import OccName +import Var ( Id, TyVar, idType ) +import Module ( moduleString ) +import TcRnMonad +import IfaceEnv ( lookupOrig ) +import Class ( Class, classExtraBigSig ) +import TyCon ( TyCon, tyConTyVars, synTyConDefn, + isSynTyCon, isNewTyCon, tyConDataCons, isPrimTyCon, isFunTyCon, + tyConArity, tyConStupidTheta, isUnLiftedTyCon ) +import DataCon ( DataCon, dataConTyCon, dataConOrigArgTys, dataConStrictMarks, + dataConName, dataConFieldLabels, dataConWrapId, dataConIsInfix, + isVanillaDataCon ) +import Id ( idName, globalIdDetails ) +import IdInfo ( GlobalIdDetails(..) ) +import TysWiredIn ( mkListTy ) +import DsMeta ( expQTyConName, typeQTyConName, decTyConName, qTyConName, nameTyConName ) +import ErrUtils ( Message ) +import SrcLoc ( SrcSpan, noLoc, unLoc, getLoc ) +import Outputable +import Unique ( Unique, Uniquable(..), getKey, mkUniqueGrimily ) + +import BasicTypes ( StrictnessMark(..), Fixity(..), FixityDirection(..) ) +import Panic ( showException ) +import FastString ( LitString ) + +import GHC.Base ( unsafeCoerce#, Int#, Int(..) ) -- Should have a better home in the module hierarchy +import Monad ( liftM ) + +#ifdef GHCI +import FastString ( mkFastString ) +#endif +\end{code} + + +%************************************************************************ +%* * +\subsection{Main interface + stubs for the non-GHCI case +%* * +%************************************************************************ + +\begin{code} +tcSpliceDecls :: LHsExpr Name -> TcM [LHsDecl RdrName] +tcSpliceExpr :: HsSplice Name -> BoxyRhoType -> TcM (HsExpr TcId) +kcSpliceType :: HsSplice Name -> TcM (HsType Name, TcKind) + +#ifndef GHCI +tcSpliceExpr n e ty = pprPanic "Cant do tcSpliceExpr without GHCi" (ppr e) +tcSpliceDecls e = pprPanic "Cant do tcSpliceDecls without GHCi" (ppr e) +#else +\end{code} + +%************************************************************************ +%* * +\subsection{Quoting an expression} +%* * +%************************************************************************ + +\begin{code} +tcBracket :: HsBracket Name -> BoxyRhoType -> TcM (LHsExpr Id) +tcBracket brack res_ty + = getStage `thenM` \ level -> + case bracketOK level of { + Nothing -> failWithTc (illegalBracket level) ; + Just next_level -> + + -- Typecheck expr to make sure it is valid, + -- but throw away the results. We'll type check + -- it again when we actually use it. + recordThUse `thenM_` + newMutVar [] `thenM` \ pending_splices -> + getLIEVar `thenM` \ lie_var -> + + setStage (Brack next_level pending_splices lie_var) ( + getLIE (tc_bracket brack) + ) `thenM` \ (meta_ty, lie) -> + tcSimplifyBracket lie `thenM_` + + -- Make the expected type have the right shape + boxyUnify meta_ty res_ty `thenM_` + + -- Return the original expression, not the type-decorated one + readMutVar pending_splices `thenM` \ pendings -> + returnM (noLoc (HsBracketOut brack pendings)) + } + +tc_bracket :: HsBracket Name -> TcM TcType +tc_bracket (VarBr v) + = tcMetaTy nameTyConName -- Result type is Var (not Q-monadic) + +tc_bracket (ExpBr expr) + = newFlexiTyVarTy liftedTypeKind `thenM` \ any_ty -> + tcMonoExpr expr any_ty `thenM_` + tcMetaTy expQTyConName + -- Result type is Expr (= Q Exp) + +tc_bracket (TypBr typ) + = tcHsSigType ExprSigCtxt typ `thenM_` + tcMetaTy typeQTyConName + -- Result type is Type (= Q Typ) + +tc_bracket (DecBr decls) + = do { tcTopSrcDecls emptyModDetails decls + -- Typecheck the declarations, dicarding the result + -- We'll get all that stuff later, when we splice it in + + ; decl_ty <- tcMetaTy decTyConName + ; q_ty <- tcMetaTy qTyConName + ; return (mkAppTy q_ty (mkListTy decl_ty)) + -- Result type is Q [Dec] + } + +tc_bracket (PatBr _) + = failWithTc (ptext SLIT("Tempate Haskell pattern brackets are not supported yet")) +\end{code} + + +%************************************************************************ +%* * +\subsection{Splicing an expression} +%* * +%************************************************************************ + +\begin{code} +tcSpliceExpr (HsSplice name expr) res_ty + = setSrcSpan (getLoc expr) $ + getStage `thenM` \ level -> + case spliceOK level of { + Nothing -> failWithTc (illegalSplice level) ; + Just next_level -> + + case level of { + Comp -> do { e <- tcTopSplice expr res_ty + ; returnM (unLoc e) } ; + Brack _ ps_var lie_var -> + + -- A splice inside brackets + -- NB: ignore res_ty, apart from zapping it to a mono-type + -- e.g. [| reverse $(h 4) |] + -- Here (h 4) :: Q Exp + -- but $(h 4) :: forall a.a i.e. anything! + + unBox res_ty `thenM_` + tcMetaTy expQTyConName `thenM` \ meta_exp_ty -> + setStage (Splice next_level) ( + setLIEVar lie_var $ + tcMonoExpr expr meta_exp_ty + ) `thenM` \ expr' -> + + -- Write the pending splice into the bucket + readMutVar ps_var `thenM` \ ps -> + writeMutVar ps_var ((name,expr') : ps) `thenM_` + + returnM (panic "tcSpliceExpr") -- The returned expression is ignored + }} + +-- tcTopSplice used to have this: +-- Note that we do not decrement the level (to -1) before +-- typechecking the expression. For example: +-- f x = $( ...$(g 3) ... ) +-- The recursive call to tcMonoExpr will simply expand the +-- inner escape before dealing with the outer one + +tcTopSplice :: LHsExpr Name -> BoxyRhoType -> TcM (LHsExpr Id) +tcTopSplice expr res_ty + = tcMetaTy expQTyConName `thenM` \ meta_exp_ty -> + + -- Typecheck the expression + tcTopSpliceExpr expr meta_exp_ty `thenM` \ zonked_q_expr -> + + -- Run the expression + traceTc (text "About to run" <+> ppr zonked_q_expr) `thenM_` + runMetaE convertToHsExpr zonked_q_expr `thenM` \ expr2 -> + + traceTc (text "Got result" <+> ppr expr2) `thenM_` + + showSplice "expression" + zonked_q_expr (ppr expr2) `thenM_` + + -- Rename it, but bale out if there are errors + -- otherwise the type checker just gives more spurious errors + checkNoErrs (rnLExpr expr2) `thenM` \ (exp3, fvs) -> + + tcMonoExpr exp3 res_ty + + +tcTopSpliceExpr :: LHsExpr Name -> TcType -> TcM (LHsExpr Id) +-- Type check an expression that is the body of a top-level splice +-- (the caller will compile and run it) +tcTopSpliceExpr expr meta_ty + = checkNoErrs $ -- checkNoErrs: must not try to run the thing + -- if the type checker fails! + + setStage topSpliceStage $ do + + + do { recordThUse -- Record that TH is used (for pkg depdendency) + + -- Typecheck the expression + ; (expr', lie) <- getLIE (tcMonoExpr expr meta_ty) + + -- Solve the constraints + ; const_binds <- tcSimplifyTop lie + + -- And zonk it + ; zonkTopLExpr (mkHsDictLet const_binds expr') } +\end{code} + + +%************************************************************************ +%* * + Splicing a type +%* * +%************************************************************************ + +Very like splicing an expression, but we don't yet share code. + +\begin{code} +kcSpliceType (HsSplice name hs_expr) + = setSrcSpan (getLoc hs_expr) $ do + { level <- getStage + ; case spliceOK level of { + Nothing -> failWithTc (illegalSplice level) ; + Just next_level -> do + + { case level of { + Comp -> do { (t,k) <- kcTopSpliceType hs_expr + ; return (unLoc t, k) } ; + Brack _ ps_var lie_var -> do + + { -- A splice inside brackets + ; meta_ty <- tcMetaTy typeQTyConName + ; expr' <- setStage (Splice next_level) $ + setLIEVar lie_var $ + tcMonoExpr hs_expr meta_ty + + -- Write the pending splice into the bucket + ; ps <- readMutVar ps_var + ; writeMutVar ps_var ((name,expr') : ps) + + -- e.g. [| Int -> $(h 4) |] + -- Here (h 4) :: Q Type + -- but $(h 4) :: forall a.a i.e. any kind + ; kind <- newKindVar + ; returnM (panic "kcSpliceType", kind) -- The returned type is ignored + }}}}} + +kcTopSpliceType :: LHsExpr Name -> TcM (LHsType Name, TcKind) +kcTopSpliceType expr + = do { meta_ty <- tcMetaTy typeQTyConName + + -- Typecheck the expression + ; zonked_q_expr <- tcTopSpliceExpr expr meta_ty + + -- Run the expression + ; traceTc (text "About to run" <+> ppr zonked_q_expr) + ; hs_ty2 <- runMetaT convertToHsType zonked_q_expr + + ; traceTc (text "Got result" <+> ppr hs_ty2) + + ; showSplice "type" zonked_q_expr (ppr hs_ty2) + + -- Rename it, but bale out if there are errors + -- otherwise the type checker just gives more spurious errors + ; let doc = ptext SLIT("In the spliced type") <+> ppr hs_ty2 + ; hs_ty3 <- checkNoErrs (rnLHsType doc hs_ty2) + + ; kcHsType hs_ty3 } +\end{code} + +%************************************************************************ +%* * +\subsection{Splicing an expression} +%* * +%************************************************************************ + +\begin{code} +-- Always at top level +-- Type sig at top of file: +-- tcSpliceDecls :: LHsExpr Name -> TcM [LHsDecl RdrName] +tcSpliceDecls expr + = do { meta_dec_ty <- tcMetaTy decTyConName + ; meta_q_ty <- tcMetaTy qTyConName + ; let list_q = mkAppTy meta_q_ty (mkListTy meta_dec_ty) + ; zonked_q_expr <- tcTopSpliceExpr expr list_q + + -- Run the expression + ; traceTc (text "About to run" <+> ppr zonked_q_expr) + ; decls <- runMetaD convertToHsDecls zonked_q_expr + + ; traceTc (text "Got result" <+> vcat (map ppr decls)) + ; showSplice "declarations" + zonked_q_expr + (ppr (getLoc expr) $$ (vcat (map ppr decls))) + ; returnM decls } + + where handleErrors :: [Either a Message] -> TcM [a] + handleErrors [] = return [] + handleErrors (Left x:xs) = liftM (x:) (handleErrors xs) + handleErrors (Right m:xs) = do addErrTc m + handleErrors xs +\end{code} + + +%************************************************************************ +%* * +\subsection{Running an expression} +%* * +%************************************************************************ + +\begin{code} +runMetaE :: (SrcSpan -> TH.Exp -> Either Message (LHsExpr RdrName)) + -> LHsExpr Id -- Of type (Q Exp) + -> TcM (LHsExpr RdrName) +runMetaE = runMeta + +runMetaT :: (SrcSpan -> TH.Type -> Either Message (LHsType RdrName)) + -> LHsExpr Id -- Of type (Q Type) + -> TcM (LHsType RdrName) +runMetaT = runMeta + +runMetaD :: (SrcSpan -> [TH.Dec] -> Either Message [LHsDecl RdrName]) + -> LHsExpr Id -- Of type Q [Dec] + -> TcM [LHsDecl RdrName] +runMetaD = runMeta + +runMeta :: (SrcSpan -> th_syn -> Either Message hs_syn) + -> LHsExpr Id -- Of type X + -> TcM hs_syn -- Of type t +runMeta convert expr + = do { hsc_env <- getTopEnv + ; tcg_env <- getGblEnv + ; this_mod <- getModule + ; let type_env = tcg_type_env tcg_env + rdr_env = tcg_rdr_env tcg_env + + -- Compile and link it; might fail if linking fails + ; either_hval <- tryM $ ioToTcRn $ + HscMain.compileExpr + hsc_env this_mod + rdr_env type_env expr + ; case either_hval of { + Left exn -> failWithTc (mk_msg "compile and link" exn) ; + Right hval -> do + + { -- Coerce it to Q t, and run it + -- Running might fail if it throws an exception of any kind (hence tryAllM) + -- including, say, a pattern-match exception in the code we are running + -- + -- We also do the TH -> HS syntax conversion inside the same + -- exception-cacthing thing so that if there are any lurking + -- exceptions in the data structure returned by hval, we'll + -- encounter them inside the tryALlM + either_tval <- tryAllM $ do + { th_syn <- TH.runQ (unsafeCoerce# hval) + ; case convert (getLoc expr) th_syn of + Left err -> do { addErrTc err; return Nothing } + Right hs_syn -> return (Just hs_syn) } + + ; case either_tval of + Right (Just v) -> return v + Right Nothing -> failM -- Error already in Tc monad + Left exn -> failWithTc (mk_msg "run" exn) -- Exception + }}} + where + mk_msg s exn = vcat [text "Exception when trying to" <+> text s <+> text "compile-time code:", + nest 2 (text (Panic.showException exn)), + nest 2 (text "Code:" <+> ppr expr)] +\end{code} + +To call runQ in the Tc monad, we need to make TcM an instance of Quasi: + +\begin{code} +instance TH.Quasi (IOEnv (Env TcGblEnv TcLclEnv)) where + qNewName s = do { u <- newUnique + ; let i = getKey u + ; return (TH.mkNameU s i) } + + qReport True msg = addErr (text msg) + qReport False msg = addReport (text msg) + + qCurrentModule = do { m <- getModule; return (moduleString m) } + qReify v = reify v + qRecover = recoverM + + qRunIO io = ioToTcRn io +\end{code} + + +%************************************************************************ +%* * +\subsection{Errors and contexts} +%* * +%************************************************************************ + +\begin{code} +showSplice :: String -> LHsExpr Id -> SDoc -> TcM () +showSplice what before after + = getSrcSpanM `thenM` \ loc -> + traceSplice (vcat [ppr loc <> colon <+> text "Splicing" <+> text what, + nest 2 (sep [nest 2 (ppr before), + text "======>", + nest 2 after])]) + +illegalBracket level + = ptext SLIT("Illegal bracket at level") <+> ppr level + +illegalSplice level + = ptext SLIT("Illegal splice at level") <+> ppr level + +#endif /* GHCI */ +\end{code} + + +%************************************************************************ +%* * + Reification +%* * +%************************************************************************ + + +\begin{code} +reify :: TH.Name -> TcM TH.Info +reify th_name + = do { name <- lookupThName th_name + ; thing <- tcLookupTh name + -- ToDo: this tcLookup could fail, which would give a + -- rather unhelpful error message + ; traceIf (text "reify" <+> text (show th_name) <+> brackets (ppr_ns th_name) <+> ppr name) + ; reifyThing thing + } + where + ppr_ns (TH.Name _ (TH.NameG TH.DataName mod)) = text "data" + ppr_ns (TH.Name _ (TH.NameG TH.TcClsName mod)) = text "tc" + ppr_ns (TH.Name _ (TH.NameG TH.VarName mod)) = text "var" + +lookupThName :: TH.Name -> TcM Name +lookupThName th_name@(TH.Name occ flavour) + = do { let rdr_name = thRdrName guessed_ns occ_str flavour + + -- Repeat much of lookupOccRn, becase we want + -- to report errors in a TH-relevant way + ; rdr_env <- getLocalRdrEnv + ; case lookupLocalRdrEnv rdr_env rdr_name of + Just name -> return name + Nothing | not (isSrcRdrName rdr_name) -- Exact, Orig + -> lookupImportedName rdr_name + | otherwise -- Unqual, Qual + -> do { + mb_name <- lookupSrcOcc_maybe rdr_name + ; case mb_name of + Just name -> return name + Nothing -> failWithTc (notInScope th_name) } + } + where + -- guessed_ns is the name space guessed from looking at the TH name + guessed_ns | isLexCon (mkFastString occ_str) = OccName.dataName + | otherwise = OccName.varName + occ_str = TH.occString occ + +tcLookupTh :: Name -> TcM TcTyThing +-- This is a specialised version of TcEnv.tcLookup; specialised mainly in that +-- it gives a reify-related error message on failure, whereas in the normal +-- tcLookup, failure is a bug. +tcLookupTh name + = do { (gbl_env, lcl_env) <- getEnvs + ; case lookupNameEnv (tcl_env lcl_env) name of { + Just thing -> returnM thing; + Nothing -> do + { if nameIsLocalOrFrom (tcg_mod gbl_env) name + then -- It's defined in this module + case lookupNameEnv (tcg_type_env gbl_env) name of + Just thing -> return (AGlobal thing) + Nothing -> failWithTc (notInEnv name) + + else do -- It's imported + { (eps,hpt) <- getEpsAndHpt + ; case lookupType hpt (eps_PTE eps) name of + Just thing -> return (AGlobal thing) + Nothing -> do { thing <- tcImportDecl name + ; return (AGlobal thing) } + -- Imported names should always be findable; + -- if not, we fail hard in tcImportDecl + }}}} + +notInScope :: TH.Name -> SDoc +notInScope th_name = quotes (text (TH.pprint th_name)) <+> + ptext SLIT("is not in scope at a reify") + -- Ugh! Rather an indirect way to display the name + +notInEnv :: Name -> SDoc +notInEnv name = quotes (ppr name) <+> + ptext SLIT("is not in the type environment at a reify") + +------------------------------ +reifyThing :: TcTyThing -> TcM TH.Info +-- The only reason this is monadic is for error reporting, +-- which in turn is mainly for the case when TH can't express +-- some random GHC extension + +reifyThing (AGlobal (AnId id)) + = do { ty <- reifyType (idType id) + ; fix <- reifyFixity (idName id) + ; let v = reifyName id + ; case globalIdDetails id of + ClassOpId cls -> return (TH.ClassOpI v ty (reifyName cls) fix) + other -> return (TH.VarI v ty Nothing fix) + } + +reifyThing (AGlobal (ATyCon tc)) = reifyTyCon tc +reifyThing (AGlobal (AClass cls)) = reifyClass cls +reifyThing (AGlobal (ADataCon dc)) + = do { let name = dataConName dc + ; ty <- reifyType (idType (dataConWrapId dc)) + ; fix <- reifyFixity name + ; return (TH.DataConI (reifyName name) ty (reifyName (dataConTyCon dc)) fix) } + +reifyThing (ATcId id _ _) + = do { ty1 <- zonkTcType (idType id) -- Make use of all the info we have, even + -- though it may be incomplete + ; ty2 <- reifyType ty1 + ; fix <- reifyFixity (idName id) + ; return (TH.VarI (reifyName id) ty2 Nothing fix) } + +reifyThing (ATyVar tv ty) + = do { ty1 <- zonkTcType ty + ; ty2 <- reifyType ty1 + ; return (TH.TyVarI (reifyName tv) ty2) } + +------------------------------ +reifyTyCon :: TyCon -> TcM TH.Info +reifyTyCon tc + | isFunTyCon tc = return (TH.PrimTyConI (reifyName tc) 2 False) + | isPrimTyCon tc = return (TH.PrimTyConI (reifyName tc) (tyConArity tc) (isUnLiftedTyCon tc)) + | isSynTyCon tc + = do { let (tvs, rhs) = synTyConDefn tc + ; rhs' <- reifyType rhs + ; return (TH.TyConI $ TH.TySynD (reifyName tc) (reifyTyVars tvs) rhs') } + +reifyTyCon tc + = do { cxt <- reifyCxt (tyConStupidTheta tc) + ; cons <- mapM reifyDataCon (tyConDataCons tc) + ; let name = reifyName tc + tvs = reifyTyVars (tyConTyVars tc) + deriv = [] -- Don't know about deriving + decl | isNewTyCon tc = TH.NewtypeD cxt name tvs (head cons) deriv + | otherwise = TH.DataD cxt name tvs cons deriv + ; return (TH.TyConI decl) } + +reifyDataCon :: DataCon -> TcM TH.Con +reifyDataCon dc + | isVanillaDataCon dc + = do { arg_tys <- reifyTypes (dataConOrigArgTys dc) + ; let stricts = map reifyStrict (dataConStrictMarks dc) + fields = dataConFieldLabels dc + name = reifyName dc + [a1,a2] = arg_tys + [s1,s2] = stricts + ; ASSERT( length arg_tys == length stricts ) + if not (null fields) then + return (TH.RecC name (zip3 (map reifyName fields) stricts arg_tys)) + else + if dataConIsInfix dc then + ASSERT( length arg_tys == 2 ) + return (TH.InfixC (s1,a1) name (s2,a2)) + else + return (TH.NormalC name (stricts `zip` arg_tys)) } + | otherwise + = failWithTc (ptext SLIT("Can't reify a non-Haskell-98 data constructor:") + <+> quotes (ppr dc)) + +------------------------------ +reifyClass :: Class -> TcM TH.Info +reifyClass cls + = do { cxt <- reifyCxt theta + ; ops <- mapM reify_op op_stuff + ; return (TH.ClassI $ TH.ClassD cxt (reifyName cls) (reifyTyVars tvs) fds' ops) } + where + (tvs, fds, theta, _, op_stuff) = classExtraBigSig cls + fds' = map reifyFunDep fds + reify_op (op, _) = do { ty <- reifyType (idType op) + ; return (TH.SigD (reifyName op) ty) } + +------------------------------ +reifyType :: TypeRep.Type -> TcM TH.Type +reifyType (TyVarTy tv) = return (TH.VarT (reifyName tv)) +reifyType (TyConApp tc tys) = reify_tc_app (reifyName tc) tys +reifyType (NoteTy _ ty) = reifyType ty +reifyType (AppTy t1 t2) = do { [r1,r2] <- reifyTypes [t1,t2] ; return (r1 `TH.AppT` r2) } +reifyType (FunTy t1 t2) = do { [r1,r2] <- reifyTypes [t1,t2] ; return (TH.ArrowT `TH.AppT` r1 `TH.AppT` r2) } +reifyType ty@(ForAllTy _ _) = do { cxt' <- reifyCxt cxt; + ; tau' <- reifyType tau + ; return (TH.ForallT (reifyTyVars tvs) cxt' tau') } + where + (tvs, cxt, tau) = tcSplitSigmaTy ty +reifyTypes = mapM reifyType +reifyCxt = mapM reifyPred + +reifyFunDep :: ([TyVar], [TyVar]) -> TH.FunDep +reifyFunDep (xs, ys) = TH.FunDep (map reifyName xs) (map reifyName ys) + +reifyTyVars :: [TyVar] -> [TH.Name] +reifyTyVars = map reifyName + +reify_tc_app :: TH.Name -> [TypeRep.Type] -> TcM TH.Type +reify_tc_app tc tys = do { tys' <- reifyTypes tys + ; return (foldl TH.AppT (TH.ConT tc) tys') } + +reifyPred :: TypeRep.PredType -> TcM TH.Type +reifyPred (ClassP cls tys) = reify_tc_app (reifyName cls) tys +reifyPred p@(IParam _ _) = noTH SLIT("implicit parameters") (ppr p) + + +------------------------------ +reifyName :: NamedThing n => n -> TH.Name +reifyName thing + | isExternalName name = mk_varg mod occ_str + | otherwise = TH.mkNameU occ_str (getKey (getUnique name)) + -- Many of the things we reify have local bindings, and + -- NameL's aren't supposed to appear in binding positions, so + -- we use NameU. When/if we start to reify nested things, that + -- have free variables, we may need to generate NameL's for them. + where + name = getName thing + mod = moduleString (nameModule name) + occ_str = occNameString occ + occ = nameOccName name + mk_varg | OccName.isDataOcc occ = TH.mkNameG_d + | OccName.isVarOcc occ = TH.mkNameG_v + | OccName.isTcOcc occ = TH.mkNameG_tc + | otherwise = pprPanic "reifyName" (ppr name) + +------------------------------ +reifyFixity :: Name -> TcM TH.Fixity +reifyFixity name + = do { fix <- lookupFixityRn name + ; return (conv_fix fix) } + where + conv_fix (BasicTypes.Fixity i d) = TH.Fixity i (conv_dir d) + conv_dir BasicTypes.InfixR = TH.InfixR + conv_dir BasicTypes.InfixL = TH.InfixL + conv_dir BasicTypes.InfixN = TH.InfixN + +reifyStrict :: BasicTypes.StrictnessMark -> TH.Strict +reifyStrict MarkedStrict = TH.IsStrict +reifyStrict MarkedUnboxed = TH.IsStrict +reifyStrict NotMarkedStrict = TH.NotStrict + +------------------------------ +noTH :: LitString -> SDoc -> TcM a +noTH s d = failWithTc (hsep [ptext SLIT("Can't represent") <+> ptext s <+> + ptext SLIT("in Template Haskell:"), + nest 2 d]) +\end{code} diff --git a/compiler/typecheck/TcSplice.lhs-boot b/compiler/typecheck/TcSplice.lhs-boot new file mode 100644 index 0000000000..d161770672 --- /dev/null +++ b/compiler/typecheck/TcSplice.lhs-boot @@ -0,0 +1,21 @@ +\begin{code} +module TcSplice where +import HsSyn ( HsSplice, HsBracket, HsExpr, LHsExpr, HsType, LHsDecl ) +import Var ( Id ) +import Name ( Name ) +import RdrName ( RdrName ) +import TcRnTypes( TcM ) +import TcType ( TcKind, BoxyRhoType ) + +tcSpliceExpr :: HsSplice Name + -> BoxyRhoType + -> TcM (HsExpr Id) + +kcSpliceType :: HsSplice Name -> TcM (HsType Name, TcKind) + +tcBracket :: HsBracket Name + -> BoxyRhoType + -> TcM (LHsExpr Id) + +tcSpliceDecls :: LHsExpr Name -> TcM [LHsDecl RdrName] +\end{code} diff --git a/compiler/typecheck/TcTyClsDecls.lhs b/compiler/typecheck/TcTyClsDecls.lhs new file mode 100644 index 0000000000..9e0b6cc6ed --- /dev/null +++ b/compiler/typecheck/TcTyClsDecls.lhs @@ -0,0 +1,829 @@ +% +% (c) The AQUA Project, Glasgow University, 1996-1998 +% +\section[TcTyClsDecls]{Typecheck type and class declarations} + +\begin{code} +module TcTyClsDecls ( + tcTyAndClassDecls + ) where + +#include "HsVersions.h" + +import HsSyn ( TyClDecl(..), HsConDetails(..), HsTyVarBndr(..), + ConDecl(..), Sig(..), , NewOrData(..), ResType(..), + tyClDeclTyVars, isSynDecl, hsConArgs, + LTyClDecl, tcdName, hsTyVarName, LHsTyVarBndr + ) +import HsTypes ( HsBang(..), getBangStrictness ) +import BasicTypes ( RecFlag(..), StrictnessMark(..) ) +import HscTypes ( implicitTyThings, ModDetails ) +import BuildTyCl ( buildClass, buildAlgTyCon, buildSynTyCon, buildDataCon, + mkDataTyConRhs, mkNewTyConRhs ) +import TcRnMonad +import TcEnv ( TyThing(..), + tcLookupLocated, tcLookupLocatedGlobal, + tcExtendGlobalEnv, tcExtendKindEnv, tcExtendKindEnvTvs, + tcExtendRecEnv, tcLookupTyVar ) +import TcTyDecls ( calcTyConArgVrcs, calcRecFlags, calcClassCycles, calcSynCycles ) +import TcClassDcl ( tcClassSigs, tcAddDeclCtxt ) +import TcHsType ( kcHsTyVars, kcHsLiftedSigType, kcHsType, + kcHsContext, tcTyVarBndrs, tcHsKindedType, tcHsKindedContext, + kcHsSigType, tcHsBangType, tcLHsConResTy, tcDataKindSig ) +import TcMType ( newKindVar, checkValidTheta, checkValidType, + -- checkFreeness, + UserTypeCtxt(..), SourceTyCtxt(..) ) +import TcType ( TcKind, TcType, tyVarsOfType, mkPhiTy, + mkArrowKind, liftedTypeKind, mkTyVarTys, + tcSplitSigmaTy, tcEqTypes, tcGetTyVar_maybe ) +import Type ( splitTyConApp_maybe, + -- pprParendType, pprThetaArrow + ) +import Kind ( mkArrowKinds, splitKindFunTys ) +import Generics ( validGenericMethodType, canDoGenerics ) +import Class ( Class, className, classTyCon, DefMeth(..), classBigSig, classTyVars ) +import TyCon ( TyCon, ArgVrcs, AlgTyConRhs( AbstractTyCon ), + tyConDataCons, mkForeignTyCon, isProductTyCon, isRecursiveTyCon, + tyConStupidTheta, synTyConRhs, isSynTyCon, tyConName ) +import DataCon ( DataCon, dataConWrapId, dataConName, + dataConFieldLabels, dataConTyCon, + dataConTyVars, dataConFieldType, dataConResTys ) +import Var ( TyVar, idType, idName ) +import VarSet ( elemVarSet, mkVarSet ) +import Name ( Name, getSrcLoc ) +import Outputable +import Maybe ( isJust ) +import Maybes ( expectJust ) +import Unify ( tcMatchTys, tcMatchTyX ) +import Util ( zipLazy, isSingleton, notNull, sortLe ) +import List ( partition ) +import SrcLoc ( Located(..), unLoc, getLoc, srcLocSpan ) +import ListSetOps ( equivClasses ) +import List ( delete ) +import Digraph ( SCC(..) ) +import DynFlags ( DynFlag( Opt_GlasgowExts, Opt_Generics, + Opt_UnboxStrictFields ) ) +\end{code} + + +%************************************************************************ +%* * +\subsection{Type checking for type and class declarations} +%* * +%************************************************************************ + +Dealing with a group +~~~~~~~~~~~~~~~~~~~~ +Consider a mutually-recursive group, binding +a type constructor T and a class C. + +Step 1: getInitialKind + Construct a KindEnv by binding T and C to a kind variable + +Step 2: kcTyClDecl + In that environment, do a kind check + +Step 3: Zonk the kinds + +Step 4: buildTyConOrClass + Construct an environment binding T to a TyCon and C to a Class. + a) Their kinds comes from zonking the relevant kind variable + b) Their arity (for synonyms) comes direct from the decl + c) The funcional dependencies come from the decl + d) The rest comes a knot-tied binding of T and C, returned from Step 4 + e) The variances of the tycons in the group is calculated from + the knot-tied stuff + +Step 5: tcTyClDecl1 + In this environment, walk over the decls, constructing the TyCons and Classes. + This uses in a strict way items (a)-(c) above, which is why they must + be constructed in Step 4. Feed the results back to Step 4. + For this step, pass the is-recursive flag as the wimp-out flag + to tcTyClDecl1. + + +Step 6: Extend environment + We extend the type environment with bindings not only for the TyCons and Classes, + but also for their "implicit Ids" like data constructors and class selectors + +Step 7: checkValidTyCl + For a recursive group only, check all the decls again, just + to check all the side conditions on validity. We could not + do this before because we were in a mutually recursive knot. + + +The knot-tying parameters: @rec_details_list@ is an alist mapping @Name@s to +@TyThing@s. @rec_vrcs@ is a finite map from @Name@s to @ArgVrcs@s. + +\begin{code} +tcTyAndClassDecls :: ModDetails -> [LTyClDecl Name] + -> TcM TcGblEnv -- Input env extended by types and classes + -- and their implicit Ids,DataCons +tcTyAndClassDecls boot_details decls + = do { -- First check for cyclic type synonysm or classes + -- See notes with checkCycleErrs + checkCycleErrs decls + ; mod <- getModule + ; traceTc (text "tcTyAndCl" <+> ppr mod) + ; (syn_tycons, alg_tyclss) <- fixM (\ ~(rec_syn_tycons, rec_alg_tyclss) -> + do { let { -- Calculate variances and rec-flag + ; (syn_decls, alg_decls) = partition (isSynDecl . unLoc) decls } + + -- Extend the global env with the knot-tied results + -- for data types and classes + -- + -- We must populate the environment with the loop-tied T's right + -- away, because the kind checker may "fault in" some type + -- constructors that recursively mention T + ; let { gbl_things = mkGlobalThings alg_decls rec_alg_tyclss } + ; tcExtendRecEnv gbl_things $ do + + -- Kind-check the declarations + { (kc_syn_decls, kc_alg_decls) <- kcTyClDecls syn_decls alg_decls + + ; let { calc_vrcs = calcTyConArgVrcs (rec_syn_tycons ++ rec_alg_tyclss) + ; calc_rec = calcRecFlags boot_details rec_alg_tyclss + ; tc_decl = addLocM (tcTyClDecl calc_vrcs calc_rec) } + -- Type-check the type synonyms, and extend the envt + ; syn_tycons <- tcSynDecls calc_vrcs kc_syn_decls + ; tcExtendGlobalEnv syn_tycons $ do + + -- Type-check the data types and classes + { alg_tyclss <- mappM tc_decl kc_alg_decls + ; return (syn_tycons, alg_tyclss) + }}}) + -- Finished with knot-tying now + -- Extend the environment with the finished things + ; tcExtendGlobalEnv (syn_tycons ++ alg_tyclss) $ do + + -- Perform the validity check + { traceTc (text "ready for validity check") + ; mappM_ (addLocM checkValidTyCl) decls + ; traceTc (text "done") + + -- Add the implicit things; + -- we want them in the environment because + -- they may be mentioned in interface files + ; let { implicit_things = concatMap implicitTyThings alg_tyclss } + ; traceTc ((text "Adding" <+> ppr alg_tyclss) $$ (text "and" <+> ppr implicit_things)) + ; tcExtendGlobalEnv implicit_things getGblEnv + }} + +mkGlobalThings :: [LTyClDecl Name] -- The decls + -> [TyThing] -- Knot-tied, in 1-1 correspondence with the decls + -> [(Name,TyThing)] +-- Driven by the Decls, and treating the TyThings lazily +-- make a TypeEnv for the new things +mkGlobalThings decls things + = map mk_thing (decls `zipLazy` things) + where + mk_thing (L _ (ClassDecl {tcdLName = L _ name}), ~(AClass cl)) + = (name, AClass cl) + mk_thing (L _ decl, ~(ATyCon tc)) + = (tcdName decl, ATyCon tc) +\end{code} + + +%************************************************************************ +%* * + Kind checking +%* * +%************************************************************************ + +We need to kind check all types in the mutually recursive group +before we know the kind of the type variables. For example: + +class C a where + op :: D b => a -> b -> b + +class D c where + bop :: (Monad c) => ... + +Here, the kind of the locally-polymorphic type variable "b" +depends on *all the uses of class D*. For example, the use of +Monad c in bop's type signature means that D must have kind Type->Type. + +However type synonyms work differently. They can have kinds which don't +just involve (->) and *: + type R = Int# -- Kind # + type S a = Array# a -- Kind * -> # + type T a b = (# a,b #) -- Kind * -> * -> (# a,b #) +So we must infer their kinds from their right-hand sides *first* and then +use them, whereas for the mutually recursive data types D we bring into +scope kind bindings D -> k, where k is a kind variable, and do inference. + +\begin{code} +kcTyClDecls syn_decls alg_decls + = do { -- First extend the kind env with each data + -- type and class, mapping them to a type variable + alg_kinds <- mappM getInitialKind alg_decls + ; tcExtendKindEnv alg_kinds $ do + + -- Now kind-check the type synonyms, in dependency order + -- We do these differently to data type and classes, + -- because a type synonym can be an unboxed type + -- type Foo = Int# + -- and a kind variable can't unify with UnboxedTypeKind + -- So we infer their kinds in dependency order + { (kc_syn_decls, syn_kinds) <- kcSynDecls (calcSynCycles syn_decls) + ; tcExtendKindEnv syn_kinds $ do + + -- Now kind-check the data type and class declarations, + -- returning kind-annotated decls + { kc_alg_decls <- mappM (wrapLocM kcTyClDecl) alg_decls + + ; return (kc_syn_decls, kc_alg_decls) }}} + +------------------------------------------------------------------------ +getInitialKind :: LTyClDecl Name -> TcM (Name, TcKind) +-- Only for data type and class declarations +-- Get as much info as possible from the data or class decl, +-- so as to maximise usefulness of error messages +getInitialKind (L _ decl) + = do { arg_kinds <- mapM (mk_arg_kind . unLoc) (tyClDeclTyVars decl) + ; res_kind <- mk_res_kind decl + ; return (tcdName decl, mkArrowKinds arg_kinds res_kind) } + where + mk_arg_kind (UserTyVar _) = newKindVar + mk_arg_kind (KindedTyVar _ kind) = return kind + + mk_res_kind (TyData { tcdKindSig = Just kind }) = return kind + -- On GADT-style declarations we allow a kind signature + -- data T :: *->* where { ... } + mk_res_kind other = return liftedTypeKind + + +---------------- +kcSynDecls :: [SCC (LTyClDecl Name)] + -> TcM ([LTyClDecl Name], -- Kind-annotated decls + [(Name,TcKind)]) -- Kind bindings +kcSynDecls [] + = return ([], []) +kcSynDecls (group : groups) + = do { (decl, nk) <- kcSynDecl group + ; (decls, nks) <- tcExtendKindEnv [nk] (kcSynDecls groups) + ; return (decl:decls, nk:nks) } + +---------------- +kcSynDecl :: SCC (LTyClDecl Name) + -> TcM (LTyClDecl Name, -- Kind-annotated decls + (Name,TcKind)) -- Kind bindings +kcSynDecl (AcyclicSCC ldecl@(L loc decl)) + = tcAddDeclCtxt decl $ + kcHsTyVars (tcdTyVars decl) (\ k_tvs -> + do { traceTc (text "kcd1" <+> ppr (unLoc (tcdLName decl)) <+> brackets (ppr (tcdTyVars decl)) + <+> brackets (ppr k_tvs)) + ; (k_rhs, rhs_kind) <- kcHsType (tcdSynRhs decl) + ; traceTc (text "kcd2" <+> ppr (unLoc (tcdLName decl))) + ; let tc_kind = foldr (mkArrowKind . kindedTyVarKind) rhs_kind k_tvs + ; return (L loc (decl { tcdTyVars = k_tvs, tcdSynRhs = k_rhs }), + (unLoc (tcdLName decl), tc_kind)) }) + +kcSynDecl (CyclicSCC decls) + = do { recSynErr decls; failM } -- Fail here to avoid error cascade + -- of out-of-scope tycons + +kindedTyVarKind (L _ (KindedTyVar _ k)) = k + +------------------------------------------------------------------------ +kcTyClDecl :: TyClDecl Name -> TcM (TyClDecl Name) + -- Not used for type synonyms (see kcSynDecl) + +kcTyClDecl decl@(TyData {tcdND = new_or_data, tcdCtxt = ctxt, tcdCons = cons}) + = kcTyClDeclBody decl $ \ tvs' -> + do { ctxt' <- kcHsContext ctxt + ; cons' <- mappM (wrapLocM kc_con_decl) cons + ; return (decl {tcdTyVars = tvs', tcdCtxt = ctxt', tcdCons = cons'}) } + where + kc_con_decl (ConDecl name expl ex_tvs ex_ctxt details res) = do + kcHsTyVars ex_tvs $ \ex_tvs' -> do + ex_ctxt' <- kcHsContext ex_ctxt + details' <- kc_con_details details + res' <- case res of + ResTyH98 -> return ResTyH98 + ResTyGADT ty -> return . ResTyGADT =<< kcHsSigType ty + return (ConDecl name expl ex_tvs' ex_ctxt' details' res') + + kc_con_details (PrefixCon btys) + = do { btys' <- mappM kc_larg_ty btys ; return (PrefixCon btys') } + kc_con_details (InfixCon bty1 bty2) + = do { bty1' <- kc_larg_ty bty1; bty2' <- kc_larg_ty bty2; return (InfixCon bty1' bty2') } + kc_con_details (RecCon fields) + = do { fields' <- mappM kc_field fields; return (RecCon fields') } + + kc_field (fld, bty) = do { bty' <- kc_larg_ty bty ; return (fld, bty') } + + kc_larg_ty bty = case new_or_data of + DataType -> kcHsSigType bty + NewType -> kcHsLiftedSigType bty + -- Can't allow an unlifted type for newtypes, because we're effectively + -- going to remove the constructor while coercing it to a lifted type. + -- And newtypes can't be bang'd + +kcTyClDecl decl@(ClassDecl {tcdCtxt = ctxt, tcdSigs = sigs}) + = kcTyClDeclBody decl $ \ tvs' -> + do { is_boot <- tcIsHsBoot + ; checkTc (not is_boot) badBootClassDeclErr + ; ctxt' <- kcHsContext ctxt + ; sigs' <- mappM (wrapLocM kc_sig) sigs + ; return (decl {tcdTyVars = tvs', tcdCtxt = ctxt', tcdSigs = sigs'}) } + where + kc_sig (TypeSig nm op_ty) = do { op_ty' <- kcHsLiftedSigType op_ty + ; return (TypeSig nm op_ty') } + kc_sig other_sig = return other_sig + +kcTyClDecl decl@(ForeignType {}) + = return decl + +kcTyClDeclBody :: TyClDecl Name + -> ([LHsTyVarBndr Name] -> TcM a) + -> TcM a +-- getInitialKind has made a suitably-shaped kind for the type or class +-- Unpack it, and attribute those kinds to the type variables +-- Extend the env with bindings for the tyvars, taken from +-- the kind of the tycon/class. Give it to the thing inside, and + -- check the result kind matches +kcTyClDeclBody decl thing_inside + = tcAddDeclCtxt decl $ + do { tc_ty_thing <- tcLookupLocated (tcdLName decl) + ; let tc_kind = case tc_ty_thing of { AThing k -> k } + (kinds, _) = splitKindFunTys tc_kind + hs_tvs = tcdTyVars decl + kinded_tvs = ASSERT( length kinds >= length hs_tvs ) + [ L loc (KindedTyVar (hsTyVarName tv) k) + | (L loc tv, k) <- zip hs_tvs kinds] + ; tcExtendKindEnvTvs kinded_tvs (thing_inside kinded_tvs) } +\end{code} + + +%************************************************************************ +%* * +\subsection{Type checking} +%* * +%************************************************************************ + +\begin{code} +tcSynDecls :: (Name -> ArgVrcs) -> [LTyClDecl Name] -> TcM [TyThing] +tcSynDecls calc_vrcs [] = return [] +tcSynDecls calc_vrcs (decl : decls) + = do { syn_tc <- addLocM (tcSynDecl calc_vrcs) decl + ; syn_tcs <- tcExtendGlobalEnv [syn_tc] (tcSynDecls calc_vrcs decls) + ; return (syn_tc : syn_tcs) } + +tcSynDecl calc_vrcs + (TySynonym {tcdLName = L _ tc_name, tcdTyVars = tvs, tcdSynRhs = rhs_ty}) + = tcTyVarBndrs tvs $ \ tvs' -> do + { traceTc (text "tcd1" <+> ppr tc_name) + ; rhs_ty' <- tcHsKindedType rhs_ty + ; return (ATyCon (buildSynTyCon tc_name tvs' rhs_ty' (calc_vrcs tc_name))) } + +-------------------- +tcTyClDecl :: (Name -> ArgVrcs) -> (Name -> RecFlag) + -> TyClDecl Name -> TcM TyThing + +tcTyClDecl calc_vrcs calc_isrec decl + = tcAddDeclCtxt decl (tcTyClDecl1 calc_vrcs calc_isrec decl) + +tcTyClDecl1 calc_vrcs calc_isrec + (TyData {tcdND = new_or_data, tcdCtxt = ctxt, tcdTyVars = tvs, + tcdLName = L _ tc_name, tcdKindSig = mb_ksig, tcdCons = cons}) + = tcTyVarBndrs tvs $ \ tvs' -> do + { extra_tvs <- tcDataKindSig mb_ksig + ; let final_tvs = tvs' ++ extra_tvs + ; stupid_theta <- tcHsKindedContext ctxt + ; want_generic <- doptM Opt_Generics + ; unbox_strict <- doptM Opt_UnboxStrictFields + ; gla_exts <- doptM Opt_GlasgowExts + ; is_boot <- tcIsHsBoot -- Are we compiling an hs-boot file? + + -- Check that we don't use GADT syntax in H98 world + ; checkTc (gla_exts || h98_syntax) (badGadtDecl tc_name) + + -- Check that there's at least one condecl, + -- or else we're reading an interface file, or -fglasgow-exts + ; checkTc (not (null cons) || gla_exts || is_boot) + (emptyConDeclsErr tc_name) + + -- Check that a newtype has exactly one constructor + ; checkTc (new_or_data == DataType || isSingleton cons) + (newtypeConError tc_name (length cons)) + + ; tycon <- fixM (\ tycon -> do + { data_cons <- mappM (addLocM (tcConDecl unbox_strict new_or_data + tycon final_tvs)) + cons + ; let tc_rhs + | null cons && is_boot -- In a hs-boot file, empty cons means + = AbstractTyCon -- "don't know"; hence Abstract + | otherwise + = case new_or_data of + DataType -> mkDataTyConRhs data_cons + NewType -> ASSERT( isSingleton data_cons ) + mkNewTyConRhs tycon (head data_cons) + ; buildAlgTyCon tc_name final_tvs stupid_theta tc_rhs arg_vrcs is_rec + (want_generic && canDoGenerics data_cons) + }) + ; return (ATyCon tycon) + } + where + arg_vrcs = calc_vrcs tc_name + is_rec = calc_isrec tc_name + h98_syntax = case cons of -- All constructors have same shape + L _ (ConDecl { con_res = ResTyGADT _ }) : _ -> False + other -> True + +tcTyClDecl1 calc_vrcs calc_isrec + (ClassDecl {tcdLName = L _ class_name, tcdTyVars = tvs, + tcdCtxt = ctxt, tcdMeths = meths, + tcdFDs = fundeps, tcdSigs = sigs} ) + = tcTyVarBndrs tvs $ \ tvs' -> do + { ctxt' <- tcHsKindedContext ctxt + ; fds' <- mappM (addLocM tc_fundep) fundeps + ; sig_stuff <- tcClassSigs class_name sigs meths + ; clas <- fixM (\ clas -> + let -- This little knot is just so we can get + -- hold of the name of the class TyCon, which we + -- need to look up its recursiveness and variance + tycon_name = tyConName (classTyCon clas) + tc_isrec = calc_isrec tycon_name + tc_vrcs = calc_vrcs tycon_name + in + buildClass class_name tvs' ctxt' fds' + sig_stuff tc_isrec tc_vrcs) + ; return (AClass clas) } + where + tc_fundep (tvs1, tvs2) = do { tvs1' <- mappM tcLookupTyVar tvs1 ; + ; tvs2' <- mappM tcLookupTyVar tvs2 ; + ; return (tvs1', tvs2') } + + +tcTyClDecl1 calc_vrcs calc_isrec + (ForeignType {tcdLName = L _ tc_name, tcdExtName = tc_ext_name}) + = returnM (ATyCon (mkForeignTyCon tc_name tc_ext_name liftedTypeKind 0 [])) + +----------------------------------- +tcConDecl :: Bool -- True <=> -funbox-strict_fields + -> NewOrData -> TyCon -> [TyVar] + -> ConDecl Name -> TcM DataCon + +tcConDecl unbox_strict NewType tycon tc_tvs -- Newtypes + (ConDecl name _ ex_tvs ex_ctxt details ResTyH98) + = do { let tc_datacon field_lbls arg_ty + = do { arg_ty' <- tcHsKindedType arg_ty -- No bang on newtype + ; buildDataCon (unLoc name) False {- Prefix -} + True {- Vanilla -} [NotMarkedStrict] + (map unLoc field_lbls) + tc_tvs [] [arg_ty'] + tycon (mkTyVarTys tc_tvs) } + + -- Check that a newtype has no existential stuff + ; checkTc (null ex_tvs && null (unLoc ex_ctxt)) (newtypeExError name) + + ; case details of + PrefixCon [arg_ty] -> tc_datacon [] arg_ty + RecCon [(field_lbl, arg_ty)] -> tc_datacon [field_lbl] arg_ty + other -> failWithTc (newtypeFieldErr name (length (hsConArgs details))) + -- Check that the constructor has exactly one field + } + +tcConDecl unbox_strict DataType tycon tc_tvs -- Data types + (ConDecl name _ tvs ctxt details res_ty) + = tcTyVarBndrs tvs $ \ tvs' -> do + { ctxt' <- tcHsKindedContext ctxt + ; (data_tc, res_ty_args) <- tcResultType tycon tc_tvs res_ty + ; let + con_tvs = case res_ty of + ResTyH98 -> tc_tvs ++ tvs' + ResTyGADT _ -> tryVanilla tvs' res_ty_args + + -- Vanilla iff result type matches the quantified vars exactly, + -- and there is no existential context + -- Must check the context too because of implicit params; e.g. + -- data T = (?x::Int) => MkT Int + is_vanilla = res_ty_args `tcEqTypes` mkTyVarTys con_tvs + && null (unLoc ctxt) + + tc_datacon is_infix field_lbls btys + = do { let bangs = map getBangStrictness btys + ; arg_tys <- mappM tcHsBangType btys + ; buildDataCon (unLoc name) is_infix is_vanilla + (argStrictness unbox_strict tycon bangs arg_tys) + (map unLoc field_lbls) + con_tvs ctxt' arg_tys + data_tc res_ty_args } + -- NB: we put data_tc, the type constructor gotten from the constructor + -- type signature into the data constructor; that way + -- checkValidDataCon can complain if it's wrong. + + ; case details of + PrefixCon btys -> tc_datacon False [] btys + InfixCon bty1 bty2 -> tc_datacon True [] [bty1,bty2] + RecCon fields -> tc_datacon False field_names btys + where + (field_names, btys) = unzip fields + + } + +tcResultType :: TyCon -> [TyVar] -> ResType Name -> TcM (TyCon, [TcType]) +tcResultType tycon tvs ResTyH98 = return (tycon, mkTyVarTys tvs) +tcResultType _ _ (ResTyGADT res_ty) = tcLHsConResTy res_ty + +tryVanilla :: [TyVar] -> [TcType] -> [TyVar] +-- (tryVanilla tvs tys) returns a permutation of tvs. +-- It tries to re-order the tvs so that it exactly +-- matches the [Type], if that is possible +tryVanilla tvs (ty:tys) | Just tv <- tcGetTyVar_maybe ty -- The type is a tyvar + , tv `elem` tvs -- That tyvar is in the list + = tv : tryVanilla (delete tv tvs) tys +tryVanilla tvs tys = tvs -- Fall through case + + +------------------- +argStrictness :: Bool -- True <=> -funbox-strict_fields + -> TyCon -> [HsBang] + -> [TcType] -> [StrictnessMark] +argStrictness unbox_strict tycon bangs arg_tys + = ASSERT( length bangs == length arg_tys ) + zipWith (chooseBoxingStrategy unbox_strict tycon) arg_tys bangs + +-- We attempt to unbox/unpack a strict field when either: +-- (i) The field is marked '!!', or +-- (ii) The field is marked '!', and the -funbox-strict-fields flag is on. + +chooseBoxingStrategy :: Bool -> TyCon -> TcType -> HsBang -> StrictnessMark +chooseBoxingStrategy unbox_strict_fields tycon arg_ty bang + = case bang of + HsNoBang -> NotMarkedStrict + HsStrict | unbox_strict_fields && can_unbox -> MarkedUnboxed + HsUnbox | can_unbox -> MarkedUnboxed + other -> MarkedStrict + where + can_unbox = case splitTyConApp_maybe arg_ty of + Nothing -> False + Just (arg_tycon, _) -> not (isRecursiveTyCon tycon) && + isProductTyCon arg_tycon +\end{code} + +%************************************************************************ +%* * +\subsection{Dependency analysis} +%* * +%************************************************************************ + +Validity checking is done once the mutually-recursive knot has been +tied, so we can look at things freely. + +\begin{code} +checkCycleErrs :: [LTyClDecl Name] -> TcM () +checkCycleErrs tyclss + | null cls_cycles + = return () + | otherwise + = do { mappM_ recClsErr cls_cycles + ; failM } -- Give up now, because later checkValidTyCl + -- will loop if the synonym is recursive + where + cls_cycles = calcClassCycles tyclss + +checkValidTyCl :: TyClDecl Name -> TcM () +-- We do the validity check over declarations, rather than TyThings +-- only so that we can add a nice context with tcAddDeclCtxt +checkValidTyCl decl + = tcAddDeclCtxt decl $ + do { thing <- tcLookupLocatedGlobal (tcdLName decl) + ; traceTc (text "Validity of" <+> ppr thing) + ; case thing of + ATyCon tc -> checkValidTyCon tc + AClass cl -> checkValidClass cl + ; traceTc (text "Done validity of" <+> ppr thing) + } + +------------------------- +-- For data types declared with record syntax, we require +-- that each constructor that has a field 'f' +-- (a) has the same result type +-- (b) has the same type for 'f' +-- module alpha conversion of the quantified type variables +-- of the constructor. + +checkValidTyCon :: TyCon -> TcM () +checkValidTyCon tc + | isSynTyCon tc + = checkValidType syn_ctxt syn_rhs + | otherwise + = -- Check the context on the data decl + checkValidTheta (DataTyCtxt name) (tyConStupidTheta tc) `thenM_` + + -- Check arg types of data constructors + mappM_ (checkValidDataCon tc) data_cons `thenM_` + + -- Check that fields with the same name share a type + mappM_ check_fields groups + + where + syn_ctxt = TySynCtxt name + name = tyConName tc + syn_rhs = synTyConRhs tc + data_cons = tyConDataCons tc + + groups = equivClasses cmp_fld (concatMap get_fields data_cons) + cmp_fld (f1,_) (f2,_) = f1 `compare` f2 + get_fields con = dataConFieldLabels con `zip` repeat con + -- dataConFieldLabels may return the empty list, which is fine + + -- XXX - autrijus - Make this far more complex to acommodate + -- for different return types. Add res_ty to the mix, + -- comparing them in two steps, all for good error messages. + -- Plan: Use Unify.tcMatchTys to compare the first candidate's + -- result type against other candidates' types (check bothways). + -- If they magically agrees, take the substitution and + -- apply them to the latter ones, and see if they match perfectly. + -- check_fields fields@((first_field_label, field_ty) : other_fields) + check_fields fields@((label, con1) : other_fields) + -- These fields all have the same name, but are from + -- different constructors in the data type + = recoverM (return ()) $ mapM_ checkOne other_fields + -- Check that all the fields in the group have the same type + -- NB: this check assumes that all the constructors of a given + -- data type use the same type variables + where + tvs1 = mkVarSet (dataConTyVars con1) + res1 = dataConResTys con1 + fty1 = dataConFieldType con1 label + + checkOne (_, con2) -- Do it bothways to ensure they are structurally identical + = do { checkFieldCompat label con1 con2 tvs1 res1 res2 fty1 fty2 + ; checkFieldCompat label con2 con1 tvs2 res2 res1 fty2 fty1 } + where + tvs2 = mkVarSet (dataConTyVars con2) + res2 = dataConResTys con2 + fty2 = dataConFieldType con2 label + +checkFieldCompat fld con1 con2 tvs1 res1 res2 fty1 fty2 + = do { checkTc (isJust mb_subst1) (resultTypeMisMatch fld con1 con2) + ; checkTc (isJust mb_subst2) (fieldTypeMisMatch fld con1 con2) } + where + mb_subst1 = tcMatchTys tvs1 res1 res2 + mb_subst2 = tcMatchTyX tvs1 (expectJust "checkFieldCompat" mb_subst1) fty1 fty2 + +------------------------------- +checkValidDataCon :: TyCon -> DataCon -> TcM () +checkValidDataCon tc con + = setSrcSpan (srcLocSpan (getSrcLoc con)) $ + addErrCtxt (dataConCtxt con) $ + do { checkTc (dataConTyCon con == tc) (badDataConTyCon con) + ; checkValidType ctxt (idType (dataConWrapId con)) } + + -- This checks the argument types and + -- ambiguity of the existential context (if any) + -- + -- Note [Sept 04] Now that tvs is all the tvs, this + -- test doesn't actually check anything +-- ; checkFreeness tvs ex_theta } + where + ctxt = ConArgCtxt (dataConName con) +-- (tvs, ex_theta, _, _, _) = dataConSig con + + +------------------------------- +checkValidClass :: Class -> TcM () +checkValidClass cls + = do { -- CHECK ARITY 1 FOR HASKELL 1.4 + gla_exts <- doptM Opt_GlasgowExts + + -- Check that the class is unary, unless GlaExs + ; checkTc (notNull tyvars) (nullaryClassErr cls) + ; checkTc (gla_exts || unary) (classArityErr cls) + + -- Check the super-classes + ; checkValidTheta (ClassSCCtxt (className cls)) theta + + -- Check the class operations + ; mappM_ (check_op gla_exts) op_stuff + + -- Check that if the class has generic methods, then the + -- class has only one parameter. We can't do generic + -- multi-parameter type classes! + ; checkTc (unary || no_generics) (genericMultiParamErr cls) + } + where + (tyvars, theta, _, op_stuff) = classBigSig cls + unary = isSingleton tyvars + no_generics = null [() | (_, GenDefMeth) <- op_stuff] + + check_op gla_exts (sel_id, dm) + = addErrCtxt (classOpCtxt sel_id tau) $ do + { checkValidTheta SigmaCtxt (tail theta) + -- The 'tail' removes the initial (C a) from the + -- class itself, leaving just the method type + + ; checkValidType (FunSigCtxt op_name) tau + + -- Check that the type mentions at least one of + -- the class type variables + ; checkTc (any (`elemVarSet` tyVarsOfType tau) tyvars) + (noClassTyVarErr cls sel_id) + + -- Check that for a generic method, the type of + -- the method is sufficiently simple + ; checkTc (dm /= GenDefMeth || validGenericMethodType tau) + (badGenericMethodType op_name op_ty) + } + where + op_name = idName sel_id + op_ty = idType sel_id + (_,theta1,tau1) = tcSplitSigmaTy op_ty + (_,theta2,tau2) = tcSplitSigmaTy tau1 + (theta,tau) | gla_exts = (theta1 ++ theta2, tau2) + | otherwise = (theta1, mkPhiTy (tail theta1) tau1) + -- Ugh! The function might have a type like + -- op :: forall a. C a => forall b. (Eq b, Eq a) => tau2 + -- With -fglasgow-exts, we want to allow this, even though the inner + -- forall has an (Eq a) constraint. Whereas in general, each constraint + -- in the context of a for-all must mention at least one quantified + -- type variable. What a mess! + + +--------------------------------------------------------------------- +resultTypeMisMatch field_name con1 con2 + = vcat [sep [ptext SLIT("Constructors") <+> ppr con1 <+> ptext SLIT("and") <+> ppr con2, + ptext SLIT("have a common field") <+> quotes (ppr field_name) <> comma], + nest 2 $ ptext SLIT("but have different result types")] +fieldTypeMisMatch field_name con1 con2 + = sep [ptext SLIT("Constructors") <+> ppr con1 <+> ptext SLIT("and") <+> ppr con2, + ptext SLIT("give different types for field"), quotes (ppr field_name)] + +dataConCtxt con = ptext SLIT("In the definition of data constructor") <+> quotes (ppr con) + +classOpCtxt sel_id tau = sep [ptext SLIT("When checking the class method:"), + nest 2 (ppr sel_id <+> dcolon <+> ppr tau)] + +nullaryClassErr cls + = ptext SLIT("No parameters for class") <+> quotes (ppr cls) + +classArityErr cls + = vcat [ptext SLIT("Too many parameters for class") <+> quotes (ppr cls), + parens (ptext SLIT("Use -fglasgow-exts to allow multi-parameter classes"))] + +noClassTyVarErr clas op + = sep [ptext SLIT("The class method") <+> quotes (ppr op), + ptext SLIT("mentions none of the type variables of the class") <+> + ppr clas <+> hsep (map ppr (classTyVars clas))] + +genericMultiParamErr clas + = ptext SLIT("The multi-parameter class") <+> quotes (ppr clas) <+> + ptext SLIT("cannot have generic methods") + +badGenericMethodType op op_ty + = hang (ptext SLIT("Generic method type is too complex")) + 4 (vcat [ppr op <+> dcolon <+> ppr op_ty, + ptext SLIT("You can only use type variables, arrows, lists, and tuples")]) + +recSynErr syn_decls + = setSrcSpan (getLoc (head sorted_decls)) $ + addErr (sep [ptext SLIT("Cycle in type synonym declarations:"), + nest 2 (vcat (map ppr_decl sorted_decls))]) + where + sorted_decls = sortLocated syn_decls + ppr_decl (L loc decl) = ppr loc <> colon <+> ppr decl + +recClsErr cls_decls + = setSrcSpan (getLoc (head sorted_decls)) $ + addErr (sep [ptext SLIT("Cycle in class declarations (via superclasses):"), + nest 2 (vcat (map ppr_decl sorted_decls))]) + where + sorted_decls = sortLocated cls_decls + ppr_decl (L loc decl) = ppr loc <> colon <+> ppr (decl { tcdSigs = [] }) + +sortLocated :: [Located a] -> [Located a] +sortLocated things = sortLe le things + where + le (L l1 _) (L l2 _) = l1 <= l2 + +badDataConTyCon data_con + = hang (ptext SLIT("Data constructor") <+> quotes (ppr data_con) <+> + ptext SLIT("returns type") <+> quotes (ppr (dataConTyCon data_con))) + 2 (ptext SLIT("instead of its parent type")) + +badGadtDecl tc_name + = vcat [ ptext SLIT("Illegal generalised algebraic data declaration for") <+> quotes (ppr tc_name) + , nest 2 (parens $ ptext SLIT("Use -fglasgow-exts to allow GADTs")) ] + +newtypeConError tycon n + = sep [ptext SLIT("A newtype must have exactly one constructor,"), + nest 2 $ ptext SLIT("but") <+> quotes (ppr tycon) <+> ptext SLIT("has") <+> speakN n ] + +newtypeExError con + = sep [ptext SLIT("A newtype constructor cannot have an existential context,"), + nest 2 $ ptext SLIT("but") <+> quotes (ppr con) <+> ptext SLIT("does")] + +newtypeFieldErr con_name n_flds + = sep [ptext SLIT("The constructor of a newtype must have exactly one field"), + nest 2 $ ptext SLIT("but") <+> quotes (ppr con_name) <+> ptext SLIT("has") <+> speakN n_flds] + +emptyConDeclsErr tycon + = sep [quotes (ppr tycon) <+> ptext SLIT("has no constructors"), + nest 2 $ ptext SLIT("(-fglasgow-exts permits this)")] + +badBootClassDeclErr = ptext SLIT("Illegal class declaration in hs-boot file") +\end{code} diff --git a/compiler/typecheck/TcTyDecls.lhs b/compiler/typecheck/TcTyDecls.lhs new file mode 100644 index 0000000000..4ce5fed3f3 --- /dev/null +++ b/compiler/typecheck/TcTyDecls.lhs @@ -0,0 +1,473 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1999 +% + +Analysis functions over data types. Specficially + a) detecting recursive types + b) computing argument variances + +This stuff is only used for source-code decls; it's recorded in interface +files for imported data types. + + +\begin{code} +module TcTyDecls( + calcTyConArgVrcs, + calcRecFlags, + calcClassCycles, calcSynCycles + ) where + +#include "HsVersions.h" + +import TypeRep ( Type(..), TyNote(..), PredType(..) ) -- friend +import HsSyn ( TyClDecl(..), HsPred(..), LTyClDecl, isClassDecl ) +import RnHsSyn ( extractHsTyNames ) +import Type ( predTypeRep, tcView ) +import HscTypes ( TyThing(..), ModDetails(..) ) +import TyCon ( TyCon, ArgVrcs, tyConArity, tyConDataCons, tyConTyVars, + synTyConDefn, isSynTyCon, isAlgTyCon, + tyConName, isNewTyCon, isProductTyCon, tyConArgVrcs, newTyConRhs ) +import Class ( classTyCon ) +import DataCon ( dataConOrigArgTys ) +import Var ( TyVar ) +import VarSet +import Name ( Name, isTyVarName ) +import NameEnv +import NameSet +import Digraph ( SCC(..), stronglyConnComp, stronglyConnCompR ) +import BasicTypes ( RecFlag(..) ) +import SrcLoc ( Located(..), unLoc ) +import Outputable +\end{code} + + +%************************************************************************ +%* * + Cycles in class and type synonym declarations +%* * +%************************************************************************ + +Checking for class-decl loops is easy, because we don't allow class decls +in interface files. + +We allow type synonyms in hi-boot files, but we *trust* hi-boot files, +so we don't check for loops that involve them. So we only look for synonym +loops in the module being compiled. + +We check for type synonym and class cycles on the *source* code. +Main reasons: + + a) Otherwise we'd need a special function to extract type-synonym tycons + from a type, whereas we have extractHsTyNames already + + b) If we checked for type synonym loops after building the TyCon, we + can't do a hoistForAllTys on the type synonym rhs, (else we fall into + a black hole) which seems unclean. Apart from anything else, it'd mean + that a type-synonym rhs could have for-alls to the right of an arrow, + which means adding new cases to the validity checker + + Indeed, in general, checking for cycles beforehand means we need to + be less careful about black holes through synonym cycles. + +The main disadvantage is that a cycle that goes via a type synonym in an +.hi-boot file can lead the compiler into a loop, because it assumes that cycles +only occur entirely within the source code of the module being compiled. +But hi-boot files are trusted anyway, so this isn't much worse than (say) +a kind error. + +[ NOTE ---------------------------------------------- +If we reverse this decision, this comment came from tcTyDecl1, and should + go back there + -- dsHsType, not tcHsKindedType, to avoid a loop. tcHsKindedType does hoisting, + -- which requires looking through synonyms... and therefore goes into a loop + -- on (erroneously) recursive synonyms. + -- Solution: do not hoist synonyms, because they'll be hoisted soon enough + -- when they are substituted + +We'd also need to add back in this definition + +synTyConsOfType :: Type -> [TyCon] +-- Does not look through type synonyms at all +-- Return a list of synonym tycons +synTyConsOfType ty + = nameEnvElts (go ty) + where + go :: Type -> NameEnv TyCon -- The NameEnv does duplicate elim + go (TyVarTy v) = emptyNameEnv + go (TyConApp tc tys) = go_tc tc tys + go (AppTy a b) = go a `plusNameEnv` go b + go (FunTy a b) = go a `plusNameEnv` go b + go (PredTy (IParam _ ty)) = go ty + go (PredTy (ClassP cls tys)) = go_s tys -- Ignore class + go (NoteTy _ ty) = go ty + go (ForAllTy _ ty) = go ty + + go_tc tc tys | isSynTyCon tc = extendNameEnv (go_s tys) (tyConName tc) tc + | otherwise = go_s tys + go_s tys = foldr (plusNameEnv . go) emptyNameEnv tys +---------------------------------------- END NOTE ] + +\begin{code} +calcSynCycles :: [LTyClDecl Name] -> [SCC (LTyClDecl Name)] +calcSynCycles decls + = stronglyConnComp syn_edges + where + syn_edges = [ (ldecl, unLoc (tcdLName decl), + mk_syn_edges (tcdSynRhs decl)) + | ldecl@(L _ decl) <- decls ] + + mk_syn_edges rhs = [ tc | tc <- nameSetToList (extractHsTyNames rhs), + not (isTyVarName tc) ] + + +calcClassCycles :: [LTyClDecl Name] -> [[LTyClDecl Name]] +calcClassCycles decls + = [decls | CyclicSCC decls <- stronglyConnComp cls_edges] + where + cls_edges = [ (ldecl, unLoc (tcdLName decl), + mk_cls_edges (unLoc (tcdCtxt decl))) + | ldecl@(L _ decl) <- decls, isClassDecl decl ] + + mk_cls_edges ctxt = [ cls | L _ (HsClassP cls _) <- ctxt ] +\end{code} + + +%************************************************************************ +%* * + Deciding which type constructors are recursive +%* * +%************************************************************************ + +For newtypes, we label some as "recursive" such that + + INVARIANT: there is no cycle of non-recursive newtypes + +In any loop, only one newtype need be marked as recursive; it is +a "loop breaker". Labelling more than necessary as recursive is OK, +provided the invariant is maintained. + +A newtype M.T is defined to be "recursive" iff + (a) it is declared in an hi-boot file (see RdrHsSyn.hsIfaceDecl) + (b) it is declared in a source file, but that source file has a + companion hi-boot file which declares the type + or (c) one can get from T's rhs to T via type + synonyms, or non-recursive newtypes *in M* + e.g. newtype T = MkT (T -> Int) + +(a) is conservative; declarations in hi-boot files are always + made loop breakers. That's why in (b) we can restrict attention + to tycons in M, because any loops through newtypes outside M + will be broken by those newtypes +(b) ensures that a newtype is not treated as a loop breaker in one place +and later as a non-loop-breaker. This matters in GHCi particularly, when +a newtype T might be embedded in many types in the environment, and then +T's source module is compiled. We don't want T's recursiveness to change. + +The "recursive" flag for algebraic data types is irrelevant (never consulted) +for types with more than one constructor. + +An algebraic data type M.T is "recursive" iff + it has just one constructor, and + (a) it is declared in an hi-boot file (see RdrHsSyn.hsIfaceDecl) + (b) it is declared in a source file, but that source file has a + companion hi-boot file which declares the type + or (c) one can get from its arg types to T via type synonyms, + or by non-recursive newtypes or non-recursive product types in M + e.g. data T = MkT (T -> Int) Bool +Just like newtype in fact + +A type synonym is recursive if one can get from its +right hand side back to it via type synonyms. (This is +reported as an error.) + +A class is recursive if one can get from its superclasses +back to it. (This is an error too.) + +Hi-boot types +~~~~~~~~~~~~~ +A data type read from an hi-boot file will have an AbstractTyCon as its AlgTyConRhs +and will respond True to isHiBootTyCon. The idea is that we treat these as if one +could get from these types to anywhere. So when we see + + module Baz where + import {-# SOURCE #-} Foo( T ) + newtype S = MkS T + +then we mark S as recursive, just in case. What that means is that if we see + + import Baz( S ) + newtype R = MkR S + +then we don't need to look inside S to compute R's recursiveness. Since S is imported +(not from an hi-boot file), one cannot get from R back to S except via an hi-boot file, +and that means that some data type will be marked recursive along the way. So R is +unconditionly non-recursive (i.e. there'll be a loop breaker elsewhere if necessary) + +This in turn means that we grovel through fewer interface files when computing +recursiveness, because we need only look at the type decls in the module being +compiled, plus the outer structure of directly-mentioned types. + +\begin{code} +calcRecFlags :: ModDetails -> [TyThing] -> (Name -> RecFlag) +-- The 'boot_names' are the things declared in M.hi-boot, if M is the current module. +-- Any type constructors in boot_names are automatically considered loop breakers +calcRecFlags boot_details tyclss + = is_rec + where + is_rec n | n `elemNameSet` rec_names = Recursive + | otherwise = NonRecursive + + boot_name_set = md_exports boot_details + rec_names = boot_name_set `unionNameSets` + nt_loop_breakers `unionNameSets` + prod_loop_breakers + + all_tycons = [ tc | tycls <- tyclss, + -- Recursion of newtypes/data types can happen via + -- the class TyCon, so tyclss includes the class tycons + let tc = getTyCon tycls, + not (tyConName tc `elemNameSet` boot_name_set) ] + -- Remove the boot_name_set because they are going + -- to be loop breakers regardless. + + ------------------------------------------------- + -- NOTE + -- These edge-construction loops rely on + -- every loop going via tyclss, the types and classes + -- in the module being compiled. Stuff in interface + -- files should be correctly marked. If not (e.g. a + -- type synonym in a hi-boot file) we can get an infinite + -- loop. We could program round this, but it'd make the code + -- rather less nice, so I'm not going to do that yet. + + --------------- Newtypes ---------------------- + new_tycons = filter isNewTyCon all_tycons + nt_loop_breakers = mkNameSet (findLoopBreakers nt_edges) + is_rec_nt tc = tyConName tc `elemNameSet` nt_loop_breakers + -- is_rec_nt is a locally-used helper function + + nt_edges = [(t, mk_nt_edges t) | t <- new_tycons] + + mk_nt_edges nt -- Invariant: nt is a newtype + = concatMap (mk_nt_edges1 nt) (tcTyConsOfType (new_tc_rhs nt)) + -- tyConsOfType looks through synonyms + + mk_nt_edges1 nt tc + | tc `elem` new_tycons = [tc] -- Loop + -- At this point we know that either it's a local *data* type, + -- or it's imported. Either way, it can't form part of a newtype cycle + | otherwise = [] + + --------------- Product types ---------------------- + -- The "prod_tycons" are the non-newtype products + prod_tycons = [tc | tc <- all_tycons, + not (isNewTyCon tc), isProductTyCon tc] + prod_loop_breakers = mkNameSet (findLoopBreakers prod_edges) + + prod_edges = [(tc, mk_prod_edges tc) | tc <- prod_tycons] + + mk_prod_edges tc -- Invariant: tc is a product tycon + = concatMap (mk_prod_edges1 tc) (dataConOrigArgTys (head (tyConDataCons tc))) + + mk_prod_edges1 ptc ty = concatMap (mk_prod_edges2 ptc) (tcTyConsOfType ty) + + mk_prod_edges2 ptc tc + | tc `elem` prod_tycons = [tc] -- Local product + | tc `elem` new_tycons = if is_rec_nt tc -- Local newtype + then [] + else mk_prod_edges1 ptc (new_tc_rhs tc) + -- At this point we know that either it's a local non-product data type, + -- or it's imported. Either way, it can't form part of a cycle + | otherwise = [] + +new_tc_rhs tc = snd (newTyConRhs tc) -- Ignore the type variables + +getTyCon (ATyCon tc) = tc +getTyCon (AClass cl) = classTyCon cl + +findLoopBreakers :: [(TyCon, [TyCon])] -> [Name] +-- Finds a set of tycons that cut all loops +findLoopBreakers deps + = go [(tc,tc,ds) | (tc,ds) <- deps] + where + go edges = [ name + | CyclicSCC ((tc,_,_) : edges') <- stronglyConnCompR edges, + name <- tyConName tc : go edges'] +\end{code} + +These two functions know about type representations, so they could be +in Type or TcType -- but they are very specialised to this module, so +I've chosen to put them here. + +\begin{code} +tcTyConsOfType :: Type -> [TyCon] +-- tcTyConsOfType looks through all synonyms, but not through any newtypes. +-- When it finds a Class, it returns the class TyCon. The reaons it's here +-- (not in Type.lhs) is because it is newtype-aware. +tcTyConsOfType ty + = nameEnvElts (go ty) + where + go :: Type -> NameEnv TyCon -- The NameEnv does duplicate elim + go ty | Just ty' <- tcView ty = go ty' + go (TyVarTy v) = emptyNameEnv + go (TyConApp tc tys) = go_tc tc tys + go (AppTy a b) = go a `plusNameEnv` go b + go (FunTy a b) = go a `plusNameEnv` go b + go (PredTy (IParam _ ty)) = go ty + go (PredTy (ClassP cls tys)) = go_tc (classTyCon cls) tys + go (ForAllTy _ ty) = go ty + + go_tc tc tys = extendNameEnv (go_s tys) (tyConName tc) tc + go_s tys = foldr (plusNameEnv . go) emptyNameEnv tys +\end{code} + + +%************************************************************************ +%* * + Compuing TyCon argument variances +%* * +%************************************************************************ + +Computing the tyConArgVrcs info +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +@tyConArgVrcs@ gives a list of (occPos,occNeg) flags, one for each +tyvar. For @AlgTyCon@s and @SynTyCon@s, this info must be precomputed +separately. Note that this is information about occurrences of type +variables, not usages of term variables. + +The function @calcTyConArgVrcs@ must be passed a list of *algebraic or +syntycons only* such that all tycons referred to (by mutual recursion) +appear in the list. The fixpointing will be done on this set of +tycons as a whole. It returns a list of @tyconVrcInfo@ data, ready to +be (knot-tyingly?) stuck back into the appropriate fields. + +\begin{code} +calcTyConArgVrcs :: [TyThing] -> Name -> ArgVrcs +-- Gives arg variances for TyCons, +-- including the class TyCon of a class +calcTyConArgVrcs tyclss + = get_vrc + where + tycons = map getTyCon tyclss + + -- We should only look up things that are in the map + get_vrc n = case lookupNameEnv final_oi n of + Just (_, pms) -> pms + Nothing -> pprPanic "calcVrcs" (ppr n) + + -- We are going to fold over this map, + -- so we need the TyCon in the range + final_oi :: NameEnv (TyCon, ArgVrcs) + final_oi = tcaoFix initial_oi + + initial_oi :: NameEnv (TyCon, ArgVrcs) + initial_oi = mkNameEnv [(tyConName tc, (tc, initial tc)) + | tc <- tycons] + initial tc = replicate (tyConArity tc) (False,False) + + tcaoFix :: NameEnv (TyCon, ArgVrcs) -- initial ArgVrcs per tycon + -> NameEnv (TyCon, ArgVrcs) -- fixpointed ArgVrcs per tycon + tcaoFix oi + | changed = tcaoFix oi' + | otherwise = oi' + where + (changed,oi') = foldNameEnv iterate (False,oi) oi + + iterate (tc, pms) (changed,oi') + = (changed || (pms /= pms'), + extendNameEnv oi' (tyConName tc) (tc, pms')) + where + pms' = tcaoIter oi' tc -- seq not simult + + tcaoIter :: NameEnv (TyCon, ArgVrcs) -- reference ArgVrcs (initial) + -> TyCon -- tycon to update + -> ArgVrcs -- new ArgVrcs for tycon + + tcaoIter oi tc | isAlgTyCon tc + = map (\v -> anyVrc (vrcInTy (lookup oi) v) argtys) vs + where + data_cons = tyConDataCons tc + vs = tyConTyVars tc + argtys = concatMap dataConOrigArgTys data_cons -- Rep? or Orig? + + tcaoIter oi tc | isSynTyCon tc + = let (tyvs,ty) = synTyConDefn tc + -- we use the already-computed result for tycons not in this SCC + in map (\v -> vrcInTy (lookup oi) v ty) tyvs + + lookup oi tc = case lookupNameEnv oi (tyConName tc) of + Just (_, pms) -> pms + Nothing -> tyConArgVrcs tc + -- We use the already-computed result for tycons not in this SCC +\end{code} + + +Variance of tyvars in a type +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +A general variance-check function. We pass a function for determining +the @ArgVrc@s of a tycon; when fixpointing this refers to the current +value; otherwise this should be looked up from the tycon's own +tyConArgVrcs. Again, it knows the representation of Types. + +\begin{code} +vrcInTy :: (TyCon -> ArgVrcs) -- function to get argVrcs of a tycon (break out of recursion) + -> TyVar -- tyvar to check Vrcs of + -> Type -- type to check for occ in + -> (Bool,Bool) -- (occurs positively, occurs negatively) + +vrcInTy fao v (NoteTy (FTVNote ftv) ty) = if elemVarSet v ftv + then vrcInTy fao v ty + else (False,False) + -- note that ftv cannot be calculated as occPos||occNeg, + -- since if a tyvar occurs only as unused tyconarg, + -- occPos==occNeg==False, but ftv=True + +vrcInTy fao v (TyVarTy v') = if v==v' + then (True,False) + else (False,False) + +vrcInTy fao v (AppTy ty1 ty2) = if vrcInTy fao v ty2 /= (False,False) + then (True,True) + else vrcInTy fao v ty1 + -- ty1 is probably unknown (or it would have been beta-reduced); + -- hence if v occurs in ty2 at all then it could occur with + -- either variance. Otherwise it occurs as it does in ty1. + +vrcInTy fao v (FunTy ty1 ty2) = negVrc (vrcInTy fao v ty1) + `orVrc` + vrcInTy fao v ty2 + +vrcInTy fao v (ForAllTy v' ty) = if v==v' + then (False,False) + else vrcInTy fao v ty + +vrcInTy fao v (TyConApp tc tys) = let pms1 = map (vrcInTy fao v) tys + pms2 = fao tc + in orVrcs (zipWith timesVrc pms1 pms2) + +vrcInTy fao v (PredTy st) = vrcInTy fao v (predTypeRep st) +\end{code} + +Variance algebra +~~~~~~~~~~~~~~~~ + +\begin{code} +orVrc :: (Bool,Bool) -> (Bool,Bool) -> (Bool,Bool) +orVrc (p1,m1) (p2,m2) = (p1||p2,m1||m2) + +orVrcs :: [(Bool,Bool)] -> (Bool,Bool) +orVrcs = foldl orVrc (False,False) + +negVrc :: (Bool,Bool) -> (Bool,Bool) +negVrc (p1,m1) = (m1,p1) + +anyVrc :: (a -> (Bool,Bool)) -> [a] -> (Bool,Bool) +anyVrc p as = foldl (\ pm a -> pm `orVrc` p a) + (False,False) as + +timesVrc :: (Bool,Bool) -> (Bool,Bool) -> (Bool,Bool) +timesVrc (p1,m1) (p2,m2) = (p1 && p2 || m1 && m2, + p1 && m2 || m1 && p2) +\end{code} diff --git a/compiler/typecheck/TcType.hi-boot-5 b/compiler/typecheck/TcType.hi-boot-5 new file mode 100644 index 0000000000..23b3a9c963 --- /dev/null +++ b/compiler/typecheck/TcType.hi-boot-5 @@ -0,0 +1,3 @@ +__interface TcType 1 0 where +__export TcType TyVarDetails; +1 data TyVarDetails ; diff --git a/compiler/typecheck/TcType.hi-boot-6 b/compiler/typecheck/TcType.hi-boot-6 new file mode 100644 index 0000000000..d1fc721c64 --- /dev/null +++ b/compiler/typecheck/TcType.hi-boot-6 @@ -0,0 +1,5 @@ +module TcType where + +data TcTyVarDetails + +pprTcTyVarDetails :: TcTyVarDetails -> Outputable.SDoc
\ No newline at end of file diff --git a/compiler/typecheck/TcType.lhs b/compiler/typecheck/TcType.lhs new file mode 100644 index 0000000000..4b6e7b814e --- /dev/null +++ b/compiler/typecheck/TcType.lhs @@ -0,0 +1,1202 @@ + +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[TcType]{Types used in the typechecker} + +This module provides the Type interface for front-end parts of the +compiler. These parts + + * treat "source types" as opaque: + newtypes, and predicates are meaningful. + * look through usage types + +The "tc" prefix is for "typechechecker", because the type checker +is the principal client. + +\begin{code} +module TcType ( + -------------------------------- + -- Types + TcType, TcSigmaType, TcRhoType, TcTauType, TcPredType, TcThetaType, + TcTyVar, TcTyVarSet, TcKind, + + BoxyTyVar, BoxySigmaType, BoxyRhoType, BoxyThetaType, BoxyType, + + -------------------------------- + -- MetaDetails + UserTypeCtxt(..), pprUserTypeCtxt, + TcTyVarDetails(..), BoxInfo(..), pprTcTyVarDetails, + MetaDetails(Flexi, Indirect), SkolemInfo(..), pprSkolTvBinding, pprSkolInfo, + isImmutableTyVar, isSkolemTyVar, isMetaTyVar, isBoxyTyVar, isSigTyVar, isExistentialTyVar, + metaTvRef, + isFlexi, isIndirect, + + -------------------------------- + -- Builders + mkPhiTy, mkSigmaTy, + + -------------------------------- + -- Splitters + -- These are important because they do not look through newtypes + tcView, + tcSplitForAllTys, tcSplitPhiTy, + tcSplitFunTy_maybe, tcSplitFunTys, tcFunArgTy, tcFunResultTy, tcSplitFunTysN, + tcSplitTyConApp, tcSplitTyConApp_maybe, tcTyConAppTyCon, tcTyConAppArgs, + tcSplitAppTy_maybe, tcSplitAppTy, tcSplitAppTys, + tcValidInstHeadTy, tcGetTyVar_maybe, tcGetTyVar, + tcSplitSigmaTy, tcMultiSplitSigmaTy, + + --------------------------------- + -- Predicates. + -- Again, newtypes are opaque + tcEqType, tcEqTypes, tcEqPred, tcCmpType, tcCmpTypes, tcCmpPred, tcEqTypeX, + isSigmaTy, isOverloadedTy, isRigidTy, isBoxyTy, + isDoubleTy, isFloatTy, isIntTy, isStringTy, + isIntegerTy, isAddrTy, isBoolTy, isUnitTy, + isTauTy, isTauTyCon, tcIsTyVarTy, tcIsForAllTy, + + --------------------------------- + -- Misc type manipulators + deNoteType, classesOfTheta, + tyClsNamesOfType, tyClsNamesOfDFunHead, + getDFunTyKey, + + --------------------------------- + -- Predicate types + getClassPredTys_maybe, getClassPredTys, + isClassPred, isTyVarClassPred, + mkDictTy, tcSplitPredTy_maybe, + isPredTy, isDictTy, tcSplitDFunTy, tcSplitDFunHead, predTyUnique, + mkClassPred, isInheritablePred, isLinearPred, isIPPred, mkPredName, + dataConsStupidTheta, isRefineableTy, + + --------------------------------- + -- Foreign import and export + isFFIArgumentTy, -- :: DynFlags -> Safety -> Type -> Bool + isFFIImportResultTy, -- :: DynFlags -> Type -> Bool + isFFIExportResultTy, -- :: Type -> Bool + isFFIExternalTy, -- :: Type -> Bool + isFFIDynArgumentTy, -- :: Type -> Bool + isFFIDynResultTy, -- :: Type -> Bool + isFFILabelTy, -- :: Type -> Bool + isFFIDotnetTy, -- :: DynFlags -> Type -> Bool + isFFIDotnetObjTy, -- :: Type -> Bool + isFFITy, -- :: Type -> Bool + + toDNType, -- :: Type -> DNType + + -------------------------------- + -- Rexported from Type + Kind, -- Stuff to do with kinds is insensitive to pre/post Tc + unliftedTypeKind, liftedTypeKind, openTypeKind, mkArrowKind, mkArrowKinds, + isLiftedTypeKind, isUnliftedTypeKind, isOpenTypeKind, + isArgTypeKind, isSubKind, defaultKind, + + Type, PredType(..), ThetaType, + mkForAllTy, mkForAllTys, + mkFunTy, mkFunTys, zipFunTys, + mkTyConApp, mkAppTy, mkAppTys, applyTy, applyTys, + mkTyVarTy, mkTyVarTys, mkTyConTy, mkPredTy, mkPredTys, + + -- Type substitutions + TvSubst(..), -- Representation visible to a few friends + TvSubstEnv, emptyTvSubst, + mkOpenTvSubst, zipOpenTvSubst, zipTopTvSubst, mkTopTvSubst, notElemTvSubst, + getTvSubstEnv, setTvSubstEnv, getTvInScope, extendTvInScope, lookupTyVar, + extendTvSubst, extendTvSubstList, isInScope, mkTvSubst, zipTyEnv, + substTy, substTys, substTyWith, substTheta, substTyVar, substTyVarBndr, + + isUnLiftedType, -- Source types are always lifted + isUnboxedTupleType, -- Ditto + isPrimitiveType, + + tidyTopType, tidyType, tidyPred, tidyTypes, tidyFreeTyVars, tidyOpenType, tidyOpenTypes, + tidyTyVarBndr, tidyOpenTyVar, tidyOpenTyVars, tidySkolemTyVar, + typeKind, tidyKind, + + tyVarsOfType, tyVarsOfTypes, tyVarsOfPred, tyVarsOfTheta, + tcTyVarsOfType, tcTyVarsOfTypes, exactTyVarsOfType, exactTyVarsOfTypes, + + pprKind, pprParendKind, + pprType, pprParendType, pprTyThingCategory, + pprPred, pprTheta, pprThetaArrow, pprClassPred + + ) where + +#include "HsVersions.h" + +-- friends: +import TypeRep ( Type(..), funTyCon ) -- friend + +import Type ( -- Re-exports + tyVarsOfType, tyVarsOfTypes, tyVarsOfPred, + tyVarsOfTheta, Kind, PredType(..), + ThetaType, unliftedTypeKind, + liftedTypeKind, openTypeKind, mkArrowKind, + isLiftedTypeKind, isUnliftedTypeKind, + mkArrowKinds, mkForAllTy, mkForAllTys, + defaultKind, isArgTypeKind, isOpenTypeKind, + mkFunTy, mkFunTys, zipFunTys, + mkTyConApp, mkAppTy, + mkAppTys, applyTy, applyTys, + mkTyVarTy, mkTyVarTys, mkTyConTy, mkPredTy, + mkPredTys, isUnLiftedType, + isUnboxedTupleType, isPrimitiveType, + splitTyConApp_maybe, + tidyTopType, tidyType, tidyPred, tidyTypes, + tidyFreeTyVars, tidyOpenType, tidyOpenTypes, + tidyTyVarBndr, tidyOpenTyVar, + tidyOpenTyVars, tidyKind, + isSubKind, tcView, + + tcEqType, tcEqTypes, tcCmpType, tcCmpTypes, + tcEqPred, tcCmpPred, tcEqTypeX, + + TvSubst(..), + TvSubstEnv, emptyTvSubst, mkTvSubst, zipTyEnv, + mkOpenTvSubst, zipOpenTvSubst, zipTopTvSubst, mkTopTvSubst, + getTvSubstEnv, setTvSubstEnv, getTvInScope, extendTvInScope, + extendTvSubst, extendTvSubstList, isInScope, notElemTvSubst, + substTy, substTys, substTyWith, substTheta, + substTyVar, substTyVarBndr, substPred, lookupTyVar, + + typeKind, repType, + pprKind, pprParendKind, + pprType, pprParendType, pprTyThingCategory, + pprPred, pprTheta, pprThetaArrow, pprClassPred + ) +import TyCon ( TyCon, isUnLiftedTyCon, isSynTyCon, synTyConDefn, tyConUnique ) +import DataCon ( DataCon, dataConStupidTheta, dataConResTys ) +import Class ( Class ) +import Var ( TyVar, Id, isTcTyVar, mkTcTyVar, tyVarName, tyVarKind, tcTyVarDetails ) +import ForeignCall ( Safety, playSafe, DNType(..) ) +import Unify ( tcMatchTys ) +import VarSet + +-- others: +import DynFlags ( DynFlags, DynFlag( Opt_GlasgowExts ), dopt ) +import Name ( Name, NamedThing(..), mkInternalName, getSrcLoc ) +import NameSet +import VarEnv ( TidyEnv ) +import OccName ( OccName, mkDictOcc ) +import PrelNames -- Lots (e.g. in isFFIArgumentTy) +import TysWiredIn ( unitTyCon, charTyCon, listTyCon ) +import BasicTypes ( IPName(..), Arity, ipNameName ) +import SrcLoc ( SrcLoc, SrcSpan ) +import Util ( snocView, equalLength ) +import Maybes ( maybeToBool, expectJust, mapCatMaybes ) +import ListSetOps ( hasNoDups ) +import List ( nubBy ) +import Outputable +import DATA_IOREF +\end{code} + + +%************************************************************************ +%* * +\subsection{Types} +%* * +%************************************************************************ + +The type checker divides the generic Type world into the +following more structured beasts: + +sigma ::= forall tyvars. phi + -- A sigma type is a qualified type + -- + -- Note that even if 'tyvars' is empty, theta + -- may not be: e.g. (?x::Int) => Int + + -- Note that 'sigma' is in prenex form: + -- all the foralls are at the front. + -- A 'phi' type has no foralls to the right of + -- an arrow + +phi :: theta => rho + +rho ::= sigma -> rho + | tau + +-- A 'tau' type has no quantification anywhere +-- Note that the args of a type constructor must be taus +tau ::= tyvar + | tycon tau_1 .. tau_n + | tau_1 tau_2 + | tau_1 -> tau_2 + +-- In all cases, a (saturated) type synonym application is legal, +-- provided it expands to the required form. + +\begin{code} +type TcTyVar = TyVar -- Used only during type inference +type TcType = Type -- A TcType can have mutable type variables + -- Invariant on ForAllTy in TcTypes: + -- forall a. T + -- a cannot occur inside a MutTyVar in T; that is, + -- T is "flattened" before quantifying over a + +-- These types do not have boxy type variables in them +type TcPredType = PredType +type TcThetaType = ThetaType +type TcSigmaType = TcType +type TcRhoType = TcType +type TcTauType = TcType +type TcKind = Kind +type TcTyVarSet = TyVarSet + +-- These types may have boxy type variables in them +type BoxyTyVar = TcTyVar +type BoxyRhoType = TcType +type BoxyThetaType = TcThetaType +type BoxySigmaType = TcType +type BoxyType = TcType +\end{code} + + +%************************************************************************ +%* * +\subsection{TyVarDetails} +%* * +%************************************************************************ + +TyVarDetails gives extra info about type variables, used during type +checking. It's attached to mutable type variables only. +It's knot-tied back to Var.lhs. There is no reason in principle +why Var.lhs shouldn't actually have the definition, but it "belongs" here. + + +Note [Signature skolems] +~~~~~~~~~~~~~~~~~~~~~~~~ +Consider this + + x :: [a] + y :: b + (x,y,z) = ([y,z], z, head x) + +Here, x and y have type sigs, which go into the environment. We used to +instantiate their types with skolem constants, and push those types into +the RHS, so we'd typecheck the RHS with type + ( [a*], b*, c ) +where a*, b* are skolem constants, and c is an ordinary meta type varible. + +The trouble is that the occurrences of z in the RHS force a* and b* to +be the *same*, so we can't make them into skolem constants that don't unify +with each other. Alas. + +On the other hand, we *must* use skolems for signature type variables, +becuase GADT type refinement refines skolems only. + +One solution would be insist that in the above defn the programmer uses +the same type variable in both type signatures. But that takes explanation. + +The alternative (currently implemented) is to have a special kind of skolem +constant, SigSkokTv, which can unify with other SigSkolTvs. + + +\begin{code} +-- A TyVarDetails is inside a TyVar +data TcTyVarDetails + = SkolemTv SkolemInfo -- A skolem constant + + | MetaTv BoxInfo (IORef MetaDetails) + +data BoxInfo + = BoxTv -- The contents is a (non-boxy) sigma-type + -- That is, this MetaTv is a "box" + + | TauTv -- The contents is a (non-boxy) tau-type + -- That is, this MetaTv is an ordinary unification variable + + | SigTv SkolemInfo -- A variant of TauTv, except that it should not be + -- unified with a type, only with a type variable + -- SigTvs are only distinguished to improve error messages + -- see Note [Signature skolems] + -- The MetaDetails, if filled in, will + -- always be another SigTv or a SkolemTv + +-- INVARIANTS: +-- A TauTv is always filled in with a tau-type, which +-- never contains any BoxTvs, nor any ForAlls +-- +-- However, a BoxTv can contain a type that contains further BoxTvs +-- Notably, when typechecking an explicit list, say [e1,e2], with +-- expected type being a box b1, we fill in b1 with (List b2), where +-- b2 is another (currently empty) box. + +data MetaDetails + = Flexi -- Flexi type variables unify to become + -- Indirects. + + | Indirect TcType -- INVARIANT: + -- For a BoxTv, this type must be non-boxy + -- For a TauTv, this type must be a tau-type + +data SkolemInfo + = SigSkol UserTypeCtxt -- A skolem that is created by instantiating + -- a programmer-supplied type signature + -- Location of the binding site is on the TyVar + + -- The rest are for non-scoped skolems + | ClsSkol Class -- Bound at a class decl + | InstSkol Id -- Bound at an instance decl + | PatSkol DataCon -- An existential type variable bound by a pattern for + SrcSpan -- a data constructor with an existential type. E.g. + -- data T = forall a. Eq a => MkT a + -- f (MkT x) = ... + -- The pattern MkT x will allocate an existential type + -- variable for 'a'. + | ArrowSkol SrcSpan -- An arrow form (see TcArrows) + + | GenSkol [TcTyVar] -- Bound when doing a subsumption check for + TcType -- (forall tvs. ty) + SrcSpan + + | UnkSkol -- Unhelpful info (until I improve it) + +------------------------------------- +-- UserTypeCtxt describes the places where a +-- programmer-written type signature can occur +data UserTypeCtxt + = FunSigCtxt Name -- Function type signature + -- Also used for types in SPECIALISE pragmas + | ExprSigCtxt -- Expression type signature + | ConArgCtxt Name -- Data constructor argument + | TySynCtxt Name -- RHS of a type synonym decl + | GenPatCtxt -- Pattern in generic decl + -- f{| a+b |} (Inl x) = ... + | LamPatSigCtxt -- Type sig in lambda pattern + -- f (x::t) = ... + | BindPatSigCtxt -- Type sig in pattern binding pattern + -- (x::t, y) = e + | ResSigCtxt -- Result type sig + -- f x :: t = .... + | ForSigCtxt Name -- Foreign inport or export signature + | RuleSigCtxt Name -- Signature on a forall'd variable in a RULE + | DefaultDeclCtxt -- Types in a default declaration + | SpecInstCtxt -- SPECIALISE instance pragma + +-- Notes re TySynCtxt +-- We allow type synonyms that aren't types; e.g. type List = [] +-- +-- If the RHS mentions tyvars that aren't in scope, we'll +-- quantify over them: +-- e.g. type T = a->a +-- will become type T = forall a. a->a +-- +-- With gla-exts that's right, but for H98 we should complain. +\end{code} + +%************************************************************************ +%* * + Pretty-printing +%* * +%************************************************************************ + +\begin{code} +pprTcTyVarDetails :: TcTyVarDetails -> SDoc +-- For debugging +pprTcTyVarDetails (SkolemTv _) = ptext SLIT("sk") +pprTcTyVarDetails (MetaTv BoxTv _) = ptext SLIT("box") +pprTcTyVarDetails (MetaTv TauTv _) = ptext SLIT("tau") +pprTcTyVarDetails (MetaTv (SigTv _) _) = ptext SLIT("sig") + +pprUserTypeCtxt :: UserTypeCtxt -> SDoc +pprUserTypeCtxt (FunSigCtxt n) = ptext SLIT("the type signature for") <+> quotes (ppr n) +pprUserTypeCtxt ExprSigCtxt = ptext SLIT("an expression type signature") +pprUserTypeCtxt (ConArgCtxt c) = ptext SLIT("the type of the constructor") <+> quotes (ppr c) +pprUserTypeCtxt (TySynCtxt c) = ptext SLIT("the RHS of the type synonym") <+> quotes (ppr c) +pprUserTypeCtxt GenPatCtxt = ptext SLIT("the type pattern of a generic definition") +pprUserTypeCtxt LamPatSigCtxt = ptext SLIT("a pattern type signature") +pprUserTypeCtxt BindPatSigCtxt = ptext SLIT("a pattern type signature") +pprUserTypeCtxt ResSigCtxt = ptext SLIT("a result type signature") +pprUserTypeCtxt (ForSigCtxt n) = ptext SLIT("the foreign declaration for") <+> quotes (ppr n) +pprUserTypeCtxt (RuleSigCtxt n) = ptext SLIT("the type signature for") <+> quotes (ppr n) +pprUserTypeCtxt DefaultDeclCtxt = ptext SLIT("a type in a `default' declaration") +pprUserTypeCtxt SpecInstCtxt = ptext SLIT("a SPECIALISE instance pragma") + + +-------------------------------- +tidySkolemTyVar :: TidyEnv -> TcTyVar -> (TidyEnv, TcTyVar) +-- Tidy the type inside a GenSkol, preparatory to printing it +tidySkolemTyVar env tv + = ASSERT( isSkolemTyVar tv ) + (env1, mkTcTyVar (tyVarName tv) (tyVarKind tv) info1) + where + (env1, info1) = case tcTyVarDetails tv of + SkolemTv (GenSkol tvs ty loc) -> (env2, SkolemTv (GenSkol tvs1 ty1 loc)) + where + (env1, tvs1) = tidyOpenTyVars env tvs + (env2, ty1) = tidyOpenType env1 ty + info -> (env, info) + +pprSkolTvBinding :: TcTyVar -> SDoc +-- Print info about the binding of a skolem tyvar, +-- or nothing if we don't have anything useful to say +pprSkolTvBinding tv + = ppr_details (tcTyVarDetails tv) + where + ppr_details (MetaTv TauTv _) = quotes (ppr tv) <+> ptext SLIT("is a meta type variable") + ppr_details (MetaTv BoxTv _) = quotes (ppr tv) <+> ptext SLIT("is a boxy type variable") + ppr_details (MetaTv (SigTv info) _) = ppr_skol info + ppr_details (SkolemTv info) = ppr_skol info + + ppr_skol UnkSkol = empty -- Unhelpful; omit + ppr_skol (SigSkol ctxt) = sep [quotes (ppr tv) <+> ptext SLIT("is bound by") <+> pprUserTypeCtxt ctxt, + nest 2 (ptext SLIT("at") <+> ppr (getSrcLoc tv))] + ppr_skol info = quotes (ppr tv) <+> pprSkolInfo info + +pprSkolInfo :: SkolemInfo -> SDoc +pprSkolInfo (SigSkol ctxt) = ptext SLIT("is bound by") <+> pprUserTypeCtxt ctxt +pprSkolInfo (ClsSkol cls) = ptext SLIT("is bound by the class declaration for") <+> quotes (ppr cls) +pprSkolInfo (InstSkol df) = ptext SLIT("is bound by the instance declaration at") <+> ppr (getSrcLoc df) +pprSkolInfo (ArrowSkol loc) = ptext SLIT("is bound by the arrow form at") <+> ppr loc +pprSkolInfo (PatSkol dc loc) = sep [ptext SLIT("is bound by the pattern for") <+> quotes (ppr dc), + nest 2 (ptext SLIT("at") <+> ppr loc)] +pprSkolInfo (GenSkol tvs ty loc) = sep [sep [ptext SLIT("is bound by the polymorphic type"), + nest 2 (quotes (ppr (mkForAllTys tvs ty)))], + nest 2 (ptext SLIT("at") <+> ppr loc)] +-- UnkSkol, SigSkol +-- For type variables the others are dealt with by pprSkolTvBinding. +-- For Insts, these cases should not happen +pprSkolInfo UnkSkol = panic "UnkSkol" + +instance Outputable MetaDetails where + ppr Flexi = ptext SLIT("Flexi") + ppr (Indirect ty) = ptext SLIT("Indirect") <+> ppr ty +\end{code} + + +%************************************************************************ +%* * + Predicates +%* * +%************************************************************************ + +\begin{code} +isImmutableTyVar, isSkolemTyVar, isExistentialTyVar, isBoxyTyVar, isMetaTyVar :: TyVar -> Bool +isImmutableTyVar tv + | isTcTyVar tv = isSkolemTyVar tv + | otherwise = True + +isSkolemTyVar tv + = ASSERT( isTcTyVar tv ) + case tcTyVarDetails tv of + SkolemTv _ -> True + MetaTv _ _ -> False + +isExistentialTyVar tv -- Existential type variable, bound by a pattern + = ASSERT( isTcTyVar tv ) + case tcTyVarDetails tv of + SkolemTv (PatSkol _ _) -> True + other -> False + +isMetaTyVar tv + = ASSERT2( isTcTyVar tv, ppr tv ) + case tcTyVarDetails tv of + MetaTv _ _ -> True + other -> False + +isBoxyTyVar tv + = ASSERT( isTcTyVar tv ) + case tcTyVarDetails tv of + MetaTv BoxTv _ -> True + other -> False + +isSigTyVar tv + = ASSERT( isTcTyVar tv ) + case tcTyVarDetails tv of + MetaTv (SigTv _) _ -> True + other -> False + +metaTvRef :: TyVar -> IORef MetaDetails +metaTvRef tv + = ASSERT( isTcTyVar tv ) + case tcTyVarDetails tv of + MetaTv _ ref -> ref + other -> pprPanic "metaTvRef" (ppr tv) + +isFlexi, isIndirect :: MetaDetails -> Bool +isFlexi Flexi = True +isFlexi other = False + +isIndirect (Indirect _) = True +isIndirect other = False +\end{code} + + +%************************************************************************ +%* * +\subsection{Tau, sigma and rho} +%* * +%************************************************************************ + +\begin{code} +mkSigmaTy tyvars theta tau = mkForAllTys tyvars (mkPhiTy theta tau) + +mkPhiTy :: [PredType] -> Type -> Type +mkPhiTy theta ty = foldr (\p r -> FunTy (mkPredTy p) r) ty theta +\end{code} + +@isTauTy@ tests for nested for-alls. It should not be called on a boxy type. + +\begin{code} +isTauTy :: Type -> Bool +isTauTy ty | Just ty' <- tcView ty = isTauTy ty' +isTauTy (TyVarTy tv) = ASSERT( not (isTcTyVar tv && isBoxyTyVar tv) ) + True +isTauTy (TyConApp tc tys) = all isTauTy tys && isTauTyCon tc +isTauTy (AppTy a b) = isTauTy a && isTauTy b +isTauTy (FunTy a b) = isTauTy a && isTauTy b +isTauTy (PredTy p) = True -- Don't look through source types +isTauTy other = False + + +isTauTyCon :: TyCon -> Bool +-- Returns False for type synonyms whose expansion is a polytype +isTauTyCon tc | isSynTyCon tc = isTauTy (snd (synTyConDefn tc)) + | otherwise = True + +--------------- +isBoxyTy :: TcType -> Bool +isBoxyTy ty = any isBoxyTyVar (varSetElems (tcTyVarsOfType ty)) + +isRigidTy :: TcType -> Bool +-- A type is rigid if it has no meta type variables in it +isRigidTy ty = all isSkolemTyVar (varSetElems (tcTyVarsOfType ty)) + +isRefineableTy :: TcType -> Bool +-- A type should have type refinements applied to it if it has +-- free type variables, and they are all rigid +isRefineableTy ty = not (null tc_tvs) && all isSkolemTyVar tc_tvs + where + tc_tvs = varSetElems (tcTyVarsOfType ty) + +--------------- +getDFunTyKey :: Type -> OccName -- Get some string from a type, to be used to + -- construct a dictionary function name +getDFunTyKey ty | Just ty' <- tcView ty = getDFunTyKey ty' +getDFunTyKey (TyVarTy tv) = getOccName tv +getDFunTyKey (TyConApp tc _) = getOccName tc +getDFunTyKey (AppTy fun _) = getDFunTyKey fun +getDFunTyKey (FunTy arg _) = getOccName funTyCon +getDFunTyKey (ForAllTy _ t) = getDFunTyKey t +getDFunTyKey ty = pprPanic "getDFunTyKey" (pprType ty) +-- PredTy shouldn't happen +\end{code} + + +%************************************************************************ +%* * +\subsection{Expanding and splitting} +%* * +%************************************************************************ + +These tcSplit functions are like their non-Tc analogues, but + a) they do not look through newtypes + b) they do not look through PredTys + c) [future] they ignore usage-type annotations + +However, they are non-monadic and do not follow through mutable type +variables. It's up to you to make sure this doesn't matter. + +\begin{code} +tcSplitForAllTys :: Type -> ([TyVar], Type) +tcSplitForAllTys ty = split ty ty [] + where + split orig_ty ty tvs | Just ty' <- tcView ty = split orig_ty ty' tvs + split orig_ty (ForAllTy tv ty) tvs = split ty ty (tv:tvs) + split orig_ty t tvs = (reverse tvs, orig_ty) + +tcIsForAllTy ty | Just ty' <- tcView ty = tcIsForAllTy ty' +tcIsForAllTy (ForAllTy tv ty) = True +tcIsForAllTy t = False + +tcSplitPhiTy :: Type -> ([PredType], Type) +tcSplitPhiTy ty = split ty ty [] + where + split orig_ty ty tvs | Just ty' <- tcView ty = split orig_ty ty' tvs + split orig_ty (FunTy arg res) ts = case tcSplitPredTy_maybe arg of + Just p -> split res res (p:ts) + Nothing -> (reverse ts, orig_ty) + split orig_ty ty ts = (reverse ts, orig_ty) + +tcSplitSigmaTy ty = case tcSplitForAllTys ty of + (tvs, rho) -> case tcSplitPhiTy rho of + (theta, tau) -> (tvs, theta, tau) + +----------------------- +tcMultiSplitSigmaTy + :: TcSigmaType + -> ( [([TyVar], ThetaType)], -- forall as.C => forall bs.D + TcSigmaType) -- The rest of the type + +-- We need a loop here because we are now prepared to entertain +-- types like +-- f:: forall a. Eq a => forall b. Baz b => tau +-- We want to instantiate this to +-- f2::tau {f2 = f1 b (Baz b), f1 = f a (Eq a)} + +tcMultiSplitSigmaTy sigma + = case (tcSplitSigmaTy sigma) of + ([],[],ty) -> ([], sigma) + (tvs, theta, ty) -> case tcMultiSplitSigmaTy ty of + (pairs, rest) -> ((tvs,theta):pairs, rest) + +----------------------- +tcTyConAppTyCon :: Type -> TyCon +tcTyConAppTyCon ty = fst (tcSplitTyConApp ty) + +tcTyConAppArgs :: Type -> [Type] +tcTyConAppArgs ty = snd (tcSplitTyConApp ty) + +tcSplitTyConApp :: Type -> (TyCon, [Type]) +tcSplitTyConApp ty = case tcSplitTyConApp_maybe ty of + Just stuff -> stuff + Nothing -> pprPanic "tcSplitTyConApp" (pprType ty) + +tcSplitTyConApp_maybe :: Type -> Maybe (TyCon, [Type]) +tcSplitTyConApp_maybe ty | Just ty' <- tcView ty = tcSplitTyConApp_maybe ty' +tcSplitTyConApp_maybe (TyConApp tc tys) = Just (tc, tys) +tcSplitTyConApp_maybe (FunTy arg res) = Just (funTyCon, [arg,res]) + -- Newtypes are opaque, so they may be split + -- However, predicates are not treated + -- as tycon applications by the type checker +tcSplitTyConApp_maybe other = Nothing + +----------------------- +tcSplitFunTys :: Type -> ([Type], Type) +tcSplitFunTys ty = case tcSplitFunTy_maybe ty of + Nothing -> ([], ty) + Just (arg,res) -> (arg:args, res') + where + (args,res') = tcSplitFunTys res + +tcSplitFunTy_maybe :: Type -> Maybe (Type, Type) +tcSplitFunTy_maybe ty | Just ty' <- tcView ty = tcSplitFunTy_maybe ty' +tcSplitFunTy_maybe (FunTy arg res) = Just (arg, res) +tcSplitFunTy_maybe other = Nothing + +tcSplitFunTysN + :: TcRhoType + -> Arity -- N: Number of desired args + -> ([TcSigmaType], -- Arg types (N or fewer) + TcSigmaType) -- The rest of the type + +tcSplitFunTysN ty n_args + | n_args == 0 + = ([], ty) + | Just (arg,res) <- tcSplitFunTy_maybe ty + = case tcSplitFunTysN res (n_args - 1) of + (args, res) -> (arg:args, res) + | otherwise + = ([], ty) + +tcFunArgTy ty = case tcSplitFunTy_maybe ty of { Just (arg,res) -> arg } +tcFunResultTy ty = case tcSplitFunTy_maybe ty of { Just (arg,res) -> res } + + +----------------------- +tcSplitAppTy_maybe :: Type -> Maybe (Type, Type) +tcSplitAppTy_maybe ty | Just ty' <- tcView ty = tcSplitAppTy_maybe ty' +tcSplitAppTy_maybe (FunTy ty1 ty2) = Just (TyConApp funTyCon [ty1], ty2) +tcSplitAppTy_maybe (AppTy ty1 ty2) = Just (ty1, ty2) +tcSplitAppTy_maybe (TyConApp tc tys) = case snocView tys of + Just (tys', ty') -> Just (TyConApp tc tys', ty') + Nothing -> Nothing +tcSplitAppTy_maybe other = Nothing + +tcSplitAppTy ty = case tcSplitAppTy_maybe ty of + Just stuff -> stuff + Nothing -> pprPanic "tcSplitAppTy" (pprType ty) + +tcSplitAppTys :: Type -> (Type, [Type]) +tcSplitAppTys ty + = go ty [] + where + go ty args = case tcSplitAppTy_maybe ty of + Just (ty', arg) -> go ty' (arg:args) + Nothing -> (ty,args) + +----------------------- +tcGetTyVar_maybe :: Type -> Maybe TyVar +tcGetTyVar_maybe ty | Just ty' <- tcView ty = tcGetTyVar_maybe ty' +tcGetTyVar_maybe (TyVarTy tv) = Just tv +tcGetTyVar_maybe other = Nothing + +tcGetTyVar :: String -> Type -> TyVar +tcGetTyVar msg ty = expectJust msg (tcGetTyVar_maybe ty) + +tcIsTyVarTy :: Type -> Bool +tcIsTyVarTy ty = maybeToBool (tcGetTyVar_maybe ty) + +----------------------- +tcSplitDFunTy :: Type -> ([TyVar], [PredType], Class, [Type]) +-- Split the type of a dictionary function +tcSplitDFunTy ty + = case tcSplitSigmaTy ty of { (tvs, theta, tau) -> + case tcSplitDFunHead tau of { (clas, tys) -> + (tvs, theta, clas, tys) }} + +tcSplitDFunHead :: Type -> (Class, [Type]) +tcSplitDFunHead tau + = case tcSplitPredTy_maybe tau of + Just (ClassP clas tys) -> (clas, tys) + +tcValidInstHeadTy :: Type -> Bool +-- Used in Haskell-98 mode, for the argument types of an instance head +-- These must not be type synonyms, but everywhere else type synonyms +-- are transparent, so we need a special function here +tcValidInstHeadTy ty + = case ty of + NoteTy _ ty -> tcValidInstHeadTy ty + TyConApp tc tys -> not (isSynTyCon tc) && ok tys + FunTy arg res -> ok [arg, res] + other -> False + where + -- Check that all the types are type variables, + -- and that each is distinct + ok tys = equalLength tvs tys && hasNoDups tvs + where + tvs = mapCatMaybes get_tv tys + + get_tv (NoteTy _ ty) = get_tv ty -- Again, do not look + get_tv (TyVarTy tv) = Just tv -- through synonyms + get_tv other = Nothing +\end{code} + + + +%************************************************************************ +%* * +\subsection{Predicate types} +%* * +%************************************************************************ + +\begin{code} +tcSplitPredTy_maybe :: Type -> Maybe PredType + -- Returns Just for predicates only +tcSplitPredTy_maybe ty | Just ty' <- tcView ty = tcSplitPredTy_maybe ty' +tcSplitPredTy_maybe (PredTy p) = Just p +tcSplitPredTy_maybe other = Nothing + +predTyUnique :: PredType -> Unique +predTyUnique (IParam n _) = getUnique (ipNameName n) +predTyUnique (ClassP clas tys) = getUnique clas + +mkPredName :: Unique -> SrcLoc -> PredType -> Name +mkPredName uniq loc (ClassP cls tys) = mkInternalName uniq (mkDictOcc (getOccName cls)) loc +mkPredName uniq loc (IParam ip ty) = mkInternalName uniq (getOccName (ipNameName ip)) loc +\end{code} + + +--------------------- Dictionary types --------------------------------- + +\begin{code} +mkClassPred clas tys = ClassP clas tys + +isClassPred :: PredType -> Bool +isClassPred (ClassP clas tys) = True +isClassPred other = False + +isTyVarClassPred (ClassP clas tys) = all tcIsTyVarTy tys +isTyVarClassPred other = False + +getClassPredTys_maybe :: PredType -> Maybe (Class, [Type]) +getClassPredTys_maybe (ClassP clas tys) = Just (clas, tys) +getClassPredTys_maybe _ = Nothing + +getClassPredTys :: PredType -> (Class, [Type]) +getClassPredTys (ClassP clas tys) = (clas, tys) + +mkDictTy :: Class -> [Type] -> Type +mkDictTy clas tys = mkPredTy (ClassP clas tys) + +isDictTy :: Type -> Bool +isDictTy ty | Just ty' <- tcView ty = isDictTy ty' +isDictTy (PredTy p) = isClassPred p +isDictTy other = False +\end{code} + +--------------------- Implicit parameters --------------------------------- + +\begin{code} +isIPPred :: PredType -> Bool +isIPPred (IParam _ _) = True +isIPPred other = False + +isInheritablePred :: PredType -> Bool +-- Can be inherited by a context. For example, consider +-- f x = let g y = (?v, y+x) +-- in (g 3 with ?v = 8, +-- g 4 with ?v = 9) +-- The point is that g's type must be quantifed over ?v: +-- g :: (?v :: a) => a -> a +-- but it doesn't need to be quantified over the Num a dictionary +-- which can be free in g's rhs, and shared by both calls to g +isInheritablePred (ClassP _ _) = True +isInheritablePred other = False + +isLinearPred :: TcPredType -> Bool +isLinearPred (IParam (Linear n) _) = True +isLinearPred other = False +\end{code} + +--------------------- The stupid theta (sigh) --------------------------------- + +\begin{code} +dataConsStupidTheta :: [DataCon] -> ThetaType +-- Union the stupid thetas from all the specified constructors (non-empty) +-- All the constructors should have the same result type, modulo alpha conversion +-- The resulting ThetaType uses type variables from the *first* constructor in the list +-- +-- It's here because it's used in MkId.mkRecordSelId, and in TcExpr +dataConsStupidTheta (con1:cons) + = nubBy tcEqPred all_preds + where + all_preds = dataConStupidTheta con1 ++ other_stupids + res_tys1 = dataConResTys con1 + tvs1 = tyVarsOfTypes res_tys1 + other_stupids = [ substPred subst pred + | con <- cons + , let Just subst = tcMatchTys tvs1 res_tys1 (dataConResTys con) + , pred <- dataConStupidTheta con ] +\end{code} + + +%************************************************************************ +%* * +\subsection{Predicates} +%* * +%************************************************************************ + +isSigmaTy returns true of any qualified type. It doesn't *necessarily* have +any foralls. E.g. + f :: (?x::Int) => Int -> Int + +\begin{code} +isSigmaTy :: Type -> Bool +isSigmaTy ty | Just ty' <- tcView ty = isSigmaTy ty' +isSigmaTy (ForAllTy tyvar ty) = True +isSigmaTy (FunTy a b) = isPredTy a +isSigmaTy _ = False + +isOverloadedTy :: Type -> Bool +isOverloadedTy ty | Just ty' <- tcView ty = isOverloadedTy ty' +isOverloadedTy (ForAllTy tyvar ty) = isOverloadedTy ty +isOverloadedTy (FunTy a b) = isPredTy a +isOverloadedTy _ = False + +isPredTy :: Type -> Bool -- Belongs in TcType because it does + -- not look through newtypes, or predtypes (of course) +isPredTy ty | Just ty' <- tcView ty = isPredTy ty' +isPredTy (PredTy sty) = True +isPredTy _ = False +\end{code} + +\begin{code} +isFloatTy = is_tc floatTyConKey +isDoubleTy = is_tc doubleTyConKey +isIntegerTy = is_tc integerTyConKey +isIntTy = is_tc intTyConKey +isAddrTy = is_tc addrTyConKey +isBoolTy = is_tc boolTyConKey +isUnitTy = is_tc unitTyConKey + +is_tc :: Unique -> Type -> Bool +-- Newtypes are opaque to this +is_tc uniq ty = case tcSplitTyConApp_maybe ty of + Just (tc, _) -> uniq == getUnique tc + Nothing -> False +\end{code} + + +%************************************************************************ +%* * +\subsection{Misc} +%* * +%************************************************************************ + +\begin{code} +deNoteType :: Type -> Type +-- Remove all *outermost* type synonyms and other notes +deNoteType ty | Just ty' <- tcView ty = deNoteType ty' +deNoteType ty = ty +\end{code} + +\begin{code} +tcTyVarsOfType :: Type -> TcTyVarSet +-- Just the tc type variables free in the type +tcTyVarsOfType (TyVarTy tv) = if isTcTyVar tv then unitVarSet tv + else emptyVarSet +tcTyVarsOfType (TyConApp tycon tys) = tcTyVarsOfTypes tys +tcTyVarsOfType (NoteTy _ ty) = tcTyVarsOfType ty +tcTyVarsOfType (PredTy sty) = tcTyVarsOfPred sty +tcTyVarsOfType (FunTy arg res) = tcTyVarsOfType arg `unionVarSet` tcTyVarsOfType res +tcTyVarsOfType (AppTy fun arg) = tcTyVarsOfType fun `unionVarSet` tcTyVarsOfType arg +tcTyVarsOfType (ForAllTy tyvar ty) = tcTyVarsOfType ty `delVarSet` tyvar + -- We do sometimes quantify over skolem TcTyVars + +tcTyVarsOfTypes :: [Type] -> TyVarSet +tcTyVarsOfTypes tys = foldr (unionVarSet.tcTyVarsOfType) emptyVarSet tys + +tcTyVarsOfPred :: PredType -> TyVarSet +tcTyVarsOfPred (IParam _ ty) = tcTyVarsOfType ty +tcTyVarsOfPred (ClassP _ tys) = tcTyVarsOfTypes tys +\end{code} + +Note [Silly type synonym] +~~~~~~~~~~~~~~~~~~~~~~~~~ +Consider + type T a = Int +What are the free tyvars of (T x)? Empty, of course! +Here's the example that Ralf Laemmel showed me: + foo :: (forall a. C u a -> C u a) -> u + mappend :: Monoid u => u -> u -> u + + bar :: Monoid u => u + bar = foo (\t -> t `mappend` t) +We have to generalise at the arg to f, and we don't +want to capture the constraint (Monad (C u a)) because +it appears to mention a. Pretty silly, but it was useful to him. + +exactTyVarsOfType is used by the type checker to figure out exactly +which type variables are mentioned in a type. It's also used in the +smart-app checking code --- see TcExpr.tcIdApp + +\begin{code} +exactTyVarsOfType :: TcType -> TyVarSet +-- Find the free type variables (of any kind) +-- but *expand* type synonyms. See Note [Silly type synonym] belos. +exactTyVarsOfType ty + = go ty + where + go ty | Just ty' <- tcView ty = go ty' -- This is the key line + go (TyVarTy tv) = unitVarSet tv + go (TyConApp tycon tys) = exactTyVarsOfTypes tys + go (PredTy ty) = go_pred ty + go (FunTy arg res) = go arg `unionVarSet` go res + go (AppTy fun arg) = go fun `unionVarSet` go arg + go (ForAllTy tyvar ty) = delVarSet (go ty) tyvar + + go_pred (IParam _ ty) = go ty + go_pred (ClassP _ tys) = exactTyVarsOfTypes tys + +exactTyVarsOfTypes :: [TcType] -> TyVarSet +exactTyVarsOfTypes tys = foldr (unionVarSet . exactTyVarsOfType) emptyVarSet tys +\end{code} + +Find the free tycons and classes of a type. This is used in the front +end of the compiler. + +\begin{code} +tyClsNamesOfType :: Type -> NameSet +tyClsNamesOfType (TyVarTy tv) = emptyNameSet +tyClsNamesOfType (TyConApp tycon tys) = unitNameSet (getName tycon) `unionNameSets` tyClsNamesOfTypes tys +tyClsNamesOfType (NoteTy _ ty2) = tyClsNamesOfType ty2 +tyClsNamesOfType (PredTy (IParam n ty)) = tyClsNamesOfType ty +tyClsNamesOfType (PredTy (ClassP cl tys)) = unitNameSet (getName cl) `unionNameSets` tyClsNamesOfTypes tys +tyClsNamesOfType (FunTy arg res) = tyClsNamesOfType arg `unionNameSets` tyClsNamesOfType res +tyClsNamesOfType (AppTy fun arg) = tyClsNamesOfType fun `unionNameSets` tyClsNamesOfType arg +tyClsNamesOfType (ForAllTy tyvar ty) = tyClsNamesOfType ty + +tyClsNamesOfTypes tys = foldr (unionNameSets . tyClsNamesOfType) emptyNameSet tys + +tyClsNamesOfDFunHead :: Type -> NameSet +-- Find the free type constructors and classes +-- of the head of the dfun instance type +-- The 'dfun_head_type' is because of +-- instance Foo a => Baz T where ... +-- The decl is an orphan if Baz and T are both not locally defined, +-- even if Foo *is* locally defined +tyClsNamesOfDFunHead dfun_ty + = case tcSplitSigmaTy dfun_ty of + (tvs,_,head_ty) -> tyClsNamesOfType head_ty + +classesOfTheta :: ThetaType -> [Class] +-- Looks just for ClassP things; maybe it should check +classesOfTheta preds = [ c | ClassP c _ <- preds ] +\end{code} + + +%************************************************************************ +%* * +\subsection[TysWiredIn-ext-type]{External types} +%* * +%************************************************************************ + +The compiler's foreign function interface supports the passing of a +restricted set of types as arguments and results (the restricting factor +being the ) + +\begin{code} +isFFITy :: Type -> Bool +-- True for any TyCon that can possibly be an arg or result of an FFI call +isFFITy ty = checkRepTyCon legalFFITyCon ty + +isFFIArgumentTy :: DynFlags -> Safety -> Type -> Bool +-- Checks for valid argument type for a 'foreign import' +isFFIArgumentTy dflags safety ty + = checkRepTyCon (legalOutgoingTyCon dflags safety) ty + +isFFIExternalTy :: Type -> Bool +-- Types that are allowed as arguments of a 'foreign export' +isFFIExternalTy ty = checkRepTyCon legalFEArgTyCon ty + +isFFIImportResultTy :: DynFlags -> Type -> Bool +isFFIImportResultTy dflags ty + = checkRepTyCon (legalFIResultTyCon dflags) ty + +isFFIExportResultTy :: Type -> Bool +isFFIExportResultTy ty = checkRepTyCon legalFEResultTyCon ty + +isFFIDynArgumentTy :: Type -> Bool +-- The argument type of a foreign import dynamic must be Ptr, FunPtr, Addr, +-- or a newtype of either. +isFFIDynArgumentTy = checkRepTyConKey [ptrTyConKey, funPtrTyConKey, addrTyConKey] + +isFFIDynResultTy :: Type -> Bool +-- The result type of a foreign export dynamic must be Ptr, FunPtr, Addr, +-- or a newtype of either. +isFFIDynResultTy = checkRepTyConKey [ptrTyConKey, funPtrTyConKey, addrTyConKey] + +isFFILabelTy :: Type -> Bool +-- The type of a foreign label must be Ptr, FunPtr, Addr, +-- or a newtype of either. +isFFILabelTy = checkRepTyConKey [ptrTyConKey, funPtrTyConKey, addrTyConKey] + +isFFIDotnetTy :: DynFlags -> Type -> Bool +isFFIDotnetTy dflags ty + = checkRepTyCon (\ tc -> not (isByteArrayLikeTyCon tc) && + (legalFIResultTyCon dflags tc || + isFFIDotnetObjTy ty || isStringTy ty)) ty + +-- Support String as an argument or result from a .NET FFI call. +isStringTy ty = + case tcSplitTyConApp_maybe (repType ty) of + Just (tc, [arg_ty]) + | tc == listTyCon -> + case tcSplitTyConApp_maybe (repType arg_ty) of + Just (cc,[]) -> cc == charTyCon + _ -> False + _ -> False + +-- Support String as an argument or result from a .NET FFI call. +isFFIDotnetObjTy ty = + let + (_, t_ty) = tcSplitForAllTys ty + in + case tcSplitTyConApp_maybe (repType t_ty) of + Just (tc, [arg_ty]) | getName tc == objectTyConName -> True + _ -> False + +toDNType :: Type -> DNType +toDNType ty + | isStringTy ty = DNString + | isFFIDotnetObjTy ty = DNObject + | Just (tc,argTys) <- tcSplitTyConApp_maybe ty = + case lookup (getUnique tc) dn_assoc of + Just x -> x + Nothing + | tc `hasKey` ioTyConKey -> toDNType (head argTys) + | otherwise -> pprPanic ("toDNType: unsupported .NET type") (pprType ty <+> parens (hcat (map pprType argTys)) <+> ppr tc) + where + dn_assoc :: [ (Unique, DNType) ] + dn_assoc = [ (unitTyConKey, DNUnit) + , (intTyConKey, DNInt) + , (int8TyConKey, DNInt8) + , (int16TyConKey, DNInt16) + , (int32TyConKey, DNInt32) + , (int64TyConKey, DNInt64) + , (wordTyConKey, DNInt) + , (word8TyConKey, DNWord8) + , (word16TyConKey, DNWord16) + , (word32TyConKey, DNWord32) + , (word64TyConKey, DNWord64) + , (floatTyConKey, DNFloat) + , (doubleTyConKey, DNDouble) + , (addrTyConKey, DNPtr) + , (ptrTyConKey, DNPtr) + , (funPtrTyConKey, DNPtr) + , (charTyConKey, DNChar) + , (boolTyConKey, DNBool) + ] + +checkRepTyCon :: (TyCon -> Bool) -> Type -> Bool + -- Look through newtypes + -- Non-recursive ones are transparent to splitTyConApp, + -- but recursive ones aren't. Manuel had: + -- newtype T = MkT (Ptr T) + -- and wanted it to work... +checkRepTyCon check_tc ty + | Just (tc,_) <- splitTyConApp_maybe (repType ty) = check_tc tc + | otherwise = False + +checkRepTyConKey :: [Unique] -> Type -> Bool +-- Like checkRepTyCon, but just looks at the TyCon key +checkRepTyConKey keys + = checkRepTyCon (\tc -> tyConUnique tc `elem` keys) +\end{code} + +---------------------------------------------- +These chaps do the work; they are not exported +---------------------------------------------- + +\begin{code} +legalFEArgTyCon :: TyCon -> Bool +-- It's illegal to return foreign objects and (mutable) +-- bytearrays from a _ccall_ / foreign declaration +-- (or be passed them as arguments in foreign exported functions). +legalFEArgTyCon tc + | isByteArrayLikeTyCon tc + = False + -- It's also illegal to make foreign exports that take unboxed + -- arguments. The RTS API currently can't invoke such things. --SDM 7/2000 + | otherwise + = boxedMarshalableTyCon tc + +legalFIResultTyCon :: DynFlags -> TyCon -> Bool +legalFIResultTyCon dflags tc + | isByteArrayLikeTyCon tc = False + | tc == unitTyCon = True + | otherwise = marshalableTyCon dflags tc + +legalFEResultTyCon :: TyCon -> Bool +legalFEResultTyCon tc + | isByteArrayLikeTyCon tc = False + | tc == unitTyCon = True + | otherwise = boxedMarshalableTyCon tc + +legalOutgoingTyCon :: DynFlags -> Safety -> TyCon -> Bool +-- Checks validity of types going from Haskell -> external world +legalOutgoingTyCon dflags safety tc + | playSafe safety && isByteArrayLikeTyCon tc + = False + | otherwise + = marshalableTyCon dflags tc + +legalFFITyCon :: TyCon -> Bool +-- True for any TyCon that can possibly be an arg or result of an FFI call +legalFFITyCon tc + = isUnLiftedTyCon tc || boxedMarshalableTyCon tc || tc == unitTyCon + +marshalableTyCon dflags tc + = (dopt Opt_GlasgowExts dflags && isUnLiftedTyCon tc) + || boxedMarshalableTyCon tc + +boxedMarshalableTyCon tc + = getUnique tc `elem` [ intTyConKey, int8TyConKey, int16TyConKey + , int32TyConKey, int64TyConKey + , wordTyConKey, word8TyConKey, word16TyConKey + , word32TyConKey, word64TyConKey + , floatTyConKey, doubleTyConKey + , addrTyConKey, ptrTyConKey, funPtrTyConKey + , charTyConKey + , stablePtrTyConKey + , byteArrayTyConKey, mutableByteArrayTyConKey + , boolTyConKey + ] + +isByteArrayLikeTyCon :: TyCon -> Bool +isByteArrayLikeTyCon tc = + getUnique tc `elem` [byteArrayTyConKey, mutableByteArrayTyConKey] +\end{code} diff --git a/compiler/typecheck/TcType.lhs-boot b/compiler/typecheck/TcType.lhs-boot new file mode 100644 index 0000000000..191badd943 --- /dev/null +++ b/compiler/typecheck/TcType.lhs-boot @@ -0,0 +1,7 @@ +\begin{code} +module TcType where +import Outputable( SDoc ) + +data TcTyVarDetails +pprTcTyVarDetails :: TcTyVarDetails -> SDoc +\end{code} diff --git a/compiler/typecheck/TcUnify.hi-boot-5 b/compiler/typecheck/TcUnify.hi-boot-5 new file mode 100644 index 0000000000..b88d3abeb0 --- /dev/null +++ b/compiler/typecheck/TcUnify.hi-boot-5 @@ -0,0 +1,8 @@ +-- This boot file exists only to tie the knot between +-- TcUnify and TcSimplify + +__interface TcUnify 1 0 where +__export TcUnify unifyTauTy ; +1 unifyTauTy :: TcType.TcTauType -> TcType.TcTauType -> TcRnTypes.TcM PrelBase.Z0T ; + + diff --git a/compiler/typecheck/TcUnify.hi-boot-6 b/compiler/typecheck/TcUnify.hi-boot-6 new file mode 100644 index 0000000000..eb286359e2 --- /dev/null +++ b/compiler/typecheck/TcUnify.hi-boot-6 @@ -0,0 +1,7 @@ +module TcUnify where + +-- This boot file exists only to tie the knot between +-- TcUnify and TcSimplify + +unifyType :: TcType.TcTauType -> TcType.TcTauType -> TcRnTypes.TcM () +zapToMonotype :: TcType.BoxyType -> TcRnTypes.TcM TcType.TcTauType diff --git a/compiler/typecheck/TcUnify.lhs b/compiler/typecheck/TcUnify.lhs new file mode 100644 index 0000000000..23cc9e2176 --- /dev/null +++ b/compiler/typecheck/TcUnify.lhs @@ -0,0 +1,1724 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section{Type subsumption and unification} + +\begin{code} +module TcUnify ( + -- Full-blown subsumption + tcSubExp, tcFunResTy, tcGen, + checkSigTyVars, checkSigTyVarsWrt, bleatEscapedTvs, sigCtxt, + + -- Various unifications + unifyType, unifyTypeList, unifyTheta, + unifyKind, unifyKinds, unifyFunKind, + checkExpectedKind, + boxySubMatchType, boxyMatchTypes, + + -------------------------------- + -- Holes + tcInfer, subFunTys, unBox, stripBoxyType, withBox, + boxyUnify, boxyUnifyList, zapToMonotype, + boxySplitListTy, boxySplitTyConApp, boxySplitAppTy, + wrapFunResCoercion + ) where + +#include "HsVersions.h" + +import HsSyn ( ExprCoFn(..), idCoercion, isIdCoercion, (<.>) ) +import TypeRep ( Type(..), PredType(..) ) + +import TcMType ( lookupTcTyVar, LookupTyVarResult(..), + tcInstSkolType, newKindVar, newMetaTyVar, + tcInstBoxy, newBoxyTyVar, newBoxyTyVarTys, readFilledBox, + readMetaTyVar, writeMetaTyVar, newFlexiTyVarTy, + tcInstSkolTyVars, + zonkTcKind, zonkType, zonkTcType, zonkTcTyVarsAndFV, + readKindVar, writeKindVar ) +import TcSimplify ( tcSimplifyCheck ) +import TcEnv ( tcGetGlobalTyVars, findGlobals ) +import TcIface ( checkWiredInTyCon ) +import TcRnMonad -- TcType, amongst others +import TcType ( TcKind, TcType, TcTyVar, TcTauType, + BoxySigmaType, BoxyRhoType, BoxyType, + TcTyVarSet, TcThetaType, TcTyVarDetails(..), BoxInfo(..), + SkolemInfo( GenSkol, UnkSkol ), MetaDetails(..), isImmutableTyVar, + pprSkolTvBinding, isTauTy, isTauTyCon, isSigmaTy, + mkFunTy, mkFunTys, mkTyConApp, isMetaTyVar, + tcSplitForAllTys, tcSplitAppTy_maybe, tcSplitFunTys, mkTyVarTys, + tyVarsOfType, mkPhiTy, mkTyVarTy, mkPredTy, + typeKind, mkForAllTys, mkAppTy, isBoxyTyVar, + tidyOpenType, tidyOpenTyVar, tidyOpenTyVars, + pprType, tidyKind, tidySkolemTyVar, isSkolemTyVar, tcView, + TvSubst, mkTvSubst, zipTyEnv, substTy, emptyTvSubst, + lookupTyVar, extendTvSubst ) +import Kind ( Kind(..), SimpleKind, KindVar, isArgTypeKind, + openTypeKind, liftedTypeKind, mkArrowKind, defaultKind, + isOpenTypeKind, argTypeKind, isLiftedTypeKind, isUnliftedTypeKind, + isSubKind, pprKind, splitKindFunTys ) +import TysPrim ( alphaTy, betaTy ) +import Inst ( newDicts, instToId ) +import TyCon ( TyCon, tyConArity, tyConTyVars, isSynTyCon ) +import TysWiredIn ( listTyCon ) +import Id ( Id, mkSysLocal ) +import Var ( Var, varName, tyVarKind, isTcTyVar, tcTyVarDetails ) +import VarSet ( emptyVarSet, mkVarSet, unitVarSet, unionVarSet, elemVarSet, varSetElems, + extendVarSet, intersectsVarSet ) +import VarEnv +import Name ( Name, isSystemName ) +import ErrUtils ( Message ) +import Maybes ( expectJust, isNothing ) +import BasicTypes ( Arity ) +import UniqSupply ( uniqsFromSupply ) +import Util ( notNull, equalLength ) +import Outputable + +-- Assertion imports +#ifdef DEBUG +import TcType ( isBoxyTy, isFlexi ) +#endif +\end{code} + +%************************************************************************ +%* * +\subsection{'hole' type variables} +%* * +%************************************************************************ + +\begin{code} +tcInfer :: (BoxyType -> TcM a) -> TcM (a, TcType) +tcInfer tc_infer + = do { box <- newBoxyTyVar openTypeKind + ; res <- tc_infer (mkTyVarTy box) + ; res_ty <- readFilledBox box -- Guaranteed filled-in by now + ; return (res, res_ty) } +\end{code} + + +%************************************************************************ +%* * + subFunTys +%* * +%************************************************************************ + +\begin{code} +subFunTys :: SDoc -- Somthing like "The function f has 3 arguments" + -- or "The abstraction (\x.e) takes 1 argument" + -> Arity -- Expected # of args + -> BoxyRhoType -- res_ty + -> ([BoxySigmaType] -> BoxyRhoType -> TcM a) + -> TcM (ExprCoFn, a) +-- Attempt to decompse res_ty to have enough top-level arrows to +-- match the number of patterns in the match group +-- +-- If (subFunTys n_args res_ty thing_inside) = (co_fn, res) +-- and the inner call to thing_inside passes args: [a1,...,an], b +-- then co_fn :: (a1 -> ... -> an -> b) -> res_ty +-- +-- Note that it takes a BoxyRho type, and guarantees to return a BoxyRhoType + + +{- Error messages from subFunTys + + The abstraction `\Just 1 -> ...' has two arguments + but its type `Maybe a -> a' has only one + + The equation(s) for `f' have two arguments + but its type `Maybe a -> a' has only one + + The section `(f 3)' requires 'f' to take two arguments + but its type `Int -> Int' has only one + + The function 'f' is applied to two arguments + but its type `Int -> Int' has only one +-} + + +subFunTys error_herald n_pats res_ty thing_inside + = loop n_pats [] res_ty + where + -- In 'loop', the parameter 'arg_tys' accumulates + -- the arg types so far, in *reverse order* + loop n args_so_far res_ty + | Just res_ty' <- tcView res_ty = loop n args_so_far res_ty' + + loop n args_so_far res_ty + | isSigmaTy res_ty -- Do this before checking n==0, because we + -- guarantee to return a BoxyRhoType, not a BoxySigmaType + = do { (gen_fn, (co_fn, res)) <- tcGen res_ty emptyVarSet $ \ res_ty' -> + loop n args_so_far res_ty' + ; return (gen_fn <.> co_fn, res) } + + loop 0 args_so_far res_ty + = do { res <- thing_inside (reverse args_so_far) res_ty + ; return (idCoercion, res) } + + loop n args_so_far (FunTy arg_ty res_ty) + = do { (co_fn, res) <- loop (n-1) (arg_ty:args_so_far) res_ty + ; co_fn' <- wrapFunResCoercion [arg_ty] co_fn + ; return (co_fn', res) } + + -- res_ty might have a type variable at the head, such as (a b c), + -- in which case we must fill in with (->). Simplest thing to do + -- is to use boxyUnify, but we catch failure and generate our own + -- error message on failure + loop n args_so_far res_ty@(AppTy _ _) + = do { [arg_ty',res_ty'] <- newBoxyTyVarTys [argTypeKind, openTypeKind] + ; (_, mb_unit) <- tryTcErrs $ boxyUnify res_ty (FunTy arg_ty' res_ty') + ; if isNothing mb_unit then bale_out args_so_far res_ty + else loop n args_so_far (FunTy arg_ty' res_ty') } + + loop n args_so_far (TyVarTy tv) + | not (isImmutableTyVar tv) + = do { cts <- readMetaTyVar tv + ; case cts of + Indirect ty -> loop n args_so_far ty + Flexi -> do { (res_ty:arg_tys) <- withMetaTvs tv kinds mk_res_ty + ; res <- thing_inside (reverse args_so_far ++ arg_tys) res_ty + ; return (idCoercion, res) } } + where + mk_res_ty (res_ty' : arg_tys') = mkFunTys arg_tys' res_ty' + kinds = openTypeKind : take n (repeat argTypeKind) + -- Note argTypeKind: the args can have an unboxed type, + -- but not an unboxed tuple. + + loop n args_so_far res_ty = bale_out args_so_far res_ty + + bale_out args_so_far res_ty + = do { env0 <- tcInitTidyEnv + ; res_ty' <- zonkTcType res_ty + ; let (env1, res_ty'') = tidyOpenType env0 res_ty' + ; failWithTcM (env1, mk_msg res_ty'' (length args_so_far)) } + + mk_msg res_ty n_actual + = error_herald <> comma $$ + sep [ptext SLIT("but its type") <+> quotes (pprType res_ty), + if n_actual == 0 then ptext SLIT("has none") + else ptext SLIT("has only") <+> speakN n_actual] +\end{code} + +\begin{code} +---------------------- +boxySplitTyConApp :: TyCon -- T :: k1 -> ... -> kn -> * + -> BoxyRhoType -- Expected type (T a b c) + -> TcM [BoxySigmaType] -- Element types, a b c + -- It's used for wired-in tycons, so we call checkWiredInTyCOn + -- Precondition: never called with FunTyCon + -- Precondition: input type :: * + +boxySplitTyConApp tc orig_ty + = do { checkWiredInTyCon tc + ; loop (tyConArity tc) [] orig_ty } + where + loop n_req args_so_far ty + | Just ty' <- tcView ty = loop n_req args_so_far ty' + + loop n_req args_so_far (TyConApp tycon args) + | tc == tycon + = ASSERT( n_req == length args) -- ty::* + return (args ++ args_so_far) + + loop n_req args_so_far (AppTy fun arg) + = loop (n_req - 1) (arg:args_so_far) fun + + loop n_req args_so_far (TyVarTy tv) + | not (isImmutableTyVar tv) + = do { cts <- readMetaTyVar tv + ; case cts of + Indirect ty -> loop n_req args_so_far ty + Flexi -> do { arg_tys <- withMetaTvs tv arg_kinds mk_res_ty + ; return (arg_tys ++ args_so_far) } + } + where + mk_res_ty arg_tys' = mkTyConApp tc arg_tys' + arg_kinds = map tyVarKind (take n_req (tyConTyVars tc)) + + loop _ _ _ = boxySplitFailure (mkTyConApp tc (mkTyVarTys (tyConTyVars tc))) orig_ty + +---------------------- +boxySplitListTy :: BoxyRhoType -> TcM BoxySigmaType -- Special case for lists +boxySplitListTy exp_ty = do { [elt_ty] <- boxySplitTyConApp listTyCon exp_ty + ; return elt_ty } + + +---------------------- +boxySplitAppTy :: BoxyRhoType -- Type to split: m a + -> TcM (BoxySigmaType, BoxySigmaType) -- Returns m, a +-- Assumes (m: * -> k), where k is the kind of the incoming type +-- If the incoming type is boxy, then so are the result types; and vice versa + +boxySplitAppTy orig_ty + = loop orig_ty + where + loop ty + | Just ty' <- tcView ty = loop ty' + + loop ty + | Just (fun_ty, arg_ty) <- tcSplitAppTy_maybe ty + = return (fun_ty, arg_ty) + + loop (TyVarTy tv) + | not (isImmutableTyVar tv) + = do { cts <- readMetaTyVar tv + ; case cts of + Indirect ty -> loop ty + Flexi -> do { [fun_ty,arg_ty] <- withMetaTvs tv kinds mk_res_ty + ; return (fun_ty, arg_ty) } } + where + mk_res_ty [fun_ty', arg_ty'] = mkAppTy fun_ty' arg_ty' + tv_kind = tyVarKind tv + kinds = [mkArrowKind liftedTypeKind (defaultKind tv_kind), + -- m :: * -> k + liftedTypeKind] -- arg type :: * + -- The defaultKind is a bit smelly. If you remove it, + -- try compiling f x = do { x } + -- and you'll get a kind mis-match. It smells, but + -- not enough to lose sleep over. + + loop _ = boxySplitFailure (mkAppTy alphaTy betaTy) orig_ty + +------------------ +boxySplitFailure actual_ty expected_ty + = unifyMisMatch False False actual_ty expected_ty + -- "outer" is False, so we don't pop the context + -- which is what we want since we have not pushed one! +\end{code} + + +-------------------------------- +-- withBoxes: the key utility function +-------------------------------- + +\begin{code} +withMetaTvs :: TcTyVar -- An unfilled-in, non-skolem, meta type variable + -> [Kind] -- Make fresh boxes (with the same BoxTv/TauTv setting as tv) + -> ([BoxySigmaType] -> BoxySigmaType) + -- Constructs the type to assign + -- to the original var + -> TcM [BoxySigmaType] -- Return the fresh boxes + +-- It's entirely possible for the [kind] to be empty. +-- For example, when pattern-matching on True, +-- we call boxySplitTyConApp passing a boolTyCon + +-- Invariant: tv is still Flexi + +withMetaTvs tv kinds mk_res_ty + | isBoxyTyVar tv + = do { box_tvs <- mapM (newMetaTyVar BoxTv) kinds + ; let box_tys = mkTyVarTys box_tvs + ; writeMetaTyVar tv (mk_res_ty box_tys) + ; return box_tys } + + | otherwise -- Non-boxy meta type variable + = do { tau_tys <- mapM newFlexiTyVarTy kinds + ; writeMetaTyVar tv (mk_res_ty tau_tys) -- Write it *first* + -- Sure to be a tau-type + ; return tau_tys } + +withBox :: Kind -> (BoxySigmaType -> TcM a) -> TcM (a, TcType) +-- Allocate a *boxy* tyvar +withBox kind thing_inside + = do { box_tv <- newMetaTyVar BoxTv kind + ; res <- thing_inside (mkTyVarTy box_tv) + ; ty <- readFilledBox box_tv + ; return (res, ty) } +\end{code} + + +%************************************************************************ +%* * + Approximate boxy matching +%* * +%************************************************************************ + +\begin{code} +boxySubMatchType + :: TcTyVarSet -> TcType -- The "template"; the tyvars are skolems + -> BoxyRhoType -- Type to match (note a *Rho* type) + -> TvSubst -- Substitution of the [TcTyVar] to BoxySigmaTypes + +boxyMatchTypes + :: TcTyVarSet -> [TcType] -- The "template"; the tyvars are skolems + -> [BoxySigmaType] -- Type to match + -> TvSubst -- Substitution of the [TcTyVar] to BoxySigmaTypes + +-- Find a *boxy* substitution that makes the template look as much +-- like the BoxySigmaType as possible. +-- It's always ok to return an empty substitution; +-- anything more is jam on the pudding +-- +-- NB1: This is a pure, non-monadic function. +-- It does no unification, and cannot fail +-- +-- Note [Matching kinds] +-- The target type might legitimately not be a sub-kind of template. +-- For example, suppose the target is simply a box with an OpenTypeKind, +-- and the template is a type variable with LiftedTypeKind. +-- Then it's ok (because the target type will later be refined). +-- We simply don't bind the template type variable. +-- +-- It might also be that the kind mis-match is an error. For example, +-- suppose we match the template (a -> Int) against (Int# -> Int), +-- where the template type variable 'a' has LiftedTypeKind. This +-- matching function does not fail; it simply doesn't bind the template. +-- Later stuff will fail. +-- +-- Precondition: the arg lengths are equal +-- Precondition: none of the template type variables appear in the [BoxySigmaType] +-- Precondition: any nested quantifiers in either type differ from +-- the template type variables passed as arguments +-- +-- Note [Sub-match] +-- ~~~~~~~~~~~~~~~~ +-- Consider this +-- head :: [a] -> a +-- |- head xs : <rhobox> +-- We will do a boxySubMatchType between a ~ <rhobox> +-- But we *don't* want to match [a |-> <rhobox>] because +-- (a) The box should be filled in with a rho-type, but +-- but the returned substitution maps TyVars to boxy *sigma* +-- types +-- (b) In any case, the right final answer might be *either* +-- instantiate 'a' with a rho-type or a sigma type +-- head xs : Int vs head xs : forall b. b->b +-- So the matcher MUST NOT make a choice here. In general, we only +-- bind a template type variable in boxyMatchType, not in boxySubMatchType. + +boxySubMatchType tmpl_tvs tmpl_ty boxy_ty + = go tmpl_ty boxy_ty + where + go t_ty b_ty + | Just t_ty' <- tcView t_ty = go t_ty' b_ty + | Just b_ty' <- tcView b_ty = go t_ty b_ty' + + go (FunTy arg1 res1) (FunTy arg2 res2) + = do_match arg1 arg2 (go res1 res2) + -- Match the args, and sub-match the results + + go (TyVarTy _) b_ty = emptyTvSubst -- Do not bind! See Note [Sub-match] + + go t_ty b_ty = do_match t_ty b_ty emptyTvSubst -- Otherwise we are safe to bind + + do_match t_ty b_ty subst = boxy_match tmpl_tvs t_ty emptyVarSet b_ty subst + +------------ +boxyMatchTypes tmpl_tvs tmpl_tys boxy_tys + = ASSERT( length tmpl_tys == length boxy_tys ) + boxy_match_s tmpl_tvs tmpl_tys emptyVarSet boxy_tys emptyTvSubst + -- ToDo: add error context? + +boxy_match_s tmpl_tvs [] boxy_tvs [] subst + = subst +boxy_match_s tmpl_tvs (t_ty:t_tys) boxy_tvs (b_ty:b_tys) subst + = boxy_match_s tmpl_tvs t_tys boxy_tvs b_tys $ + boxy_match tmpl_tvs t_ty boxy_tvs b_ty subst + +------------ +boxy_match :: TcTyVarSet -> TcType -- Template + -> TcTyVarSet -- boxy_tvs: do not bind template tyvars to any of these + -> BoxySigmaType -- Match against this type + -> TvSubst + -> TvSubst + +-- The boxy_tvs argument prevents this match: +-- [a] forall b. a ~ forall b. b +-- We don't want to bind the template variable 'a' +-- to the quantified type variable 'b'! + +boxy_match tmpl_tvs orig_tmpl_ty boxy_tvs orig_boxy_ty subst + = go orig_tmpl_ty orig_boxy_ty + where + go t_ty b_ty + | Just t_ty' <- tcView t_ty = go t_ty' b_ty + | Just b_ty' <- tcView b_ty = go t_ty b_ty' + + go (ForAllTy _ ty1) (ForAllTy tv2 ty2) + = boxy_match tmpl_tvs ty1 (boxy_tvs `extendVarSet` tv2) ty2 subst + + go (TyConApp tc1 tys1) (TyConApp tc2 tys2) + | tc1 == tc2 = go_s tys1 tys2 + + go (FunTy arg1 res1) (FunTy arg2 res2) + = go_s [arg1,res1] [arg2,res2] + + go t_ty b_ty + | Just (s1,t1) <- tcSplitAppTy_maybe t_ty, + Just (s2,t2) <- tcSplitAppTy_maybe b_ty, + typeKind t2 `isSubKind` typeKind t1 -- Maintain invariant + = go_s [s1,t1] [s2,t2] + + go (TyVarTy tv) b_ty + | tv `elemVarSet` tmpl_tvs -- Template type variable in the template + , not (intersectsVarSet boxy_tvs (tyVarsOfType orig_boxy_ty)) + , typeKind b_ty `isSubKind` tyVarKind tv + = extendTvSubst subst tv boxy_ty' + where + boxy_ty' = case lookupTyVar subst tv of + Nothing -> orig_boxy_ty + Just ty -> ty `boxyLub` orig_boxy_ty + + go _ _ = subst -- Always safe + + -------- + go_s tys1 tys2 = boxy_match_s tmpl_tvs tys1 boxy_tvs tys2 subst + + +boxyLub :: BoxySigmaType -> BoxySigmaType -> BoxySigmaType +-- Combine boxy information from the two types +-- If there is a conflict, return the first +boxyLub orig_ty1 orig_ty2 + = go orig_ty1 orig_ty2 + where + go (AppTy f1 a1) (AppTy f2 a2) = AppTy (boxyLub f1 f2) (boxyLub a1 a2) + go (FunTy f1 a1) (FunTy f2 a2) = FunTy (boxyLub f1 f2) (boxyLub a1 a2) + go (TyConApp tc1 ts1) (TyConApp tc2 ts2) + | tc1 == tc2, length ts1 == length ts2 + = TyConApp tc1 (zipWith boxyLub ts1 ts2) + + go (TyVarTy tv1) ty2 -- This is the whole point; + | isTcTyVar tv1, isMetaTyVar tv1 -- choose ty2 if ty2 is a box + = ty2 + + -- Look inside type synonyms, but only if the naive version fails + go ty1 ty2 | Just ty1' <- tcView ty1 = go ty1' ty2 + | Just ty2' <- tcView ty1 = go ty1 ty2' + + -- For now, we don't look inside ForAlls, PredTys + go ty1 ty2 = orig_ty1 -- Default +\end{code} + + +%************************************************************************ +%* * + Subsumption checking +%* * +%************************************************************************ + +All the tcSub calls have the form + + tcSub expected_ty offered_ty +which checks + offered_ty <= expected_ty + +That is, that a value of type offered_ty is acceptable in +a place expecting a value of type expected_ty. + +It returns a coercion function + co_fn :: offered_ty -> expected_ty +which takes an HsExpr of type offered_ty into one of type +expected_ty. + +\begin{code} +----------------- +tcSubExp :: BoxySigmaType -> BoxySigmaType -> TcM ExprCoFn -- Locally used only + -- (tcSub act exp) checks that + -- act <= exp +tcSubExp actual_ty expected_ty + = addErrCtxtM (unifyCtxt actual_ty expected_ty) + (tc_sub True actual_ty actual_ty expected_ty expected_ty) + +tcFunResTy :: Name -> BoxySigmaType -> BoxySigmaType -> TcM ExprCoFn -- Locally used only +tcFunResTy fun actual_ty expected_ty + = addErrCtxtM (checkFunResCtxt fun actual_ty expected_ty) $ + (tc_sub True actual_ty actual_ty expected_ty expected_ty) + +----------------- +tc_sub :: Outer -- See comments with uTys + -> BoxySigmaType -- actual_ty, before expanding synonyms + -> BoxySigmaType -- ..and after + -> BoxySigmaType -- expected_ty, before + -> BoxySigmaType -- ..and after + -> TcM ExprCoFn + +tc_sub outer act_sty act_ty exp_sty exp_ty + | Just exp_ty' <- tcView exp_ty = tc_sub False act_sty act_ty exp_sty exp_ty' +tc_sub outer act_sty act_ty exp_sty exp_ty + | Just act_ty' <- tcView act_ty = tc_sub False act_sty act_ty' exp_sty exp_ty + +----------------------------------- +-- Rule SBOXY, plus other cases when act_ty is a type variable +-- Just defer to boxy matching +-- This rule takes precedence over SKOL! +tc_sub outer act_sty (TyVarTy tv) exp_sty exp_ty + = do { uVar outer False tv False exp_sty exp_ty + ; return idCoercion } + +----------------------------------- +-- Skolemisation case (rule SKOL) +-- actual_ty: d:Eq b => b->b +-- expected_ty: forall a. Ord a => a->a +-- co_fn e /\a. \d2:Ord a. let d = eqFromOrd d2 in e + +-- It is essential to do this *before* the specialisation case +-- Example: f :: (Eq a => a->a) -> ... +-- g :: Ord b => b->b +-- Consider f g ! + +tc_sub outer act_sty act_ty exp_sty exp_ty + | isSigmaTy exp_ty + = do { (gen_fn, co_fn) <- tcGen exp_ty act_tvs $ \ body_exp_ty -> + tc_sub False act_sty act_ty body_exp_ty body_exp_ty + ; return (gen_fn <.> co_fn) } + where + act_tvs = tyVarsOfType act_ty + -- It's really important to check for escape wrt the free vars of + -- both expected_ty *and* actual_ty + +----------------------------------- +-- Specialisation case (rule ASPEC): +-- actual_ty: forall a. Ord a => a->a +-- expected_ty: Int -> Int +-- co_fn e = e Int dOrdInt + +tc_sub outer act_sty actual_ty exp_sty expected_ty + | isSigmaTy actual_ty + = do { (tyvars, theta, tau) <- tcInstBoxy actual_ty + ; dicts <- newDicts InstSigOrigin theta + ; extendLIEs dicts + ; let inst_fn = CoApps (CoTyApps CoHole (mkTyVarTys tyvars)) + (map instToId dicts) + ; co_fn <- tc_sub False tau tau exp_sty expected_ty + ; return (co_fn <.> inst_fn) } + +----------------------------------- +-- Function case (rule F1) +tc_sub _ _ (FunTy act_arg act_res) _ (FunTy exp_arg exp_res) + = tc_sub_funs act_arg act_res exp_arg exp_res + +-- Function case (rule F2) +tc_sub outer act_sty act_ty@(FunTy act_arg act_res) exp_sty (TyVarTy exp_tv) + | isBoxyTyVar exp_tv + = do { cts <- readMetaTyVar exp_tv + ; case cts of + Indirect ty -> do { u_tys outer False act_sty act_ty True exp_sty ty + ; return idCoercion } + Flexi -> do { [arg_ty,res_ty] <- withMetaTvs exp_tv fun_kinds mk_res_ty + ; tc_sub_funs act_arg act_res arg_ty res_ty } } + where + mk_res_ty [arg_ty', res_ty'] = mkFunTy arg_ty' res_ty' + fun_kinds = [argTypeKind, openTypeKind] + +-- Everything else: defer to boxy matching +tc_sub outer act_sty actual_ty exp_sty expected_ty + = do { u_tys outer False act_sty actual_ty False exp_sty expected_ty + ; return idCoercion } + + +----------------------------------- +tc_sub_funs act_arg act_res exp_arg exp_res + = do { uTys False act_arg False exp_arg + ; co_fn_res <- tc_sub False act_res act_res exp_res exp_res + ; wrapFunResCoercion [exp_arg] co_fn_res } + +----------------------------------- +wrapFunResCoercion + :: [TcType] -- Type of args + -> ExprCoFn -- HsExpr a -> HsExpr b + -> TcM ExprCoFn -- HsExpr (arg_tys -> a) -> HsExpr (arg_tys -> b) +wrapFunResCoercion arg_tys co_fn_res + | isIdCoercion co_fn_res = return idCoercion + | null arg_tys = return co_fn_res + | otherwise + = do { us <- newUniqueSupply + ; let arg_ids = zipWith (mkSysLocal FSLIT("sub")) (uniqsFromSupply us) arg_tys + ; return (CoLams arg_ids (co_fn_res <.> (CoApps CoHole arg_ids))) } +\end{code} + + + +%************************************************************************ +%* * +\subsection{Generalisation} +%* * +%************************************************************************ + +\begin{code} +tcGen :: BoxySigmaType -- expected_ty + -> TcTyVarSet -- Extra tyvars that the universally + -- quantified tyvars of expected_ty + -- must not be unified + -> (BoxyRhoType -> TcM result) -- spec_ty + -> TcM (ExprCoFn, result) + -- The expression has type: spec_ty -> expected_ty + +tcGen expected_ty extra_tvs thing_inside -- We expect expected_ty to be a forall-type + -- If not, the call is a no-op + = do { -- We want the GenSkol info in the skolemised type variables to + -- mention the *instantiated* tyvar names, so that we get a + -- good error message "Rigid variable 'a' is bound by (forall a. a->a)" + -- Hence the tiresome but innocuous fixM + ((forall_tvs, theta, rho_ty), skol_info) <- fixM (\ ~(_, skol_info) -> + do { (forall_tvs, theta, rho_ty) <- tcInstSkolType skol_info expected_ty + ; span <- getSrcSpanM + ; let skol_info = GenSkol forall_tvs (mkPhiTy theta rho_ty) span + ; return ((forall_tvs, theta, rho_ty), skol_info) }) + +#ifdef DEBUG + ; traceTc (text "tcGen" <+> vcat [text "extra_tvs" <+> ppr extra_tvs, + text "expected_ty" <+> ppr expected_ty, + text "inst ty" <+> ppr forall_tvs <+> ppr theta <+> ppr rho_ty, + text "free_tvs" <+> ppr free_tvs, + text "forall_tvs" <+> ppr forall_tvs]) +#endif + + -- Type-check the arg and unify with poly type + ; (result, lie) <- getLIE (thing_inside rho_ty) + + -- Check that the "forall_tvs" havn't been constrained + -- The interesting bit here is that we must include the free variables + -- of the expected_ty. Here's an example: + -- runST (newVar True) + -- Here, if we don't make a check, we'll get a type (ST s (MutVar s Bool)) + -- for (newVar True), with s fresh. Then we unify with the runST's arg type + -- forall s'. ST s' a. That unifies s' with s, and a with MutVar s Bool. + -- So now s' isn't unconstrained because it's linked to a. + -- Conclusion: include the free vars of the expected_ty in the + -- list of "free vars" for the signature check. + + ; dicts <- newDicts (SigOrigin skol_info) theta + ; inst_binds <- tcSimplifyCheck sig_msg forall_tvs dicts lie + + ; checkSigTyVarsWrt free_tvs forall_tvs + ; traceTc (text "tcGen:done") + + ; let + -- This HsLet binds any Insts which came out of the simplification. + -- It's a bit out of place here, but using AbsBind involves inventing + -- a couple of new names which seems worse. + dict_ids = map instToId dicts + co_fn = CoTyLams forall_tvs $ CoLams dict_ids $ CoLet inst_binds CoHole + ; returnM (co_fn, result) } + where + free_tvs = tyVarsOfType expected_ty `unionVarSet` extra_tvs + sig_msg = ptext SLIT("expected type of an expression") +\end{code} + + + +%************************************************************************ +%* * + Boxy unification +%* * +%************************************************************************ + +The exported functions are all defined as versions of some +non-exported generic functions. + +\begin{code} +boxyUnify :: BoxyType -> BoxyType -> TcM () +-- Acutal and expected, respectively +boxyUnify ty1 ty2 + = addErrCtxtM (unifyCtxt ty1 ty2) $ + uTysOuter False ty1 False ty2 + +--------------- +boxyUnifyList :: [BoxyType] -> [BoxyType] -> TcM () +-- Arguments should have equal length +-- Acutal and expected types +boxyUnifyList tys1 tys2 = uList boxyUnify tys1 tys2 + +--------------- +unifyType :: TcTauType -> TcTauType -> TcM () +-- No boxes expected inside these types +-- Acutal and expected types +unifyType ty1 ty2 -- ty1 expected, ty2 inferred + = ASSERT2( not (isBoxyTy ty1), ppr ty1 ) + ASSERT2( not (isBoxyTy ty2), ppr ty2 ) + addErrCtxtM (unifyCtxt ty1 ty2) $ + uTysOuter True ty1 True ty2 + +--------------- +unifyPred :: PredType -> PredType -> TcM () +-- Acutal and expected types +unifyPred p1 p2 = addErrCtxtM (unifyCtxt (mkPredTy p1) (mkPredTy p2)) $ + uPred True True p1 True p2 + +unifyTheta :: TcThetaType -> TcThetaType -> TcM () +-- Acutal and expected types +unifyTheta theta1 theta2 + = do { checkTc (equalLength theta1 theta2) + (ptext SLIT("Contexts differ in length")) + ; uList unifyPred theta1 theta2 } + +--------------- +uList :: (a -> a -> TcM ()) + -> [a] -> [a] -> TcM () +-- Unify corresponding elements of two lists of types, which +-- should be f equal length. We charge down the list explicitly so that +-- we can complain if their lengths differ. +uList unify [] [] = return () +uList unify (ty1:tys1) (ty2:tys2) = do { unify ty1 ty2; uList unify tys1 tys2 } +uList unify ty1s ty2s = panic "Unify.uList: mismatched type lists!" +\end{code} + +@unifyTypeList@ takes a single list of @TauType@s and unifies them +all together. It is used, for example, when typechecking explicit +lists, when all the elts should be of the same type. + +\begin{code} +unifyTypeList :: [TcTauType] -> TcM () +unifyTypeList [] = returnM () +unifyTypeList [ty] = returnM () +unifyTypeList (ty1:tys@(ty2:_)) = do { unifyType ty1 ty2 + ; unifyTypeList tys } +\end{code} + +%************************************************************************ +%* * +\subsection[Unify-uTys]{@uTys@: getting down to business} +%* * +%************************************************************************ + +@uTys@ is the heart of the unifier. Each arg happens twice, because +we want to report errors in terms of synomyms if poss. The first of +the pair is used in error messages only; it is always the same as the +second, except that if the first is a synonym then the second may be a +de-synonym'd version. This way we get better error messages. + +We call the first one \tr{ps_ty1}, \tr{ps_ty2} for ``possible synomym''. + +\begin{code} +type NoBoxes = Bool -- True <=> definitely no boxes in this type + -- False <=> there might be boxes (always safe) + +type Outer = Bool -- True <=> this is the outer level of a unification + -- so that the types being unified are the + -- very ones we began with, not some sub + -- component or synonym expansion +-- The idea is that if Outer is true then unifyMisMatch should +-- pop the context to remove the "Expected/Acutal" context + +uTysOuter, uTys + :: NoBoxes -> TcType -- ty1 is the *expected* type + -> NoBoxes -> TcType -- ty2 is the *actual* type + -> TcM () +uTysOuter nb1 ty1 nb2 ty2 = u_tys True nb1 ty1 ty1 nb2 ty2 ty2 +uTys nb1 ty1 nb2 ty2 = u_tys False nb1 ty1 ty1 nb2 ty2 ty2 + + +-------------- +uTys_s :: NoBoxes -> [TcType] -- ty1 is the *actual* types + -> NoBoxes -> [TcType] -- ty2 is the *expected* types + -> TcM () +uTys_s nb1 [] nb2 [] = returnM () +uTys_s nb1 (ty1:tys1) nb2 (ty2:tys2) = do { uTys nb1 ty1 nb2 ty2 + ; uTys_s nb1 tys1 nb2 tys2 } +uTys_s nb1 ty1s nb2 ty2s = panic "Unify.uTys_s: mismatched type lists!" + +-------------- +u_tys :: Outer + -> NoBoxes -> TcType -> TcType -- ty1 is the *actual* type + -> NoBoxes -> TcType -> TcType -- ty2 is the *expected* type + -> TcM () + +u_tys outer nb1 orig_ty1 ty1 nb2 orig_ty2 ty2 + = go outer ty1 ty2 + where + + -- Always expand synonyms (see notes at end) + -- (this also throws away FTVs) + go outer ty1 ty2 + | Just ty1' <- tcView ty1 = go False ty1' ty2 + | Just ty2' <- tcView ty2 = go False ty1 ty2' + + -- Variables; go for uVar + go outer (TyVarTy tyvar1) ty2 = uVar outer False tyvar1 nb2 orig_ty2 ty2 + go outer ty1 (TyVarTy tyvar2) = uVar outer True tyvar2 nb1 orig_ty1 ty1 + -- "True" means args swapped + -- Predicates + go outer (PredTy p1) (PredTy p2) = uPred outer nb1 p1 nb2 p2 + + -- Type constructors must match + go _ (TyConApp con1 tys1) (TyConApp con2 tys2) + | con1 == con2 = uTys_s nb1 tys1 nb2 tys2 + -- See Note [TyCon app] + + -- Functions; just check the two parts + go _ (FunTy fun1 arg1) (FunTy fun2 arg2) + = do { uTys nb1 fun1 nb2 fun2 + ; uTys nb1 arg1 nb2 arg2 } + + -- Applications need a bit of care! + -- They can match FunTy and TyConApp, so use splitAppTy_maybe + -- NB: we've already dealt with type variables and Notes, + -- so if one type is an App the other one jolly well better be too + go outer (AppTy s1 t1) ty2 + | Just (s2,t2) <- tcSplitAppTy_maybe ty2 + = do { uTys nb1 s1 nb2 s2; uTys nb1 t1 nb2 t2 } + + -- Now the same, but the other way round + -- Don't swap the types, because the error messages get worse + go outer ty1 (AppTy s2 t2) + | Just (s1,t1) <- tcSplitAppTy_maybe ty1 + = do { uTys nb1 s1 nb2 s2; uTys nb1 t1 nb2 t2 } + + go _ ty1@(ForAllTy _ _) ty2@(ForAllTy _ _) + | length tvs1 == length tvs2 + = do { tvs <- tcInstSkolTyVars UnkSkol tvs1 -- Not a helpful SkolemInfo + ; let tys = mkTyVarTys tvs + in_scope = mkInScopeSet (mkVarSet tvs) + subst1 = mkTvSubst in_scope (zipTyEnv tvs1 tys) + subst2 = mkTvSubst in_scope (zipTyEnv tvs2 tys) + ; uTys nb1 (substTy subst1 body1) nb2 (substTy subst2 body2) + + -- If both sides are inside a box, we should not have + -- a polytype at all. This check comes last, because + -- the error message is extremely unhelpful. + ; ifM (nb1 && nb2) (notMonoType ty1) + } + where + (tvs1, body1) = tcSplitForAllTys ty1 + (tvs2, body2) = tcSplitForAllTys ty2 + + -- Anything else fails + go outer _ _ = unifyMisMatch outer False orig_ty1 orig_ty2 + +---------- +uPred outer nb1 (IParam n1 t1) nb2 (IParam n2 t2) + | n1 == n2 = uTys nb1 t1 nb2 t2 +uPred outer nb1 (ClassP c1 tys1) nb2 (ClassP c2 tys2) + | c1 == c2 = uTys_s nb1 tys1 nb2 tys2 -- Guaranteed equal lengths because the kinds check +uPred outer _ p1 _ p2 = unifyMisMatch outer False (mkPredTy p1) (mkPredTy p2) +\end{code} + +Note [Tycon app] +~~~~~~~~~~~~~~~~ +When we find two TyConApps, the argument lists are guaranteed equal +length. Reason: intially the kinds of the two types to be unified is +the same. The only way it can become not the same is when unifying two +AppTys (f1 a1):=:(f2 a2). In that case there can't be a TyConApp in +the f1,f2 (because it'd absorb the app). If we unify f1:=:f2 first, +which we do, that ensures that f1,f2 have the same kind; and that +means a1,a2 have the same kind. And now the argument repeats. + + +Notes on synonyms +~~~~~~~~~~~~~~~~~ +If you are tempted to make a short cut on synonyms, as in this +pseudocode... + +\begin{verbatim} +-- NO uTys (SynTy con1 args1 ty1) (SynTy con2 args2 ty2) +-- NO = if (con1 == con2) then +-- NO -- Good news! Same synonym constructors, so we can shortcut +-- NO -- by unifying their arguments and ignoring their expansions. +-- NO unifyTypepeLists args1 args2 +-- NO else +-- NO -- Never mind. Just expand them and try again +-- NO uTys ty1 ty2 +\end{verbatim} + +then THINK AGAIN. Here is the whole story, as detected and reported +by Chris Okasaki \tr{<Chris_Okasaki@loch.mess.cs.cmu.edu>}: +\begin{quotation} +Here's a test program that should detect the problem: + +\begin{verbatim} + type Bogus a = Int + x = (1 :: Bogus Char) :: Bogus Bool +\end{verbatim} + +The problem with [the attempted shortcut code] is that +\begin{verbatim} + con1 == con2 +\end{verbatim} +is not a sufficient condition to be able to use the shortcut! +You also need to know that the type synonym actually USES all +its arguments. For example, consider the following type synonym +which does not use all its arguments. +\begin{verbatim} + type Bogus a = Int +\end{verbatim} + +If you ever tried unifying, say, \tr{Bogus Char} with \tr{Bogus Bool}, +the unifier would blithely try to unify \tr{Char} with \tr{Bool} and +would fail, even though the expanded forms (both \tr{Int}) should +match. + +Similarly, unifying \tr{Bogus Char} with \tr{Bogus t} would +unnecessarily bind \tr{t} to \tr{Char}. + +... You could explicitly test for the problem synonyms and mark them +somehow as needing expansion, perhaps also issuing a warning to the +user. +\end{quotation} + + +%************************************************************************ +%* * +\subsection[Unify-uVar]{@uVar@: unifying with a type variable} +%* * +%************************************************************************ + +@uVar@ is called when at least one of the types being unified is a +variable. It does {\em not} assume that the variable is a fixed point +of the substitution; rather, notice that @uVar@ (defined below) nips +back into @uTys@ if it turns out that the variable is already bound. + +\begin{code} +uVar :: Outer + -> Bool -- False => tyvar is the "expected" + -- True => ty is the "expected" thing + -> TcTyVar + -> NoBoxes -- True <=> definitely no boxes in t2 + -> TcTauType -> TcTauType -- printing and real versions + -> TcM () + +uVar outer swapped tv1 nb2 ps_ty2 ty2 + = do { let expansion | showSDoc (ppr ty2) == showSDoc (ppr ps_ty2) = empty + | otherwise = brackets (equals <+> ppr ty2) + ; traceTc (text "uVar" <+> ppr swapped <+> + sep [ppr tv1 <+> dcolon <+> ppr (tyVarKind tv1 ), + nest 2 (ptext SLIT(" :=: ")), + ppr ps_ty2 <+> dcolon <+> ppr (typeKind ty2) <+> expansion]) + ; details <- lookupTcTyVar tv1 + ; case details of + IndirectTv ty1 + | swapped -> u_tys outer nb2 ps_ty2 ty2 True ty1 ty1 -- Swap back + | otherwise -> u_tys outer True ty1 ty1 nb2 ps_ty2 ty2 -- Same order + -- The 'True' here says that ty1 + -- is definitely box-free + DoneTv details1 -> uUnfilledVar outer swapped tv1 details1 nb2 ps_ty2 ty2 + } + +---------------- +uUnfilledVar :: Outer + -> Bool -- Args are swapped + -> TcTyVar -> TcTyVarDetails -- Tyvar 1 + -> NoBoxes -> TcTauType -> TcTauType -- Type 2 + -> TcM () +-- Invariant: tyvar 1 is not unified with anything + +uUnfilledVar outer swapped tv1 details1 nb2 ps_ty2 ty2 + | Just ty2' <- tcView ty2 + = -- Expand synonyms; ignore FTVs + uUnfilledVar False swapped tv1 details1 nb2 ps_ty2 ty2' + +uUnfilledVar outer swapped tv1 details1 nb2 ps_ty2 ty2@(TyVarTy tv2) + -- Same type variable => no-op + | tv1 == tv2 + = returnM () + + -- Distinct type variables + | otherwise + = do { lookup2 <- lookupTcTyVar tv2 + ; case lookup2 of + IndirectTv ty2' -> uUnfilledVar outer swapped tv1 details1 True ty2' ty2' + DoneTv details2 -> uUnfilledVars outer swapped tv1 details1 tv2 details2 + } + +uUnfilledVar outer swapped tv1 details1 nb2 ps_ty2 non_var_ty2 -- ty2 is not a type variable + = case details1 of + MetaTv (SigTv _) ref1 -> mis_match -- Can't update a skolem with a non-type-variable + MetaTv info ref1 -> uMetaVar swapped tv1 info ref1 nb2 ps_ty2 non_var_ty2 + skolem_details -> mis_match + where + mis_match = unifyMisMatch outer swapped (TyVarTy tv1) ps_ty2 + +---------------- +uMetaVar :: Bool + -> TcTyVar -> BoxInfo -> IORef MetaDetails + -> NoBoxes -> TcType -> TcType + -> TcM () +-- tv1 is an un-filled-in meta type variable (maybe boxy, maybe tau) +-- ty2 is not a type variable + +uMetaVar swapped tv1 info1 ref1 nb2 ps_ty2 non_var_ty2 + = do { final_ty <- case info1 of + BoxTv -> unBox ps_ty2 -- No occurs check + other -> checkTauTvUpdate tv1 ps_ty2 -- Occurs check + monotype check + ; checkUpdateMeta swapped tv1 ref1 final_ty } + +---------------- +uUnfilledVars :: Outer + -> Bool -- Args are swapped + -> TcTyVar -> TcTyVarDetails -- Tyvar 1 + -> TcTyVar -> TcTyVarDetails -- Tyvar 2 + -> TcM () +-- Invarant: The type variables are distinct, +-- Neither is filled in yet +-- They might be boxy or not + +uUnfilledVars outer swapped tv1 (SkolemTv _) tv2 (SkolemTv _) + = unifyMisMatch outer swapped (mkTyVarTy tv1) (mkTyVarTy tv2) + +uUnfilledVars outer swapped tv1 (MetaTv info1 ref1) tv2 (SkolemTv _) + = checkUpdateMeta swapped tv1 ref1 (mkTyVarTy tv2) +uUnfilledVars outer swapped tv1 (SkolemTv _) tv2 (MetaTv info2 ref2) + = checkUpdateMeta (not swapped) tv2 ref2 (mkTyVarTy tv1) + +-- ToDo: this function seems too long for what it acutally does! +uUnfilledVars outer swapped tv1 (MetaTv info1 ref1) tv2 (MetaTv info2 ref2) + = case (info1, info2) of + (BoxTv, BoxTv) -> box_meets_box + + -- If a box meets a TauTv, but the fomer has the smaller kind + -- then we must create a fresh TauTv with the smaller kind + (_, BoxTv) | k1_sub_k2 -> update_tv2 + | otherwise -> box_meets_box + (BoxTv, _ ) | k2_sub_k1 -> update_tv1 + | otherwise -> box_meets_box + + -- Avoid SigTvs if poss + (SigTv _, _ ) | k1_sub_k2 -> update_tv2 + (_, SigTv _) | k2_sub_k1 -> update_tv1 + + (_, _) | k1_sub_k2 -> if k2_sub_k1 && nicer_to_update_tv1 + then update_tv1 -- Same kinds + else update_tv2 + | k2_sub_k1 -> update_tv1 + | otherwise -> kind_err + + -- Update the variable with least kind info + -- See notes on type inference in Kind.lhs + -- The "nicer to" part only applies if the two kinds are the same, + -- so we can choose which to do. + where + -- Kinds should be guaranteed ok at this point + update_tv1 = updateMeta tv1 ref1 (mkTyVarTy tv2) + update_tv2 = updateMeta tv2 ref2 (mkTyVarTy tv1) + + box_meets_box | k1_sub_k2 = fill_with k1 + | k2_sub_k1 = fill_with k2 + | otherwise = kind_err + + fill_with kind = do { tau_ty <- newFlexiTyVarTy kind + ; updateMeta tv1 ref1 tau_ty + ; updateMeta tv2 ref2 tau_ty } + + kind_err = addErrCtxtM (unifyKindCtxt swapped tv1 (mkTyVarTy tv2)) $ + unifyKindMisMatch k1 k2 + + k1 = tyVarKind tv1 + k2 = tyVarKind tv2 + k1_sub_k2 = k1 `isSubKind` k2 + k2_sub_k1 = k2 `isSubKind` k1 + + nicer_to_update_tv1 = isSystemName (varName tv1) + -- Try to update sys-y type variables in preference to ones + -- gotten (say) by instantiating a polymorphic function with + -- a user-written type sig + +---------------- +checkUpdateMeta :: Bool -> TcTyVar -> IORef MetaDetails -> TcType -> TcM () +-- Update tv1, which is flexi; occurs check is alrady done +-- The 'check' version does a kind check too +-- We do a sub-kind check here: we might unify (a b) with (c d) +-- where b::*->* and d::*; this should fail + +checkUpdateMeta swapped tv1 ref1 ty2 + = do { checkKinds swapped tv1 ty2 + ; updateMeta tv1 ref1 ty2 } + +updateMeta :: TcTyVar -> IORef MetaDetails -> TcType -> TcM () +updateMeta tv1 ref1 ty2 + = ASSERT( isMetaTyVar tv1 ) + ASSERT( isBoxyTyVar tv1 || isTauTy ty2 ) + do { ASSERTM2( do { details <- readMetaTyVar tv1; return (isFlexi details) }, ppr tv1 ) + ; traceTc (text "updateMeta" <+> ppr tv1 <+> text ":=" <+> ppr ty2) + ; writeMutVar ref1 (Indirect ty2) } + +---------------- +checkKinds swapped tv1 ty2 +-- We're about to unify a type variable tv1 with a non-tyvar-type ty2. +-- ty2 has been zonked at this stage, which ensures that +-- its kind has as much boxity information visible as possible. + | tk2 `isSubKind` tk1 = returnM () + + | otherwise + -- Either the kinds aren't compatible + -- (can happen if we unify (a b) with (c d)) + -- or we are unifying a lifted type variable with an + -- unlifted type: e.g. (id 3#) is illegal + = addErrCtxtM (unifyKindCtxt swapped tv1 ty2) $ + unifyKindMisMatch k1 k2 + where + (k1,k2) | swapped = (tk2,tk1) + | otherwise = (tk1,tk2) + tk1 = tyVarKind tv1 + tk2 = typeKind ty2 + +---------------- +checkTauTvUpdate :: TcTyVar -> TcType -> TcM TcType +-- (checkTauTvUpdate tv ty) +-- We are about to update the TauTv tv with ty. +-- Check (a) that tv doesn't occur in ty (occurs check) +-- (b) that ty is a monotype +-- Furthermore, in the interest of (b), if you find an +-- empty box (BoxTv that is Flexi), fill it in with a TauTv +-- +-- Returns the (non-boxy) type to update the type variable with, or fails + +checkTauTvUpdate orig_tv orig_ty + = go orig_ty + where + go (TyConApp tc tys) + | isSynTyCon tc = go_syn tc tys + | otherwise = do { tys' <- mappM go tys; return (TyConApp tc tys') } + go (NoteTy _ ty2) = go ty2 -- Discard free-tyvar annotations + go (PredTy p) = do { p' <- go_pred p; return (PredTy p') } + go (FunTy arg res) = do { arg' <- go arg; res' <- go res; return (FunTy arg' res') } + go (AppTy fun arg) = do { fun' <- go fun; arg' <- go arg; return (mkAppTy fun' arg') } + -- NB the mkAppTy; we might have instantiated a + -- type variable to a type constructor, so we need + -- to pull the TyConApp to the top. + go (ForAllTy tv ty) = notMonoType orig_ty -- (b) + + go (TyVarTy tv) + | orig_tv == tv = occurCheck tv orig_ty -- (a) + | isTcTyVar tv = go_tyvar tv (tcTyVarDetails tv) + | otherwise = return (TyVarTy tv) + -- Ordinary (non Tc) tyvars + -- occur inside quantified types + + go_pred (ClassP c tys) = do { tys' <- mapM go tys; return (ClassP c tys') } + go_pred (IParam n ty) = do { ty' <- go ty; return (IParam n ty') } + + go_tyvar tv (SkolemTv _) = return (TyVarTy tv) + go_tyvar tv (MetaTv box ref) + = do { cts <- readMutVar ref + ; case cts of + Indirect ty -> go ty + Flexi -> case box of + BoxTv -> do { tau <- newFlexiTyVarTy (tyVarKind tv) + ; writeMutVar ref (Indirect tau) + ; return tau } + other -> return (TyVarTy tv) + } + + -- go_syn is called for synonyms only + -- See Note [Type synonyms and the occur check] + go_syn tc tys + | not (isTauTyCon tc) + = notMonoType orig_ty -- (b) again + | otherwise + = do { (msgs, mb_tys') <- tryTc (mapM go tys) + ; case mb_tys' of + Just tys' -> return (TyConApp tc tys') + -- Retain the synonym (the common case) + Nothing -> go (expectJust "checkTauTvUpdate" + (tcView (TyConApp tc tys))) + -- Try again, expanding the synonym + } +\end{code} + +Note [Type synonyms and the occur check] +~~~~~~~~~~~~~~~~~~~~ +Basically we want to update tv1 := ps_ty2 +because ps_ty2 has type-synonym info, which improves later error messages + +But consider + type A a = () + + f :: (A a -> a -> ()) -> () + f = \ _ -> () + + x :: () + x = f (\ x p -> p x) + +In the application (p x), we try to match "t" with "A t". If we go +ahead and bind t to A t (= ps_ty2), we'll lead the type checker into +an infinite loop later. +But we should not reject the program, because A t = (). +Rather, we should bind t to () (= non_var_ty2). + +\begin{code} +stripBoxyType :: BoxyType -> TcM TcType +-- Strip all boxes from the input type, returning a non-boxy type. +-- It's fine for there to be a polytype inside a box (c.f. unBox) +-- All of the boxes should have been filled in by now; +-- hence we return a TcType +stripBoxyType ty = zonkType strip_tv ty + where + strip_tv tv = ASSERT( not (isBoxyTyVar tv) ) return (TyVarTy tv) + -- strip_tv will be called for *Flexi* meta-tyvars + -- There should not be any Boxy ones; hence the ASSERT + +zapToMonotype :: BoxySigmaType -> TcM TcTauType +-- Subtle... we must zap the boxy res_ty +-- to kind * before using it to instantiate a LitInst +-- Calling unBox instead doesn't do the job, because the box +-- often has an openTypeKind, and we don't want to instantiate +-- with that type. +zapToMonotype res_ty + = do { res_tau <- newFlexiTyVarTy liftedTypeKind + ; boxyUnify res_tau res_ty + ; return res_tau } + +unBox :: BoxyType -> TcM TcType +-- unBox implements the judgement +-- |- s' ~ box(s) +-- with input s', and result s +-- +-- It remove all boxes from the input type, returning a non-boxy type. +-- A filled box in the type can only contain a monotype; unBox fails if not +-- The type can have empty boxes, which unBox fills with a monotype +-- +-- Compare this wth checkTauTvUpdate +-- +-- For once, it's safe to treat synonyms as opaque! + +unBox (NoteTy n ty) = do { ty' <- unBox ty; return (NoteTy n ty') } +unBox (TyConApp tc tys) = do { tys' <- mapM unBox tys; return (TyConApp tc tys') } +unBox (AppTy f a) = do { f' <- unBox f; a' <- unBox a; return (mkAppTy f' a') } +unBox (FunTy f a) = do { f' <- unBox f; a' <- unBox a; return (FunTy f' a') } +unBox (PredTy p) = do { p' <- unBoxPred p; return (PredTy p') } +unBox (ForAllTy tv ty) = ASSERT( isImmutableTyVar tv ) + do { ty' <- unBox ty; return (ForAllTy tv ty') } +unBox (TyVarTy tv) + | isTcTyVar tv -- It's a boxy type variable + , MetaTv BoxTv ref <- tcTyVarDetails tv -- NB: non-TcTyVars are possible + = do { cts <- readMutVar ref -- under nested quantifiers + ; case cts of + Indirect ty -> do { non_boxy_ty <- unBox ty + ; if isTauTy non_boxy_ty + then return non_boxy_ty + else notMonoType non_boxy_ty } + Flexi -> do { tau <- newFlexiTyVarTy (tyVarKind tv) + ; writeMutVar ref (Indirect tau) + ; return tau } + } + | otherwise -- Skolems, and meta-tau-variables + = return (TyVarTy tv) + +unBoxPred (ClassP cls tys) = do { tys' <- mapM unBox tys; return (ClassP cls tys') } +unBoxPred (IParam ip ty) = do { ty' <- unBox ty; return (IParam ip ty') } +\end{code} + + + +%************************************************************************ +%* * +\subsection[Unify-context]{Errors and contexts} +%* * +%************************************************************************ + +Errors +~~~~~~ + +\begin{code} +unifyCtxt act_ty exp_ty tidy_env + = do { act_ty' <- zonkTcType act_ty + ; exp_ty' <- zonkTcType exp_ty + ; let (env1, exp_ty'') = tidyOpenType tidy_env exp_ty' + (env2, act_ty'') = tidyOpenType env1 act_ty' + ; return (env2, mkExpectedActualMsg act_ty'' exp_ty'') } + +---------------- +mkExpectedActualMsg act_ty exp_ty + = nest 2 (vcat [ text "Expected type" <> colon <+> ppr exp_ty, + text "Inferred type" <> colon <+> ppr act_ty ]) + +---------------- +-- If an error happens we try to figure out whether the function +-- function has been given too many or too few arguments, and say so. +checkFunResCtxt fun actual_res_ty expected_res_ty tidy_env + = do { exp_ty' <- zonkTcType expected_res_ty + ; act_ty' <- zonkTcType actual_res_ty + ; let + (env1, exp_ty'') = tidyOpenType tidy_env exp_ty' + (env2, act_ty'') = tidyOpenType env1 act_ty' + (exp_args, _) = tcSplitFunTys exp_ty'' + (act_args, _) = tcSplitFunTys act_ty'' + + len_act_args = length act_args + len_exp_args = length exp_args + + message | len_exp_args < len_act_args = wrongArgsCtxt "too few" fun + | len_exp_args > len_act_args = wrongArgsCtxt "too many" fun + | otherwise = mkExpectedActualMsg act_ty'' exp_ty'' + ; return (env2, message) } + + where + wrongArgsCtxt too_many_or_few fun + = ptext SLIT("Probable cause:") <+> quotes (ppr fun) + <+> ptext SLIT("is applied to") <+> text too_many_or_few + <+> ptext SLIT("arguments") + +------------------ +unifyKindCtxt swapped tv1 ty2 tidy_env -- not swapped => tv1 expected, ty2 inferred + -- tv1 and ty2 are zonked already + = returnM msg + where + msg = (env2, ptext SLIT("When matching the kinds of") <+> + sep [quotes pp_expected <+> ptext SLIT("and"), quotes pp_actual]) + + (pp_expected, pp_actual) | swapped = (pp2, pp1) + | otherwise = (pp1, pp2) + (env1, tv1') = tidyOpenTyVar tidy_env tv1 + (env2, ty2') = tidyOpenType env1 ty2 + pp1 = ppr tv1' <+> dcolon <+> ppr (tyVarKind tv1) + pp2 = ppr ty2' <+> dcolon <+> ppr (typeKind ty2) + +unifyMisMatch outer swapped ty1 ty2 + = do { (env, msg) <- if swapped then misMatchMsg ty1 ty2 + else misMatchMsg ty2 ty1 + + -- This is the whole point of the 'outer' stuff + ; if outer then popErrCtxt (failWithTcM (env, msg)) + else failWithTcM (env, msg) + } + +misMatchMsg ty1 ty2 + = do { env0 <- tcInitTidyEnv + ; (env1, pp1, extra1) <- ppr_ty env0 ty1 + ; (env2, pp2, extra2) <- ppr_ty env1 ty2 + ; return (env2, sep [sep [ptext SLIT("Couldn't match expected type") <+> pp1, + nest 7 (ptext SLIT("against inferred type") <+> pp2)], + nest 2 extra1, nest 2 extra2]) } + +ppr_ty :: TidyEnv -> TcType -> TcM (TidyEnv, SDoc, SDoc) +ppr_ty env ty + = do { ty' <- zonkTcType ty + ; let (env1,tidy_ty) = tidyOpenType env ty' + simple_result = (env1, quotes (ppr tidy_ty), empty) + ; case tidy_ty of + TyVarTy tv + | isSkolemTyVar tv -> return (env2, pp_rigid tv', + pprSkolTvBinding tv') + | otherwise -> return simple_result + where + (env2, tv') = tidySkolemTyVar env1 tv + other -> return simple_result } + where + pp_rigid tv = quotes (ppr tv) <+> parens (ptext SLIT("a rigid variable")) + + +notMonoType ty + = do { ty' <- zonkTcType ty + ; env0 <- tcInitTidyEnv + ; let (env1, tidy_ty) = tidyOpenType env0 ty' + msg = ptext SLIT("Cannot match a monotype with") <+> ppr tidy_ty + ; failWithTcM (env1, msg) } + +occurCheck tyvar ty + = do { env0 <- tcInitTidyEnv + ; ty' <- zonkTcType ty + ; let (env1, tidy_tyvar) = tidyOpenTyVar env0 tyvar + (env2, tidy_ty) = tidyOpenType env1 ty' + extra = sep [ppr tidy_tyvar, char '=', ppr tidy_ty] + ; failWithTcM (env2, hang msg 2 extra) } + where + msg = ptext SLIT("Occurs check: cannot construct the infinite type:") +\end{code} + + +%************************************************************************ +%* * + Kind unification +%* * +%************************************************************************ + +Unifying kinds is much, much simpler than unifying types. + +\begin{code} +unifyKind :: TcKind -- Expected + -> TcKind -- Actual + -> TcM () +unifyKind LiftedTypeKind LiftedTypeKind = returnM () +unifyKind UnliftedTypeKind UnliftedTypeKind = returnM () + +unifyKind OpenTypeKind k2 | isOpenTypeKind k2 = returnM () +unifyKind ArgTypeKind k2 | isArgTypeKind k2 = returnM () + -- Respect sub-kinding + +unifyKind (FunKind a1 r1) (FunKind a2 r2) + = do { unifyKind a2 a1; unifyKind r1 r2 } + -- Notice the flip in the argument, + -- so that the sub-kinding works right + +unifyKind (KindVar kv1) k2 = uKVar False kv1 k2 +unifyKind k1 (KindVar kv2) = uKVar True kv2 k1 +unifyKind k1 k2 = unifyKindMisMatch k1 k2 + +unifyKinds :: [TcKind] -> [TcKind] -> TcM () +unifyKinds [] [] = returnM () +unifyKinds (k1:ks1) (k2:ks2) = unifyKind k1 k2 `thenM_` + unifyKinds ks1 ks2 +unifyKinds _ _ = panic "unifyKinds: length mis-match" + +---------------- +uKVar :: Bool -> KindVar -> TcKind -> TcM () +uKVar swapped kv1 k2 + = do { mb_k1 <- readKindVar kv1 + ; case mb_k1 of + Nothing -> uUnboundKVar swapped kv1 k2 + Just k1 | swapped -> unifyKind k2 k1 + | otherwise -> unifyKind k1 k2 } + +---------------- +uUnboundKVar :: Bool -> KindVar -> TcKind -> TcM () +uUnboundKVar swapped kv1 k2@(KindVar kv2) + | kv1 == kv2 = returnM () + | otherwise -- Distinct kind variables + = do { mb_k2 <- readKindVar kv2 + ; case mb_k2 of + Just k2 -> uUnboundKVar swapped kv1 k2 + Nothing -> writeKindVar kv1 k2 } + +uUnboundKVar swapped kv1 non_var_k2 + = do { k2' <- zonkTcKind non_var_k2 + ; kindOccurCheck kv1 k2' + ; k2'' <- kindSimpleKind swapped k2' + -- KindVars must be bound only to simple kinds + -- Polarities: (kindSimpleKind True ?) succeeds + -- returning *, corresponding to unifying + -- expected: ? + -- actual: kind-ver + ; writeKindVar kv1 k2'' } + +---------------- +kindOccurCheck kv1 k2 -- k2 is zonked + = checkTc (not_in k2) (kindOccurCheckErr kv1 k2) + where + not_in (KindVar kv2) = kv1 /= kv2 + not_in (FunKind a2 r2) = not_in a2 && not_in r2 + not_in other = True + +kindSimpleKind :: Bool -> Kind -> TcM SimpleKind +-- (kindSimpleKind True k) returns a simple kind sk such that sk <: k +-- If the flag is False, it requires k <: sk +-- E.g. kindSimpleKind False ?? = * +-- What about (kv -> *) :=: ?? -> * +kindSimpleKind orig_swapped orig_kind + = go orig_swapped orig_kind + where + go sw (FunKind k1 k2) = do { k1' <- go (not sw) k1 + ; k2' <- go sw k2 + ; return (FunKind k1' k2') } + go True OpenTypeKind = return liftedTypeKind + go True ArgTypeKind = return liftedTypeKind + go sw LiftedTypeKind = return liftedTypeKind + go sw k@(KindVar _) = return k -- KindVars are always simple + go swapped kind = failWithTc (ptext SLIT("Unexpected kind unification failure:") + <+> ppr orig_swapped <+> ppr orig_kind) + -- I think this can't actually happen + +-- T v = MkT v v must be a type +-- T v w = MkT (v -> w) v must not be an umboxed tuple + +---------------- +kindOccurCheckErr tyvar ty + = hang (ptext SLIT("Occurs check: cannot construct the infinite kind:")) + 2 (sep [ppr tyvar, char '=', ppr ty]) + +unifyKindMisMatch ty1 ty2 + = zonkTcKind ty1 `thenM` \ ty1' -> + zonkTcKind ty2 `thenM` \ ty2' -> + let + msg = hang (ptext SLIT("Couldn't match kind")) + 2 (sep [quotes (ppr ty1'), + ptext SLIT("against"), + quotes (ppr ty2')]) + in + failWithTc msg +\end{code} + +\begin{code} +unifyFunKind :: TcKind -> TcM (Maybe (TcKind, TcKind)) +-- Like unifyFunTy, but does not fail; instead just returns Nothing + +unifyFunKind (KindVar kvar) + = readKindVar kvar `thenM` \ maybe_kind -> + case maybe_kind of + Just fun_kind -> unifyFunKind fun_kind + Nothing -> do { arg_kind <- newKindVar + ; res_kind <- newKindVar + ; writeKindVar kvar (mkArrowKind arg_kind res_kind) + ; returnM (Just (arg_kind,res_kind)) } + +unifyFunKind (FunKind arg_kind res_kind) = returnM (Just (arg_kind,res_kind)) +unifyFunKind other = returnM Nothing +\end{code} + +%************************************************************************ +%* * + Checking kinds +%* * +%************************************************************************ + +--------------------------- +-- We would like to get a decent error message from +-- (a) Under-applied type constructors +-- f :: (Maybe, Maybe) +-- (b) Over-applied type constructors +-- f :: Int x -> Int x +-- + +\begin{code} +checkExpectedKind :: Outputable a => a -> TcKind -> TcKind -> TcM () +-- A fancy wrapper for 'unifyKind', which tries +-- to give decent error messages. +checkExpectedKind ty act_kind exp_kind + | act_kind `isSubKind` exp_kind -- Short cut for a very common case + = returnM () + | otherwise + = tryTc (unifyKind exp_kind act_kind) `thenM` \ (_errs, mb_r) -> + case mb_r of { + Just r -> returnM () ; -- Unification succeeded + Nothing -> + + -- So there's definitely an error + -- Now to find out what sort + zonkTcKind exp_kind `thenM` \ exp_kind -> + zonkTcKind act_kind `thenM` \ act_kind -> + + tcInitTidyEnv `thenM` \ env0 -> + let (exp_as, _) = splitKindFunTys exp_kind + (act_as, _) = splitKindFunTys act_kind + n_exp_as = length exp_as + n_act_as = length act_as + + (env1, tidy_exp_kind) = tidyKind env0 exp_kind + (env2, tidy_act_kind) = tidyKind env1 act_kind + + err | n_exp_as < n_act_as -- E.g. [Maybe] + = quotes (ppr ty) <+> ptext SLIT("is not applied to enough type arguments") + + -- Now n_exp_as >= n_act_as. In the next two cases, + -- n_exp_as == 0, and hence so is n_act_as + | isLiftedTypeKind exp_kind && isUnliftedTypeKind act_kind + = ptext SLIT("Expecting a lifted type, but") <+> quotes (ppr ty) + <+> ptext SLIT("is unlifted") + + | isUnliftedTypeKind exp_kind && isLiftedTypeKind act_kind + = ptext SLIT("Expecting an unlifted type, but") <+> quotes (ppr ty) + <+> ptext SLIT("is lifted") + + | otherwise -- E.g. Monad [Int] + = ptext SLIT("Kind mis-match") + + more_info = sep [ ptext SLIT("Expected kind") <+> + quotes (pprKind tidy_exp_kind) <> comma, + ptext SLIT("but") <+> quotes (ppr ty) <+> + ptext SLIT("has kind") <+> quotes (pprKind tidy_act_kind)] + in + failWithTcM (env2, err $$ more_info) + } +\end{code} + +%************************************************************************ +%* * +\subsection{Checking signature type variables} +%* * +%************************************************************************ + +@checkSigTyVars@ checks that a set of universally quantified type varaibles +are not mentioned in the environment. In particular: + + (a) Not mentioned in the type of a variable in the envt + eg the signature for f in this: + + g x = ... where + f :: a->[a] + f y = [x,y] + + Here, f is forced to be monorphic by the free occurence of x. + + (d) Not (unified with another type variable that is) in scope. + eg f x :: (r->r) = (\y->y) :: forall a. a->r + when checking the expression type signature, we find that + even though there is nothing in scope whose type mentions r, + nevertheless the type signature for the expression isn't right. + + Another example is in a class or instance declaration: + class C a where + op :: forall b. a -> b + op x = x + Here, b gets unified with a + +Before doing this, the substitution is applied to the signature type variable. + +\begin{code} +checkSigTyVars :: [TcTyVar] -> TcM () +checkSigTyVars sig_tvs = check_sig_tyvars emptyVarSet sig_tvs + +checkSigTyVarsWrt :: TcTyVarSet -> [TcTyVar] -> TcM () +-- The extra_tvs can include boxy type variables; +-- e.g. TcMatches.tcCheckExistentialPat +checkSigTyVarsWrt extra_tvs sig_tvs + = do { extra_tvs' <- zonkTcTyVarsAndFV (varSetElems extra_tvs) + ; check_sig_tyvars extra_tvs' sig_tvs } + +check_sig_tyvars + :: TcTyVarSet -- Global type variables. The universally quantified + -- tyvars should not mention any of these + -- Guaranteed already zonked. + -> [TcTyVar] -- Universally-quantified type variables in the signature + -- Guaranteed to be skolems + -> TcM () +check_sig_tyvars extra_tvs [] + = returnM () +check_sig_tyvars extra_tvs sig_tvs + = ASSERT( all isSkolemTyVar sig_tvs ) + do { gbl_tvs <- tcGetGlobalTyVars + ; traceTc (text "check_sig_tyvars" <+> (vcat [text "sig_tys" <+> ppr sig_tvs, + text "gbl_tvs" <+> ppr gbl_tvs, + text "extra_tvs" <+> ppr extra_tvs])) + + ; let env_tvs = gbl_tvs `unionVarSet` extra_tvs + ; ifM (any (`elemVarSet` env_tvs) sig_tvs) + (bleatEscapedTvs env_tvs sig_tvs sig_tvs) + } + +bleatEscapedTvs :: TcTyVarSet -- The global tvs + -> [TcTyVar] -- The possibly-escaping type variables + -> [TcTyVar] -- The zonked versions thereof + -> TcM () +-- Complain about escaping type variables +-- We pass a list of type variables, at least one of which +-- escapes. The first list contains the original signature type variable, +-- while the second contains the type variable it is unified to (usually itself) +bleatEscapedTvs globals sig_tvs zonked_tvs + = do { env0 <- tcInitTidyEnv + ; let (env1, tidy_tvs) = tidyOpenTyVars env0 sig_tvs + (env2, tidy_zonked_tvs) = tidyOpenTyVars env1 zonked_tvs + + ; (env3, msgs) <- foldlM check (env2, []) (tidy_tvs `zip` tidy_zonked_tvs) + ; failWithTcM (env3, main_msg $$ nest 2 (vcat msgs)) } + where + main_msg = ptext SLIT("Inferred type is less polymorphic than expected") + + check (tidy_env, msgs) (sig_tv, zonked_tv) + | not (zonked_tv `elemVarSet` globals) = return (tidy_env, msgs) + | otherwise + = do { (tidy_env1, globs) <- findGlobals (unitVarSet zonked_tv) tidy_env + ; returnM (tidy_env1, escape_msg sig_tv zonked_tv globs : msgs) } + +----------------------- +escape_msg sig_tv zonked_tv globs + | notNull globs + = vcat [sep [msg, ptext SLIT("is mentioned in the environment:")], + nest 2 (vcat globs)] + | otherwise + = msg <+> ptext SLIT("escapes") + -- Sigh. It's really hard to give a good error message + -- all the time. One bad case is an existential pattern match. + -- We rely on the "When..." context to help. + where + msg = ptext SLIT("Quantified type variable") <+> quotes (ppr sig_tv) <+> is_bound_to + is_bound_to + | sig_tv == zonked_tv = empty + | otherwise = ptext SLIT("is unified with") <+> quotes (ppr zonked_tv) <+> ptext SLIT("which") +\end{code} + +These two context are used with checkSigTyVars + +\begin{code} +sigCtxt :: Id -> [TcTyVar] -> TcThetaType -> TcTauType + -> TidyEnv -> TcM (TidyEnv, Message) +sigCtxt id sig_tvs sig_theta sig_tau tidy_env + = zonkTcType sig_tau `thenM` \ actual_tau -> + let + (env1, tidy_sig_tvs) = tidyOpenTyVars tidy_env sig_tvs + (env2, tidy_sig_rho) = tidyOpenType env1 (mkPhiTy sig_theta sig_tau) + (env3, tidy_actual_tau) = tidyOpenType env2 actual_tau + sub_msg = vcat [ptext SLIT("Signature type: ") <+> pprType (mkForAllTys tidy_sig_tvs tidy_sig_rho), + ptext SLIT("Type to generalise:") <+> pprType tidy_actual_tau + ] + msg = vcat [ptext SLIT("When trying to generalise the type inferred for") <+> quotes (ppr id), + nest 2 sub_msg] + in + returnM (env3, msg) +\end{code} diff --git a/compiler/typecheck/TcUnify.lhs-boot b/compiler/typecheck/TcUnify.lhs-boot new file mode 100644 index 0000000000..8a1847e671 --- /dev/null +++ b/compiler/typecheck/TcUnify.lhs-boot @@ -0,0 +1,11 @@ +\begin{code} +module TcUnify where +import TcType ( TcTauType, BoxyType ) +import TcRnTypes( TcM ) + +-- This boot file exists only to tie the knot between +-- TcUnify and TcSimplify + +unifyType :: TcTauType -> TcTauType -> TcM () +zapToMonotype :: BoxyType -> TcM TcTauType +\end{code} diff --git a/compiler/types/Class.lhs b/compiler/types/Class.lhs new file mode 100644 index 0000000000..016ce1bfbe --- /dev/null +++ b/compiler/types/Class.lhs @@ -0,0 +1,164 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[Class]{The @Class@ datatype} + +\begin{code} +module Class ( + Class, ClassOpItem, FunDep, + DefMeth (..), + + mkClass, classTyVars, classArity, + classKey, className, classSelIds, classTyCon, classMethods, + classBigSig, classExtraBigSig, classTvsFds, classSCTheta + ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} TyCon ( TyCon ) +import {-# SOURCE #-} TypeRep ( PredType ) + +import Var ( Id, TyVar ) +import Name ( NamedThing(..), Name ) +import BasicTypes ( Arity ) +import Unique ( Unique, Uniquable(..) ) +import Outputable +\end{code} + +%************************************************************************ +%* * +\subsection[Class-basic]{@Class@: basic definition} +%* * +%************************************************************************ + +A @Class@ corresponds to a Greek kappa in the static semantics: + +\begin{code} +data Class + = Class { + classKey :: Unique, -- Key for fast comparison + className :: Name, + + classTyVars :: [TyVar], -- The class type variables + classFunDeps :: [FunDep TyVar], -- The functional dependencies + + classSCTheta :: [PredType], -- Immediate superclasses, and the + classSCSels :: [Id], -- corresponding selector functions to + -- extract them from a dictionary of this + -- class + + classOpStuff :: [ClassOpItem], -- Ordered by tag + + classTyCon :: TyCon -- The data type constructor for dictionaries + } -- of this class + +type FunDep a = ([a],[a]) -- e.g. class C a b c | a b -> c, a c -> b where ... + -- Here fun-deps are [([a,b],[c]), ([a,c],[b])] + +type ClassOpItem = (Id, DefMeth) + -- Selector function; contains unfolding + -- Default-method info + +data DefMeth = NoDefMeth -- No default method + | DefMeth -- A polymorphic default method + | GenDefMeth -- A generic default method + deriving Eq +\end{code} + +The @mkClass@ function fills in the indirect superclasses. + +\begin{code} +mkClass :: Name -> [TyVar] + -> [([TyVar], [TyVar])] + -> [PredType] -> [Id] + -> [ClassOpItem] + -> TyCon + -> Class + +mkClass name tyvars fds super_classes superdict_sels + op_stuff tycon + = Class { classKey = getUnique name, + className = name, + classTyVars = tyvars, + classFunDeps = fds, + classSCTheta = super_classes, + classSCSels = superdict_sels, + classOpStuff = op_stuff, + classTyCon = tycon } +\end{code} + +%************************************************************************ +%* * +\subsection[Class-selectors]{@Class@: simple selectors} +%* * +%************************************************************************ + +The rest of these functions are just simple selectors. + +\begin{code} +classArity :: Class -> Arity +classArity clas = length (classTyVars clas) + -- Could memoise this + +classSelIds :: Class -> [Id] +classSelIds c@(Class {classSCSels = sc_sels}) + = sc_sels ++ classMethods c + +classMethods :: Class -> [Id] +classMethods (Class {classOpStuff = op_stuff}) + = [op_sel | (op_sel, _) <- op_stuff] + +classTvsFds c + = (classTyVars c, classFunDeps c) + +classBigSig (Class {classTyVars = tyvars, classSCTheta = sc_theta, + classSCSels = sc_sels, classOpStuff = op_stuff}) + = (tyvars, sc_theta, sc_sels, op_stuff) +classExtraBigSig (Class {classTyVars = tyvars, classFunDeps = fundeps, + classSCTheta = sc_theta, classSCSels = sc_sels, + classOpStuff = op_stuff}) + = (tyvars, fundeps, sc_theta, sc_sels, op_stuff) +\end{code} + + +%************************************************************************ +%* * +\subsection[Class-instances]{Instance declarations for @Class@} +%* * +%************************************************************************ + +We compare @Classes@ by their keys (which include @Uniques@). + +\begin{code} +instance Eq Class where + c1 == c2 = classKey c1 == classKey c2 + c1 /= c2 = classKey c1 /= classKey c2 + +instance Ord Class where + c1 <= c2 = classKey c1 <= classKey c2 + c1 < c2 = classKey c1 < classKey c2 + c1 >= c2 = classKey c1 >= classKey c2 + c1 > c2 = classKey c1 > classKey c2 + compare c1 c2 = classKey c1 `compare` classKey c2 +\end{code} + +\begin{code} +instance Uniquable Class where + getUnique c = classKey c + +instance NamedThing Class where + getName clas = className clas + +instance Outputable Class where + ppr c = ppr (getName c) + +instance Show Class where + showsPrec p c = showsPrecSDoc p (ppr c) + +instance Outputable DefMeth where + ppr DefMeth = text "{- has default method -}" + ppr GenDefMeth = text "{- has generic method -}" + ppr NoDefMeth = empty -- No default method +\end{code} + + diff --git a/compiler/types/FunDeps.lhs b/compiler/types/FunDeps.lhs new file mode 100644 index 0000000000..9347f5f665 --- /dev/null +++ b/compiler/types/FunDeps.lhs @@ -0,0 +1,500 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 2000 +% +\section[FunDeps]{FunDeps - functional dependencies} + +It's better to read it as: "if we know these, then we're going to know these" + +\begin{code} +module FunDeps ( + Equation, pprEquation, + oclose, grow, improve, + checkInstCoverage, checkFunDeps, + pprFundeps + ) where + +#include "HsVersions.h" + +import Name ( Name, getSrcLoc ) +import Var ( TyVar ) +import Class ( Class, FunDep, classTvsFds ) +import Unify ( tcUnifyTys, BindFlag(..) ) +import Type ( substTys, notElemTvSubst ) +import TcType ( Type, PredType(..), tcEqType, + predTyUnique, mkClassPred, tyVarsOfTypes, tyVarsOfPred ) +import InstEnv ( Instance(..), InstEnv, instanceHead, classInstances, + instanceCantMatch, roughMatchTcs ) +import VarSet +import VarEnv +import Outputable +import Util ( notNull ) +import List ( tails ) +import Maybe ( isJust ) +import ListSetOps ( equivClassesByUniq ) +\end{code} + + +%************************************************************************ +%* * +\subsection{Close type variables} +%* * +%************************************************************************ + +(oclose preds tvs) closes the set of type variables tvs, +wrt functional dependencies in preds. The result is a superset +of the argument set. For example, if we have + class C a b | a->b where ... +then + oclose [C (x,y) z, C (x,p) q] {x,y} = {x,y,z} +because if we know x and y then that fixes z. + +Using oclose +~~~~~~~~~~~~ +oclose is used + +a) When determining ambiguity. The type + forall a,b. C a b => a +is not ambiguous (given the above class decl for C) because +a determines b. + +b) When generalising a type T. Usually we take FV(T) \ FV(Env), +but in fact we need + FV(T) \ (FV(Env)+) +where the '+' is the oclosure operation. Notice that we do not +take FV(T)+. This puzzled me for a bit. Consider + + f = E + +and suppose e have that E :: C a b => a, and suppose that b is +free in the environment. Then we quantify over 'a' only, giving +the type forall a. C a b => a. Since a->b but we don't have b->a, +we might have instance decls like + instance C Bool Int where ... + instance C Char Int where ... +so knowing that b=Int doesn't fix 'a'; so we quantify over it. + + --------------- + A WORRY: ToDo! + --------------- +If we have class C a b => D a b where .... + class D a b | a -> b where ... +and the preds are [C (x,y) z], then we want to see the fd in D, +even though it is not explicit in C, giving [({x,y},{z})] + +Similarly for instance decls? E.g. Suppose we have + instance C a b => Eq (T a b) where ... +and we infer a type t with constraints Eq (T a b) for a particular +expression, and suppose that 'a' is free in the environment. +We could generalise to + forall b. Eq (T a b) => t +but if we reduced the constraint, to C a b, we'd see that 'a' determines +b, so that a better type might be + t (with free constraint C a b) +Perhaps it doesn't matter, because we'll still force b to be a +particular type at the call sites. Generalising over too many +variables (provided we don't shadow anything by quantifying over a +variable that is actually free in the envt) may postpone errors; it +won't hide them altogether. + + +\begin{code} +oclose :: [PredType] -> TyVarSet -> TyVarSet +oclose preds fixed_tvs + | null tv_fds = fixed_tvs -- Fast escape hatch for common case + | otherwise = loop fixed_tvs + where + loop fixed_tvs + | new_fixed_tvs `subVarSet` fixed_tvs = fixed_tvs + | otherwise = loop new_fixed_tvs + where + new_fixed_tvs = foldl extend fixed_tvs tv_fds + + extend fixed_tvs (ls,rs) | ls `subVarSet` fixed_tvs = fixed_tvs `unionVarSet` rs + | otherwise = fixed_tvs + + tv_fds :: [(TyVarSet,TyVarSet)] + -- In our example, tv_fds will be [ ({x,y}, {z}), ({x,p},{q}) ] + -- Meaning "knowing x,y fixes z, knowing x,p fixes q" + tv_fds = [ (tyVarsOfTypes xs, tyVarsOfTypes ys) + | ClassP cls tys <- preds, -- Ignore implicit params + let (cls_tvs, cls_fds) = classTvsFds cls, + fd <- cls_fds, + let (xs,ys) = instFD fd cls_tvs tys + ] +\end{code} + +\begin{code} +grow :: [PredType] -> TyVarSet -> TyVarSet +grow preds fixed_tvs + | null preds = fixed_tvs + | otherwise = loop fixed_tvs + where + loop fixed_tvs + | new_fixed_tvs `subVarSet` fixed_tvs = fixed_tvs + | otherwise = loop new_fixed_tvs + where + new_fixed_tvs = foldl extend fixed_tvs pred_sets + + extend fixed_tvs pred_tvs + | fixed_tvs `intersectsVarSet` pred_tvs = fixed_tvs `unionVarSet` pred_tvs + | otherwise = fixed_tvs + + pred_sets = [tyVarsOfPred pred | pred <- preds] +\end{code} + +%************************************************************************ +%* * +\subsection{Generate equations from functional dependencies} +%* * +%************************************************************************ + + +\begin{code} +---------- +type Equation = (TyVarSet, [(Type, Type)]) +-- These pairs of types should be equal, for some +-- substitution of the tyvars in the tyvar set +-- INVARIANT: corresponding types aren't already equal + +-- It's important that we have a *list* of pairs of types. Consider +-- class C a b c | a -> b c where ... +-- instance C Int x x where ... +-- Then, given the constraint (C Int Bool v) we should improve v to Bool, +-- via the equation ({x}, [(Bool,x), (v,x)]) +-- This would not happen if the class had looked like +-- class C a b c | a -> b, a -> c + +-- To "execute" the equation, make fresh type variable for each tyvar in the set, +-- instantiate the two types with these fresh variables, and then unify. +-- +-- For example, ({a,b}, (a,Int,b), (Int,z,Bool)) +-- We unify z with Int, but since a and b are quantified we do nothing to them +-- We usually act on an equation by instantiating the quantified type varaibles +-- to fresh type variables, and then calling the standard unifier. + +pprEquation (qtvs, pairs) + = vcat [ptext SLIT("forall") <+> braces (pprWithCommas ppr (varSetElems qtvs)), + nest 2 (vcat [ ppr t1 <+> ptext SLIT(":=:") <+> ppr t2 | (t1,t2) <- pairs])] + +---------- +type Pred_Loc = (PredType, SDoc) -- SDoc says where the Pred comes from + +improve :: (Class -> [Instance]) -- Gives instances for given class + -> [Pred_Loc] -- Current constraints; + -> [(Equation,Pred_Loc,Pred_Loc)] -- Derived equalities that must also hold + -- (NB the above INVARIANT for type Equation) + -- The Pred_Locs explain which two predicates were + -- combined (for error messages) +\end{code} + +Given a bunch of predicates that must hold, such as + + C Int t1, C Int t2, C Bool t3, ?x::t4, ?x::t5 + +improve figures out what extra equations must hold. +For example, if we have + + class C a b | a->b where ... + +then improve will return + + [(t1,t2), (t4,t5)] + +NOTA BENE: + + * improve does not iterate. It's possible that when we make + t1=t2, for example, that will in turn trigger a new equation. + This would happen if we also had + C t1 t7, C t2 t8 + If t1=t2, we also get t7=t8. + + improve does *not* do this extra step. It relies on the caller + doing so. + + * The equations unify types that are not already equal. So there + is no effect iff the result of improve is empty + + + +\begin{code} +improve inst_env preds + = [ eqn | group <- equivClassesByUniq (predTyUnique . fst) preds, + eqn <- checkGroup inst_env group ] + +---------- +checkGroup :: (Class -> [Instance]) + -> [Pred_Loc] + -> [(Equation, Pred_Loc, Pred_Loc)] + -- The preds are all for the same class or implicit param + +checkGroup inst_env (p1@(IParam _ ty, _) : ips) + = -- For implicit parameters, all the types must match + [ ((emptyVarSet, [(ty,ty')]), p1, p2) + | p2@(IParam _ ty', _) <- ips, not (ty `tcEqType` ty')] + +checkGroup inst_env clss@((ClassP cls _, _) : _) + = -- For classes life is more complicated + -- Suppose the class is like + -- classs C as | (l1 -> r1), (l2 -> r2), ... where ... + -- Then FOR EACH PAIR (ClassP c tys1, ClassP c tys2) in the list clss + -- we check whether + -- U l1[tys1/as] = U l2[tys2/as] + -- (where U is a unifier) + -- + -- If so, we return the pair + -- U r1[tys1/as] = U l2[tys2/as] + -- + -- We need to do something very similar comparing each predicate + -- with relevant instance decls + + instance_eqns ++ pairwise_eqns + -- NB: we put the instance equations first. This biases the + -- order so that we first improve individual constraints against the + -- instances (which are perhaps in a library and less likely to be + -- wrong; and THEN perform the pairwise checks. + -- The other way round, it's possible for the pairwise check to succeed + -- and cause a subsequent, misleading failure of one of the pair with an + -- instance declaration. See tcfail143.hs for an exmample + + where + (cls_tvs, cls_fds) = classTvsFds cls + instances = inst_env cls + + -- NOTE that we iterate over the fds first; they are typically + -- empty, which aborts the rest of the loop. + pairwise_eqns :: [(Equation,Pred_Loc,Pred_Loc)] + pairwise_eqns -- This group comes from pairwise comparison + = [ (eqn, p1, p2) + | fd <- cls_fds, + p1@(ClassP _ tys1, _) : rest <- tails clss, + p2@(ClassP _ tys2, _) <- rest, + eqn <- checkClsFD emptyVarSet fd cls_tvs tys1 tys2 + ] + + instance_eqns :: [(Equation,Pred_Loc,Pred_Loc)] + instance_eqns -- This group comes from comparing with instance decls + = [ (eqn, p1, p2) + | fd <- cls_fds, -- Iterate through the fundeps first, + -- because there often are none! + p2@(ClassP _ tys2, _) <- clss, + let rough_tcs2 = trimRoughMatchTcs cls_tvs fd (roughMatchTcs tys2), + ispec@(Instance { is_tvs = qtvs, is_tys = tys1, + is_tcs = mb_tcs1 }) <- instances, + not (instanceCantMatch mb_tcs1 rough_tcs2), + eqn <- checkClsFD qtvs fd cls_tvs tys1 tys2, + let p1 = (mkClassPred cls tys1, + ptext SLIT("arising from the instance declaration at") <+> + ppr (getSrcLoc ispec)) + ] +---------- +checkClsFD :: TyVarSet -- Quantified type variables; see note below + -> FunDep TyVar -> [TyVar] -- One functional dependency from the class + -> [Type] -> [Type] + -> [Equation] + +checkClsFD qtvs fd clas_tvs tys1 tys2 +-- 'qtvs' are the quantified type variables, the ones which an be instantiated +-- to make the types match. For example, given +-- class C a b | a->b where ... +-- instance C (Maybe x) (Tree x) where .. +-- +-- and an Inst of form (C (Maybe t1) t2), +-- then we will call checkClsFD with +-- +-- qtvs = {x}, tys1 = [Maybe x, Tree x] +-- tys2 = [Maybe t1, t2] +-- +-- We can instantiate x to t1, and then we want to force +-- (Tree x) [t1/x] :=: t2 +-- +-- This function is also used when matching two Insts (rather than an Inst +-- against an instance decl. In that case, qtvs is empty, and we are doing +-- an equality check +-- +-- This function is also used by InstEnv.badFunDeps, which needs to *unify* +-- For the one-sided matching case, the qtvs are just from the template, +-- so we get matching +-- + = ASSERT2( length tys1 == length tys2 && + length tys1 == length clas_tvs + , ppr tys1 <+> ppr tys2 ) + + case tcUnifyTys bind_fn ls1 ls2 of + Nothing -> [] + Just subst | isJust (tcUnifyTys bind_fn rs1' rs2') + -- Don't include any equations that already hold. + -- Reason: then we know if any actual improvement has happened, + -- in which case we need to iterate the solver + -- In making this check we must taking account of the fact that any + -- qtvs that aren't already instantiated can be instantiated to anything + -- at all + -> [] + + | otherwise -- Aha! A useful equation + -> [ (qtvs', zip rs1' rs2')] + -- We could avoid this substTy stuff by producing the eqn + -- (qtvs, ls1++rs1, ls2++rs2) + -- which will re-do the ls1/ls2 unification when the equation is + -- executed. What we're doing instead is recording the partial + -- work of the ls1/ls2 unification leaving a smaller unification problem + where + rs1' = substTys subst rs1 + rs2' = substTys subst rs2 + qtvs' = filterVarSet (`notElemTvSubst` subst) qtvs + -- qtvs' are the quantified type variables + -- that have not been substituted out + -- + -- Eg. class C a b | a -> b + -- instance C Int [y] + -- Given constraint C Int z + -- we generate the equation + -- ({y}, [y], z) + where + bind_fn tv | tv `elemVarSet` qtvs = BindMe + | otherwise = Skolem + + (ls1, rs1) = instFD fd clas_tvs tys1 + (ls2, rs2) = instFD fd clas_tvs tys2 + +instFD :: FunDep TyVar -> [TyVar] -> [Type] -> FunDep Type +instFD (ls,rs) tvs tys + = (map lookup ls, map lookup rs) + where + env = zipVarEnv tvs tys + lookup tv = lookupVarEnv_NF env tv +\end{code} + +\begin{code} +checkInstCoverage :: Class -> [Type] -> Bool +-- Check that the Coverage Condition is obeyed in an instance decl +-- For example, if we have +-- class theta => C a b | a -> b +-- instance C t1 t2 +-- Then we require fv(t2) `subset` fv(t1) +-- See Note [Coverage Condition] below + +checkInstCoverage clas inst_taus + = all fundep_ok fds + where + (tyvars, fds) = classTvsFds clas + fundep_ok fd = tyVarsOfTypes rs `subVarSet` tyVarsOfTypes ls + where + (ls,rs) = instFD fd tyvars inst_taus +\end{code} + +Note [Coverage condition] +~~~~~~~~~~~~~~~~~~~~~~~~~ +For the coverage condition, we used to require only that + fv(t2) `subset` oclose(fv(t1), theta) + +Example: + class Mul a b c | a b -> c where + (.*.) :: a -> b -> c + + instance Mul Int Int Int where (.*.) = (*) + instance Mul Int Float Float where x .*. y = fromIntegral x * y + instance Mul a b c => Mul a [b] [c] where x .*. v = map (x.*.) v + +In the third instance, it's not the case that fv([c]) `subset` fv(a,[b]). +But it is the case that fv([c]) `subset` oclose( theta, fv(a,[b]) ) + +But it is a mistake to accept the instance because then this defn: + f = \ b x y -> if b then x .*. [y] else y +makes instance inference go into a loop, because it requires the constraint + Mul a [b] b + + +%************************************************************************ +%* * + Check that a new instance decl is OK wrt fundeps +%* * +%************************************************************************ + +Here is the bad case: + class C a b | a->b where ... + instance C Int Bool where ... + instance C Int Char where ... + +The point is that a->b, so Int in the first parameter must uniquely +determine the second. In general, given the same class decl, and given + + instance C s1 s2 where ... + instance C t1 t2 where ... + +Then the criterion is: if U=unify(s1,t1) then U(s2) = U(t2). + +Matters are a little more complicated if there are free variables in +the s2/t2. + + class D a b c | a -> b + instance D a b => D [(a,a)] [b] Int + instance D a b => D [a] [b] Bool + +The instance decls don't overlap, because the third parameter keeps +them separate. But we want to make sure that given any constraint + D s1 s2 s3 +if s1 matches + + +\begin{code} +checkFunDeps :: (InstEnv, InstEnv) -> Instance + -> Maybe [Instance] -- Nothing <=> ok + -- Just dfs <=> conflict with dfs +-- Check wheher adding DFunId would break functional-dependency constraints +-- Used only for instance decls defined in the module being compiled +checkFunDeps inst_envs ispec + | null bad_fundeps = Nothing + | otherwise = Just bad_fundeps + where + (ins_tvs, _, clas, ins_tys) = instanceHead ispec + ins_tv_set = mkVarSet ins_tvs + cls_inst_env = classInstances inst_envs clas + bad_fundeps = badFunDeps cls_inst_env clas ins_tv_set ins_tys + +badFunDeps :: [Instance] -> Class + -> TyVarSet -> [Type] -- Proposed new instance type + -> [Instance] +badFunDeps cls_insts clas ins_tv_set ins_tys + = [ ispec | fd <- fds, -- fds is often empty + let trimmed_tcs = trimRoughMatchTcs clas_tvs fd rough_tcs, + ispec@(Instance { is_tcs = mb_tcs, is_tvs = tvs, + is_tys = tys }) <- cls_insts, + -- Filter out ones that can't possibly match, + -- based on the head of the fundep + not (instanceCantMatch trimmed_tcs mb_tcs), + notNull (checkClsFD (tvs `unionVarSet` ins_tv_set) + fd clas_tvs tys ins_tys) + ] + where + (clas_tvs, fds) = classTvsFds clas + rough_tcs = roughMatchTcs ins_tys + +trimRoughMatchTcs :: [TyVar] -> FunDep TyVar -> [Maybe Name] -> [Maybe Name] +-- Computing rough_tcs for a particular fundep +-- class C a b c | a c -> b where ... +-- For each instance .... => C ta tb tc +-- we want to match only on the types ta, tb; so our +-- rough-match thing must similarly be filtered. +-- Hence, we Nothing-ise the tb type right here +trimRoughMatchTcs clas_tvs (ltvs,_) mb_tcs + = zipWith select clas_tvs mb_tcs + where + select clas_tv mb_tc | clas_tv `elem` ltvs = mb_tc + | otherwise = Nothing +\end{code} + + +%************************************************************************ +%* * +\subsection{Miscellaneous} +%* * +%************************************************************************ + +\begin{code} +pprFundeps :: Outputable a => [FunDep a] -> SDoc +pprFundeps [] = empty +pprFundeps fds = hsep (ptext SLIT("|") : punctuate comma (map ppr_fd fds)) + +ppr_fd (us, vs) = hsep [interppSP us, ptext SLIT("->"), interppSP vs] +\end{code} + diff --git a/compiler/types/Generics.lhs b/compiler/types/Generics.lhs new file mode 100644 index 0000000000..2c973649cf --- /dev/null +++ b/compiler/types/Generics.lhs @@ -0,0 +1,546 @@ +\begin{code} +module Generics ( canDoGenerics, mkTyConGenericBinds, + mkGenericRhs, + validGenericInstanceType, validGenericMethodType + ) where + + +import HsSyn +import Type ( Type, isUnLiftedType, tyVarsOfType, tyVarsOfTypes, + isTyVarTy, getTyVar_maybe, funTyCon + ) +import TcHsSyn ( mkSimpleHsAlt ) +import TcType ( tcSplitTyConApp_maybe, tcSplitSigmaTy, tcSplitPhiTy, applyTy, + isTauTy, mkTyVarTy ) +import DataCon ( DataCon, dataConOrigArgTys, isVanillaDataCon, + dataConSourceArity ) + +import TyCon ( TyCon, tyConName, tyConDataCons, + isBoxedTupleTyCon + ) +import Name ( nameModule, nameOccName, getSrcLoc ) +import OccName ( mkGenOcc1, mkGenOcc2 ) +import RdrName ( RdrName, getRdrName, mkVarUnqual, mkOrig ) +import BasicTypes ( EP(..), Boxity(..) ) +import Var ( TyVar ) +import VarSet ( varSetElems ) +import Id ( Id, idType ) +import TysWiredIn ( listTyCon ) +import PrelNames + +import SrcLoc ( srcLocSpan, noLoc, Located(..) ) +import Util ( takeList, isSingleton ) +import Bag +import Outputable +import FastString + +#include "HsVersions.h" +\end{code} + +Roadmap of what's where in the Generics work. +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Parser +No real checks. + +RnSource.rnHsType + Checks that HsNumTy has a "1" in it. + +TcInstDcls.mkGenericInstance: + Checks for invalid type patterns, such as f {| Int |} + +TcClassDcl.tcClassSig + Checks for a method type that is too complicated; + e.g. has for-alls or lists in it + We could lift this restriction + +TcClassDecl.mkDefMethRhs + Checks that the instance type is simple, in an instance decl + where we let the compiler fill in a generic method. + e.g. instance C (T Int) + is not valid if C has generic methods. + +TcClassDecl.checkGenericClassIsUnary + Checks that we don't have generic methods in a multi-parameter class + +TcClassDecl.checkDefaultBinds + Checks that all the equations for a method in a class decl + are generic, or all are non-generic + + + +Checking that the type constructors which are present in Generic +patterns (not Unit, this is done differently) is done in mk_inst_info +(TcInstDecls) in a call to tcHsType (TcMonoBinds). This means that +HsOpTy is tied to Generic definitions which is not a very good design +feature, indeed a bug. However, the check is easy to move from +tcHsType back to mk_inst_info and everything will be fine. Also see +bug #5. [I don't think that this is the case anymore after SPJ's latest +changes in that regard. Delete this comment? -=chak/7Jun2] + +Generics.lhs + +Making generic information to put into a tycon. Constructs the +representation type, which, I think, are not used later. Perhaps it is +worth removing them from the GI datatype. Although it does get used in +the construction of conversion functions (internally). + +TyCon.lhs + +Just stores generic information, accessible by tyConGenInfo or tyConGenIds. + +TysWiredIn.lhs + +Defines generic and other type and data constructors. + +This is sadly incomplete, but will be added to. + + +Bugs & shortcomings of existing implementation: +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +2. Another pretty big bug I dscovered at the last minute when I was +testing the code is that at the moment the type variable of the class +is scoped over the entire declaration, including the patterns. For +instance, if I have the following code, + +class Er a where + ... + er {| Plus a b |} (Inl x) (Inl y) = er x y + er {| Plus a b |} (Inr x) (Inr y) = er x y + er {| Plus a b |} _ _ = False + +and I print out the types of the generic patterns, I get the +following. Note that all the variable names for "a" are the same, +while for "b" they are all different. + +check_ty + [std.Generics.Plus{-33u,i-} a{-r6Z-} b{-r7g-}, + std.Generics.Plus{-33u,i-} a{-r6Z-} b{-r7m-}, + std.Generics.Plus{-33u,i-} a{-r6Z-} b{-r7p-}] + +This is a bug as if I change the code to + + er {| Plus c b |} (Inl x) (Inl y) = er x y + +all the names come out to be different. + +Thus, all the types (Plus a b) come out to be different, so I cannot +compare them and test whether they are all the same and thus cannot +return an error if the type variables are different. + +Temporary fix/hack. I am not checking for this, I just assume they are +the same, see line "check_ty = True" in TcInstDecls. When we resolve +the issue with variables, though - I assume that we will make them to +be the same in all the type patterns, jus uncomment the check and +everything should work smoothly. + +Hence, I have also left the rather silly construction of: +* extracting all the type variables from all the types +* putting them *all* into the environment +* typechecking all the types +* selecting one of them and using it as the instance_ty. + +(the alternative is to make sure that all the types are the same, +taking one, extracting its variables, putting them into the environment, +type checking it, using it as the instance_ty) + +6. What happens if we do not supply all of the generic patterns? At +the moment, the compiler crashes with an error message "Non-exhaustive +patterns in a generic declaration" + + +What has not been addressed: +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Contexts. In the generated instance declarations for the 3 primitive +type constructors, we need contexts. It is unclear what those should +be. At the moment, we always say eg. (Eq a, Eq b) => Eq (Plus a b) + +Type application. We have type application in expressions +(essentially) on the lhs of an equation. Do we want to allow it on the +RHS? + +Scoping of type variables in a generic definition. At the moment, (see +TcInstDecls) we extract the type variables inside the type patterns +and add them to the environment. See my bug #2 above. This seems pretty +important. + + + +%************************************************************************ +%* * +\subsection{Getting the representation type out} +%* * +%************************************************************************ + +\begin{code} +validGenericInstanceType :: Type -> Bool + -- Checks for validity of the type pattern in a generic + -- declaration. It's ok to have + -- f {| a + b |} ... + -- but it's not OK to have + -- f {| a + Int |} + +validGenericInstanceType inst_ty + = case tcSplitTyConApp_maybe inst_ty of + Just (tycon, tys) -> all isTyVarTy tys && tyConName tycon `elem` genericTyConNames + Nothing -> False + +validGenericMethodType :: Type -> Bool + -- At the moment we only allow method types built from + -- * type variables + -- * function arrow + -- * boxed tuples + -- * lists + -- * an arbitrary type not involving the class type variables + -- e.g. this is ok: forall b. Ord b => [b] -> a + -- where a is the class variable +validGenericMethodType ty + = valid tau + where + (local_tvs, _, tau) = tcSplitSigmaTy ty + + valid ty + | isTyVarTy ty = True + | no_tyvars_in_ty = True + | otherwise = case tcSplitTyConApp_maybe ty of + Just (tc,tys) -> valid_tycon tc && all valid tys + Nothing -> False + where + no_tyvars_in_ty = all (`elem` local_tvs) (varSetElems (tyVarsOfType ty)) + + valid_tycon tc = tc == funTyCon || tc == listTyCon || isBoxedTupleTyCon tc + -- Compare bimapApp, below +\end{code} + + +%************************************************************************ +%* * +\subsection{Generating representation types} +%* * +%************************************************************************ + +\begin{code} +canDoGenerics :: [DataCon] -> Bool +-- Called on source-code data types, to see if we should generate +-- generic functions for them. (This info is recorded in the interface file for +-- imported data types.) + +canDoGenerics data_cons + = not (any bad_con data_cons) -- See comment below + && not (null data_cons) -- No values of the type + where + bad_con dc = any bad_arg_type (dataConOrigArgTys dc) || not (isVanillaDataCon dc) + -- If any of the constructor has an unboxed type as argument, + -- then we can't build the embedding-projection pair, because + -- it relies on instantiating *polymorphic* sum and product types + -- at the argument types of the constructors + + -- Nor can we do the job if it's an existential data constructor, + + -- Nor if the args are polymorphic types (I don't think) + bad_arg_type ty = isUnLiftedType ty || not (isTauTy ty) +\end{code} + +%************************************************************************ +%* * +\subsection{Generating the RHS of a generic default method} +%* * +%************************************************************************ + +\begin{code} +type US = Int -- Local unique supply, just a plain Int +type FromAlt = (LPat RdrName, LHsExpr RdrName) + +mkTyConGenericBinds :: TyCon -> LHsBinds RdrName +mkTyConGenericBinds tycon + = unitBag (L loc (mkFunBind (L loc from_RDR) from_matches)) + `unionBags` + unitBag (L loc (mkFunBind (L loc to_RDR) to_matches)) + where + from_matches = [mkSimpleHsAlt pat rhs | (pat,rhs) <- from_alts] + to_matches = [mkSimpleHsAlt to_pat to_body] + loc = srcLocSpan (getSrcLoc tycon) + datacons = tyConDataCons tycon + (from_RDR, to_RDR) = mkGenericNames tycon + + -- Recurse over the sum first + from_alts :: [FromAlt] + (from_alts, to_pat, to_body) = mk_sum_stuff init_us datacons + init_us = 1::Int -- Unique supply + +---------------------------------------------------- +-- Dealing with sums +---------------------------------------------------- + +mk_sum_stuff :: US -- Base for generating unique names + -> [DataCon] -- The data constructors + -> ([FromAlt], -- Alternatives for the T->Trep "from" function + InPat RdrName, LHsExpr RdrName) -- Arg and body of the Trep->T "to" function + +-- For example, given +-- data T = C | D Int Int Int +-- +-- mk_sum_stuff v [C,D] = ([C -> Inl Unit, D a b c -> Inr (a :*: (b :*: c))], +-- case cd of { Inl u -> C; +-- Inr abc -> case abc of { a :*: bc -> +-- case bc of { b :*: c -> +-- D a b c }} }, +-- cd) + +mk_sum_stuff us [datacon] + = ([from_alt], to_pat, to_body_fn app_exp) + where + n_args = dataConSourceArity datacon -- Existentials already excluded + + datacon_vars = map mkGenericLocal [us .. us+n_args-1] + us' = us + n_args + + datacon_rdr = getRdrName datacon + app_exp = nlHsVarApps datacon_rdr datacon_vars + from_alt = (nlConVarPat datacon_rdr datacon_vars, from_alt_rhs) + + (_, from_alt_rhs, to_pat, to_body_fn) = mk_prod_stuff us' datacon_vars + +mk_sum_stuff us datacons + = (wrap inlDataCon_RDR l_from_alts ++ wrap inrDataCon_RDR r_from_alts, + nlVarPat to_arg, + noLoc (HsCase (nlHsVar to_arg) + (mkMatchGroup [mkSimpleHsAlt (nlConPat inlDataCon_RDR [l_to_pat]) l_to_body, + mkSimpleHsAlt (nlConPat inrDataCon_RDR [r_to_pat]) r_to_body]))) + where + (l_datacons, r_datacons) = splitInHalf datacons + (l_from_alts, l_to_pat, l_to_body) = mk_sum_stuff us' l_datacons + (r_from_alts, r_to_pat, r_to_body) = mk_sum_stuff us' r_datacons + + to_arg = mkGenericLocal us + us' = us+1 + + wrap :: RdrName -> [FromAlt] -> [FromAlt] + -- Wrap an application of the Inl or Inr constructor round each alternative + wrap dc alts = [(pat, noLoc (HsApp (nlHsVar dc) rhs)) | (pat,rhs) <- alts] + + +---------------------------------------------------- +-- Dealing with products +---------------------------------------------------- +mk_prod_stuff :: US -- Base for unique names + -> [RdrName] -- arg-ids; args of the original user-defined constructor + -- They are bound enclosing from_rhs + -- Please bind these in the to_body_fn + -> (US, -- Depleted unique-name supply + LHsExpr RdrName, -- from-rhs: puts together the representation from the arg_ids + InPat RdrName, -- to_pat: + LHsExpr RdrName -> LHsExpr RdrName) -- to_body_fn: takes apart the representation + +-- For example: +-- mk_prod_stuff abc [a,b,c] = ( a :*: (b :*: c), +-- abc, +-- \<body-code> -> case abc of { a :*: bc -> +-- case bc of { b :*: c -> +-- <body-code> ) + +-- We need to use different uniques in the branches +-- because the returned to_body_fns are nested. +-- Hence the returned unqique-name supply + +mk_prod_stuff us [] -- Unit case + = (us+1, + nlHsVar genUnitDataCon_RDR, + noLoc (SigPatIn (nlVarPat (mkGenericLocal us)) + (noLoc (HsTyVar (getRdrName genUnitTyConName)))), + -- Give a signature to the pattern so we get + -- data S a = Nil | S a + -- toS = \x -> case x of { Inl (g :: Unit) -> Nil + -- Inr x -> S x } + -- The (:: Unit) signature ensures that we'll infer the right + -- type for toS. If we leave it out, the type is too polymorphic + + \x -> x) + +mk_prod_stuff us [arg_var] -- Singleton case + = (us, nlHsVar arg_var, nlVarPat arg_var, \x -> x) + +mk_prod_stuff us arg_vars -- Two or more + = (us'', + nlHsApps crossDataCon_RDR [l_alt_rhs, r_alt_rhs], + nlVarPat to_arg, +-- gaw 2004 FIX? + \x -> noLoc (HsCase (nlHsVar to_arg) + (mkMatchGroup [mkSimpleHsAlt pat (l_to_body_fn (r_to_body_fn x))]))) + where + to_arg = mkGenericLocal us + (l_arg_vars, r_arg_vars) = splitInHalf arg_vars + (us', l_alt_rhs, l_to_pat, l_to_body_fn) = mk_prod_stuff (us+1) l_arg_vars + (us'', r_alt_rhs, r_to_pat, r_to_body_fn) = mk_prod_stuff us' r_arg_vars + pat = nlConPat crossDataCon_RDR [l_to_pat, r_to_pat] + +splitInHalf :: [a] -> ([a],[a]) +splitInHalf list = (left, right) + where + half = length list `div` 2 + left = take half list + right = drop half list + +mkGenericLocal :: US -> RdrName +mkGenericLocal u = mkVarUnqual (mkFastString ("g" ++ show u)) + +mkGenericNames tycon + = (from_RDR, to_RDR) + where + tc_name = tyConName tycon + tc_occ = nameOccName tc_name + tc_mod = nameModule tc_name + from_RDR = mkOrig tc_mod (mkGenOcc1 tc_occ) + to_RDR = mkOrig tc_mod (mkGenOcc2 tc_occ) +\end{code} + +%************************************************************************ +%* * +\subsection{Generating the RHS of a generic default method} +%* * +%************************************************************************ + +Generating the Generic default method. Uses the bimaps to generate the +actual method. All of this is rather incomplete, but it would be nice +to make even this work. Example + + class Foo a where + op :: Op a + + instance Foo T + +Then we fill in the RHS for op, RenamedHsExpr, by calling mkGenericRhs: + + instance Foo T where + op = <mkGenericRhs op a T> + +To do this, we generate a pair of RenamedHsExprs (EP toOp fromOp), where + + toOp :: Op Trep -> Op T + fromOp :: Op T -> Op Trep + +(the bimap) and then fill in the RHS with + + instance Foo T where + op = toOp op + +Remember, we're generating a RenamedHsExpr, so the result of all this +will be fed to the type checker. So the 'op' on the RHS will be +at the representation type for T, Trep. + + +Note [Polymorphic methods] +~~~~~~~~~~~~~~~~~~~~~~~~~~ +Suppose the class op is polymorphic: + + class Baz a where + op :: forall b. Ord b => a -> b -> b + +Then we can still generate a bimap with + + toOP :: forall b. (Trep -> b -> b) -> (T -> b -> b) + +and fill in the instance decl thus + + instance Foo T where + op = toOp op + +By the time the type checker has done its stuff we'll get + + instance Foo T where + op = \b. \dict::Ord b. toOp b (op Trep b dict) + +\begin{code} +mkGenericRhs :: Id -> TyVar -> TyCon -> LHsExpr RdrName +mkGenericRhs sel_id tyvar tycon + = ASSERT( isSingleton ctxt ) -- Checks shape of selector-id context +-- pprTrace "mkGenericRhs" (vcat [ppr sel_id, ppr (idType sel_id), ppr tyvar, ppr tycon, ppr local_tvs, ppr final_ty]) $ + mkHsApp (toEP bimap) (nlHsVar (getRdrName sel_id)) + where + -- Initialising the "Environment" with the from/to functions + -- on the datatype (actually tycon) in question + (from_RDR, to_RDR) = mkGenericNames tycon + + -- Instantiate the selector type, and strip off its class context + (ctxt, op_ty) = tcSplitPhiTy (applyTy (idType sel_id) (mkTyVarTy tyvar)) + + -- Do it again! This deals with the case where the method type + -- is polymorphic -- see Note [Polymorphic methods] above + (local_tvs,_,final_ty) = tcSplitSigmaTy op_ty + + -- Now we probably have a tycon in front + -- of us, quite probably a FunTyCon. + ep = EP (nlHsVar from_RDR) (nlHsVar to_RDR) + bimap = generate_bimap (tyvar, ep, local_tvs) final_ty + +type EPEnv = (TyVar, -- The class type variable + EP (LHsExpr RdrName), -- The EP it maps to + [TyVar] -- Other in-scope tyvars; they have an identity EP + ) + +------------------- +generate_bimap :: EPEnv + -> Type + -> EP (LHsExpr RdrName) +-- Top level case - splitting the TyCon. +generate_bimap env@(tv,ep,local_tvs) ty + = case getTyVar_maybe ty of + Just tv1 | tv == tv1 -> ep -- The class tyvar + | otherwise -> ASSERT( tv1 `elem` local_tvs) -- One of the polymorphic tyvars of the method + idEP + Nothing -> bimapApp env (tcSplitTyConApp_maybe ty) + +------------------- +bimapApp :: EPEnv -> Maybe (TyCon, [Type]) -> EP (LHsExpr RdrName) +bimapApp env Nothing = panic "TcClassDecl: Type Application!" +bimapApp env (Just (tycon, ty_args)) + | tycon == funTyCon = bimapArrow arg_eps + | tycon == listTyCon = bimapList arg_eps + | isBoxedTupleTyCon tycon = bimapTuple arg_eps + | otherwise = -- Otherwise validGenericMethodType will + -- have checked that the type is a constant type + ASSERT( all (`elem` local_tvs) (varSetElems (tyVarsOfTypes ty_args)) ) + idEP + where + arg_eps = map (generate_bimap env) ty_args + (_,_,local_tvs) = env + +------------------- +-- bimapArrow :: [EP a a', EP b b'] -> EP (a->b) (a'->b') +bimapArrow [ep1, ep2] + = EP { fromEP = mkHsLam [nlVarPat a_RDR, nlVarPat b_RDR] from_body, + toEP = mkHsLam [nlVarPat a_RDR, nlVarPat b_RDR] to_body } + where + from_body = fromEP ep2 `mkHsApp` (mkHsPar $ nlHsVar a_RDR `mkHsApp` (mkHsPar $ toEP ep1 `mkHsApp` nlHsVar b_RDR)) + to_body = toEP ep2 `mkHsApp` (mkHsPar $ nlHsVar a_RDR `mkHsApp` (mkHsPar $ fromEP ep1 `mkHsApp` nlHsVar b_RDR)) + +------------------- +-- bimapTuple :: [EP a1 b1, ... EP an bn] -> EP (a1,...an) (b1,..bn) +bimapTuple eps + = EP { fromEP = mkHsLam [noLoc tuple_pat] (noLoc from_body), + toEP = mkHsLam [noLoc tuple_pat] (noLoc to_body) } + where + names = takeList eps gs_RDR + tuple_pat = TuplePat (map nlVarPat names) Boxed placeHolderType + eps_w_names = eps `zip` names + to_body = ExplicitTuple [toEP ep `mkHsApp` nlHsVar g | (ep,g) <- eps_w_names] Boxed + from_body = ExplicitTuple [fromEP ep `mkHsApp` nlHsVar g | (ep,g) <- eps_w_names] Boxed + +------------------- +-- bimapList :: EP a b -> EP [a] [b] +bimapList [ep] + = EP { fromEP = nlHsApp (nlHsVar map_RDR) (fromEP ep), + toEP = nlHsApp (nlHsVar map_RDR) (toEP ep) } + +------------------- +a_RDR = mkVarUnqual FSLIT("a") +b_RDR = mkVarUnqual FSLIT("b") +gs_RDR = [ mkVarUnqual (mkFastString ("g"++show i)) | i <- [(1::Int) .. ] ] + +idEP :: EP (LHsExpr RdrName) +idEP = EP idexpr idexpr + where + idexpr = mkHsLam [nlVarPat a_RDR] (nlHsVar a_RDR) +\end{code} diff --git a/compiler/types/InstEnv.lhs b/compiler/types/InstEnv.lhs new file mode 100644 index 0000000000..d4a7b771b7 --- /dev/null +++ b/compiler/types/InstEnv.lhs @@ -0,0 +1,566 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[InstEnv]{Utilities for typechecking instance declarations} + +The bits common to TcInstDcls and TcDeriv. + +\begin{code} +module InstEnv ( + DFunId, OverlapFlag(..), + Instance(..), pprInstance, pprInstanceHdr, pprInstances, + instanceHead, mkLocalInstance, mkImportedInstance, + instanceDFunId, setInstanceDFunId, instanceRoughTcs, + + InstEnv, emptyInstEnv, extendInstEnv, + extendInstEnvList, lookupInstEnv, instEnvElts, + classInstances, + instanceCantMatch, roughMatchTcs + ) where + +#include "HsVersions.h" + +import Class ( Class ) +import Var ( Id, TyVar, isTcTyVar ) +import VarSet +import Name ( Name, NamedThing(..), getSrcLoc, nameIsLocalOrFrom, nameModule ) +import OccName ( OccName ) +import NameSet ( unionNameSets, unitNameSet, nameSetToList ) +import Type ( TvSubst ) +import TcType ( Type, PredType, tcEqType, + tcSplitDFunTy, tyVarsOfTypes, isExistentialTyVar, + pprThetaArrow, pprClassPred, + tyClsNamesOfType, tcSplitTyConApp_maybe + ) +import TyCon ( tyConName ) +import Unify ( tcMatchTys, tcUnifyTys, BindFlag(..) ) +import Outputable +import UniqFM ( UniqFM, lookupUFM, emptyUFM, addToUFM_C, eltsUFM ) +import Id ( idType, idName ) +import SrcLoc ( pprDefnLoc ) +import Maybe ( isJust, isNothing ) +\end{code} + + +%************************************************************************ +%* * +\subsection{The key types} +%* * +%************************************************************************ + +\begin{code} +type DFunId = Id +data Instance + = Instance { is_cls :: Name -- Class name + + -- Used for "rough matching"; see note below + , is_tcs :: [Maybe Name] -- Top of type args + + -- Used for "proper matching"; see note + , is_tvs :: TyVarSet -- Template tyvars for full match + , is_tys :: [Type] -- Full arg types + + , is_dfun :: DFunId + , is_flag :: OverlapFlag + + , is_orph :: Maybe OccName } + +-- The "rough-match" fields +-- ~~~~~~~~~~~~~~~~~~~~~~~~~ +-- The is_cls, is_args fields allow a "rough match" to be done +-- without poking inside the DFunId. Poking the DFunId forces +-- us to suck in all the type constructors etc it involves, +-- which is a total waste of time if it has no chance of matching +-- So the Name, [Maybe Name] fields allow us to say "definitely +-- does not match", based only on the Name. +-- +-- In is_tcs, +-- Nothing means that this type arg is a type variable +-- +-- (Just n) means that this type arg is a +-- TyConApp with a type constructor of n. +-- This is always a real tycon, never a synonym! +-- (Two different synonyms might match, but two +-- different real tycons can't.) +-- NB: newtypes are not transparent, though! +-- +-- The "proper-match" fields +-- ~~~~~~~~~~~~~~~~~~~~~~~~~ +-- The is_tvs, is_tys fields are simply cahced values, pulled +-- out (lazily) from the dfun id. They are cached here simply so +-- that we don't need to decompose the DFunId each time we want +-- to match it. The hope is that the fast-match fields mean +-- that we often never poke th proper-match fields +-- +-- However, note that: +-- * is_tvs must be a superset of the free vars of is_tys +-- +-- * The is_dfun must itself be quantified over exactly is_tvs +-- (This is so that we can use the matching substitution to +-- instantiate the dfun's context.) +-- +-- The "orphan" field +-- ~~~~~~~~~~~~~~~~~~ +-- An instance is an orphan if its head (after the =>) mentions +-- nothing defined in this module. +-- +-- Just n The head mentions n, which is defined in this module +-- This is used for versioning; the instance decl is +-- considered part of the defn of n when computing versions +-- +-- Nothing The head mentions nothing defined in this modle +-- +-- If a module contains any orphans, then its interface file is read +-- regardless, so that its instances are not missed. +-- +-- Functional dependencies worsen the situation a bit. Consider +-- class C a b | a -> b +-- In some other module we might have +-- module M where +-- data T = ... +-- instance C Int T where ... +-- This isn't considered an orphan, so we will only read M's interface +-- if something from M is used (e.g. T). So there's a risk we'll +-- miss the improvement from the instance. Workaround: import M. + +instanceDFunId :: Instance -> DFunId +instanceDFunId = is_dfun + +setInstanceDFunId :: Instance -> DFunId -> Instance +setInstanceDFunId ispec dfun + = ASSERT( idType dfun `tcEqType` idType (is_dfun ispec) ) + -- We need to create the cached fields afresh from + -- the new dfun id. In particular, the is_tvs in + -- the Instance must match those in the dfun! + -- We assume that the only thing that changes is + -- the quantified type variables, so the other fields + -- are ok; hence the assert + ispec { is_dfun = dfun, is_tvs = mkVarSet tvs, is_tys = tys } + where + (tvs, _, _, tys) = tcSplitDFunTy (idType dfun) + +instanceRoughTcs :: Instance -> [Maybe Name] +instanceRoughTcs = is_tcs +\end{code} + +\begin{code} +instance NamedThing Instance where + getName ispec = getName (is_dfun ispec) + +instance Outputable Instance where + ppr = pprInstance + +pprInstance :: Instance -> SDoc +-- Prints the Instance as an instance declaration +pprInstance ispec@(Instance { is_flag = flag }) + = hang (pprInstanceHdr ispec) + 2 (ptext SLIT("--") <+> (pprDefnLoc (getSrcLoc ispec))) + +-- * pprInstanceHdr is used in VStudio to populate the ClassView tree +pprInstanceHdr :: Instance -> SDoc +-- Prints the Instance as an instance declaration +pprInstanceHdr ispec@(Instance { is_flag = flag }) + = ptext SLIT("instance") <+> ppr flag + <+> sep [pprThetaArrow theta, pprClassPred clas tys] + where + (_, theta, clas, tys) = instanceHead ispec + -- Print without the for-all, which the programmer doesn't write + +pprInstances :: [Instance] -> SDoc +pprInstances ispecs = vcat (map pprInstance ispecs) + +instanceHead :: Instance -> ([TyVar], [PredType], Class, [Type]) +instanceHead ispec = tcSplitDFunTy (idType (is_dfun ispec)) + +mkLocalInstance :: DFunId -> OverlapFlag -> Instance +-- Used for local instances, where we can safely pull on the DFunId +mkLocalInstance dfun oflag + = Instance { is_flag = oflag, is_dfun = dfun, + is_tvs = mkVarSet tvs, is_tys = tys, + is_cls = cls_name, is_tcs = roughMatchTcs tys, + is_orph = orph } + where + (tvs, _, cls, tys) = tcSplitDFunTy (idType dfun) + mod = nameModule (idName dfun) + cls_name = getName cls + tycl_names = foldr (unionNameSets . tyClsNamesOfType) + (unitNameSet cls_name) tys + orph = case filter (nameIsLocalOrFrom mod) (nameSetToList tycl_names) of + [] -> Nothing + (n:ns) -> Just (getOccName n) + +mkImportedInstance :: Name -> [Maybe Name] -> Maybe OccName + -> DFunId -> OverlapFlag -> Instance +-- Used for imported instances, where we get the rough-match stuff +-- from the interface file +mkImportedInstance cls mb_tcs orph dfun oflag + = Instance { is_flag = oflag, is_dfun = dfun, + is_tvs = mkVarSet tvs, is_tys = tys, + is_cls = cls, is_tcs = mb_tcs, is_orph = orph } + where + (tvs, _, _, tys) = tcSplitDFunTy (idType dfun) + +roughMatchTcs :: [Type] -> [Maybe Name] +roughMatchTcs tys = map rough tys + where + rough ty = case tcSplitTyConApp_maybe ty of + Just (tc,_) -> Just (tyConName tc) + Nothing -> Nothing + +instanceCantMatch :: [Maybe Name] -> [Maybe Name] -> Bool +-- (instanceCantMatch tcs1 tcs2) returns True if tcs1 cannot +-- possibly be instantiated to actual, nor vice versa; +-- False is non-committal +instanceCantMatch (Just t : ts) (Just a : as) = t/=a || instanceCantMatch ts as +instanceCantMatch ts as = False -- Safe + +--------------------------------------------------- +data OverlapFlag + = NoOverlap -- This instance must not overlap another + + | OverlapOk -- Silently ignore this instance if you find a + -- more specific one that matches the constraint + -- you are trying to resolve + -- + -- Example: constraint (Foo [Int]) + -- instances (Foo [Int]) + -- (Foo [a]) OverlapOk + -- Since the second instance has the OverlapOk flag, + -- the first instance will be chosen (otherwise + -- its ambiguous which to choose) + + | Incoherent -- Like OverlapOk, but also ignore this instance + -- if it doesn't match the constraint you are + -- trying to resolve, but could match if the type variables + -- in the constraint were instantiated + -- + -- Example: constraint (Foo [b]) + -- instances (Foo [Int]) Incoherent + -- (Foo [a]) + -- Without the Incoherent flag, we'd complain that + -- instantiating 'b' would change which instance + -- was chosen + +instance Outputable OverlapFlag where + ppr NoOverlap = empty + ppr OverlapOk = ptext SLIT("[overlap ok]") + ppr Incoherent = ptext SLIT("[incoherent]") +\end{code} + + +Note [Overlapping instances] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Overlap is permitted, but only in such a way that one can make +a unique choice when looking up. That is, overlap is only permitted if +one template matches the other, or vice versa. So this is ok: + + [a] [Int] + +but this is not + + (Int,a) (b,Int) + +If overlap is permitted, the list is kept most specific first, so that +the first lookup is the right choice. + + +For now we just use association lists. + +\subsection{Avoiding a problem with overlapping} + +Consider this little program: + +\begin{pseudocode} + class C a where c :: a + class C a => D a where d :: a + + instance C Int where c = 17 + instance D Int where d = 13 + + instance C a => C [a] where c = [c] + instance ({- C [a], -} D a) => D [a] where d = c + + instance C [Int] where c = [37] + + main = print (d :: [Int]) +\end{pseudocode} + +What do you think `main' prints (assuming we have overlapping instances, and +all that turned on)? Well, the instance for `D' at type `[a]' is defined to +be `c' at the same type, and we've got an instance of `C' at `[Int]', so the +answer is `[37]', right? (the generic `C [a]' instance shouldn't apply because +the `C [Int]' instance is more specific). + +Ghc-4.04 gives `[37]', while ghc-4.06 gives `[17]', so 4.06 is wrong. That +was easy ;-) Let's just consult hugs for good measure. Wait - if I use old +hugs (pre-September99), I get `[17]', and stranger yet, if I use hugs98, it +doesn't even compile! What's going on!? + +What hugs complains about is the `D [a]' instance decl. + +\begin{pseudocode} + ERROR "mj.hs" (line 10): Cannot build superclass instance + *** Instance : D [a] + *** Context supplied : D a + *** Required superclass : C [a] +\end{pseudocode} + +You might wonder what hugs is complaining about. It's saying that you +need to add `C [a]' to the context of the `D [a]' instance (as appears +in comments). But there's that `C [a]' instance decl one line above +that says that I can reduce the need for a `C [a]' instance to the +need for a `C a' instance, and in this case, I already have the +necessary `C a' instance (since we have `D a' explicitly in the +context, and `C' is a superclass of `D'). + +Unfortunately, the above reasoning indicates a premature commitment to the +generic `C [a]' instance. I.e., it prematurely rules out the more specific +instance `C [Int]'. This is the mistake that ghc-4.06 makes. The fix is to +add the context that hugs suggests (uncomment the `C [a]'), effectively +deferring the decision about which instance to use. + +Now, interestingly enough, 4.04 has this same bug, but it's covered up +in this case by a little known `optimization' that was disabled in +4.06. Ghc-4.04 silently inserts any missing superclass context into +an instance declaration. In this case, it silently inserts the `C +[a]', and everything happens to work out. + +(See `basicTypes/MkId:mkDictFunId' for the code in question. Search for +`Mark Jones', although Mark claims no credit for the `optimization' in +question, and would rather it stopped being called the `Mark Jones +optimization' ;-) + +So, what's the fix? I think hugs has it right. Here's why. Let's try +something else out with ghc-4.04. Let's add the following line: + + d' :: D a => [a] + d' = c + +Everyone raise their hand who thinks that `d :: [Int]' should give a +different answer from `d' :: [Int]'. Well, in ghc-4.04, it does. The +`optimization' only applies to instance decls, not to regular +bindings, giving inconsistent behavior. + +Old hugs had this same bug. Here's how we fixed it: like GHC, the +list of instances for a given class is ordered, so that more specific +instances come before more generic ones. For example, the instance +list for C might contain: + ..., C Int, ..., C a, ... +When we go to look for a `C Int' instance we'll get that one first. +But what if we go looking for a `C b' (`b' is unconstrained)? We'll +pass the `C Int' instance, and keep going. But if `b' is +unconstrained, then we don't know yet if the more specific instance +will eventually apply. GHC keeps going, and matches on the generic `C +a'. The fix is to, at each step, check to see if there's a reverse +match, and if so, abort the search. This prevents hugs from +prematurely chosing a generic instance when a more specific one +exists. + +--Jeff + +BUT NOTE [Nov 2001]: we must actually *unify* not reverse-match in +this test. Suppose the instance envt had + ..., forall a b. C a a b, ..., forall a b c. C a b c, ... +(still most specific first) +Now suppose we are looking for (C x y Int), where x and y are unconstrained. + C x y Int doesn't match the template {a,b} C a a b +but neither does + C a a b match the template {x,y} C x y Int +But still x and y might subsequently be unified so they *do* match. + +Simple story: unify, don't match. + + +%************************************************************************ +%* * + InstEnv, ClsInstEnv +%* * +%************************************************************************ + +A @ClsInstEnv@ all the instances of that class. The @Id@ inside a +ClsInstEnv mapping is the dfun for that instance. + +If class C maps to a list containing the item ([a,b], [t1,t2,t3], dfun), then + + forall a b, C t1 t2 t3 can be constructed by dfun + +or, to put it another way, we have + + instance (...) => C t1 t2 t3, witnessed by dfun + +\begin{code} +--------------------------------------------------- +type InstEnv = UniqFM ClsInstEnv -- Maps Class to instances for that class + +data ClsInstEnv + = ClsIE [Instance] -- The instances for a particular class, in any order + Bool -- True <=> there is an instance of form C a b c + -- If *not* then the common case of looking up + -- (C a b c) can fail immediately + +-- INVARIANTS: +-- * The is_tvs are distinct in each Instance +-- of a ClsInstEnv (so we can safely unify them) + +-- Thus, the @ClassInstEnv@ for @Eq@ might contain the following entry: +-- [a] ===> dfun_Eq_List :: forall a. Eq a => Eq [a] +-- The "a" in the pattern must be one of the forall'd variables in +-- the dfun type. + +emptyInstEnv :: InstEnv +emptyInstEnv = emptyUFM + +instEnvElts :: InstEnv -> [Instance] +instEnvElts ie = [elt | ClsIE elts _ <- eltsUFM ie, elt <- elts] + +classInstances :: (InstEnv,InstEnv) -> Class -> [Instance] +classInstances (pkg_ie, home_ie) cls + = get home_ie ++ get pkg_ie + where + get env = case lookupUFM env cls of + Just (ClsIE insts _) -> insts + Nothing -> [] + +extendInstEnvList :: InstEnv -> [Instance] -> InstEnv +extendInstEnvList inst_env ispecs = foldl extendInstEnv inst_env ispecs + +extendInstEnv :: InstEnv -> Instance -> InstEnv +extendInstEnv inst_env ins_item@(Instance { is_cls = cls_nm, is_tcs = mb_tcs }) + = addToUFM_C add inst_env cls_nm (ClsIE [ins_item] ins_tyvar) + where + add (ClsIE cur_insts cur_tyvar) _ = ClsIE (ins_item : cur_insts) + (ins_tyvar || cur_tyvar) + ins_tyvar = not (any isJust mb_tcs) +\end{code} + + +%************************************************************************ +%* * +\subsection{Looking up an instance} +%* * +%************************************************************************ + +@lookupInstEnv@ looks up in a @InstEnv@, using a one-way match. Since +the env is kept ordered, the first match must be the only one. The +thing we are looking up can have an arbitrary "flexi" part. + +\begin{code} +lookupInstEnv :: (InstEnv -- External package inst-env + ,InstEnv) -- Home-package inst-env + -> Class -> [Type] -- What we are looking for + -> ([(TvSubst, Instance)], -- Successful matches + [Instance]) -- These don't match but do unify + -- The second component of the tuple happens when we look up + -- Foo [a] + -- in an InstEnv that has entries for + -- Foo [Int] + -- Foo [b] + -- Then which we choose would depend on the way in which 'a' + -- is instantiated. So we report that Foo [b] is a match (mapping b->a) + -- but Foo [Int] is a unifier. This gives the caller a better chance of + -- giving a suitable error messagen + +lookupInstEnv (pkg_ie, home_ie) cls tys + = (pruned_matches, all_unifs) + where + rough_tcs = roughMatchTcs tys + all_tvs = all isNothing rough_tcs + (home_matches, home_unifs) = lookup home_ie + (pkg_matches, pkg_unifs) = lookup pkg_ie + all_matches = home_matches ++ pkg_matches + all_unifs = home_unifs ++ pkg_unifs + pruned_matches + | null all_unifs = foldr insert_overlapping [] all_matches + | otherwise = all_matches -- Non-empty unifs is always an error situation, + -- so don't attempt to pune the matches + + -------------- + lookup env = case lookupUFM env cls of + Nothing -> ([],[]) -- No instances for this class + Just (ClsIE insts has_tv_insts) + | all_tvs && not has_tv_insts + -> ([],[]) -- Short cut for common case + -- The thing we are looking up is of form (C a b c), and + -- the ClsIE has no instances of that form, so don't bother to search + + | otherwise + -> find [] [] insts + + -------------- + find ms us [] = (ms, us) + find ms us (item@(Instance { is_tcs = mb_tcs, is_tvs = tpl_tvs, + is_tys = tpl_tys, is_flag = oflag, + is_dfun = dfun }) : rest) + -- Fast check for no match, uses the "rough match" fields + | instanceCantMatch rough_tcs mb_tcs + = find ms us rest + + | Just subst <- tcMatchTys tpl_tvs tpl_tys tys + = find ((subst,item):ms) us rest + + -- Does not match, so next check whether the things unify + -- See Note [overlapping instances] above + | Incoherent <- oflag + = find ms us rest + + | otherwise + = ASSERT2( not (tyVarsOfTypes tys `intersectsVarSet` tpl_tvs), + (ppr cls <+> ppr tys <+> ppr all_tvs) $$ + (ppr dfun <+> ppr tpl_tvs <+> ppr tpl_tys) + ) + -- Unification will break badly if the variables overlap + -- They shouldn't because we allocate separate uniques for them + case tcUnifyTys bind_fn tpl_tys tys of + Just _ -> find ms (item:us) rest + Nothing -> find ms us rest + +--------------- +bind_fn tv | isTcTyVar tv && isExistentialTyVar tv = Skolem + | otherwise = BindMe + -- The key_tys can contain skolem constants, and we can guarantee that those + -- are never going to be instantiated to anything, so we should not involve + -- them in the unification test. Example: + -- class Foo a where { op :: a -> Int } + -- instance Foo a => Foo [a] -- NB overlap + -- instance Foo [Int] -- NB overlap + -- data T = forall a. Foo a => MkT a + -- f :: T -> Int + -- f (MkT x) = op [x,x] + -- The op [x,x] means we need (Foo [a]). Without the filterVarSet we'd + -- complain, saying that the choice of instance depended on the instantiation + -- of 'a'; but of course it isn't *going* to be instantiated. + -- + -- We do this only for pattern-bound skolems. For example we reject + -- g :: forall a => [a] -> Int + -- g x = op x + -- on the grounds that the correct instance depends on the instantiation of 'a' + +--------------- +insert_overlapping :: (TvSubst, Instance) -> [(TvSubst, Instance)] + -> [(TvSubst, Instance)] +-- Add a new solution, knocking out strictly less specific ones +insert_overlapping new_item [] = [new_item] +insert_overlapping new_item (item:items) + | new_beats_old && old_beats_new = item : insert_overlapping new_item items + -- Duplicate => keep both for error report + | new_beats_old = insert_overlapping new_item items + -- Keep new one + | old_beats_new = item : items + -- Keep old one + | otherwise = item : insert_overlapping new_item items + -- Keep both + where + new_beats_old = new_item `beats` item + old_beats_new = item `beats` new_item + + (_, instA) `beats` (_, instB) + = overlap_ok && + isJust (tcMatchTys (is_tvs instB) (is_tys instB) (is_tys instA)) + -- A beats B if A is more specific than B, and B admits overlap + -- I.e. if B can be instantiated to match A + where + overlap_ok = case is_flag instB of + NoOverlap -> False + other -> True +\end{code} + diff --git a/compiler/types/Kind.lhs b/compiler/types/Kind.lhs new file mode 100644 index 0000000000..fa24fec144 --- /dev/null +++ b/compiler/types/Kind.lhs @@ -0,0 +1,228 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1998 +% + +\begin{code} +module Kind ( + Kind(..), SimpleKind, + openTypeKind, liftedTypeKind, unliftedTypeKind, + argTypeKind, ubxTupleKind, + + isLiftedTypeKind, isUnliftedTypeKind, + isArgTypeKind, isOpenTypeKind, + mkArrowKind, mkArrowKinds, + + isSubKind, defaultKind, + kindFunResult, splitKindFunTys, + + KindVar, mkKindVar, kindVarRef, kindVarUniq, + kindVarOcc, setKindVarOcc, + + pprKind, pprParendKind + ) where + +#include "HsVersions.h" + +import Unique ( Unique ) +import OccName ( OccName, mkOccName, tvName ) +import Outputable +import DATA_IOREF +\end{code} + +Kinds +~~~~~ +There's a little subtyping at the kind level: + + ? + / \ + / \ + ?? (#) + / \ + * # + +where * [LiftedTypeKind] means boxed type + # [UnliftedTypeKind] means unboxed type + (#) [UbxTupleKind] means unboxed tuple + ?? [ArgTypeKind] is the lub of *,# + ? [OpenTypeKind] means any type at all + +In particular: + + error :: forall a:?. String -> a + (->) :: ?? -> ? -> * + (\(x::t) -> ...) Here t::?? (i.e. not unboxed tuple) + +\begin{code} +data Kind + = LiftedTypeKind -- * + | OpenTypeKind -- ? + | UnliftedTypeKind -- # + | UbxTupleKind -- (##) + | ArgTypeKind -- ?? + | FunKind Kind Kind -- k1 -> k2 + | KindVar KindVar + deriving( Eq ) + +data KindVar = KVar Unique OccName (IORef (Maybe SimpleKind)) + -- INVARIANT: a KindVar can only be instantiated by a SimpleKind + +type SimpleKind = Kind + -- A SimpleKind has no ? or # kinds in it: + -- sk ::= * | sk1 -> sk2 | kvar + +instance Eq KindVar where + (KVar u1 _ _) == (KVar u2 _ _) = u1 == u2 + +mkKindVar :: Unique -> IORef (Maybe Kind) -> KindVar +mkKindVar u r = KVar u kind_var_occ r + +kindVarRef :: KindVar -> IORef (Maybe Kind) +kindVarRef (KVar _ _ ref) = ref + +kindVarUniq :: KindVar -> Unique +kindVarUniq (KVar uniq _ _) = uniq + +kindVarOcc :: KindVar -> OccName +kindVarOcc (KVar _ occ _) = occ + +setKindVarOcc :: KindVar -> OccName -> KindVar +setKindVarOcc (KVar u _ r) occ = KVar u occ r + +kind_var_occ :: OccName -- Just one for all KindVars + -- They may be jiggled by tidying +kind_var_occ = mkOccName tvName "k" +\end{code} + +Kind inference +~~~~~~~~~~~~~~ +During kind inference, a kind variable unifies only with +a "simple kind", sk + sk ::= * | sk1 -> sk2 +For example + data T a = MkT a (T Int#) +fails. We give T the kind (k -> *), and the kind variable k won't unify +with # (the kind of Int#). + +Type inference +~~~~~~~~~~~~~~ +When creating a fresh internal type variable, we give it a kind to express +constraints on it. E.g. in (\x->e) we make up a fresh type variable for x, +with kind ??. + +During unification we only bind an internal type variable to a type +whose kind is lower in the sub-kind hierarchy than the kind of the tyvar. + +When unifying two internal type variables, we collect their kind constraints by +finding the GLB of the two. Since the partial order is a tree, they only +have a glb if one is a sub-kind of the other. In that case, we bind the +less-informative one to the more informative one. Neat, eh? + + +\begin{code} +liftedTypeKind = LiftedTypeKind +unliftedTypeKind = UnliftedTypeKind +openTypeKind = OpenTypeKind +argTypeKind = ArgTypeKind +ubxTupleKind = UbxTupleKind + +mkArrowKind :: Kind -> Kind -> Kind +mkArrowKind k1 k2 = k1 `FunKind` k2 + +mkArrowKinds :: [Kind] -> Kind -> Kind +mkArrowKinds arg_kinds result_kind = foldr mkArrowKind result_kind arg_kinds +\end{code} + +%************************************************************************ +%* * + Functions over Kinds +%* * +%************************************************************************ + +\begin{code} +kindFunResult :: Kind -> Kind +kindFunResult (FunKind _ k) = k +kindFunResult k = pprPanic "kindFunResult" (ppr k) + +splitKindFunTys :: Kind -> ([Kind],Kind) +splitKindFunTys (FunKind k1 k2) = case splitKindFunTys k2 of + (as, r) -> (k1:as, r) +splitKindFunTys k = ([], k) + +isLiftedTypeKind, isUnliftedTypeKind :: Kind -> Bool +isLiftedTypeKind LiftedTypeKind = True +isLiftedTypeKind other = False + +isUnliftedTypeKind UnliftedTypeKind = True +isUnliftedTypeKind other = False + +isArgTypeKind :: Kind -> Bool +-- True of any sub-kind of ArgTypeKind +isArgTypeKind LiftedTypeKind = True +isArgTypeKind UnliftedTypeKind = True +isArgTypeKind ArgTypeKind = True +isArgTypeKind other = False + +isOpenTypeKind :: Kind -> Bool +-- True of any sub-kind of OpenTypeKind (i.e. anything except arrow) +isOpenTypeKind (FunKind _ _) = False +isOpenTypeKind (KindVar _) = False -- This is a conservative answer + -- It matters in the call to isSubKind in + -- checkExpectedKind. +isOpenTypeKind other = True + +isSubKind :: Kind -> Kind -> Bool +-- (k1 `isSubKind` k2) checks that k1 <: k2 +isSubKind LiftedTypeKind LiftedTypeKind = True +isSubKind UnliftedTypeKind UnliftedTypeKind = True +isSubKind UbxTupleKind UbxTupleKind = True +isSubKind k1 OpenTypeKind = isOpenTypeKind k1 +isSubKind k1 ArgTypeKind = isArgTypeKind k1 +isSubKind (FunKind a1 r1) (FunKind a2 r2) = (a2 `isSubKind` a1) && (r1 `isSubKind` r2) +isSubKind k1 k2 = False + +defaultKind :: Kind -> Kind +-- Used when generalising: default kind '?' and '??' to '*' +-- +-- When we generalise, we make generic type variables whose kind is +-- simple (* or *->* etc). So generic type variables (other than +-- built-in constants like 'error') always have simple kinds. This is important; +-- consider +-- f x = True +-- We want f to get type +-- f :: forall (a::*). a -> Bool +-- Not +-- f :: forall (a::??). a -> Bool +-- because that would allow a call like (f 3#) as well as (f True), +--and the calling conventions differ. This defaulting is done in TcMType.zonkTcTyVarBndr. +defaultKind OpenTypeKind = LiftedTypeKind +defaultKind ArgTypeKind = LiftedTypeKind +defaultKind kind = kind +\end{code} + + +%************************************************************************ +%* * + Pretty printing +%* * +%************************************************************************ + +\begin{code} +instance Outputable KindVar where + ppr (KVar uniq occ _) = ppr occ <> ifPprDebug (ppr uniq) + +instance Outputable Kind where + ppr k = pprKind k + +pprParendKind :: Kind -> SDoc +pprParendKind k@(FunKind _ _) = parens (pprKind k) +pprParendKind k = pprKind k + +pprKind (KindVar v) = ppr v +pprKind LiftedTypeKind = ptext SLIT("*") +pprKind UnliftedTypeKind = ptext SLIT("#") +pprKind OpenTypeKind = ptext SLIT("?") +pprKind ArgTypeKind = ptext SLIT("??") +pprKind UbxTupleKind = ptext SLIT("(#)") +pprKind (FunKind k1 k2) = sep [ pprParendKind k1, arrow <+> pprKind k2] + +\end{code} diff --git a/compiler/types/TyCon.hi-boot-5 b/compiler/types/TyCon.hi-boot-5 new file mode 100644 index 0000000000..1f040d73e1 --- /dev/null +++ b/compiler/types/TyCon.hi-boot-5 @@ -0,0 +1,6 @@ +__interface TyCon 1 0 where +__export TyCon TyCon isTupleTyCon isUnboxedTupleTyCon isFunTyCon setTyConName ; +1 data TyCon ; +1 isTupleTyCon :: TyCon -> PrelBase.Bool ; +1 isUnboxedTupleTyCon :: TyCon -> PrelBase.Bool ; +1 isFunTyCon :: TyCon -> PrelBase.Bool ; diff --git a/compiler/types/TyCon.hi-boot-6 b/compiler/types/TyCon.hi-boot-6 new file mode 100644 index 0000000000..08975621f0 --- /dev/null +++ b/compiler/types/TyCon.hi-boot-6 @@ -0,0 +1,7 @@ +module TyCon where + +data TyCon + +isTupleTyCon :: TyCon -> GHC.Base.Bool +isUnboxedTupleTyCon :: TyCon -> GHC.Base.Bool +isFunTyCon :: TyCon -> GHC.Base.Bool diff --git a/compiler/types/TyCon.lhs b/compiler/types/TyCon.lhs new file mode 100644 index 0000000000..fcd32c6974 --- /dev/null +++ b/compiler/types/TyCon.lhs @@ -0,0 +1,683 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[TyCon]{The @TyCon@ datatype} + +\begin{code} +module TyCon( + TyCon, ArgVrcs, FieldLabel, + + PrimRep(..), + tyConPrimRep, + + AlgTyConRhs(..), visibleDataCons, + + isFunTyCon, isUnLiftedTyCon, isProductTyCon, + isAlgTyCon, isDataTyCon, isSynTyCon, isNewTyCon, isPrimTyCon, + isEnumerationTyCon, + isTupleTyCon, isUnboxedTupleTyCon, isBoxedTupleTyCon, tupleTyConBoxity, + isRecursiveTyCon, newTyConRep, newTyConRhs, + isHiBootTyCon, + + tcExpandTyCon_maybe, coreExpandTyCon_maybe, + + makeTyConAbstract, isAbstractTyCon, + + mkForeignTyCon, isForeignTyCon, + + mkAlgTyCon, + mkClassTyCon, + mkFunTyCon, + mkPrimTyCon, + mkLiftedPrimTyCon, + mkTupleTyCon, + mkSynTyCon, + + tyConName, + tyConKind, + tyConUnique, + tyConTyVars, + tyConArgVrcs, + algTyConRhs, tyConDataCons, tyConDataCons_maybe, tyConFamilySize, + tyConSelIds, + tyConStupidTheta, + tyConArity, + isClassTyCon, tyConClass_maybe, + synTyConDefn, synTyConRhs, + tyConExtName, -- External name for foreign types + + maybeTyConSingleCon, + + -- Generics + tyConHasGenerics +) where + +#include "HsVersions.h" + +import {-# SOURCE #-} TypeRep ( Type, PredType ) + -- Should just be Type(Type), but this fails due to bug present up to + -- and including 4.02 involving slurping of hi-boot files. Bug is now fixed. + +import {-# SOURCE #-} DataCon ( DataCon, isVanillaDataCon ) + + +import Var ( TyVar, Id ) +import Class ( Class ) +import Kind ( Kind ) +import BasicTypes ( Arity, RecFlag(..), Boxity(..), isBoxed ) +import Name ( Name, nameUnique, NamedThing(getName) ) +import PrelNames ( Unique, Uniquable(..) ) +import Maybes ( orElse ) +import Outputable +import FastString +\end{code} + +%************************************************************************ +%* * +\subsection{The data type} +%* * +%************************************************************************ + +\begin{code} +data TyCon + = FunTyCon { + tyConUnique :: Unique, + tyConName :: Name, + tyConKind :: Kind, + tyConArity :: Arity + } + + + | AlgTyCon { -- Data type, and newtype decls. + -- All lifted, all boxed + tyConUnique :: Unique, + tyConName :: Name, + tyConKind :: Kind, + tyConArity :: Arity, + + tyConTyVars :: [TyVar], -- Scopes over (a) the [PredType] in AlgTyConRhs.DataTyCon + -- (b) the cached types in AlgTyConRhs.NewTyCon + -- But not over the data constructors + argVrcs :: ArgVrcs, + + algTcSelIds :: [Id], -- Its record selectors (empty if none): + + algTcStupidTheta :: [PredType], -- The "stupid theta" for the data type + -- (always empty for GADTs) + + algTcRhs :: AlgTyConRhs, -- Data constructors in here + + algTcRec :: RecFlag, -- Tells whether the data type is part of + -- a mutually-recursive group or not + + hasGenerics :: Bool, -- True <=> generic to/from functions are available + -- (in the exports of the data type's source module) + + algTcClass :: Maybe Class + -- Just cl if this tycon came from a class declaration + } + + | PrimTyCon { -- Primitive types; cannot be defined in Haskell + -- Now includes foreign-imported types + tyConUnique :: Unique, + tyConName :: Name, + tyConKind :: Kind, + tyConArity :: Arity, + argVrcs :: ArgVrcs, + + primTyConRep :: PrimRep, + -- Many primitive tycons are unboxed, but some are + -- boxed (represented by pointers). The CgRep tells. + + isUnLifted :: Bool, -- Most primitive tycons are unlifted, + -- but foreign-imported ones may not be + tyConExtName :: Maybe FastString -- Just xx for foreign-imported types + } + + | TupleTyCon { + tyConUnique :: Unique, + tyConName :: Name, + tyConKind :: Kind, + tyConArity :: Arity, + tyConBoxed :: Boxity, + tyConTyVars :: [TyVar], + dataCon :: DataCon, + hasGenerics :: Bool + } + + | SynTyCon { + tyConUnique :: Unique, + tyConName :: Name, + tyConKind :: Kind, + tyConArity :: Arity, + + tyConTyVars :: [TyVar], -- Bound tyvars + synTcRhs :: Type, -- Right-hand side, mentioning these type vars. + -- Acts as a template for the expansion when + -- the tycon is applied to some types. + argVrcs :: ArgVrcs + } + +type FieldLabel = Name + +type ArgVrcs = [(Bool,Bool)] -- Tyvar variance info: [(occPos,occNeg)] + -- [] means "no information, assume the worst" + +data AlgTyConRhs + = AbstractTyCon -- We know nothing about this data type, except + -- that it's represented by a pointer + -- Used when we export a data type abstractly into + -- an hi file + + | DataTyCon { + data_cons :: [DataCon], + -- The constructors; can be empty if the user declares + -- the type to have no constructors + -- INVARIANT: Kept in order of increasing tag + -- (see the tag assignment in DataCon.mkDataCon) + is_enum :: Bool -- Cached: True <=> an enumeration type + } -- Includes data types with no constructors. + + | NewTyCon { + data_con :: DataCon, -- The unique constructor; it has no existentials + + nt_rhs :: Type, -- Cached: the argument type of the constructor + -- = the representation type of the tycon + + nt_etad_rhs :: ([TyVar], Type) , + -- The same again, but this time eta-reduced + -- hence the [TyVar] which may be shorter than the declared + -- arity of the TyCon. See Note [Newtype eta] + + nt_rep :: Type -- Cached: the *ultimate* representation type + -- By 'ultimate' I mean that the top-level constructor + -- of the rep type is not itself a newtype or type synonym. + -- The rep type isn't entirely simple: + -- for a recursive newtype we pick () as the rep type + -- newtype T = MkT T + -- + -- This one does not need to be eta reduced; hence its + -- free type variables are conveniently tyConTyVars + -- Thus: + -- newtype T a = MkT [(a,Int)] + -- The rep type is [(a,Int)] + -- NB: the rep type isn't necessarily the original RHS of the + -- newtype decl, because the rep type looks through other + } -- newtypes. + +visibleDataCons :: AlgTyConRhs -> [DataCon] +visibleDataCons AbstractTyCon = [] +visibleDataCons (DataTyCon{ data_cons = cs }) = cs +visibleDataCons (NewTyCon{ data_con = c }) = [c] +\end{code} + +Note [Newtype eta] +~~~~~~~~~~~~~~~~~~ +Consider + newtype Parser m a = MkParser (Foogle m a) +Are these two types equal (to Core)? + Monad (Parser m) + Monad (Foogle m) +Well, yes. But to see that easily we eta-reduce the RHS type of +Parser, in this case to ([], Froogle), so that even unsaturated applications +of Parser will work right. This eta reduction is done when the type +constructor is built, and cached in NewTyCon. The cached field is +only used in coreExpandTyCon_maybe. + +Here's an example that I think showed up in practice +Source code: + newtype T a = MkT [a] + newtype Foo m = MkFoo (forall a. m a -> Int) + + w1 :: Foo [] + w1 = ... + + w2 :: Foo T + w2 = MkFoo (\(MkT x) -> case w1 of MkFoo f -> f x) + +After desugaring, and discading the data constructors for the newtypes, +we get: + w2 :: Foo T + w2 = w1 +And now Lint complains unless Foo T == Foo [], and that requires T==[] + + +%************************************************************************ +%* * +\subsection{PrimRep} +%* * +%************************************************************************ + +A PrimRep is an abstraction of a type. It contains information that +the code generator needs in order to pass arguments, return results, +and store values of this type. + +A PrimRep is somewhat similar to a CgRep (see codeGen/SMRep) and a +MachRep (see cmm/MachOp), although each of these types has a distinct +and clearly defined purpose: + + - A PrimRep is a CgRep + information about signedness + information + about primitive pointers (AddrRep). Signedness and primitive + pointers are required when passing a primitive type to a foreign + function, but aren't needed for call/return conventions of Haskell + functions. + + - A MachRep is a basic machine type (non-void, doesn't contain + information on pointerhood or signedness, but contains some + reps that don't have corresponding Haskell types). + +\begin{code} +data PrimRep + = VoidRep + | PtrRep + | IntRep -- signed, word-sized + | WordRep -- unsinged, word-sized + | Int64Rep -- signed, 64 bit (32-bit words only) + | Word64Rep -- unsigned, 64 bit (32-bit words only) + | AddrRep -- a pointer, but not to a Haskell value + | FloatRep + | DoubleRep +\end{code} + +%************************************************************************ +%* * +\subsection{TyCon Construction} +%* * +%************************************************************************ + +Note: the TyCon constructors all take a Kind as one argument, even though +they could, in principle, work out their Kind from their other arguments. +But to do so they need functions from Types, and that makes a nasty +module mutual-recursion. And they aren't called from many places. +So we compromise, and move their Kind calculation to the call site. + +\begin{code} +mkFunTyCon :: Name -> Kind -> TyCon +mkFunTyCon name kind + = FunTyCon { + tyConUnique = nameUnique name, + tyConName = name, + tyConKind = kind, + tyConArity = 2 + } + +-- This is the making of a TyCon. Just the same as the old mkAlgTyCon, +-- but now you also have to pass in the generic information about the type +-- constructor - you can get hold of it easily (see Generics module) +mkAlgTyCon name kind tyvars argvrcs stupid rhs sel_ids is_rec gen_info + = AlgTyCon { + tyConName = name, + tyConUnique = nameUnique name, + tyConKind = kind, + tyConArity = length tyvars, + tyConTyVars = tyvars, + argVrcs = argvrcs, + algTcStupidTheta = stupid, + algTcRhs = rhs, + algTcSelIds = sel_ids, + algTcClass = Nothing, + algTcRec = is_rec, + hasGenerics = gen_info + } + +mkClassTyCon name kind tyvars argvrcs rhs clas is_rec + = AlgTyCon { + tyConName = name, + tyConUnique = nameUnique name, + tyConKind = kind, + tyConArity = length tyvars, + tyConTyVars = tyvars, + argVrcs = argvrcs, + algTcStupidTheta = [], + algTcRhs = rhs, + algTcSelIds = [], + algTcClass = Just clas, + algTcRec = is_rec, + hasGenerics = False + } + + +mkTupleTyCon name kind arity tyvars con boxed gen_info + = TupleTyCon { + tyConUnique = nameUnique name, + tyConName = name, + tyConKind = kind, + tyConArity = arity, + tyConBoxed = boxed, + tyConTyVars = tyvars, + dataCon = con, + hasGenerics = gen_info + } + +-- Foreign-imported (.NET) type constructors are represented +-- as primitive, but *lifted*, TyCons for now. They are lifted +-- because the Haskell type T representing the (foreign) .NET +-- type T is actually implemented (in ILX) as a thunk<T> +mkForeignTyCon name ext_name kind arity arg_vrcs + = PrimTyCon { + tyConName = name, + tyConUnique = nameUnique name, + tyConKind = kind, + tyConArity = arity, + argVrcs = arg_vrcs, + primTyConRep = PtrRep, -- they all do + isUnLifted = False, + tyConExtName = ext_name + } + + +-- most Prim tycons are lifted +mkPrimTyCon name kind arity arg_vrcs rep + = mkPrimTyCon' name kind arity arg_vrcs rep True + +-- but RealWorld is lifted +mkLiftedPrimTyCon name kind arity arg_vrcs rep + = mkPrimTyCon' name kind arity arg_vrcs rep False + +mkPrimTyCon' name kind arity arg_vrcs rep is_unlifted + = PrimTyCon { + tyConName = name, + tyConUnique = nameUnique name, + tyConKind = kind, + tyConArity = arity, + argVrcs = arg_vrcs, + primTyConRep = rep, + isUnLifted = is_unlifted, + tyConExtName = Nothing + } + +mkSynTyCon name kind tyvars rhs argvrcs + = SynTyCon { + tyConName = name, + tyConUnique = nameUnique name, + tyConKind = kind, + tyConArity = length tyvars, + tyConTyVars = tyvars, + synTcRhs = rhs, + argVrcs = argvrcs + } +\end{code} + +\begin{code} +isFunTyCon :: TyCon -> Bool +isFunTyCon (FunTyCon {}) = True +isFunTyCon _ = False + +isAbstractTyCon :: TyCon -> Bool +isAbstractTyCon (AlgTyCon { algTcRhs = AbstractTyCon }) = True +isAbstractTyCon _ = False + +makeTyConAbstract :: TyCon -> TyCon +makeTyConAbstract tc@(AlgTyCon {}) = tc { algTcRhs = AbstractTyCon } +makeTyConAbstract tc = pprPanic "makeTyConAbstract" (ppr tc) + +isPrimTyCon :: TyCon -> Bool +isPrimTyCon (PrimTyCon {}) = True +isPrimTyCon _ = False + +isUnLiftedTyCon :: TyCon -> Bool +isUnLiftedTyCon (PrimTyCon {isUnLifted = is_unlifted}) = is_unlifted +isUnLiftedTyCon (TupleTyCon {tyConBoxed = boxity}) = not (isBoxed boxity) +isUnLiftedTyCon _ = False + +-- isAlgTyCon returns True for both @data@ and @newtype@ +isAlgTyCon :: TyCon -> Bool +isAlgTyCon (AlgTyCon {}) = True +isAlgTyCon (TupleTyCon {}) = True +isAlgTyCon other = False + +isDataTyCon :: TyCon -> Bool +-- isDataTyCon returns True for data types that are represented by +-- heap-allocated constructors. +-- These are srcutinised by Core-level @case@ expressions, and they +-- get info tables allocated for them. +-- True for all @data@ types +-- False for newtypes +-- unboxed tuples +isDataTyCon tc@(AlgTyCon {algTcRhs = rhs}) + = case rhs of + DataTyCon {} -> True + NewTyCon {} -> False + AbstractTyCon -> pprPanic "isDataTyCon" (ppr tc) + +isDataTyCon (TupleTyCon {tyConBoxed = boxity}) = isBoxed boxity +isDataTyCon other = False + +isNewTyCon :: TyCon -> Bool +isNewTyCon (AlgTyCon {algTcRhs = NewTyCon {}}) = True +isNewTyCon other = False + +isProductTyCon :: TyCon -> Bool +-- A "product" tycon +-- has *one* constructor, +-- is *not* existential +-- but +-- may be DataType or NewType, +-- may be unboxed or not, +-- may be recursive or not +isProductTyCon tc@(AlgTyCon {}) = case algTcRhs tc of + DataTyCon{ data_cons = [data_con] } + -> isVanillaDataCon data_con + NewTyCon {} -> True + other -> False +isProductTyCon (TupleTyCon {}) = True +isProductTyCon other = False + +isSynTyCon :: TyCon -> Bool +isSynTyCon (SynTyCon {}) = True +isSynTyCon _ = False + +isEnumerationTyCon :: TyCon -> Bool +isEnumerationTyCon (AlgTyCon {algTcRhs = DataTyCon { is_enum = res }}) = res +isEnumerationTyCon other = False + +isTupleTyCon :: TyCon -> Bool +-- The unit tycon didn't used to be classed as a tuple tycon +-- but I thought that was silly so I've undone it +-- If it can't be for some reason, it should be a AlgTyCon +-- +-- NB: when compiling Data.Tuple, the tycons won't reply True to +-- isTupleTyCon, becuase they are built as AlgTyCons. However they +-- get spat into the interface file as tuple tycons, so I don't think +-- it matters. +isTupleTyCon (TupleTyCon {}) = True +isTupleTyCon other = False + +isUnboxedTupleTyCon :: TyCon -> Bool +isUnboxedTupleTyCon (TupleTyCon {tyConBoxed = boxity}) = not (isBoxed boxity) +isUnboxedTupleTyCon other = False + +isBoxedTupleTyCon :: TyCon -> Bool +isBoxedTupleTyCon (TupleTyCon {tyConBoxed = boxity}) = isBoxed boxity +isBoxedTupleTyCon other = False + +tupleTyConBoxity tc = tyConBoxed tc + +isRecursiveTyCon :: TyCon -> Bool +isRecursiveTyCon (AlgTyCon {algTcRec = Recursive}) = True +isRecursiveTyCon other = False + +isHiBootTyCon :: TyCon -> Bool +-- Used for knot-tying in hi-boot files +isHiBootTyCon (AlgTyCon {algTcRhs = AbstractTyCon}) = True +isHiBootTyCon other = False + +isForeignTyCon :: TyCon -> Bool +-- isForeignTyCon identifies foreign-imported type constructors +isForeignTyCon (PrimTyCon {tyConExtName = Just _}) = True +isForeignTyCon other = False +\end{code} + + +----------------------------------------------- +-- Expand type-constructor applications +----------------------------------------------- + +\begin{code} +tcExpandTyCon_maybe, coreExpandTyCon_maybe + :: TyCon + -> [Type] -- Args to tycon + -> Maybe ([(TyVar,Type)], -- Substitution + Type, -- Body type (not yet substituted) + [Type]) -- Leftover args + +-- For the *typechecker* view, we expand synonyms only +tcExpandTyCon_maybe (SynTyCon {tyConTyVars = tvs, synTcRhs = rhs }) tys + = expand tvs rhs tys +tcExpandTyCon_maybe other_tycon tys = Nothing + +--------------- +-- For the *Core* view, we expand synonyms *and* non-recursive newtypes +coreExpandTyCon_maybe (AlgTyCon {algTcRec = NonRecursive, -- Not recursive + algTcRhs = NewTyCon { nt_etad_rhs = etad_rhs }}) tys + = case etad_rhs of -- Don't do this in the pattern match, lest we accidentally + -- match the etad_rhs of a *recursive* newtype + (tvs,rhs) -> expand tvs rhs tys + +coreExpandTyCon_maybe tycon tys = tcExpandTyCon_maybe tycon tys + +---------------- +expand :: [TyVar] -> Type -- Template + -> [Type] -- Args + -> Maybe ([(TyVar,Type)], Type, [Type]) -- Expansion +expand tvs rhs tys + = case n_tvs `compare` length tys of + LT -> Just (tvs `zip` tys, rhs, drop n_tvs tys) + EQ -> Just (tvs `zip` tys, rhs, []) + GT -> Nothing + where + n_tvs = length tvs +\end{code} + +\begin{code} +tyConHasGenerics :: TyCon -> Bool +tyConHasGenerics (AlgTyCon {hasGenerics = hg}) = hg +tyConHasGenerics (TupleTyCon {hasGenerics = hg}) = hg +tyConHasGenerics other = False -- Synonyms + +tyConDataCons :: TyCon -> [DataCon] +-- It's convenient for tyConDataCons to return the +-- empty list for type synonyms etc +tyConDataCons tycon = tyConDataCons_maybe tycon `orElse` [] + +tyConDataCons_maybe :: TyCon -> Maybe [DataCon] +tyConDataCons_maybe (AlgTyCon {algTcRhs = DataTyCon { data_cons = cons }}) = Just cons +tyConDataCons_maybe (AlgTyCon {algTcRhs = NewTyCon { data_con = con }}) = Just [con] +tyConDataCons_maybe (TupleTyCon {dataCon = con}) = Just [con] +tyConDataCons_maybe other = Nothing + +tyConFamilySize :: TyCon -> Int +tyConFamilySize (AlgTyCon {algTcRhs = DataTyCon { data_cons = cons }}) = length cons +tyConFamilySize (AlgTyCon {algTcRhs = NewTyCon {}}) = 1 +tyConFamilySize (TupleTyCon {}) = 1 +#ifdef DEBUG +tyConFamilySize other = pprPanic "tyConFamilySize:" (ppr other) +#endif + +tyConSelIds :: TyCon -> [Id] +tyConSelIds (AlgTyCon {algTcSelIds = fs}) = fs +tyConSelIds other_tycon = [] + +algTyConRhs :: TyCon -> AlgTyConRhs +algTyConRhs (AlgTyCon {algTcRhs = rhs}) = rhs +algTyConRhs (TupleTyCon {dataCon = con}) = DataTyCon { data_cons = [con], is_enum = False } +algTyConRhs other = pprPanic "algTyConRhs" (ppr other) +\end{code} + +\begin{code} +newTyConRhs :: TyCon -> ([TyVar], Type) +newTyConRhs (AlgTyCon {tyConTyVars = tvs, algTcRhs = NewTyCon { nt_rhs = rhs }}) = (tvs, rhs) +newTyConRhs tycon = pprPanic "newTyConRhs" (ppr tycon) + +newTyConRep :: TyCon -> ([TyVar], Type) +newTyConRep (AlgTyCon {tyConTyVars = tvs, algTcRhs = NewTyCon { nt_rep = rep }}) = (tvs, rep) +newTyConRep tycon = pprPanic "newTyConRep" (ppr tycon) + +tyConPrimRep :: TyCon -> PrimRep +tyConPrimRep (PrimTyCon {primTyConRep = rep}) = rep +tyConPrimRep tc = ASSERT(not (isUnboxedTupleTyCon tc)) PtrRep +\end{code} + +\begin{code} +tyConStupidTheta :: TyCon -> [PredType] +tyConStupidTheta (AlgTyCon {algTcStupidTheta = stupid}) = stupid +tyConStupidTheta (TupleTyCon {}) = [] +tyConStupidTheta tycon = pprPanic "tyConStupidTheta" (ppr tycon) +\end{code} + +@tyConArgVrcs_maybe@ gives a list of (occPos,occNeg) flags, one for +each tyvar, if available. See @calcAlgTyConArgVrcs@ for how this is +actually computed (in another file). + +\begin{code} +tyConArgVrcs :: TyCon -> ArgVrcs +tyConArgVrcs (FunTyCon {}) = [(False,True),(True,False)] +tyConArgVrcs (AlgTyCon {argVrcs = oi}) = oi +tyConArgVrcs (PrimTyCon {argVrcs = oi}) = oi +tyConArgVrcs (TupleTyCon {tyConArity = arity}) = (replicate arity (True,False)) +tyConArgVrcs (SynTyCon {argVrcs = oi}) = oi +\end{code} + +\begin{code} +synTyConDefn :: TyCon -> ([TyVar], Type) +synTyConDefn (SynTyCon {tyConTyVars = tyvars, synTcRhs = ty}) = (tyvars,ty) +synTyConDefn tycon = pprPanic "getSynTyConDefn" (ppr tycon) + +synTyConRhs :: TyCon -> Type +synTyConRhs tc = synTcRhs tc +\end{code} + +\begin{code} +maybeTyConSingleCon :: TyCon -> Maybe DataCon +maybeTyConSingleCon (AlgTyCon {algTcRhs = DataTyCon {data_cons = [c] }}) = Just c +maybeTyConSingleCon (AlgTyCon {algTcRhs = NewTyCon { data_con = c }}) = Just c +maybeTyConSingleCon (AlgTyCon {}) = Nothing +maybeTyConSingleCon (TupleTyCon {dataCon = con}) = Just con +maybeTyConSingleCon (PrimTyCon {}) = Nothing +maybeTyConSingleCon (FunTyCon {}) = Nothing -- case at funty +maybeTyConSingleCon tc = pprPanic "maybeTyConSingleCon: unexpected tycon " $ ppr tc +\end{code} + +\begin{code} +isClassTyCon :: TyCon -> Bool +isClassTyCon (AlgTyCon {algTcClass = Just _}) = True +isClassTyCon other_tycon = False + +tyConClass_maybe :: TyCon -> Maybe Class +tyConClass_maybe (AlgTyCon {algTcClass = maybe_clas}) = maybe_clas +tyConClass_maybe ther_tycon = Nothing +\end{code} + + +%************************************************************************ +%* * +\subsection[TyCon-instances]{Instance declarations for @TyCon@} +%* * +%************************************************************************ + +@TyCon@s are compared by comparing their @Unique@s. + +The strictness analyser needs @Ord@. It is a lexicographic order with +the property @(a<=b) || (b<=a)@. + +\begin{code} +instance Eq TyCon where + a == b = case (a `compare` b) of { EQ -> True; _ -> False } + a /= b = case (a `compare` b) of { EQ -> False; _ -> True } + +instance Ord TyCon where + a <= b = case (a `compare` b) of { LT -> True; EQ -> True; GT -> False } + a < b = case (a `compare` b) of { LT -> True; EQ -> False; GT -> False } + a >= b = case (a `compare` b) of { LT -> False; EQ -> True; GT -> True } + a > b = case (a `compare` b) of { LT -> False; EQ -> False; GT -> True } + compare a b = getUnique a `compare` getUnique b + +instance Uniquable TyCon where + getUnique tc = tyConUnique tc + +instance Outputable TyCon where + ppr tc = ppr (getName tc) + +instance NamedThing TyCon where + getName = tyConName +\end{code} diff --git a/compiler/types/TyCon.lhs-boot b/compiler/types/TyCon.lhs-boot new file mode 100644 index 0000000000..83b4b7d07a --- /dev/null +++ b/compiler/types/TyCon.lhs-boot @@ -0,0 +1,9 @@ +\begin{code} +module TyCon where + +data TyCon + +isTupleTyCon :: TyCon -> Bool +isUnboxedTupleTyCon :: TyCon -> Bool +isFunTyCon :: TyCon -> Bool +\end{code} diff --git a/compiler/types/Type.lhs b/compiler/types/Type.lhs new file mode 100644 index 0000000000..872feb06f5 --- /dev/null +++ b/compiler/types/Type.lhs @@ -0,0 +1,1232 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1998 +% +\section[Type]{Type - public interface} + +\begin{code} +module Type ( + -- re-exports from TypeRep + TyThing(..), Type, PredType(..), ThetaType, + funTyCon, + + -- Re-exports from Kind + module Kind, + + -- Re-exports from TyCon + PrimRep(..), + + mkTyVarTy, mkTyVarTys, getTyVar, getTyVar_maybe, isTyVarTy, + + mkAppTy, mkAppTys, splitAppTy, splitAppTys, splitAppTy_maybe, + + mkFunTy, mkFunTys, splitFunTy, splitFunTy_maybe, + splitFunTys, splitFunTysN, + funResultTy, funArgTy, zipFunTys, isFunTy, + + mkTyConApp, mkTyConTy, + tyConAppTyCon, tyConAppArgs, + splitTyConApp_maybe, splitTyConApp, + + repType, typePrimRep, coreView, tcView, + + mkForAllTy, mkForAllTys, splitForAllTy_maybe, splitForAllTys, + applyTy, applyTys, isForAllTy, dropForAlls, + + -- Source types + predTypeRep, mkPredTy, mkPredTys, + + -- Newtypes + splitRecNewType_maybe, + + -- Lifting and boxity + isUnLiftedType, isUnboxedTupleType, isAlgType, isPrimitiveType, + isStrictType, isStrictPred, + + -- Free variables + tyVarsOfType, tyVarsOfTypes, tyVarsOfPred, tyVarsOfTheta, + typeKind, addFreeTyVars, + + -- Tidying up for printing + tidyType, tidyTypes, + tidyOpenType, tidyOpenTypes, + tidyTyVarBndr, tidyFreeTyVars, + tidyOpenTyVar, tidyOpenTyVars, + tidyTopType, tidyPred, + tidyKind, + + -- Comparison + coreEqType, tcEqType, tcEqTypes, tcCmpType, tcCmpTypes, + tcEqPred, tcCmpPred, tcEqTypeX, + + -- Seq + seqType, seqTypes, + + -- Type substitutions + TvSubstEnv, emptyTvSubstEnv, -- Representation widely visible + TvSubst(..), emptyTvSubst, -- Representation visible to a few friends + mkTvSubst, mkOpenTvSubst, zipOpenTvSubst, zipTopTvSubst, mkTopTvSubst, notElemTvSubst, + getTvSubstEnv, setTvSubstEnv, getTvInScope, extendTvInScope, + extendTvSubst, extendTvSubstList, isInScope, composeTvSubst, zipTyEnv, + + -- Performing substitution on types + substTy, substTys, substTyWith, substTheta, + substPred, substTyVar, substTyVarBndr, deShadowTy, lookupTyVar, + + -- Pretty-printing + pprType, pprParendType, pprTyThingCategory, + pprPred, pprTheta, pprThetaArrow, pprClassPred + ) where + +#include "HsVersions.h" + +-- We import the representation and primitive functions from TypeRep. +-- Many things are reexported, but not the representation! + +import TypeRep + +-- friends: +import Kind +import Var ( Var, TyVar, tyVarKind, tyVarName, setTyVarName, mkTyVar ) +import VarEnv +import VarSet + +import OccName ( tidyOccName ) +import Name ( NamedThing(..), mkInternalName, tidyNameOcc ) +import Class ( Class, classTyCon ) +import TyCon ( TyCon, isRecursiveTyCon, isPrimTyCon, + isUnboxedTupleTyCon, isUnLiftedTyCon, + isFunTyCon, isNewTyCon, newTyConRep, newTyConRhs, + isAlgTyCon, tyConArity, + tcExpandTyCon_maybe, coreExpandTyCon_maybe, + tyConKind, PrimRep(..), tyConPrimRep, + ) + +-- others +import StaticFlags ( opt_DictsStrict ) +import SrcLoc ( noSrcLoc ) +import Util ( mapAccumL, seqList, lengthIs, snocView, thenCmp, isEqual, all2 ) +import Outputable +import UniqSet ( sizeUniqSet ) -- Should come via VarSet +import Maybe ( isJust ) +\end{code} + + +%************************************************************************ +%* * + Type representation +%* * +%************************************************************************ + +In Core, we "look through" non-recursive newtypes and PredTypes. + +\begin{code} +{-# INLINE coreView #-} +coreView :: Type -> Maybe Type +-- Srips off the *top layer only* of a type to give +-- its underlying representation type. +-- Returns Nothing if there is nothing to look through. +-- +-- In the case of newtypes, it returns +-- *either* a vanilla TyConApp (recursive newtype, or non-saturated) +-- *or* the newtype representation (otherwise), meaning the +-- type written in the RHS of the newtype decl, +-- which may itself be a newtype +-- +-- Example: newtype R = MkR S +-- newtype S = MkS T +-- newtype T = MkT (T -> T) +-- expandNewTcApp on R gives Just S +-- on S gives Just T +-- on T gives Nothing (no expansion) + +-- By being non-recursive and inlined, this case analysis gets efficiently +-- joined onto the case analysis that the caller is already doing +coreView (NoteTy _ ty) = Just ty +coreView (PredTy p) = Just (predTypeRep p) +coreView (TyConApp tc tys) | Just (tenv, rhs, tys') <- coreExpandTyCon_maybe tc tys + = Just (mkAppTys (substTy (mkTopTvSubst tenv) rhs) tys') + -- Its important to use mkAppTys, rather than (foldl AppTy), + -- because the function part might well return a + -- partially-applied type constructor; indeed, usually will! +coreView ty = Nothing + +----------------------------------------------- +{-# INLINE tcView #-} +tcView :: Type -> Maybe Type +-- Same, but for the type checker, which just looks through synonyms +tcView (NoteTy _ ty) = Just ty +tcView (TyConApp tc tys) | Just (tenv, rhs, tys') <- tcExpandTyCon_maybe tc tys + = Just (mkAppTys (substTy (mkTopTvSubst tenv) rhs) tys') +tcView ty = Nothing +\end{code} + + +%************************************************************************ +%* * +\subsection{Constructor-specific functions} +%* * +%************************************************************************ + + +--------------------------------------------------------------------- + TyVarTy + ~~~~~~~ +\begin{code} +mkTyVarTy :: TyVar -> Type +mkTyVarTy = TyVarTy + +mkTyVarTys :: [TyVar] -> [Type] +mkTyVarTys = map mkTyVarTy -- a common use of mkTyVarTy + +getTyVar :: String -> Type -> TyVar +getTyVar msg ty = case getTyVar_maybe ty of + Just tv -> tv + Nothing -> panic ("getTyVar: " ++ msg) + +isTyVarTy :: Type -> Bool +isTyVarTy ty = isJust (getTyVar_maybe ty) + +getTyVar_maybe :: Type -> Maybe TyVar +getTyVar_maybe ty | Just ty' <- coreView ty = getTyVar_maybe ty' +getTyVar_maybe (TyVarTy tv) = Just tv +getTyVar_maybe other = Nothing +\end{code} + + +--------------------------------------------------------------------- + AppTy + ~~~~~ +We need to be pretty careful with AppTy to make sure we obey the +invariant that a TyConApp is always visibly so. mkAppTy maintains the +invariant: use it. + +\begin{code} +mkAppTy orig_ty1 orig_ty2 + = mk_app orig_ty1 + where + mk_app (NoteTy _ ty1) = mk_app ty1 + mk_app (TyConApp tc tys) = mkTyConApp tc (tys ++ [orig_ty2]) + mk_app ty1 = AppTy orig_ty1 orig_ty2 + -- Note that the TyConApp could be an + -- under-saturated type synonym. GHC allows that; e.g. + -- type Foo k = k a -> k a + -- type Id x = x + -- foo :: Foo Id -> Foo Id + -- + -- Here Id is partially applied in the type sig for Foo, + -- but once the type synonyms are expanded all is well + +mkAppTys :: Type -> [Type] -> Type +mkAppTys orig_ty1 [] = orig_ty1 + -- This check for an empty list of type arguments + -- avoids the needless loss of a type synonym constructor. + -- For example: mkAppTys Rational [] + -- returns to (Ratio Integer), which has needlessly lost + -- the Rational part. +mkAppTys orig_ty1 orig_tys2 + = mk_app orig_ty1 + where + mk_app (NoteTy _ ty1) = mk_app ty1 + mk_app (TyConApp tc tys) = mkTyConApp tc (tys ++ orig_tys2) + -- mkTyConApp: see notes with mkAppTy + mk_app ty1 = foldl AppTy orig_ty1 orig_tys2 + +splitAppTy_maybe :: Type -> Maybe (Type, Type) +splitAppTy_maybe ty | Just ty' <- coreView ty = splitAppTy_maybe ty' +splitAppTy_maybe (FunTy ty1 ty2) = Just (TyConApp funTyCon [ty1], ty2) +splitAppTy_maybe (AppTy ty1 ty2) = Just (ty1, ty2) +splitAppTy_maybe (TyConApp tc tys) = case snocView tys of + Nothing -> Nothing + Just (tys',ty') -> Just (TyConApp tc tys', ty') +splitAppTy_maybe other = Nothing + +splitAppTy :: Type -> (Type, Type) +splitAppTy ty = case splitAppTy_maybe ty of + Just pr -> pr + Nothing -> panic "splitAppTy" + +splitAppTys :: Type -> (Type, [Type]) +splitAppTys ty = split ty ty [] + where + split orig_ty ty args | Just ty' <- coreView ty = split orig_ty ty' args + split orig_ty (AppTy ty arg) args = split ty ty (arg:args) + split orig_ty (TyConApp tc tc_args) args = (TyConApp tc [], tc_args ++ args) + split orig_ty (FunTy ty1 ty2) args = ASSERT( null args ) + (TyConApp funTyCon [], [ty1,ty2]) + split orig_ty ty args = (orig_ty, args) +\end{code} + + +--------------------------------------------------------------------- + FunTy + ~~~~~ + +\begin{code} +mkFunTy :: Type -> Type -> Type +mkFunTy arg res = FunTy arg res + +mkFunTys :: [Type] -> Type -> Type +mkFunTys tys ty = foldr FunTy ty tys + +isFunTy :: Type -> Bool +isFunTy ty = isJust (splitFunTy_maybe ty) + +splitFunTy :: Type -> (Type, Type) +splitFunTy ty | Just ty' <- coreView ty = splitFunTy ty' +splitFunTy (FunTy arg res) = (arg, res) +splitFunTy other = pprPanic "splitFunTy" (ppr other) + +splitFunTy_maybe :: Type -> Maybe (Type, Type) +splitFunTy_maybe ty | Just ty' <- coreView ty = splitFunTy_maybe ty' +splitFunTy_maybe (FunTy arg res) = Just (arg, res) +splitFunTy_maybe other = Nothing + +splitFunTys :: Type -> ([Type], Type) +splitFunTys ty = split [] ty ty + where + split args orig_ty ty | Just ty' <- coreView ty = split args orig_ty ty' + split args orig_ty (FunTy arg res) = split (arg:args) res res + split args orig_ty ty = (reverse args, orig_ty) + +splitFunTysN :: Int -> Type -> ([Type], Type) +-- Split off exactly n arg tys +splitFunTysN 0 ty = ([], ty) +splitFunTysN n ty = case splitFunTy ty of { (arg, res) -> + case splitFunTysN (n-1) res of { (args, res) -> + (arg:args, res) }} + +zipFunTys :: Outputable a => [a] -> Type -> ([(a,Type)], Type) +zipFunTys orig_xs orig_ty = split [] orig_xs orig_ty orig_ty + where + split acc [] nty ty = (reverse acc, nty) + split acc xs nty ty + | Just ty' <- coreView ty = split acc xs nty ty' + split acc (x:xs) nty (FunTy arg res) = split ((x,arg):acc) xs res res + split acc (x:xs) nty ty = pprPanic "zipFunTys" (ppr orig_xs <+> ppr orig_ty) + +funResultTy :: Type -> Type +funResultTy ty | Just ty' <- coreView ty = funResultTy ty' +funResultTy (FunTy arg res) = res +funResultTy ty = pprPanic "funResultTy" (ppr ty) + +funArgTy :: Type -> Type +funArgTy ty | Just ty' <- coreView ty = funArgTy ty' +funArgTy (FunTy arg res) = arg +funArgTy ty = pprPanic "funArgTy" (ppr ty) +\end{code} + + +--------------------------------------------------------------------- + TyConApp + ~~~~~~~~ +@mkTyConApp@ is a key function, because it builds a TyConApp, FunTy or PredTy, +as apppropriate. + +\begin{code} +mkTyConApp :: TyCon -> [Type] -> Type +mkTyConApp tycon tys + | isFunTyCon tycon, [ty1,ty2] <- tys + = FunTy ty1 ty2 + + | otherwise + = TyConApp tycon tys + +mkTyConTy :: TyCon -> Type +mkTyConTy tycon = mkTyConApp tycon [] + +-- splitTyConApp "looks through" synonyms, because they don't +-- mean a distinct type, but all other type-constructor applications +-- including functions are returned as Just .. + +tyConAppTyCon :: Type -> TyCon +tyConAppTyCon ty = fst (splitTyConApp ty) + +tyConAppArgs :: Type -> [Type] +tyConAppArgs ty = snd (splitTyConApp ty) + +splitTyConApp :: Type -> (TyCon, [Type]) +splitTyConApp ty = case splitTyConApp_maybe ty of + Just stuff -> stuff + Nothing -> pprPanic "splitTyConApp" (ppr ty) + +splitTyConApp_maybe :: Type -> Maybe (TyCon, [Type]) +splitTyConApp_maybe ty | Just ty' <- coreView ty = splitTyConApp_maybe ty' +splitTyConApp_maybe (TyConApp tc tys) = Just (tc, tys) +splitTyConApp_maybe (FunTy arg res) = Just (funTyCon, [arg,res]) +splitTyConApp_maybe other = Nothing +\end{code} + + +--------------------------------------------------------------------- + SynTy + ~~~~~ + +Notes on type synonyms +~~~~~~~~~~~~~~~~~~~~~~ +The various "split" functions (splitFunTy, splitRhoTy, splitForAllTy) try +to return type synonyms whereever possible. Thus + + type Foo a = a -> a + +we want + splitFunTys (a -> Foo a) = ([a], Foo a) +not ([a], a -> a) + +The reason is that we then get better (shorter) type signatures in +interfaces. Notably this plays a role in tcTySigs in TcBinds.lhs. + + + Representation types + ~~~~~~~~~~~~~~~~~~~~ +repType looks through + (a) for-alls, and + (b) synonyms + (c) predicates + (d) usage annotations + (e) all newtypes, including recursive ones +It's useful in the back end. + +\begin{code} +repType :: Type -> Type +-- Only applied to types of kind *; hence tycons are saturated +repType ty | Just ty' <- coreView ty = repType ty' +repType (ForAllTy _ ty) = repType ty +repType (TyConApp tc tys) + | isNewTyCon tc = -- Recursive newtypes are opaque to coreView + -- but we must expand them here. Sure to + -- be saturated because repType is only applied + -- to types of kind * + ASSERT( isRecursiveTyCon tc && + tys `lengthIs` tyConArity tc ) + repType (new_type_rep tc tys) +repType ty = ty + +-- new_type_rep doesn't ask any questions: +-- it just expands newtype, whether recursive or not +new_type_rep new_tycon tys = ASSERT( tys `lengthIs` tyConArity new_tycon ) + case newTyConRep new_tycon of + (tvs, rep_ty) -> substTyWith tvs tys rep_ty + +-- ToDo: this could be moved to the code generator, using splitTyConApp instead +-- of inspecting the type directly. +typePrimRep :: Type -> PrimRep +typePrimRep ty = case repType ty of + TyConApp tc _ -> tyConPrimRep tc + FunTy _ _ -> PtrRep + AppTy _ _ -> PtrRep -- See note below + TyVarTy _ -> PtrRep + other -> pprPanic "typePrimRep" (ppr ty) + -- Types of the form 'f a' must be of kind *, not *#, so + -- we are guaranteed that they are represented by pointers. + -- The reason is that f must have kind *->*, not *->*#, because + -- (we claim) there is no way to constrain f's kind any other + -- way. + +\end{code} + + +--------------------------------------------------------------------- + ForAllTy + ~~~~~~~~ + +\begin{code} +mkForAllTy :: TyVar -> Type -> Type +mkForAllTy tyvar ty + = mkForAllTys [tyvar] ty + +mkForAllTys :: [TyVar] -> Type -> Type +mkForAllTys tyvars ty = foldr ForAllTy ty tyvars + +isForAllTy :: Type -> Bool +isForAllTy (NoteTy _ ty) = isForAllTy ty +isForAllTy (ForAllTy _ _) = True +isForAllTy other_ty = False + +splitForAllTy_maybe :: Type -> Maybe (TyVar, Type) +splitForAllTy_maybe ty = splitFAT_m ty + where + splitFAT_m ty | Just ty' <- coreView ty = splitFAT_m ty' + splitFAT_m (ForAllTy tyvar ty) = Just(tyvar, ty) + splitFAT_m _ = Nothing + +splitForAllTys :: Type -> ([TyVar], Type) +splitForAllTys ty = split ty ty [] + where + split orig_ty ty tvs | Just ty' <- coreView ty = split orig_ty ty' tvs + split orig_ty (ForAllTy tv ty) tvs = split ty ty (tv:tvs) + split orig_ty t tvs = (reverse tvs, orig_ty) + +dropForAlls :: Type -> Type +dropForAlls ty = snd (splitForAllTys ty) +\end{code} + +-- (mkPiType now in CoreUtils) + +applyTy, applyTys +~~~~~~~~~~~~~~~~~ +Instantiate a for-all type with one or more type arguments. +Used when we have a polymorphic function applied to type args: + f t1 t2 +Then we use (applyTys type-of-f [t1,t2]) to compute the type of +the expression. + +\begin{code} +applyTy :: Type -> Type -> Type +applyTy ty arg | Just ty' <- coreView ty = applyTy ty' arg +applyTy (ForAllTy tv ty) arg = substTyWith [tv] [arg] ty +applyTy other arg = panic "applyTy" + +applyTys :: Type -> [Type] -> Type +-- This function is interesting because +-- a) the function may have more for-alls than there are args +-- b) less obviously, it may have fewer for-alls +-- For case (b) think of +-- applyTys (forall a.a) [forall b.b, Int] +-- This really can happen, via dressing up polymorphic types with newtype +-- clothing. Here's an example: +-- newtype R = R (forall a. a->a) +-- foo = case undefined :: R of +-- R f -> f () + +applyTys orig_fun_ty [] = orig_fun_ty +applyTys orig_fun_ty arg_tys + | n_tvs == n_args -- The vastly common case + = substTyWith tvs arg_tys rho_ty + | n_tvs > n_args -- Too many for-alls + = substTyWith (take n_args tvs) arg_tys + (mkForAllTys (drop n_args tvs) rho_ty) + | otherwise -- Too many type args + = ASSERT2( n_tvs > 0, ppr orig_fun_ty ) -- Zero case gives infnite loop! + applyTys (substTyWith tvs (take n_tvs arg_tys) rho_ty) + (drop n_tvs arg_tys) + where + (tvs, rho_ty) = splitForAllTys orig_fun_ty + n_tvs = length tvs + n_args = length arg_tys +\end{code} + + +%************************************************************************ +%* * +\subsection{Source types} +%* * +%************************************************************************ + +A "source type" is a type that is a separate type as far as the type checker is +concerned, but which has low-level representation as far as the back end is concerned. + +Source types are always lifted. + +The key function is predTypeRep which gives the representation of a source type: + +\begin{code} +mkPredTy :: PredType -> Type +mkPredTy pred = PredTy pred + +mkPredTys :: ThetaType -> [Type] +mkPredTys preds = map PredTy preds + +predTypeRep :: PredType -> Type +-- Convert a PredType to its "representation type"; +-- the post-type-checking type used by all the Core passes of GHC. +-- Unwraps only the outermost level; for example, the result might +-- be a newtype application +predTypeRep (IParam _ ty) = ty +predTypeRep (ClassP clas tys) = mkTyConApp (classTyCon clas) tys + -- Result might be a newtype application, but the consumer will + -- look through that too if necessary +\end{code} + + +%************************************************************************ +%* * + NewTypes +%* * +%************************************************************************ + +\begin{code} +splitRecNewType_maybe :: Type -> Maybe Type +-- Sometimes we want to look through a recursive newtype, and that's what happens here +-- It only strips *one layer* off, so the caller will usually call itself recursively +-- Only applied to types of kind *, hence the newtype is always saturated +splitRecNewType_maybe ty | Just ty' <- coreView ty = splitRecNewType_maybe ty' +splitRecNewType_maybe (TyConApp tc tys) + | isNewTyCon tc + = ASSERT( tys `lengthIs` tyConArity tc ) -- splitRecNewType_maybe only be applied + -- to *types* (of kind *) + ASSERT( isRecursiveTyCon tc ) -- Guaranteed by coreView + case newTyConRhs tc of + (tvs, rep_ty) -> ASSERT( length tvs == length tys ) + Just (substTyWith tvs tys rep_ty) + +splitRecNewType_maybe other = Nothing +\end{code} + + +%************************************************************************ +%* * +\subsection{Kinds and free variables} +%* * +%************************************************************************ + +--------------------------------------------------------------------- + Finding the kind of a type + ~~~~~~~~~~~~~~~~~~~~~~~~~~ +\begin{code} +typeKind :: Type -> Kind + +typeKind (TyVarTy tyvar) = tyVarKind tyvar +typeKind (TyConApp tycon tys) = foldr (\_ k -> kindFunResult k) (tyConKind tycon) tys +typeKind (NoteTy _ ty) = typeKind ty +typeKind (PredTy _) = liftedTypeKind -- Predicates are always + -- represented by lifted types +typeKind (AppTy fun arg) = kindFunResult (typeKind fun) +typeKind (FunTy arg res) = liftedTypeKind +typeKind (ForAllTy tv ty) = typeKind ty +\end{code} + + +--------------------------------------------------------------------- + Free variables of a type + ~~~~~~~~~~~~~~~~~~~~~~~~ +\begin{code} +tyVarsOfType :: Type -> TyVarSet +-- NB: for type synonyms tyVarsOfType does *not* expand the synonym +tyVarsOfType (TyVarTy tv) = unitVarSet tv +tyVarsOfType (TyConApp tycon tys) = tyVarsOfTypes tys +tyVarsOfType (NoteTy (FTVNote tvs) ty2) = tvs +tyVarsOfType (PredTy sty) = tyVarsOfPred sty +tyVarsOfType (FunTy arg res) = tyVarsOfType arg `unionVarSet` tyVarsOfType res +tyVarsOfType (AppTy fun arg) = tyVarsOfType fun `unionVarSet` tyVarsOfType arg +tyVarsOfType (ForAllTy tyvar ty) = delVarSet (tyVarsOfType ty) tyvar + +tyVarsOfTypes :: [Type] -> TyVarSet +tyVarsOfTypes tys = foldr (unionVarSet.tyVarsOfType) emptyVarSet tys + +tyVarsOfPred :: PredType -> TyVarSet +tyVarsOfPred (IParam _ ty) = tyVarsOfType ty +tyVarsOfPred (ClassP _ tys) = tyVarsOfTypes tys + +tyVarsOfTheta :: ThetaType -> TyVarSet +tyVarsOfTheta = foldr (unionVarSet . tyVarsOfPred) emptyVarSet + +-- Add a Note with the free tyvars to the top of the type +addFreeTyVars :: Type -> Type +addFreeTyVars ty@(NoteTy (FTVNote _) _) = ty +addFreeTyVars ty = NoteTy (FTVNote (tyVarsOfType ty)) ty +\end{code} + + +%************************************************************************ +%* * +\subsection{TidyType} +%* * +%************************************************************************ + +tidyTy tidies up a type for printing in an error message, or in +an interface file. + +It doesn't change the uniques at all, just the print names. + +\begin{code} +tidyTyVarBndr :: TidyEnv -> TyVar -> (TidyEnv, TyVar) +tidyTyVarBndr (tidy_env, subst) tyvar + = case tidyOccName tidy_env (getOccName name) of + (tidy', occ') -> ((tidy', subst'), tyvar') + where + subst' = extendVarEnv subst tyvar tyvar' + tyvar' = setTyVarName tyvar name' + name' = tidyNameOcc name occ' + where + name = tyVarName tyvar + +tidyFreeTyVars :: TidyEnv -> TyVarSet -> TidyEnv +-- Add the free tyvars to the env in tidy form, +-- so that we can tidy the type they are free in +tidyFreeTyVars env tyvars = fst (tidyOpenTyVars env (varSetElems tyvars)) + +tidyOpenTyVars :: TidyEnv -> [TyVar] -> (TidyEnv, [TyVar]) +tidyOpenTyVars env tyvars = mapAccumL tidyOpenTyVar env tyvars + +tidyOpenTyVar :: TidyEnv -> TyVar -> (TidyEnv, TyVar) +-- Treat a new tyvar as a binder, and give it a fresh tidy name +tidyOpenTyVar env@(tidy_env, subst) tyvar + = case lookupVarEnv subst tyvar of + Just tyvar' -> (env, tyvar') -- Already substituted + Nothing -> tidyTyVarBndr env tyvar -- Treat it as a binder + +tidyType :: TidyEnv -> Type -> Type +tidyType env@(tidy_env, subst) ty + = go ty + where + go (TyVarTy tv) = case lookupVarEnv subst tv of + Nothing -> TyVarTy tv + Just tv' -> TyVarTy tv' + go (TyConApp tycon tys) = let args = map go tys + in args `seqList` TyConApp tycon args + go (NoteTy note ty) = (NoteTy $! (go_note note)) $! (go ty) + go (PredTy sty) = PredTy (tidyPred env sty) + go (AppTy fun arg) = (AppTy $! (go fun)) $! (go arg) + go (FunTy fun arg) = (FunTy $! (go fun)) $! (go arg) + go (ForAllTy tv ty) = ForAllTy tvp $! (tidyType envp ty) + where + (envp, tvp) = tidyTyVarBndr env tv + + go_note note@(FTVNote ftvs) = note -- No need to tidy the free tyvars + +tidyTypes env tys = map (tidyType env) tys + +tidyPred :: TidyEnv -> PredType -> PredType +tidyPred env (IParam n ty) = IParam n (tidyType env ty) +tidyPred env (ClassP clas tys) = ClassP clas (tidyTypes env tys) +\end{code} + + +@tidyOpenType@ grabs the free type variables, tidies them +and then uses @tidyType@ to work over the type itself + +\begin{code} +tidyOpenType :: TidyEnv -> Type -> (TidyEnv, Type) +tidyOpenType env ty + = (env', tidyType env' ty) + where + env' = tidyFreeTyVars env (tyVarsOfType ty) + +tidyOpenTypes :: TidyEnv -> [Type] -> (TidyEnv, [Type]) +tidyOpenTypes env tys = mapAccumL tidyOpenType env tys + +tidyTopType :: Type -> Type +tidyTopType ty = tidyType emptyTidyEnv ty +\end{code} + + +%************************************************************************ +%* * + Tidying Kinds +%* * +%************************************************************************ + +We use a grevious hack for tidying KindVars. A TidyEnv contains +a (VarEnv Var) substitution, to express the renaming; but +KindVars are not Vars. The Right Thing ultimately is to make them +into Vars (and perhaps make Kinds into Types), but I just do a hack +here: I make up a TyVar just to remember the new OccName for the +renamed KindVar + +\begin{code} +tidyKind :: TidyEnv -> Kind -> (TidyEnv, Kind) +tidyKind env@(tidy_env, subst) (KindVar kvar) + | Just tv <- lookupVarEnv_Directly subst uniq + = (env, KindVar (setKindVarOcc kvar (getOccName tv))) + | otherwise + = ((tidy', subst'), KindVar kvar') + where + uniq = kindVarUniq kvar + (tidy', occ') = tidyOccName tidy_env (kindVarOcc kvar) + kvar' = setKindVarOcc kvar occ' + fake_tv = mkTyVar tv_name (panic "tidyKind:fake tv kind") + tv_name = mkInternalName uniq occ' noSrcLoc + subst' = extendVarEnv subst fake_tv fake_tv + +tidyKind env (FunKind k1 k2) + = (env2, FunKind k1' k2') + where + (env1, k1') = tidyKind env k1 + (env2, k2') = tidyKind env1 k2 + +tidyKind env k = (env, k) -- Atomic kinds +\end{code} + + +%************************************************************************ +%* * +\subsection{Liftedness} +%* * +%************************************************************************ + +\begin{code} +isUnLiftedType :: Type -> Bool + -- isUnLiftedType returns True for forall'd unlifted types: + -- x :: forall a. Int# + -- I found bindings like these were getting floated to the top level. + -- They are pretty bogus types, mind you. It would be better never to + -- construct them + +isUnLiftedType ty | Just ty' <- coreView ty = isUnLiftedType ty' +isUnLiftedType (ForAllTy tv ty) = isUnLiftedType ty +isUnLiftedType (TyConApp tc _) = isUnLiftedTyCon tc +isUnLiftedType other = False + +isUnboxedTupleType :: Type -> Bool +isUnboxedTupleType ty = case splitTyConApp_maybe ty of + Just (tc, ty_args) -> isUnboxedTupleTyCon tc + other -> False + +-- Should only be applied to *types*; hence the assert +isAlgType :: Type -> Bool +isAlgType ty = case splitTyConApp_maybe ty of + Just (tc, ty_args) -> ASSERT( ty_args `lengthIs` tyConArity tc ) + isAlgTyCon tc + other -> False +\end{code} + +@isStrictType@ computes whether an argument (or let RHS) should +be computed strictly or lazily, based only on its type. +Works just like isUnLiftedType, except that it has a special case +for dictionaries. Since it takes account of ClassP, you might think +this function should be in TcType, but isStrictType is used by DataCon, +which is below TcType in the hierarchy, so it's convenient to put it here. + +\begin{code} +isStrictType (PredTy pred) = isStrictPred pred +isStrictType ty | Just ty' <- coreView ty = isStrictType ty' +isStrictType (ForAllTy tv ty) = isStrictType ty +isStrictType (TyConApp tc _) = isUnLiftedTyCon tc +isStrictType other = False + +isStrictPred (ClassP clas _) = opt_DictsStrict && not (isNewTyCon (classTyCon clas)) +isStrictPred other = False + -- We may be strict in dictionary types, but only if it + -- has more than one component. + -- [Being strict in a single-component dictionary risks + -- poking the dictionary component, which is wrong.] +\end{code} + +\begin{code} +isPrimitiveType :: Type -> Bool +-- Returns types that are opaque to Haskell. +-- Most of these are unlifted, but now that we interact with .NET, we +-- may have primtive (foreign-imported) types that are lifted +isPrimitiveType ty = case splitTyConApp_maybe ty of + Just (tc, ty_args) -> ASSERT( ty_args `lengthIs` tyConArity tc ) + isPrimTyCon tc + other -> False +\end{code} + + +%************************************************************************ +%* * +\subsection{Sequencing on types +%* * +%************************************************************************ + +\begin{code} +seqType :: Type -> () +seqType (TyVarTy tv) = tv `seq` () +seqType (AppTy t1 t2) = seqType t1 `seq` seqType t2 +seqType (FunTy t1 t2) = seqType t1 `seq` seqType t2 +seqType (NoteTy note t2) = seqNote note `seq` seqType t2 +seqType (PredTy p) = seqPred p +seqType (TyConApp tc tys) = tc `seq` seqTypes tys +seqType (ForAllTy tv ty) = tv `seq` seqType ty + +seqTypes :: [Type] -> () +seqTypes [] = () +seqTypes (ty:tys) = seqType ty `seq` seqTypes tys + +seqNote :: TyNote -> () +seqNote (FTVNote set) = sizeUniqSet set `seq` () + +seqPred :: PredType -> () +seqPred (ClassP c tys) = c `seq` seqTypes tys +seqPred (IParam n ty) = n `seq` seqType ty +\end{code} + + +%************************************************************************ +%* * + Equality for Core types + (We don't use instances so that we know where it happens) +%* * +%************************************************************************ + +Note that eqType works right even for partial applications of newtypes. +See Note [Newtype eta] in TyCon.lhs + +\begin{code} +coreEqType :: Type -> Type -> Bool +coreEqType t1 t2 + = eq rn_env t1 t2 + where + rn_env = mkRnEnv2 (mkInScopeSet (tyVarsOfType t1 `unionVarSet` tyVarsOfType t2)) + + eq env (TyVarTy tv1) (TyVarTy tv2) = rnOccL env tv1 == rnOccR env tv2 + eq env (ForAllTy tv1 t1) (ForAllTy tv2 t2) = eq (rnBndr2 env tv1 tv2) t1 t2 + eq env (AppTy s1 t1) (AppTy s2 t2) = eq env s1 s2 && eq env t1 t2 + eq env (FunTy s1 t1) (FunTy s2 t2) = eq env s1 s2 && eq env t1 t2 + eq env (TyConApp tc1 tys1) (TyConApp tc2 tys2) + | tc1 == tc2, all2 (eq env) tys1 tys2 = True + -- The lengths should be equal because + -- the two types have the same kind + -- NB: if the type constructors differ that does not + -- necessarily mean that the types aren't equal + -- (synonyms, newtypes) + -- Even if the type constructors are the same, but the arguments + -- differ, the two types could be the same (e.g. if the arg is just + -- ignored in the RHS). In both these cases we fall through to an + -- attempt to expand one side or the other. + + -- Now deal with newtypes, synonyms, pred-tys + eq env t1 t2 | Just t1' <- coreView t1 = eq env t1' t2 + | Just t2' <- coreView t2 = eq env t1 t2' + + -- Fall through case; not equal! + eq env t1 t2 = False +\end{code} + + +%************************************************************************ +%* * + Comparision for source types + (We don't use instances so that we know where it happens) +%* * +%************************************************************************ + +Note that + tcEqType, tcCmpType +do *not* look through newtypes, PredTypes + +\begin{code} +tcEqType :: Type -> Type -> Bool +tcEqType t1 t2 = isEqual $ cmpType t1 t2 + +tcEqTypes :: [Type] -> [Type] -> Bool +tcEqTypes tys1 tys2 = isEqual $ cmpTypes tys1 tys2 + +tcCmpType :: Type -> Type -> Ordering +tcCmpType t1 t2 = cmpType t1 t2 + +tcCmpTypes :: [Type] -> [Type] -> Ordering +tcCmpTypes tys1 tys2 = cmpTypes tys1 tys2 + +tcEqPred :: PredType -> PredType -> Bool +tcEqPred p1 p2 = isEqual $ cmpPred p1 p2 + +tcCmpPred :: PredType -> PredType -> Ordering +tcCmpPred p1 p2 = cmpPred p1 p2 + +tcEqTypeX :: RnEnv2 -> Type -> Type -> Bool +tcEqTypeX env t1 t2 = isEqual $ cmpTypeX env t1 t2 +\end{code} + +Now here comes the real worker + +\begin{code} +cmpType :: Type -> Type -> Ordering +cmpType t1 t2 = cmpTypeX rn_env t1 t2 + where + rn_env = mkRnEnv2 (mkInScopeSet (tyVarsOfType t1 `unionVarSet` tyVarsOfType t2)) + +cmpTypes :: [Type] -> [Type] -> Ordering +cmpTypes ts1 ts2 = cmpTypesX rn_env ts1 ts2 + where + rn_env = mkRnEnv2 (mkInScopeSet (tyVarsOfTypes ts1 `unionVarSet` tyVarsOfTypes ts2)) + +cmpPred :: PredType -> PredType -> Ordering +cmpPred p1 p2 = cmpPredX rn_env p1 p2 + where + rn_env = mkRnEnv2 (mkInScopeSet (tyVarsOfPred p1 `unionVarSet` tyVarsOfPred p2)) + +cmpTypeX :: RnEnv2 -> Type -> Type -> Ordering -- Main workhorse +cmpTypeX env t1 t2 | Just t1' <- tcView t1 = cmpTypeX env t1' t2 + | Just t2' <- tcView t2 = cmpTypeX env t1 t2' + +cmpTypeX env (TyVarTy tv1) (TyVarTy tv2) = rnOccL env tv1 `compare` rnOccR env tv2 +cmpTypeX env (ForAllTy tv1 t1) (ForAllTy tv2 t2) = cmpTypeX (rnBndr2 env tv1 tv2) t1 t2 +cmpTypeX env (AppTy s1 t1) (AppTy s2 t2) = cmpTypeX env s1 s2 `thenCmp` cmpTypeX env t1 t2 +cmpTypeX env (FunTy s1 t1) (FunTy s2 t2) = cmpTypeX env s1 s2 `thenCmp` cmpTypeX env t1 t2 +cmpTypeX env (PredTy p1) (PredTy p2) = cmpPredX env p1 p2 +cmpTypeX env (TyConApp tc1 tys1) (TyConApp tc2 tys2) = (tc1 `compare` tc2) `thenCmp` cmpTypesX env tys1 tys2 +cmpTypeX env t1 (NoteTy _ t2) = cmpTypeX env t1 t2 + + -- Deal with the rest: TyVarTy < AppTy < FunTy < TyConApp < ForAllTy < PredTy +cmpTypeX env (AppTy _ _) (TyVarTy _) = GT + +cmpTypeX env (FunTy _ _) (TyVarTy _) = GT +cmpTypeX env (FunTy _ _) (AppTy _ _) = GT + +cmpTypeX env (TyConApp _ _) (TyVarTy _) = GT +cmpTypeX env (TyConApp _ _) (AppTy _ _) = GT +cmpTypeX env (TyConApp _ _) (FunTy _ _) = GT + +cmpTypeX env (ForAllTy _ _) (TyVarTy _) = GT +cmpTypeX env (ForAllTy _ _) (AppTy _ _) = GT +cmpTypeX env (ForAllTy _ _) (FunTy _ _) = GT +cmpTypeX env (ForAllTy _ _) (TyConApp _ _) = GT + +cmpTypeX env (PredTy _) t2 = GT + +cmpTypeX env _ _ = LT + +------------- +cmpTypesX :: RnEnv2 -> [Type] -> [Type] -> Ordering +cmpTypesX env [] [] = EQ +cmpTypesX env (t1:tys1) (t2:tys2) = cmpTypeX env t1 t2 `thenCmp` cmpTypesX env tys1 tys2 +cmpTypesX env [] tys = LT +cmpTypesX env ty [] = GT + +------------- +cmpPredX :: RnEnv2 -> PredType -> PredType -> Ordering +cmpPredX env (IParam n1 ty1) (IParam n2 ty2) = (n1 `compare` n2) `thenCmp` cmpTypeX env ty1 ty2 + -- Compare types as well as names for implicit parameters + -- This comparison is used exclusively (I think) for the + -- finite map built in TcSimplify +cmpPredX env (ClassP c1 tys1) (ClassP c2 tys2) = (c1 `compare` c2) `thenCmp` cmpTypesX env tys1 tys2 +cmpPredX env (IParam _ _) (ClassP _ _) = LT +cmpPredX env (ClassP _ _) (IParam _ _) = GT +\end{code} + +PredTypes are used as a FM key in TcSimplify, +so we take the easy path and make them an instance of Ord + +\begin{code} +instance Eq PredType where { (==) = tcEqPred } +instance Ord PredType where { compare = tcCmpPred } +\end{code} + + +%************************************************************************ +%* * + Type substitutions +%* * +%************************************************************************ + +\begin{code} +data TvSubst + = TvSubst InScopeSet -- The in-scope type variables + TvSubstEnv -- The substitution itself + -- See Note [Apply Once] + +{- ---------------------------------------------------------- + Note [Apply Once] + +We use TvSubsts to instantiate things, and we might instantiate + forall a b. ty +\with the types + [a, b], or [b, a]. +So the substition might go [a->b, b->a]. A similar situation arises in Core +when we find a beta redex like + (/\ a /\ b -> e) b a +Then we also end up with a substition that permutes type variables. Other +variations happen to; for example [a -> (a, b)]. + + *************************************************** + *** So a TvSubst must be applied precisely once *** + *************************************************** + +A TvSubst is not idempotent, but, unlike the non-idempotent substitution +we use during unifications, it must not be repeatedly applied. +-------------------------------------------------------------- -} + + +type TvSubstEnv = TyVarEnv Type + -- A TvSubstEnv is used both inside a TvSubst (with the apply-once + -- invariant discussed in Note [Apply Once]), and also independently + -- in the middle of matching, and unification (see Types.Unify) + -- So you have to look at the context to know if it's idempotent or + -- apply-once or whatever +emptyTvSubstEnv :: TvSubstEnv +emptyTvSubstEnv = emptyVarEnv + +composeTvSubst :: InScopeSet -> TvSubstEnv -> TvSubstEnv -> TvSubstEnv +-- (compose env1 env2)(x) is env1(env2(x)); i.e. apply env2 then env1 +-- It assumes that both are idempotent +-- Typically, env1 is the refinement to a base substitution env2 +composeTvSubst in_scope env1 env2 + = env1 `plusVarEnv` mapVarEnv (substTy subst1) env2 + -- First apply env1 to the range of env2 + -- Then combine the two, making sure that env1 loses if + -- both bind the same variable; that's why env1 is the + -- *left* argument to plusVarEnv, because the right arg wins + where + subst1 = TvSubst in_scope env1 + +emptyTvSubst = TvSubst emptyInScopeSet emptyVarEnv + +isEmptyTvSubst :: TvSubst -> Bool +isEmptyTvSubst (TvSubst _ env) = isEmptyVarEnv env + +mkTvSubst :: InScopeSet -> TvSubstEnv -> TvSubst +mkTvSubst = TvSubst + +getTvSubstEnv :: TvSubst -> TvSubstEnv +getTvSubstEnv (TvSubst _ env) = env + +getTvInScope :: TvSubst -> InScopeSet +getTvInScope (TvSubst in_scope _) = in_scope + +isInScope :: Var -> TvSubst -> Bool +isInScope v (TvSubst in_scope _) = v `elemInScopeSet` in_scope + +notElemTvSubst :: TyVar -> TvSubst -> Bool +notElemTvSubst tv (TvSubst _ env) = not (tv `elemVarEnv` env) + +setTvSubstEnv :: TvSubst -> TvSubstEnv -> TvSubst +setTvSubstEnv (TvSubst in_scope _) env = TvSubst in_scope env + +extendTvInScope :: TvSubst -> [Var] -> TvSubst +extendTvInScope (TvSubst in_scope env) vars = TvSubst (extendInScopeSetList in_scope vars) env + +extendTvSubst :: TvSubst -> TyVar -> Type -> TvSubst +extendTvSubst (TvSubst in_scope env) tv ty = TvSubst in_scope (extendVarEnv env tv ty) + +extendTvSubstList :: TvSubst -> [TyVar] -> [Type] -> TvSubst +extendTvSubstList (TvSubst in_scope env) tvs tys + = TvSubst in_scope (extendVarEnvList env (tvs `zip` tys)) + +-- mkOpenTvSubst and zipOpenTvSubst generate the in-scope set from +-- the types given; but it's just a thunk so with a bit of luck +-- it'll never be evaluated + +mkOpenTvSubst :: TvSubstEnv -> TvSubst +mkOpenTvSubst env = TvSubst (mkInScopeSet (tyVarsOfTypes (varEnvElts env))) env + +zipOpenTvSubst :: [TyVar] -> [Type] -> TvSubst +zipOpenTvSubst tyvars tys +#ifdef DEBUG + | length tyvars /= length tys + = pprTrace "zipOpenTvSubst" (ppr tyvars $$ ppr tys) emptyTvSubst + | otherwise +#endif + = TvSubst (mkInScopeSet (tyVarsOfTypes tys)) (zipTyEnv tyvars tys) + +-- mkTopTvSubst is called when doing top-level substitutions. +-- Here we expect that the free vars of the range of the +-- substitution will be empty. +mkTopTvSubst :: [(TyVar, Type)] -> TvSubst +mkTopTvSubst prs = TvSubst emptyInScopeSet (mkVarEnv prs) + +zipTopTvSubst :: [TyVar] -> [Type] -> TvSubst +zipTopTvSubst tyvars tys +#ifdef DEBUG + | length tyvars /= length tys + = pprTrace "zipOpenTvSubst" (ppr tyvars $$ ppr tys) emptyTvSubst + | otherwise +#endif + = TvSubst emptyInScopeSet (zipTyEnv tyvars tys) + +zipTyEnv :: [TyVar] -> [Type] -> TvSubstEnv +zipTyEnv tyvars tys +#ifdef DEBUG + | length tyvars /= length tys + = pprTrace "mkTopTvSubst" (ppr tyvars $$ ppr tys) emptyVarEnv + | otherwise +#endif + = zip_ty_env tyvars tys emptyVarEnv + +-- Later substitutions in the list over-ride earlier ones, +-- but there should be no loops +zip_ty_env [] [] env = env +zip_ty_env (tv:tvs) (ty:tys) env = zip_ty_env tvs tys (extendVarEnv env tv ty) + -- There used to be a special case for when + -- ty == TyVarTy tv + -- (a not-uncommon case) in which case the substitution was dropped. + -- But the type-tidier changes the print-name of a type variable without + -- changing the unique, and that led to a bug. Why? Pre-tidying, we had + -- a type {Foo t}, where Foo is a one-method class. So Foo is really a newtype. + -- And it happened that t was the type variable of the class. Post-tiding, + -- it got turned into {Foo t2}. The ext-core printer expanded this using + -- sourceTypeRep, but that said "Oh, t == t2" because they have the same unique, + -- and so generated a rep type mentioning t not t2. + -- + -- Simplest fix is to nuke the "optimisation" +zip_ty_env tvs tys env = pprTrace "Var/Type length mismatch: " (ppr tvs $$ ppr tys) env +-- zip_ty_env _ _ env = env + +instance Outputable TvSubst where + ppr (TvSubst ins env) + = sep[ ptext SLIT("<TvSubst"), + nest 2 (ptext SLIT("In scope:") <+> ppr ins), + nest 2 (ptext SLIT("Env:") <+> ppr env) ] +\end{code} + +%************************************************************************ +%* * + Performing type substitutions +%* * +%************************************************************************ + +\begin{code} +substTyWith :: [TyVar] -> [Type] -> Type -> Type +substTyWith tvs tys = ASSERT( length tvs == length tys ) + substTy (zipOpenTvSubst tvs tys) + +substTy :: TvSubst -> Type -> Type +substTy subst ty | isEmptyTvSubst subst = ty + | otherwise = subst_ty subst ty + +substTys :: TvSubst -> [Type] -> [Type] +substTys subst tys | isEmptyTvSubst subst = tys + | otherwise = map (subst_ty subst) tys + +substTheta :: TvSubst -> ThetaType -> ThetaType +substTheta subst theta + | isEmptyTvSubst subst = theta + | otherwise = map (substPred subst) theta + +substPred :: TvSubst -> PredType -> PredType +substPred subst (IParam n ty) = IParam n (subst_ty subst ty) +substPred subst (ClassP clas tys) = ClassP clas (map (subst_ty subst) tys) + +deShadowTy :: TyVarSet -> Type -> Type -- Remove any nested binders mentioning tvs +deShadowTy tvs ty + = subst_ty (mkTvSubst in_scope emptyTvSubstEnv) ty + where + in_scope = mkInScopeSet tvs + +-- Note that the in_scope set is poked only if we hit a forall +-- so it may often never be fully computed +subst_ty subst ty + = go ty + where + go (TyVarTy tv) = substTyVar subst tv + go (TyConApp tc tys) = let args = map go tys + in args `seqList` TyConApp tc args + + go (PredTy p) = PredTy $! (substPred subst p) + + go (NoteTy (FTVNote _) ty2) = go ty2 -- Discard the free tyvar note + + go (FunTy arg res) = (FunTy $! (go arg)) $! (go res) + go (AppTy fun arg) = mkAppTy (go fun) $! (go arg) + -- The mkAppTy smart constructor is important + -- we might be replacing (a Int), represented with App + -- by [Int], represented with TyConApp + go (ForAllTy tv ty) = case substTyVarBndr subst tv of + (subst', tv') -> ForAllTy tv' $! (subst_ty subst' ty) + +substTyVar :: TvSubst -> TyVar -> Type +substTyVar subst tv + = case lookupTyVar subst tv of + Nothing -> TyVarTy tv + Just ty' -> ty' -- See Note [Apply Once] + +lookupTyVar :: TvSubst -> TyVar -> Maybe Type +lookupTyVar (TvSubst in_scope env) tv = lookupVarEnv env tv + +substTyVarBndr :: TvSubst -> TyVar -> (TvSubst, TyVar) +substTyVarBndr subst@(TvSubst in_scope env) old_var + | old_var == new_var -- No need to clone + -- But we *must* zap any current substitution for the variable. + -- For example: + -- (\x.e) with id_subst = [x |-> e'] + -- Here we must simply zap the substitution for x + -- + -- The new_id isn't cloned, but it may have a different type + -- etc, so we must return it, not the old id + = (TvSubst (in_scope `extendInScopeSet` new_var) + (delVarEnv env old_var), + new_var) + + | otherwise -- The new binder is in scope so + -- we'd better rename it away from the in-scope variables + -- Extending the substitution to do this renaming also + -- has the (correct) effect of discarding any existing + -- substitution for that variable + = (TvSubst (in_scope `extendInScopeSet` new_var) + (extendVarEnv env old_var (TyVarTy new_var)), + new_var) + where + new_var = uniqAway in_scope old_var + -- The uniqAway part makes sure the new variable is not already in scope +\end{code} diff --git a/compiler/types/TypeRep.hi-boot-5 b/compiler/types/TypeRep.hi-boot-5 new file mode 100644 index 0000000000..80452e4d2f --- /dev/null +++ b/compiler/types/TypeRep.hi-boot-5 @@ -0,0 +1,9 @@ +__interface TypeRep 1 0 where +__export TypeRep Type SourceType PredType Kind SuperKind TyThing ; +1 data Type ; +1 data SourceType ; +1 data TyThing ; +1 type PredType = SourceType ; +1 type Kind = Type ; +1 type SuperKind = Type ; + diff --git a/compiler/types/TypeRep.hi-boot-6 b/compiler/types/TypeRep.hi-boot-6 new file mode 100644 index 0000000000..55d80a6acc --- /dev/null +++ b/compiler/types/TypeRep.hi-boot-6 @@ -0,0 +1,6 @@ +module TypeRep where + +data Type +data PredType +data TyThing + diff --git a/compiler/types/TypeRep.lhs b/compiler/types/TypeRep.lhs new file mode 100644 index 0000000000..7bb863a210 --- /dev/null +++ b/compiler/types/TypeRep.lhs @@ -0,0 +1,409 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1998 +% +\section[TypeRep]{Type - friends' interface} + +\begin{code} +module TypeRep ( + TyThing(..), + Type(..), TyNote(..), -- Representation visible + PredType(..), -- to friends + + Kind, ThetaType, -- Synonyms + + funTyCon, + + -- Pretty-printing + pprType, pprParendType, pprTyThingCategory, + pprPred, pprTheta, pprThetaArrow, pprClassPred, + + -- Re-export fromKind + liftedTypeKind, unliftedTypeKind, openTypeKind, + isLiftedTypeKind, isUnliftedTypeKind, isOpenTypeKind, + mkArrowKind, mkArrowKinds, + pprKind, pprParendKind + ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} DataCon( DataCon, dataConName ) + +-- friends: +import Kind +import Var ( Var, Id, TyVar, tyVarKind ) +import VarSet ( TyVarSet ) +import Name ( Name, NamedThing(..), BuiltInSyntax(..), mkWiredInName ) +import OccName ( mkOccNameFS, tcName, parenSymOcc ) +import BasicTypes ( IPName, tupleParens ) +import TyCon ( TyCon, mkFunTyCon, tyConArity, tupleTyConBoxity, isTupleTyCon, isRecursiveTyCon, isNewTyCon ) +import Class ( Class ) + +-- others +import PrelNames ( gHC_PRIM, funTyConKey, listTyConKey, parrTyConKey, hasKey ) +import Outputable +\end{code} + +%************************************************************************ +%* * +\subsection{Type Classifications} +%* * +%************************************************************************ + +A type is + + *unboxed* iff its representation is other than a pointer + Unboxed types are also unlifted. + + *lifted* A type is lifted iff it has bottom as an element. + Closures always have lifted types: i.e. any + let-bound identifier in Core must have a lifted + type. Operationally, a lifted object is one that + can be entered. + + Only lifted types may be unified with a type variable. + + *algebraic* A type with one or more constructors, whether declared + with "data" or "newtype". + An algebraic type is one that can be deconstructed + with a case expression. + *NOT* the same as lifted types, because we also + include unboxed tuples in this classification. + + *data* A type declared with "data". Also boxed tuples. + + *primitive* iff it is a built-in type that can't be expressed + in Haskell. + +Currently, all primitive types are unlifted, but that's not necessarily +the case. (E.g. Int could be primitive.) + +Some primitive types are unboxed, such as Int#, whereas some are boxed +but unlifted (such as ByteArray#). The only primitive types that we +classify as algebraic are the unboxed tuples. + +examples of type classifications: + +Type primitive boxed lifted algebraic +----------------------------------------------------------------------------- +Int#, Yes No No No +ByteArray# Yes Yes No No +(# a, b #) Yes No No Yes +( a, b ) No Yes Yes Yes +[a] No Yes Yes Yes + + + + ---------------------- + A note about newtypes + ---------------------- + +Consider + newtype N = MkN Int + +Then we want N to be represented as an Int, and that's what we arrange. +The front end of the compiler [TcType.lhs] treats N as opaque, +the back end treats it as transparent [Type.lhs]. + +There's a bit of a problem with recursive newtypes + newtype P = MkP P + newtype Q = MkQ (Q->Q) + +Here the 'implicit expansion' we get from treating P and Q as transparent +would give rise to infinite types, which in turn makes eqType diverge. +Similarly splitForAllTys and splitFunTys can get into a loop. + +Solution: + +* Newtypes are always represented using TyConApp. + +* For non-recursive newtypes, P, treat P just like a type synonym after + type-checking is done; i.e. it's opaque during type checking (functions + from TcType) but transparent afterwards (functions from Type). + "Treat P as a type synonym" means "all functions expand NewTcApps + on the fly". + + Applications of the data constructor P simply vanish: + P x = x + + +* For recursive newtypes Q, treat the Q and its representation as + distinct right through the compiler. Applications of the data consructor + use a coerce: + Q = \(x::Q->Q). coerce Q x + They are rare, so who cares if they are a tiny bit less efficient. + +The typechecker (TcTyDecls) identifies enough type construtors as 'recursive' +to cut all loops. The other members of the loop may be marked 'non-recursive'. + + +%************************************************************************ +%* * +\subsection{The data type} +%* * +%************************************************************************ + + +\begin{code} +data Type + = TyVarTy TyVar + + | AppTy + Type -- Function is *not* a TyConApp + Type -- It must be another AppTy, or TyVarTy + -- (or NoteTy of these) + + | TyConApp -- Application of a TyCon, including newtypes *and* synonyms + TyCon -- *Invariant* saturated appliations of FunTyCon and + -- synonyms have their own constructors, below. + -- However, *unsaturated* FunTyCons do appear as TyConApps. + -- + [Type] -- Might not be saturated. + -- Even type synonyms are not necessarily saturated; + -- for example unsaturated type synonyms can appear as the + -- RHS of a type synonym. + + | FunTy -- Special case of TyConApp: TyConApp FunTyCon [t1,t2] + Type + Type + + | ForAllTy -- A polymorphic type + TyVar + Type + + | PredTy -- A high level source type + PredType -- ...can be expanded to a representation type... + + | NoteTy -- A type with a note attached + TyNote + Type -- The expanded version + +data TyNote = FTVNote TyVarSet -- The free type variables of the noted expression +\end{code} + +------------------------------------- + Source types + +A type of the form + PredTy p +represents a value whose type is the Haskell predicate p, +where a predicate is what occurs before the '=>' in a Haskell type. +It can be expanded into its representation, but: + + * The type checker must treat it as opaque + * The rest of the compiler treats it as transparent + +Consider these examples: + f :: (Eq a) => a -> Int + g :: (?x :: Int -> Int) => a -> Int + h :: (r\l) => {r} => {l::Int | r} + +Here the "Eq a" and "?x :: Int -> Int" and "r\l" are all called *predicates* +Predicates are represented inside GHC by PredType: + +\begin{code} +data PredType + = ClassP Class [Type] -- Class predicate + | IParam (IPName Name) Type -- Implicit parameter + +type ThetaType = [PredType] +\end{code} + +(We don't support TREX records yet, but the setup is designed +to expand to allow them.) + +A Haskell qualified type, such as that for f,g,h above, is +represented using + * a FunTy for the double arrow + * with a PredTy as the function argument + +The predicate really does turn into a real extra argument to the +function. If the argument has type (PredTy p) then the predicate p is +represented by evidence (a dictionary, for example, of type (predRepTy p). + + +%************************************************************************ +%* * + TyThing +%* * +%************************************************************************ + +Despite the fact that DataCon has to be imported via a hi-boot route, +this module seems the right place for TyThing, because it's needed for +funTyCon and all the types in TysPrim. + +\begin{code} +data TyThing = AnId Id + | ADataCon DataCon + | ATyCon TyCon + | AClass Class + +instance Outputable TyThing where + ppr thing = pprTyThingCategory thing <+> quotes (ppr (getName thing)) + +pprTyThingCategory :: TyThing -> SDoc +pprTyThingCategory (ATyCon _) = ptext SLIT("Type constructor") +pprTyThingCategory (AClass _) = ptext SLIT("Class") +pprTyThingCategory (AnId _) = ptext SLIT("Identifier") +pprTyThingCategory (ADataCon _) = ptext SLIT("Data constructor") + +instance NamedThing TyThing where -- Can't put this with the type + getName (AnId id) = getName id -- decl, because the DataCon instance + getName (ATyCon tc) = getName tc -- isn't visible there + getName (AClass cl) = getName cl + getName (ADataCon dc) = dataConName dc +\end{code} + + +%************************************************************************ +%* * +\subsection{Wired-in type constructors +%* * +%************************************************************************ + +We define a few wired-in type constructors here to avoid module knots + +\begin{code} +funTyCon = mkFunTyCon funTyConName (mkArrowKinds [argTypeKind, openTypeKind] liftedTypeKind) + -- You might think that (->) should have type (?? -> ? -> *), and you'd be right + -- But if we do that we get kind errors when saying + -- instance Control.Arrow (->) + -- becuase the expected kind is (*->*->*). The trouble is that the + -- expected/actual stuff in the unifier does not go contra-variant, whereas + -- the kind sub-typing does. Sigh. It really only matters if you use (->) in + -- a prefix way, thus: (->) Int# Int#. And this is unusual. + +funTyConName = mkWiredInName gHC_PRIM + (mkOccNameFS tcName FSLIT("(->)")) + funTyConKey + Nothing -- No parent object + (ATyCon funTyCon) -- Relevant TyCon + BuiltInSyntax +\end{code} + + +%************************************************************************ +%* * +\subsection{The external interface} +%* * +%************************************************************************ + +@pprType@ is the standard @Type@ printer; the overloaded @ppr@ function is +defined to use this. @pprParendType@ is the same, except it puts +parens around the type, except for the atomic cases. @pprParendType@ +works just by setting the initial context precedence very high. + +\begin{code} +data Prec = TopPrec -- No parens + | FunPrec -- Function args; no parens for tycon apps + | TyConPrec -- Tycon args; no parens for atomic + deriving( Eq, Ord ) + +maybeParen :: Prec -> Prec -> SDoc -> SDoc +maybeParen ctxt_prec inner_prec pretty + | ctxt_prec < inner_prec = pretty + | otherwise = parens pretty + +------------------ +pprType, pprParendType :: Type -> SDoc +pprType ty = ppr_type TopPrec ty +pprParendType ty = ppr_type TyConPrec ty + +------------------ +pprPred :: PredType -> SDoc +pprPred (ClassP cls tys) = pprClassPred cls tys +pprPred (IParam ip ty) = ppr ip <> dcolon <> pprType ty + +pprClassPred :: Class -> [Type] -> SDoc +pprClassPred clas tys = parenSymOcc (getOccName clas) (ppr clas) + <+> sep (map pprParendType tys) + +pprTheta :: ThetaType -> SDoc +pprTheta theta = parens (sep (punctuate comma (map pprPred theta))) + +pprThetaArrow :: ThetaType -> SDoc +pprThetaArrow theta + | null theta = empty + | otherwise = parens (sep (punctuate comma (map pprPred theta))) <+> ptext SLIT("=>") + +------------------ +instance Outputable Type where + ppr ty = pprType ty + +instance Outputable PredType where + ppr = pprPred + +instance Outputable name => OutputableBndr (IPName name) where + pprBndr _ n = ppr n -- Simple for now + +------------------ + -- OK, here's the main printer + +ppr_type :: Prec -> Type -> SDoc +ppr_type p (TyVarTy tv) = ppr tv +ppr_type p (PredTy pred) = braces (ppr pred) +ppr_type p (NoteTy other ty2) = ppr_type p ty2 +ppr_type p (TyConApp tc tys) = ppr_tc_app p tc tys + +ppr_type p (AppTy t1 t2) = maybeParen p TyConPrec $ + pprType t1 <+> ppr_type TyConPrec t2 + +ppr_type p ty@(ForAllTy _ _) = ppr_forall_type p ty +ppr_type p ty@(FunTy (PredTy _) _) = ppr_forall_type p ty + +ppr_type p (FunTy ty1 ty2) + = -- We don't want to lose synonyms, so we mustn't use splitFunTys here. + maybeParen p FunPrec $ + sep (ppr_type FunPrec ty1 : ppr_fun_tail ty2) + where + ppr_fun_tail (FunTy ty1 ty2) = (arrow <+> ppr_type FunPrec ty1) : ppr_fun_tail ty2 + ppr_fun_tail other_ty = [arrow <+> pprType other_ty] + +ppr_forall_type :: Prec -> Type -> SDoc +ppr_forall_type p ty + = maybeParen p FunPrec $ + sep [pprForAll tvs, pprThetaArrow ctxt, pprType tau] + where + (tvs, rho) = split1 [] ty + (ctxt, tau) = split2 [] rho + + split1 tvs (ForAllTy tv ty) = split1 (tv:tvs) ty + split1 tvs (NoteTy _ ty) = split1 tvs ty + split1 tvs ty = (reverse tvs, ty) + + split2 ps (NoteTy _ arg -- Rather a disgusting case + `FunTy` res) = split2 ps (arg `FunTy` res) + split2 ps (PredTy p `FunTy` ty) = split2 (p:ps) ty + split2 ps (NoteTy _ ty) = split2 ps ty + split2 ps ty = (reverse ps, ty) + +ppr_tc_app :: Prec -> TyCon -> [Type] -> SDoc +ppr_tc_app p tc [] + = ppr_tc tc +ppr_tc_app p tc [ty] + | tc `hasKey` listTyConKey = brackets (pprType ty) + | tc `hasKey` parrTyConKey = ptext SLIT("[:") <> pprType ty <> ptext SLIT(":]") +ppr_tc_app p tc tys + | isTupleTyCon tc && tyConArity tc == length tys + = tupleParens (tupleTyConBoxity tc) (sep (punctuate comma (map pprType tys))) + | otherwise + = maybeParen p TyConPrec $ + ppr_tc tc <+> sep (map (ppr_type TyConPrec) tys) + +ppr_tc :: TyCon -> SDoc +ppr_tc tc = parenSymOcc (getOccName tc) (pp_nt_debug <> ppr tc) + where + pp_nt_debug | isNewTyCon tc = ifPprDebug (if isRecursiveTyCon tc + then ptext SLIT("<recnt>") + else ptext SLIT("<nt>")) + | otherwise = empty + +------------------- +pprForAll [] = empty +pprForAll tvs = ptext SLIT("forall") <+> sep (map pprTvBndr tvs) <> dot + +pprTvBndr tv | isLiftedTypeKind kind = ppr tv + | otherwise = parens (ppr tv <+> dcolon <+> pprKind kind) + where + kind = tyVarKind tv +\end{code} + diff --git a/compiler/types/TypeRep.lhs-boot b/compiler/types/TypeRep.lhs-boot new file mode 100644 index 0000000000..b99fdd3321 --- /dev/null +++ b/compiler/types/TypeRep.lhs-boot @@ -0,0 +1,8 @@ +\begin{code} +module TypeRep where + +data Type +data PredType +data TyThing +\end{code} + diff --git a/compiler/types/Unify.lhs b/compiler/types/Unify.lhs new file mode 100644 index 0000000000..f60c7bee61 --- /dev/null +++ b/compiler/types/Unify.lhs @@ -0,0 +1,536 @@ +\begin{code} +module Unify ( + -- Matching and unification + tcMatchTys, tcMatchTyX, ruleMatchTyX, tcMatchPreds, MatchEnv(..), + + tcUnifyTys, + + gadtRefineTys, BindFlag(..), + + coreRefineTys, TypeRefinement, + + -- Re-export + MaybeErr(..) + ) where + +#include "HsVersions.h" + +import Var ( Var, TyVar, tyVarKind ) +import VarEnv +import VarSet +import Kind ( isSubKind ) +import Type ( typeKind, tyVarsOfType, tyVarsOfTypes, tyVarsOfTheta, mkTyVarTys, + TvSubstEnv, emptyTvSubstEnv, TvSubst(..), substTy, tcEqTypeX, + mkOpenTvSubst, tcView ) +import TypeRep ( Type(..), PredType(..), funTyCon ) +import DataCon ( DataCon, dataConInstResTy ) +import Util ( snocView ) +import ErrUtils ( Message ) +import Outputable +import Maybes +\end{code} + + +%************************************************************************ +%* * + Matching +%* * +%************************************************************************ + + +Matching is much tricker than you might think. + +1. The substitution we generate binds the *template type variables* + which are given to us explicitly. + +2. We want to match in the presence of foralls; + e.g (forall a. t1) ~ (forall b. t2) + + That is what the RnEnv2 is for; it does the alpha-renaming + that makes it as if a and b were the same variable. + Initialising the RnEnv2, so that it can generate a fresh + binder when necessary, entails knowing the free variables of + both types. + +3. We must be careful not to bind a template type variable to a + locally bound variable. E.g. + (forall a. x) ~ (forall b. b) + where x is the template type variable. Then we do not want to + bind x to a/b! This is a kind of occurs check. + The necessary locals accumulate in the RnEnv2. + + +\begin{code} +data MatchEnv + = ME { me_tmpls :: VarSet -- Template tyvars + , me_env :: RnEnv2 -- Renaming envt for nested foralls + } -- In-scope set includes template tyvars + +tcMatchTys :: TyVarSet -- Template tyvars + -> [Type] -- Template + -> [Type] -- Target + -> Maybe TvSubst -- One-shot; in principle the template + -- variables could be free in the target + +tcMatchTys tmpls tys1 tys2 + = case match_tys menv emptyTvSubstEnv tys1 tys2 of + Just subst_env -> Just (TvSubst in_scope subst_env) + Nothing -> Nothing + where + menv = ME { me_tmpls = tmpls, me_env = mkRnEnv2 in_scope } + in_scope = mkInScopeSet (tmpls `unionVarSet` tyVarsOfTypes tys2) + -- We're assuming that all the interesting + -- tyvars in tys1 are in tmpls + +-- This is similar, but extends a substitution +tcMatchTyX :: TyVarSet -- Template tyvars + -> TvSubst -- Substitution to extend + -> Type -- Template + -> Type -- Target + -> Maybe TvSubst +tcMatchTyX tmpls (TvSubst in_scope subst_env) ty1 ty2 + = case match menv subst_env ty1 ty2 of + Just subst_env -> Just (TvSubst in_scope subst_env) + Nothing -> Nothing + where + menv = ME {me_tmpls = tmpls, me_env = mkRnEnv2 in_scope} + +tcMatchPreds + :: [TyVar] -- Bind these + -> [PredType] -> [PredType] + -> Maybe TvSubstEnv +tcMatchPreds tmpls ps1 ps2 + = match_list (match_pred menv) emptyTvSubstEnv ps1 ps2 + where + menv = ME { me_tmpls = mkVarSet tmpls, me_env = mkRnEnv2 in_scope_tyvars } + in_scope_tyvars = mkInScopeSet (tyVarsOfTheta ps1 `unionVarSet` tyVarsOfTheta ps2) + +-- This one is called from the expression matcher, which already has a MatchEnv in hand +ruleMatchTyX :: MatchEnv + -> TvSubstEnv -- Substitution to extend + -> Type -- Template + -> Type -- Target + -> Maybe TvSubstEnv + +ruleMatchTyX menv subst ty1 ty2 = match menv subst ty1 ty2 -- Rename for export +\end{code} + +Now the internals of matching + +\begin{code} +match :: MatchEnv -- For the most part this is pushed downwards + -> TvSubstEnv -- Substitution so far: + -- Domain is subset of template tyvars + -- Free vars of range is subset of + -- in-scope set of the RnEnv2 + -> Type -> Type -- Template and target respectively + -> Maybe TvSubstEnv +-- This matcher works on source types; that is, +-- it respects NewTypes and PredType + +match menv subst ty1 ty2 | Just ty1' <- tcView ty1 = match menv subst ty1' ty2 +match menv subst ty1 ty2 | Just ty2' <- tcView ty2 = match menv subst ty1 ty2' + +match menv subst (TyVarTy tv1) ty2 + | tv1 `elemVarSet` me_tmpls menv + = case lookupVarEnv subst tv1' of + Nothing | any (inRnEnvR rn_env) (varSetElems (tyVarsOfType ty2)) + -> Nothing -- Occurs check + | not (typeKind ty2 `isSubKind` tyVarKind tv1) + -> Nothing -- Kind mis-match + | otherwise + -> Just (extendVarEnv subst tv1 ty2) + + Just ty1' | tcEqTypeX (nukeRnEnvL rn_env) ty1' ty2 + -- ty1 has no locally-bound variables, hence nukeRnEnvL + -- Note tcEqType...we are doing source-type matching here + -> Just subst + + other -> Nothing + + | otherwise -- tv1 is not a template tyvar + = case ty2 of + TyVarTy tv2 | tv1' == rnOccR rn_env tv2 -> Just subst + other -> Nothing + where + rn_env = me_env menv + tv1' = rnOccL rn_env tv1 + +match menv subst (ForAllTy tv1 ty1) (ForAllTy tv2 ty2) + = match menv' subst ty1 ty2 + where -- Use the magic of rnBndr2 to go under the binders + menv' = menv { me_env = rnBndr2 (me_env menv) tv1 tv2 } + +match menv subst (PredTy p1) (PredTy p2) + = match_pred menv subst p1 p2 +match menv subst (TyConApp tc1 tys1) (TyConApp tc2 tys2) + | tc1 == tc2 = match_tys menv subst tys1 tys2 +match menv subst (FunTy ty1a ty1b) (FunTy ty2a ty2b) + = do { subst' <- match menv subst ty1a ty2a + ; match menv subst' ty1b ty2b } +match menv subst (AppTy ty1a ty1b) ty2 + | Just (ty2a, ty2b) <- repSplitAppTy_maybe ty2 + = do { subst' <- match menv subst ty1a ty2a + ; match menv subst' ty1b ty2b } + +match menv subst ty1 ty2 + = Nothing + +-------------- +match_tys menv subst tys1 tys2 = match_list (match menv) subst tys1 tys2 + +-------------- +match_list :: (TvSubstEnv -> a -> a -> Maybe TvSubstEnv) + -> TvSubstEnv -> [a] -> [a] -> Maybe TvSubstEnv +match_list fn subst [] [] = Just subst +match_list fn subst (ty1:tys1) (ty2:tys2) = do { subst' <- fn subst ty1 ty2 + ; match_list fn subst' tys1 tys2 } +match_list fn subst tys1 tys2 = Nothing + +-------------- +match_pred menv subst (ClassP c1 tys1) (ClassP c2 tys2) + | c1 == c2 = match_tys menv subst tys1 tys2 +match_pred menv subst (IParam n1 t1) (IParam n2 t2) + | n1 == n2 = match menv subst t1 t2 +match_pred menv subst p1 p2 = Nothing +\end{code} + + +%************************************************************************ +%* * + Unification +%* * +%************************************************************************ + +\begin{code} +tcUnifyTys :: (TyVar -> BindFlag) + -> [Type] -> [Type] + -> Maybe TvSubst -- A regular one-shot substitution +-- The two types may have common type variables, and indeed do so in the +-- second call to tcUnifyTys in FunDeps.checkClsFD +tcUnifyTys bind_fn tys1 tys2 + = maybeErrToMaybe $ initUM bind_fn $ + do { subst_env <- unify_tys emptyTvSubstEnv tys1 tys2 + + -- Find the fixed point of the resulting non-idempotent substitution + ; let in_scope = mkInScopeSet (tvs1 `unionVarSet` tvs2) + subst = TvSubst in_scope subst_env_fixpt + subst_env_fixpt = mapVarEnv (substTy subst) subst_env + ; return subst } + where + tvs1 = tyVarsOfTypes tys1 + tvs2 = tyVarsOfTypes tys2 + +---------------------------- +coreRefineTys :: DataCon -> [TyVar] -- Case pattern (con tv1 .. tvn ...) + -> Type -- Type of scrutinee + -> Maybe TypeRefinement + +type TypeRefinement = (TvSubstEnv, Bool) + -- The Bool is True iff all the bindings in the + -- env are for the pattern type variables + -- In this case, there is no type refinement + -- for already-in-scope type variables + +-- Used by Core Lint and the simplifier. +coreRefineTys con tvs scrut_ty + = maybeErrToMaybe $ initUM (tryToBind tv_set) $ + do { -- Run the unifier, starting with an empty env + ; subst_env <- unify emptyTvSubstEnv pat_res_ty scrut_ty + + -- Find the fixed point of the resulting non-idempotent substitution + ; let subst = mkOpenTvSubst subst_env_fixpt + subst_env_fixpt = mapVarEnv (substTy subst) subst_env + + ; return (subst_env_fixpt, all_bound_here subst_env) } + where + pat_res_ty = dataConInstResTy con (mkTyVarTys tvs) + + -- 'tvs' are the tyvars bound by the pattern + tv_set = mkVarSet tvs + all_bound_here env = all bound_here (varEnvKeys env) + bound_here uniq = elemVarSetByKey uniq tv_set + +-- This version is used by the type checker +gadtRefineTys :: TvSubst + -> DataCon -> [TyVar] + -> [Type] -> [Type] + -> MaybeErr Message (TvSubst, Bool) +-- The bool is True <=> the only *new* bindings are for pat_tvs + +gadtRefineTys (TvSubst in_scope env1) con pat_tvs pat_tys ctxt_tys + = initUM (tryToBind tv_set) $ + do { -- Run the unifier, starting with an empty env + ; env2 <- unify_tys env1 pat_tys ctxt_tys + + -- Find the fixed point of the resulting non-idempotent substitution + ; let subst2 = TvSubst in_scope subst_env_fixpt + subst_env_fixpt = mapVarEnv (substTy subst2) env2 + + ; return (subst2, all_bound_here env2) } + where + -- 'tvs' are the tyvars bound by the pattern + tv_set = mkVarSet pat_tvs + all_bound_here env = all bound_here (varEnvKeys env) + bound_here uniq = elemVarEnvByKey uniq env1 || elemVarSetByKey uniq tv_set + -- The bool is True <=> the only *new* bindings are for pat_tvs + +---------------------------- +tryToBind :: TyVarSet -> TyVar -> BindFlag +tryToBind tv_set tv | tv `elemVarSet` tv_set = BindMe + | otherwise = AvoidMe +\end{code} + + +%************************************************************************ +%* * + The workhorse +%* * +%************************************************************************ + +\begin{code} +unify :: TvSubstEnv -- An existing substitution to extend + -> Type -> Type -- Types to be unified + -> UM TvSubstEnv -- Just the extended substitution, + -- Nothing if unification failed +-- We do not require the incoming substitution to be idempotent, +-- nor guarantee that the outgoing one is. That's fixed up by +-- the wrappers. + +-- Respects newtypes, PredTypes + +unify subst ty1 ty2 = -- pprTrace "unify" (ppr subst <+> pprParendType ty1 <+> pprParendType ty2) $ + unify_ subst ty1 ty2 + +-- in unify_, any NewTcApps/Preds should be taken at face value +unify_ subst (TyVarTy tv1) ty2 = uVar False subst tv1 ty2 +unify_ subst ty1 (TyVarTy tv2) = uVar True subst tv2 ty1 + +unify_ subst ty1 ty2 | Just ty1' <- tcView ty1 = unify subst ty1' ty2 +unify_ subst ty1 ty2 | Just ty2' <- tcView ty2 = unify subst ty1 ty2' + +unify_ subst (PredTy p1) (PredTy p2) = unify_pred subst p1 p2 + +unify_ subst t1@(TyConApp tyc1 tys1) t2@(TyConApp tyc2 tys2) + | tyc1 == tyc2 = unify_tys subst tys1 tys2 + +unify_ subst (FunTy ty1a ty1b) (FunTy ty2a ty2b) + = do { subst' <- unify subst ty1a ty2a + ; unify subst' ty1b ty2b } + + -- Applications need a bit of care! + -- They can match FunTy and TyConApp, so use splitAppTy_maybe + -- NB: we've already dealt with type variables and Notes, + -- so if one type is an App the other one jolly well better be too +unify_ subst (AppTy ty1a ty1b) ty2 + | Just (ty2a, ty2b) <- repSplitAppTy_maybe ty2 + = do { subst' <- unify subst ty1a ty2a + ; unify subst' ty1b ty2b } + +unify_ subst ty1 (AppTy ty2a ty2b) + | Just (ty1a, ty1b) <- repSplitAppTy_maybe ty1 + = do { subst' <- unify subst ty1a ty2a + ; unify subst' ty1b ty2b } + +unify_ subst ty1 ty2 = failWith (misMatch ty1 ty2) + +------------------------------ +unify_pred subst (ClassP c1 tys1) (ClassP c2 tys2) + | c1 == c2 = unify_tys subst tys1 tys2 +unify_pred subst (IParam n1 t1) (IParam n2 t2) + | n1 == n2 = unify subst t1 t2 +unify_pred subst p1 p2 = failWith (misMatch (PredTy p1) (PredTy p2)) + +------------------------------ +unify_tys = unifyList unify + +unifyList :: Outputable a + => (TvSubstEnv -> a -> a -> UM TvSubstEnv) + -> TvSubstEnv -> [a] -> [a] -> UM TvSubstEnv +unifyList unifier subst orig_xs orig_ys + = go subst orig_xs orig_ys + where + go subst [] [] = return subst + go subst (x:xs) (y:ys) = do { subst' <- unifier subst x y + ; go subst' xs ys } + go subst _ _ = failWith (lengthMisMatch orig_xs orig_ys) + +------------------------------ +uVar :: Bool -- Swapped + -> TvSubstEnv -- An existing substitution to extend + -> TyVar -- Type variable to be unified + -> Type -- with this type + -> UM TvSubstEnv + +uVar swap subst tv1 ty + = -- Check to see whether tv1 is refined by the substitution + case (lookupVarEnv subst tv1) of + -- Yes, call back into unify' + Just ty' | swap -> unify subst ty ty' + | otherwise -> unify subst ty' ty + -- No, continue + Nothing -> uUnrefined subst tv1 ty ty + + +uUnrefined :: TvSubstEnv -- An existing substitution to extend + -> TyVar -- Type variable to be unified + -> Type -- with this type + -> Type -- (de-noted version) + -> UM TvSubstEnv + +-- We know that tv1 isn't refined + +uUnrefined subst tv1 ty2 ty2' + | Just ty2'' <- tcView ty2' + = uUnrefined subst tv1 ty2 ty2'' -- Unwrap synonyms + -- This is essential, in case we have + -- type Foo a = a + -- and then unify a :=: Foo a + +uUnrefined subst tv1 ty2 (TyVarTy tv2) + | tv1 == tv2 -- Same type variable + = return subst + + -- Check to see whether tv2 is refined + | Just ty' <- lookupVarEnv subst tv2 + = uUnrefined subst tv1 ty' ty' + + -- So both are unrefined; next, see if the kinds force the direction + | k1 == k2 -- Can update either; so check the bind-flags + = do { b1 <- tvBindFlag tv1 + ; b2 <- tvBindFlag tv2 + ; case (b1,b2) of + (BindMe, _) -> bind tv1 ty2 + + (AvoidMe, BindMe) -> bind tv2 ty1 + (AvoidMe, _) -> bind tv1 ty2 + + (WildCard, WildCard) -> return subst + (WildCard, Skolem) -> return subst + (WildCard, _) -> bind tv2 ty1 + + (Skolem, WildCard) -> return subst + (Skolem, Skolem) -> failWith (misMatch ty1 ty2) + (Skolem, _) -> bind tv2 ty1 + } + + | k1 `isSubKind` k2 = bindTv subst tv2 ty1 -- Must update tv2 + | k2 `isSubKind` k1 = bindTv subst tv1 ty2 -- Must update tv1 + + | otherwise = failWith (kindMisMatch tv1 ty2) + where + ty1 = TyVarTy tv1 + k1 = tyVarKind tv1 + k2 = tyVarKind tv2 + bind tv ty = return (extendVarEnv subst tv ty) + +uUnrefined subst tv1 ty2 ty2' -- ty2 is not a type variable + | tv1 `elemVarSet` substTvSet subst (tyVarsOfType ty2') + = failWith (occursCheck tv1 ty2) -- Occurs check + | not (k2 `isSubKind` k1) + = failWith (kindMisMatch tv1 ty2) -- Kind check + | otherwise + = bindTv subst tv1 ty2 -- Bind tyvar to the synonym if poss + where + k1 = tyVarKind tv1 + k2 = typeKind ty2' + +substTvSet :: TvSubstEnv -> TyVarSet -> TyVarSet +-- Apply the non-idempotent substitution to a set of type variables, +-- remembering that the substitution isn't necessarily idempotent +substTvSet subst tvs + = foldVarSet (unionVarSet . get) emptyVarSet tvs + where + get tv = case lookupVarEnv subst tv of + Nothing -> unitVarSet tv + Just ty -> substTvSet subst (tyVarsOfType ty) + +bindTv subst tv ty -- ty is not a type variable + = do { b <- tvBindFlag tv + ; case b of + Skolem -> failWith (misMatch (TyVarTy tv) ty) + WildCard -> return subst + other -> return (extendVarEnv subst tv ty) + } +\end{code} + +%************************************************************************ +%* * + Unification monad +%* * +%************************************************************************ + +\begin{code} +data BindFlag + = BindMe -- A regular type variable + | AvoidMe -- Like BindMe but, given the choice, avoid binding it + + | Skolem -- This type variable is a skolem constant + -- Don't bind it; it only matches itself + + | WildCard -- This type variable matches anything, + -- and does not affect the substitution + +newtype UM a = UM { unUM :: (TyVar -> BindFlag) + -> MaybeErr Message a } + +instance Monad UM where + return a = UM (\tvs -> Succeeded a) + fail s = UM (\tvs -> Failed (text s)) + m >>= k = UM (\tvs -> case unUM m tvs of + Failed err -> Failed err + Succeeded v -> unUM (k v) tvs) + +initUM :: (TyVar -> BindFlag) -> UM a -> MaybeErr Message a +initUM badtvs um = unUM um badtvs + +tvBindFlag :: TyVar -> UM BindFlag +tvBindFlag tv = UM (\tv_fn -> Succeeded (tv_fn tv)) + +failWith :: Message -> UM a +failWith msg = UM (\tv_fn -> Failed msg) + +maybeErrToMaybe :: MaybeErr fail succ -> Maybe succ +maybeErrToMaybe (Succeeded a) = Just a +maybeErrToMaybe (Failed m) = Nothing + +------------------------------ +repSplitAppTy_maybe :: Type -> Maybe (Type,Type) +-- Like Type.splitAppTy_maybe, but any coreView stuff is already done +repSplitAppTy_maybe (FunTy ty1 ty2) = Just (TyConApp funTyCon [ty1], ty2) +repSplitAppTy_maybe (AppTy ty1 ty2) = Just (ty1, ty2) +repSplitAppTy_maybe (TyConApp tc tys) = case snocView tys of + Just (tys', ty') -> Just (TyConApp tc tys', ty') + Nothing -> Nothing +repSplitAppTy_maybe other = Nothing +\end{code} + + +%************************************************************************ +%* * + Error reporting + We go to a lot more trouble to tidy the types + in TcUnify. Maybe we'll end up having to do that + here too, but I'll leave it for now. +%* * +%************************************************************************ + +\begin{code} +misMatch t1 t2 + = ptext SLIT("Can't match types") <+> quotes (ppr t1) <+> + ptext SLIT("and") <+> quotes (ppr t2) + +lengthMisMatch tys1 tys2 + = sep [ptext SLIT("Can't match unequal length lists"), + nest 2 (ppr tys1), nest 2 (ppr tys2) ] + +kindMisMatch tv1 t2 + = vcat [ptext SLIT("Can't match kinds") <+> quotes (ppr (tyVarKind tv1)) <+> + ptext SLIT("and") <+> quotes (ppr (typeKind t2)), + ptext SLIT("when matching") <+> quotes (ppr tv1) <+> + ptext SLIT("with") <+> quotes (ppr t2)] + +occursCheck tv ty + = hang (ptext SLIT("Can't construct the infinite type")) + 2 (ppr tv <+> equals <+> ppr ty) +\end{code}
\ No newline at end of file diff --git a/compiler/utils/Bag.lhs b/compiler/utils/Bag.lhs new file mode 100644 index 0000000000..b107f84a3a --- /dev/null +++ b/compiler/utils/Bag.lhs @@ -0,0 +1,177 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[Bags]{@Bag@: an unordered collection with duplicates} + +\begin{code} +module Bag ( + Bag, -- abstract type + + emptyBag, unitBag, unionBags, unionManyBags, + mapBag, + elemBag, + filterBag, partitionBag, concatBag, foldBag, foldrBag, foldlBag, + isEmptyBag, isSingletonBag, consBag, snocBag, anyBag, + listToBag, bagToList, + mapBagM, mapAndUnzipBagM + ) where + +#include "HsVersions.h" + +import Outputable +import Util ( isSingleton ) +import List ( partition ) +\end{code} + + +\begin{code} +data Bag a + = EmptyBag + | UnitBag a + | TwoBags (Bag a) (Bag a) -- INVARIANT: neither branch is empty + | ListBag [a] -- INVARIANT: the list is non-empty + +emptyBag = EmptyBag +unitBag = UnitBag + +elemBag :: Eq a => a -> Bag a -> Bool + +elemBag x EmptyBag = False +elemBag x (UnitBag y) = x==y +elemBag x (TwoBags b1 b2) = x `elemBag` b1 || x `elemBag` b2 +elemBag x (ListBag ys) = any (x ==) ys + +unionManyBags :: [Bag a] -> Bag a +unionManyBags xs = foldr unionBags EmptyBag xs + +-- This one is a bit stricter! The bag will get completely evaluated. + +unionBags :: Bag a -> Bag a -> Bag a +unionBags EmptyBag b = b +unionBags b EmptyBag = b +unionBags b1 b2 = TwoBags b1 b2 + +consBag :: a -> Bag a -> Bag a +snocBag :: Bag a -> a -> Bag a + +consBag elt bag = (unitBag elt) `unionBags` bag +snocBag bag elt = bag `unionBags` (unitBag elt) + +isEmptyBag EmptyBag = True +isEmptyBag other = False -- NB invariants + +isSingletonBag :: Bag a -> Bool +isSingletonBag EmptyBag = False +isSingletonBag (UnitBag x) = True +isSingletonBag (TwoBags b1 b2) = False -- Neither is empty +isSingletonBag (ListBag xs) = isSingleton xs + +filterBag :: (a -> Bool) -> Bag a -> Bag a +filterBag pred EmptyBag = EmptyBag +filterBag pred b@(UnitBag val) = if pred val then b else EmptyBag +filterBag pred (TwoBags b1 b2) = sat1 `unionBags` sat2 + where + sat1 = filterBag pred b1 + sat2 = filterBag pred b2 +filterBag pred (ListBag vs) = listToBag (filter pred vs) + +anyBag :: (a -> Bool) -> Bag a -> Bool +anyBag p EmptyBag = False +anyBag p (UnitBag v) = p v +anyBag p (TwoBags b1 b2) = anyBag p b1 || anyBag p b2 +anyBag p (ListBag xs) = any p xs + +concatBag :: Bag (Bag a) -> Bag a +concatBag EmptyBag = EmptyBag +concatBag (UnitBag b) = b +concatBag (TwoBags b1 b2) = concatBag b1 `unionBags` concatBag b2 +concatBag (ListBag bs) = unionManyBags bs + +partitionBag :: (a -> Bool) -> Bag a -> (Bag a {- Satisfy predictate -}, + Bag a {- Don't -}) +partitionBag pred EmptyBag = (EmptyBag, EmptyBag) +partitionBag pred b@(UnitBag val) = if pred val then (b, EmptyBag) else (EmptyBag, b) +partitionBag pred (TwoBags b1 b2) = (sat1 `unionBags` sat2, fail1 `unionBags` fail2) + where + (sat1,fail1) = partitionBag pred b1 + (sat2,fail2) = partitionBag pred b2 +partitionBag pred (ListBag vs) = (listToBag sats, listToBag fails) + where + (sats,fails) = partition pred vs + + +foldBag :: (r -> r -> r) -- Replace TwoBags with this; should be associative + -> (a -> r) -- Replace UnitBag with this + -> r -- Replace EmptyBag with this + -> Bag a + -> r + +{- Standard definition +foldBag t u e EmptyBag = e +foldBag t u e (UnitBag x) = u x +foldBag t u e (TwoBags b1 b2) = (foldBag t u e b1) `t` (foldBag t u e b2) +foldBag t u e (ListBag xs) = foldr (t.u) e xs +-} + +-- More tail-recursive definition, exploiting associativity of "t" +foldBag t u e EmptyBag = e +foldBag t u e (UnitBag x) = u x `t` e +foldBag t u e (TwoBags b1 b2) = foldBag t u (foldBag t u e b2) b1 +foldBag t u e (ListBag xs) = foldr (t.u) e xs + +foldrBag :: (a -> r -> r) -> r + -> Bag a + -> r + +foldrBag k z EmptyBag = z +foldrBag k z (UnitBag x) = k x z +foldrBag k z (TwoBags b1 b2) = foldrBag k (foldrBag k z b2) b1 +foldrBag k z (ListBag xs) = foldr k z xs + +foldlBag :: (r -> a -> r) -> r + -> Bag a + -> r + +foldlBag k z EmptyBag = z +foldlBag k z (UnitBag x) = k z x +foldlBag k z (TwoBags b1 b2) = foldlBag k (foldlBag k z b1) b2 +foldlBag k z (ListBag xs) = foldl k z xs + + +mapBag :: (a -> b) -> Bag a -> Bag b +mapBag f EmptyBag = EmptyBag +mapBag f (UnitBag x) = UnitBag (f x) +mapBag f (TwoBags b1 b2) = TwoBags (mapBag f b1) (mapBag f b2) +mapBag f (ListBag xs) = ListBag (map f xs) + +mapBagM :: Monad m => (a -> m b) -> Bag a -> m (Bag b) +mapBagM f EmptyBag = return EmptyBag +mapBagM f (UnitBag x) = do { r <- f x; return (UnitBag r) } +mapBagM f (TwoBags b1 b2) = do { r1 <- mapBagM f b1; r2 <- mapBagM f b2; return (TwoBags r1 r2) } +mapBagM f (ListBag xs) = do { rs <- mapM f xs; return (ListBag rs) } + +mapAndUnzipBagM :: Monad m => (a -> m (b,c)) -> Bag a -> m (Bag b, Bag c) +mapAndUnzipBagM f EmptyBag = return (EmptyBag, EmptyBag) +mapAndUnzipBagM f (UnitBag x) = do { (r,s) <- f x; return (UnitBag r, UnitBag s) } +mapAndUnzipBagM f (TwoBags b1 b2) = do { (r1,s1) <- mapAndUnzipBagM f b1 + ; (r2,s2) <- mapAndUnzipBagM f b2 + ; return (TwoBags r1 r2, TwoBags s1 s2) } +mapAndUnzipBagM f (ListBag xs) = do { ts <- mapM f xs + ; let (rs,ss) = unzip ts + ; return (ListBag rs, ListBag ss) } + +listToBag :: [a] -> Bag a +listToBag [] = EmptyBag +listToBag vs = ListBag vs + +bagToList :: Bag a -> [a] +bagToList b = foldrBag (:) [] b +\end{code} + +\begin{code} +instance (Outputable a) => Outputable (Bag a) where + ppr EmptyBag = ptext SLIT("emptyBag") + ppr (UnitBag a) = ppr a + ppr (TwoBags b1 b2) = hsep [ppr b1 <> comma, ppr b2] + ppr (ListBag as) = interpp'SP as +\end{code} diff --git a/compiler/utils/Binary.hs b/compiler/utils/Binary.hs new file mode 100644 index 0000000000..7a1ca515b7 --- /dev/null +++ b/compiler/utils/Binary.hs @@ -0,0 +1,756 @@ +{-# OPTIONS -cpp #-} +-- +-- (c) The University of Glasgow 2002 +-- +-- Binary I/O library, with special tweaks for GHC +-- +-- Based on the nhc98 Binary library, which is copyright +-- (c) Malcolm Wallace and Colin Runciman, University of York, 1998. +-- Under the terms of the license for that software, we must tell you +-- where you can obtain the original version of the Binary library, namely +-- http://www.cs.york.ac.uk/fp/nhc98/ + +module Binary + ( {-type-} Bin, + {-class-} Binary(..), + {-type-} BinHandle, + + openBinIO, openBinIO_, + openBinMem, +-- closeBin, + + seekBin, + tellBin, + castBin, + + writeBinMem, + readBinMem, + + isEOFBin, + + -- for writing instances: + putByte, + getByte, + + -- lazy Bin I/O + lazyGet, + lazyPut, + + -- GHC only: + ByteArray(..), + getByteArray, + putByteArray, + + getBinFileWithDict, -- :: Binary a => FilePath -> IO a + putBinFileWithDict, -- :: Binary a => FilePath -> ModuleName -> a -> IO () + + ) where + +#include "HsVersions.h" + +-- The *host* architecture version: +#include "MachDeps.h" + +import FastString +import Unique +import Panic +import UniqFM +import FastMutInt +import PackageConfig ( PackageId, packageIdFS, fsToPackageId ) + +import Foreign +import Data.Array.IO +import Data.Array +import Data.Bits +import Data.Int +import Data.Word +import Data.IORef +import Data.Char ( ord, chr ) +import Data.Array.Base ( unsafeRead, unsafeWrite ) +import Control.Monad ( when ) +import Control.Exception ( throwDyn ) +import System.IO as IO +import System.IO.Unsafe ( unsafeInterleaveIO ) +import System.IO.Error ( mkIOError, eofErrorType ) +import GHC.Real ( Ratio(..) ) +import GHC.Exts +import GHC.IOBase ( IO(..) ) +import GHC.Word ( Word8(..) ) +#if __GLASGOW_HASKELL__ < 601 +-- openFileEx is available from the lang package, but we want to +-- be independent of hslibs libraries. +import GHC.Handle ( openFileEx, IOModeEx(..) ) +#else +import System.IO ( openBinaryFile ) +#endif + +#if __GLASGOW_HASKELL__ < 601 +openBinaryFile f mode = openFileEx f (BinaryMode mode) +#endif + +type BinArray = IOUArray Int Word8 + +--------------------------------------------------------------- +-- BinHandle +--------------------------------------------------------------- + +data BinHandle + = BinMem { -- binary data stored in an unboxed array + bh_usr :: UserData, -- sigh, need parameterized modules :-) + off_r :: !FastMutInt, -- the current offset + sz_r :: !FastMutInt, -- size of the array (cached) + arr_r :: !(IORef BinArray) -- the array (bounds: (0,size-1)) + } + -- XXX: should really store a "high water mark" for dumping out + -- the binary data to a file. + + | BinIO { -- binary data stored in a file + bh_usr :: UserData, + off_r :: !FastMutInt, -- the current offset (cached) + hdl :: !IO.Handle -- the file handle (must be seekable) + } + -- cache the file ptr in BinIO; using hTell is too expensive + -- to call repeatedly. If anyone else is modifying this Handle + -- at the same time, we'll be screwed. + +getUserData :: BinHandle -> UserData +getUserData bh = bh_usr bh + +setUserData :: BinHandle -> UserData -> BinHandle +setUserData bh us = bh { bh_usr = us } + + +--------------------------------------------------------------- +-- Bin +--------------------------------------------------------------- + +newtype Bin a = BinPtr Int + deriving (Eq, Ord, Show, Bounded) + +castBin :: Bin a -> Bin b +castBin (BinPtr i) = BinPtr i + +--------------------------------------------------------------- +-- class Binary +--------------------------------------------------------------- + +class Binary a where + put_ :: BinHandle -> a -> IO () + put :: BinHandle -> a -> IO (Bin a) + get :: BinHandle -> IO a + + -- define one of put_, put. Use of put_ is recommended because it + -- is more likely that tail-calls can kick in, and we rarely need the + -- position return value. + put_ bh a = do put bh a; return () + put bh a = do p <- tellBin bh; put_ bh a; return p + +putAt :: Binary a => BinHandle -> Bin a -> a -> IO () +putAt bh p x = do seekBin bh p; put bh x; return () + +getAt :: Binary a => BinHandle -> Bin a -> IO a +getAt bh p = do seekBin bh p; get bh + +openBinIO_ :: IO.Handle -> IO BinHandle +openBinIO_ h = openBinIO h + +openBinIO :: IO.Handle -> IO BinHandle +openBinIO h = do + r <- newFastMutInt + writeFastMutInt r 0 + return (BinIO noUserData r h) + +openBinMem :: Int -> IO BinHandle +openBinMem size + | size <= 0 = error "Data.Binary.openBinMem: size must be >= 0" + | otherwise = do + arr <- newArray_ (0,size-1) + arr_r <- newIORef arr + ix_r <- newFastMutInt + writeFastMutInt ix_r 0 + sz_r <- newFastMutInt + writeFastMutInt sz_r size + return (BinMem noUserData ix_r sz_r arr_r) + +tellBin :: BinHandle -> IO (Bin a) +tellBin (BinIO _ r _) = do ix <- readFastMutInt r; return (BinPtr ix) +tellBin (BinMem _ r _ _) = do ix <- readFastMutInt r; return (BinPtr ix) + +seekBin :: BinHandle -> Bin a -> IO () +seekBin (BinIO _ ix_r h) (BinPtr p) = do + writeFastMutInt ix_r p + hSeek h AbsoluteSeek (fromIntegral p) +seekBin h@(BinMem _ ix_r sz_r a) (BinPtr p) = do + sz <- readFastMutInt sz_r + if (p >= sz) + then do expandBin h p; writeFastMutInt ix_r p + else writeFastMutInt ix_r p + +isEOFBin :: BinHandle -> IO Bool +isEOFBin (BinMem _ ix_r sz_r a) = do + ix <- readFastMutInt ix_r + sz <- readFastMutInt sz_r + return (ix >= sz) +isEOFBin (BinIO _ ix_r h) = hIsEOF h + +writeBinMem :: BinHandle -> FilePath -> IO () +writeBinMem (BinIO _ _ _) _ = error "Data.Binary.writeBinMem: not a memory handle" +writeBinMem (BinMem _ ix_r sz_r arr_r) fn = do + h <- openBinaryFile fn WriteMode + arr <- readIORef arr_r + ix <- readFastMutInt ix_r + hPutArray h arr ix +#if __GLASGOW_HASKELL__ <= 500 + -- workaround a bug in old implementation of hPutBuf (it doesn't + -- set the FILEOBJ_RW_WRITTEN flag on the file object, so the file doens't + -- get flushed properly). Adding an extra '\0' doens't do any harm. + hPutChar h '\0' +#endif + hClose h + +readBinMem :: FilePath -> IO BinHandle +-- Return a BinHandle with a totally undefined State +readBinMem filename = do + h <- openBinaryFile filename ReadMode + filesize' <- hFileSize h + let filesize = fromIntegral filesize' + arr <- newArray_ (0,filesize-1) + count <- hGetArray h arr filesize + when (count /= filesize) + (error ("Binary.readBinMem: only read " ++ show count ++ " bytes")) + hClose h + arr_r <- newIORef arr + ix_r <- newFastMutInt + writeFastMutInt ix_r 0 + sz_r <- newFastMutInt + writeFastMutInt sz_r filesize + return (BinMem noUserData ix_r sz_r arr_r) + +-- expand the size of the array to include a specified offset +expandBin :: BinHandle -> Int -> IO () +expandBin (BinMem _ ix_r sz_r arr_r) off = do + sz <- readFastMutInt sz_r + let sz' = head (dropWhile (<= off) (iterate (* 2) sz)) + arr <- readIORef arr_r + arr' <- newArray_ (0,sz'-1) + sequence_ [ unsafeRead arr i >>= unsafeWrite arr' i + | i <- [ 0 .. sz-1 ] ] + writeFastMutInt sz_r sz' + writeIORef arr_r arr' +#ifdef DEBUG + hPutStrLn stderr ("Binary: expanding to size: " ++ show sz') +#endif + return () +expandBin (BinIO _ _ _) _ = return () + -- no need to expand a file, we'll assume they expand by themselves. + +-- ----------------------------------------------------------------------------- +-- Low-level reading/writing of bytes + +putWord8 :: BinHandle -> Word8 -> IO () +putWord8 h@(BinMem _ ix_r sz_r arr_r) w = do + ix <- readFastMutInt ix_r + sz <- readFastMutInt sz_r + -- double the size of the array if it overflows + if (ix >= sz) + then do expandBin h ix + putWord8 h w + else do arr <- readIORef arr_r + unsafeWrite arr ix w + writeFastMutInt ix_r (ix+1) + return () +putWord8 (BinIO _ ix_r h) w = do + ix <- readFastMutInt ix_r + hPutChar h (chr (fromIntegral w)) -- XXX not really correct + writeFastMutInt ix_r (ix+1) + return () + +getWord8 :: BinHandle -> IO Word8 +getWord8 (BinMem _ ix_r sz_r arr_r) = do + ix <- readFastMutInt ix_r + sz <- readFastMutInt sz_r + when (ix >= sz) $ +#if __GLASGOW_HASKELL__ <= 408 + throw (mkIOError eofErrorType "Data.Binary.getWord8" Nothing Nothing) +#else + ioError (mkIOError eofErrorType "Data.Binary.getWord8" Nothing Nothing) +#endif + arr <- readIORef arr_r + w <- unsafeRead arr ix + writeFastMutInt ix_r (ix+1) + return w +getWord8 (BinIO _ ix_r h) = do + ix <- readFastMutInt ix_r + c <- hGetChar h + writeFastMutInt ix_r (ix+1) + return $! (fromIntegral (ord c)) -- XXX not really correct + +putByte :: BinHandle -> Word8 -> IO () +putByte bh w = put_ bh w + +getByte :: BinHandle -> IO Word8 +getByte = getWord8 + +-- ----------------------------------------------------------------------------- +-- Primitve Word writes + +instance Binary Word8 where + put_ = putWord8 + get = getWord8 + +instance Binary Word16 where + put_ h w = do -- XXX too slow.. inline putWord8? + putByte h (fromIntegral (w `shiftR` 8)) + putByte h (fromIntegral (w .&. 0xff)) + get h = do + w1 <- getWord8 h + w2 <- getWord8 h + return $! ((fromIntegral w1 `shiftL` 8) .|. fromIntegral w2) + + +instance Binary Word32 where + put_ h w = do + putByte h (fromIntegral (w `shiftR` 24)) + putByte h (fromIntegral ((w `shiftR` 16) .&. 0xff)) + putByte h (fromIntegral ((w `shiftR` 8) .&. 0xff)) + putByte h (fromIntegral (w .&. 0xff)) + get h = do + w1 <- getWord8 h + w2 <- getWord8 h + w3 <- getWord8 h + w4 <- getWord8 h + return $! ((fromIntegral w1 `shiftL` 24) .|. + (fromIntegral w2 `shiftL` 16) .|. + (fromIntegral w3 `shiftL` 8) .|. + (fromIntegral w4)) + + +instance Binary Word64 where + put_ h w = do + putByte h (fromIntegral (w `shiftR` 56)) + putByte h (fromIntegral ((w `shiftR` 48) .&. 0xff)) + putByte h (fromIntegral ((w `shiftR` 40) .&. 0xff)) + putByte h (fromIntegral ((w `shiftR` 32) .&. 0xff)) + putByte h (fromIntegral ((w `shiftR` 24) .&. 0xff)) + putByte h (fromIntegral ((w `shiftR` 16) .&. 0xff)) + putByte h (fromIntegral ((w `shiftR` 8) .&. 0xff)) + putByte h (fromIntegral (w .&. 0xff)) + get h = do + w1 <- getWord8 h + w2 <- getWord8 h + w3 <- getWord8 h + w4 <- getWord8 h + w5 <- getWord8 h + w6 <- getWord8 h + w7 <- getWord8 h + w8 <- getWord8 h + return $! ((fromIntegral w1 `shiftL` 56) .|. + (fromIntegral w2 `shiftL` 48) .|. + (fromIntegral w3 `shiftL` 40) .|. + (fromIntegral w4 `shiftL` 32) .|. + (fromIntegral w5 `shiftL` 24) .|. + (fromIntegral w6 `shiftL` 16) .|. + (fromIntegral w7 `shiftL` 8) .|. + (fromIntegral w8)) + +-- ----------------------------------------------------------------------------- +-- Primitve Int writes + +instance Binary Int8 where + put_ h w = put_ h (fromIntegral w :: Word8) + get h = do w <- get h; return $! (fromIntegral (w::Word8)) + +instance Binary Int16 where + put_ h w = put_ h (fromIntegral w :: Word16) + get h = do w <- get h; return $! (fromIntegral (w::Word16)) + +instance Binary Int32 where + put_ h w = put_ h (fromIntegral w :: Word32) + get h = do w <- get h; return $! (fromIntegral (w::Word32)) + +instance Binary Int64 where + put_ h w = put_ h (fromIntegral w :: Word64) + get h = do w <- get h; return $! (fromIntegral (w::Word64)) + +-- ----------------------------------------------------------------------------- +-- Instances for standard types + +instance Binary () where + put_ bh () = return () + get _ = return () +-- getF bh p = case getBitsF bh 0 p of (_,b) -> ((),b) + +instance Binary Bool where + put_ bh b = putByte bh (fromIntegral (fromEnum b)) + get bh = do x <- getWord8 bh; return $! (toEnum (fromIntegral x)) +-- getF bh p = case getBitsF bh 1 p of (x,b) -> (toEnum x,b) + +instance Binary Char where + put_ bh c = put_ bh (fromIntegral (ord c) :: Word32) + get bh = do x <- get bh; return $! (chr (fromIntegral (x :: Word32))) +-- getF bh p = case getBitsF bh 8 p of (x,b) -> (toEnum x,b) + +instance Binary Int where +#if SIZEOF_HSINT == 4 + put_ bh i = put_ bh (fromIntegral i :: Int32) + get bh = do + x <- get bh + return $! (fromIntegral (x :: Int32)) +#elif SIZEOF_HSINT == 8 + put_ bh i = put_ bh (fromIntegral i :: Int64) + get bh = do + x <- get bh + return $! (fromIntegral (x :: Int64)) +#else +#error "unsupported sizeof(HsInt)" +#endif +-- getF bh = getBitsF bh 32 + +instance Binary a => Binary [a] where + put_ bh l = do + let len = length l + if (len < 0xff) + then putByte bh (fromIntegral len :: Word8) + else do putByte bh 0xff; put_ bh (fromIntegral len :: Word32) + mapM_ (put_ bh) l + get bh = do + b <- getByte bh + len <- if b == 0xff + then get bh + else return (fromIntegral b :: Word32) + let loop 0 = return [] + loop n = do a <- get bh; as <- loop (n-1); return (a:as) + loop len + +instance (Binary a, Binary b) => Binary (a,b) where + put_ bh (a,b) = do put_ bh a; put_ bh b + get bh = do a <- get bh + b <- get bh + return (a,b) + +instance (Binary a, Binary b, Binary c) => Binary (a,b,c) where + put_ bh (a,b,c) = do put_ bh a; put_ bh b; put_ bh c + get bh = do a <- get bh + b <- get bh + c <- get bh + return (a,b,c) + +instance (Binary a, Binary b, Binary c, Binary d) => Binary (a,b,c,d) where + put_ bh (a,b,c,d) = do put_ bh a; put_ bh b; put_ bh c; put_ bh d + get bh = do a <- get bh + b <- get bh + c <- get bh + d <- get bh + return (a,b,c,d) + +instance Binary a => Binary (Maybe a) where + put_ bh Nothing = putByte bh 0 + put_ bh (Just a) = do putByte bh 1; put_ bh a + get bh = do h <- getWord8 bh + case h of + 0 -> return Nothing + _ -> do x <- get bh; return (Just x) + +instance (Binary a, Binary b) => Binary (Either a b) where + put_ bh (Left a) = do putByte bh 0; put_ bh a + put_ bh (Right b) = do putByte bh 1; put_ bh b + get bh = do h <- getWord8 bh + case h of + 0 -> do a <- get bh ; return (Left a) + _ -> do b <- get bh ; return (Right b) + +#ifdef __GLASGOW_HASKELL__ +instance Binary Integer where + put_ bh (S# i#) = do putByte bh 0; put_ bh (I# i#) + put_ bh (J# s# a#) = do + p <- putByte bh 1; + put_ bh (I# s#) + let sz# = sizeofByteArray# a# -- in *bytes* + put_ bh (I# sz#) -- in *bytes* + putByteArray bh a# sz# + + get bh = do + b <- getByte bh + case b of + 0 -> do (I# i#) <- get bh + return (S# i#) + _ -> do (I# s#) <- get bh + sz <- get bh + (BA a#) <- getByteArray bh sz + return (J# s# a#) + +putByteArray :: BinHandle -> ByteArray# -> Int# -> IO () +putByteArray bh a s# = loop 0# + where loop n# + | n# ==# s# = return () + | otherwise = do + putByte bh (indexByteArray a n#) + loop (n# +# 1#) + +getByteArray :: BinHandle -> Int -> IO ByteArray +getByteArray bh (I# sz) = do + (MBA arr) <- newByteArray sz + let loop n + | n ==# sz = return () + | otherwise = do + w <- getByte bh + writeByteArray arr n w + loop (n +# 1#) + loop 0# + freezeByteArray arr + + +data ByteArray = BA ByteArray# +data MBA = MBA (MutableByteArray# RealWorld) + +newByteArray :: Int# -> IO MBA +newByteArray sz = IO $ \s -> + case newByteArray# sz s of { (# s, arr #) -> + (# s, MBA arr #) } + +freezeByteArray :: MutableByteArray# RealWorld -> IO ByteArray +freezeByteArray arr = IO $ \s -> + case unsafeFreezeByteArray# arr s of { (# s, arr #) -> + (# s, BA arr #) } + +writeByteArray :: MutableByteArray# RealWorld -> Int# -> Word8 -> IO () + +#if __GLASGOW_HASKELL__ < 503 +writeByteArray arr i w8 = IO $ \s -> + case word8ToWord w8 of { W# w# -> + case writeCharArray# arr i (chr# (word2Int# w#)) s of { s -> + (# s , () #) }} +#else +writeByteArray arr i (W8# w) = IO $ \s -> + case writeWord8Array# arr i w s of { s -> + (# s, () #) } +#endif + +#if __GLASGOW_HASKELL__ < 503 +indexByteArray a# n# = fromIntegral (I# (ord# (indexCharArray# a# n#))) +#else +indexByteArray a# n# = W8# (indexWord8Array# a# n#) +#endif + +instance (Integral a, Binary a) => Binary (Ratio a) where + put_ bh (a :% b) = do put_ bh a; put_ bh b + get bh = do a <- get bh; b <- get bh; return (a :% b) +#endif + +instance Binary (Bin a) where + put_ bh (BinPtr i) = put_ bh i + get bh = do i <- get bh; return (BinPtr i) + +-- ----------------------------------------------------------------------------- +-- Lazy reading/writing + +lazyPut :: Binary a => BinHandle -> a -> IO () +lazyPut bh a = do + -- output the obj with a ptr to skip over it: + pre_a <- tellBin bh + put_ bh pre_a -- save a slot for the ptr + put_ bh a -- dump the object + q <- tellBin bh -- q = ptr to after object + putAt bh pre_a q -- fill in slot before a with ptr to q + seekBin bh q -- finally carry on writing at q + +lazyGet :: Binary a => BinHandle -> IO a +lazyGet bh = do + p <- get bh -- a BinPtr + p_a <- tellBin bh + a <- unsafeInterleaveIO (getAt bh p_a) + seekBin bh p -- skip over the object for now + return a + +-- -------------------------------------------------------------- +-- Main wrappers: getBinFileWithDict, putBinFileWithDict +-- +-- This layer is built on top of the stuff above, +-- and should not know anything about BinHandles +-- -------------------------------------------------------------- + +initBinMemSize = (1024*1024) :: Int + +#if WORD_SIZE_IN_BITS == 32 +binaryInterfaceMagic = 0x1face :: Word32 +#elif WORD_SIZE_IN_BITS == 64 +binaryInterfaceMagic = 0x1face64 :: Word32 +#endif + +getBinFileWithDict :: Binary a => FilePath -> IO a +getBinFileWithDict file_path = do + bh <- Binary.readBinMem file_path + + -- Read the magic number to check that this really is a GHC .hi file + -- (This magic number does not change when we change + -- GHC interface file format) + magic <- get bh + when (magic /= binaryInterfaceMagic) $ + throwDyn (ProgramError ( + "magic number mismatch: old/corrupt interface file?")) + + -- Read the dictionary + -- The next word in the file is a pointer to where the dictionary is + -- (probably at the end of the file) + dict_p <- Binary.get bh -- Get the dictionary ptr + data_p <- tellBin bh -- Remember where we are now + seekBin bh dict_p + dict <- getDictionary bh + seekBin bh data_p -- Back to where we were before + + -- Initialise the user-data field of bh + let bh' = setUserData bh (initReadState dict) + + -- At last, get the thing + get bh' + +putBinFileWithDict :: Binary a => FilePath -> a -> IO () +putBinFileWithDict file_path the_thing = do + bh <- openBinMem initBinMemSize + put_ bh binaryInterfaceMagic + + -- Remember where the dictionary pointer will go + dict_p_p <- tellBin bh + put_ bh dict_p_p -- Placeholder for ptr to dictionary + + -- Make some intial state + usr_state <- newWriteState + + -- Put the main thing, + put_ (setUserData bh usr_state) the_thing + + -- Get the final-state + j <- readIORef (ud_next usr_state) + fm <- readIORef (ud_map usr_state) + dict_p <- tellBin bh -- This is where the dictionary will start + + -- Write the dictionary pointer at the fornt of the file + putAt bh dict_p_p dict_p -- Fill in the placeholder + seekBin bh dict_p -- Seek back to the end of the file + + -- Write the dictionary itself + putDictionary bh j (constructDictionary j fm) + + -- And send the result to the file + writeBinMem bh file_path + +-- ----------------------------------------------------------------------------- +-- UserData +-- ----------------------------------------------------------------------------- + +data UserData = + UserData { -- This field is used only when reading + ud_dict :: Dictionary, + + -- The next two fields are only used when writing + ud_next :: IORef Int, -- The next index to use + ud_map :: IORef (UniqFM (Int,FastString)) + } + +noUserData = error "Binary.UserData: no user data" + +initReadState :: Dictionary -> UserData +initReadState dict = UserData{ ud_dict = dict, + ud_next = undef "next", + ud_map = undef "map" } + +newWriteState :: IO UserData +newWriteState = do + j_r <- newIORef 0 + out_r <- newIORef emptyUFM + return (UserData { ud_dict = panic "dict", + ud_next = j_r, + ud_map = out_r }) + + +undef s = panic ("Binary.UserData: no " ++ s) + +--------------------------------------------------------- +-- The Dictionary +--------------------------------------------------------- + +type Dictionary = Array Int FastString -- The dictionary + -- Should be 0-indexed + +putDictionary :: BinHandle -> Int -> Dictionary -> IO () +putDictionary bh sz dict = do + put_ bh sz + mapM_ (putFS bh) (elems dict) + +getDictionary :: BinHandle -> IO Dictionary +getDictionary bh = do + sz <- get bh + elems <- sequence (take sz (repeat (getFS bh))) + return (listArray (0,sz-1) elems) + +constructDictionary :: Int -> UniqFM (Int,FastString) -> Dictionary +constructDictionary j fm = array (0,j-1) (eltsUFM fm) + +--------------------------------------------------------- +-- Reading and writing FastStrings +--------------------------------------------------------- + +putFS bh (FastString id l _ buf _) = do + put_ bh l + withForeignPtr buf $ \ptr -> + let + go n | n == l = return () + | otherwise = do + b <- peekElemOff ptr n + putByte bh b + go (n+1) + in + go 0 + +{- -- possible faster version, not quite there yet: +getFS bh@BinMem{} = do + (I# l) <- get bh + arr <- readIORef (arr_r bh) + off <- readFastMutInt (off_r bh) + return $! (mkFastSubStringBA# arr off l) +-} +getFS bh = do + l <- get bh + fp <- mallocForeignPtrBytes l + withForeignPtr fp $ \ptr -> do + let + go n | n == l = mkFastStringForeignPtr ptr fp l + | otherwise = do + b <- getByte bh + pokeElemOff ptr n b + go (n+1) + -- + go 0 + +#if __GLASGOW_HASKELL__ < 600 +mallocForeignPtrBytes :: Int -> IO (ForeignPtr a) +mallocForeignPtrBytes n = do + r <- mallocBytes n + newForeignPtr r (finalizerFree r) + +foreign import ccall unsafe "stdlib.h free" + finalizerFree :: Ptr a -> IO () +#endif + +instance Binary PackageId where + put_ bh pid = put_ bh (packageIdFS pid) + get bh = do { fs <- get bh; return (fsToPackageId fs) } + +instance Binary FastString where + put_ bh f@(FastString id l _ fp _) = + case getUserData bh of { + UserData { ud_next = j_r, ud_map = out_r, ud_dict = dict} -> do + out <- readIORef out_r + let uniq = getUnique f + case lookupUFM out uniq of + Just (j,f) -> put_ bh j + Nothing -> do + j <- readIORef j_r + put_ bh j + writeIORef j_r (j+1) + writeIORef out_r (addToUFM out uniq (j,f)) + } + + get bh = do + j <- get bh + return $! (ud_dict (getUserData bh) ! j) diff --git a/compiler/utils/BitSet.lhs b/compiler/utils/BitSet.lhs new file mode 100644 index 0000000000..a108136af3 --- /dev/null +++ b/compiler/utils/BitSet.lhs @@ -0,0 +1,205 @@ +% +% (c) The GRASP Project, Glasgow University, 1994-1998 +% +\section[BitSet]{An implementation of very small sets} + +Bit sets are a fast implementation of sets of integers ranging from 0 +to one less than the number of bits in a machine word (typically 31). +If any element exceeds the maximum value for a particular machine +architecture, the results of these operations are undefined. You have +been warned. If you put any safety checks in this code, I will have +to kill you. + +Note: the Yale Haskell implementation won't provide a full 32 bits. +However, if you can handle the performance loss, you could change to +Integer and get virtually unlimited sets. + +\begin{code} + +module BitSet ( + BitSet, -- abstract type + mkBS, listBS, emptyBS, unitBS, + unionBS, minusBS, intBS + ) where + +#include "HsVersions.h" + +#ifdef __GLASGOW_HASKELL__ +import GLAEXTS +-- nothing to import +#elif defined(__YALE_HASKELL__) +{-hide import from mkdependHS-} +import + LogOpPrims +#else +{-hide import from mkdependHS-} +import + Word +#endif + +#ifdef __GLASGOW_HASKELL__ + +data BitSet = MkBS Word# + +emptyBS :: BitSet +emptyBS = MkBS (int2Word# 0#) + +mkBS :: [Int] -> BitSet +mkBS xs = foldr (unionBS . unitBS) emptyBS xs + +unitBS :: Int -> BitSet +unitBS x = case x of +#if __GLASGOW_HASKELL__ >= 503 + I# i# -> MkBS ((int2Word# 1#) `uncheckedShiftL#` i#) +#else + I# i# -> MkBS ((int2Word# 1#) `shiftL#` i#) +#endif + +unionBS :: BitSet -> BitSet -> BitSet +unionBS (MkBS x#) (MkBS y#) = MkBS (x# `or#` y#) + +minusBS :: BitSet -> BitSet -> BitSet +minusBS (MkBS x#) (MkBS y#) = MkBS (x# `and#` (not# y#)) + +#if 0 +-- not used in GHC +isEmptyBS :: BitSet -> Bool +isEmptyBS (MkBS s#) + = case word2Int# s# of + 0# -> True + _ -> False + +intersectBS :: BitSet -> BitSet -> BitSet +intersectBS (MkBS x#) (MkBS y#) = MkBS (x# `and#` y#) + +elementBS :: Int -> BitSet -> Bool +elementBS x (MkBS s#) = case x of + I# i# -> case word2Int# (((int2Word# 1#) `shiftL#` i#) `and#` s#) of + 0# -> False + _ -> True +#endif + +listBS :: BitSet -> [Int] +listBS s = listify s 0 + where listify (MkBS s#) n = + case word2Int# s# of + 0# -> [] + _ -> let s' = (MkBS (s# `shiftr` 1#)) + more = listify s' (n + 1) + in case word2Int# (s# `and#` (int2Word# 1#)) of + 0# -> more + _ -> n : more +#if __GLASGOW_HASKELL__ >= 503 + shiftr x y = uncheckedShiftRL# x y +#else + shiftr x y = shiftRL# x y +#endif + +-- intBS is a bit naughty. +intBS :: BitSet -> Int +intBS (MkBS w#) = I# (word2Int# w#) + +#elif defined(__YALE_HASKELL__) + +data BitSet = MkBS Int + +emptyBS :: BitSet +emptyBS = MkBS 0 + +mkBS :: [Int] -> BitSet +mkBS xs = foldr (unionBS . unitBS) emptyBS xs + +unitBS :: Int -> BitSet +unitBS x = MkBS (1 `ashInt` x) + +unionBS :: BitSet -> BitSet -> BitSet +unionBS (MkBS x) (MkBS y) = MkBS (x `logiorInt` y) + +#if 0 +-- not used in GHC +isEmptyBS :: BitSet -> Bool +isEmptyBS (MkBS s) + = case s of + 0 -> True + _ -> False + +intersectBS :: BitSet -> BitSet -> BitSet +intersectBS (MkBS x) (MkBS y) = MkBS (x `logandInt` y) + +elementBS :: Int -> BitSet -> Bool +elementBS x (MkBS s) + = case logbitpInt x s of + 0 -> False + _ -> True +#endif + +minusBS :: BitSet -> BitSet -> BitSet +minusBS (MkBS x) (MkBS y) = MkBS (x `logandc2Int` y) + +-- rewritten to avoid right shifts (which would give nonsense on negative +-- values. +listBS :: BitSet -> [Int] +listBS (MkBS s) = listify s 0 1 + where listify s n m = + case s of + 0 -> [] + _ -> let n' = n+1; m' = m+m in + case logbitpInt s m of + 0 -> listify s n' m' + _ -> n : listify (s `logandc2Int` m) n' m' + +#else /* HBC, perhaps? */ + +data BitSet = MkBS Word + +emptyBS :: BitSet +emptyBS = MkBS 0 + +mkBS :: [Int] -> BitSet +mkBS xs = foldr (unionBS . unitBS) emptyBS xs + +unitBS :: Int -> BitSet +unitBS x = MkBS (1 `bitLsh` x) + +unionBS :: BitSet -> BitSet -> BitSet +unionBS (MkBS x) (MkBS y) = MkBS (x `bitOr` y) + +#if 0 +-- not used in GHC +isEmptyBS :: BitSet -> Bool +isEmptyBS (MkBS s) + = case s of + 0 -> True + _ -> False + +intersectBS :: BitSet -> BitSet -> BitSet +intersectBS (MkBS x) (MkBS y) = MkBS (x `bitAnd` y) + +elementBS :: Int -> BitSet -> Bool +elementBS x (MkBS s) + = case (1 `bitLsh` x) `bitAnd` s of + 0 -> False + _ -> True +#endif + +minusBS :: BitSet -> BitSet -> BitSet +minusBS (MkBS x) (MkBS y) = MkBS (x `bitAnd` (bitCompl y)) + +listBS :: BitSet -> [Int] +listBS (MkBS s) = listify s 0 + where listify s n = + case s of + 0 -> [] + _ -> let s' = s `bitRsh` 1 + more = listify s' (n + 1) + in case (s `bitAnd` 1) of + 0 -> more + _ -> n : more + +#endif + +\end{code} + + + + diff --git a/compiler/utils/BufWrite.hs b/compiler/utils/BufWrite.hs new file mode 100644 index 0000000000..a03db3d084 --- /dev/null +++ b/compiler/utils/BufWrite.hs @@ -0,0 +1,124 @@ +----------------------------------------------------------------------------- +-- +-- Fast write-buffered Handles +-- +-- (c) The University of Glasgow 2005 +-- +-- This is a simple abstraction over Handles that offers very fast write +-- buffering, but without the thread safety that Handles provide. It's used +-- to save time in Pretty.printDoc. +-- +----------------------------------------------------------------------------- + +module BufWrite ( + BufHandle(..), + newBufHandle, + bPutChar, + bPutStr, + bPutFS, + bPutLitString, + bFlush, + ) where + +#include "HsVersions.h" + +import FastString +import FastMutInt +import Panic ( panic ) + +import Monad ( when ) +import Char ( ord ) +import Foreign +import IO + +import GHC.IOBase ( IO(..) ) +import System.IO ( hPutBuf ) +import GHC.Ptr ( Ptr(..) ) + +import GLAEXTS ( Int(..), Int#, Addr# ) + +-- ----------------------------------------------------------------------------- + +data BufHandle = BufHandle {-#UNPACK#-}!(Ptr Word8) + {-#UNPACK#-}!FastMutInt + Handle + +newBufHandle :: Handle -> IO BufHandle +newBufHandle hdl = do + ptr <- mallocBytes buf_size + r <- newFastMutInt + writeFastMutInt r 0 + return (BufHandle ptr r hdl) + +buf_size = 8192 :: Int + +#define STRICT2(f) f a b | a `seq` b `seq` False = undefined +#define STRICT3(f) f a b c | a `seq` b `seq` c `seq` False = undefined + +bPutChar :: BufHandle -> Char -> IO () +STRICT2(bPutChar) +bPutChar b@(BufHandle buf r hdl) c = do + i <- readFastMutInt r + if (i >= buf_size) + then do hPutBuf hdl buf buf_size + writeFastMutInt r 0 + bPutChar b c + else do pokeElemOff buf i (fromIntegral (ord c) :: Word8) + writeFastMutInt r (i+1) + +bPutStr :: BufHandle -> String -> IO () +STRICT2(bPutStr) +bPutStr b@(BufHandle buf r hdl) str = do + i <- readFastMutInt r + loop str i + where loop _ i | i `seq` False = undefined + loop "" i = do writeFastMutInt r i; return () + loop (c:cs) i + | i >= buf_size = do + hPutBuf hdl buf buf_size + loop (c:cs) 0 + | otherwise = do + pokeElemOff buf i (fromIntegral (ord c)) + loop cs (i+1) + +bPutFS :: BufHandle -> FastString -> IO () +bPutFS b@(BufHandle buf r hdl) fs@(FastString _ len _ fp _) = + withForeignPtr fp $ \ptr -> do + i <- readFastMutInt r + if (i + len) >= buf_size + then do hPutBuf hdl buf i + writeFastMutInt r 0 + if (len >= buf_size) + then hPutBuf hdl ptr len + else bPutFS b fs + else do + copyBytes (buf `plusPtr` i) ptr len + writeFastMutInt r (i+len) + +bPutLitString :: BufHandle -> Addr# -> Int# -> IO () +bPutLitString b@(BufHandle buf r hdl) a# len# = do + let len = I# len# + i <- readFastMutInt r + if (i+len) >= buf_size + then do hPutBuf hdl buf i + writeFastMutInt r 0 + if (len >= buf_size) + then hPutBuf hdl (Ptr a#) len + else bPutLitString b a# len# + else do + copyBytes (buf `plusPtr` i) (Ptr a#) len + writeFastMutInt r (i+len) + +bFlush :: BufHandle -> IO () +bFlush b@(BufHandle buf r hdl) = do + i <- readFastMutInt r + when (i > 0) $ hPutBuf hdl buf i + free buf + return () + +#if 0 +myPutBuf s hdl buf i = + modifyIOError (\e -> ioeSetErrorString e (ioeGetErrorString e ++ ':':s ++ " (" ++ show buf ++ "," ++ show i ++ ")")) $ + + hPutBuf hdl buf i +#endif diff --git a/compiler/utils/Digraph.lhs b/compiler/utils/Digraph.lhs new file mode 100644 index 0000000000..c49087c8f3 --- /dev/null +++ b/compiler/utils/Digraph.lhs @@ -0,0 +1,426 @@ +\begin{code} +module Digraph( + + -- At present the only one with a "nice" external interface + stronglyConnComp, stronglyConnCompR, SCC(..), flattenSCC, flattenSCCs, + + Graph, Vertex, + graphFromEdges, graphFromEdges', + buildG, transposeG, reverseE, outdegree, indegree, + + Tree(..), Forest, + showTree, showForest, + + dfs, dff, + topSort, + components, + scc, + back, cross, forward, + reachable, path, + bcc + + ) where + +# include "HsVersions.h" + +------------------------------------------------------------------------------ +-- A version of the graph algorithms described in: +-- +-- ``Lazy Depth-First Search and Linear Graph Algorithms in Haskell'' +-- by David King and John Launchbury +-- +-- Also included is some additional code for printing tree structures ... +------------------------------------------------------------------------------ + + +import Util ( sortLe ) + +-- Extensions +import MONAD_ST + +-- std interfaces +import Maybe +import Array +import List +import Outputable + +#if __GLASGOW_HASKELL__ >= 504 +import Data.Array.ST hiding ( indices, bounds ) +#else +import ST +#endif +\end{code} + + +%************************************************************************ +%* * +%* External interface +%* * +%************************************************************************ + +\begin{code} +data SCC vertex = AcyclicSCC vertex + | CyclicSCC [vertex] + +flattenSCCs :: [SCC a] -> [a] +flattenSCCs = concatMap flattenSCC + +flattenSCC (AcyclicSCC v) = [v] +flattenSCC (CyclicSCC vs) = vs + +instance Outputable a => Outputable (SCC a) where + ppr (AcyclicSCC v) = text "NONREC" $$ (nest 3 (ppr v)) + ppr (CyclicSCC vs) = text "REC" $$ (nest 3 (vcat (map ppr vs))) +\end{code} + +\begin{code} +stronglyConnComp + :: Ord key + => [(node, key, [key])] -- The graph; its ok for the + -- out-list to contain keys which arent + -- a vertex key, they are ignored + -> [SCC node] -- Returned in topologically sorted order + -- Later components depend on earlier ones, but not vice versa + +stronglyConnComp edges + = map get_node (stronglyConnCompR edges) + where + get_node (AcyclicSCC (n, _, _)) = AcyclicSCC n + get_node (CyclicSCC triples) = CyclicSCC [n | (n,_,_) <- triples] + +-- The "R" interface is used when you expect to apply SCC to +-- the (some of) the result of SCC, so you dont want to lose the dependency info +stronglyConnCompR + :: Ord key + => [(node, key, [key])] -- The graph; its ok for the + -- out-list to contain keys which arent + -- a vertex key, they are ignored + -> [SCC (node, key, [key])] -- Topologically sorted + +stronglyConnCompR [] = [] -- added to avoid creating empty array in graphFromEdges -- SOF +stronglyConnCompR edges + = map decode forest + where + (graph, vertex_fn) = _scc_ "graphFromEdges" graphFromEdges edges + forest = _scc_ "Digraph.scc" scc graph + decode (Node v []) | mentions_itself v = CyclicSCC [vertex_fn v] + | otherwise = AcyclicSCC (vertex_fn v) + decode other = CyclicSCC (dec other []) + where + dec (Node v ts) vs = vertex_fn v : foldr dec vs ts + mentions_itself v = v `elem` (graph ! v) +\end{code} + +%************************************************************************ +%* * +%* Graphs +%* * +%************************************************************************ + + +\begin{code} +type Vertex = Int +type Table a = Array Vertex a +type Graph = Table [Vertex] +type Bounds = (Vertex, Vertex) +type Edge = (Vertex, Vertex) +\end{code} + +\begin{code} +vertices :: Graph -> [Vertex] +vertices = indices + +edges :: Graph -> [Edge] +edges g = [ (v, w) | v <- vertices g, w <- g!v ] + +mapT :: (Vertex -> a -> b) -> Table a -> Table b +mapT f t = array (bounds t) [ (,) v (f v (t!v)) | v <- indices t ] + +buildG :: Bounds -> [Edge] -> Graph +buildG bounds edges = accumArray (flip (:)) [] bounds edges + +transposeG :: Graph -> Graph +transposeG g = buildG (bounds g) (reverseE g) + +reverseE :: Graph -> [Edge] +reverseE g = [ (w, v) | (v, w) <- edges g ] + +outdegree :: Graph -> Table Int +outdegree = mapT numEdges + where numEdges v ws = length ws + +indegree :: Graph -> Table Int +indegree = outdegree . transposeG +\end{code} + + +\begin{code} +graphFromEdges + :: Ord key + => [(node, key, [key])] + -> (Graph, Vertex -> (node, key, [key])) +graphFromEdges edges = + case graphFromEdges' edges of (graph, vertex_fn, _) -> (graph, vertex_fn) + +graphFromEdges' + :: Ord key + => [(node, key, [key])] + -> (Graph, Vertex -> (node, key, [key]), key -> Maybe Vertex) +graphFromEdges' edges + = (graph, \v -> vertex_map ! v, key_vertex) + where + max_v = length edges - 1 + bounds = (0,max_v) :: (Vertex, Vertex) + sorted_edges = let + (_,k1,_) `le` (_,k2,_) = case k1 `compare` k2 of { GT -> False; other -> True } + in + sortLe le edges + edges1 = zipWith (,) [0..] sorted_edges + + graph = array bounds [(,) v (mapMaybe key_vertex ks) | (,) v (_, _, ks) <- edges1] + key_map = array bounds [(,) v k | (,) v (_, k, _ ) <- edges1] + vertex_map = array bounds edges1 + + + -- key_vertex :: key -> Maybe Vertex + -- returns Nothing for non-interesting vertices + key_vertex k = find 0 max_v + where + find a b | a > b + = Nothing + find a b = case compare k (key_map ! mid) of + LT -> find a (mid-1) + EQ -> Just mid + GT -> find (mid+1) b + where + mid = (a + b) `div` 2 +\end{code} + +%************************************************************************ +%* * +%* Trees and forests +%* * +%************************************************************************ + +\begin{code} +data Tree a = Node a (Forest a) +type Forest a = [Tree a] + +mapTree :: (a -> b) -> (Tree a -> Tree b) +mapTree f (Node x ts) = Node (f x) (map (mapTree f) ts) +\end{code} + +\begin{code} +instance Show a => Show (Tree a) where + showsPrec p t s = showTree t ++ s + +showTree :: Show a => Tree a -> String +showTree = drawTree . mapTree show + +showForest :: Show a => Forest a -> String +showForest = unlines . map showTree + +drawTree :: Tree String -> String +drawTree = unlines . draw + +draw (Node x ts) = grp this (space (length this)) (stLoop ts) + where this = s1 ++ x ++ " " + + space n = replicate n ' ' + + stLoop [] = [""] + stLoop [t] = grp s2 " " (draw t) + stLoop (t:ts) = grp s3 s4 (draw t) ++ [s4] ++ rsLoop ts + + rsLoop [t] = grp s5 " " (draw t) + rsLoop (t:ts) = grp s6 s4 (draw t) ++ [s4] ++ rsLoop ts + + grp fst rst = zipWith (++) (fst:repeat rst) + + [s1,s2,s3,s4,s5,s6] = ["- ", "--", "-+", " |", " `", " +"] +\end{code} + + +%************************************************************************ +%* * +%* Depth first search +%* * +%************************************************************************ + +\begin{code} +#if __GLASGOW_HASKELL__ >= 504 +newSTArray :: Ix i => (i,i) -> e -> ST s (STArray s i e) +newSTArray = newArray + +readSTArray :: Ix i => STArray s i e -> i -> ST s e +readSTArray = readArray + +writeSTArray :: Ix i => STArray s i e -> i -> e -> ST s () +writeSTArray = writeArray +#endif + +type Set s = STArray s Vertex Bool + +mkEmpty :: Bounds -> ST s (Set s) +mkEmpty bnds = newSTArray bnds False + +contains :: Set s -> Vertex -> ST s Bool +contains m v = readSTArray m v + +include :: Set s -> Vertex -> ST s () +include m v = writeSTArray m v True +\end{code} + +\begin{code} +dff :: Graph -> Forest Vertex +dff g = dfs g (vertices g) + +dfs :: Graph -> [Vertex] -> Forest Vertex +dfs g vs = prune (bounds g) (map (generate g) vs) + +generate :: Graph -> Vertex -> Tree Vertex +generate g v = Node v (map (generate g) (g!v)) + +prune :: Bounds -> Forest Vertex -> Forest Vertex +prune bnds ts = runST (mkEmpty bnds >>= \m -> + chop m ts) + +chop :: Set s -> Forest Vertex -> ST s (Forest Vertex) +chop m [] = return [] +chop m (Node v ts : us) + = contains m v >>= \visited -> + if visited then + chop m us + else + include m v >>= \_ -> + chop m ts >>= \as -> + chop m us >>= \bs -> + return (Node v as : bs) +\end{code} + + +%************************************************************************ +%* * +%* Algorithms +%* * +%************************************************************************ + +------------------------------------------------------------ +-- Algorithm 1: depth first search numbering +------------------------------------------------------------ + +\begin{code} +--preorder :: Tree a -> [a] +preorder (Node a ts) = a : preorderF ts + +preorderF :: Forest a -> [a] +preorderF ts = concat (map preorder ts) + +tabulate :: Bounds -> [Vertex] -> Table Int +tabulate bnds vs = array bnds (zipWith (,) vs [1..]) + +preArr :: Bounds -> Forest Vertex -> Table Int +preArr bnds = tabulate bnds . preorderF +\end{code} + + +------------------------------------------------------------ +-- Algorithm 2: topological sorting +------------------------------------------------------------ + +\begin{code} +--postorder :: Tree a -> [a] +postorder (Node a ts) = postorderF ts ++ [a] + +postorderF :: Forest a -> [a] +postorderF ts = concat (map postorder ts) + +postOrd :: Graph -> [Vertex] +postOrd = postorderF . dff + +topSort :: Graph -> [Vertex] +topSort = reverse . postOrd +\end{code} + + +------------------------------------------------------------ +-- Algorithm 3: connected components +------------------------------------------------------------ + +\begin{code} +components :: Graph -> Forest Vertex +components = dff . undirected + +undirected :: Graph -> Graph +undirected g = buildG (bounds g) (edges g ++ reverseE g) +\end{code} + + +-- Algorithm 4: strongly connected components + +\begin{code} +scc :: Graph -> Forest Vertex +scc g = dfs g (reverse (postOrd (transposeG g))) +\end{code} + + +------------------------------------------------------------ +-- Algorithm 5: Classifying edges +------------------------------------------------------------ + +\begin{code} +back :: Graph -> Table Int -> Graph +back g post = mapT select g + where select v ws = [ w | w <- ws, post!v < post!w ] + +cross :: Graph -> Table Int -> Table Int -> Graph +cross g pre post = mapT select g + where select v ws = [ w | w <- ws, post!v > post!w, pre!v > pre!w ] + +forward :: Graph -> Graph -> Table Int -> Graph +forward g tree pre = mapT select g + where select v ws = [ w | w <- ws, pre!v < pre!w ] \\ tree!v +\end{code} + + +------------------------------------------------------------ +-- Algorithm 6: Finding reachable vertices +------------------------------------------------------------ + +\begin{code} +reachable :: Graph -> Vertex -> [Vertex] +reachable g v = preorderF (dfs g [v]) + +path :: Graph -> Vertex -> Vertex -> Bool +path g v w = w `elem` (reachable g v) +\end{code} + + +------------------------------------------------------------ +-- Algorithm 7: Biconnected components +------------------------------------------------------------ + +\begin{code} +bcc :: Graph -> Forest [Vertex] +bcc g = (concat . map bicomps . map (do_label g dnum)) forest + where forest = dff g + dnum = preArr (bounds g) forest + +do_label :: Graph -> Table Int -> Tree Vertex -> Tree (Vertex,Int,Int) +do_label g dnum (Node v ts) = Node (v,dnum!v,lv) us + where us = map (do_label g dnum) ts + lv = minimum ([dnum!v] ++ [dnum!w | w <- g!v] + ++ [lu | Node (u,du,lu) xs <- us]) + +bicomps :: Tree (Vertex,Int,Int) -> Forest [Vertex] +bicomps (Node (v,dv,lv) ts) + = [ Node (v:vs) us | (l,Node vs us) <- map collect ts] + +collect :: Tree (Vertex,Int,Int) -> (Int, Tree [Vertex]) +collect (Node (v,dv,lv) ts) = (lv, Node (v:vs) cs) + where collected = map collect ts + vs = concat [ ws | (lw, Node ws us) <- collected, lw<dv] + cs = concat [ if lw<dv then us else [Node (v:ws) us] + | (lw, Node ws us) <- collected ] +\end{code} + diff --git a/compiler/utils/Encoding.hs b/compiler/utils/Encoding.hs new file mode 100644 index 0000000000..152bf3c60e --- /dev/null +++ b/compiler/utils/Encoding.hs @@ -0,0 +1,373 @@ +-- ----------------------------------------------------------------------------- +-- +-- (c) The University of Glasgow, 1997-2006 +-- +-- Character encodings +-- +-- ----------------------------------------------------------------------------- + +module Encoding ( + -- * UTF-8 + utf8DecodeChar#, + utf8PrevChar, + utf8CharStart, + utf8DecodeChar, + utf8DecodeString, + utf8EncodeChar, + utf8EncodeString, + utf8EncodedLength, + countUTF8Chars, + + -- * Z-encoding + zEncodeString, + zDecodeString + ) where + +#define COMPILING_FAST_STRING +#include "HsVersions.h" +import Foreign +import Data.Char ( ord, chr, isDigit, digitToInt, isHexDigit ) +import Numeric ( showHex ) + +import Data.Bits +import GHC.Ptr ( Ptr(..) ) +import GHC.Base + +-- ----------------------------------------------------------------------------- +-- UTF-8 + +-- We can't write the decoder as efficiently as we'd like without +-- resorting to unboxed extensions, unfortunately. I tried to write +-- an IO version of this function, but GHC can't eliminate boxed +-- results from an IO-returning function. +-- +-- We assume we can ignore overflow when parsing a multibyte character here. +-- To make this safe, we add extra sentinel bytes to unparsed UTF-8 sequences +-- before decoding them (see StringBuffer.hs). + +{-# INLINE utf8DecodeChar# #-} +utf8DecodeChar# :: Addr# -> (# Char#, Addr# #) +utf8DecodeChar# a# = + let ch0 = word2Int# (indexWord8OffAddr# a# 0#) in + case () of + _ | ch0 <=# 0x7F# -> (# chr# ch0, a# `plusAddr#` 1# #) + + | ch0 >=# 0xC0# && ch0 <=# 0xDF# -> + let ch1 = word2Int# (indexWord8OffAddr# a# 1#) in + if ch1 <# 0x80# || ch1 >=# 0xC0# then fail 1# else + (# chr# (((ch0 -# 0xC0#) `uncheckedIShiftL#` 6#) +# + (ch1 -# 0x80#)), + a# `plusAddr#` 2# #) + + | ch0 >=# 0xE0# && ch0 <=# 0xEF# -> + let ch1 = word2Int# (indexWord8OffAddr# a# 1#) in + if ch1 <# 0x80# || ch1 >=# 0xC0# then fail 1# else + let ch2 = word2Int# (indexWord8OffAddr# a# 2#) in + if ch2 <# 0x80# || ch2 >=# 0xC0# then fail 2# else + (# chr# (((ch0 -# 0xE0#) `uncheckedIShiftL#` 12#) +# + ((ch1 -# 0x80#) `uncheckedIShiftL#` 6#) +# + (ch2 -# 0x80#)), + a# `plusAddr#` 3# #) + + | ch0 >=# 0xF0# && ch0 <=# 0xF8# -> + let ch1 = word2Int# (indexWord8OffAddr# a# 1#) in + if ch1 <# 0x80# || ch1 >=# 0xC0# then fail 1# else + let ch2 = word2Int# (indexWord8OffAddr# a# 2#) in + if ch2 <# 0x80# || ch2 >=# 0xC0# then fail 2# else + let ch3 = word2Int# (indexWord8OffAddr# a# 3#) in + if ch3 <# 0x80# || ch3 >=# 0xC0# then fail 3# else + (# chr# (((ch0 -# 0xF0#) `uncheckedIShiftL#` 18#) +# + ((ch1 -# 0x80#) `uncheckedIShiftL#` 12#) +# + ((ch2 -# 0x80#) `uncheckedIShiftL#` 6#) +# + (ch3 -# 0x80#)), + a# `plusAddr#` 4# #) + + | otherwise -> fail 1# + where + -- all invalid sequences end up here: + fail n = (# '\0'#, a# `plusAddr#` n #) + -- '\xFFFD' would be the usual replacement character, but + -- that's a valid symbol in Haskell, so will result in a + -- confusing parse error later on. Instead we use '\0' which + -- will signal a lexer error immediately. + +utf8DecodeChar :: Ptr Word8 -> (Char, Ptr Word8) +utf8DecodeChar (Ptr a#) = + case utf8DecodeChar# a# of (# c#, b# #) -> ( C# c#, Ptr b# ) + +-- UTF-8 is cleverly designed so that we can always figure out where +-- the start of the current character is, given any position in a +-- stream. This function finds the start of the previous character, +-- assuming there *is* a previous character. +utf8PrevChar :: Ptr Word8 -> IO (Ptr Word8) +utf8PrevChar p = utf8CharStart (p `plusPtr` (-1)) + +utf8CharStart :: Ptr Word8 -> IO (Ptr Word8) +utf8CharStart p = go p + where go p = do w <- peek p + if w >= 0x80 && w < 0xC0 + then go (p `plusPtr` (-1)) + else return p + +utf8DecodeString :: Ptr Word8 -> Int -> IO [Char] +STRICT2(utf8DecodeString) +utf8DecodeString (Ptr a#) (I# len#) + = unpack a# + where + end# = addr2Int# (a# `plusAddr#` len#) + + unpack p# + | addr2Int# p# >=# end# = return [] + | otherwise = + case utf8DecodeChar# p# of + (# c#, q# #) -> do + chs <- unpack q# + return (C# c# : chs) + +countUTF8Chars :: Ptr Word8 -> Int -> IO Int +countUTF8Chars ptr bytes = go ptr 0 + where + end = ptr `plusPtr` bytes + + STRICT2(go) + go ptr n + | ptr >= end = return n + | otherwise = do + case utf8DecodeChar# (unPtr ptr) of + (# c, a #) -> go (Ptr a) (n+1) + +unPtr (Ptr a) = a + +utf8EncodeChar c ptr = + let x = ord c in + case () of + _ | x > 0 && x <= 0x007f -> do + poke ptr (fromIntegral x) + return (ptr `plusPtr` 1) + -- NB. '\0' is encoded as '\xC0\x80', not '\0'. This is so that we + -- can have 0-terminated UTF-8 strings (see GHC.Base.unpackCStringUtf8). + | x <= 0x07ff -> do + poke ptr (fromIntegral (0xC0 .|. ((x `shiftR` 6) .&. 0x1F))) + pokeElemOff ptr 1 (fromIntegral (0x80 .|. (x .&. 0x3F))) + return (ptr `plusPtr` 2) + | x <= 0xffff -> do + poke ptr (fromIntegral (0xE0 .|. (x `shiftR` 12) .&. 0x0F)) + pokeElemOff ptr 1 (fromIntegral (0x80 .|. (x `shiftR` 6) .&. 0x3F)) + pokeElemOff ptr 2 (fromIntegral (0x80 .|. (x .&. 0x3F))) + return (ptr `plusPtr` 3) + | otherwise -> do + poke ptr (fromIntegral (0xF0 .|. (x `shiftR` 18))) + pokeElemOff ptr 1 (fromIntegral (0x80 .|. ((x `shiftR` 12) .&. 0x3F))) + pokeElemOff ptr 2 (fromIntegral (0x80 .|. ((x `shiftR` 6) .&. 0x3F))) + pokeElemOff ptr 3 (fromIntegral (0x80 .|. (x .&. 0x3F))) + return (ptr `plusPtr` 4) + +utf8EncodeString :: Ptr Word8 -> String -> IO () +utf8EncodeString ptr str = go ptr str + where STRICT2(go) + go ptr [] = return () + go ptr (c:cs) = do + ptr' <- utf8EncodeChar c ptr + go ptr' cs + +utf8EncodedLength :: String -> Int +utf8EncodedLength str = go 0 str + where STRICT2(go) + go n [] = n + go n (c:cs) + | ord c > 0 && ord c <= 0x007f = go (n+1) cs + | ord c <= 0x07ff = go (n+2) cs + | ord c <= 0xffff = go (n+3) cs + | otherwise = go (n+4) cs + +-- ----------------------------------------------------------------------------- +-- The Z-encoding + +{- +This is the main name-encoding and decoding function. It encodes any +string into a string that is acceptable as a C name. This is done +right before we emit a symbol name into the compiled C or asm code. +Z-encoding of strings is cached in the FastString interface, so we +never encode the same string more than once. + +The basic encoding scheme is this. + +* Tuples (,,,) are coded as Z3T + +* Alphabetic characters (upper and lower) and digits + all translate to themselves; + except 'Z', which translates to 'ZZ' + and 'z', which translates to 'zz' + We need both so that we can preserve the variable/tycon distinction + +* Most other printable characters translate to 'zx' or 'Zx' for some + alphabetic character x + +* The others translate as 'znnnU' where 'nnn' is the decimal number + of the character + + Before After + -------------------------- + Trak Trak + foo_wib foozuwib + > zg + >1 zg1 + foo# foozh + foo## foozhzh + foo##1 foozhzh1 + fooZ fooZZ + :+ ZCzp + () Z0T 0-tuple + (,,,,) Z5T 5-tuple + (# #) Z1H unboxed 1-tuple (note the space) + (#,,,,#) Z5H unboxed 5-tuple + (NB: There is no Z1T nor Z0H.) +-} + +type UserString = String -- As the user typed it +type EncodedString = String -- Encoded form + + +zEncodeString :: UserString -> EncodedString +zEncodeString cs = case maybe_tuple cs of + Just n -> n -- Tuples go to Z2T etc + Nothing -> go cs + where + go [] = [] + go (c:cs) = encode_ch c ++ go cs + +unencodedChar :: Char -> Bool -- True for chars that don't need encoding +unencodedChar 'Z' = False +unencodedChar 'z' = False +unencodedChar c = c >= 'a' && c <= 'z' + || c >= 'A' && c <= 'Z' + || c >= '0' && c <= '9' + +encode_ch :: Char -> EncodedString +encode_ch c | unencodedChar c = [c] -- Common case first + +-- Constructors +encode_ch '(' = "ZL" -- Needed for things like (,), and (->) +encode_ch ')' = "ZR" -- For symmetry with ( +encode_ch '[' = "ZM" +encode_ch ']' = "ZN" +encode_ch ':' = "ZC" +encode_ch 'Z' = "ZZ" + +-- Variables +encode_ch 'z' = "zz" +encode_ch '&' = "za" +encode_ch '|' = "zb" +encode_ch '^' = "zc" +encode_ch '$' = "zd" +encode_ch '=' = "ze" +encode_ch '>' = "zg" +encode_ch '#' = "zh" +encode_ch '.' = "zi" +encode_ch '<' = "zl" +encode_ch '-' = "zm" +encode_ch '!' = "zn" +encode_ch '+' = "zp" +encode_ch '\'' = "zq" +encode_ch '\\' = "zr" +encode_ch '/' = "zs" +encode_ch '*' = "zt" +encode_ch '_' = "zu" +encode_ch '%' = "zv" +encode_ch c = 'z' : if isDigit (head hex_str) then hex_str + else '0':hex_str + where hex_str = showHex (ord c) "U" + -- ToDo: we could improve the encoding here in various ways. + -- eg. strings of unicode characters come out as 'z1234Uz5678U', we + -- could remove the 'U' in the middle (the 'z' works as a separator). + +zDecodeString :: EncodedString -> UserString +zDecodeString [] = [] +zDecodeString ('Z' : d : rest) + | isDigit d = decode_tuple d rest + | otherwise = decode_upper d : zDecodeString rest +zDecodeString ('z' : d : rest) + | isDigit d = decode_num_esc d rest + | otherwise = decode_lower d : zDecodeString rest +zDecodeString (c : rest) = c : zDecodeString rest + +decode_upper, decode_lower :: Char -> Char + +decode_upper 'L' = '(' +decode_upper 'R' = ')' +decode_upper 'M' = '[' +decode_upper 'N' = ']' +decode_upper 'C' = ':' +decode_upper 'Z' = 'Z' +decode_upper ch = {-pprTrace "decode_upper" (char ch)-} ch + +decode_lower 'z' = 'z' +decode_lower 'a' = '&' +decode_lower 'b' = '|' +decode_lower 'c' = '^' +decode_lower 'd' = '$' +decode_lower 'e' = '=' +decode_lower 'g' = '>' +decode_lower 'h' = '#' +decode_lower 'i' = '.' +decode_lower 'l' = '<' +decode_lower 'm' = '-' +decode_lower 'n' = '!' +decode_lower 'p' = '+' +decode_lower 'q' = '\'' +decode_lower 'r' = '\\' +decode_lower 's' = '/' +decode_lower 't' = '*' +decode_lower 'u' = '_' +decode_lower 'v' = '%' +decode_lower ch = {-pprTrace "decode_lower" (char ch)-} ch + +-- Characters not having a specific code are coded as z224U (in hex) +decode_num_esc d rest + = go (digitToInt d) rest + where + go n (c : rest) | isHexDigit c = go (16*n + digitToInt c) rest + go n ('U' : rest) = chr n : zDecodeString rest + go n other = error ("decode_num_esc: " ++ show n ++ ' ':other) + +decode_tuple :: Char -> EncodedString -> UserString +decode_tuple d rest + = go (digitToInt d) rest + where + -- NB. recurse back to zDecodeString after decoding the tuple, because + -- the tuple might be embedded in a longer name. + go n (c : rest) | isDigit c = go (10*n + digitToInt c) rest + go 0 ('T':rest) = "()" ++ zDecodeString rest + go n ('T':rest) = '(' : replicate (n-1) ',' ++ ")" ++ zDecodeString rest + go 1 ('H':rest) = "(# #)" ++ zDecodeString rest + go n ('H':rest) = '(' : '#' : replicate (n-1) ',' ++ "#)" ++ zDecodeString rest + go n other = error ("decode_tuple: " ++ show n ++ ' ':other) + +{- +Tuples are encoded as + Z3T or Z3H +for 3-tuples or unboxed 3-tuples respectively. No other encoding starts + Z<digit> + +* "(# #)" is the tycon for an unboxed 1-tuple (not 0-tuple) + There are no unboxed 0-tuples. + +* "()" is the tycon for a boxed 0-tuple. + There are no boxed 1-tuples. +-} + +maybe_tuple :: UserString -> Maybe EncodedString + +maybe_tuple "(# #)" = Just("Z1H") +maybe_tuple ('(' : '#' : cs) = case count_commas (0::Int) cs of + (n, '#' : ')' : cs) -> Just ('Z' : shows (n+1) "H") + other -> Nothing +maybe_tuple "()" = Just("Z0T") +maybe_tuple ('(' : cs) = case count_commas (0::Int) cs of + (n, ')' : cs) -> Just ('Z' : shows (n+1) "T") + other -> Nothing +maybe_tuple other = Nothing + +count_commas :: Int -> String -> (Int, String) +count_commas n (',' : cs) = count_commas (n+1) cs +count_commas n cs = (n,cs) diff --git a/compiler/utils/FastMutInt.lhs b/compiler/utils/FastMutInt.lhs new file mode 100644 index 0000000000..b483a1428e --- /dev/null +++ b/compiler/utils/FastMutInt.lhs @@ -0,0 +1,54 @@ +{-# OPTIONS -cpp #-} +-- +-- (c) The University of Glasgow 2002 +-- +-- Unboxed mutable Ints + +\begin{code} +module FastMutInt( + FastMutInt, newFastMutInt, + readFastMutInt, writeFastMutInt + ) where + +#include "MachDeps.h" + +#ifndef SIZEOF_HSINT +#define SIZEOF_HSINT INT_SIZE_IN_BYTES +#endif + + +#if __GLASGOW_HASKELL__ < 503 +import GlaExts +import PrelIOBase +#else +import GHC.Base +import GHC.IOBase +#endif + +#if __GLASGOW_HASKELL__ < 411 +newByteArray# = newCharArray# +#endif +\end{code} + +\begin{code} +#ifdef __GLASGOW_HASKELL__ +data FastMutInt = FastMutInt (MutableByteArray# RealWorld) + +newFastMutInt :: IO FastMutInt +newFastMutInt = IO $ \s -> + case newByteArray# size s of { (# s, arr #) -> + (# s, FastMutInt arr #) } + where I# size = SIZEOF_HSINT + +readFastMutInt :: FastMutInt -> IO Int +readFastMutInt (FastMutInt arr) = IO $ \s -> + case readIntArray# arr 0# s of { (# s, i #) -> + (# s, I# i #) } + +writeFastMutInt :: FastMutInt -> Int -> IO () +writeFastMutInt (FastMutInt arr) (I# i) = IO $ \s -> + case writeIntArray# arr 0# i s of { s -> + (# s, () #) } +\end{code} +#endif + diff --git a/compiler/utils/FastString.lhs b/compiler/utils/FastString.lhs new file mode 100644 index 0000000000..ea307799c4 --- /dev/null +++ b/compiler/utils/FastString.lhs @@ -0,0 +1,499 @@ +% +% (c) The University of Glasgow, 1997-2006 +% +\begin{code} +{- +FastString: A compact, hash-consed, representation of character strings. + Comparison is O(1), and you can get a Unique from them. + Generated by the FSLIT macro + Turn into SDoc with Outputable.ftext + +LitString: Just a wrapper for the Addr# of a C string (Ptr CChar). + Practically no operations + Outputing them is fast + Generated by the SLIT macro + Turn into SDoc with Outputable.ptext + +Use LitString unless you want the facilities of FastString +-} +module FastString + ( + -- * FastStrings + FastString(..), -- not abstract, for now. + + -- ** Construction + mkFastString, + mkFastStringBytes, + mkFastStringForeignPtr, + mkFastString#, + mkZFastString, + mkZFastStringBytes, + + -- ** Deconstruction + unpackFS, -- :: FastString -> String + bytesFS, -- :: FastString -> [Word8] + + -- ** Encoding + isZEncoded, + zEncodeFS, + + -- ** Operations + uniqueOfFS, + lengthFS, + nullFS, + appendFS, + headFS, + tailFS, + concatFS, + consFS, + nilFS, + + -- ** Outputing + hPutFS, + + -- ** Internal + getFastStringTable, + hasZEncoding, + + -- * LitStrings + LitString, + mkLitString#, + strLength + ) where + +-- This #define suppresses the "import FastString" that +-- HsVersions otherwise produces +#define COMPILING_FAST_STRING +#include "HsVersions.h" + +import Encoding + +import Foreign +import Foreign.C +import GHC.Exts +import System.IO.Unsafe ( unsafePerformIO ) +import Control.Monad.ST ( stToIO ) +import Data.IORef ( IORef, newIORef, readIORef, writeIORef ) +import System.IO ( hPutBuf ) +import Data.Maybe ( isJust ) + +import GHC.Arr ( STArray(..), newSTArray ) +import GHC.IOBase ( IO(..) ) +import GHC.Ptr ( Ptr(..) ) + +#define hASH_TBL_SIZE 4091 + + +{-| +A 'FastString' is an array of bytes, hashed to support fast O(1) +comparison. It is also associated with a character encoding, so that +we know how to convert a 'FastString' to the local encoding, or to the +Z-encoding used by the compiler internally. + +'FastString's support a memoized conversion to the Z-encoding via zEncodeFS. +-} + +data FastString = FastString { + uniq :: {-# UNPACK #-} !Int, -- unique id + n_bytes :: {-# UNPACK #-} !Int, -- number of bytes + n_chars :: {-# UNPACK #-} !Int, -- number of chars + buf :: {-# UNPACK #-} !(ForeignPtr Word8), + enc :: FSEncoding + } + +data FSEncoding + = ZEncoded + -- including strings that don't need any encoding + | UTF8Encoded {-# UNPACK #-} !(IORef (Maybe FastString)) + -- A UTF-8 string with a memoized Z-encoding + +instance Eq FastString where + f1 == f2 = uniq f1 == uniq f2 + +instance Ord FastString where + -- Compares lexicographically, not by unique + a <= b = case cmpFS a b of { LT -> True; EQ -> True; GT -> False } + a < b = case cmpFS a b of { LT -> True; EQ -> False; GT -> False } + a >= b = case cmpFS a b of { LT -> False; EQ -> True; GT -> True } + a > b = case cmpFS a b of { LT -> False; EQ -> False; GT -> True } + max x y | x >= y = x + | otherwise = y + min x y | x <= y = x + | otherwise = y + compare a b = cmpFS a b + +instance Show FastString where + show fs = show (unpackFS fs) + +cmpFS :: FastString -> FastString -> Ordering +cmpFS (FastString u1 l1 _ buf1 _) (FastString u2 l2 _ buf2 _) = + if u1 == u2 then EQ else + let l = if l1 <= l2 then l1 else l2 in + inlinePerformIO $ + withForeignPtr buf1 $ \p1 -> + withForeignPtr buf2 $ \p2 -> do + res <- memcmp p1 p2 l + case () of + _ | res < 0 -> return LT + | res == 0 -> if l1 == l2 then return EQ + else if l1 < l2 then return LT + else return GT + | otherwise -> return GT + +#ifndef __HADDOCK__ +foreign import ccall unsafe "ghc_memcmp" + memcmp :: Ptr a -> Ptr b -> Int -> IO Int +#endif + +-- ----------------------------------------------------------------------------- +-- Construction + +{- +Internally, the compiler will maintain a fast string symbol +table, providing sharing and fast comparison. Creation of +new @FastString@s then covertly does a lookup, re-using the +@FastString@ if there was a hit. +-} + +data FastStringTable = + FastStringTable + {-# UNPACK #-} !Int + (MutableArray# RealWorld [FastString]) + +string_table :: IORef FastStringTable +string_table = + unsafePerformIO $ do + (STArray _ _ arr#) <- stToIO (newSTArray (0::Int,hASH_TBL_SIZE) []) + newIORef (FastStringTable 0 arr#) + +lookupTbl :: FastStringTable -> Int -> IO [FastString] +lookupTbl (FastStringTable _ arr#) (I# i#) = + IO $ \ s# -> readArray# arr# i# s# + +updTbl :: IORef FastStringTable -> FastStringTable -> Int -> [FastString] -> IO () +updTbl fs_table_var (FastStringTable uid arr#) (I# i#) ls = do + (IO $ \ s# -> case writeArray# arr# i# ls s# of { s2# -> (# s2#, () #) }) + writeIORef fs_table_var (FastStringTable (uid+1) arr#) + +mkFastString# :: Addr# -> FastString +mkFastString# a# = mkFastStringBytes ptr (strLength ptr) + where ptr = Ptr a# + +mkFastStringBytes :: Ptr Word8 -> Int -> FastString +mkFastStringBytes ptr len = unsafePerformIO $ do + ft@(FastStringTable uid tbl#) <- readIORef string_table + let + h = hashStr ptr len + add_it ls = do + fs <- copyNewFastString uid ptr len + updTbl string_table ft h (fs:ls) + {- _trace ("new: " ++ show f_str) $ -} + return fs + -- + lookup_result <- lookupTbl ft h + case lookup_result of + [] -> add_it [] + ls -> do + b <- bucket_match ls len ptr + case b of + Nothing -> add_it ls + Just v -> {- _trace ("re-use: "++show v) $ -} return v + +mkZFastStringBytes :: Ptr Word8 -> Int -> FastString +mkZFastStringBytes ptr len = unsafePerformIO $ do + ft@(FastStringTable uid tbl#) <- readIORef string_table + let + h = hashStr ptr len + add_it ls = do + fs <- copyNewZFastString uid ptr len + updTbl string_table ft h (fs:ls) + {- _trace ("new: " ++ show f_str) $ -} + return fs + -- + lookup_result <- lookupTbl ft h + case lookup_result of + [] -> add_it [] + ls -> do + b <- bucket_match ls len ptr + case b of + Nothing -> add_it ls + Just v -> {- _trace ("re-use: "++show v) $ -} return v + +-- | Create a 'FastString' from an existing 'ForeignPtr'; the difference +-- between this and 'mkFastStringBytes' is that we don't have to copy +-- the bytes if the string is new to the table. +mkFastStringForeignPtr :: Ptr Word8 -> ForeignPtr Word8 -> Int -> IO FastString +mkFastStringForeignPtr ptr fp len = do + ft@(FastStringTable uid tbl#) <- readIORef string_table +-- _trace ("hashed: "++show (I# h)) $ + let + h = hashStr ptr len + add_it ls = do + fs <- mkNewFastString uid ptr fp len + updTbl string_table ft h (fs:ls) + {- _trace ("new: " ++ show f_str) $ -} + return fs + -- + lookup_result <- lookupTbl ft h + case lookup_result of + [] -> add_it [] + ls -> do + b <- bucket_match ls len ptr + case b of + Nothing -> add_it ls + Just v -> {- _trace ("re-use: "++show v) $ -} return v + +mkZFastStringForeignPtr :: Ptr Word8 -> ForeignPtr Word8 -> Int -> IO FastString +mkZFastStringForeignPtr ptr fp len = do + ft@(FastStringTable uid tbl#) <- readIORef string_table +-- _trace ("hashed: "++show (I# h)) $ + let + h = hashStr ptr len + add_it ls = do + fs <- mkNewZFastString uid ptr fp len + updTbl string_table ft h (fs:ls) + {- _trace ("new: " ++ show f_str) $ -} + return fs + -- + lookup_result <- lookupTbl ft h + case lookup_result of + [] -> add_it [] + ls -> do + b <- bucket_match ls len ptr + case b of + Nothing -> add_it ls + Just v -> {- _trace ("re-use: "++show v) $ -} return v + + +-- | Creates a UTF-8 encoded 'FastString' from a 'String' +mkFastString :: String -> FastString +mkFastString str = + inlinePerformIO $ do + let l = utf8EncodedLength str + buf <- mallocForeignPtrBytes l + withForeignPtr buf $ \ptr -> do + utf8EncodeString ptr str + mkFastStringForeignPtr ptr buf l + + +-- | Creates a Z-encoded 'FastString' from a 'String' +mkZFastString :: String -> FastString +mkZFastString str = + inlinePerformIO $ do + let l = Prelude.length str + buf <- mallocForeignPtrBytes l + withForeignPtr buf $ \ptr -> do + pokeCAString (castPtr ptr) str + mkZFastStringForeignPtr ptr buf l + +bucket_match [] _ _ = return Nothing +bucket_match (v@(FastString _ l _ buf _):ls) len ptr + | len == l = do + b <- cmpStringPrefix ptr buf len + if b then return (Just v) + else bucket_match ls len ptr + | otherwise = + bucket_match ls len ptr + +mkNewFastString uid ptr fp len = do + ref <- newIORef Nothing + n_chars <- countUTF8Chars ptr len + return (FastString uid len n_chars fp (UTF8Encoded ref)) + +mkNewZFastString uid ptr fp len = do + return (FastString uid len len fp ZEncoded) + + +copyNewFastString uid ptr len = do + fp <- copyBytesToForeignPtr ptr len + ref <- newIORef Nothing + n_chars <- countUTF8Chars ptr len + return (FastString uid len n_chars fp (UTF8Encoded ref)) + +copyNewZFastString uid ptr len = do + fp <- copyBytesToForeignPtr ptr len + return (FastString uid len len fp ZEncoded) + + +copyBytesToForeignPtr ptr len = do + fp <- mallocForeignPtrBytes len + withForeignPtr fp $ \ptr' -> copyBytes ptr' ptr len + return fp + +cmpStringPrefix :: Ptr Word8 -> ForeignPtr Word8 -> Int -> IO Bool +cmpStringPrefix ptr fp len = + withForeignPtr fp $ \ptr' -> do + r <- memcmp ptr ptr' len + return (r == 0) + + +hashStr :: Ptr Word8 -> Int -> Int + -- use the Addr to produce a hash value between 0 & m (inclusive) +hashStr (Ptr a#) (I# len#) = loop 0# 0# + where + loop h n | n ==# len# = I# h + | otherwise = loop h2 (n +# 1#) + where c = ord# (indexCharOffAddr# a# n) + h2 = (c +# (h *# 128#)) `remInt#` hASH_TBL_SIZE# + +-- ----------------------------------------------------------------------------- +-- Operations + +-- | Returns the length of the 'FastString' in characters +lengthFS :: FastString -> Int +lengthFS f = n_chars f + +-- | Returns 'True' if the 'FastString' is Z-encoded +isZEncoded :: FastString -> Bool +isZEncoded fs | ZEncoded <- enc fs = True + | otherwise = False + +-- | Returns 'True' if this 'FastString' is not Z-encoded but already has +-- a Z-encoding cached (used in producing stats). +hasZEncoding :: FastString -> Bool +hasZEncoding fs@(FastString uid n_bytes _ fp enc) = + case enc of + ZEncoded -> False + UTF8Encoded ref -> + inlinePerformIO $ do + m <- readIORef ref + return (isJust m) + +-- | Returns 'True' if the 'FastString' is empty +nullFS :: FastString -> Bool +nullFS f = n_bytes f == 0 + +-- | unpacks and decodes the FastString +unpackFS :: FastString -> String +unpackFS (FastString _ n_bytes _ buf enc) = + inlinePerformIO $ withForeignPtr buf $ \ptr -> + case enc of + ZEncoded -> peekCAStringLen (castPtr ptr,n_bytes) + UTF8Encoded _ -> utf8DecodeString ptr n_bytes + +bytesFS :: FastString -> [Word8] +bytesFS (FastString _ n_bytes _ buf enc) = + inlinePerformIO $ withForeignPtr buf $ \ptr -> + peekArray n_bytes ptr + +-- | returns a Z-encoded version of a 'FastString'. This might be the +-- original, if it was already Z-encoded. The first time this +-- function is applied to a particular 'FastString', the results are +-- memoized. +-- +zEncodeFS :: FastString -> FastString +zEncodeFS fs@(FastString uid n_bytes _ fp enc) = + case enc of + ZEncoded -> fs + UTF8Encoded ref -> + inlinePerformIO $ do + m <- readIORef ref + case m of + Just fs -> return fs + Nothing -> do + let efs = mkZFastString (zEncodeString (unpackFS fs)) + writeIORef ref (Just efs) + return efs + +appendFS :: FastString -> FastString -> FastString +appendFS fs1 fs2 = mkFastString (unpackFS fs1 ++ unpackFS fs2) + +concatFS :: [FastString] -> FastString +concatFS ls = mkFastString (Prelude.concat (map unpackFS ls)) -- ToDo: do better + +headFS :: FastString -> Char +headFS (FastString _ n_bytes _ buf enc) = + inlinePerformIO $ withForeignPtr buf $ \ptr -> do + case enc of + ZEncoded -> do + w <- peek (castPtr ptr) + return (castCCharToChar w) + UTF8Encoded _ -> + return (fst (utf8DecodeChar ptr)) + +tailFS :: FastString -> FastString +tailFS (FastString _ n_bytes _ buf enc) = + inlinePerformIO $ withForeignPtr buf $ \ptr -> do + case enc of + ZEncoded -> do + return $! mkZFastStringBytes (ptr `plusPtr` 1) (n_bytes - 1) + UTF8Encoded _ -> do + let (_,ptr') = utf8DecodeChar ptr + let off = ptr' `minusPtr` ptr + return $! mkFastStringBytes (ptr `plusPtr` off) (n_bytes - off) + +consFS :: Char -> FastString -> FastString +consFS c fs = mkFastString (c : unpackFS fs) + +uniqueOfFS :: FastString -> Int# +uniqueOfFS (FastString (I# u#) _ _ _ _) = u# + +nilFS = mkFastString "" + +-- ----------------------------------------------------------------------------- +-- Stats + +getFastStringTable :: IO [[FastString]] +getFastStringTable = do + tbl <- readIORef string_table + buckets <- mapM (lookupTbl tbl) [0 .. hASH_TBL_SIZE] + return buckets + +-- ----------------------------------------------------------------------------- +-- Outputting 'FastString's + +-- |Outputs a 'FastString' with /no decoding at all/, that is, you +-- get the actual bytes in the 'FastString' written to the 'Handle'. +hPutFS handle (FastString _ len _ fp _) + | len == 0 = return () + | otherwise = do withForeignPtr fp $ \ptr -> hPutBuf handle ptr len + +-- ToDo: we'll probably want an hPutFSLocal, or something, to output +-- in the current locale's encoding (for error messages and suchlike). + +-- ----------------------------------------------------------------------------- +-- LitStrings, here for convenience only. + +type LitString = Ptr () + +mkLitString# :: Addr# -> LitString +mkLitString# a# = Ptr a# + +foreign import ccall unsafe "ghc_strlen" + strLength :: Ptr () -> Int + +-- ----------------------------------------------------------------------------- +-- under the carpet + +-- Just like unsafePerformIO, but we inline it. +{-# INLINE inlinePerformIO #-} +inlinePerformIO :: IO a -> a +inlinePerformIO (IO m) = case m realWorld# of (# _, r #) -> r + +-- NB. does *not* add a '\0'-terminator. +pokeCAString :: Ptr CChar -> String -> IO () +pokeCAString ptr str = + let + go [] n = return () + go (c:cs) n = do pokeElemOff ptr n (castCharToCChar c); go cs (n+1) + in + go str 0 + +#if __GLASGOW_HASKELL__ < 600 + +mallocForeignPtrBytes :: Int -> IO (ForeignPtr a) +mallocForeignPtrBytes n = do + r <- mallocBytes n + newForeignPtr r (finalizerFree r) + +foreign import ccall unsafe "stdlib.h free" + finalizerFree :: Ptr a -> IO () + +peekCAStringLen = peekCStringLen + +#elif __GLASGOW_HASKELL__ <= 602 + +peekCAStringLen = peekCStringLen + +#endif +\end{code} diff --git a/compiler/utils/FastTypes.lhs b/compiler/utils/FastTypes.lhs new file mode 100644 index 0000000000..bb92c8c02f --- /dev/null +++ b/compiler/utils/FastTypes.lhs @@ -0,0 +1,65 @@ +% +% (c) The University of Glasgow, 2000 +% +\section{Fast integers and booleans} + +\begin{code} +module FastTypes ( + FastInt, _ILIT, iBox, iUnbox, + (+#), (-#), (*#), quotFastInt, negateFastInt, + (==#), (<#), (<=#), (>=#), (>#), + + FastBool, fastBool, isFastTrue, fastOr, fastAnd + ) where + +#include "HsVersions.h" + +#if defined(__GLASGOW_HASKELL__) + +-- Import the beggars +import GLAEXTS + ( Int(..), Int#, (+#), (-#), (*#), + quotInt#, negateInt#, (==#), (<#), (<=#), (>=#), (>#) + ) + +type FastInt = Int# +_ILIT (I# x) = x +iBox x = I# x +iUnbox (I# x) = x +quotFastInt = quotInt# +negateFastInt = negateInt# + +type FastBool = Int# +fastBool True = 1# +fastBool False = 0# +isFastTrue x = x ==# 1# + +fastOr 1# _ = 1# +fastOr 0# x = x + +fastAnd 0# x = 0# +fastAnd 1# x = x + +#else /* ! __GLASGOW_HASKELL__ */ + +type FastInt = Int +_ILIT x = x +iBox x = x +iUnbox x = x +(+#) = (+) +(-#) = (-) +(*#) = (*) +quotFastInt = quot +negateFastInt = negate +(==#) = (==) +(<#) = (<) +(<=#) = (<=) +(>=#) = (>=) +(>#) = (>) + +type FastBool = Bool +fastBool x = x +_IS_TRUE_ x = x + +#endif /* ! __GLASGOW_HASKELL__ */ +\end{code} diff --git a/compiler/utils/FiniteMap.lhs b/compiler/utils/FiniteMap.lhs new file mode 100644 index 0000000000..9168d3656f --- /dev/null +++ b/compiler/utils/FiniteMap.lhs @@ -0,0 +1,749 @@ + +% (c) The AQUA Project, Glasgow University, 1994-1998 +% +\section[FiniteMap]{An implementation of finite maps} + +``Finite maps'' are the heart of the compiler's +lookup-tables/environments and its implementation of sets. Important +stuff! + +This code is derived from that in the paper: +\begin{display} + S Adams + "Efficient sets: a balancing act" + Journal of functional programming 3(4) Oct 1993, pp553-562 +\end{display} + +The code is SPECIALIZEd to various highly-desirable types (e.g., Id) +near the end. + +\begin{code} + +module FiniteMap ( + FiniteMap, -- abstract type + + emptyFM, unitFM, listToFM, + + addToFM, + addToFM_C, + addListToFM, + addListToFM_C, + delFromFM, + delListFromFM, + + plusFM, + plusFM_C, + minusFM, + foldFM, + + intersectFM, + intersectFM_C, + mapFM, filterFM, + + sizeFM, isEmptyFM, elemFM, lookupFM, lookupWithDefaultFM, + + fmToList, keysFM, eltsFM + + , bagToFM + + ) where + +#include "HsVersions.h" +#define IF_NOT_GHC(a) {--} + +#if defined(DEBUG_FINITEMAPS)/* NB NB NB */ +#define OUTPUTABLE_key , Outputable key +#else +#define OUTPUTABLE_key {--} +#endif + +import Maybes +import Bag ( Bag, foldrBag ) +import Util +import Outputable + +import GLAEXTS + +#if ! OMIT_NATIVE_CODEGEN +# define IF_NCG(a) a +#else +# define IF_NCG(a) {--} +#endif + + +-- SIGH: but we use unboxed "sizes"... +#if __GLASGOW_HASKELL__ +#define IF_GHC(a,b) a +#else /* not GHC */ +#define IF_GHC(a,b) b +#endif /* not GHC */ +\end{code} + + +%************************************************************************ +%* * +\subsection{The signature of the module} +%* * +%************************************************************************ + +\begin{code} +-- BUILDING +emptyFM :: FiniteMap key elt +unitFM :: key -> elt -> FiniteMap key elt +listToFM :: (Ord key OUTPUTABLE_key) => [(key,elt)] -> FiniteMap key elt + -- In the case of duplicates, the last is taken +bagToFM :: (Ord key OUTPUTABLE_key) => Bag (key,elt) -> FiniteMap key elt + -- In the case of duplicates, who knows which is taken + +-- ADDING AND DELETING + -- Throws away any previous binding + -- In the list case, the items are added starting with the + -- first one in the list +addToFM :: (Ord key OUTPUTABLE_key) => FiniteMap key elt -> key -> elt -> FiniteMap key elt +addListToFM :: (Ord key OUTPUTABLE_key) => FiniteMap key elt -> [(key,elt)] -> FiniteMap key elt + + -- Combines with previous binding + -- The combining fn goes (old -> new -> new) +addToFM_C :: (Ord key OUTPUTABLE_key) => (elt -> elt -> elt) + -> FiniteMap key elt -> key -> elt + -> FiniteMap key elt +addListToFM_C :: (Ord key OUTPUTABLE_key) => (elt -> elt -> elt) + -> FiniteMap key elt -> [(key,elt)] + -> FiniteMap key elt + + -- Deletion doesn't complain if you try to delete something + -- which isn't there +delFromFM :: (Ord key OUTPUTABLE_key) => FiniteMap key elt -> key -> FiniteMap key elt +delListFromFM :: (Ord key OUTPUTABLE_key) => FiniteMap key elt -> [key] -> FiniteMap key elt + +-- COMBINING + -- Bindings in right argument shadow those in the left +plusFM :: (Ord key OUTPUTABLE_key) => FiniteMap key elt -> FiniteMap key elt + -> FiniteMap key elt + + -- Combines bindings for the same thing with the given function +plusFM_C :: (Ord key OUTPUTABLE_key) => (elt -> elt -> elt) + -> FiniteMap key elt -> FiniteMap key elt -> FiniteMap key elt + +minusFM :: (Ord key OUTPUTABLE_key) => FiniteMap key elt -> FiniteMap key elt -> FiniteMap key elt + -- (minusFM a1 a2) deletes from a1 any bindings which are bound in a2 + +intersectFM :: (Ord key OUTPUTABLE_key) => FiniteMap key elt -> FiniteMap key elt -> FiniteMap key elt +intersectFM_C :: (Ord key OUTPUTABLE_key) => (elt1 -> elt2 -> elt3) + -> FiniteMap key elt1 -> FiniteMap key elt2 -> FiniteMap key elt3 + +-- MAPPING, FOLDING, FILTERING +foldFM :: (key -> elt -> a -> a) -> a -> FiniteMap key elt -> a +mapFM :: (key -> elt1 -> elt2) -> FiniteMap key elt1 -> FiniteMap key elt2 +filterFM :: (Ord key OUTPUTABLE_key) => (key -> elt -> Bool) + -> FiniteMap key elt -> FiniteMap key elt + + +-- INTERROGATING +sizeFM :: FiniteMap key elt -> Int +isEmptyFM :: FiniteMap key elt -> Bool + +elemFM :: (Ord key OUTPUTABLE_key) => key -> FiniteMap key elt -> Bool +lookupFM :: (Ord key OUTPUTABLE_key) => FiniteMap key elt -> key -> Maybe elt +lookupWithDefaultFM + :: (Ord key OUTPUTABLE_key) => FiniteMap key elt -> elt -> key -> elt + -- lookupWithDefaultFM supplies a "default" elt + -- to return for an unmapped key + +-- LISTIFYING +fmToList :: FiniteMap key elt -> [(key,elt)] +keysFM :: FiniteMap key elt -> [key] +eltsFM :: FiniteMap key elt -> [elt] +\end{code} + +%************************************************************************ +%* * +\subsection{The @FiniteMap@ data type, and building of same} +%* * +%************************************************************************ + +Invariants about @FiniteMap@: +\begin{enumerate} +\item +all keys in a FiniteMap are distinct +\item +all keys in left subtree are $<$ key in Branch and +all keys in right subtree are $>$ key in Branch +\item +size field of a Branch gives number of Branch nodes in the tree +\item +size of left subtree is differs from size of right subtree by a +factor of at most \tr{sIZE_RATIO} +\end{enumerate} + +\begin{code} +data FiniteMap key elt + = EmptyFM + | Branch key elt -- Key and elt stored here + IF_GHC(Int#,Int{-STRICT-}) -- Size >= 1 + (FiniteMap key elt) -- Children + (FiniteMap key elt) +\end{code} + +\begin{code} +emptyFM = EmptyFM +{- +emptyFM + = Branch bottom bottom IF_GHC(0#,0) bottom bottom + where + bottom = panic "emptyFM" +-} + +-- #define EmptyFM (Branch _ _ IF_GHC(0#,0) _ _) + +unitFM key elt = Branch key elt IF_GHC(1#,1) emptyFM emptyFM + +listToFM = addListToFM emptyFM + +bagToFM = foldrBag (\(k,v) fm -> addToFM fm k v) emptyFM +\end{code} + +%************************************************************************ +%* * +\subsection{Adding to and deleting from @FiniteMaps@} +%* * +%************************************************************************ + +\begin{code} +addToFM fm key elt = addToFM_C (\ old new -> new) fm key elt + +addToFM_C combiner EmptyFM key elt = unitFM key elt +addToFM_C combiner (Branch key elt size fm_l fm_r) new_key new_elt + = case compare new_key key of + LT -> mkBalBranch key elt (addToFM_C combiner fm_l new_key new_elt) fm_r + GT -> mkBalBranch key elt fm_l (addToFM_C combiner fm_r new_key new_elt) + EQ -> Branch new_key (combiner elt new_elt) size fm_l fm_r + +addListToFM fm key_elt_pairs = addListToFM_C (\ old new -> new) fm key_elt_pairs + +addListToFM_C combiner fm key_elt_pairs + = foldl' add fm key_elt_pairs -- foldl adds from the left + where + add fmap (key,elt) = addToFM_C combiner fmap key elt +\end{code} + +\begin{code} +delFromFM EmptyFM del_key = emptyFM +delFromFM (Branch key elt size fm_l fm_r) del_key + = case compare del_key key of + GT -> mkBalBranch key elt fm_l (delFromFM fm_r del_key) + LT -> mkBalBranch key elt (delFromFM fm_l del_key) fm_r + EQ -> glueBal fm_l fm_r + +delListFromFM fm keys = foldl' delFromFM fm keys +\end{code} + +%************************************************************************ +%* * +\subsection{Combining @FiniteMaps@} +%* * +%************************************************************************ + +\begin{code} +plusFM_C combiner EmptyFM fm2 = fm2 +plusFM_C combiner fm1 EmptyFM = fm1 +plusFM_C combiner fm1 (Branch split_key elt2 _ left right) + = mkVBalBranch split_key new_elt + (plusFM_C combiner lts left) + (plusFM_C combiner gts right) + where + lts = splitLT fm1 split_key + gts = splitGT fm1 split_key + new_elt = case lookupFM fm1 split_key of + Nothing -> elt2 + Just elt1 -> combiner elt1 elt2 + +-- It's worth doing plusFM specially, because we don't need +-- to do the lookup in fm1. +-- FM2 over-rides FM1. + +plusFM EmptyFM fm2 = fm2 +plusFM fm1 EmptyFM = fm1 +plusFM fm1 (Branch split_key elt1 _ left right) + = mkVBalBranch split_key elt1 (plusFM lts left) (plusFM gts right) + where + lts = splitLT fm1 split_key + gts = splitGT fm1 split_key + +minusFM EmptyFM fm2 = emptyFM +minusFM fm1 EmptyFM = fm1 +minusFM fm1 (Branch split_key elt _ left right) + = glueVBal (minusFM lts left) (minusFM gts right) + -- The two can be way different, so we need glueVBal + where + lts = splitLT fm1 split_key -- NB gt and lt, so the equal ones + gts = splitGT fm1 split_key -- are not in either. + +intersectFM fm1 fm2 = intersectFM_C (\ left right -> right) fm1 fm2 + +intersectFM_C combiner fm1 EmptyFM = emptyFM +intersectFM_C combiner EmptyFM fm2 = emptyFM +intersectFM_C combiner fm1 (Branch split_key elt2 _ left right) + + | maybeToBool maybe_elt1 -- split_elt *is* in intersection + = mkVBalBranch split_key (combiner elt1 elt2) (intersectFM_C combiner lts left) + (intersectFM_C combiner gts right) + + | otherwise -- split_elt is *not* in intersection + = glueVBal (intersectFM_C combiner lts left) (intersectFM_C combiner gts right) + + where + lts = splitLT fm1 split_key -- NB gt and lt, so the equal ones + gts = splitGT fm1 split_key -- are not in either. + + maybe_elt1 = lookupFM fm1 split_key + Just elt1 = maybe_elt1 +\end{code} + +%************************************************************************ +%* * +\subsection{Mapping, folding, and filtering with @FiniteMaps@} +%* * +%************************************************************************ + +\begin{code} +foldFM k z EmptyFM = z +foldFM k z (Branch key elt _ fm_l fm_r) + = foldFM k (k key elt (foldFM k z fm_r)) fm_l + +mapFM f EmptyFM = emptyFM +mapFM f (Branch key elt size fm_l fm_r) + = Branch key (f key elt) size (mapFM f fm_l) (mapFM f fm_r) + +filterFM p EmptyFM = emptyFM +filterFM p (Branch key elt _ fm_l fm_r) + | p key elt -- Keep the item + = mkVBalBranch key elt (filterFM p fm_l) (filterFM p fm_r) + + | otherwise -- Drop the item + = glueVBal (filterFM p fm_l) (filterFM p fm_r) +\end{code} + +%************************************************************************ +%* * +\subsection{Interrogating @FiniteMaps@} +%* * +%************************************************************************ + +\begin{code} +--{-# INLINE sizeFM #-} +sizeFM EmptyFM = 0 +sizeFM (Branch _ _ size _ _) = IF_GHC(I# size, size) + +isEmptyFM fm = sizeFM fm == 0 + +lookupFM EmptyFM key = Nothing +lookupFM (Branch key elt _ fm_l fm_r) key_to_find + = case compare key_to_find key of + LT -> lookupFM fm_l key_to_find + GT -> lookupFM fm_r key_to_find + EQ -> Just elt + +key `elemFM` fm + = case (lookupFM fm key) of { Nothing -> False; Just elt -> True } + +lookupWithDefaultFM fm deflt key + = case (lookupFM fm key) of { Nothing -> deflt; Just elt -> elt } +\end{code} + +%************************************************************************ +%* * +\subsection{Listifying @FiniteMaps@} +%* * +%************************************************************************ + +\begin{code} +fmToList fm = foldFM (\ key elt rest -> (key,elt) : rest) [] fm +keysFM fm = foldFM (\ key elt rest -> key : rest) [] fm +eltsFM fm = foldFM (\ key elt rest -> elt : rest) [] fm +\end{code} + + +%************************************************************************ +%* * +\subsection{The implementation of balancing} +%* * +%************************************************************************ + +%************************************************************************ +%* * +\subsubsection{Basic construction of a @FiniteMap@} +%* * +%************************************************************************ + +@mkBranch@ simply gets the size component right. This is the ONLY +(non-trivial) place the Branch object is built, so the ASSERTion +recursively checks consistency. (The trivial use of Branch is in +@unitFM@.) + +\begin{code} +sIZE_RATIO :: Int +sIZE_RATIO = 5 + +mkBranch :: (Ord key OUTPUTABLE_key) -- Used for the assertion checking only + => Int + -> key -> elt + -> FiniteMap key elt -> FiniteMap key elt + -> FiniteMap key elt + +mkBranch which key elt fm_l fm_r + = --ASSERT( left_ok && right_ok && balance_ok ) +#if defined(DEBUG_FINITEMAPS) + if not ( left_ok && right_ok && balance_ok ) then + pprPanic ("mkBranch:"++show which) (vcat [ppr [left_ok, right_ok, balance_ok], + ppr key, + ppr fm_l, + ppr fm_r]) + else +#endif + let + result = Branch key elt (unbox (1 + left_size + right_size)) fm_l fm_r + in +-- if sizeFM result <= 8 then + result +-- else +-- pprTrace ("mkBranch:"++(show which)) (ppr result) ( +-- result +-- ) + where + left_ok = case fm_l of + EmptyFM -> True + Branch left_key _ _ _ _ -> let + biggest_left_key = fst (findMax fm_l) + in + biggest_left_key < key + right_ok = case fm_r of + EmptyFM -> True + Branch right_key _ _ _ _ -> let + smallest_right_key = fst (findMin fm_r) + in + key < smallest_right_key + balance_ok = True -- sigh +{- LATER: + balance_ok + = -- Both subtrees have one or no elements... + (left_size + right_size <= 1) +-- NO || left_size == 0 -- ??? +-- NO || right_size == 0 -- ??? + -- ... or the number of elements in a subtree does not exceed + -- sIZE_RATIO times the number of elements in the other subtree + || (left_size * sIZE_RATIO >= right_size && + right_size * sIZE_RATIO >= left_size) +-} + + left_size = sizeFM fm_l + right_size = sizeFM fm_r + +#ifdef __GLASGOW_HASKELL__ + unbox :: Int -> Int# + unbox (I# size) = size +#else + unbox :: Int -> Int + unbox x = x +#endif +\end{code} + +%************************************************************************ +%* * +\subsubsection{{\em Balanced} construction of a @FiniteMap@} +%* * +%************************************************************************ + +@mkBalBranch@ rebalances, assuming that the subtrees aren't too far +out of whack. + +\begin{code} +mkBalBranch :: (Ord key OUTPUTABLE_key) + => key -> elt + -> FiniteMap key elt -> FiniteMap key elt + -> FiniteMap key elt + +mkBalBranch key elt fm_L fm_R + + | size_l + size_r < 2 + = mkBranch 1{-which-} key elt fm_L fm_R + + | size_r > sIZE_RATIO * size_l -- Right tree too big + = case fm_R of + Branch _ _ _ fm_rl fm_rr + | sizeFM fm_rl < 2 * sizeFM fm_rr -> single_L fm_L fm_R + | otherwise -> double_L fm_L fm_R + -- Other case impossible + + | size_l > sIZE_RATIO * size_r -- Left tree too big + = case fm_L of + Branch _ _ _ fm_ll fm_lr + | sizeFM fm_lr < 2 * sizeFM fm_ll -> single_R fm_L fm_R + | otherwise -> double_R fm_L fm_R + -- Other case impossible + + | otherwise -- No imbalance + = mkBranch 2{-which-} key elt fm_L fm_R + + where + size_l = sizeFM fm_L + size_r = sizeFM fm_R + + single_L fm_l (Branch key_r elt_r _ fm_rl fm_rr) + = mkBranch 3{-which-} key_r elt_r (mkBranch 4{-which-} key elt fm_l fm_rl) fm_rr + + double_L fm_l (Branch key_r elt_r _ (Branch key_rl elt_rl _ fm_rll fm_rlr) fm_rr) + = mkBranch 5{-which-} key_rl elt_rl (mkBranch 6{-which-} key elt fm_l fm_rll) + (mkBranch 7{-which-} key_r elt_r fm_rlr fm_rr) + + single_R (Branch key_l elt_l _ fm_ll fm_lr) fm_r + = mkBranch 8{-which-} key_l elt_l fm_ll (mkBranch 9{-which-} key elt fm_lr fm_r) + + double_R (Branch key_l elt_l _ fm_ll (Branch key_lr elt_lr _ fm_lrl fm_lrr)) fm_r + = mkBranch 10{-which-} key_lr elt_lr (mkBranch 11{-which-} key_l elt_l fm_ll fm_lrl) + (mkBranch 12{-which-} key elt fm_lrr fm_r) +\end{code} + + +\begin{code} +mkVBalBranch :: (Ord key OUTPUTABLE_key) + => key -> elt + -> FiniteMap key elt -> FiniteMap key elt + -> FiniteMap key elt + +-- Assert: in any call to (mkVBalBranch_C comb key elt l r), +-- (a) all keys in l are < all keys in r +-- (b) all keys in l are < key +-- (c) all keys in r are > key + +mkVBalBranch key elt EmptyFM fm_r = addToFM fm_r key elt +mkVBalBranch key elt fm_l EmptyFM = addToFM fm_l key elt + +mkVBalBranch key elt fm_l@(Branch key_l elt_l _ fm_ll fm_lr) + fm_r@(Branch key_r elt_r _ fm_rl fm_rr) + | sIZE_RATIO * size_l < size_r + = mkBalBranch key_r elt_r (mkVBalBranch key elt fm_l fm_rl) fm_rr + + | sIZE_RATIO * size_r < size_l + = mkBalBranch key_l elt_l fm_ll (mkVBalBranch key elt fm_lr fm_r) + + | otherwise + = mkBranch 13{-which-} key elt fm_l fm_r + + where + size_l = sizeFM fm_l + size_r = sizeFM fm_r +\end{code} + +%************************************************************************ +%* * +\subsubsection{Gluing two trees together} +%* * +%************************************************************************ + +@glueBal@ assumes its two arguments aren't too far out of whack, just +like @mkBalBranch@. But: all keys in first arg are $<$ all keys in +second. + +\begin{code} +glueBal :: (Ord key OUTPUTABLE_key) + => FiniteMap key elt -> FiniteMap key elt + -> FiniteMap key elt + +glueBal EmptyFM fm2 = fm2 +glueBal fm1 EmptyFM = fm1 +glueBal fm1 fm2 + -- The case analysis here (absent in Adams' program) is really to deal + -- with the case where fm2 is a singleton. Then deleting the minimum means + -- we pass an empty tree to mkBalBranch, which breaks its invariant. + | sizeFM fm2 > sizeFM fm1 + = mkBalBranch mid_key2 mid_elt2 fm1 (deleteMin fm2) + + | otherwise + = mkBalBranch mid_key1 mid_elt1 (deleteMax fm1) fm2 + where + (mid_key1, mid_elt1) = findMax fm1 + (mid_key2, mid_elt2) = findMin fm2 +\end{code} + +@glueVBal@ copes with arguments which can be of any size. +But: all keys in first arg are $<$ all keys in second. + +\begin{code} +glueVBal :: (Ord key OUTPUTABLE_key) + => FiniteMap key elt -> FiniteMap key elt + -> FiniteMap key elt + +glueVBal EmptyFM fm2 = fm2 +glueVBal fm1 EmptyFM = fm1 +glueVBal fm_l@(Branch key_l elt_l _ fm_ll fm_lr) + fm_r@(Branch key_r elt_r _ fm_rl fm_rr) + | sIZE_RATIO * size_l < size_r + = mkBalBranch key_r elt_r (glueVBal fm_l fm_rl) fm_rr + + | sIZE_RATIO * size_r < size_l + = mkBalBranch key_l elt_l fm_ll (glueVBal fm_lr fm_r) + + | otherwise -- We now need the same two cases as in glueBal above. + = glueBal fm_l fm_r + where + size_l = sizeFM fm_l + size_r = sizeFM fm_r +\end{code} + +%************************************************************************ +%* * +\subsection{Local utilities} +%* * +%************************************************************************ + +\begin{code} +splitLT, splitGT :: (Ord key OUTPUTABLE_key) => FiniteMap key elt -> key -> FiniteMap key elt + +-- splitLT fm split_key = fm restricted to keys < split_key +-- splitGT fm split_key = fm restricted to keys > split_key + +splitLT EmptyFM split_key = emptyFM +splitLT (Branch key elt _ fm_l fm_r) split_key + = case compare split_key key of + LT -> splitLT fm_l split_key + GT -> mkVBalBranch key elt fm_l (splitLT fm_r split_key) + EQ -> fm_l + +splitGT EmptyFM split_key = emptyFM +splitGT (Branch key elt _ fm_l fm_r) split_key + = case compare split_key key of + GT -> splitGT fm_r split_key + LT -> mkVBalBranch key elt (splitGT fm_l split_key) fm_r + EQ -> fm_r + +findMin :: FiniteMap key elt -> (key,elt) +findMin (Branch key elt _ EmptyFM _) = (key,elt) +findMin (Branch key elt _ fm_l _) = findMin fm_l + +deleteMin :: (Ord key OUTPUTABLE_key) => FiniteMap key elt -> FiniteMap key elt +deleteMin (Branch key elt _ EmptyFM fm_r) = fm_r +deleteMin (Branch key elt _ fm_l fm_r) = mkBalBranch key elt (deleteMin fm_l) fm_r + +findMax :: FiniteMap key elt -> (key,elt) +findMax (Branch key elt _ _ EmptyFM) = (key,elt) +findMax (Branch key elt _ _ fm_r) = findMax fm_r + +deleteMax :: (Ord key OUTPUTABLE_key) => FiniteMap key elt -> FiniteMap key elt +deleteMax (Branch key elt _ fm_l EmptyFM) = fm_l +deleteMax (Branch key elt _ fm_l fm_r) = mkBalBranch key elt fm_l (deleteMax fm_r) +\end{code} + +%************************************************************************ +%* * +\subsection{Output-ery} +%* * +%************************************************************************ + +\begin{code} +#if defined(DEBUG_FINITEMAPS) + +instance (Outputable key) => Outputable (FiniteMap key elt) where + ppr fm = pprX fm + +pprX EmptyFM = char '!' +pprX (Branch key elt sz fm_l fm_r) + = parens (hcat [pprX fm_l, space, + ppr key, space, int (IF_GHC(I# sz, sz)), space, + pprX fm_r]) +#else +-- and when not debugging the package itself... +instance (Outputable key, Outputable elt) => Outputable (FiniteMap key elt) where + ppr fm = ppr (fmToList fm) +#endif + +#if 0 +instance (Eq key, Eq elt) => Eq (FiniteMap key elt) where + fm_1 == fm_2 = (sizeFM fm_1 == sizeFM fm_2) && -- quick test + (fmToList fm_1 == fmToList fm_2) + +{- NO: not clear what The Right Thing to do is: +instance (Ord key, Ord elt) => Ord (FiniteMap key elt) where + fm_1 <= fm_2 = (sizeFM fm_1 <= sizeFM fm_2) && -- quick test + (fmToList fm_1 <= fmToList fm_2) +-} +#endif +\end{code} + +%************************************************************************ +%* * +\subsection{Efficiency pragmas for GHC} +%* * +%************************************************************************ + +When the FiniteMap module is used in GHC, we specialise it for +\tr{Uniques}, for dastardly efficiency reasons. + +\begin{code} +#if 0 + +#if __GLASGOW_HASKELL__ + +{-# SPECIALIZE addListToFM + :: FiniteMap (FastString, FAST_STRING) elt -> [((FAST_STRING, FAST_STRING),elt)] -> FiniteMap (FAST_STRING, FAST_STRING) elt + , FiniteMap RdrName elt -> [(RdrName,elt)] -> FiniteMap RdrName elt + IF_NCG(COMMA FiniteMap Reg elt -> [(Reg COMMA elt)] -> FiniteMap Reg elt) + #-} +{-# SPECIALIZE addListToFM_C + :: (elt -> elt -> elt) -> FiniteMap TyCon elt -> [(TyCon,elt)] -> FiniteMap TyCon elt + , (elt -> elt -> elt) -> FiniteMap FastString elt -> [(FAST_STRING,elt)] -> FiniteMap FAST_STRING elt + IF_NCG(COMMA (elt -> elt -> elt) -> FiniteMap Reg elt -> [(Reg COMMA elt)] -> FiniteMap Reg elt) + #-} +{-# SPECIALIZE addToFM + :: FiniteMap CLabel elt -> CLabel -> elt -> FiniteMap CLabel elt + , FiniteMap FastString elt -> FAST_STRING -> elt -> FiniteMap FAST_STRING elt + , FiniteMap (FastString, FAST_STRING) elt -> (FAST_STRING, FAST_STRING) -> elt -> FiniteMap (FAST_STRING, FAST_STRING) elt + , FiniteMap RdrName elt -> RdrName -> elt -> FiniteMap RdrName elt + IF_NCG(COMMA FiniteMap Reg elt -> Reg -> elt -> FiniteMap Reg elt) + #-} +{-# SPECIALIZE addToFM_C + :: (elt -> elt -> elt) -> FiniteMap (RdrName, RdrName) elt -> (RdrName, RdrName) -> elt -> FiniteMap (RdrName, RdrName) elt + , (elt -> elt -> elt) -> FiniteMap FastString elt -> FAST_STRING -> elt -> FiniteMap FAST_STRING elt + IF_NCG(COMMA (elt -> elt -> elt) -> FiniteMap Reg elt -> Reg -> elt -> FiniteMap Reg elt) + #-} +{-# SPECIALIZE bagToFM + :: Bag (FastString,elt) -> FiniteMap FAST_STRING elt + #-} +{-# SPECIALIZE delListFromFM + :: FiniteMap RdrName elt -> [RdrName] -> FiniteMap RdrName elt + , FiniteMap FastString elt -> [FAST_STRING] -> FiniteMap FAST_STRING elt + IF_NCG(COMMA FiniteMap Reg elt -> [Reg] -> FiniteMap Reg elt) + #-} +{-# SPECIALIZE listToFM + :: [([Char],elt)] -> FiniteMap [Char] elt + , [(FastString,elt)] -> FiniteMap FAST_STRING elt + , [((FastString,FAST_STRING),elt)] -> FiniteMap (FAST_STRING, FAST_STRING) elt + IF_NCG(COMMA [(Reg COMMA elt)] -> FiniteMap Reg elt) + #-} +{-# SPECIALIZE lookupFM + :: FiniteMap CLabel elt -> CLabel -> Maybe elt + , FiniteMap [Char] elt -> [Char] -> Maybe elt + , FiniteMap FastString elt -> FAST_STRING -> Maybe elt + , FiniteMap (FastString,FAST_STRING) elt -> (FAST_STRING,FAST_STRING) -> Maybe elt + , FiniteMap RdrName elt -> RdrName -> Maybe elt + , FiniteMap (RdrName,RdrName) elt -> (RdrName,RdrName) -> Maybe elt + IF_NCG(COMMA FiniteMap Reg elt -> Reg -> Maybe elt) + #-} +{-# SPECIALIZE lookupWithDefaultFM + :: FiniteMap FastString elt -> elt -> FAST_STRING -> elt + IF_NCG(COMMA FiniteMap Reg elt -> elt -> Reg -> elt) + #-} +{-# SPECIALIZE plusFM + :: FiniteMap RdrName elt -> FiniteMap RdrName elt -> FiniteMap RdrName elt + , FiniteMap FastString elt -> FiniteMap FAST_STRING elt -> FiniteMap FAST_STRING elt + IF_NCG(COMMA FiniteMap Reg elt -> FiniteMap Reg elt -> FiniteMap Reg elt) + #-} +{-# SPECIALIZE plusFM_C + :: (elt -> elt -> elt) -> FiniteMap FastString elt -> FiniteMap FAST_STRING elt -> FiniteMap FAST_STRING elt + IF_NCG(COMMA (elt -> elt -> elt) -> FiniteMap Reg elt -> FiniteMap Reg elt -> FiniteMap Reg elt) + #-} + +#endif /* compiling with ghc and have specialiser */ + +#endif /* 0 */ +\end{code} diff --git a/compiler/utils/IOEnv.hs b/compiler/utils/IOEnv.hs new file mode 100644 index 0000000000..e1dfdb400b --- /dev/null +++ b/compiler/utils/IOEnv.hs @@ -0,0 +1,208 @@ +-- (c) The University of Glasgow 2002 +-- +-- The IO Monad with an environment +-- + +module IOEnv ( + IOEnv, -- Instance of Monad + + -- Standard combinators, specialised + returnM, thenM, thenM_, failM, failWithM, + mappM, mappM_, mapSndM, sequenceM, sequenceM_, + foldlM, foldrM, + mapAndUnzipM, mapAndUnzip3M, + checkM, ifM, zipWithM, zipWithM_, + + -- Getting at the environment + getEnv, setEnv, updEnv, + + runIOEnv, unsafeInterleaveM, + tryM, tryAllM, fixM, + + -- I/O operations + ioToIOEnv, + IORef, newMutVar, readMutVar, writeMutVar, updMutVar + ) where +#include "HsVersions.h" + +import Panic ( try, tryUser, Exception(..) ) +import DATA_IOREF ( IORef, newIORef, readIORef, writeIORef ) +import UNSAFE_IO ( unsafeInterleaveIO ) +import FIX_IO ( fixIO ) + + +---------------------------------------------------------------------- +-- Defining the monad type +---------------------------------------------------------------------- + + +newtype IOEnv env a = IOEnv (env -> IO a) +unIOEnv (IOEnv m) = m + +instance Monad (IOEnv m) where + (>>=) = thenM + (>>) = thenM_ + return = returnM + fail s = failM -- Ignore the string + +returnM :: a -> IOEnv env a +returnM a = IOEnv (\ env -> return a) + +thenM :: IOEnv env a -> (a -> IOEnv env b) -> IOEnv env b +thenM (IOEnv m) f = IOEnv (\ env -> do { r <- m env ; + unIOEnv (f r) env }) + +thenM_ :: IOEnv env a -> IOEnv env b -> IOEnv env b +thenM_ (IOEnv m) f = IOEnv (\ env -> do { m env ; unIOEnv f env }) + +failM :: IOEnv env a +failM = IOEnv (\ env -> ioError (userError "IOEnv failure")) + +failWithM :: String -> IOEnv env a +failWithM s = IOEnv (\ env -> ioError (userError s)) + + + +---------------------------------------------------------------------- +-- Fundmantal combinators specific to the monad +---------------------------------------------------------------------- + + +--------------------------- +runIOEnv :: env -> IOEnv env a -> IO a +runIOEnv env (IOEnv m) = m env + + +--------------------------- +{-# NOINLINE fixM #-} + -- Aargh! Not inlining fixTc alleviates a space leak problem. + -- Normally fixTc is used with a lazy tuple match: if the optimiser is + -- shown the definition of fixTc, it occasionally transforms the code + -- in such a way that the code generator doesn't spot the selector + -- thunks. Sigh. + +fixM :: (a -> IOEnv env a) -> IOEnv env a +fixM f = IOEnv (\ env -> fixIO (\ r -> unIOEnv (f r) env)) + + +--------------------------- +tryM :: IOEnv env r -> IOEnv env (Either Exception r) +-- Reflect UserError exceptions into IOEnv monad +-- The idea is that errors in the program being compiled will give rise +-- to UserErrors. But, say, pattern-match failures in GHC itself should +-- not be caught here, else they'll be reported as errors in the program +-- begin compiled! +tryM (IOEnv thing) = IOEnv (\ env -> tryUser (thing env)) + +tryAllM :: IOEnv env r -> IOEnv env (Either Exception r) +-- Catch *all* exceptions +-- This is used when running a Template-Haskell splice, when +-- even a pattern-match failure is a programmer error +tryAllM (IOEnv thing) = IOEnv (\ env -> try (thing env)) + +--------------------------- +unsafeInterleaveM :: IOEnv env a -> IOEnv env a +unsafeInterleaveM (IOEnv m) = IOEnv (\ env -> unsafeInterleaveIO (m env)) + + +---------------------------------------------------------------------- +-- Accessing input/output +---------------------------------------------------------------------- + +ioToIOEnv :: IO a -> IOEnv env a +ioToIOEnv io = IOEnv (\ env -> io) + +newMutVar :: a -> IOEnv env (IORef a) +newMutVar val = IOEnv (\ env -> newIORef val) + +writeMutVar :: IORef a -> a -> IOEnv env () +writeMutVar var val = IOEnv (\ env -> writeIORef var val) + +readMutVar :: IORef a -> IOEnv env a +readMutVar var = IOEnv (\ env -> readIORef var) + +updMutVar :: IORef a -> (a->a) -> IOEnv env () +updMutVar var upd_fn = IOEnv (\ env -> do { v <- readIORef var; writeIORef var (upd_fn v) }) + + +---------------------------------------------------------------------- +-- Accessing the environment +---------------------------------------------------------------------- + +getEnv :: IOEnv env env +{-# INLINE getEnv #-} +getEnv = IOEnv (\ env -> return env) + +setEnv :: env' -> IOEnv env' a -> IOEnv env a +{-# INLINE setEnv #-} +setEnv new_env (IOEnv m) = IOEnv (\ env -> m new_env) + +updEnv :: (env -> env') -> IOEnv env' a -> IOEnv env a +{-# INLINE updEnv #-} +updEnv upd (IOEnv m) = IOEnv (\ env -> m (upd env)) + + +---------------------------------------------------------------------- +-- Standard combinators, but specialised for this monad +-- (for efficiency) +---------------------------------------------------------------------- + +mappM :: (a -> IOEnv env b) -> [a] -> IOEnv env [b] +mappM_ :: (a -> IOEnv env b) -> [a] -> IOEnv env () +mapSndM :: (b -> IOEnv env c) -> [(a,b)] -> IOEnv env [(a,c)] + -- Funny names to avoid clash with Prelude +sequenceM :: [IOEnv env a] -> IOEnv env [a] +sequenceM_ :: [IOEnv env a] -> IOEnv env () +foldlM :: (a -> b -> IOEnv env a) -> a -> [b] -> IOEnv env a +foldrM :: (b -> a -> IOEnv env a) -> a -> [b] -> IOEnv env a +mapAndUnzipM :: (a -> IOEnv env (b,c)) -> [a] -> IOEnv env ([b],[c]) +mapAndUnzip3M :: (a -> IOEnv env (b,c,d)) -> [a] -> IOEnv env ([b],[c],[d]) +checkM :: Bool -> IOEnv env a -> IOEnv env () -- Perform arg if bool is False +ifM :: Bool -> IOEnv env a -> IOEnv env () -- Perform arg if bool is True + +mappM f [] = return [] +mappM f (x:xs) = do { r <- f x; rs <- mappM f xs; return (r:rs) } + +mapSndM f [] = return [] +mapSndM f ((a,b):xs) = do { c <- f b; rs <- mapSndM f xs; return ((a,c):rs) } + +mappM_ f [] = return () +mappM_ f (x:xs) = f x >> mappM_ f xs + +zipWithM :: (a -> b -> IOEnv env c) -> [a] -> [b] -> IOEnv env [c] +zipWithM f [] bs = return [] +zipWithM f as [] = return [] +zipWithM f (a:as) (b:bs) = do { r <- f a b; rs <- zipWithM f as bs; return (r:rs) } + +zipWithM_ :: (a -> b -> IOEnv env c) -> [a] -> [b] -> IOEnv env () +zipWithM_ f [] bs = return () +zipWithM_ f as [] = return () +zipWithM_ f (a:as) (b:bs) = do { f a b; zipWithM_ f as bs } + +sequenceM [] = return [] +sequenceM (x:xs) = do { r <- x; rs <- sequenceM xs; return (r:rs) } + +sequenceM_ [] = return () +sequenceM_ (x:xs) = do { x; sequenceM_ xs } + +foldlM k z [] = return z +foldlM k z (x:xs) = do { r <- k z x; foldlM k r xs } + +foldrM k z [] = return z +foldrM k z (x:xs) = do { r <- foldrM k z xs; k x r } + +mapAndUnzipM f [] = return ([],[]) +mapAndUnzipM f (x:xs) = do { (r,s) <- f x; + (rs,ss) <- mapAndUnzipM f xs; + return (r:rs, s:ss) } + +mapAndUnzip3M f [] = return ([],[], []) +mapAndUnzip3M f (x:xs) = do { (r,s,t) <- f x; + (rs,ss,ts) <- mapAndUnzip3M f xs; + return (r:rs, s:ss, t:ts) } + +checkM True err = return () +checkM False err = do { err; return () } + +ifM True do_it = do { do_it; return () } +ifM False do_it = return () diff --git a/compiler/utils/ListSetOps.lhs b/compiler/utils/ListSetOps.lhs new file mode 100644 index 0000000000..02950722a2 --- /dev/null +++ b/compiler/utils/ListSetOps.lhs @@ -0,0 +1,227 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[ListSetOps]{Set-like operations on lists} + +\begin{code} +module ListSetOps ( + unionLists, minusList, insertList, + + -- Association lists + Assoc, assoc, assocMaybe, assocUsing, assocDefault, assocDefaultUsing, + emptyAssoc, unitAssoc, mapAssoc, plusAssoc_C, extendAssoc_C, + mkLookupFun, findInList, assocElts, + + -- Duplicate handling + hasNoDups, runs, removeDups, findDupsEq, + equivClasses, equivClassesByUniq + + ) where + +#include "HsVersions.h" + +import Outputable +import Unique ( Unique ) +import UniqFM ( eltsUFM, emptyUFM, addToUFM_C ) +import Util ( isn'tIn, isIn, mapAccumR, sortLe ) +import List ( partition ) +\end{code} + + +%************************************************************************ +%* * + Treating lists as sets + Assumes the lists contain no duplicates, but are unordered +%* * +%************************************************************************ + +\begin{code} +insertList :: Eq a => a -> [a] -> [a] +-- Assumes the arg list contains no dups; guarantees the result has no dups +insertList x xs | isIn "insert" x xs = xs + | otherwise = x : xs + +unionLists :: (Eq a) => [a] -> [a] -> [a] +-- Assumes that the arguments contain no duplicates +unionLists xs ys = [x | x <- xs, isn'tIn "unionLists" x ys] ++ ys + +minusList :: (Eq a) => [a] -> [a] -> [a] +-- Everything in the first list that is not in the second list: +minusList xs ys = [ x | x <- xs, isn'tIn "minusList" x ys] +\end{code} + + +%************************************************************************ +%* * +\subsection[Utils-assoc]{Association lists} +%* * +%************************************************************************ + +Inefficient finite maps based on association lists and equality. + +\begin{code} +type Assoc a b = [(a,b)] -- A finite mapping based on equality and association lists + +emptyAssoc :: Assoc a b +unitAssoc :: a -> b -> Assoc a b +assocElts :: Assoc a b -> [(a,b)] +assoc :: (Eq a) => String -> Assoc a b -> a -> b +assocDefault :: (Eq a) => b -> Assoc a b -> a -> b +assocUsing :: (a -> a -> Bool) -> String -> Assoc a b -> a -> b +assocMaybe :: (Eq a) => Assoc a b -> a -> Maybe b +assocDefaultUsing :: (a -> a -> Bool) -> b -> Assoc a b -> a -> b +mapAssoc :: (b -> c) -> Assoc a b -> Assoc a c +extendAssoc_C :: (Eq a) => (b -> b -> b) -> Assoc a b -> (a,b) -> Assoc a b +plusAssoc_C :: (Eq a) => (b -> b -> b) -> Assoc a b -> Assoc a b -> Assoc a b + -- combining fn takes (old->new->result) + +emptyAssoc = [] +unitAssoc a b = [(a,b)] +assocElts xs = xs + +assocDefaultUsing eq deflt ((k,v) : rest) key + | k `eq` key = v + | otherwise = assocDefaultUsing eq deflt rest key + +assocDefaultUsing eq deflt [] key = deflt + +assoc crash_msg list key = assocDefaultUsing (==) (panic ("Failed in assoc: " ++ crash_msg)) list key +assocDefault deflt list key = assocDefaultUsing (==) deflt list key +assocUsing eq crash_msg list key = assocDefaultUsing eq (panic ("Failed in assoc: " ++ crash_msg)) list key + +assocMaybe alist key + = lookup alist + where + lookup [] = Nothing + lookup ((tv,ty):rest) = if key == tv then Just ty else lookup rest + +mapAssoc f alist = [(key, f val) | (key,val) <- alist] + +plusAssoc_C combine [] new = new -- Shortcut for common case +plusAssoc_C combine old new = foldl (extendAssoc_C combine) old new + +extendAssoc_C combine old_list (new_key, new_val) + = go old_list + where + go [] = [(new_key, new_val)] + go ((old_key, old_val) : old_list) + | new_key == old_key = ((old_key, old_val `combine` new_val) : old_list) + | otherwise = (old_key, old_val) : go old_list +\end{code} + + +@mkLookupFun eq alist@ is a function which looks up +its argument in the association list @alist@, returning a Maybe type. +@mkLookupFunDef@ is similar except that it is given a value to return +on failure. + +\begin{code} +mkLookupFun :: (key -> key -> Bool) -- Equality predicate + -> [(key,val)] -- The assoc list + -> key -- The key + -> Maybe val -- The corresponding value + +mkLookupFun eq alist s + = case [a | (s',a) <- alist, s' `eq` s] of + [] -> Nothing + (a:_) -> Just a + +findInList :: (a -> Bool) -> [a] -> Maybe a +findInList p [] = Nothing +findInList p (x:xs) | p x = Just x + | otherwise = findInList p xs +\end{code} + + +%************************************************************************ +%* * +\subsection[Utils-dups]{Duplicate-handling} +%* * +%************************************************************************ + +\begin{code} +hasNoDups :: (Eq a) => [a] -> Bool + +hasNoDups xs = f [] xs + where + f seen_so_far [] = True + f seen_so_far (x:xs) = if x `is_elem` seen_so_far then + False + else + f (x:seen_so_far) xs + + is_elem = isIn "hasNoDups" +\end{code} + +\begin{code} +equivClasses :: (a -> a -> Ordering) -- Comparison + -> [a] + -> [[a]] + +equivClasses cmp stuff@[] = [] +equivClasses cmp stuff@[item] = [stuff] +equivClasses cmp items + = runs eq (sortLe le items) + where + eq a b = case cmp a b of { EQ -> True; _ -> False } + le a b = case cmp a b of { LT -> True; EQ -> True; GT -> False } +\end{code} + +The first cases in @equivClasses@ above are just to cut to the point +more quickly... + +@runs@ groups a list into a list of lists, each sublist being a run of +identical elements of the input list. It is passed a predicate @p@ which +tells when two elements are equal. + +\begin{code} +runs :: (a -> a -> Bool) -- Equality + -> [a] + -> [[a]] + +runs p [] = [] +runs p (x:xs) = case (span (p x) xs) of + (first, rest) -> (x:first) : (runs p rest) +\end{code} + +\begin{code} +removeDups :: (a -> a -> Ordering) -- Comparison function + -> [a] + -> ([a], -- List with no duplicates + [[a]]) -- List of duplicate groups. One representative from + -- each group appears in the first result + +removeDups cmp [] = ([], []) +removeDups cmp [x] = ([x],[]) +removeDups cmp xs + = case (mapAccumR collect_dups [] (equivClasses cmp xs)) of { (dups, xs') -> + (xs', dups) } + where + collect_dups dups_so_far [x] = (dups_so_far, x) + collect_dups dups_so_far dups@(x:xs) = (dups:dups_so_far, x) + +findDupsEq :: (a->a->Bool) -> [a] -> [[a]] +findDupsEq eq [] = [] +findDupsEq eq (x:xs) | null eq_xs = findDupsEq eq xs + | otherwise = (x:eq_xs) : findDupsEq eq neq_xs + where + (eq_xs, neq_xs) = partition (eq x) xs +\end{code} + + +\begin{code} +equivClassesByUniq :: (a -> Unique) -> [a] -> [[a]] + -- NB: it's *very* important that if we have the input list [a,b,c], + -- where a,b,c all have the same unique, then we get back the list + -- [a,b,c] + -- not + -- [c,b,a] + -- Hence the use of foldr, plus the reversed-args tack_on below +equivClassesByUniq get_uniq xs + = eltsUFM (foldr add emptyUFM xs) + where + add a ufm = addToUFM_C tack_on ufm (get_uniq a) [a] + tack_on old new = new++old +\end{code} + + diff --git a/compiler/utils/Maybes.lhs b/compiler/utils/Maybes.lhs new file mode 100644 index 0000000000..3c9bd693e6 --- /dev/null +++ b/compiler/utils/Maybes.lhs @@ -0,0 +1,123 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 +% +\section[Maybes]{The `Maybe' types and associated utility functions} + +\begin{code} +module Maybes ( + module Maybe, -- Re-export all of Maybe + + MaybeErr(..), -- Instance of Monad + failME, + + orElse, + mapCatMaybes, + allMaybes, + firstJust, + expectJust, + maybeToBool, + + thenMaybe, seqMaybe, returnMaybe, failMaybe + ) where + +#include "HsVersions.h" + +import Maybe + + +infixr 4 `orElse` +\end{code} + +%************************************************************************ +%* * +\subsection[Maybe type]{The @Maybe@ type} +%* * +%************************************************************************ + +\begin{code} +maybeToBool :: Maybe a -> Bool +maybeToBool Nothing = False +maybeToBool (Just x) = True +\end{code} + +@catMaybes@ takes a list of @Maybe@s and returns a list of +the contents of all the @Just@s in it. @allMaybes@ collects +a list of @Justs@ into a single @Just@, returning @Nothing@ if there +are any @Nothings@. + +\begin{code} +allMaybes :: [Maybe a] -> Maybe [a] +allMaybes [] = Just [] +allMaybes (Nothing : ms) = Nothing +allMaybes (Just x : ms) = case (allMaybes ms) of + Nothing -> Nothing + Just xs -> Just (x:xs) + +\end{code} + +@firstJust@ takes a list of @Maybes@ and returns the +first @Just@ if there is one, or @Nothing@ otherwise. + +\begin{code} +firstJust :: [Maybe a] -> Maybe a +firstJust [] = Nothing +firstJust (Just x : ms) = Just x +firstJust (Nothing : ms) = firstJust ms +\end{code} + +\begin{code} +expectJust :: String -> Maybe a -> a +{-# INLINE expectJust #-} +expectJust err (Just x) = x +expectJust err Nothing = error ("expectJust " ++ err) +\end{code} + +\begin{code} +mapCatMaybes :: (a -> Maybe b) -> [a] -> [b] +mapCatMaybes f [] = [] +mapCatMaybes f (x:xs) = case f x of + Just y -> y : mapCatMaybes f xs + Nothing -> mapCatMaybes f xs +\end{code} + +The Maybe monad +~~~~~~~~~~~~~~~ +\begin{code} +seqMaybe :: Maybe a -> Maybe a -> Maybe a +seqMaybe (Just x) _ = Just x +seqMaybe Nothing my = my + +thenMaybe :: Maybe a -> (a -> Maybe b) -> Maybe b +thenMaybe ma mb = case ma of + Just x -> mb x + Nothing -> Nothing + +returnMaybe :: a -> Maybe a +returnMaybe = Just + +failMaybe :: Maybe a +failMaybe = Nothing + +orElse :: Maybe a -> a -> a +(Just x) `orElse` y = x +Nothing `orElse` y = y +\end{code} + + +%************************************************************************ +%* * +\subsection[MaybeErr type]{The @MaybeErr@ type} +%* * +%************************************************************************ + +\begin{code} +data MaybeErr err val = Succeeded val | Failed err + +instance Monad (MaybeErr err) where + return v = Succeeded v + Succeeded v >>= k = k v + Failed e >>= k = Failed e + +failME :: err -> MaybeErr err val +failME e = Failed e +\end{code} diff --git a/compiler/utils/OrdList.lhs b/compiler/utils/OrdList.lhs new file mode 100644 index 0000000000..7f22b38e49 --- /dev/null +++ b/compiler/utils/OrdList.lhs @@ -0,0 +1,83 @@ +% +% (c) The AQUA Project, Glasgow University, 1993-1998 +% + +This is useful, general stuff for the Native Code Generator. + +Provide trees (of instructions), so that lists of instructions +can be appended in linear time. + +\begin{code} +module OrdList ( + OrdList, + nilOL, isNilOL, unitOL, appOL, consOL, snocOL, concatOL, + fromOL, toOL, foldrOL, foldlOL +) where + +infixl 5 `appOL` +infixl 5 `snocOL` +infixr 5 `consOL` + +data OrdList a + = Many [a] + | Two (OrdList a) (OrdList a) + | One a + | None + +nilOL :: OrdList a +isNilOL :: OrdList a -> Bool + +unitOL :: a -> OrdList a +snocOL :: OrdList a -> a -> OrdList a +consOL :: a -> OrdList a -> OrdList a +appOL :: OrdList a -> OrdList a -> OrdList a +concatOL :: [OrdList a] -> OrdList a + +nilOL = None +unitOL as = One as +snocOL as b = Two as (One b) +consOL a bs = Two (One a) bs +concatOL aas = foldr Two None aas + +isNilOL None = True +isNilOL (One _) = False +isNilOL (Two as bs) = isNilOL as && isNilOL bs +isNilOL (Many xs) = null xs + +appOL None bs = bs +appOL as None = as +appOL as bs = Two as bs + +mapOL :: (a -> b) -> OrdList a -> OrdList b +mapOL f None = None +mapOL f (One x) = One (f x) +mapOL f (Two x y) = Two (mapOL f x) (mapOL f y) +mapOL f (Many xs) = Many (map f xs) + +instance Functor OrdList where + fmap = mapOL + +foldrOL :: (a->b->b) -> b -> OrdList a -> b +foldrOL k z None = z +foldrOL k z (One x) = k x z +foldrOL k z (Two b1 b2) = foldrOL k (foldrOL k z b2) b1 +foldrOL k z (Many xs) = foldr k z xs + +foldlOL :: (b->a->b) -> b -> OrdList a -> b +foldlOL k z None = z +foldlOL k z (One x) = k z x +foldlOL k z (Two b1 b2) = foldlOL k (foldlOL k z b1) b2 +foldlOL k z (Many xs) = foldl k z xs + +fromOL :: OrdList a -> [a] +fromOL ol + = flat ol [] + where + flat None rest = rest + flat (One x) rest = x:rest + flat (Two a b) rest = flat a (flat b rest) + flat (Many xs) rest = xs ++ rest + +toOL :: [a] -> OrdList a +toOL xs = Many xs +\end{code} diff --git a/compiler/utils/Outputable.lhs b/compiler/utils/Outputable.lhs new file mode 100644 index 0000000000..cf99e12bcf --- /dev/null +++ b/compiler/utils/Outputable.lhs @@ -0,0 +1,540 @@ +% +% (c) The GRASP Project, Glasgow University, 1992-1998 +% +\section[Outputable]{Classes for pretty-printing} + +Defines classes for pretty-printing and forcing, both forms of +``output.'' + +\begin{code} + +module Outputable ( + Outputable(..), OutputableBndr(..), -- Class + + BindingSite(..), + + PprStyle, CodeStyle(..), PrintUnqualified, alwaysQualify, + getPprStyle, withPprStyle, withPprStyleDoc, pprDeeper, pprSetDepth, + codeStyle, userStyle, debugStyle, dumpStyle, asmStyle, + ifPprDebug, unqualStyle, + mkErrStyle, defaultErrStyle, defaultDumpStyle, defaultUserStyle, + + SDoc, -- Abstract + docToSDoc, + interppSP, interpp'SP, pprQuotedList, pprWithCommas, + empty, nest, + text, char, ftext, ptext, + int, integer, float, double, rational, + parens, brackets, braces, quotes, doubleQuotes, angleBrackets, + semi, comma, colon, dcolon, space, equals, dot, arrow, + lparen, rparen, lbrack, rbrack, lbrace, rbrace, underscore, + (<>), (<+>), hcat, hsep, + ($$), ($+$), vcat, + sep, cat, + fsep, fcat, + hang, punctuate, + speakNth, speakNTimes, speakN, speakNOf, plural, + + printSDoc, printErrs, printDump, + printForC, printForAsm, printForUser, + pprCode, mkCodeStyle, + showSDoc, showSDocForUser, showSDocDebug, showSDocDump, + showSDocUnqual, showsPrecSDoc, + pprHsChar, pprHsString, + + -- error handling + pprPanic, assertPprPanic, pprPanic#, pprPgmError, + pprTrace, warnPprTrace, + trace, pgmError, panic, panic#, assertPanic + ) where + +#include "HsVersions.h" + + +import {-# SOURCE #-} Module( Module ) +import {-# SOURCE #-} OccName( OccName ) + +import StaticFlags ( opt_PprStyle_Debug, opt_PprUserLength ) +import PackageConfig ( PackageId, packageIdString ) +import FastString +import qualified Pretty +import Pretty ( Doc, Mode(..) ) +import Panic + +import DATA_WORD ( Word32 ) + +import IO ( Handle, stderr, stdout, hFlush ) +import Char ( ord ) +\end{code} + + +%************************************************************************ +%* * +\subsection{The @PprStyle@ data type} +%* * +%************************************************************************ + +\begin{code} +data PprStyle + = PprUser PrintUnqualified Depth + -- Pretty-print in a way that will make sense to the + -- ordinary user; must be very close to Haskell + -- syntax, etc. + -- Assumes printing tidied code: non-system names are + -- printed without uniques. + + | PprCode CodeStyle + -- Print code; either C or assembler + + | PprDump -- For -ddump-foo; less verbose than PprDebug. + -- Does not assume tidied code: non-external names + -- are printed with uniques. + + | PprDebug -- Full debugging output + +data CodeStyle = CStyle -- The format of labels differs for C and assembler + | AsmStyle + +data Depth = AllTheWay + | PartWay Int -- 0 => stop + + +type PrintUnqualified = Module -> OccName -> Bool + -- This function tells when it's ok to print + -- a (Global) name unqualified + +alwaysQualify,neverQualify :: PrintUnqualified +alwaysQualify m n = False +neverQualify m n = True + +defaultUserStyle = mkUserStyle alwaysQualify AllTheWay + +defaultDumpStyle | opt_PprStyle_Debug = PprDebug + | otherwise = PprDump + +mkErrStyle :: PrintUnqualified -> PprStyle +-- Style for printing error messages +mkErrStyle print_unqual = mkUserStyle print_unqual (PartWay opt_PprUserLength) + +defaultErrStyle :: PprStyle +-- Default style for error messages +-- It's a bit of a hack because it doesn't take into account what's in scope +-- Only used for desugarer warnings, and typechecker errors in interface sigs +defaultErrStyle + | opt_PprStyle_Debug = mkUserStyle alwaysQualify AllTheWay + | otherwise = mkUserStyle alwaysQualify (PartWay opt_PprUserLength) + +mkUserStyle unqual depth | opt_PprStyle_Debug = PprDebug + | otherwise = PprUser unqual depth +\end{code} + +Orthogonal to the above printing styles are (possibly) some +command-line flags that affect printing (often carried with the +style). The most likely ones are variations on how much type info is +shown. + +The following test decides whether or not we are actually generating +code (either C or assembly), or generating interface files. + +%************************************************************************ +%* * +\subsection{The @SDoc@ data type} +%* * +%************************************************************************ + +\begin{code} +type SDoc = PprStyle -> Doc + +withPprStyle :: PprStyle -> SDoc -> SDoc +withPprStyle sty d sty' = d sty + +withPprStyleDoc :: PprStyle -> SDoc -> Doc +withPprStyleDoc sty d = d sty + +pprDeeper :: SDoc -> SDoc +pprDeeper d (PprUser unqual (PartWay 0)) = Pretty.text "..." +pprDeeper d (PprUser unqual (PartWay n)) = d (PprUser unqual (PartWay (n-1))) +pprDeeper d other_sty = d other_sty + +pprSetDepth :: Int -> SDoc -> SDoc +pprSetDepth n d (PprUser unqual _) = d (PprUser unqual (PartWay n)) +pprSetDepth n d other_sty = d other_sty + +getPprStyle :: (PprStyle -> SDoc) -> SDoc +getPprStyle df sty = df sty sty +\end{code} + +\begin{code} +unqualStyle :: PprStyle -> PrintUnqualified +unqualStyle (PprUser unqual _) m n = unqual m n +unqualStyle other m n = False + +codeStyle :: PprStyle -> Bool +codeStyle (PprCode _) = True +codeStyle _ = False + +asmStyle :: PprStyle -> Bool +asmStyle (PprCode AsmStyle) = True +asmStyle other = False + +dumpStyle :: PprStyle -> Bool +dumpStyle PprDump = True +dumpStyle other = False + +debugStyle :: PprStyle -> Bool +debugStyle PprDebug = True +debugStyle other = False + +userStyle :: PprStyle -> Bool +userStyle (PprUser _ _) = True +userStyle other = False + +ifPprDebug :: SDoc -> SDoc -- Empty for non-debug style +ifPprDebug d sty@PprDebug = d sty +ifPprDebug d sty = Pretty.empty +\end{code} + +\begin{code} +-- Unused [7/02 sof] +printSDoc :: SDoc -> PprStyle -> IO () +printSDoc d sty = do + Pretty.printDoc PageMode stdout (d sty) + hFlush stdout + +-- I'm not sure whether the direct-IO approach of Pretty.printDoc +-- above is better or worse than the put-big-string approach here +printErrs :: Doc -> IO () +printErrs doc = do Pretty.printDoc PageMode stderr doc + hFlush stderr + +printDump :: SDoc -> IO () +printDump doc = do + Pretty.printDoc PageMode stdout (better_doc defaultDumpStyle) + hFlush stdout + where + better_doc = doc $$ text "" + +printForUser :: Handle -> PrintUnqualified -> SDoc -> IO () +printForUser handle unqual doc + = Pretty.printDoc PageMode handle (doc (mkUserStyle unqual AllTheWay)) + +-- printForC, printForAsm do what they sound like +printForC :: Handle -> SDoc -> IO () +printForC handle doc = Pretty.printDoc LeftMode handle (doc (PprCode CStyle)) + +printForAsm :: Handle -> SDoc -> IO () +printForAsm handle doc = Pretty.printDoc LeftMode handle (doc (PprCode AsmStyle)) + +pprCode :: CodeStyle -> SDoc -> SDoc +pprCode cs d = withPprStyle (PprCode cs) d + +mkCodeStyle :: CodeStyle -> PprStyle +mkCodeStyle = PprCode + +-- Can't make SDoc an instance of Show because SDoc is just a function type +-- However, Doc *is* an instance of Show +-- showSDoc just blasts it out as a string +showSDoc :: SDoc -> String +showSDoc d = show (d defaultUserStyle) + +showSDocForUser :: PrintUnqualified -> SDoc -> String +showSDocForUser unqual doc = show (doc (mkUserStyle unqual AllTheWay)) + +showSDocUnqual :: SDoc -> String +-- Only used in the gruesome HsExpr.isOperator +showSDocUnqual d = show (d (mkUserStyle neverQualify AllTheWay)) + +showsPrecSDoc :: Int -> SDoc -> ShowS +showsPrecSDoc p d = showsPrec p (d defaultUserStyle) + +showSDocDump :: SDoc -> String +showSDocDump d = show (d PprDump) + +showSDocDebug :: SDoc -> String +showSDocDebug d = show (d PprDebug) +\end{code} + +\begin{code} +docToSDoc :: Doc -> SDoc +docToSDoc d = \_ -> d + +empty sty = Pretty.empty +text s sty = Pretty.text s +char c sty = Pretty.char c +ftext s sty = Pretty.ftext s +ptext s sty = Pretty.ptext s +int n sty = Pretty.int n +integer n sty = Pretty.integer n +float n sty = Pretty.float n +double n sty = Pretty.double n +rational n sty = Pretty.rational n + +parens d sty = Pretty.parens (d sty) +braces d sty = Pretty.braces (d sty) +brackets d sty = Pretty.brackets (d sty) +doubleQuotes d sty = Pretty.doubleQuotes (d sty) +angleBrackets d = char '<' <> d <> char '>' + +-- quotes encloses something in single quotes... +-- but it omits them if the thing ends in a single quote +-- so that we don't get `foo''. Instead we just have foo'. +quotes d sty = case show pp_d of + ('\'' : _) -> pp_d + other -> Pretty.quotes pp_d + where + pp_d = d sty + +semi sty = Pretty.semi +comma sty = Pretty.comma +colon sty = Pretty.colon +equals sty = Pretty.equals +space sty = Pretty.space +lparen sty = Pretty.lparen +rparen sty = Pretty.rparen +lbrack sty = Pretty.lbrack +rbrack sty = Pretty.rbrack +lbrace sty = Pretty.lbrace +rbrace sty = Pretty.rbrace +dcolon sty = Pretty.ptext SLIT("::") +arrow sty = Pretty.ptext SLIT("->") +underscore = char '_' +dot = char '.' + +nest n d sty = Pretty.nest n (d sty) +(<>) d1 d2 sty = (Pretty.<>) (d1 sty) (d2 sty) +(<+>) d1 d2 sty = (Pretty.<+>) (d1 sty) (d2 sty) +($$) d1 d2 sty = (Pretty.$$) (d1 sty) (d2 sty) +($+$) d1 d2 sty = (Pretty.$+$) (d1 sty) (d2 sty) + +hcat ds sty = Pretty.hcat [d sty | d <- ds] +hsep ds sty = Pretty.hsep [d sty | d <- ds] +vcat ds sty = Pretty.vcat [d sty | d <- ds] +sep ds sty = Pretty.sep [d sty | d <- ds] +cat ds sty = Pretty.cat [d sty | d <- ds] +fsep ds sty = Pretty.fsep [d sty | d <- ds] +fcat ds sty = Pretty.fcat [d sty | d <- ds] + +hang d1 n d2 sty = Pretty.hang (d1 sty) n (d2 sty) + +punctuate :: SDoc -> [SDoc] -> [SDoc] +punctuate p [] = [] +punctuate p (d:ds) = go d ds + where + go d [] = [d] + go d (e:es) = (d <> p) : go e es +\end{code} + + +%************************************************************************ +%* * +\subsection[Outputable-class]{The @Outputable@ class} +%* * +%************************************************************************ + +\begin{code} +class Outputable a where + ppr :: a -> SDoc +\end{code} + +\begin{code} +instance Outputable Bool where + ppr True = ptext SLIT("True") + ppr False = ptext SLIT("False") + +instance Outputable Int where + ppr n = int n + +instance Outputable () where + ppr _ = text "()" + +instance (Outputable a) => Outputable [a] where + ppr xs = brackets (fsep (punctuate comma (map ppr xs))) + +instance (Outputable a, Outputable b) => Outputable (a, b) where + ppr (x,y) = parens (sep [ppr x <> comma, ppr y]) + +instance Outputable a => Outputable (Maybe a) where + ppr Nothing = ptext SLIT("Nothing") + ppr (Just x) = ptext SLIT("Just") <+> ppr x + +-- ToDo: may not be used +instance (Outputable a, Outputable b, Outputable c) => Outputable (a, b, c) where + ppr (x,y,z) = + parens (sep [ppr x <> comma, + ppr y <> comma, + ppr z ]) + +instance (Outputable a, Outputable b, Outputable c, Outputable d) => + Outputable (a, b, c, d) where + ppr (x,y,z,w) = + parens (sep [ppr x <> comma, + ppr y <> comma, + ppr z <> comma, + ppr w]) + +instance Outputable FastString where + ppr fs = ftext fs -- Prints an unadorned string, + -- no double quotes or anything + +instance Outputable PackageId where + ppr pid = text (packageIdString pid) +\end{code} + + +%************************************************************************ +%* * +\subsection{The @OutputableBndr@ class} +%* * +%************************************************************************ + +When we print a binder, we often want to print its type too. +The @OutputableBndr@ class encapsulates this idea. + +@BindingSite@ is used to tell the thing that prints binder what +language construct is binding the identifier. This can be used +to decide how much info to print. + +\begin{code} +data BindingSite = LambdaBind | CaseBind | LetBind + +class Outputable a => OutputableBndr a where + pprBndr :: BindingSite -> a -> SDoc + pprBndr b x = ppr x +\end{code} + + + +%************************************************************************ +%* * +\subsection{Random printing helpers} +%* * +%************************************************************************ + +\begin{code} +-- We have 31-bit Chars and will simply use Show instances +-- of Char and String. + +pprHsChar :: Char -> SDoc +pprHsChar c | c > '\x10ffff' = char '\\' <> text (show (fromIntegral (ord c) :: Word32)) + | otherwise = text (show c) + +pprHsString :: FastString -> SDoc +pprHsString fs = text (show (unpackFS fs)) +\end{code} + + +%************************************************************************ +%* * +\subsection{Other helper functions} +%* * +%************************************************************************ + +\begin{code} +pprWithCommas :: (a -> SDoc) -> [a] -> SDoc +pprWithCommas pp xs = fsep (punctuate comma (map pp xs)) + +interppSP :: Outputable a => [a] -> SDoc +interppSP xs = sep (map ppr xs) + +interpp'SP :: Outputable a => [a] -> SDoc +interpp'SP xs = sep (punctuate comma (map ppr xs)) + +pprQuotedList :: Outputable a => [a] -> SDoc +-- [x,y,z] ==> `x', `y', `z' +pprQuotedList xs = hsep (punctuate comma (map (quotes . ppr) xs)) +\end{code} + + +%************************************************************************ +%* * +\subsection{Printing numbers verbally} +%* * +%************************************************************************ + +@speakNth@ converts an integer to a verbal index; eg 1 maps to +``first'' etc. + +\begin{code} +speakNth :: Int -> SDoc +speakNth 1 = ptext SLIT("first") +speakNth 2 = ptext SLIT("second") +speakNth 3 = ptext SLIT("third") +speakNth 4 = ptext SLIT("fourth") +speakNth 5 = ptext SLIT("fifth") +speakNth 6 = ptext SLIT("sixth") +speakNth n = hcat [ int n, text suffix ] + where + suffix | n <= 20 = "th" -- 11,12,13 are non-std + | last_dig == 1 = "st" + | last_dig == 2 = "nd" + | last_dig == 3 = "rd" + | otherwise = "th" + + last_dig = n `rem` 10 + +speakN :: Int -> SDoc +speakN 0 = ptext SLIT("none") -- E.g. "he has none" +speakN 1 = ptext SLIT("one") -- E.g. "he has one" +speakN 2 = ptext SLIT("two") +speakN 3 = ptext SLIT("three") +speakN 4 = ptext SLIT("four") +speakN 5 = ptext SLIT("five") +speakN 6 = ptext SLIT("six") +speakN n = int n + +speakNOf :: Int -> SDoc -> SDoc +speakNOf 0 d = ptext SLIT("no") <+> d <> char 's' -- E.g. "no arguments" +speakNOf 1 d = ptext SLIT("one") <+> d -- E.g. "one argument" +speakNOf n d = speakN n <+> d <> char 's' -- E.g. "three arguments" + +speakNTimes :: Int {- >=1 -} -> SDoc +speakNTimes t | t == 1 = ptext SLIT("once") + | t == 2 = ptext SLIT("twice") + | otherwise = speakN t <+> ptext SLIT("times") + +plural [x] = empty +plural xs = char 's' +\end{code} + + +%************************************************************************ +%* * +\subsection{Error handling} +%* * +%************************************************************************ + +\begin{code} +pprPanic, pprPgmError :: String -> SDoc -> a +pprTrace :: String -> SDoc -> a -> a +pprPanic = pprAndThen panic -- Throw an exn saying "bug in GHC" + +pprPgmError = pprAndThen pgmError -- Throw an exn saying "bug in pgm being compiled" + -- (used for unusual pgm errors) +pprTrace = pprAndThen trace + +pprPanic# heading pretty_msg = panic# (show (doc PprDebug)) + where + doc = text heading <+> pretty_msg + +pprAndThen :: (String -> a) -> String -> SDoc -> a +pprAndThen cont heading pretty_msg = cont (show (doc PprDebug)) + where + doc = sep [text heading, nest 4 pretty_msg] + +assertPprPanic :: String -> Int -> SDoc -> a +assertPprPanic file line msg + = panic (show (doc PprDebug)) + where + doc = sep [hsep[text "ASSERT failed! file", + text file, + text "line", int line], + msg] + +warnPprTrace :: Bool -> String -> Int -> SDoc -> a -> a +warnPprTrace False file line msg x = x +warnPprTrace True file line msg x + = trace (show (doc PprDebug)) x + where + doc = sep [hsep [text "WARNING: file", text file, text "line", int line], + msg] +\end{code} diff --git a/compiler/utils/Panic.lhs b/compiler/utils/Panic.lhs new file mode 100644 index 0000000000..1a74d5db32 --- /dev/null +++ b/compiler/utils/Panic.lhs @@ -0,0 +1,250 @@ +% +% (c) The GRASP Project, Glasgow University, 1992-2000 +% +\section{Panic error messages} + +Defines basic funtions for printing error messages. + +It's hard to put these functions anywhere else without causing +some unnecessary loops in the module dependency graph. + +\begin{code} +module Panic + ( + GhcException(..), showGhcException, ghcError, progName, + pgmError, + + panic, panic#, assertPanic, trace, + + Exception.Exception(..), showException, try, tryJust, tryMost, tryUser, + catchJust, ioErrors, throwTo, + + installSignalHandlers, interruptTargetThread + ) where + +#include "HsVersions.h" + +import Config +import FastTypes + +#ifndef mingw32_HOST_OS +# if __GLASGOW_HASKELL__ > 504 +import System.Posix.Signals +# else +import Posix ( Handler(Catch), installHandler, sigINT, sigQUIT ) +# endif /* GHC > 504 */ +#endif /* mingw32_HOST_OS */ + +#if defined(mingw32_HOST_OS) && __GLASGOW_HASKELL__ >= 603 +import GHC.ConsoleHandler +#endif + +# if __GLASGOW_HASKELL__ < 500 +import EXCEPTION ( raiseInThread ) +# else +import EXCEPTION ( throwTo ) +# endif /* GHC < 500 */ + +#if __GLASGOW_HASKELL__ > 408 +import EXCEPTION ( catchJust, tryJust, ioErrors ) +#endif + +import CONCURRENT ( myThreadId, MVar, ThreadId, withMVar, newEmptyMVar ) +import DYNAMIC +import qualified EXCEPTION as Exception +import TRACE ( trace ) +import UNSAFE_IO ( unsafePerformIO ) +import IO ( isUserError ) + +import System +\end{code} + +GHC's own exception type. + +\begin{code} +ghcError :: GhcException -> a +ghcError e = Exception.throwDyn e + +-- error messages all take the form +-- +-- <location>: <error> +-- +-- If the location is on the command line, or in GHC itself, then +-- <location>="ghc". All of the error types below correspond to +-- a <location> of "ghc", except for ProgramError (where the string is +-- assumed to contain a location already, so we don't print one). + +data GhcException + = PhaseFailed String -- name of phase + ExitCode -- an external phase (eg. cpp) failed + | Interrupted -- someone pressed ^C + | UsageError String -- prints the short usage msg after the error + | CmdLineError String -- cmdline prob, but doesn't print usage + | Panic String -- the `impossible' happened + | InstallationError String -- an installation problem + | ProgramError String -- error in the user's code, probably + deriving Eq + +progName = unsafePerformIO (getProgName) +{-# NOINLINE progName #-} + +short_usage = "Usage: For basic information, try the `--help' option." + +showException :: Exception.Exception -> String +-- Show expected dynamic exceptions specially +showException (Exception.DynException d) | Just e <- fromDynamic d + = show (e::GhcException) +showException other_exn = show other_exn + +instance Show GhcException where + showsPrec _ e@(ProgramError _) = showGhcException e + showsPrec _ e = showString progName . showString ": " . showGhcException e + +showGhcException (UsageError str) + = showString str . showChar '\n' . showString short_usage +showGhcException (PhaseFailed phase code) + = showString "phase `" . showString phase . + showString "' failed (exitcode = " . shows int_code . + showString ")" + where + int_code = + case code of + ExitSuccess -> (0::Int) + ExitFailure x -> x +showGhcException (CmdLineError str) + = showString str +showGhcException (ProgramError str) + = showString str +showGhcException (InstallationError str) + = showString str +showGhcException (Interrupted) + = showString "interrupted" +showGhcException (Panic s) + = showString ("panic! (the 'impossible' happened)\n" + ++ " (GHC version " ++ cProjectVersion ++ " for " ++ TargetPlatform_NAME ++ "):\n\t" + ++ s ++ "\n\n" + ++ "Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug\n") + +#if __GLASGOW_HASKELL__ < 603 +myMkTyConApp = mkAppTy +#else +myMkTyConApp = mkTyConApp +#endif + +ghcExceptionTc = mkTyCon "GhcException" +{-# NOINLINE ghcExceptionTc #-} +instance Typeable GhcException where + typeOf _ = myMkTyConApp ghcExceptionTc [] +\end{code} + +Panics and asserts. + +\begin{code} +panic, pgmError :: String -> a +panic x = Exception.throwDyn (Panic x) +pgmError x = Exception.throwDyn (ProgramError x) + +-- #-versions because panic can't return an unboxed int, and that's +-- what TAG_ is with GHC at the moment. Ugh. (Simon) +-- No, man -- Too Beautiful! (Will) + +panic# :: String -> FastInt +panic# s = case (panic s) of () -> _ILIT 0 + +assertPanic :: String -> Int -> a +assertPanic file line = + Exception.throw (Exception.AssertionFailed + ("ASSERT failed! file " ++ file ++ ", line " ++ show line)) +\end{code} + +\begin{code} +-- | tryMost is like try, but passes through Interrupted and Panic +-- exceptions. Used when we want soft failures when reading interface +-- files, for example. + +tryMost :: IO a -> IO (Either Exception.Exception a) +tryMost action = do r <- try action; filter r + where + filter (Left e@(Exception.DynException d)) + | Just ghc_ex <- fromDynamic d + = case ghc_ex of + Interrupted -> Exception.throw e + Panic _ -> Exception.throw e + _other -> return (Left e) + filter other + = return other + +-- | tryUser is like try, but catches only UserErrors. +-- These are the ones that are thrown by the TcRn monad +-- to signal an error in the program being compiled +tryUser :: IO a -> IO (Either Exception.Exception a) +tryUser action = tryJust tc_errors action + where +#if __GLASGOW_HASKELL__ > 504 || __GLASGOW_HASKELL__ < 500 + tc_errors e@(Exception.IOException ioe) | isUserError ioe = Just e +#elif __GLASGOW_HASKELL__ == 502 + tc_errors e@(UserError _) = Just e +#else + tc_errors e@(Exception.IOException ioe) | isUserError e = Just e +#endif + tc_errors _other = Nothing +\end{code} + +Compatibility stuff: + +\begin{code} +#if __GLASGOW_HASKELL__ <= 408 +try = Exception.tryAllIO +#else +try = Exception.try +#endif + +#if __GLASGOW_HASKELL__ <= 408 +catchJust = Exception.catchIO +tryJust = Exception.tryIO +ioErrors = Exception.justIoErrors +throwTo = Exception.raiseInThread +#endif +\end{code} + +Standard signal handlers for catching ^C, which just throw an +exception in the target thread. The current target thread is +the thread at the head of the list in the MVar passed to +installSignalHandlers. + +\begin{code} +installSignalHandlers :: IO () +installSignalHandlers = do + let + interrupt_exn = Exception.DynException (toDyn Interrupted) + + interrupt = do + withMVar interruptTargetThread $ \targets -> + case targets of + [] -> return () + (thread:_) -> throwTo thread interrupt_exn + -- +#if !defined(mingw32_HOST_OS) + installHandler sigQUIT (Catch interrupt) Nothing + installHandler sigINT (Catch interrupt) Nothing + return () +#elif __GLASGOW_HASKELL__ >= 603 + -- GHC 6.3+ has support for console events on Windows + -- NOTE: running GHCi under a bash shell for some reason requires + -- you to press Ctrl-Break rather than Ctrl-C to provoke + -- an interrupt. Ctrl-C is getting blocked somewhere, I don't know + -- why --SDM 17/12/2004 + let sig_handler ControlC = interrupt + sig_handler Break = interrupt + sig_handler _ = return () + + installHandler (Catch sig_handler) + return () +#else + return () -- nothing +#endif + +{-# NOINLINE interruptTargetThread #-} +interruptTargetThread :: MVar [ThreadId] +interruptTargetThread = unsafePerformIO newEmptyMVar +\end{code} diff --git a/compiler/utils/Pretty.lhs b/compiler/utils/Pretty.lhs new file mode 100644 index 0000000000..ec8f1e75ad --- /dev/null +++ b/compiler/utils/Pretty.lhs @@ -0,0 +1,1075 @@ +********************************************************************************* +* * +* John Hughes's and Simon Peyton Jones's Pretty Printer Combinators * +* * +* based on "The Design of a Pretty-printing Library" * +* in Advanced Functional Programming, * +* Johan Jeuring and Erik Meijer (eds), LNCS 925 * +* http://www.cs.chalmers.se/~rjmh/Papers/pretty.ps * +* * +* Heavily modified by Simon Peyton Jones, Dec 96 * +* * +********************************************************************************* + +Version 3.0 28 May 1997 + * Cured massive performance bug. If you write + + foldl <> empty (map (text.show) [1..10000]) + + you get quadratic behaviour with V2.0. Why? For just the same reason as you get + quadratic behaviour with left-associated (++) chains. + + This is really bad news. One thing a pretty-printer abstraction should + certainly guarantee is insensivity to associativity. It matters: suddenly + GHC's compilation times went up by a factor of 100 when I switched to the + new pretty printer. + + I fixed it with a bit of a hack (because I wanted to get GHC back on the + road). I added two new constructors to the Doc type, Above and Beside: + + <> = Beside + $$ = Above + + Then, where I need to get to a "TextBeside" or "NilAbove" form I "force" + the Doc to squeeze out these suspended calls to Beside and Above; but in so + doing I re-associate. It's quite simple, but I'm not satisfied that I've done + the best possible job. I'll send you the code if you are interested. + + * Added new exports: + punctuate, hang + int, integer, float, double, rational, + lparen, rparen, lbrack, rbrack, lbrace, rbrace, + + * fullRender's type signature has changed. Rather than producing a string it + now takes an extra couple of arguments that tells it how to glue fragments + of output together: + + fullRender :: Mode + -> Int -- Line length + -> Float -- Ribbons per line + -> (TextDetails -> a -> a) -- What to do with text + -> a -- What to do at the end + -> Doc + -> a -- Result + + The "fragments" are encapsulated in the TextDetails data type: + data TextDetails = Chr Char + | Str String + | PStr FastString + + The Chr and Str constructors are obvious enough. The PStr constructor has a packed + string (FastString) inside it. It's generated by using the new "ptext" export. + + An advantage of this new setup is that you can get the renderer to do output + directly (by passing in a function of type (TextDetails -> IO () -> IO ()), + rather than producing a string that you then print. + + +Version 2.0 24 April 1997 + * Made empty into a left unit for <> as well as a right unit; + it is also now true that + nest k empty = empty + which wasn't true before. + + * Fixed an obscure bug in sep that occassionally gave very wierd behaviour + + * Added $+$ + + * Corrected and tidied up the laws and invariants + +====================================================================== +Relative to John's original paper, there are the following new features: + +1. There's an empty document, "empty". It's a left and right unit for + both <> and $$, and anywhere in the argument list for + sep, hcat, hsep, vcat, fcat etc. + + It is Really Useful in practice. + +2. There is a paragraph-fill combinator, fsep, that's much like sep, + only it keeps fitting things on one line until itc can't fit any more. + +3. Some random useful extra combinators are provided. + <+> puts its arguments beside each other with a space between them, + unless either argument is empty in which case it returns the other + + + hcat is a list version of <> + hsep is a list version of <+> + vcat is a list version of $$ + + sep (separate) is either like hsep or like vcat, depending on what fits + + cat is behaves like sep, but it uses <> for horizontal conposition + fcat is behaves like fsep, but it uses <> for horizontal conposition + + These new ones do the obvious things: + char, semi, comma, colon, space, + parens, brackets, braces, + quotes, doubleQuotes + +4. The "above" combinator, $$, now overlaps its two arguments if the + last line of the top argument stops before the first line of the second begins. + For example: text "hi" $$ nest 5 "there" + lays out as + hi there + rather than + hi + there + + There are two places this is really useful + + a) When making labelled blocks, like this: + Left -> code for left + Right -> code for right + LongLongLongLabel -> + code for longlonglonglabel + The block is on the same line as the label if the label is + short, but on the next line otherwise. + + b) When laying out lists like this: + [ first + , second + , third + ] + which some people like. But if the list fits on one line + you want [first, second, third]. You can't do this with + John's original combinators, but it's quite easy with the + new $$. + + The combinator $+$ gives the original "never-overlap" behaviour. + +5. Several different renderers are provided: + * a standard one + * one that uses cut-marks to avoid deeply-nested documents + simply piling up in the right-hand margin + * one that ignores indentation (fewer chars output; good for machines) + * one that ignores indentation and newlines (ditto, only more so) + +6. Numerous implementation tidy-ups + Use of unboxed data types to speed up the implementation + + + +\begin{code} +module Pretty ( + Doc, -- Abstract + Mode(..), TextDetails(..), + + empty, isEmpty, nest, + + text, char, ftext, ptext, + int, integer, float, double, rational, + parens, brackets, braces, quotes, doubleQuotes, + semi, comma, colon, space, equals, + lparen, rparen, lbrack, rbrack, lbrace, rbrace, + + (<>), (<+>), hcat, hsep, + ($$), ($+$), vcat, + sep, cat, + fsep, fcat, + + hang, punctuate, + +-- renderStyle, -- Haskell 1.3 only + render, fullRender, printDoc, showDocWith + ) where + +#include "HsVersions.h" + +import BufWrite +import FastString + +import GLAEXTS + +import Numeric (fromRat) +import IO + +import System.IO ( hPutBuf ) + +import GHC.Base ( unpackCString# ) +import GHC.Ptr ( Ptr(..) ) + +-- Don't import Util( assertPanic ) because it makes a loop in the module structure + +infixl 6 <> +infixl 6 <+> +infixl 5 $$, $+$ +\end{code} + + + +********************************************************* +* * +\subsection{CPP magic so that we can compile with both GHC and Hugs} +* * +********************************************************* + +The library uses unboxed types to get a bit more speed, but these CPP macros +allow you to use either GHC or Hugs. To get GHC, just set the CPP variable + __GLASGOW_HASKELL__ + +\begin{code} + +#if defined(__GLASGOW_HASKELL__) + +-- Glasgow Haskell + +-- Disable ASSERT checks; they are expensive! +#define LOCAL_ASSERT(x) + +#define ILIT(x) (x#) +#define IBOX(x) (I# (x)) +#define INT Int# +#define MINUS -# +#define NEGATE negateInt# +#define PLUS +# +#define GR ># +#define GREQ >=# +#define LT <# +#define DIV `quotInt#` + + +#define SHOW Show +#define MAXINT maxBound + +#else + +-- Standard Haskell + +#define LOCAL_ASSERT(x) + +#define INT Int +#define IBOX(x) x +#define MINUS - +#define NEGATE negate +#define PLUS + +#define GR > +#define GREQ >= +#define LT < +#define DIV `quot` +#define ILIT(x) x + +#define SHOW Show +#define MAXINT maxBound + +#endif + +\end{code} + + +********************************************************* +* * +\subsection{The interface} +* * +********************************************************* + +The primitive @Doc@ values + +\begin{code} +empty :: Doc +isEmpty :: Doc -> Bool +text :: String -> Doc +char :: Char -> Doc + +semi, comma, colon, space, equals :: Doc +lparen, rparen, lbrack, rbrack, lbrace, rbrace :: Doc + +parens, brackets, braces :: Doc -> Doc +quotes, doubleQuotes :: Doc -> Doc + +int :: Int -> Doc +integer :: Integer -> Doc +float :: Float -> Doc +double :: Double -> Doc +rational :: Rational -> Doc +\end{code} + +Combining @Doc@ values + +\begin{code} +(<>) :: Doc -> Doc -> Doc -- Beside +hcat :: [Doc] -> Doc -- List version of <> +(<+>) :: Doc -> Doc -> Doc -- Beside, separated by space +hsep :: [Doc] -> Doc -- List version of <+> + +($$) :: Doc -> Doc -> Doc -- Above; if there is no + -- overlap it "dovetails" the two +vcat :: [Doc] -> Doc -- List version of $$ + +cat :: [Doc] -> Doc -- Either hcat or vcat +sep :: [Doc] -> Doc -- Either hsep or vcat +fcat :: [Doc] -> Doc -- ``Paragraph fill'' version of cat +fsep :: [Doc] -> Doc -- ``Paragraph fill'' version of sep + +nest :: Int -> Doc -> Doc -- Nested +\end{code} + +GHC-specific ones. + +\begin{code} +hang :: Doc -> Int -> Doc -> Doc +punctuate :: Doc -> [Doc] -> [Doc] -- punctuate p [d1, ... dn] = [d1 <> p, d2 <> p, ... dn-1 <> p, dn] +\end{code} + +Displaying @Doc@ values. + +\begin{code} +instance SHOW Doc where + showsPrec prec doc cont = showDoc doc cont + +render :: Doc -> String -- Uses default style +fullRender :: Mode + -> Int -- Line length + -> Float -- Ribbons per line + -> (TextDetails -> a -> a) -- What to do with text + -> a -- What to do at the end + -> Doc + -> a -- Result + +{- When we start using 1.3 +renderStyle :: Style -> Doc -> String +data Style = Style { lineLength :: Int, -- In chars + ribbonsPerLine :: Float, -- Ratio of ribbon length to line length + mode :: Mode + } +style :: Style -- The default style +style = Style { lineLength = 100, ribbonsPerLine = 2.5, mode = PageMode } +-} + +data Mode = PageMode -- Normal + | ZigZagMode -- With zig-zag cuts + | LeftMode -- No indentation, infinitely long lines + | OneLineMode -- All on one line + +\end{code} + + +********************************************************* +* * +\subsection{The @Doc@ calculus} +* * +********************************************************* + +The @Doc@ combinators satisfy the following laws: +\begin{verbatim} +Laws for $$ +~~~~~~~~~~~ +<a1> (x $$ y) $$ z = x $$ (y $$ z) +<a2> empty $$ x = x +<a3> x $$ empty = x + + ...ditto $+$... + +Laws for <> +~~~~~~~~~~~ +<b1> (x <> y) <> z = x <> (y <> z) +<b2> empty <> x = empty +<b3> x <> empty = x + + ...ditto <+>... + +Laws for text +~~~~~~~~~~~~~ +<t1> text s <> text t = text (s++t) +<t2> text "" <> x = x, if x non-empty + +Laws for nest +~~~~~~~~~~~~~ +<n1> nest 0 x = x +<n2> nest k (nest k' x) = nest (k+k') x +<n3> nest k (x <> y) = nest k z <> nest k y +<n4> nest k (x $$ y) = nest k x $$ nest k y +<n5> nest k empty = empty +<n6> x <> nest k y = x <> y, if x non-empty + +** Note the side condition on <n6>! It is this that +** makes it OK for empty to be a left unit for <>. + +Miscellaneous +~~~~~~~~~~~~~ +<m1> (text s <> x) $$ y = text s <> ((text "" <> x)) $$ + nest (-length s) y) + +<m2> (x $$ y) <> z = x $$ (y <> z) + if y non-empty + + +Laws for list versions +~~~~~~~~~~~~~~~~~~~~~~ +<l1> sep (ps++[empty]++qs) = sep (ps ++ qs) + ...ditto hsep, hcat, vcat, fill... + +<l2> nest k (sep ps) = sep (map (nest k) ps) + ...ditto hsep, hcat, vcat, fill... + +Laws for oneLiner +~~~~~~~~~~~~~~~~~ +<o1> oneLiner (nest k p) = nest k (oneLiner p) +<o2> oneLiner (x <> y) = oneLiner x <> oneLiner y +\end{verbatim} + + +You might think that the following verion of <m1> would +be neater: +\begin{verbatim} +<3 NO> (text s <> x) $$ y = text s <> ((empty <> x)) $$ + nest (-length s) y) +\end{verbatim} +But it doesn't work, for if x=empty, we would have +\begin{verbatim} + text s $$ y = text s <> (empty $$ nest (-length s) y) + = text s <> nest (-length s) y +\end{verbatim} + + + +********************************************************* +* * +\subsection{Simple derived definitions} +* * +********************************************************* + +\begin{code} +semi = char ';' +colon = char ':' +comma = char ',' +space = char ' ' +equals = char '=' +lparen = char '(' +rparen = char ')' +lbrack = char '[' +rbrack = char ']' +lbrace = char '{' +rbrace = char '}' + +int n = text (show n) +integer n = text (show n) +float n = text (show n) +double n = text (show n) +rational n = text (show (fromRat n)) +--rational n = text (show (fromRationalX n)) -- _showRational 30 n) + +quotes p = char '`' <> p <> char '\'' +doubleQuotes p = char '"' <> p <> char '"' +parens p = char '(' <> p <> char ')' +brackets p = char '[' <> p <> char ']' +braces p = char '{' <> p <> char '}' + + +hcat = foldr (<>) empty +hsep = foldr (<+>) empty +vcat = foldr ($$) empty + +hang d1 n d2 = sep [d1, nest n d2] + +punctuate p [] = [] +punctuate p (d:ds) = go d ds + where + go d [] = [d] + go d (e:es) = (d <> p) : go e es +\end{code} + + +********************************************************* +* * +\subsection{The @Doc@ data type} +* * +********************************************************* + +A @Doc@ represents a {\em set} of layouts. A @Doc@ with +no occurrences of @Union@ or @NoDoc@ represents just one layout. +\begin{code} +data Doc + = Empty -- empty + | NilAbove Doc -- text "" $$ x + | TextBeside !TextDetails INT Doc -- text s <> x + | Nest INT Doc -- nest k x + | Union Doc Doc -- ul `union` ur + | NoDoc -- The empty set of documents + | Beside Doc Bool Doc -- True <=> space between + | Above Doc Bool Doc -- True <=> never overlap + +type RDoc = Doc -- RDoc is a "reduced Doc", guaranteed not to have a top-level Above or Beside + + +reduceDoc :: Doc -> RDoc +reduceDoc (Beside p g q) = beside p g (reduceDoc q) +reduceDoc (Above p g q) = above p g (reduceDoc q) +reduceDoc p = p + + +data TextDetails = Chr {-#UNPACK#-}!Char + | Str String + | PStr FastString -- a hashed string + | LStr Addr# Int# -- a '\0'-terminated array of bytes + +space_text = Chr ' ' +nl_text = Chr '\n' +\end{code} + +Here are the invariants: +\begin{itemize} +\item +The argument of @NilAbove@ is never @Empty@. Therefore +a @NilAbove@ occupies at least two lines. + +\item +The arugment of @TextBeside@ is never @Nest@. + +\item +The layouts of the two arguments of @Union@ both flatten to the same string. + +\item +The arguments of @Union@ are either @TextBeside@, or @NilAbove@. + +\item +The right argument of a union cannot be equivalent to the empty set (@NoDoc@). +If the left argument of a union is equivalent to the empty set (@NoDoc@), +then the @NoDoc@ appears in the first line. + +\item +An empty document is always represented by @Empty@. +It can't be hidden inside a @Nest@, or a @Union@ of two @Empty@s. + +\item +The first line of every layout in the left argument of @Union@ +is longer than the first line of any layout in the right argument. +(1) ensures that the left argument has a first line. In view of (3), +this invariant means that the right argument must have at least two +lines. +\end{itemize} + +\begin{code} + -- Arg of a NilAbove is always an RDoc +nilAbove_ p = LOCAL_ASSERT( ok p ) NilAbove p + where + ok Empty = False + ok other = True + + -- Arg of a TextBeside is always an RDoc +textBeside_ s sl p = TextBeside s sl (LOCAL_ASSERT( ok p ) p) + where + ok (Nest _ _) = False + ok other = True + + -- Arg of Nest is always an RDoc +nest_ k p = Nest k (LOCAL_ASSERT( ok p ) p) + where + ok Empty = False + ok other = True + + -- Args of union are always RDocs +union_ p q = Union (LOCAL_ASSERT( ok p ) p) (LOCAL_ASSERT( ok q ) q) + where + ok (TextBeside _ _ _) = True + ok (NilAbove _) = True + ok (Union _ _) = True + ok other = False +\end{code} + + +Notice the difference between + * NoDoc (no documents) + * Empty (one empty document; no height and no width) + * text "" (a document containing the empty string; + one line high, but has no width) + + + +********************************************************* +* * +\subsection{@empty@, @text@, @nest@, @union@} +* * +********************************************************* + +\begin{code} +empty = Empty + +isEmpty Empty = True +isEmpty _ = False + +char c = textBeside_ (Chr c) 1# Empty +text s = case length s of {IBOX(sl) -> textBeside_ (Str s) sl Empty} +ftext s = case lengthFS s of {IBOX(sl) -> textBeside_ (PStr s) sl Empty} +ptext (Ptr s) = case strLength (Ptr s) of {IBOX(sl) -> textBeside_ (LStr s sl) sl Empty} + +-- RULE that turns (text "abc") into (ptext (A# "abc"#)) to avoid the +-- intermediate packing/unpacking of the string. +{-# RULES + "text/str" forall a. text (unpackCString# a) = ptext (Ptr a) + #-} + +nest IBOX(k) p = mkNest k (reduceDoc p) -- Externally callable version + +-- mkNest checks for Nest's invariant that it doesn't have an Empty inside it +mkNest k (Nest k1 p) = mkNest (k PLUS k1) p +mkNest k NoDoc = NoDoc +mkNest k Empty = Empty +mkNest ILIT(0) p = p -- Worth a try! +mkNest k p = nest_ k p + +-- mkUnion checks for an empty document +mkUnion Empty q = Empty +mkUnion p q = p `union_` q +\end{code} + +********************************************************* +* * +\subsection{Vertical composition @$$@} +* * +********************************************************* + + +\begin{code} +p $$ q = Above p False q +p $+$ q = Above p True q + +above :: Doc -> Bool -> RDoc -> RDoc +above (Above p g1 q1) g2 q2 = above p g1 (above q1 g2 q2) +above p@(Beside _ _ _) g q = aboveNest (reduceDoc p) g ILIT(0) (reduceDoc q) +above p g q = aboveNest p g ILIT(0) (reduceDoc q) + +aboveNest :: RDoc -> Bool -> INT -> RDoc -> RDoc +-- Specfication: aboveNest p g k q = p $g$ (nest k q) + +aboveNest NoDoc g k q = NoDoc +aboveNest (p1 `Union` p2) g k q = aboveNest p1 g k q `union_` + aboveNest p2 g k q + +aboveNest Empty g k q = mkNest k q +aboveNest (Nest k1 p) g k q = nest_ k1 (aboveNest p g (k MINUS k1) q) + -- p can't be Empty, so no need for mkNest + +aboveNest (NilAbove p) g k q = nilAbove_ (aboveNest p g k q) +aboveNest (TextBeside s sl p) g k q = textBeside_ s sl rest + where + k1 = k MINUS sl + rest = case p of + Empty -> nilAboveNest g k1 q + other -> aboveNest p g k1 q +\end{code} + +\begin{code} +nilAboveNest :: Bool -> INT -> RDoc -> RDoc +-- Specification: text s <> nilaboveNest g k q +-- = text s <> (text "" $g$ nest k q) + +nilAboveNest g k Empty = Empty -- Here's why the "text s <>" is in the spec! +nilAboveNest g k (Nest k1 q) = nilAboveNest g (k PLUS k1) q + +nilAboveNest g k q | (not g) && (k GR ILIT(0)) -- No newline if no overlap + = textBeside_ (Str (spaces k)) k q + | otherwise -- Put them really above + = nilAbove_ (mkNest k q) +\end{code} + + +********************************************************* +* * +\subsection{Horizontal composition @<>@} +* * +********************************************************* + +\begin{code} +p <> q = Beside p False q +p <+> q = Beside p True q + +beside :: Doc -> Bool -> RDoc -> RDoc +-- Specification: beside g p q = p <g> q + +beside NoDoc g q = NoDoc +beside (p1 `Union` p2) g q = (beside p1 g q) `union_` (beside p2 g q) +beside Empty g q = q +beside (Nest k p) g q = nest_ k $! beside p g q -- p non-empty +beside p@(Beside p1 g1 q1) g2 q2 + {- (A `op1` B) `op2` C == A `op1` (B `op2` C) iff op1 == op2 + [ && (op1 == <> || op1 == <+>) ] -} + | g1 == g2 = beside p1 g1 $! beside q1 g2 q2 + | otherwise = beside (reduceDoc p) g2 q2 +beside p@(Above _ _ _) g q = let d = reduceDoc p in d `seq` beside d g q +beside (NilAbove p) g q = nilAbove_ $! beside p g q +beside (TextBeside s sl p) g q = textBeside_ s sl $! rest + where + rest = case p of + Empty -> nilBeside g q + other -> beside p g q +\end{code} + +\begin{code} +nilBeside :: Bool -> RDoc -> RDoc +-- Specification: text "" <> nilBeside g p +-- = text "" <g> p + +nilBeside g Empty = Empty -- Hence the text "" in the spec +nilBeside g (Nest _ p) = nilBeside g p +nilBeside g p | g = textBeside_ space_text ILIT(1) p + | otherwise = p +\end{code} + +********************************************************* +* * +\subsection{Separate, @sep@, Hughes version} +* * +********************************************************* + +\begin{code} +-- Specification: sep ps = oneLiner (hsep ps) +-- `union` +-- vcat ps + +sep = sepX True -- Separate with spaces +cat = sepX False -- Don't + +sepX x [] = empty +sepX x (p:ps) = sep1 x (reduceDoc p) ILIT(0) ps + + +-- Specification: sep1 g k ys = sep (x : map (nest k) ys) +-- = oneLiner (x <g> nest k (hsep ys)) +-- `union` x $$ nest k (vcat ys) + +sep1 :: Bool -> RDoc -> INT -> [Doc] -> RDoc +sep1 g NoDoc k ys = NoDoc +sep1 g (p `Union` q) k ys = sep1 g p k ys + `union_` + (aboveNest q False k (reduceDoc (vcat ys))) + +sep1 g Empty k ys = mkNest k (sepX g ys) +sep1 g (Nest n p) k ys = nest_ n (sep1 g p (k MINUS n) ys) + +sep1 g (NilAbove p) k ys = nilAbove_ (aboveNest p False k (reduceDoc (vcat ys))) +sep1 g (TextBeside s sl p) k ys = textBeside_ s sl (sepNB g p (k MINUS sl) ys) + +-- Specification: sepNB p k ys = sep1 (text "" <> p) k ys +-- Called when we have already found some text in the first item +-- We have to eat up nests + +sepNB g (Nest _ p) k ys = sepNB g p k ys + +sepNB g Empty k ys = oneLiner (nilBeside g (reduceDoc rest)) + `mkUnion` + nilAboveNest False k (reduceDoc (vcat ys)) + where + rest | g = hsep ys + | otherwise = hcat ys + +sepNB g p k ys = sep1 g p k ys +\end{code} + +********************************************************* +* * +\subsection{@fill@} +* * +********************************************************* + +\begin{code} +fsep = fill True +fcat = fill False + +-- Specification: +-- fill [] = empty +-- fill [p] = p +-- fill (p1:p2:ps) = oneLiner p1 <#> nest (length p1) +-- (fill (oneLiner p2 : ps)) +-- `union` +-- p1 $$ fill ps + +fill g [] = empty +fill g (p:ps) = fill1 g (reduceDoc p) ILIT(0) ps + + +fill1 :: Bool -> RDoc -> INT -> [Doc] -> Doc +fill1 g NoDoc k ys = NoDoc +fill1 g (p `Union` q) k ys = fill1 g p k ys + `union_` + (aboveNest q False k (fill g ys)) + +fill1 g Empty k ys = mkNest k (fill g ys) +fill1 g (Nest n p) k ys = nest_ n (fill1 g p (k MINUS n) ys) + +fill1 g (NilAbove p) k ys = nilAbove_ (aboveNest p False k (fill g ys)) +fill1 g (TextBeside s sl p) k ys = textBeside_ s sl (fillNB g p (k MINUS sl) ys) + +fillNB g (Nest _ p) k ys = fillNB g p k ys +fillNB g Empty k [] = Empty +fillNB g Empty k (y:ys) = nilBeside g (fill1 g (oneLiner (reduceDoc y)) k1 ys) + `mkUnion` + nilAboveNest False k (fill g (y:ys)) + where + k1 | g = k MINUS ILIT(1) + | otherwise = k + +fillNB g p k ys = fill1 g p k ys +\end{code} + + +********************************************************* +* * +\subsection{Selecting the best layout} +* * +********************************************************* + +\begin{code} +best :: Int -- Line length + -> Int -- Ribbon length + -> RDoc + -> RDoc -- No unions in here! + +best IBOX(w) IBOX(r) p + = get w p + where + get :: INT -- (Remaining) width of line + -> Doc -> Doc + get w Empty = Empty + get w NoDoc = NoDoc + get w (NilAbove p) = nilAbove_ (get w p) + get w (TextBeside s sl p) = textBeside_ s sl (get1 w sl p) + get w (Nest k p) = nest_ k (get (w MINUS k) p) + get w (p `Union` q) = nicest w r (get w p) (get w q) + + get1 :: INT -- (Remaining) width of line + -> INT -- Amount of first line already eaten up + -> Doc -- This is an argument to TextBeside => eat Nests + -> Doc -- No unions in here! + + get1 w sl Empty = Empty + get1 w sl NoDoc = NoDoc + get1 w sl (NilAbove p) = nilAbove_ (get (w MINUS sl) p) + get1 w sl (TextBeside t tl p) = textBeside_ t tl (get1 w (sl PLUS tl) p) + get1 w sl (Nest k p) = get1 w sl p + get1 w sl (p `Union` q) = nicest1 w r sl (get1 w sl p) + (get1 w sl q) + +nicest w r p q = nicest1 w r ILIT(0) p q +nicest1 w r sl p q | fits ((w `minn` r) MINUS sl) p = p + | otherwise = q + +fits :: INT -- Space available + -> Doc + -> Bool -- True if *first line* of Doc fits in space available + +fits n p | n LT ILIT(0) = False +fits n NoDoc = False +fits n Empty = True +fits n (NilAbove _) = True +fits n (TextBeside _ sl p) = fits (n MINUS sl) p + +minn x y | x LT y = x + | otherwise = y +\end{code} + +@first@ and @nonEmptySet@ are similar to @nicest@ and @fits@, only simpler. +@first@ returns its first argument if it is non-empty, otherwise its second. + +\begin{code} +first p q | nonEmptySet p = p + | otherwise = q + +nonEmptySet NoDoc = False +nonEmptySet (p `Union` q) = True +nonEmptySet Empty = True +nonEmptySet (NilAbove p) = True -- NoDoc always in first line +nonEmptySet (TextBeside _ _ p) = nonEmptySet p +nonEmptySet (Nest _ p) = nonEmptySet p +\end{code} + +@oneLiner@ returns the one-line members of the given set of @Doc@s. + +\begin{code} +oneLiner :: Doc -> Doc +oneLiner NoDoc = NoDoc +oneLiner Empty = Empty +oneLiner (NilAbove p) = NoDoc +oneLiner (TextBeside s sl p) = textBeside_ s sl (oneLiner p) +oneLiner (Nest k p) = nest_ k (oneLiner p) +oneLiner (p `Union` q) = oneLiner p +\end{code} + + + +********************************************************* +* * +\subsection{Displaying the best layout} +* * +********************************************************* + + +\begin{code} +{- +renderStyle Style{mode, lineLength, ribbonsPerLine} doc + = fullRender mode lineLength ribbonsPerLine doc "" +-} + +render doc = showDocWith PageMode doc +showDoc doc rest = showDocWithAppend PageMode doc rest + +showDocWithAppend :: Mode -> Doc -> String -> String +showDocWithAppend mode doc rest = fullRender mode 100 1.5 string_txt rest doc + +showDocWith :: Mode -> Doc -> String +showDocWith mode doc = showDocWithAppend mode doc "" + +string_txt (Chr c) s = c:s +string_txt (Str s1) s2 = s1 ++ s2 +string_txt (PStr s1) s2 = unpackFS s1 ++ s2 +string_txt (LStr s1 _) s2 = unpackLitString s1 ++ s2 + +unpackLitString addr = + unpack 0# + where + unpack nh + | ch `eqChar#` '\0'# = [] + | otherwise = C# ch : unpack (nh +# 1#) + where + ch = indexCharOffAddr# addr nh +\end{code} + +\begin{code} + +fullRender OneLineMode _ _ txt end doc + = lay (reduceDoc doc) + where + lay NoDoc = cant_fail + lay (Union p q) = (lay q) -- Second arg can't be NoDoc + lay (Nest k p) = lay p + lay Empty = end + lay (NilAbove p) = space_text `txt` lay p -- NoDoc always on first line + lay (TextBeside s sl p) = s `txt` lay p + +fullRender LeftMode _ _ txt end doc + = lay (reduceDoc doc) + where + lay NoDoc = cant_fail + lay (Union p q) = lay (first p q) + lay (Nest k p) = lay p + lay Empty = end + lay (NilAbove p) = nl_text `txt` lay p -- NoDoc always on first line + lay (TextBeside s sl p) = s `txt` lay p + +fullRender mode line_length ribbons_per_line txt end doc + = display mode line_length ribbon_length txt end best_doc + where + best_doc = best hacked_line_length ribbon_length (reduceDoc doc) + + hacked_line_length, ribbon_length :: Int + ribbon_length = round (fromIntegral line_length / ribbons_per_line) + hacked_line_length = case mode of { ZigZagMode -> MAXINT; other -> line_length } + +display mode IBOX(page_width) IBOX(ribbon_width) txt end doc + = case page_width MINUS ribbon_width of { gap_width -> + case gap_width DIV ILIT(2) of { shift -> + let + lay k (Nest k1 p) = lay (k PLUS k1) p + lay k Empty = end + + lay k (NilAbove p) = nl_text `txt` lay k p + + lay k (TextBeside s sl p) + = case mode of + ZigZagMode | k GREQ gap_width + -> nl_text `txt` ( + Str (multi_ch shift '/') `txt` ( + nl_text `txt` ( + lay1 (k MINUS shift) s sl p))) + + | k LT ILIT(0) + -> nl_text `txt` ( + Str (multi_ch shift '\\') `txt` ( + nl_text `txt` ( + lay1 (k PLUS shift) s sl p ))) + + other -> lay1 k s sl p + + lay1 k s sl p = Str (indent k) `txt` (s `txt` lay2 (k PLUS sl) p) + + lay2 k (NilAbove p) = nl_text `txt` lay k p + lay2 k (TextBeside s sl p) = s `txt` (lay2 (k PLUS sl) p) + lay2 k (Nest _ p) = lay2 k p + lay2 k Empty = end + in + lay ILIT(0) doc + }} + +cant_fail = error "easy_display: NoDoc" + +indent n | n GREQ ILIT(8) = '\t' : indent (n MINUS ILIT(8)) + | otherwise = spaces n + +multi_ch ILIT(0) ch = "" +multi_ch n ch = ch : multi_ch (n MINUS ILIT(1)) ch + +spaces ILIT(0) = "" +spaces n = ' ' : spaces (n MINUS ILIT(1)) +\end{code} + +\begin{code} +pprCols = (120 :: Int) -- could make configurable + +printDoc :: Mode -> Handle -> Doc -> IO () +printDoc LeftMode hdl doc + = do { printLeftRender hdl doc; hFlush hdl } +printDoc mode hdl doc + = do { fullRender mode pprCols 1.5 put done doc ; + hFlush hdl } + where + put (Chr c) next = hPutChar hdl c >> next + put (Str s) next = hPutStr hdl s >> next + put (PStr s) next = hPutFS hdl s >> next + put (LStr s l) next = hPutLitString hdl s l >> next + + done = hPutChar hdl '\n' + + -- some versions of hPutBuf will barf if the length is zero +hPutLitString handle a# 0# = return () +hPutLitString handle a# l# +#if __GLASGOW_HASKELL__ < 411 + = hPutBuf handle (A# a#) (I# l#) +#else + = hPutBuf handle (Ptr a#) (I# l#) +#endif + +-- Printing output in LeftMode is performance critical: it's used when +-- dumping C and assembly output, so we allow ourselves a few dirty +-- hacks: +-- +-- (1) we specialise fullRender for LeftMode with IO output. +-- +-- (2) we add a layer of buffering on top of Handles. Handles +-- don't perform well with lots of hPutChars, which is mostly +-- what we're doing here, because Handles have to be thread-safe +-- and async exception-safe. We only have a single thread and don't +-- care about exceptions, so we add a layer of fast buffering +-- over the Handle interface. +-- +-- (3) a few hacks in layLeft below to convince GHC to generate the right +-- code. + +printLeftRender :: Handle -> Doc -> IO () +printLeftRender hdl doc = do + b <- newBufHandle hdl + layLeft b (reduceDoc doc) + bFlush b + +-- HACK ALERT! the "return () >>" below convinces GHC to eta-expand +-- this function with the IO state lambda. Otherwise we end up with +-- closures in all the case branches. +layLeft b _ | b `seq` False = undefined -- make it strict in b +layLeft b NoDoc = cant_fail +layLeft b (Union p q) = return () >> layLeft b (first p q) +layLeft b (Nest k p) = return () >> layLeft b p +layLeft b Empty = bPutChar b '\n' +layLeft b (NilAbove p) = bPutChar b '\n' >> layLeft b p +layLeft b (TextBeside s sl p) = put b s >> layLeft b p + where + put b _ | b `seq` False = undefined + put b (Chr c) = bPutChar b c + put b (Str s) = bPutStr b s + put b (PStr s) = bPutFS b s + put b (LStr s l) = bPutLitString b s l + +#if __GLASGOW_HASKELL__ < 503 +hPutBuf = hPutBufFull +#endif + +\end{code} diff --git a/compiler/utils/StringBuffer.lhs b/compiler/utils/StringBuffer.lhs new file mode 100644 index 0000000000..e52e7e78da --- /dev/null +++ b/compiler/utils/StringBuffer.lhs @@ -0,0 +1,240 @@ +% +% (c) The University of Glasgow, 1997-2006 +% +\section{String buffers} + +Buffers for scanning string input stored in external arrays. + +\begin{code} +module StringBuffer + ( + StringBuffer(..), + -- non-abstract for vs\/HaskellService + + -- * Creation\/destruction + hGetStringBuffer, + hGetStringBufferBlock, + appendStringBuffers, + stringToStringBuffer, + + -- * Inspection + nextChar, + currentChar, + prevChar, + atEnd, + + -- * Moving and comparison + stepOn, + offsetBytes, + byteDiff, + + -- * Conversion + lexemeToString, + lexemeToFastString, + + -- * Parsing integers + parseInteger, + ) where + +#include "HsVersions.h" + +import Encoding +import FastString ( FastString,mkFastString,mkFastStringBytes ) + +import Foreign +import System.IO ( hGetBuf, hFileSize,IOMode(ReadMode), hClose + , Handle, hTell ) + +import GHC.Ptr ( Ptr(..) ) +import GHC.Exts +import GHC.IOBase ( IO(..) ) +import GHC.Base ( unsafeChr ) + +#if __GLASGOW_HASKELL__ >= 601 +import System.IO ( openBinaryFile ) +#else +import IOExts ( openFileEx, IOModeEx(..) ) +#endif + +#if __GLASGOW_HASKELL__ < 601 +openBinaryFile fp mode = openFileEx fp (BinaryMode mode) +#endif + +-- ----------------------------------------------------------------------------- +-- The StringBuffer type + +-- |A StringBuffer is an internal pointer to a sized chunk of bytes. +-- The bytes are intended to be *immutable*. There are pure +-- operations to read the contents of a StringBuffer. +-- +-- A StringBuffer may have a finalizer, depending on how it was +-- obtained. +-- +data StringBuffer + = StringBuffer { + buf :: {-# UNPACK #-} !(ForeignPtr Word8), + len :: {-# UNPACK #-} !Int, -- length + cur :: {-# UNPACK #-} !Int -- current pos + } + -- The buffer is assumed to be UTF-8 encoded, and furthermore + -- we add three '\0' bytes to the end as sentinels so that the + -- decoder doesn't have to check for overflow at every single byte + -- of a multibyte sequence. + +instance Show StringBuffer where + showsPrec _ s = showString "<stringbuffer(" + . shows (len s) . showString "," . shows (cur s) + . showString ">" + +-- ----------------------------------------------------------------------------- +-- Creation / Destruction + +hGetStringBuffer :: FilePath -> IO StringBuffer +hGetStringBuffer fname = do + h <- openBinaryFile fname ReadMode + size_i <- hFileSize h + let size = fromIntegral size_i + buf <- mallocForeignPtrArray (size+3) + withForeignPtr buf $ \ptr -> do + r <- if size == 0 then return 0 else hGetBuf h ptr size + hClose h + if (r /= size) + then ioError (userError "short read of file") + else do + pokeArray (ptr `plusPtr` size :: Ptr Word8) [0,0,0] + -- sentinels for UTF-8 decoding + return (StringBuffer buf size 0) + +hGetStringBufferBlock :: Handle -> Int -> IO StringBuffer +hGetStringBufferBlock handle wanted + = do size_i <- hFileSize handle + offset_i <- hTell handle + let size = min wanted (fromIntegral $ size_i-offset_i) + buf <- mallocForeignPtrArray (size+3) + withForeignPtr buf $ \ptr -> + do r <- if size == 0 then return 0 else hGetBuf handle ptr size + if r /= size + then ioError (userError $ "short read of file: "++show(r,size,fromIntegral size_i,handle)) + else do pokeArray (ptr `plusPtr` size :: Ptr Word8) [0,0,0] + return (StringBuffer buf size 0) + +appendStringBuffers :: StringBuffer -> StringBuffer -> IO StringBuffer +appendStringBuffers sb1 sb2 + = do newBuf <- mallocForeignPtrArray (size+3) + withForeignPtr newBuf $ \ptr -> + withForeignPtr (buf sb1) $ \sb1Ptr -> + withForeignPtr (buf sb2) $ \sb2Ptr -> + do copyArray (sb1Ptr `advancePtr` cur sb1) ptr (calcLen sb1) + copyArray (sb2Ptr `advancePtr` cur sb2) (ptr `advancePtr` cur sb1) (calcLen sb2) + pokeArray (ptr `advancePtr` size) [0,0,0] + return (StringBuffer newBuf size 0) + where calcLen sb = len sb - cur sb + size = calcLen sb1 + calcLen sb2 + +stringToStringBuffer :: String -> IO StringBuffer +stringToStringBuffer str = do + let size = utf8EncodedLength str + buf <- mallocForeignPtrArray (size+3) + withForeignPtr buf $ \ptr -> do + utf8EncodeString ptr str + pokeArray (ptr `plusPtr` size :: Ptr Word8) [0,0,0] + -- sentinels for UTF-8 decoding + return (StringBuffer buf size 0) + +-- ----------------------------------------------------------------------------- +-- Grab a character + +-- Getting our fingers dirty a little here, but this is performance-critical +{-# INLINE nextChar #-} +nextChar :: StringBuffer -> (Char,StringBuffer) +nextChar (StringBuffer buf len (I# cur#)) = + inlinePerformIO $ do + withForeignPtr buf $ \(Ptr a#) -> do + case utf8DecodeChar# (a# `plusAddr#` cur#) of + (# c#, b# #) -> + let cur' = I# (b# `minusAddr#` a#) in + return (C# c#, StringBuffer buf len cur') + +currentChar :: StringBuffer -> Char +currentChar = fst . nextChar + +prevChar :: StringBuffer -> Char -> Char +prevChar (StringBuffer buf len 0) deflt = deflt +prevChar (StringBuffer buf len cur) deflt = + inlinePerformIO $ do + withForeignPtr buf $ \p -> do + p' <- utf8PrevChar (p `plusPtr` cur) + return (fst (utf8DecodeChar p')) + +-- ----------------------------------------------------------------------------- +-- Moving + +stepOn :: StringBuffer -> StringBuffer +stepOn s = snd (nextChar s) + +offsetBytes :: Int -> StringBuffer -> StringBuffer +offsetBytes i s = s { cur = cur s + i } + +byteDiff :: StringBuffer -> StringBuffer -> Int +byteDiff s1 s2 = cur s2 - cur s1 + +atEnd :: StringBuffer -> Bool +atEnd (StringBuffer _ l c) = l == c + +-- ----------------------------------------------------------------------------- +-- Conversion + +lexemeToString :: StringBuffer -> Int {-bytes-} -> String +lexemeToString _ 0 = "" +lexemeToString (StringBuffer buf _ cur) bytes = + inlinePerformIO $ + withForeignPtr buf $ \ptr -> + utf8DecodeString (ptr `plusPtr` cur) bytes + +lexemeToFastString :: StringBuffer -> Int {-bytes-} -> FastString +lexemeToFastString _ 0 = mkFastString "" +lexemeToFastString (StringBuffer buf _ cur) len = + inlinePerformIO $ + withForeignPtr buf $ \ptr -> + return $! mkFastStringBytes (ptr `plusPtr` cur) len + +-- ----------------------------------------------------------------------------- +-- Parsing integer strings in various bases + +byteOff :: StringBuffer -> Int -> Char +byteOff (StringBuffer buf _ cur) i = + inlinePerformIO $ withForeignPtr buf $ \ptr -> do + w <- peek (ptr `plusPtr` (cur+i)) + return (unsafeChr (fromIntegral (w::Word8))) + +-- | XXX assumes ASCII digits only +parseInteger :: StringBuffer -> Int -> Integer -> (Char->Int) -> Integer +parseInteger buf len radix to_int + = go 0 0 + where go i x | i == len = x + | otherwise = go (i+1) (x * radix + toInteger (to_int (byteOff buf i))) + +-- ----------------------------------------------------------------------------- +-- under the carpet + +-- Just like unsafePerformIO, but we inline it. +{-# INLINE inlinePerformIO #-} +inlinePerformIO :: IO a -> a +inlinePerformIO (IO m) = case m realWorld# of (# _, r #) -> r + +#if __GLASGOW_HASKELL__ < 600 +mallocForeignPtrArray :: Storable a => Int -> IO (ForeignPtr a) +mallocForeignPtrArray = doMalloc undefined + where + doMalloc :: Storable b => b -> Int -> IO (ForeignPtr b) + doMalloc dummy size = mallocForeignPtrBytes (size * sizeOf dummy) + +mallocForeignPtrBytes :: Int -> IO (ForeignPtr a) +mallocForeignPtrBytes n = do + r <- mallocBytes n + newForeignPtr r (finalizerFree r) + +foreign import ccall unsafe "stdlib.h free" + finalizerFree :: Ptr a -> IO () +#endif +\end{code} diff --git a/compiler/utils/UniqFM.lhs b/compiler/utils/UniqFM.lhs new file mode 100644 index 0000000000..84294aae0d --- /dev/null +++ b/compiler/utils/UniqFM.lhs @@ -0,0 +1,847 @@ +%ilter +% (c) The AQUA Project, Glasgow University, 1994-1998 +% +\section[UniqFM]{Specialised finite maps, for things with @Uniques@} + +Based on @FiniteMaps@ (as you would expect). + +Basically, the things need to be in class @Uniquable@, and we use the +@getUnique@ method to grab their @Uniques@. + +(A similar thing to @UniqSet@, as opposed to @Set@.) + +\begin{code} +module UniqFM ( + UniqFM, -- abstract type + + emptyUFM, + unitUFM, + unitDirectlyUFM, + listToUFM, + listToUFM_Directly, + addToUFM,addToUFM_C,addToUFM_Acc, + addListToUFM,addListToUFM_C, + addToUFM_Directly, + addListToUFM_Directly, + delFromUFM, + delFromUFM_Directly, + delListFromUFM, + plusUFM, + plusUFM_C, + minusUFM, + intersectUFM, + intersectUFM_C, + foldUFM, + mapUFM, + elemUFM, elemUFM_Directly, + filterUFM, filterUFM_Directly, + sizeUFM, + hashUFM, + isNullUFM, + lookupUFM, lookupUFM_Directly, + lookupWithDefaultUFM, lookupWithDefaultUFM_Directly, + eltsUFM, keysUFM, + ufmToList + ) where + +#include "HsVersions.h" + +import Unique ( Uniquable(..), Unique, getKey#, mkUniqueGrimily ) +import Maybes ( maybeToBool ) +import FastTypes +import Outputable + +import GLAEXTS -- Lots of Int# operations +\end{code} + +%************************************************************************ +%* * +\subsection{The @UniqFM@ type, and signatures for the functions} +%* * +%************************************************************************ + +We use @FiniteMaps@, with a (@getUnique@-able) @Unique@ as ``key''. + +\begin{code} +emptyUFM :: UniqFM elt +isNullUFM :: UniqFM elt -> Bool +unitUFM :: Uniquable key => key -> elt -> UniqFM elt +unitDirectlyUFM -- got the Unique already + :: Unique -> elt -> UniqFM elt +listToUFM :: Uniquable key => [(key,elt)] -> UniqFM elt +listToUFM_Directly + :: [(Unique, elt)] -> UniqFM elt + +addToUFM :: Uniquable key => UniqFM elt -> key -> elt -> UniqFM elt +addListToUFM :: Uniquable key => UniqFM elt -> [(key,elt)] -> UniqFM elt +addToUFM_Directly + :: UniqFM elt -> Unique -> elt -> UniqFM elt + +addToUFM_C :: Uniquable key => (elt -> elt -> elt) -- old -> new -> result + -> UniqFM elt -- old + -> key -> elt -- new + -> UniqFM elt -- result + +addToUFM_Acc :: Uniquable key => + (elt -> elts -> elts) -- Add to existing + -> (elt -> elts) -- New element + -> UniqFM elts -- old + -> key -> elt -- new + -> UniqFM elts -- result + +addListToUFM_C :: Uniquable key => (elt -> elt -> elt) + -> UniqFM elt -> [(key,elt)] + -> UniqFM elt + +delFromUFM :: Uniquable key => UniqFM elt -> key -> UniqFM elt +delListFromUFM :: Uniquable key => UniqFM elt -> [key] -> UniqFM elt +delFromUFM_Directly :: UniqFM elt -> Unique -> UniqFM elt + +plusUFM :: UniqFM elt -> UniqFM elt -> UniqFM elt + +plusUFM_C :: (elt -> elt -> elt) + -> UniqFM elt -> UniqFM elt -> UniqFM elt + +minusUFM :: UniqFM elt1 -> UniqFM elt2 -> UniqFM elt1 + +intersectUFM :: UniqFM elt -> UniqFM elt -> UniqFM elt +intersectUFM_C :: (elt1 -> elt2 -> elt3) + -> UniqFM elt1 -> UniqFM elt2 -> UniqFM elt3 +foldUFM :: (elt -> a -> a) -> a -> UniqFM elt -> a +mapUFM :: (elt1 -> elt2) -> UniqFM elt1 -> UniqFM elt2 +filterUFM :: (elt -> Bool) -> UniqFM elt -> UniqFM elt +filterUFM_Directly :: (Unique -> elt -> Bool) -> UniqFM elt -> UniqFM elt + +sizeUFM :: UniqFM elt -> Int +hashUFM :: UniqFM elt -> Int +elemUFM :: Uniquable key => key -> UniqFM elt -> Bool +elemUFM_Directly:: Unique -> UniqFM elt -> Bool + +lookupUFM :: Uniquable key => UniqFM elt -> key -> Maybe elt +lookupUFM_Directly -- when you've got the Unique already + :: UniqFM elt -> Unique -> Maybe elt +lookupWithDefaultUFM + :: Uniquable key => UniqFM elt -> elt -> key -> elt +lookupWithDefaultUFM_Directly + :: UniqFM elt -> elt -> Unique -> elt + +keysUFM :: UniqFM elt -> [Unique] -- Get the keys +eltsUFM :: UniqFM elt -> [elt] +ufmToList :: UniqFM elt -> [(Unique, elt)] +\end{code} + +%************************************************************************ +%* * +\subsection{The @IdFinMap@ and @TyVarFinMap@ specialisations for Ids/TyVars} +%* * +%************************************************************************ + +\begin{code} +-- Turn off for now, these need to be updated (SDM 4/98) + +#if 0 +#ifdef __GLASGOW_HASKELL__ +-- I don't think HBC was too happy about this (WDP 94/10) + +{-# SPECIALIZE + addListToUFM :: UniqFM elt -> [(Name, elt)] -> UniqFM elt + #-} +{-# SPECIALIZE + addListToUFM_C :: (elt -> elt -> elt) -> UniqFM elt -> [(Name, elt)] -> UniqFM elt + #-} +{-# SPECIALIZE + addToUFM :: UniqFM elt -> Unique -> elt -> UniqFM elt + #-} +{-# SPECIALIZE + listToUFM :: [(Unique, elt)] -> UniqFM elt + #-} +{-# SPECIALIZE + lookupUFM :: UniqFM elt -> Name -> Maybe elt + , UniqFM elt -> Unique -> Maybe elt + #-} + +#endif /* __GLASGOW_HASKELL__ */ +#endif +\end{code} + +%************************************************************************ +%* * +\subsection{Andy Gill's underlying @UniqFM@ machinery} +%* * +%************************************************************************ + +``Uniq Finite maps'' are the heart and soul of the compiler's +lookup-tables/environments. Important stuff! It works well with +Dense and Sparse ranges. +Both @Uq@ Finite maps and @Hash@ Finite Maps +are built ontop of Int Finite Maps. + +This code is explained in the paper: +\begin{display} + A Gill, S Peyton Jones, B O'Sullivan, W Partain and Aqua Friends + "A Cheap balancing act that grows on a tree" + Glasgow FP Workshop, Sep 1994, pp??-?? +\end{display} + +%************************************************************************ +%* * +\subsubsection{The @UniqFM@ type, and signatures for the functions} +%* * +%************************************************************************ + +@UniqFM a@ is a mapping from Unique to a. + +First, the DataType itself; which is either a Node, a Leaf, or an Empty. + +\begin{code} +data UniqFM ele + = EmptyUFM + | LeafUFM FastInt ele + | NodeUFM FastInt -- the switching + FastInt -- the delta + (UniqFM ele) + (UniqFM ele) +-- INVARIANT: the children of a NodeUFM are never EmptyUFMs + +{- +-- for debugging only :-) +instance Outputable (UniqFM a) where + ppr(NodeUFM a b t1 t2) = + sep [text "NodeUFM " <+> int IBOX(a) <+> int IBOX(b), + nest 1 (parens (ppr t1)), + nest 1 (parens (ppr t2))] + ppr (LeafUFM x a) = text "LeafUFM " <+> int IBOX(x) + ppr (EmptyUFM) = empty +-} +-- and when not debugging the package itself... +instance Outputable a => Outputable (UniqFM a) where + ppr ufm = ppr (ufmToList ufm) +\end{code} + +%************************************************************************ +%* * +\subsubsection{The @UniqFM@ functions} +%* * +%************************************************************************ + +First the ways of building a UniqFM. + +\begin{code} +emptyUFM = EmptyUFM +unitUFM key elt = mkLeafUFM (getKey# (getUnique key)) elt +unitDirectlyUFM key elt = mkLeafUFM (getKey# key) elt + +listToUFM key_elt_pairs + = addListToUFM_C use_snd EmptyUFM key_elt_pairs + +listToUFM_Directly uniq_elt_pairs + = addListToUFM_directly_C use_snd EmptyUFM uniq_elt_pairs +\end{code} + +Now ways of adding things to UniqFMs. + +There is an alternative version of @addListToUFM_C@, that uses @plusUFM@, +but the semantics of this operation demands a linear insertion; +perhaps the version without the combinator function +could be optimised using it. + +\begin{code} +addToUFM fm key elt = addToUFM_C use_snd fm key elt + +addToUFM_Directly fm u elt = insert_ele use_snd fm (getKey# u) elt + +addToUFM_C combiner fm key elt + = insert_ele combiner fm (getKey# (getUnique key)) elt + +addToUFM_Acc add unit fm key item + = insert_ele combiner fm (getKey# (getUnique key)) (unit item) + where + combiner old _unit_item = add item old + +addListToUFM fm key_elt_pairs = addListToUFM_C use_snd fm key_elt_pairs +addListToUFM_Directly fm uniq_elt_pairs = addListToUFM_directly_C use_snd fm uniq_elt_pairs + +addListToUFM_C combiner fm key_elt_pairs + = foldl (\ fm (k, e) -> insert_ele combiner fm (getKey# (getUnique k)) e) + fm key_elt_pairs + +addListToUFM_directly_C combiner fm uniq_elt_pairs + = foldl (\ fm (k, e) -> insert_ele combiner fm (getKey# k) e) + fm uniq_elt_pairs +\end{code} + +Now ways of removing things from UniqFM. + +\begin{code} +delListFromUFM fm lst = foldl delFromUFM fm lst + +delFromUFM fm key = delete fm (getKey# (getUnique key)) +delFromUFM_Directly fm u = delete fm (getKey# u) + +delete EmptyUFM _ = EmptyUFM +delete fm key = del_ele fm + where + del_ele :: UniqFM a -> UniqFM a + + del_ele lf@(LeafUFM j _) + | j ==# key = EmptyUFM + | otherwise = lf -- no delete! + + del_ele nd@(NodeUFM j p t1 t2) + | j ># key + = mkSLNodeUFM (NodeUFMData j p) (del_ele t1) t2 + | otherwise + = mkLSNodeUFM (NodeUFMData j p) t1 (del_ele t2) + + del_ele _ = panic "Found EmptyUFM FM when rec-deleting" +\end{code} + +Now ways of adding two UniqFM's together. + +\begin{code} +plusUFM tr1 tr2 = plusUFM_C use_snd tr1 tr2 + +plusUFM_C f EmptyUFM tr = tr +plusUFM_C f tr EmptyUFM = tr +plusUFM_C f fm1 fm2 = mix_trees fm1 fm2 + where + mix_trees (LeafUFM i a) t2 = insert_ele (flip f) t2 i a + mix_trees t1 (LeafUFM i a) = insert_ele f t1 i a + + mix_trees left_t@(NodeUFM j p t1 t2) right_t@(NodeUFM j' p' t1' t2') + = mix_branches + (ask_about_common_ancestor + (NodeUFMData j p) + (NodeUFMData j' p')) + where + -- Given a disjoint j,j' (p >^ p' && p' >^ p): + -- + -- j j' (C j j') + -- / \ + / \ ==> / \ + -- t1 t2 t1' t2' j j' + -- / \ / \ + -- t1 t2 t1' t2' + -- Fast, Ehh ! + -- + mix_branches (NewRoot nd False) + = mkLLNodeUFM nd left_t right_t + mix_branches (NewRoot nd True) + = mkLLNodeUFM nd right_t left_t + + -- Now, if j == j': + -- + -- j j' j + -- / \ + / \ ==> / \ + -- t1 t2 t1' t2' t1 + t1' t2 + t2' + -- + mix_branches (SameRoot) + = mkSSNodeUFM (NodeUFMData j p) + (mix_trees t1 t1') + (mix_trees t2 t2') + -- Now the 4 different other ways; all like this: + -- + -- Given j >^ j' (and, say, j > j') + -- + -- j j' j + -- / \ + / \ ==> / \ + -- t1 t2 t1' t2' t1 t2 + j' + -- / \ + -- t1' t2' + mix_branches (LeftRoot Leftt) -- | trace "LL" True + = mkSLNodeUFM + (NodeUFMData j p) + (mix_trees t1 right_t) + t2 + + mix_branches (LeftRoot Rightt) -- | trace "LR" True + = mkLSNodeUFM + (NodeUFMData j p) + t1 + (mix_trees t2 right_t) + + mix_branches (RightRoot Leftt) -- | trace "RL" True + = mkSLNodeUFM + (NodeUFMData j' p') + (mix_trees left_t t1') + t2' + + mix_branches (RightRoot Rightt) -- | trace "RR" True + = mkLSNodeUFM + (NodeUFMData j' p') + t1' + (mix_trees left_t t2') + + mix_trees _ _ = panic "EmptyUFM found when inserting into plusInt" +\end{code} + +And ways of subtracting them. First the base cases, +then the full D&C approach. + +\begin{code} +minusUFM EmptyUFM _ = EmptyUFM +minusUFM t1 EmptyUFM = t1 +minusUFM fm1 fm2 = minus_trees fm1 fm2 + where + -- + -- Notice the asymetry of subtraction + -- + minus_trees lf@(LeafUFM i a) t2 = + case lookUp t2 i of + Nothing -> lf + Just b -> EmptyUFM + + minus_trees t1 (LeafUFM i _) = delete t1 i + + minus_trees left_t@(NodeUFM j p t1 t2) right_t@(NodeUFM j' p' t1' t2') + = minus_branches + (ask_about_common_ancestor + (NodeUFMData j p) + (NodeUFMData j' p')) + where + -- Given a disjoint j,j' (p >^ p' && p' >^ p): + -- + -- j j' j + -- / \ + / \ ==> / \ + -- t1 t2 t1' t2' t1 t2 + -- + -- + -- Fast, Ehh ! + -- + minus_branches (NewRoot nd _) = left_t + + -- Now, if j == j': + -- + -- j j' j + -- / \ + / \ ==> / \ + -- t1 t2 t1' t2' t1 + t1' t2 + t2' + -- + minus_branches (SameRoot) + = mkSSNodeUFM (NodeUFMData j p) + (minus_trees t1 t1') + (minus_trees t2 t2') + -- Now the 4 different other ways; all like this: + -- again, with asymatry + + -- + -- The left is above the right + -- + minus_branches (LeftRoot Leftt) + = mkSLNodeUFM + (NodeUFMData j p) + (minus_trees t1 right_t) + t2 + minus_branches (LeftRoot Rightt) + = mkLSNodeUFM + (NodeUFMData j p) + t1 + (minus_trees t2 right_t) + + -- + -- The right is above the left + -- + minus_branches (RightRoot Leftt) + = minus_trees left_t t1' + minus_branches (RightRoot Rightt) + = minus_trees left_t t2' + + minus_trees _ _ = panic "EmptyUFM found when insering into plusInt" +\end{code} + +And taking the intersection of two UniqFM's. + +\begin{code} +intersectUFM t1 t2 = intersectUFM_C use_snd t1 t2 + +intersectUFM_C f EmptyUFM _ = EmptyUFM +intersectUFM_C f _ EmptyUFM = EmptyUFM +intersectUFM_C f fm1 fm2 = intersect_trees fm1 fm2 + where + intersect_trees (LeafUFM i a) t2 = + case lookUp t2 i of + Nothing -> EmptyUFM + Just b -> mkLeafUFM i (f a b) + + intersect_trees t1 (LeafUFM i a) = + case lookUp t1 i of + Nothing -> EmptyUFM + Just b -> mkLeafUFM i (f b a) + + intersect_trees left_t@(NodeUFM j p t1 t2) right_t@(NodeUFM j' p' t1' t2') + = intersect_branches + (ask_about_common_ancestor + (NodeUFMData j p) + (NodeUFMData j' p')) + where + -- Given a disjoint j,j' (p >^ p' && p' >^ p): + -- + -- j j' + -- / \ + / \ ==> EmptyUFM + -- t1 t2 t1' t2' + -- + -- Fast, Ehh ! + -- + intersect_branches (NewRoot nd _) = EmptyUFM + + -- Now, if j == j': + -- + -- j j' j + -- / \ + / \ ==> / \ + -- t1 t2 t1' t2' t1 x t1' t2 x t2' + -- + intersect_branches (SameRoot) + = mkSSNodeUFM (NodeUFMData j p) + (intersect_trees t1 t1') + (intersect_trees t2 t2') + -- Now the 4 different other ways; all like this: + -- + -- Given j >^ j' (and, say, j > j') + -- + -- j j' t2 + j' + -- / \ + / \ ==> / \ + -- t1 t2 t1' t2' t1' t2' + -- + -- This does cut down the search space quite a bit. + + intersect_branches (LeftRoot Leftt) + = intersect_trees t1 right_t + intersect_branches (LeftRoot Rightt) + = intersect_trees t2 right_t + intersect_branches (RightRoot Leftt) + = intersect_trees left_t t1' + intersect_branches (RightRoot Rightt) + = intersect_trees left_t t2' + + intersect_trees x y = panic ("EmptyUFM found when intersecting trees") +\end{code} + +Now the usual set of `collection' operators, like map, fold, etc. + +\begin{code} +foldUFM f a (NodeUFM _ _ t1 t2) = foldUFM f (foldUFM f a t2) t1 +foldUFM f a (LeafUFM _ obj) = f obj a +foldUFM f a EmptyUFM = a +\end{code} + +\begin{code} +mapUFM fn EmptyUFM = EmptyUFM +mapUFM fn fm = map_tree fn fm + +filterUFM fn EmptyUFM = EmptyUFM +filterUFM fn fm = filter_tree pred fm + where + pred (i::FastInt) e = fn e + +filterUFM_Directly fn EmptyUFM = EmptyUFM +filterUFM_Directly fn fm = filter_tree pred fm + where + pred i e = fn (mkUniqueGrimily (iBox i)) e +\end{code} + +Note, this takes a long time, O(n), but +because we dont want to do this very often, we put up with this. +O'rable, but how often do we look at the size of +a finite map? + +\begin{code} +sizeUFM EmptyUFM = 0 +sizeUFM (NodeUFM _ _ t1 t2) = sizeUFM t1 + sizeUFM t2 +sizeUFM (LeafUFM _ _) = 1 + +isNullUFM EmptyUFM = True +isNullUFM _ = False + +-- hashing is used in VarSet.uniqAway, and should be fast +-- We use a cheap and cheerful method for now +hashUFM EmptyUFM = 0 +hashUFM (NodeUFM n _ _ _) = iBox n +hashUFM (LeafUFM n _) = iBox n +\end{code} + +looking up in a hurry is the {\em whole point} of this binary tree lark. +Lookup up a binary tree is easy (and fast). + +\begin{code} +elemUFM key fm = maybeToBool (lookupUFM fm key) +elemUFM_Directly key fm = maybeToBool (lookupUFM_Directly fm key) + +lookupUFM fm key = lookUp fm (getKey# (getUnique key)) +lookupUFM_Directly fm key = lookUp fm (getKey# key) + +lookupWithDefaultUFM fm deflt key + = case lookUp fm (getKey# (getUnique key)) of + Nothing -> deflt + Just elt -> elt + +lookupWithDefaultUFM_Directly fm deflt key + = case lookUp fm (getKey# key) of + Nothing -> deflt + Just elt -> elt + +lookUp EmptyUFM _ = Nothing +lookUp fm i = lookup_tree fm + where + lookup_tree :: UniqFM a -> Maybe a + + lookup_tree (LeafUFM j b) + | j ==# i = Just b + | otherwise = Nothing + lookup_tree (NodeUFM j p t1 t2) + | j ># i = lookup_tree t1 + | otherwise = lookup_tree t2 + + lookup_tree EmptyUFM = panic "lookup Failed" +\end{code} + +folds are *wonderful* things. + +\begin{code} +eltsUFM fm = foldUFM (:) [] fm + +ufmToList fm = fold_tree (\ iu elt rest -> (mkUniqueGrimily (iBox iu), elt) : rest) [] fm + +keysUFM fm = fold_tree (\ iu elt rest -> mkUniqueGrimily (iBox iu) : rest) [] fm + +fold_tree f a (NodeUFM _ _ t1 t2) = fold_tree f (fold_tree f a t2) t1 +fold_tree f a (LeafUFM iu obj) = f iu obj a +fold_tree f a EmptyUFM = a +\end{code} + +%************************************************************************ +%* * +\subsubsection{The @UniqFM@ type, and its functions} +%* * +%************************************************************************ + +You should always use these to build the tree. +There are 4 versions of mkNodeUFM, depending on +the strictness of the two sub-tree arguments. +The strictness is used *both* to prune out +empty trees, *and* to improve performance, +stoping needless thunks lying around. +The rule of thumb (from experence with these trees) +is make thunks strict, but data structures lazy. +If in doubt, use mkSSNodeUFM, which has the `strongest' +functionality, but may do a few needless evaluations. + +\begin{code} +mkLeafUFM :: FastInt -> a -> UniqFM a +mkLeafUFM i a = LeafUFM i a + +-- The *ONLY* ways of building a NodeUFM. + +mkSSNodeUFM (NodeUFMData j p) EmptyUFM t2 = t2 +mkSSNodeUFM (NodeUFMData j p) t1 EmptyUFM = t1 +mkSSNodeUFM (NodeUFMData j p) t1 t2 + = ASSERT(correctNodeUFM (iBox j) (iBox p) t1 t2) + NodeUFM j p t1 t2 + +mkSLNodeUFM (NodeUFMData j p) EmptyUFM t2 = t2 +mkSLNodeUFM (NodeUFMData j p) t1 t2 + = ASSERT(correctNodeUFM (iBox j) (iBox p) t1 t2) + NodeUFM j p t1 t2 + +mkLSNodeUFM (NodeUFMData j p) t1 EmptyUFM = t1 +mkLSNodeUFM (NodeUFMData j p) t1 t2 + = ASSERT(correctNodeUFM (iBox j) (iBox p) t1 t2) + NodeUFM j p t1 t2 + +mkLLNodeUFM (NodeUFMData j p) t1 t2 + = ASSERT(correctNodeUFM (iBox j) (iBox p) t1 t2) + NodeUFM j p t1 t2 + +correctNodeUFM + :: Int + -> Int + -> UniqFM a + -> UniqFM a + -> Bool + +correctNodeUFM j p t1 t2 + = correct (j-p) (j-1) p t1 && correct j ((j-1)+p) p t2 + where + correct low high _ (LeafUFM i _) + = low <= iBox i && iBox i <= high + correct low high above_p (NodeUFM j p _ _) + = low <= iBox j && iBox j <= high && above_p > iBox p + correct _ _ _ EmptyUFM = panic "EmptyUFM stored inside a tree" +\end{code} + +Note: doing SAT on this by hand seems to make it worse. Todo: Investigate, +and if necessary do $\lambda$ lifting on our functions that are bound. + +\begin{code} +insert_ele + :: (a -> a -> a) -- old -> new -> result + -> UniqFM a + -> FastInt + -> a + -> UniqFM a + +insert_ele f EmptyUFM i new = mkLeafUFM i new + +insert_ele f (LeafUFM j old) i new + | j ># i = + mkLLNodeUFM (getCommonNodeUFMData + (indexToRoot i) + (indexToRoot j)) + (mkLeafUFM i new) + (mkLeafUFM j old) + | j ==# i = mkLeafUFM j (f old new) + | otherwise = + mkLLNodeUFM (getCommonNodeUFMData + (indexToRoot i) + (indexToRoot j)) + (mkLeafUFM j old) + (mkLeafUFM i new) + +insert_ele f n@(NodeUFM j p t1 t2) i a + | i <# j + = if (i >=# (j -# p)) + then mkSLNodeUFM (NodeUFMData j p) (insert_ele f t1 i a) t2 + else mkLLNodeUFM (getCommonNodeUFMData + (indexToRoot i) + ((NodeUFMData j p))) + (mkLeafUFM i a) + n + | otherwise + = if (i <=# ((j -# _ILIT(1)) +# p)) + then mkLSNodeUFM (NodeUFMData j p) t1 (insert_ele f t2 i a) + else mkLLNodeUFM (getCommonNodeUFMData + (indexToRoot i) + ((NodeUFMData j p))) + n + (mkLeafUFM i a) +\end{code} + + + +\begin{code} +map_tree f (NodeUFM j p t1 t2) + = mkLLNodeUFM (NodeUFMData j p) (map_tree f t1) (map_tree f t2) + -- NB. lazy! we know the tree is well-formed. +map_tree f (LeafUFM i obj) + = mkLeafUFM i (f obj) +map_tree f _ = panic "map_tree failed" +\end{code} + +\begin{code} +filter_tree :: (FastInt -> a -> Bool) -> UniqFM a -> UniqFM a +filter_tree f nd@(NodeUFM j p t1 t2) + = mkSSNodeUFM (NodeUFMData j p) (filter_tree f t1) (filter_tree f t2) + +filter_tree f lf@(LeafUFM i obj) + | f i obj = lf + | otherwise = EmptyUFM +filter_tree f _ = panic "filter_tree failed" +\end{code} + +%************************************************************************ +%* * +\subsubsection{The @UniqFM@ type, and signatures for the functions} +%* * +%************************************************************************ + +Now some Utilities; + +This is the information that is held inside a NodeUFM, packaged up for +consumer use. + +\begin{code} +data NodeUFMData + = NodeUFMData FastInt + FastInt +\end{code} + +This is the information used when computing new NodeUFMs. + +\begin{code} +data Side = Leftt | Rightt -- NB: avoid 1.3 names "Left" and "Right" +data CommonRoot + = LeftRoot Side -- which side is the right down ? + | RightRoot Side -- which side is the left down ? + | SameRoot -- they are the same ! + | NewRoot NodeUFMData -- here's the new, common, root + Bool -- do you need to swap left and right ? +\end{code} + +This specifies the relationship between NodeUFMData and CalcNodeUFMData. + +\begin{code} +indexToRoot :: FastInt -> NodeUFMData + +indexToRoot i + = let + l = (_ILIT(1) :: FastInt) + in + NodeUFMData (((i `shiftR_` l) `shiftL_` l) +# _ILIT(1)) l + +getCommonNodeUFMData :: NodeUFMData -> NodeUFMData -> NodeUFMData + +getCommonNodeUFMData (NodeUFMData i p) (NodeUFMData i2 p2) + | p ==# p2 = getCommonNodeUFMData_ p j j2 + | p <# p2 = getCommonNodeUFMData_ p2 (j `quotFastInt` (p2 `quotFastInt` p)) j2 + | otherwise = getCommonNodeUFMData_ p j (j2 `quotFastInt` (p `quotFastInt` p2)) + where + l = (_ILIT(1) :: FastInt) + j = i `quotFastInt` (p `shiftL_` l) + j2 = i2 `quotFastInt` (p2 `shiftL_` l) + + getCommonNodeUFMData_ :: FastInt -> FastInt -> FastInt -> NodeUFMData + + getCommonNodeUFMData_ p j j_ + | j ==# j_ + = NodeUFMData (((j `shiftL_` l) +# l) *# p) p + | otherwise + = getCommonNodeUFMData_ (p `shiftL_` l) (j `shiftR_` l) (j_ `shiftR_` l) + +ask_about_common_ancestor :: NodeUFMData -> NodeUFMData -> CommonRoot + +ask_about_common_ancestor x@(NodeUFMData j p) y@(NodeUFMData j2 p2) + | j ==# j2 = SameRoot + | otherwise + = case getCommonNodeUFMData x y of + nd@(NodeUFMData j3 p3) + | j3 ==# j -> LeftRoot (decideSide (j ># j2)) + | j3 ==# j2 -> RightRoot (decideSide (j <# j2)) + | otherwise -> NewRoot nd (j ># j2) + where + decideSide :: Bool -> Side + decideSide True = Leftt + decideSide False = Rightt +\end{code} + +This might be better in Util.lhs ? + + +Now the bit twiddling functions. +\begin{code} +shiftL_ :: FastInt -> FastInt -> FastInt +shiftR_ :: FastInt -> FastInt -> FastInt + +#if __GLASGOW_HASKELL__ +{-# INLINE shiftL_ #-} +{-# INLINE shiftR_ #-} +#if __GLASGOW_HASKELL__ >= 503 +shiftL_ n p = word2Int#((int2Word# n) `uncheckedShiftL#` p) +#else +shiftL_ n p = word2Int#((int2Word# n) `shiftL#` p) +#endif +shiftR_ n p = word2Int#((int2Word# n) `shiftr` p) + where +#if __GLASGOW_HASKELL__ >= 503 + shiftr x y = uncheckedShiftRL# x y +#else + shiftr x y = shiftRL# x y +#endif + +#else /* not GHC */ +shiftL_ n p = n * (2 ^ p) +shiftR_ n p = n `quot` (2 ^ p) + +#endif /* not GHC */ +\end{code} + +\begin{code} +use_snd :: a -> b -> b +use_snd a b = b +\end{code} diff --git a/compiler/utils/UniqSet.lhs b/compiler/utils/UniqSet.lhs new file mode 100644 index 0000000000..129e333eb5 --- /dev/null +++ b/compiler/utils/UniqSet.lhs @@ -0,0 +1,138 @@ +% +% (c) The AQUA Project, Glasgow University, 1994-1998 +% +\section[UniqSet]{Specialised sets, for things with @Uniques@} + +Based on @UniqFMs@ (as you would expect). + +Basically, the things need to be in class @Uniquable@. + +\begin{code} +module UniqSet ( + UniqSet, -- abstract type: NOT + + mkUniqSet, uniqSetToList, emptyUniqSet, unitUniqSet, + addOneToUniqSet, addListToUniqSet, delOneFromUniqSet, delListFromUniqSet, + unionUniqSets, unionManyUniqSets, minusUniqSet, + elementOfUniqSet, mapUniqSet, intersectUniqSets, + isEmptyUniqSet, filterUniqSet, sizeUniqSet, foldUniqSet, + elemUniqSet_Directly, lookupUniqSet, hashUniqSet + ) where + +#include "HsVersions.h" + +import {-# SOURCE #-} Name ( Name ) + +import Maybes ( maybeToBool ) +import UniqFM +import Unique ( Unique, Uniquable(..) ) + +#if ! OMIT_NATIVE_CODEGEN +#define IF_NCG(a) a +#else +#define IF_NCG(a) {--} +#endif +\end{code} + +%************************************************************************ +%* * +\subsection{The @UniqSet@ type} +%* * +%************************************************************************ + +We use @UniqFM@, with a (@getUnique@-able) @Unique@ as ``key'' +and the thing itself as the ``value'' (for later retrieval). + +\begin{code} +--data UniqSet a = MkUniqSet (FiniteMap Unique a) : NOT + +type UniqSet a = UniqFM a +#define MkUniqSet {--} + +emptyUniqSet :: UniqSet a +emptyUniqSet = MkUniqSet emptyUFM + +unitUniqSet :: Uniquable a => a -> UniqSet a +unitUniqSet x = MkUniqSet (unitUFM x x) + +uniqSetToList :: UniqSet a -> [a] +uniqSetToList (MkUniqSet set) = eltsUFM set + +foldUniqSet :: (a -> b -> b) -> b -> UniqSet a -> b +foldUniqSet k z (MkUniqSet set) = foldUFM k z set + +mkUniqSet :: Uniquable a => [a] -> UniqSet a +mkUniqSet xs = MkUniqSet (listToUFM [ (x, x) | x <- xs]) + +addOneToUniqSet :: Uniquable a => UniqSet a -> a -> UniqSet a +addOneToUniqSet (MkUniqSet set) x = MkUniqSet (addToUFM set x x) + +delOneFromUniqSet :: Uniquable a => UniqSet a -> a -> UniqSet a +delOneFromUniqSet (MkUniqSet set) x = MkUniqSet (delFromUFM set x) + +delListFromUniqSet :: Uniquable a => UniqSet a -> [a] -> UniqSet a +delListFromUniqSet (MkUniqSet set) xs = MkUniqSet (delListFromUFM set xs) + +addListToUniqSet :: Uniquable a => UniqSet a -> [a] -> UniqSet a +addListToUniqSet (MkUniqSet set) xs = MkUniqSet (addListToUFM set [(x,x) | x<-xs]) + +unionUniqSets :: UniqSet a -> UniqSet a -> UniqSet a +unionUniqSets (MkUniqSet set1) (MkUniqSet set2) = MkUniqSet (plusUFM set1 set2) + +unionManyUniqSets :: [UniqSet a] -> UniqSet a + -- = foldr unionUniqSets emptyUniqSet ss +unionManyUniqSets [] = emptyUniqSet +unionManyUniqSets [s] = s +unionManyUniqSets (s:ss) = s `unionUniqSets` unionManyUniqSets ss + +minusUniqSet :: UniqSet a -> UniqSet a -> UniqSet a +minusUniqSet (MkUniqSet set1) (MkUniqSet set2) = MkUniqSet (minusUFM set1 set2) + +filterUniqSet :: (a -> Bool) -> UniqSet a -> UniqSet a +filterUniqSet pred (MkUniqSet set) = MkUniqSet (filterUFM pred set) + +intersectUniqSets :: UniqSet a -> UniqSet a -> UniqSet a +intersectUniqSets (MkUniqSet set1) (MkUniqSet set2) = MkUniqSet (intersectUFM set1 set2) + +elementOfUniqSet :: Uniquable a => a -> UniqSet a -> Bool +elementOfUniqSet x (MkUniqSet set) = maybeToBool (lookupUFM set x) + +lookupUniqSet :: Uniquable a => UniqSet a -> a -> Maybe a +lookupUniqSet (MkUniqSet set) x = lookupUFM set x + +elemUniqSet_Directly :: Unique -> UniqSet a -> Bool +elemUniqSet_Directly x (MkUniqSet set) = maybeToBool (lookupUFM_Directly set x) + +sizeUniqSet :: UniqSet a -> Int +sizeUniqSet (MkUniqSet set) = sizeUFM set + +hashUniqSet :: UniqSet a -> Int +hashUniqSet (MkUniqSet set) = hashUFM set + +isEmptyUniqSet :: UniqSet a -> Bool +isEmptyUniqSet (MkUniqSet set) = isNullUFM set {-SLOW: sizeUFM set == 0-} + +mapUniqSet :: (a -> a) -> UniqSet a -> UniqSet a + -- VERY IMPORTANT: *assumes* that the function doesn't change the unique +mapUniqSet f (MkUniqSet set) = MkUniqSet (mapUFM f set) +\end{code} + +\begin{code} +#if __GLASGOW_HASKELL__ +{-# SPECIALIZE + addOneToUniqSet :: UniqSet Unique -> Unique -> UniqSet Unique + #-} +{- SPECIALIZE + elementOfUniqSet :: Name -> UniqSet Name -> Bool + , Unique -> UniqSet Unique -> Bool + -} +{- SPECIALIZE + mkUniqSet :: [Name] -> UniqSet Name + -} + +{- SPECIALIZE + unitUniqSet :: Name -> UniqSet Name + , Unique -> UniqSet Unique + -} +#endif +\end{code} diff --git a/compiler/utils/Util.lhs b/compiler/utils/Util.lhs new file mode 100644 index 0000000000..e692ff1aa3 --- /dev/null +++ b/compiler/utils/Util.lhs @@ -0,0 +1,1029 @@ +% +% (c) The University of Glasgow 1992-2002 +% +\section[Util]{Highly random utility functions} + +\begin{code} +module Util ( + + -- general list processing + zipEqual, zipWithEqual, zipWith3Equal, zipWith4Equal, + zipLazy, stretchZipWith, + mapFst, mapSnd, + mapAndUnzip, mapAndUnzip3, + nOfThem, filterOut, + lengthExceeds, lengthIs, lengthAtLeast, listLengthCmp, atLength, + isSingleton, only, singleton, + notNull, snocView, + + isIn, isn'tIn, + + -- for-loop + nTimes, + + -- sorting + sortLe, sortWith, + + -- transitive closures + transitiveClosure, + + -- accumulating + mapAccumL, mapAccumR, mapAccumB, + foldl2, count, all2, + + takeList, dropList, splitAtList, split, + + -- comparisons + isEqual, eqListBy, equalLength, compareLength, + thenCmp, cmpList, prefixMatch, suffixMatch, maybePrefixMatch, + removeSpaces, + + -- strictness + foldl', seqList, + + -- pairs + unzipWith, + + global, consIORef, + + -- module names + looksLikeModuleName, + + toArgs, + + -- Floating point stuff + readRational, + + -- IO-ish utilities + createDirectoryHierarchy, + doesDirNameExist, + modificationTimeIfExists, + + later, handleDyn, handle, + + -- Filename utils + Suffix, + splitFilename, suffixOf, basenameOf, joinFileExt, + splitFilenameDir, joinFileName, + splitFilename3, + splitLongestPrefix, + replaceFilenameSuffix, directoryOf, filenameOf, + replaceFilenameDirectory, + escapeSpaces, isPathSeparator, + parseSearchPath, + normalisePath, platformPath, pgmPath, + ) where + +#include "HsVersions.h" + +import Panic ( panic, trace ) +import FastTypes + +import EXCEPTION ( Exception(..), finally, throwDyn, catchDyn, throw ) +import qualified EXCEPTION as Exception +import DYNAMIC ( Typeable ) +import DATA_IOREF ( IORef, newIORef ) +import UNSAFE_IO ( unsafePerformIO ) +import DATA_IOREF ( readIORef, writeIORef ) + +import qualified List ( elem, notElem ) + +#ifndef DEBUG +import List ( zipWith4 ) +#endif + +import Monad ( when ) +import IO ( catch, isDoesNotExistError ) +import Directory ( doesDirectoryExist, createDirectory ) +import Char ( isUpper, isAlphaNum, isSpace, ord, isDigit ) +import Ratio ( (%) ) +import Time ( ClockTime ) +import Directory ( getModificationTime ) + +infixr 9 `thenCmp` +\end{code} + +%************************************************************************ +%* * +\subsection{The Eager monad} +%* * +%************************************************************************ + +The @Eager@ monad is just an encoding of continuation-passing style, +used to allow you to express "do this and then that", mainly to avoid +space leaks. It's done with a type synonym to save bureaucracy. + +\begin{code} +#if NOT_USED + +type Eager ans a = (a -> ans) -> ans + +runEager :: Eager a a -> a +runEager m = m (\x -> x) + +appEager :: Eager ans a -> (a -> ans) -> ans +appEager m cont = m cont + +thenEager :: Eager ans a -> (a -> Eager ans b) -> Eager ans b +thenEager m k cont = m (\r -> k r cont) + +returnEager :: a -> Eager ans a +returnEager v cont = cont v + +mapEager :: (a -> Eager ans b) -> [a] -> Eager ans [b] +mapEager f [] = returnEager [] +mapEager f (x:xs) = f x `thenEager` \ y -> + mapEager f xs `thenEager` \ ys -> + returnEager (y:ys) +#endif +\end{code} + +%************************************************************************ +%* * +\subsection{A for loop} +%* * +%************************************************************************ + +\begin{code} +-- Compose a function with itself n times. (nth rather than twice) +nTimes :: Int -> (a -> a) -> (a -> a) +nTimes 0 _ = id +nTimes 1 f = f +nTimes n f = f . nTimes (n-1) f +\end{code} + +%************************************************************************ +%* * +\subsection[Utils-lists]{General list processing} +%* * +%************************************************************************ + +\begin{code} +filterOut :: (a->Bool) -> [a] -> [a] +-- Like filter, only reverses the sense of the test +filterOut p [] = [] +filterOut p (x:xs) | p x = filterOut p xs + | otherwise = x : filterOut p xs +\end{code} + +A paranoid @zip@ (and some @zipWith@ friends) that checks the lists +are of equal length. Alastair Reid thinks this should only happen if +DEBUGging on; hey, why not? + +\begin{code} +zipEqual :: String -> [a] -> [b] -> [(a,b)] +zipWithEqual :: String -> (a->b->c) -> [a]->[b]->[c] +zipWith3Equal :: String -> (a->b->c->d) -> [a]->[b]->[c]->[d] +zipWith4Equal :: String -> (a->b->c->d->e) -> [a]->[b]->[c]->[d]->[e] + +#ifndef DEBUG +zipEqual _ = zip +zipWithEqual _ = zipWith +zipWith3Equal _ = zipWith3 +zipWith4Equal _ = zipWith4 +#else +zipEqual msg [] [] = [] +zipEqual msg (a:as) (b:bs) = (a,b) : zipEqual msg as bs +zipEqual msg as bs = panic ("zipEqual: unequal lists:"++msg) + +zipWithEqual msg z (a:as) (b:bs)= z a b : zipWithEqual msg z as bs +zipWithEqual msg _ [] [] = [] +zipWithEqual msg _ _ _ = panic ("zipWithEqual: unequal lists:"++msg) + +zipWith3Equal msg z (a:as) (b:bs) (c:cs) + = z a b c : zipWith3Equal msg z as bs cs +zipWith3Equal msg _ [] [] [] = [] +zipWith3Equal msg _ _ _ _ = panic ("zipWith3Equal: unequal lists:"++msg) + +zipWith4Equal msg z (a:as) (b:bs) (c:cs) (d:ds) + = z a b c d : zipWith4Equal msg z as bs cs ds +zipWith4Equal msg _ [] [] [] [] = [] +zipWith4Equal msg _ _ _ _ _ = panic ("zipWith4Equal: unequal lists:"++msg) +#endif +\end{code} + +\begin{code} +-- zipLazy is lazy in the second list (observe the ~) + +zipLazy :: [a] -> [b] -> [(a,b)] +zipLazy [] ys = [] +zipLazy (x:xs) ~(y:ys) = (x,y) : zipLazy xs ys +\end{code} + + +\begin{code} +stretchZipWith :: (a -> Bool) -> b -> (a->b->c) -> [a] -> [b] -> [c] +-- (stretchZipWith p z f xs ys) stretches ys by inserting z in +-- the places where p returns *True* + +stretchZipWith p z f [] ys = [] +stretchZipWith p z f (x:xs) ys + | p x = f x z : stretchZipWith p z f xs ys + | otherwise = case ys of + [] -> [] + (y:ys) -> f x y : stretchZipWith p z f xs ys +\end{code} + + +\begin{code} +mapFst :: (a->c) -> [(a,b)] -> [(c,b)] +mapSnd :: (b->c) -> [(a,b)] -> [(a,c)] + +mapFst f xys = [(f x, y) | (x,y) <- xys] +mapSnd f xys = [(x, f y) | (x,y) <- xys] + +mapAndUnzip :: (a -> (b, c)) -> [a] -> ([b], [c]) + +mapAndUnzip f [] = ([],[]) +mapAndUnzip f (x:xs) + = let + (r1, r2) = f x + (rs1, rs2) = mapAndUnzip f xs + in + (r1:rs1, r2:rs2) + +mapAndUnzip3 :: (a -> (b, c, d)) -> [a] -> ([b], [c], [d]) + +mapAndUnzip3 f [] = ([],[],[]) +mapAndUnzip3 f (x:xs) + = let + (r1, r2, r3) = f x + (rs1, rs2, rs3) = mapAndUnzip3 f xs + in + (r1:rs1, r2:rs2, r3:rs3) +\end{code} + +\begin{code} +nOfThem :: Int -> a -> [a] +nOfThem n thing = replicate n thing + +-- 'atLength atLen atEnd ls n' unravels list 'ls' to position 'n'; +-- specification: +-- +-- atLength atLenPred atEndPred ls n +-- | n < 0 = atLenPred n +-- | length ls < n = atEndPred (n - length ls) +-- | otherwise = atLenPred (drop n ls) +-- +atLength :: ([a] -> b) + -> (Int -> b) + -> [a] + -> Int + -> b +atLength atLenPred atEndPred ls n + | n < 0 = atEndPred n + | otherwise = go n ls + where + go n [] = atEndPred n + go 0 ls = atLenPred ls + go n (_:xs) = go (n-1) xs + +-- special cases. +lengthExceeds :: [a] -> Int -> Bool +-- (lengthExceeds xs n) = (length xs > n) +lengthExceeds = atLength notNull (const False) + +lengthAtLeast :: [a] -> Int -> Bool +lengthAtLeast = atLength notNull (== 0) + +lengthIs :: [a] -> Int -> Bool +lengthIs = atLength null (==0) + +listLengthCmp :: [a] -> Int -> Ordering +listLengthCmp = atLength atLen atEnd + where + atEnd 0 = EQ + atEnd x + | x > 0 = LT -- not yet seen 'n' elts, so list length is < n. + | otherwise = GT + + atLen [] = EQ + atLen _ = GT + +singleton :: a -> [a] +singleton x = [x] + +isSingleton :: [a] -> Bool +isSingleton [x] = True +isSingleton _ = False + +notNull :: [a] -> Bool +notNull [] = False +notNull _ = True + +snocView :: [a] -> Maybe ([a],a) + -- Split off the last element +snocView [] = Nothing +snocView xs = go [] xs + where + -- Invariant: second arg is non-empty + go acc [x] = Just (reverse acc, x) + go acc (x:xs) = go (x:acc) xs + +only :: [a] -> a +#ifdef DEBUG +only [a] = a +#else +only (a:_) = a +#endif +\end{code} + +Debugging/specialising versions of \tr{elem} and \tr{notElem} + +\begin{code} +isIn, isn'tIn :: (Eq a) => String -> a -> [a] -> Bool + +# ifndef DEBUG +isIn msg x ys = elem__ x ys +isn'tIn msg x ys = notElem__ x ys + +--these are here to be SPECIALIZEd (automagically) +elem__ _ [] = False +elem__ x (y:ys) = x==y || elem__ x ys + +notElem__ x [] = True +notElem__ x (y:ys) = x /= y && notElem__ x ys + +# else /* DEBUG */ +isIn msg x ys + = elem (_ILIT 0) x ys + where + elem i _ [] = False + elem i x (y:ys) + | i ># _ILIT 100 = trace ("Over-long elem in " ++ msg) $ + x `List.elem` (y:ys) + | otherwise = x == y || elem (i +# _ILIT(1)) x ys + +isn'tIn msg x ys + = notElem (_ILIT 0) x ys + where + notElem i x [] = True + notElem i x (y:ys) + | i ># _ILIT 100 = trace ("Over-long notElem in " ++ msg) $ + x `List.notElem` (y:ys) + | otherwise = x /= y && notElem (i +# _ILIT(1)) x ys +# endif /* DEBUG */ +\end{code} + +%************************************************************************ +%* * +\subsubsection[Utils-Carsten-mergesort]{A mergesort from Carsten} +%* * +%************************************************************************ + +\begin{display} +Date: Mon, 3 May 93 20:45:23 +0200 +From: Carsten Kehler Holst <kehler@cs.chalmers.se> +To: partain@dcs.gla.ac.uk +Subject: natural merge sort beats quick sort [ and it is prettier ] + +Here is a piece of Haskell code that I'm rather fond of. See it as an +attempt to get rid of the ridiculous quick-sort routine. group is +quite useful by itself I think it was John's idea originally though I +believe the lazy version is due to me [surprisingly complicated]. +gamma [used to be called] is called gamma because I got inspired by +the Gamma calculus. It is not very close to the calculus but does +behave less sequentially than both foldr and foldl. One could imagine +a version of gamma that took a unit element as well thereby avoiding +the problem with empty lists. + +I've tried this code against + + 1) insertion sort - as provided by haskell + 2) the normal implementation of quick sort + 3) a deforested version of quick sort due to Jan Sparud + 4) a super-optimized-quick-sort of Lennart's + +If the list is partially sorted both merge sort and in particular +natural merge sort wins. If the list is random [ average length of +rising subsequences = approx 2 ] mergesort still wins and natural +merge sort is marginally beaten by Lennart's soqs. The space +consumption of merge sort is a bit worse than Lennart's quick sort +approx a factor of 2. And a lot worse if Sparud's bug-fix [see his +fpca article ] isn't used because of group. + +have fun +Carsten +\end{display} + +\begin{code} +group :: (a -> a -> Bool) -> [a] -> [[a]] +-- Given a <= function, group finds maximal contiguous up-runs +-- or down-runs in the input list. +-- It's stable, in the sense that it never re-orders equal elements +-- +-- Date: Mon, 12 Feb 1996 15:09:41 +0000 +-- From: Andy Gill <andy@dcs.gla.ac.uk> +-- Here is a `better' definition of group. + +group p [] = [] +group p (x:xs) = group' xs x x (x :) + where + group' [] _ _ s = [s []] + group' (x:xs) x_min x_max s + | x_max `p` x = group' xs x_min x (s . (x :)) + | not (x_min `p` x) = group' xs x x_max ((x :) . s) + | otherwise = s [] : group' xs x x (x :) + -- NB: the 'not' is essential for stablity + -- x `p` x_min would reverse equal elements + +generalMerge :: (a -> a -> Bool) -> [a] -> [a] -> [a] +generalMerge p xs [] = xs +generalMerge p [] ys = ys +generalMerge p (x:xs) (y:ys) | x `p` y = x : generalMerge p xs (y:ys) + | otherwise = y : generalMerge p (x:xs) ys + +-- gamma is now called balancedFold + +balancedFold :: (a -> a -> a) -> [a] -> a +balancedFold f [] = error "can't reduce an empty list using balancedFold" +balancedFold f [x] = x +balancedFold f l = balancedFold f (balancedFold' f l) + +balancedFold' :: (a -> a -> a) -> [a] -> [a] +balancedFold' f (x:y:xs) = f x y : balancedFold' f xs +balancedFold' f xs = xs + +generalNaturalMergeSort p [] = [] +generalNaturalMergeSort p xs = (balancedFold (generalMerge p) . group p) xs + +#if NOT_USED +generalMergeSort p [] = [] +generalMergeSort p xs = (balancedFold (generalMerge p) . map (: [])) xs + +mergeSort, naturalMergeSort :: Ord a => [a] -> [a] + +mergeSort = generalMergeSort (<=) +naturalMergeSort = generalNaturalMergeSort (<=) + +mergeSortLe le = generalMergeSort le +#endif + +sortLe :: (a->a->Bool) -> [a] -> [a] +sortLe le = generalNaturalMergeSort le + +sortWith :: Ord b => (a->b) -> [a] -> [a] +sortWith get_key xs = sortLe le xs + where + x `le` y = get_key x < get_key y +\end{code} + +%************************************************************************ +%* * +\subsection[Utils-transitive-closure]{Transitive closure} +%* * +%************************************************************************ + +This algorithm for transitive closure is straightforward, albeit quadratic. + +\begin{code} +transitiveClosure :: (a -> [a]) -- Successor function + -> (a -> a -> Bool) -- Equality predicate + -> [a] + -> [a] -- The transitive closure + +transitiveClosure succ eq xs + = go [] xs + where + go done [] = done + go done (x:xs) | x `is_in` done = go done xs + | otherwise = go (x:done) (succ x ++ xs) + + x `is_in` [] = False + x `is_in` (y:ys) | eq x y = True + | otherwise = x `is_in` ys +\end{code} + +%************************************************************************ +%* * +\subsection[Utils-accum]{Accumulating} +%* * +%************************************************************************ + +@mapAccumL@ behaves like a combination +of @map@ and @foldl@; +it applies a function to each element of a list, passing an accumulating +parameter from left to right, and returning a final value of this +accumulator together with the new list. + +\begin{code} +mapAccumL :: (acc -> x -> (acc, y)) -- Function of elt of input list + -- and accumulator, returning new + -- accumulator and elt of result list + -> acc -- Initial accumulator + -> [x] -- Input list + -> (acc, [y]) -- Final accumulator and result list + +mapAccumL f b [] = (b, []) +mapAccumL f b (x:xs) = (b'', x':xs') where + (b', x') = f b x + (b'', xs') = mapAccumL f b' xs +\end{code} + +@mapAccumR@ does the same, but working from right to left instead. Its type is +the same as @mapAccumL@, though. + +\begin{code} +mapAccumR :: (acc -> x -> (acc, y)) -- Function of elt of input list + -- and accumulator, returning new + -- accumulator and elt of result list + -> acc -- Initial accumulator + -> [x] -- Input list + -> (acc, [y]) -- Final accumulator and result list + +mapAccumR f b [] = (b, []) +mapAccumR f b (x:xs) = (b'', x':xs') where + (b'', x') = f b' x + (b', xs') = mapAccumR f b xs +\end{code} + +Here is the bi-directional version, that works from both left and right. + +\begin{code} +mapAccumB :: (accl -> accr -> x -> (accl, accr,y)) + -- Function of elt of input list + -- and accumulator, returning new + -- accumulator and elt of result list + -> accl -- Initial accumulator from left + -> accr -- Initial accumulator from right + -> [x] -- Input list + -> (accl, accr, [y]) -- Final accumulators and result list + +mapAccumB f a b [] = (a,b,[]) +mapAccumB f a b (x:xs) = (a'',b'',y:ys) + where + (a',b'',y) = f a b' x + (a'',b',ys) = mapAccumB f a' b xs +\end{code} + +A strict version of foldl. + +\begin{code} +foldl' :: (a -> b -> a) -> a -> [b] -> a +foldl' f z xs = lgo z xs + where + lgo z [] = z + lgo z (x:xs) = (lgo $! (f z x)) xs +\end{code} + +A combination of foldl with zip. It works with equal length lists. + +\begin{code} +foldl2 :: (acc -> a -> b -> acc) -> acc -> [a] -> [b] -> acc +foldl2 k z [] [] = z +foldl2 k z (a:as) (b:bs) = foldl2 k (k z a b) as bs + +all2 :: (a -> b -> Bool) -> [a] -> [b] -> Bool +-- True if the lists are the same length, and +-- all corresponding elements satisfy the predicate +all2 p [] [] = True +all2 p (x:xs) (y:ys) = p x y && all2 p xs ys +all2 p xs ys = False +\end{code} + +Count the number of times a predicate is true + +\begin{code} +count :: (a -> Bool) -> [a] -> Int +count p [] = 0 +count p (x:xs) | p x = 1 + count p xs + | otherwise = count p xs +\end{code} + +@splitAt@, @take@, and @drop@ but with length of another +list giving the break-off point: + +\begin{code} +takeList :: [b] -> [a] -> [a] +takeList [] _ = [] +takeList (_:xs) ls = + case ls of + [] -> [] + (y:ys) -> y : takeList xs ys + +dropList :: [b] -> [a] -> [a] +dropList [] xs = xs +dropList _ xs@[] = xs +dropList (_:xs) (_:ys) = dropList xs ys + + +splitAtList :: [b] -> [a] -> ([a], [a]) +splitAtList [] xs = ([], xs) +splitAtList _ xs@[] = (xs, xs) +splitAtList (_:xs) (y:ys) = (y:ys', ys'') + where + (ys', ys'') = splitAtList xs ys + +split :: Char -> String -> [String] +split c s = case rest of + [] -> [chunk] + _:rest -> chunk : split c rest + where (chunk, rest) = break (==c) s +\end{code} + + +%************************************************************************ +%* * +\subsection[Utils-comparison]{Comparisons} +%* * +%************************************************************************ + +\begin{code} +isEqual :: Ordering -> Bool +-- Often used in (isEqual (a `compare` b)) +isEqual GT = False +isEqual EQ = True +isEqual LT = False + +thenCmp :: Ordering -> Ordering -> Ordering +{-# INLINE thenCmp #-} +thenCmp EQ any = any +thenCmp other any = other + +eqListBy :: (a->a->Bool) -> [a] -> [a] -> Bool +eqListBy eq [] [] = True +eqListBy eq (x:xs) (y:ys) = eq x y && eqListBy eq xs ys +eqListBy eq xs ys = False + +equalLength :: [a] -> [b] -> Bool +equalLength [] [] = True +equalLength (_:xs) (_:ys) = equalLength xs ys +equalLength xs ys = False + +compareLength :: [a] -> [b] -> Ordering +compareLength [] [] = EQ +compareLength (_:xs) (_:ys) = compareLength xs ys +compareLength [] _ys = LT +compareLength _xs [] = GT + +cmpList :: (a -> a -> Ordering) -> [a] -> [a] -> Ordering + -- `cmpList' uses a user-specified comparer + +cmpList cmp [] [] = EQ +cmpList cmp [] _ = LT +cmpList cmp _ [] = GT +cmpList cmp (a:as) (b:bs) + = case cmp a b of { EQ -> cmpList cmp as bs; xxx -> xxx } +\end{code} + +\begin{code} +prefixMatch :: Eq a => [a] -> [a] -> Bool +prefixMatch [] _str = True +prefixMatch _pat [] = False +prefixMatch (p:ps) (s:ss) | p == s = prefixMatch ps ss + | otherwise = False + +maybePrefixMatch :: String -> String -> Maybe String +maybePrefixMatch [] rest = Just rest +maybePrefixMatch (_:_) [] = Nothing +maybePrefixMatch (p:pat) (r:rest) + | p == r = maybePrefixMatch pat rest + | otherwise = Nothing + +suffixMatch :: Eq a => [a] -> [a] -> Bool +suffixMatch pat str = prefixMatch (reverse pat) (reverse str) + +removeSpaces :: String -> String +removeSpaces = reverse . dropWhile isSpace . reverse . dropWhile isSpace +\end{code} + +%************************************************************************ +%* * +\subsection[Utils-pairs]{Pairs} +%* * +%************************************************************************ + +The following are curried versions of @fst@ and @snd@. + +\begin{code} +#if NOT_USED +cfst :: a -> b -> a -- stranal-sem only (Note) +cfst x y = x +#endif +\end{code} + +The following provide us higher order functions that, when applied +to a function, operate on pairs. + +\begin{code} +#if NOT_USED +applyToPair :: ((a -> c),(b -> d)) -> (a,b) -> (c,d) +applyToPair (f,g) (x,y) = (f x, g y) + +applyToFst :: (a -> c) -> (a,b)-> (c,b) +applyToFst f (x,y) = (f x,y) + +applyToSnd :: (b -> d) -> (a,b) -> (a,d) +applyToSnd f (x,y) = (x,f y) +#endif +\end{code} + +\begin{code} +unzipWith :: (a -> b -> c) -> [(a, b)] -> [c] +unzipWith f pairs = map ( \ (a, b) -> f a b ) pairs +\end{code} + +\begin{code} +seqList :: [a] -> b -> b +seqList [] b = b +seqList (x:xs) b = x `seq` seqList xs b +\end{code} + +Global variables: + +\begin{code} +global :: a -> IORef a +global a = unsafePerformIO (newIORef a) +\end{code} + +\begin{code} +consIORef :: IORef [a] -> a -> IO () +consIORef var x = do + xs <- readIORef var + writeIORef var (x:xs) +\end{code} + +Module names: + +\begin{code} +looksLikeModuleName [] = False +looksLikeModuleName (c:cs) = isUpper c && go cs + where go [] = True + go ('.':cs) = looksLikeModuleName cs + go (c:cs) = (isAlphaNum c || c == '_') && go cs +\end{code} + +Akin to @Prelude.words@, but sensitive to dquoted entities treating +them as single words. + +\begin{code} +toArgs :: String -> [String] +toArgs "" = [] +toArgs s = + case break (\ ch -> isSpace ch || ch == '"') (dropWhile isSpace s) of -- " + (w,aft) -> + (\ ws -> if null w then ws else w : ws) $ + case aft of + [] -> [] + (x:xs) + | x /= '"' -> toArgs xs + | otherwise -> + case lex aft of + ((str,rs):_) -> stripQuotes str : toArgs rs + _ -> [aft] + where + -- strip away dquotes; assume first and last chars contain quotes. + stripQuotes :: String -> String + stripQuotes ('"':xs) = init xs + stripQuotes xs = xs +\end{code} + +-- ----------------------------------------------------------------------------- +-- Floats + +\begin{code} +readRational__ :: ReadS Rational -- NB: doesn't handle leading "-" +readRational__ r = do + (n,d,s) <- readFix r + (k,t) <- readExp s + return ((n%1)*10^^(k-d), t) + where + readFix r = do + (ds,s) <- lexDecDigits r + (ds',t) <- lexDotDigits s + return (read (ds++ds'), length ds', t) + + readExp (e:s) | e `elem` "eE" = readExp' s + readExp s = return (0,s) + + readExp' ('+':s) = readDec s + readExp' ('-':s) = do + (k,t) <- readDec s + return (-k,t) + readExp' s = readDec s + + readDec s = do + (ds,r) <- nonnull isDigit s + return (foldl1 (\n d -> n * 10 + d) [ ord d - ord '0' | d <- ds ], + r) + + lexDecDigits = nonnull isDigit + + lexDotDigits ('.':s) = return (span isDigit s) + lexDotDigits s = return ("",s) + + nonnull p s = do (cs@(_:_),t) <- return (span p s) + return (cs,t) + +readRational :: String -> Rational -- NB: *does* handle a leading "-" +readRational top_s + = case top_s of + '-' : xs -> - (read_me xs) + xs -> read_me xs + where + read_me s + = case (do { (x,"") <- readRational__ s ; return x }) of + [x] -> x + [] -> error ("readRational: no parse:" ++ top_s) + _ -> error ("readRational: ambiguous parse:" ++ top_s) + + +----------------------------------------------------------------------------- +-- Create a hierarchy of directories + +createDirectoryHierarchy :: FilePath -> IO () +createDirectoryHierarchy dir = do + b <- doesDirectoryExist dir + when (not b) $ do + createDirectoryHierarchy (directoryOf dir) + createDirectory dir + +----------------------------------------------------------------------------- +-- Verify that the 'dirname' portion of a FilePath exists. +-- +doesDirNameExist :: FilePath -> IO Bool +doesDirNameExist fpath = doesDirectoryExist (directoryOf fpath) + +-- ----------------------------------------------------------------------------- +-- Exception utils + +later = flip finally + +handleDyn :: Typeable ex => (ex -> IO a) -> IO a -> IO a +handleDyn = flip catchDyn + +handle :: (Exception -> IO a) -> IO a -> IO a +#if __GLASGOW_HASKELL__ < 501 +handle = flip Exception.catchAllIO +#else +handle h f = f `Exception.catch` \e -> case e of + ExitException _ -> throw e + _ -> h e +#endif + +-- -------------------------------------------------------------- +-- check existence & modification time at the same time + +modificationTimeIfExists :: FilePath -> IO (Maybe ClockTime) +modificationTimeIfExists f = do + (do t <- getModificationTime f; return (Just t)) + `IO.catch` \e -> if isDoesNotExistError e + then return Nothing + else ioError e + +-- -------------------------------------------------------------- +-- Filename manipulation + +-- Filenames are kept "normalised" inside GHC, using '/' as the path +-- separator. On Windows these functions will also recognise '\\' as +-- the path separator, but will generally construct paths using '/'. + +type Suffix = String + +splitFilename :: String -> (String,Suffix) +splitFilename f = splitLongestPrefix f (=='.') + +basenameOf :: FilePath -> String +basenameOf = fst . splitFilename + +suffixOf :: FilePath -> Suffix +suffixOf = snd . splitFilename + +joinFileExt :: String -> String -> FilePath +joinFileExt path "" = path +joinFileExt path ext = path ++ '.':ext + +-- "foo/bar/xyzzy.ext" -> ("foo/bar", "xyzzy.ext") +splitFilenameDir :: String -> (String,String) +splitFilenameDir str + = let (dir, rest) = splitLongestPrefix str isPathSeparator + (dir', rest') | null rest = (".", dir) + | otherwise = (dir, rest) + in (dir', rest') + +-- "foo/bar/xyzzy.ext" -> ("foo/bar", "xyzzy", ".ext") +splitFilename3 :: String -> (String,String,Suffix) +splitFilename3 str + = let (dir, rest) = splitFilenameDir str + (name, ext) = splitFilename rest + in (dir, name, ext) + +joinFileName :: String -> String -> FilePath +joinFileName "" fname = fname +joinFileName "." fname = fname +joinFileName dir "" = dir +joinFileName dir fname = dir ++ '/':fname + +-- split a string at the last character where 'pred' is True, +-- returning a pair of strings. The first component holds the string +-- up (but not including) the last character for which 'pred' returned +-- True, the second whatever comes after (but also not including the +-- last character). +-- +-- If 'pred' returns False for all characters in the string, the original +-- string is returned in the first component (and the second one is just +-- empty). +splitLongestPrefix :: String -> (Char -> Bool) -> (String,String) +splitLongestPrefix str pred + | null r_pre = (str, []) + | otherwise = (reverse (tail r_pre), reverse r_suf) + -- 'tail' drops the char satisfying 'pred' + where + (r_suf, r_pre) = break pred (reverse str) + +replaceFilenameSuffix :: FilePath -> Suffix -> FilePath +replaceFilenameSuffix file suf = basenameOf file `joinFileExt` suf + +-- directoryOf strips the filename off the input string, returning +-- the directory. +directoryOf :: FilePath -> String +directoryOf = fst . splitFilenameDir + +-- filenameOf strips the directory off the input string, returning +-- the filename. +filenameOf :: FilePath -> String +filenameOf = snd . splitFilenameDir + +replaceFilenameDirectory :: FilePath -> String -> FilePath +replaceFilenameDirectory path dir = dir `joinFileName` filenameOf path + +escapeSpaces :: String -> String +escapeSpaces = foldr (\c s -> if isSpace c then '\\':c:s else c:s) "" + +isPathSeparator :: Char -> Bool +isPathSeparator ch = +#ifdef mingw32_TARGET_OS + ch == '/' || ch == '\\' +#else + ch == '/' +#endif + +-------------------------------------------------------------- +-- * Search path +-------------------------------------------------------------- + +-- | The function splits the given string to substrings +-- using the 'searchPathSeparator'. +parseSearchPath :: String -> [FilePath] +parseSearchPath path = split path + where + split :: String -> [String] + split s = + case rest' of + [] -> [chunk] + _:rest -> chunk : split rest + where + chunk = + case chunk' of +#ifdef mingw32_HOST_OS + ('\"':xs@(_:_)) | last xs == '\"' -> init xs +#endif + _ -> chunk' + + (chunk', rest') = break (==searchPathSeparator) s + +-- | A platform-specific character used to separate search path strings in +-- environment variables. The separator is a colon (\":\") on Unix and Macintosh, +-- and a semicolon (\";\") on the Windows operating system. +searchPathSeparator :: Char +#if mingw32_HOST_OS || mingw32_TARGET_OS +searchPathSeparator = ';' +#else +searchPathSeparator = ':' +#endif + +----------------------------------------------------------------------------- +-- Convert filepath into platform / MSDOS form. + +-- We maintain path names in Unix form ('/'-separated) right until +-- the last moment. On Windows we dos-ify them just before passing them +-- to the Windows command. +-- +-- The alternative, of using '/' consistently on Unix and '\' on Windows, +-- proved quite awkward. There were a lot more calls to platformPath, +-- and even on Windows we might invoke a unix-like utility (eg 'sh'), which +-- interpreted a command line 'foo\baz' as 'foobaz'. + +normalisePath :: String -> String +-- Just changes '\' to '/' + +pgmPath :: String -- Directory string in Unix format + -> String -- Program name with no directory separators + -- (e.g. copy /y) + -> String -- Program invocation string in native format + +#if defined(mingw32_HOST_OS) +--------------------- Windows version ------------------ +normalisePath xs = subst '\\' '/' xs +pgmPath dir pgm = platformPath dir ++ '\\' : pgm +platformPath p = subst '/' '\\' p + +subst a b ls = map (\ x -> if x == a then b else x) ls +#else +--------------------- Non-Windows version -------------- +normalisePath xs = xs +pgmPath dir pgm = dir ++ '/' : pgm +platformPath stuff = stuff +-------------------------------------------------------- +#endif +\end{code} |