Backport commit from upstream LLVM:

r219009 [ISel] Keep matching state consistent when folding during X86 address match from Brad (maintainer)
2015-08-24 07:45:56 +00:00 · 2015-08-24 07:45:56 +00:00 · 4413e0aa9e
commit 4413e0aa9e
parent 086b280429
4 changed files with 228 additions and 2 deletions
--- a/devel/llvm/Makefile
+++ b/devel/llvm/Makefile
@ -1,4 +1,4 @@
-# $OpenBSD: Makefile,v 1.107 2015/06/15 06:20:48 ajacoutot Exp $
+# $OpenBSD: Makefile,v 1.108 2015/08/24 07:45:56 ajacoutot Exp $

 # XXX: Remember to bump MODCLANG_VERSION in lang/clang/clang.port.mk when
 # updating this port.
@ -10,7 +10,7 @@ COMMENT =	modular, fast C/C++/ObjC compiler, static analyzer and tools

 LLVM_V =	3.5
 DISTNAME =	llvm-${LLVM_V}.20140228
-REVISION =	34
+REVISION =	35
 CATEGORIES =	devel
 MASTER_SITES =	http://comstyle.com/source/
 EXTRACT_SUFX =	.tar.xz
--- a/devel/llvm/patches/patch-include_llvm_CodeGen_SelectionDAGISel_h
+++ b/devel/llvm/patches/patch-include_llvm_CodeGen_SelectionDAGISel_h
@ -0,0 +1,59 @@
+$OpenBSD: patch-include_llvm_CodeGen_SelectionDAGISel_h,v 1.1 2015/08/24 07:45:56 ajacoutot Exp $
+
+r219009
+[ISel] Keep matching state consistent when folding during X86 address match
+
+In the X86 backend, matching an address is initiated by the 'addr' complex
+pattern and its friends.  During this process we may reassociate and-of-shift
+into shift-of-and (FoldMaskedShiftToScaledMask) to allow folding of the
+shift into the scale of the address.
+
+However as demonstrated by the testcase, this can trigger CSE of not only the
+shift and the AND which the code is prepared for but also the underlying load
+node.  In the testcase this node is sitting in the RecordedNode and MatchScope
+data structures of the matcher and becomes a deleted node upon CSE.  Returning
+from the complex pattern function, we try to access it again hitting an assert
+because the node is no longer a load even though this was checked before.
+
+Now obviously changing the DAG this late is bending the rules but I think it
+makes sense somewhat.  Outside of addresses we prefer and-of-shift because it
+may lead to smaller immediates (FoldMaskAndShiftToScale is an even better
+example because it create a non-canonical node).  We currently don't recognize
+addresses during DAGCombiner where arguably this canonicalization should be
+performed.  On the other hand, having this in the matcher allows us to cover
+all the cases where an address can be used in an instruction.
+
+I've also talked a little bit to Dan Gohman on llvm-dev who added the RAUW for
+the new shift node in FoldMaskedShiftToScaledMask.  This RAUW is responsible
+for initiating the recursive CSE on users
+(http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-September/076903.html) but it
+is not strictly necessary since the shift is hooked into the visited user.  Of
+course it's safer to keep the DAG consistent at all times (e.g. for accurate
+number of uses, etc.).
+
+So rather than changing the fundamentals, I've decided to continue along the
+previous patches and detect the CSE.  This patch installs a very targeted
+DAGUpdateListener for the duration of a complex-pattern match and updates the
+matching state accordingly.  (Previous patches used HandleSDNode to detect the
+CSE but that's not practical here).  The listener is only installed on X86.
+
+I tested that there is no measurable overhead due to this while running
+through the spec2k BC files with llc.  The only thing we pay for is the
+creation of the listener.  The callback never ever triggers in spec2k since
+this is a corner case.
+
+--- include/llvm/CodeGen/SelectionDAGISel.h.orig	Tue Aug  4 22:44:44 2015
+++ include/llvm/CodeGen/SelectionDAGISel.h	Tue Aug  4 22:46:22 2015
+@@ -238,6 +238,12 @@ class SelectionDAGISel : public MachineFunctionPass { 
+                            const unsigned char *MatcherTable,
+                            unsigned TableSize);
+ 
+  /// \brief Return true if complex patterns for this target can mutate the
+  /// DAG.
+  virtual bool ComplexPatternFuncMutatesDAG() const {
+    return false;
+  }
+
+ private:
+ 
+   // Calls to these functions are generated by tblgen.
--- a/devel/llvm/patches/patch-lib_CodeGen_SelectionDAG_SelectionDAGISel_cpp
+++ b/devel/llvm/patches/patch-lib_CodeGen_SelectionDAG_SelectionDAGISel_cpp
@ -0,0 +1,107 @@
+$OpenBSD: patch-lib_CodeGen_SelectionDAG_SelectionDAGISel_cpp,v 1.1 2015/08/24 07:45:56 ajacoutot Exp $
+
+r219009
+[ISel] Keep matching state consistent when folding during X86 address match
+
+In the X86 backend, matching an address is initiated by the 'addr' complex
+pattern and its friends.  During this process we may reassociate and-of-shift
+into shift-of-and (FoldMaskedShiftToScaledMask) to allow folding of the
+shift into the scale of the address.
+
+However as demonstrated by the testcase, this can trigger CSE of not only the
+shift and the AND which the code is prepared for but also the underlying load
+node.  In the testcase this node is sitting in the RecordedNode and MatchScope
+data structures of the matcher and becomes a deleted node upon CSE.  Returning
+from the complex pattern function, we try to access it again hitting an assert
+because the node is no longer a load even though this was checked before.
+
+Now obviously changing the DAG this late is bending the rules but I think it
+makes sense somewhat.  Outside of addresses we prefer and-of-shift because it
+may lead to smaller immediates (FoldMaskAndShiftToScale is an even better
+example because it create a non-canonical node).  We currently don't recognize
+addresses during DAGCombiner where arguably this canonicalization should be
+performed.  On the other hand, having this in the matcher allows us to cover
+all the cases where an address can be used in an instruction.
+
+I've also talked a little bit to Dan Gohman on llvm-dev who added the RAUW for
+the new shift node in FoldMaskedShiftToScaledMask.  This RAUW is responsible
+for initiating the recursive CSE on users
+(http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-September/076903.html) but it
+is not strictly necessary since the shift is hooked into the visited user.  Of
+course it's safer to keep the DAG consistent at all times (e.g. for accurate
+number of uses, etc.).
+
+So rather than changing the fundamentals, I've decided to continue along the
+previous patches and detect the CSE.  This patch installs a very targeted
+DAGUpdateListener for the duration of a complex-pattern match and updates the
+matching state accordingly.  (Previous patches used HandleSDNode to detect the
+CSE but that's not practical here).  The listener is only installed on X86.
+
+I tested that there is no measurable overhead due to this while running
+through the spec2k BC files with llc.  The only thing we pay for is the
+creation of the listener.  The callback never ever triggers in spec2k since
+this is a corner case.
+
+--- lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp.orig	Tue Aug  4 22:47:10 2015
+++ lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp	Tue Aug  4 22:52:46 2015
+@@ -2363,6 +2363,45 @@ struct MatchScope {
+   bool HasChainNodesMatched, HasGlueResultNodesMatched;
+ };
+ 
+/// \\brief A DAG update listener to keep the matching state
+/// (i.e. RecordedNodes and MatchScope) uptodate if the target is allowed to
+/// change the DAG while matching.  X86 addressing mode matcher is an example
+/// for this.
+class MatchStateUpdater : public SelectionDAG::DAGUpdateListener
+{
+      SmallVectorImpl<std::pair<SDValue, SDNode*> > &RecordedNodes;
+      SmallVectorImpl<MatchScope> &MatchScopes;
+public:
+  MatchStateUpdater(SelectionDAG &DAG,
+                    SmallVectorImpl<std::pair<SDValue, SDNode*> > &RN,
+                    SmallVectorImpl<MatchScope> &MS) :
+    SelectionDAG::DAGUpdateListener(DAG),
+    RecordedNodes(RN), MatchScopes(MS) { }
+
+  void NodeDeleted(SDNode *N, SDNode *E) {
+    // Some early-returns here to avoid the search if we deleted the node or
+    // if the update comes from MorphNodeTo (MorphNodeTo is the last thing we
+    // do, so it's unnecessary to update matching state at that point).
+    // Neither of these can occur currently because we only install this
+    // update listener during matching a complex patterns.
+    if (!E || E->isMachineOpcode())
+      return;
+    // Performing linear search here does not matter because we almost never
+    // run this code.  You'd have to have a CSE during complex pattern
+    // matching.
+    for (SmallVectorImpl<std::pair<SDValue, SDNode*> >::iterator I =
+         RecordedNodes.begin(), IE = RecordedNodes.end(); I != IE; ++I)
+      if (I->first.getNode() == N)
+        I->first.setNode(E);
+
+    for (SmallVectorImpl<MatchScope>::iterator I = MatchScopes.begin(),
+         IE = MatchScopes.end(); I != IE; ++I)
+      for (SmallVector<SDValue, 4>::iterator J = I->NodeStack.begin(),
+           JE = I->NodeStack.end(); J != JE; ++J)
+        if (J->getNode() == N)
+          J->setNode(E);
+  }
+};
+ }
+ 
+ SDNode *SelectionDAGISel::
+@@ -2617,6 +2656,14 @@ SelectCodeCommon(SDNode *NodeToMatch, const unsigned c
+       unsigned CPNum = MatcherTable[MatcherIndex++];
+       unsigned RecNo = MatcherTable[MatcherIndex++];
+       assert(RecNo < RecordedNodes.size() && "Invalid CheckComplexPat");
+
+      // If target can modify DAG during matching, keep the matching state
+      // consistent.
+      OwningPtr<MatchStateUpdater> MSU;
+      if (ComplexPatternFuncMutatesDAG())
+        MSU.reset(new MatchStateUpdater(*CurDAG, RecordedNodes,
+                                        MatchScopes));
+
+       if (!CheckComplexPattern(NodeToMatch, RecordedNodes[RecNo].second,
+                                RecordedNodes[RecNo].first, CPNum,
+                                RecordedNodes))
--- a/devel/llvm/patches/patch-lib_Target_X86_X86ISelDAGToDAG_cpp
+++ b/devel/llvm/patches/patch-lib_Target_X86_X86ISelDAGToDAG_cpp
@ -0,0 +1,60 @@
+$OpenBSD: patch-lib_Target_X86_X86ISelDAGToDAG_cpp,v 1.1 2015/08/24 07:45:56 ajacoutot Exp $
+
+r219009
+[ISel] Keep matching state consistent when folding during X86 address match
+
+In the X86 backend, matching an address is initiated by the 'addr' complex
+pattern and its friends.  During this process we may reassociate and-of-shift
+into shift-of-and (FoldMaskedShiftToScaledMask) to allow folding of the
+shift into the scale of the address.
+
+However as demonstrated by the testcase, this can trigger CSE of not only the
+shift and the AND which the code is prepared for but also the underlying load
+node.  In the testcase this node is sitting in the RecordedNode and MatchScope
+data structures of the matcher and becomes a deleted node upon CSE.  Returning
+from the complex pattern function, we try to access it again hitting an assert
+because the node is no longer a load even though this was checked before.
+
+Now obviously changing the DAG this late is bending the rules but I think it
+makes sense somewhat.  Outside of addresses we prefer and-of-shift because it
+may lead to smaller immediates (FoldMaskAndShiftToScale is an even better
+example because it create a non-canonical node).  We currently don't recognize
+addresses during DAGCombiner where arguably this canonicalization should be
+performed.  On the other hand, having this in the matcher allows us to cover
+all the cases where an address can be used in an instruction.
+
+I've also talked a little bit to Dan Gohman on llvm-dev who added the RAUW for
+the new shift node in FoldMaskedShiftToScaledMask.  This RAUW is responsible
+for initiating the recursive CSE on users
+(http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-September/076903.html) but it
+is not strictly necessary since the shift is hooked into the visited user.  Of
+course it's safer to keep the DAG consistent at all times (e.g. for accurate
+number of uses, etc.).
+
+So rather than changing the fundamentals, I've decided to continue along the
+previous patches and detect the CSE.  This patch installs a very targeted
+DAGUpdateListener for the duration of a complex-pattern match and updates the
+matching state accordingly.  (Previous patches used HandleSDNode to detect the
+CSE but that's not practical here).  The listener is only installed on X86.
+
+I tested that there is no measurable overhead due to this while running
+through the spec2k BC files with llc.  The only thing we pay for is the
+creation of the listener.  The callback never ever triggers in spec2k since
+this is a corner case.
+
+--- lib/Target/X86/X86ISelDAGToDAG.cpp.orig	Tue Aug  4 22:53:05 2015
+++ lib/Target/X86/X86ISelDAGToDAG.cpp	Tue Aug  4 22:53:59 2015
+@@ -290,6 +290,13 @@ namespace {
+     const X86InstrInfo *getInstrInfo() const {
+       return getTargetMachine().getInstrInfo();
+     }
+
+    /// \brief Address-mode matching performs shift-of-and to and-of-shift
+    /// reassociation in order to expose more scaled addressing
+    /// opportunities.
+    bool ComplexPatternFuncMutatesDAG() const {
+      return true;
+    }
+   };
+ }
+