Backport commit from upstream LLVM:
r219009 [ISel] Keep matching state consistent when folding during X86 address match from Brad (maintainer)
This commit is contained in:
parent
086b280429
commit
4413e0aa9e
@ -1,4 +1,4 @@
|
||||
# $OpenBSD: Makefile,v 1.107 2015/06/15 06:20:48 ajacoutot Exp $
|
||||
# $OpenBSD: Makefile,v 1.108 2015/08/24 07:45:56 ajacoutot Exp $
|
||||
|
||||
# XXX: Remember to bump MODCLANG_VERSION in lang/clang/clang.port.mk when
|
||||
# updating this port.
|
||||
@ -10,7 +10,7 @@ COMMENT = modular, fast C/C++/ObjC compiler, static analyzer and tools
|
||||
|
||||
LLVM_V = 3.5
|
||||
DISTNAME = llvm-${LLVM_V}.20140228
|
||||
REVISION = 34
|
||||
REVISION = 35
|
||||
CATEGORIES = devel
|
||||
MASTER_SITES = http://comstyle.com/source/
|
||||
EXTRACT_SUFX = .tar.xz
|
||||
|
@ -0,0 +1,59 @@
|
||||
$OpenBSD: patch-include_llvm_CodeGen_SelectionDAGISel_h,v 1.1 2015/08/24 07:45:56 ajacoutot Exp $
|
||||
|
||||
r219009
|
||||
[ISel] Keep matching state consistent when folding during X86 address match
|
||||
|
||||
In the X86 backend, matching an address is initiated by the 'addr' complex
|
||||
pattern and its friends. During this process we may reassociate and-of-shift
|
||||
into shift-of-and (FoldMaskedShiftToScaledMask) to allow folding of the
|
||||
shift into the scale of the address.
|
||||
|
||||
However as demonstrated by the testcase, this can trigger CSE of not only the
|
||||
shift and the AND which the code is prepared for but also the underlying load
|
||||
node. In the testcase this node is sitting in the RecordedNode and MatchScope
|
||||
data structures of the matcher and becomes a deleted node upon CSE. Returning
|
||||
from the complex pattern function, we try to access it again hitting an assert
|
||||
because the node is no longer a load even though this was checked before.
|
||||
|
||||
Now obviously changing the DAG this late is bending the rules but I think it
|
||||
makes sense somewhat. Outside of addresses we prefer and-of-shift because it
|
||||
may lead to smaller immediates (FoldMaskAndShiftToScale is an even better
|
||||
example because it create a non-canonical node). We currently don't recognize
|
||||
addresses during DAGCombiner where arguably this canonicalization should be
|
||||
performed. On the other hand, having this in the matcher allows us to cover
|
||||
all the cases where an address can be used in an instruction.
|
||||
|
||||
I've also talked a little bit to Dan Gohman on llvm-dev who added the RAUW for
|
||||
the new shift node in FoldMaskedShiftToScaledMask. This RAUW is responsible
|
||||
for initiating the recursive CSE on users
|
||||
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-September/076903.html) but it
|
||||
is not strictly necessary since the shift is hooked into the visited user. Of
|
||||
course it's safer to keep the DAG consistent at all times (e.g. for accurate
|
||||
number of uses, etc.).
|
||||
|
||||
So rather than changing the fundamentals, I've decided to continue along the
|
||||
previous patches and detect the CSE. This patch installs a very targeted
|
||||
DAGUpdateListener for the duration of a complex-pattern match and updates the
|
||||
matching state accordingly. (Previous patches used HandleSDNode to detect the
|
||||
CSE but that's not practical here). The listener is only installed on X86.
|
||||
|
||||
I tested that there is no measurable overhead due to this while running
|
||||
through the spec2k BC files with llc. The only thing we pay for is the
|
||||
creation of the listener. The callback never ever triggers in spec2k since
|
||||
this is a corner case.
|
||||
|
||||
--- include/llvm/CodeGen/SelectionDAGISel.h.orig Tue Aug 4 22:44:44 2015
|
||||
+++ include/llvm/CodeGen/SelectionDAGISel.h Tue Aug 4 22:46:22 2015
|
||||
@@ -238,6 +238,12 @@ class SelectionDAGISel : public MachineFunctionPass {
|
||||
const unsigned char *MatcherTable,
|
||||
unsigned TableSize);
|
||||
|
||||
+ /// \brief Return true if complex patterns for this target can mutate the
|
||||
+ /// DAG.
|
||||
+ virtual bool ComplexPatternFuncMutatesDAG() const {
|
||||
+ return false;
|
||||
+ }
|
||||
+
|
||||
private:
|
||||
|
||||
// Calls to these functions are generated by tblgen.
|
@ -0,0 +1,107 @@
|
||||
$OpenBSD: patch-lib_CodeGen_SelectionDAG_SelectionDAGISel_cpp,v 1.1 2015/08/24 07:45:56 ajacoutot Exp $
|
||||
|
||||
r219009
|
||||
[ISel] Keep matching state consistent when folding during X86 address match
|
||||
|
||||
In the X86 backend, matching an address is initiated by the 'addr' complex
|
||||
pattern and its friends. During this process we may reassociate and-of-shift
|
||||
into shift-of-and (FoldMaskedShiftToScaledMask) to allow folding of the
|
||||
shift into the scale of the address.
|
||||
|
||||
However as demonstrated by the testcase, this can trigger CSE of not only the
|
||||
shift and the AND which the code is prepared for but also the underlying load
|
||||
node. In the testcase this node is sitting in the RecordedNode and MatchScope
|
||||
data structures of the matcher and becomes a deleted node upon CSE. Returning
|
||||
from the complex pattern function, we try to access it again hitting an assert
|
||||
because the node is no longer a load even though this was checked before.
|
||||
|
||||
Now obviously changing the DAG this late is bending the rules but I think it
|
||||
makes sense somewhat. Outside of addresses we prefer and-of-shift because it
|
||||
may lead to smaller immediates (FoldMaskAndShiftToScale is an even better
|
||||
example because it create a non-canonical node). We currently don't recognize
|
||||
addresses during DAGCombiner where arguably this canonicalization should be
|
||||
performed. On the other hand, having this in the matcher allows us to cover
|
||||
all the cases where an address can be used in an instruction.
|
||||
|
||||
I've also talked a little bit to Dan Gohman on llvm-dev who added the RAUW for
|
||||
the new shift node in FoldMaskedShiftToScaledMask. This RAUW is responsible
|
||||
for initiating the recursive CSE on users
|
||||
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-September/076903.html) but it
|
||||
is not strictly necessary since the shift is hooked into the visited user. Of
|
||||
course it's safer to keep the DAG consistent at all times (e.g. for accurate
|
||||
number of uses, etc.).
|
||||
|
||||
So rather than changing the fundamentals, I've decided to continue along the
|
||||
previous patches and detect the CSE. This patch installs a very targeted
|
||||
DAGUpdateListener for the duration of a complex-pattern match and updates the
|
||||
matching state accordingly. (Previous patches used HandleSDNode to detect the
|
||||
CSE but that's not practical here). The listener is only installed on X86.
|
||||
|
||||
I tested that there is no measurable overhead due to this while running
|
||||
through the spec2k BC files with llc. The only thing we pay for is the
|
||||
creation of the listener. The callback never ever triggers in spec2k since
|
||||
this is a corner case.
|
||||
|
||||
--- lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp.orig Tue Aug 4 22:47:10 2015
|
||||
+++ lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp Tue Aug 4 22:52:46 2015
|
||||
@@ -2363,6 +2363,45 @@ struct MatchScope {
|
||||
bool HasChainNodesMatched, HasGlueResultNodesMatched;
|
||||
};
|
||||
|
||||
+/// \\brief A DAG update listener to keep the matching state
|
||||
+/// (i.e. RecordedNodes and MatchScope) uptodate if the target is allowed to
|
||||
+/// change the DAG while matching. X86 addressing mode matcher is an example
|
||||
+/// for this.
|
||||
+class MatchStateUpdater : public SelectionDAG::DAGUpdateListener
|
||||
+{
|
||||
+ SmallVectorImpl<std::pair<SDValue, SDNode*> > &RecordedNodes;
|
||||
+ SmallVectorImpl<MatchScope> &MatchScopes;
|
||||
+public:
|
||||
+ MatchStateUpdater(SelectionDAG &DAG,
|
||||
+ SmallVectorImpl<std::pair<SDValue, SDNode*> > &RN,
|
||||
+ SmallVectorImpl<MatchScope> &MS) :
|
||||
+ SelectionDAG::DAGUpdateListener(DAG),
|
||||
+ RecordedNodes(RN), MatchScopes(MS) { }
|
||||
+
|
||||
+ void NodeDeleted(SDNode *N, SDNode *E) {
|
||||
+ // Some early-returns here to avoid the search if we deleted the node or
|
||||
+ // if the update comes from MorphNodeTo (MorphNodeTo is the last thing we
|
||||
+ // do, so it's unnecessary to update matching state at that point).
|
||||
+ // Neither of these can occur currently because we only install this
|
||||
+ // update listener during matching a complex patterns.
|
||||
+ if (!E || E->isMachineOpcode())
|
||||
+ return;
|
||||
+ // Performing linear search here does not matter because we almost never
|
||||
+ // run this code. You'd have to have a CSE during complex pattern
|
||||
+ // matching.
|
||||
+ for (SmallVectorImpl<std::pair<SDValue, SDNode*> >::iterator I =
|
||||
+ RecordedNodes.begin(), IE = RecordedNodes.end(); I != IE; ++I)
|
||||
+ if (I->first.getNode() == N)
|
||||
+ I->first.setNode(E);
|
||||
+
|
||||
+ for (SmallVectorImpl<MatchScope>::iterator I = MatchScopes.begin(),
|
||||
+ IE = MatchScopes.end(); I != IE; ++I)
|
||||
+ for (SmallVector<SDValue, 4>::iterator J = I->NodeStack.begin(),
|
||||
+ JE = I->NodeStack.end(); J != JE; ++J)
|
||||
+ if (J->getNode() == N)
|
||||
+ J->setNode(E);
|
||||
+ }
|
||||
+};
|
||||
}
|
||||
|
||||
SDNode *SelectionDAGISel::
|
||||
@@ -2617,6 +2656,14 @@ SelectCodeCommon(SDNode *NodeToMatch, const unsigned c
|
||||
unsigned CPNum = MatcherTable[MatcherIndex++];
|
||||
unsigned RecNo = MatcherTable[MatcherIndex++];
|
||||
assert(RecNo < RecordedNodes.size() && "Invalid CheckComplexPat");
|
||||
+
|
||||
+ // If target can modify DAG during matching, keep the matching state
|
||||
+ // consistent.
|
||||
+ OwningPtr<MatchStateUpdater> MSU;
|
||||
+ if (ComplexPatternFuncMutatesDAG())
|
||||
+ MSU.reset(new MatchStateUpdater(*CurDAG, RecordedNodes,
|
||||
+ MatchScopes));
|
||||
+
|
||||
if (!CheckComplexPattern(NodeToMatch, RecordedNodes[RecNo].second,
|
||||
RecordedNodes[RecNo].first, CPNum,
|
||||
RecordedNodes))
|
60
devel/llvm/patches/patch-lib_Target_X86_X86ISelDAGToDAG_cpp
Normal file
60
devel/llvm/patches/patch-lib_Target_X86_X86ISelDAGToDAG_cpp
Normal file
@ -0,0 +1,60 @@
|
||||
$OpenBSD: patch-lib_Target_X86_X86ISelDAGToDAG_cpp,v 1.1 2015/08/24 07:45:56 ajacoutot Exp $
|
||||
|
||||
r219009
|
||||
[ISel] Keep matching state consistent when folding during X86 address match
|
||||
|
||||
In the X86 backend, matching an address is initiated by the 'addr' complex
|
||||
pattern and its friends. During this process we may reassociate and-of-shift
|
||||
into shift-of-and (FoldMaskedShiftToScaledMask) to allow folding of the
|
||||
shift into the scale of the address.
|
||||
|
||||
However as demonstrated by the testcase, this can trigger CSE of not only the
|
||||
shift and the AND which the code is prepared for but also the underlying load
|
||||
node. In the testcase this node is sitting in the RecordedNode and MatchScope
|
||||
data structures of the matcher and becomes a deleted node upon CSE. Returning
|
||||
from the complex pattern function, we try to access it again hitting an assert
|
||||
because the node is no longer a load even though this was checked before.
|
||||
|
||||
Now obviously changing the DAG this late is bending the rules but I think it
|
||||
makes sense somewhat. Outside of addresses we prefer and-of-shift because it
|
||||
may lead to smaller immediates (FoldMaskAndShiftToScale is an even better
|
||||
example because it create a non-canonical node). We currently don't recognize
|
||||
addresses during DAGCombiner where arguably this canonicalization should be
|
||||
performed. On the other hand, having this in the matcher allows us to cover
|
||||
all the cases where an address can be used in an instruction.
|
||||
|
||||
I've also talked a little bit to Dan Gohman on llvm-dev who added the RAUW for
|
||||
the new shift node in FoldMaskedShiftToScaledMask. This RAUW is responsible
|
||||
for initiating the recursive CSE on users
|
||||
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-September/076903.html) but it
|
||||
is not strictly necessary since the shift is hooked into the visited user. Of
|
||||
course it's safer to keep the DAG consistent at all times (e.g. for accurate
|
||||
number of uses, etc.).
|
||||
|
||||
So rather than changing the fundamentals, I've decided to continue along the
|
||||
previous patches and detect the CSE. This patch installs a very targeted
|
||||
DAGUpdateListener for the duration of a complex-pattern match and updates the
|
||||
matching state accordingly. (Previous patches used HandleSDNode to detect the
|
||||
CSE but that's not practical here). The listener is only installed on X86.
|
||||
|
||||
I tested that there is no measurable overhead due to this while running
|
||||
through the spec2k BC files with llc. The only thing we pay for is the
|
||||
creation of the listener. The callback never ever triggers in spec2k since
|
||||
this is a corner case.
|
||||
|
||||
--- lib/Target/X86/X86ISelDAGToDAG.cpp.orig Tue Aug 4 22:53:05 2015
|
||||
+++ lib/Target/X86/X86ISelDAGToDAG.cpp Tue Aug 4 22:53:59 2015
|
||||
@@ -290,6 +290,13 @@ namespace {
|
||||
const X86InstrInfo *getInstrInfo() const {
|
||||
return getTargetMachine().getInstrInfo();
|
||||
}
|
||||
+
|
||||
+ /// \brief Address-mode matching performs shift-of-and to and-of-shift
|
||||
+ /// reassociation in order to expose more scaled addressing
|
||||
+ /// opportunities.
|
||||
+ bool ComplexPatternFuncMutatesDAG() const {
|
||||
+ return true;
|
||||
+ }
|
||||
};
|
||||
}
|
||||
|
Loading…
x
Reference in New Issue
Block a user