瀏覽代碼

Introduce infrastructure for an incremental port of SelectionDAG atomic load/store handling

This is the first patch in a large sequence. The eventual goal is to have unordered atomic loads and stores - and possibly ordered atomics as well - handled through the normal ISEL codepaths for loads and stores. Today, there handled w/instances of AtomicSDNodes. The result of which is that all transforms need to be duplicated to work for unordered atomics. The benefit of the current design is that it's harder to introduce a silent miscompile by adding an transform which forgets about atomicity.  See the thread on llvm-dev titled "FYI: proposed changes to atomic load/store in SelectionDAG" for further context.

Note that this patch is NFC unless the experimental flag is set.

The basic strategy I plan on taking is:

    introduce infrastructure and a flag for testing (this patch)
    Audit uses of isVolatile, and apply isAtomic conservatively*
    piecemeal conservative* update generic code and x86 backedge code in individual reviews w/tests for cases which didn't check volatile, but can be found with inspection
    flip the flag at the end (with minimal diffs)
    Work through todo list identified in (2) and (3) exposing performance ops

(*) The "conservative" bit here is aimed at minimizing the number of diffs involved in (4). Ideally, there'd be none. In practice, getting it down to something reviewable by a human is the actual goal. Note that there are (currently) no paths which produce LoadSDNode or StoreSDNode with atomic MMOs, so we don't need to worry about preserving any behaviour there.

We've taken a very similar strategy twice before with success - once at IR level, and once at the MI level (post ISEL). 

Differential Revision: https://reviews.llvm.org/D66309



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371441 91177308-0d34-0410-b5e6-96231b3b80d8
Philip Reames 6 年之前
父節點
當前提交
4113e2297d

+ 0 - 2
include/llvm/CodeGen/SelectionDAGNodes.h

@@ -2197,8 +2197,6 @@ public:
       : MemSDNode(NodeTy, Order, dl, VTs, MemVT, MMO) {
       : MemSDNode(NodeTy, Order, dl, VTs, MemVT, MMO) {
     LSBaseSDNodeBits.AddressingMode = AM;
     LSBaseSDNodeBits.AddressingMode = AM;
     assert(getAddressingMode() == AM && "Value truncated");
     assert(getAddressingMode() == AM && "Value truncated");
-    assert((!MMO->isAtomic() || MMO->isVolatile()) &&
-           "use an AtomicSDNode instead for non-volatile atomics");
   }
   }
 
 
   const SDValue &getOffset() const {
   const SDValue &getOffset() const {

+ 19 - 0
include/llvm/CodeGen/TargetLowering.h

@@ -3716,6 +3716,25 @@ public:
     return MachineMemOperand::MONone;
     return MachineMemOperand::MONone;
   }
   }
 
 
+  /// Should SelectionDAG lower an atomic store of the given kind as a normal
+  /// StoreSDNode (as opposed to an AtomicSDNode)?  NOTE: The intention is to
+  /// eventually migrate all targets to the using StoreSDNodes, but porting is
+  /// being done target at a time.  
+  virtual bool lowerAtomicStoreAsStoreSDNode(const StoreInst &SI) const {
+    assert(SI.isAtomic() && "violated precondition");
+    return false;
+  }
+
+  /// Should SelectionDAG lower an atomic load of the given kind as a normal
+  /// LoadSDNode (as opposed to an AtomicSDNode)?  NOTE: The intention is to
+  /// eventually migrate all targets to the using LoadSDNodes, but porting is
+  /// being done target at a time.  
+  virtual bool lowerAtomicLoadAsLoadSDNode(const LoadInst &LI) const {
+    assert(LI.isAtomic() && "violated precondition");
+    return false;
+  }
+
+
   /// This callback is invoked by the type legalizer to legalize nodes with an
   /// This callback is invoked by the type legalizer to legalize nodes with an
   /// illegal operand type but legal result types.  It replaces the
   /// illegal operand type but legal result types.  It replaces the
   /// LowerOperation callback in the type Legalizer.  The reason we can not do
   /// LowerOperation callback in the type Legalizer.  The reason we can not do

+ 29 - 4
lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

@@ -4658,9 +4658,26 @@ void SelectionDAGBuilder::visitAtomicLoad(const LoadInst &I) {
                            AAMDNodes(), nullptr, SSID, Order);
                            AAMDNodes(), nullptr, SSID, Order);
 
 
   InChain = TLI.prepareVolatileOrAtomicLoad(InChain, dl, DAG);
   InChain = TLI.prepareVolatileOrAtomicLoad(InChain, dl, DAG);
-  SDValue L =
-      DAG.getAtomic(ISD::ATOMIC_LOAD, dl, MemVT, MemVT, InChain,
-                    getValue(I.getPointerOperand()), MMO);
+
+  SDValue Ptr = getValue(I.getPointerOperand());
+
+  if (TLI.lowerAtomicLoadAsLoadSDNode(I)) {
+    // TODO: Once this is better exercised by tests, it should be merged with
+    // the normal path for loads to prevent future divergence.
+    SDValue L = DAG.getLoad(MemVT, dl, InChain, Ptr, MMO);
+    if (MemVT != VT)
+      L = DAG.getPtrExtOrTrunc(L, dl, VT);
+
+    setValue(&I, L);
+    if (!I.isUnordered()) {
+      SDValue OutChain = L.getValue(1);
+      DAG.setRoot(OutChain);
+    }
+    return;
+  }
+  
+  SDValue L = DAG.getAtomic(ISD::ATOMIC_LOAD, dl, MemVT, MemVT, InChain,
+                            Ptr, MMO);
 
 
   SDValue OutChain = L.getValue(1);
   SDValue OutChain = L.getValue(1);
   if (MemVT != VT)
   if (MemVT != VT)
@@ -4699,9 +4716,17 @@ void SelectionDAGBuilder::visitAtomicStore(const StoreInst &I) {
   SDValue Val = getValue(I.getValueOperand());
   SDValue Val = getValue(I.getValueOperand());
   if (Val.getValueType() != MemVT)
   if (Val.getValueType() != MemVT)
     Val = DAG.getPtrExtOrTrunc(Val, dl, MemVT);
     Val = DAG.getPtrExtOrTrunc(Val, dl, MemVT);
+  SDValue Ptr = getValue(I.getPointerOperand());
 
 
+  if (TLI.lowerAtomicStoreAsStoreSDNode(I)) {
+    // TODO: Once this is better exercised by tests, it should be merged with
+    // the normal path for stores to prevent future divergence.
+    SDValue S = DAG.getStore(InChain, dl, Val, Ptr, MMO);
+    DAG.setRoot(S);
+    return;
+  }
   SDValue OutChain = DAG.getAtomic(ISD::ATOMIC_STORE, dl, MemVT, InChain,
   SDValue OutChain = DAG.getAtomic(ISD::ATOMIC_STORE, dl, MemVT, InChain,
-                                   getValue(I.getPointerOperand()), Val, MMO);
+                                   Ptr, Val, MMO);
 
 
 
 
   DAG.setRoot(OutChain);
   DAG.setRoot(OutChain);

+ 19 - 0
lib/Target/X86/X86ISelLowering.cpp

@@ -92,6 +92,13 @@ static cl::opt<bool> MulConstantOptimization(
              "SHIFT, LEA, etc."),
              "SHIFT, LEA, etc."),
     cl::Hidden);
     cl::Hidden);
 
 
+static cl::opt<bool> ExperimentalUnorderedISEL(
+    "x86-experimental-unordered-atomic-isel", cl::init(false),
+    cl::desc("Use LoadSDNode and StoreSDNode instead of "
+             "AtomicSDNode for unordered atomic loads and "
+             "stores respectively."),
+    cl::Hidden);
+
 /// Call this when the user attempts to do something unsupported, like
 /// Call this when the user attempts to do something unsupported, like
 /// returning a double without SSE2 enabled on x86_64. This is not fatal, unlike
 /// returning a double without SSE2 enabled on x86_64. This is not fatal, unlike
 /// report_fatal_error, so calling code should attempt to recover without
 /// report_fatal_error, so calling code should attempt to recover without
@@ -26493,6 +26500,18 @@ X86TargetLowering::lowerIdempotentRMWIntoFencedLoad(AtomicRMWInst *AI) const {
   return Loaded;
   return Loaded;
 }
 }
 
 
+bool X86TargetLowering::lowerAtomicStoreAsStoreSDNode(const StoreInst &SI) const {
+  if (!SI.isUnordered())
+    return false;
+  return ExperimentalUnorderedISEL;
+}
+bool X86TargetLowering::lowerAtomicLoadAsLoadSDNode(const LoadInst &LI) const {
+  if (!LI.isUnordered())
+    return false;
+  return ExperimentalUnorderedISEL;
+}
+
+
 /// Emit a locked operation on a stack location which does not change any
 /// Emit a locked operation on a stack location which does not change any
 /// memory location, but does involve a lock prefix.  Location is chosen to be
 /// memory location, but does involve a lock prefix.  Location is chosen to be
 /// a) very likely accessed only by a single thread to minimize cache traffic,
 /// a) very likely accessed only by a single thread to minimize cache traffic,

+ 3 - 0
lib/Target/X86/X86ISelLowering.h

@@ -1388,6 +1388,9 @@ namespace llvm {
     LoadInst *
     LoadInst *
     lowerIdempotentRMWIntoFencedLoad(AtomicRMWInst *AI) const override;
     lowerIdempotentRMWIntoFencedLoad(AtomicRMWInst *AI) const override;
 
 
+    bool lowerAtomicStoreAsStoreSDNode(const StoreInst &SI) const override;
+    bool lowerAtomicLoadAsLoadSDNode(const LoadInst &LI) const override;
+
     bool needsCmpXchgNb(Type *MemType) const;
     bool needsCmpXchgNb(Type *MemType) const;
 
 
     void SetupEntryBlockForSjLj(MachineInstr &MI, MachineBasicBlock *MBB,
     void SetupEntryBlockForSjLj(MachineInstr &MI, MachineBasicBlock *MBB,

文件差異過大導致無法顯示
+ 571 - 38
test/CodeGen/X86/atomic-unordered.ll


部分文件因文件數量過多而無法顯示