aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/i915/intel_ringbuffer.c (follow)
AgeCommit message (Collapse)AuthorFilesLines
2015-02-25drm/i915: Cache ringbuf pointer in request structureJohn Harrison1-0/+1
In execlist mode, the ringbuf is a function of the ring and context whereas in legacy mode, it is derived from the ring alone. Thus the calculation required to determine the ringbuf pointer from the ring (and context) also needs to test execlist mode or not. This is messy. Further, the request structure holds a pointer to both the ring and the context for which it was created. Thus, given a request, it is possible to derive the ringbuf in either legacy or execlist mode. Hence it is necessary to pass just the request in to all the low level functions rather than some combination of request, ring, context and ringbuf. However, rather than recalculating it each time, it is much simpler to just cache the ringbuf pointer in the request structure itself. Caching the pointer means the calculation is done once at request creation time and all further code and simply read it directly from the request structure. OTC-Jira: VIZ-5115 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> [danvet: Drop contentless comment in lrc alloc request entirely. And spelling fix in the commit message.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-02-25drm/i915: Rename 'flags' to 'dispatch_flags' for better code readingJohn Harrison1-15/+20
There is a flags word that is passed through the execbuffer code path all the way from initial decoding of the user parameters down to the very final dispatch buffer call. It is simply called 'flags'. Unfortuantely, there are many other flags words floating around in the same blocks of code. Even more once the GPU scheduler arrives. This patch makes it more obvious exactly which flags word is which by renaming 'flags' to 'dispatch_flags'. Note that the bit definitions for this flags word already have an 'I915_DISPATCH_' prefix on them and so are not quite so ambiguous. OTC-Jira: VIZ-1587 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> [danvet: Resolve conflict with Chris' rework of the bb parsing.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-02-24drm/i915/skl: Tune IZ hashing when subslices are unbalancedDamien Lespiau1-1/+44
When one EU is disabled in a particular subslice, we can tune how the work is spread between subslices to improve EU utilization. v2: - Use a bitfield to record which subslice(s) has(have) 7 EUs. That will also make the machinery work if several sublices have 7 EUs. (Jeff Mcgee) - Only apply the different hashing algorithm if the slice is effectively unbalanced by checking there's a single subslice with 7 EUs. (Jeff Mcgee) v3: Fix typo in comment (Jeff Mcgee) Issue: VIZ-3845 Cc: Jeff Mcgee <jeff.mcgee@intel.com> Reviewed-by: Jeff Mcgee <jeff.mcgee@intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-02-23drm/i915/skl: Implement WaDisablePowerCompilerClockGatingDamien Lespiau1-0/+8
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Nick Hoath <nicholas.hoath@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-02-13drm/i915/skl: Use a LRI for WaDisableDgMirrorFixInHalfSliceChicken5Damien Lespiau1-7/+3
I have no idea how that crept in, but we need to do the write from the ring and this is a masked register. Two fixes in 1! Cc: Nick Hoath <nicholas.hoath@intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Nick Hoath <nicholas.hoath@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-02-13drm/i915/skl: Fix always true comparison in a revision id checkDamien Lespiau1-2/+2
It's always a good idea to keep static analysis happy (also because it prompts doing the check like I proposed :), this time smatch complains: drivers/gpu/drm/i915/intel_ringbuffer.c:891 gen9_init_workarounds() warn: always true condition '((->dev->pdev->revision) >= (0)) => (0-255 >= 0)' That's because revision is a u8. Tweak a bit the condition then. Cc: Nick Hoath <nicholas.hoath@intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Nick Hoath <nicholas.hoath@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-02-13drm/i915/skl: Implement WaSetDisablePixMaskCammingAndRhwoInCommonSliceChickenDamien Lespiau1-0/+8
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Nick Hoath <nicholas.hoath@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-02-13drm/i915/skl: Implement WaBarrierPerformanceFixDisableDamien Lespiau1-0/+7
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Nick Hoath <nicholas.hoath@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-02-13drm/i915/skl: Implement WaCcsTlbPrefetchDisable:sklDamien Lespiau1-0/+4
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Nick Hoath <nicholas.hoath@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-02-13drm/i915/skl: Implement WaDisablePartialResolveInVcDamien Lespiau1-0/+3
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Nick Hoath <nicholas.hoath@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-02-13drm/i915/skl: Introduce a SKL specific init_workarounds()Damien Lespiau1-1/+10
This function will host SKL-only W/As. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Nick Hoath <nicholas.hoath@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-02-13drm/i915: Make intel_ring_setup_status_page() staticDamien Lespiau1-62/+62
This function is only used in intel_ringbuffer.c, so restrict it to that file. The function was moved around to avoid a forward declaration and group it with its user. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> [danvet: Squash in fixup from Wu Fengguang.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-02-13drm/i915/bdw: Implement WaForceContextSaveRestoreNonCoherentDamien Lespiau1-3/+5
v2: Reorder defines (Ben) v3: More bikesheds, this time re-ordering comments! (Chris) Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> [danvet: Resolve conflict.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-02-13drm/i915: gen 9 h/w w/a Fix stepping checkNick Hoath1-1/+2
Fixed the stepping check on WaDisableDgMirrorFixInHalfSliceChicken5 to be for the correct SOC (Skylake) Signed-off-by: Nick Hoath <nicholas.hoath@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-02-13drm/i915/gen9: Implement WaForceEnableNonCoherentHoath, Nicholas1-0/+11
v2: Don't add WaHdcDisableFetchWhenMasked. Add stepping check for WaForceEnableNonCoherent Signed-off-by: Nick Hoath <nicholas.hoath@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-02-13drm/i915/gen9: Implement Wa4x4STCOptimizationDisableHoath, Nicholas1-0/+3
Move Wa4x4STCOptimizationDisable to gen9_init_workarounds v2: rebase Signed-off-by: Nick Hoath <nicholas.hoath@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-02-13drm/i915/gen9: Implement WaEnableYV12BugFixInHalfSliceChicken7Nick Hoath1-0/+6
Move WaEnableYV12BugFixInHalfSliceChicken7 to gen9_init_workarounds v2: Add stepping check. Signed-off-by: Nick Hoath <nicholas.hoath@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-02-13drm/i915/gen9: h/w w/a: syncing dependencies between camera and graphicsNick Hoath1-0/+4
This one doesn't have one of these nice cryptic names unfortunately. v2: Added missing register bitmap Signed-off-by: Nick Hoath <nicholas.hoath@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-02-13drm/i915/gen9: Implement WaDisableDgMirrorFixInHalfSliceChicken5Nick Hoath1-0/+10
Move WaDisableDgMirrorFixInHalfSliceChicken5 to gen9_init_workarounds v2: Added stepping check v3: Removed unused register bitmap Signed-off-by: Nick Hoath <nicholas.hoath@intel.com> [danvet: Bikesheds.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-02-13drm/i915/gen9: Implement WaDisablePartialInstShootdownHoath, Nicholas1-0/+7
v2: Dont add WaDisableThreadStallDopClockGating as not SKL WA. (Found by Damien Lespiau) Signed-off-by: Nick Hoath <nicholas.hoath@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> [danvet: Bikeshed commit message a bit as per Damien's suggestions.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-02-13drm/i915: ring w/a initialisation for gen 9Hoath, Nicholas1-0/+8
Add framework for gen 9 HW WAs v1: Changed SOC specific WA function to gen 9 common function (Req: Damien Lespiau) Signed-off-by: Nick Hoath <nicholas.hoath@intel.com> Reviewed-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-02-13drm/i915/skl: Remove the check enforcing VCS2 to be gen8 onlyDamien Lespiau1-7/+1
We already track this in the intel_info struct. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [danvet: Make the commit message a bit less terse.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-02-09drm/i915: Insert a command barrier on BLT/BSD cache flushesChris Wilson1-4/+19
This looked like an odd regression from commit ec5cc0f9b019af95e4571a9fa162d94294c8d90b Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Jun 12 10:28:55 2014 +0100 drm/i915: Restrict GPU boost to the RCS engine but in reality it undercovered a much older coherency bug. The issue that boosting the GPU frequency on the BCS ring was masking was that we could wake the CPU up after completion of a BCS batch and inspect memory prior to the write cache being fully evicted. In order to serialise the breadcrumb interrupt (and so ensure that the CPU's view of memory is coherent) we need to perform a post-sync operation in the MI_FLUSH_DW. v2: Fix all the MI_FLUSH_DW (bsd plus the duplication in execlists). Also fix the invalidate_domains mask in gen8_emit_flush() for ring != VCS. Testcase: gpuX-rcs-gpu-read-after-write Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Acked-by: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2015-01-27drm/i915: Change CHV WIZ hashing mode to 16x4Ville Syrjälä1-0/+12
I ran a few tests with xonotic and synmark2 trying out the different WIZ hashing modes on CHV. The results seem to match the results I got with IVB/HSW when I did the similar tests on them in the past. That is 16x4 is generally the fastest mode, 8x8 comes next and finally 8x4. On CHV the difference between the modes is at most ~1% in most tests. IIRC on IVB/HSW the difference was a little bigger, but as there doesn't seem to be any real downside to 16x4 let's use it by default. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-27drm/i915: Implement Wa4x4STCOptimizationDisable:chvVille Syrjälä1-0/+4
Wa4x4STCOptimizationDisable got only implemented for BDW, but according to the w/a database CHV needs it too, so add it. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-27drm/i915: Rename the forcewake get/put functionsMika Kuoppala1-2/+2
We have multiple forcewake domains now on recent gens. Change the function naming to reflect this. v2: More verbose names (Chris) v3: Rebase v4: Rebase v5: Add documentation for forcewake_get/put Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Deepak S <deepak.s@linux.intel.com> (v2) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-27drm/i915: Removed duplicate members from submit_requestNick Hoath1-1/+1
Where there were duplicate variables for the tail, context and ring (engine) in the gem request and the execlist queue item, use the one from the request and remove the duplicate from the execlist queue item. Issue: VIZ-4274 v1: Rebase v2: Fixed build issues. Keep separate postfix & tail pointers as these are used in different ways. Reinserted missing full tail pointer update. Signed-off-by: Nick Hoath <nicholas.hoath@intel.com> Reviewed-by: Thomas Daniel <thomas.daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-17drm/i915: Ensure the HiZ RAW Stall Optimization is on for Cherryview.Kenneth Graunke1-0/+5
This is an important optimization for avoiding read-after-write (RAW) stalls in the HiZ buffer. Certain workloads would run very slowly with HiZ enabled, but run much faster with the "hiz=false" driconf option. With this patch, they run at full speed even with HiZ. Increases performance in OglVSInstancing by about 2.7x on Braswell. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-17drm/i915: Enable the HiZ RAW Stall Optimization on Broadwell.Kenneth Graunke1-0/+10
This is an important optimization for avoiding read-after-write (RAW) stalls in the HiZ buffer. Certain workloads would run very slowly with HiZ enabled, but run much faster with the "hiz=false" driconf option. With this patch, they run at full speed even with HiZ. Improves performance in OglVSInstancing by 3.2x on Broadwell GT3e (Iris Pro 6200). Thanks to Jesse Barnes and Ben Widawsky for their help in tracking this down. Thanks to Chris Wilson for showing me the new workarounds system. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-13drm/i915: Improve HiZ throughput on Cherryview.Kenneth Graunke1-0/+3
Found by reading the HIZ_CHICKEN documentation. Improves performance in a HiZ microbenchmark by around 50%. Improves performance in OglZBuffer by around 18%. Thanks to Chris Wilson for helping me figure out where to put this. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-01-12Merge tag 'topic/i915-hda-componentized-2015-01-12' into drm-intel-next-queuedDaniel Vetter1-10/+17
Conflicts: drivers/gpu/drm/i915/intel_runtime_pm.c Separate branch so that Takashi can also pull just this refactoring into sound-next. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2014-12-16drm/i915: Force the CS stall for invalidate flushesChris Wilson1-0/+2
In order to act as a full command barrier by itself, we need to tell the pipecontrol to actually stall the command streamer while the flush runs. We require the full command barrier before operations like MI_SET_CONTEXT, which currently rely on a prior invalidate flush. References: https://bugs.freedesktop.org/show_bug.cgi?id=83677 Cc: Simon Farnsworth <simon@farnz.org.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2014-12-16drm/i915: Invalidate media caches on gen7Chris Wilson1-0/+1
In the gen7 pipe control there is an extra bit to flush the media caches, so let's set it during cache invalidation flushes. v2: Rename to MEDIA_STATE_CLEAR to be more inline with spec. Cc: Simon Farnsworth <simon@farnz.org.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: stable@vger.kernel.org Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2014-12-16drm/i915: Warn about missing context state workarounds only onceMichel Thierry1-1/+1
Otherwise, new platforms without workarounds will hit this warning for every new context created. Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-10drm/i915/bdw: Add WaForceEnableNonCoherent labelMichel Thierry1-0/+1
We already implement this workaround, but it was missing its name. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-10drm/i915: Remove '& 0xffff' from the mask given to WA_REG()Damien Lespiau1-2/+2
We may be hidding bugs by doing that, so let remove it and have the actual mask value shine through, for better or worse. Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2014-12-10drm/i915: Invert the mask and val arguments in wa_add() and WA_REG()Damien Lespiau1-9/+9
While trying to unify the order of those arguments throughout the driver, Daniel noticed what we were inverting them in this part of the code. Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2014-12-10drm/i915/bdw: Fix the write setting up the WIZ hashing modeDamien Lespiau1-2/+6
I was playing with clang and oh surprise! a warning trigerred by -Wshift-overflow (gcc doesn't have this one): WA_SET_BIT_MASKED(GEN7_GT_MODE, GEN6_WIZ_HASHING_MASK | GEN6_WIZ_HASHING_16x4); drivers/gpu/drm/i915/intel_ringbuffer.c:786:2: warning: signed shift result (0x28002000000) requires 43 bits to represent, but 'int' only has 32 bits [-Wshift-overflow] WA_SET_BIT_MASKED(GEN7_GT_MODE, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/i915/intel_ringbuffer.c:737:15: note: expanded from macro 'WA_SET_BIT_MASKED' WA_REG(addr, _MASKED_BIT_ENABLE(mask), (mask) & 0xffff) Turned out GEN6_WIZ_HASHING_MASK was already shifted by 16, and we were trying to shift it a bit more. The other thing is that it's not the usual case of setting WA bits here, we need to have separate mask and value. To fix this, I've introduced a new _MASKED_FIELD() macro that takes both the (unshifted) mask and the desired value and the rest of the patch ripples through from it. This bug was introduced when reworking the WA emission in: Commit 7225342ab501befdb64bcec76ded41f5897c0855 Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Tue Oct 7 17:21:26 2014 +0300 drm/i915: Build workaround list in ring initialization v2: Invert the order of the mask and value arguments (Daniel Vetter) Rewrite _MASKED_BIT_ENABLE() and _MASKED_BIT_DISABLE() with _MASKED_FIELD() (Jani Nikula) Make sure we only evaluate 'a' once in _MASKED_BIT_ENABLE() (Dave Gordon) Add check to ensure the value is within the mask boundaries (Chris Wilson) v3: Ensure the the value and mask are 16 bits (Dave Gordon) Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2014-12-08drm/i915: Move golden context init into ->init_contextDaniel Vetter1-1/+17
Similar to a patch from Thomas Daniel for lrc contexts. This keeps both sides somewhat in sync and should make Dave Gordon happy. Note that both the wa and the golden context init code suffer a bit from an inssuficient split into driver load and hw init code. Which means we have a bunch of tests all over the place to check whether the one-time initialization has been done already or not. All that one-tim code should be moved into the one-time ring setup code, but that's work for later. Cc: Dave Gordon <david.s.gordon@intel.com> Cc: Thomas Daniel <thomas.daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-06drm/i915: Add unique id to the request structure for debuggingJohn Harrison1-0/+2
For debugging purposes, it is useful to be able to uniquely identify a given request structure as it works its way through the system. This becomes especially tricky once the seqno value is lazily allocated as then the request has nothing but its pointer to identify it for much of its life. Change-Id: Ie76b2268b940467f4cdf5a4ba6f5a54cbb96445d For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-06drm/i915: Zero fill the request structureJohn Harrison1-1/+1
There is a general theory that kzmalloc is better/safer than kmalloc, especially for interesting data structures. This change updates the request structure allocation to be zero filled. This also fixes crashes in the reset code. Quoting Mika's patch: "Clean the request structure on alloc. Otherwise we might end up referencing uninitialized fields. This is apparent when we try to cleanup the preallocated request on ring reset, before any request has been submitted to the ring. The request->ctx is foobar and we end up freeing the foobarness." Note that this fixes a regression introduced in commit 9eba5d4a1d79d5094321469479b4dbe418f60110 Author: John Harrison <John.C.Harrison@Intel.com> Date: Mon Nov 24 18:49:23 2014 +0000 drm/i915: Ensure OLS & PLR are always in sync References: https://bugs.freedesktop.org/show_bug.cgi?id=86959 References: https://bugs.freedesktop.org/show_bug.cgi?id=86962 References: https://bugs.freedesktop.org/show_bug.cgi?id=86992 Change-Id: I68715ef758025fab8db763941ef63bf60d7031e2 For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-05drm/i915/bdw: Add WaHdcDisableFetchWhenMaskedMichel Thierry1-0/+2
We already have it for chv, but was missing for bdw. Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-03drm/i915: Flatten engine init control flowDaniel Vetter1-23/+22
Now that sanity prevails and we have the clean split between software init and starting the engines we can drop all the "have we allocate this struct already?" nonsense. Execlist code could benefit quite a bit more still, but that's for another patch. Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2014-12-03drm/i915: Only init engines onceDaniel Vetter1-4/+0
We can do this. And now there's finally the clean split between software setup and hardware setup I kinda wanted since multi-ring support was merged aeons ago. It only took almost 5 years. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-03drm/i915: Move intel_init_pipe_control out of engine->init_hwDaniel Vetter1-7/+11
With this all the ->init_hw hooks really only set up hw state needed to start the ring, all the software state setup and memory/buffer allocations happen beforehand. v2: We need to call intel_init_pipe_control after the ring init since otherwise engine->dev is NULL and it falls over. Currently that's now after the hw ring is enabled but a) we'll be fine as long as no one submits a batch b) this will change soon. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-03drm/i915: s/init()/init_hw()/ in intel_engine_csDaniel Vetter1-6/+6
This is (mostly, some exceptions that need fixing) the hw setup function which starts the ring. And not the function which allocates all the resources. Make this clear by giving it a better name. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-03drm/i915: Consolidate ring freespace calculationsDave Gordon1-20/+22
There are numerous places in the code where the driver's idea of how much space is left in a ring is updated using the driver's latest notions of the positions of 'head' and 'tail' for the ring. Among them are some that update one or both of these values before (re)doing the calculation. In particular, there are four different places in the code where 'last_retired_head' is copied to 'head' and then set to -1; and two of these do not have a guard to check that it has actually been updated since last time it was consumed, leaving the possibility that the dummy -1 can be transferred from 'last_retired_head' to 'head', causing the space calculation to produce 'impossible' results (previously seen on Android/VLV). This code therefore consolidates all the calculation and updating of these values, such that there is only one place where the ring space is updated, and it ALWAYS uses (and consumes) 'last_retired_head' if (and ONLY if) it has been updated since the last call. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-03drm/i915: Make ring freespace calculation more robustDave Gordon1-3/+3
The used space in a ring is given by the cyclic distance from the consumer (HEAD) to the producer (TAIL), i.e. ((tail-head) MOD size); conversely, the available space in a ring is the cyclic distance from the producer to the consumer, MINUS the amount reserved for a "gap" that is supposed to guarantee that the producer never catches up with or overruns the consumer. Note that some GEN h/w requires that TAIL never approach to within one cacheline of HEAD, so the gap is usually set to twice the cacheline size to ensure this. While the existing code gives the correct answer for correct inputs, if the producer HAS overrun into the reserved space, the result can be a value larger than the maximum valid value (size-reserved). We can improve this by reorganising the calculation, so that in the event of overrun the result will be negative rather than over-large. This means that the commonly-used test (available >= required) will then reject further writes into the ring after an overrun, giving some chance that we can recover from or at least diagnose the original problem; whereas allowing more writes would likely both confuse the h/w and destroy the evidence of what went wrong. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-03drm/i915: Connect requests to rings at creation not submissionJohn Harrison1-0/+1
It makes a lot more sense (and makes future seqno -> request conversion patches simpler) to fill in the 'ring' field of the request structure at the point of creation rather than submission. Given that the request structure is assigned by ring specific code and thus is locked to a ring from the start, there really is no reason to defer this assignment. For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-03drm/i915: Remove obsolete seqno parameter from 'i915_add_request'John Harrison1-1/+1
There is no longer any need to retrieve a seqno value from an i915_add_request() call. The calling code already knows which request structure is being processed (it can only be ring->OLR). And as the request itself is now used in preference to the basic seqno value, the latter is now redundant in this situation. For: VIZ-4377 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>