| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The decompression is done in-place and only the compressed tiles are
decompressed. Note: R6xx-R7xx can do that only with Z16 and Z32F.
The texture unit is programmed to use non-displayable tiling and depth
ordering of samples, so that it can fetch the texture in the native DB format.
The latest version of the libdrm surface allocator is required for stencil
texturing to work. The old one didn't create the mipmap tree correctly.
We need a separate mipmap tree for stencil, because the stencil mipmap
offsets are not really depth offsets/4.
There are still some known bugs, but this should save some memory and it also
improves performance a little bit in Lightsmark (especially with low
resolutions; tested with Radeon HD 5000).
The DB->CB copy is still used for transfers.
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
|
|
|
|
| |
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
|
|
|
|
|
|
| |
It seems to work for me now. Even the graphics corruption is gone.
This also boosts performance in Reaction Quake.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The problem was we set VRAM|GTT for relocations of STATIC resources.
Setting just VRAM increases the framerate 4 times on my machine.
I rewrote the switch statement and adjusted the domains for window
framebuffers too.
NOTE: This is a candidate for the stable branches.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On r6xx/r7xx shader resource management need to make sure that the
shader does not goes over the gpr register limit. Each specific
asic has a maxmimum register that can be split btw shader stage.
For each stage the shader must not use more register than the
limit programmed.
v2: Print an error message when discarding draw. Don't add another
boolean to context structure, but rather propagate the discard
boolean through the call chain.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
|
|
|
|
| |
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
|
|
|
|
|
|
|
|
| |
Reviewed-by: Brian Paul <brianp@vmware.com>
v2: update relnotes-9.1
v3: use align_malloc and align_free for malloced buffers in r300g
v4: document the new CAP in the docs
|
|
|
|
|
| |
Taken from the intel driver. The sample positions are actually a solution
to the 8 queens puzzle. It gives more accurate and smoother AA.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows updating only a subrange of buffer bindings.
set_vertex_buffers(pipe, start_slot, count, NULL) unbinds buffers in that
range. Binding NULL resources unbinds buffers too (both buffer and user_buffer
must be NULL).
The meta ops are adapted to only save, change, and restore the single slot
they use. The cso_context can save and restore only one vertex buffer slot.
The clients can query which one it is using cso_get_aux_vertex_buffer_slot.
It's currently set to 0. (the Draw module breaks if it's set to non-zero)
It should decrease the CPU overhead when using a lot of meta ops, but
the drivers must be able to treat each vertex buffer slot as a separate
state (only r600g does so at the moment).
I can imagine this also being useful for optimizing some OpenGL use cases.
Reviewed-by: Brian Paul <brianp@vmware.com>
|
|
|
|
| |
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
|
|
|
|
|
|
|
|
|
|
| |
The 2x and 4x MSAA cases are completely broken. The lfdptr instruction returns
garbage there.
The 8x MSAA case is broken on Cayman, though at least the result looks somewhat
correct.
Only the 8x MSAA case works on Evergreen and is enabled.
|
|
|
|
| |
to match the varying limit.
|
| |
|
|
|
|
|
|
|
|
|
| |
And use it for compute. This should improve compute support
on cayman.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
|
|
|
|
|
|
|
|
|
| |
These are common to both evergreen and cayman, but were
not emitted on cayman.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
|
|
|
|
|
|
|
|
| |
we were previously only setting 8 of them.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
|
|
|
|
|
|
|
|
|
| |
Move gfx specific bits out as the code is shared with
compute.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
|
|
|
|
|
|
|
|
|
|
| |
It's required. The CP uses this to properly allocate new
contexts. Also do a CS partial flush since we are updating
CONFIG regs which are single state.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
|
|
|
|
| |
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
|
|
|
|
| |
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
|
|
|
|
|
| |
The 3.2 version of the backend now sets all the correct fields for
PRED_SET* instructions.
|
|
|
|
| |
Reviewed-by: Brian Paul <brianp@vmware.com>
|
|
|
|
|
|
| |
This segfault was caused by commit
369e46888904c6d379b8b477d9242cff1608e30e, however it is my fault for not
testing the patch while it was on the list.
|
| |
|
|
|
|
| |
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
|
|
|
|
| |
That depends on the texture wrap modes and filtering.
|
|
|
|
|
|
|
| |
- stopped using util_color
- reformatted to occupy less characters per line.
- used memcpy for the border color
- used pipe_color_union in the state structure
|
| |
|
|
|
|
| |
by changing the format to NORM.
|
|
|
|
|
| |
This improves performance a little bit if there are lots of small indexed
draw commands.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"get_transfer + transfer_map" becomes "transfer_map".
"transfer_unmap + transfer_destroy" becomes "transfer_unmap".
transfer_map must create and return the transfer object and transfer_unmap
must destroy it.
transfer_map is successful if the returned buffer pointer is not NULL.
If transfer_map fails, the pointer to the transfer object remains unchanged
(i.e. doesn't have to be NULL).
Acked-by: Brian Paul <brianp@vmware.com>
|
|
|
|
| |
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
|
|
|
|
|
|
|
| |
Also update the register value in more appropriate places
than r600_update_derived_state.
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
|
|
|
|
| |
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
|
|
|
|
| |
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
|
|
|
|
| |
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
|
|
|
|
|
|
| |
Some variables have been removed from there too.
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
|
|
|
|
| |
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
|
|
|
|
|
|
| |
The workaround for R600 lacking VPORT_SCISSOR_ENABLE has also been simplified.
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
|
|
|
|
|
|
|
| |
POLY_OFFSET_DB_FMT_CNTL is moved to the framebuffer state, because it only
depends on the zbuffer format.
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
|
|
|
|
|
|
|
| |
The state object is actually a buffer, it's literally a buffer containing
the shader code.
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
|
|
|
|
| |
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This is not so trivial, because we disable blending if the dual src
blending is turned on and the number of color outputs is less than 2.
I decided to create 2 command buffers in the blend state object and just
switch between them when needed, because there are other states unrelated
to blending (like the color mask) and those shouldn't be changed
(the old code had it wrong).
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
|
|
|
|
| |
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
r600_command_buffer is not an atom.
The "atoms" have evolved into state slots (or groups of state slots) where
you can bind states. There is a fixed amount of atoms (state slots)
in the context.
The command buffers are nothing like that. They represent states, not state
slots.
We could probably give r600_atom a better name someday.
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
|
|
|
|
| |
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
|
|
|
|
| |
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
|
| |
|