Using accelerator and accelerator_view Objects
You can use the accelerator and accelerator_view classes to specify the device or emulator to run your C++ AMP code on. A system might have several devices or emulators that differ by amount of memory, debugging support, or double-precision support. C++ Accelerated Massive Parallelism (C++ AMP) provides APIs that you can use to examine the available accelerators, set one as the default, specify multiple accelerators for multiple calls to parallel_for_each, and perform special debugging tasks.
Using the Default Accelerator
The C++ AMP runtime picks a default accelerator, unless you write code to pick a specific one. The runtime chooses the default accelerator as follows:
If the app is running in debug mode, an accelerator that supports debugging.
Otherwise, the accelerator that's specified by the CPPAMP_DEFAULT_ACCELERATOR environment variable, if it's set.
Otherwise, a non-emulated device.
Otherwise, the device that has the greatest amount of available memory.
Otherwise, a device that's not attached to the display.
You can determine the properties of the default accelerator by constructing the default accelerator and examining its properties. The following code example prints the path, memory, and double-precision support of the default accelerator.
void default_properties() {
accelerator default_acc;
std::wcout << default_acc.device_path << "\n";
std::wcout << default_acc.dedicated_memory << "\n";
std::wcout << (default_acc.supports_double_precision ?
"double precision: true" : "double precision: false") << "\n";
}
CPPAMP_DEFAULT_ACCELERATOR Environment Variable
You can set the CPPAMP_DEFAULT_ACCELERATOR environment variable to specify the accelerator::device_path of the default accelerator. The path is hardware-dependent. The following code uses the accelerator::get_all function to retrieve a list of the available accelerators and then display the path of each accelerator.
void list_all_accelerators()
{
std::vector<accelerator> accs = accelerator::get_all();
for (int i = 0; i < accs.size(); i++) {
std::wcout << accs[i].device_path << "\n";
std::wcout << accs[i].dedicated_memory << "\n";
std::wcout << (accs[i].supports_double_precision ?
"double precision: true" : "double precision: false") << "\n";
}
}
Selecting an Accelerator
To select an accelerator, use the accelerator::get_all method to retrieve a list of the available accelerators and then select one based on its properties. This example shows how to pick the accelerator that has the most memory:
void pick_with_most_memory()
{
std::vector<accelerator> accs = accelerator::get_all();
accelerator acc_chosen = accs[0];
for (int i = 0; i < accs.size(); i++) {
if (accs[i].dedicated_memory > acc_chosen.dedicated_memory) {
acc_chosen = accs[i];
}
}
std::wcout << "The accelerator with the most memory is "
<< acc_chosen.device_path << "\n"
<< acc_chosen.dedicated_memory << ".\n";
}
Note
One of the accelerators that are returned by accelerator::get_all is the CPU accelerator. You cannot execute code on the CPU accelerator. To filter out the CPU accelerator, compare the value of the device_path property of the accelerator that's returned by accelerator::get_all with the value of the accelerator::cpu_accelerator. For more information, see the "Special Accelerators" section in this article.
Changing the Default Accelerator
You can change the default accelerator by calling the accelerator::set_default method. You can change the default accelerator only once per app execution and you must change it before any code is executed on the GPU. Any subsequent function calls to change the accelerator return false. If you want to use a different accelerator in a call to parallel_for_each, read the "Using Multiple Accelerators" section in this article. The following code example sets the default accelerator to one that is not emulated, is not connected to a display, and supports double-precision.
bool pick_accelerator()
{
std::vector<accelerator> accs = accelerator::get_all();
accelerator chosen_one;
auto result =
std::find_if(accs.begin(), accs.end(), [] (const accelerator& acc)
{
return !acc.is_emulated &&
acc.supports_double_precision &&
!acc.has_display;
});
if (result != accs.end())
chosen_one = *(result);
std::wcout << chosen_one.description << std::endl;
bool success = accelerator::set_default(chosen_one.device_path);
return success;
}
Using Multiple Accelerators
There are two ways to use multiple accelerators in your app:
You can pass accelerator_view objects to the calls to the parallel_for_each method.
You can construct an array object using a specific accelerator object. The C+AMP runtime will pick up the accelerator_view object from the captured array object in the lambda expression.
Special Accelerators
The device paths of three special accelerators are available as properties of the accelerator class:
accelerator::direct3d_ref Data Member: This single-threaded accelerator uses software on the CPU to emulate a generic graphics card. It's used by default for debugging, but it's not useful in production because it's slower than the hardware accelerators. Additionally, it's available only in the DirectX SDK and the Windows SDK, and it's unlikely to be installed on your customers' computers. For more information, see Debugging GPU Code.
accelerator::direct3d_warp Data Member: This accelerator provides a fallback solution for executing C++ AMP code on multi-core CPUs that use Streaming SIMD Extensions (SSE).
accelerator::cpu_accelerator Data Member: You can use this accelerator for setting up staging arrays. It cannot execute C++ AMP code. For more information, see the Staging Arrays in C++ AMP post on the Parallel Programming in Native Code blog.
Interoperability
The C++ AMP runtime supports interoperability between the accelerator_view class and the Direct3D ID3D11Device interface. The create_accelerator_view method takes an IUnknown interface and returns an accelerator_view object. The get_device method takes an accelerator_view object and returns an IUknown interface.