It's UWAweek 47

help5507

This forum is provided to promote discussion amongst students enrolled in CITS5507 High Performance Computing.

Please consider offering answers and suggestions to help other students! And if you fix a problem by following a suggestion here, it would be great if other interested students could see a short "Great, fixed it!"  followup message.

How do I ask a good question?
Displaying the 6 articles in this topic
Showing 6 of 148 articles.
Currently 23 other people reading this forum.


 UWA week 33 (2nd semester, week 4) ↓
SVG not supported

Login to reply

👍?
helpful
2:03pm Fri 16th Aug, ANONYMOUS

In Lab 3, after following the program from the slides, I observed that increasing the number of threads led to an increase in the total running time, which was contrary to my expectations. Could you help me understand if this behavior is normal or if there might be an issue with the program?


 UWA week 34 (2nd semester, week 5) ↓
SVG not supported

Login to reply

👍?
helpful
1:41pm Wed 21st Aug, Abdul M.

Hi, can you give me more details about your implementation? It would be better if you could attach your code. I suspect if you are using the reduction and the number of threads is quite large, it can cause time overhead in the reduction steps.


SVG not supported

Login to reply

👍?
helpful
6:48pm Fri 23rd Aug, ANONYMOUS

#include <stdio.h> #include <stdlib.h> #include <string.h> #include <time.h> #include <unistd.h> #include <omp.h> #define ARRAY_LEN 1000000 int main(int argc, char* argv[]) { unsigned int seed = 12345; srand(seed); int arr[ARRAY_LEN]; for(int i=0;i<ARRAY_LEN;++i){ arr[i] = rand(); } int realTotalSum = 0; for(int i=0;i<ARRAY_LEN;++i){ realTotalSum += arr[i]; } clock_t start = clock(); const int n = 128; omp_set_num_threads(n); printf("OpenMP running with %d threads\n", omp_get_max_threads()); int localSums[128] = {0}; int localSum = 0; long i = 0; #pragma omp parallel private(localSum) { int id = omp_get_thread_num(); printf("Hello world from thread %d\n", id); #pragma omp for for(long i = 0; i < ARRAY_LEN; ++i){ localSum += arr[i]; } localSums[id] = localSum; printf("local sum in thread: %d\n", localSum); } int totalSum = 0; for(int i = 0;i<n;++i){ totalSum += localSums[i]; } printf("realTotalSum:%d\n", realTotalSum); printf("totalSum:%d\n", totalSum); clock_t end = clock(); double timeSpent = (double)(end - start) / CLOCKS_PER_SEC; printf("time spent = %10.6f\n", timeSpent); return 0; } //This is my code, I just change the const int n = 128 to others for test.


 UWA week 35 (2nd semester, week 6) ↓
SVG not supported

Login to reply

👍?
helpful
5:44pm Mon 26th Aug, Abdul M.

ANONYMOUS wrote:

#include <stdio.h> #include <stdlib.h> #include <string.h> #include <time.h> #include <unistd.h> #include <omp.h>

#define ARRAY_LEN 1000000

int main(int argc, char* argv[]) { unsigned int seed = 12345; srand(seed); int arr[ARRAY_LEN]; for(int i=0;i<ARRAY_LEN;++i){ arr[i] = rand(); } int realTotalSum = 0; for(int i=0;i<ARRAY_LEN;++i){ realTotalSum += arr[i]; } clock_t start = clock();

const int n = 128; omp_set_num_threads(n);

printf("OpenMP running with %d threads\n", omp_get_max_threads()); int localSums[128] = {0}; int localSum = 0; long i = 0; #pragma omp parallel private(localSum) { int id = omp_get_thread_num(); printf("Hello world from thread %d\n", id);

#pragma omp for
for(long i = 0; i < ARRAY_LEN; ++i){
  localSum += arr[i];
}

localSums[id] = localSum;
printf("local sum in thread: %d\n", localSum);

}

int totalSum = 0; for(int i = 0;i<n;++i){ totalSum += localSums[i]; } printf("realTotalSum:%d\n", realTotalSum); printf("totalSum:%d\n", totalSum);

clock_t end = clock(); double timeSpent = (double)(end - start) / CLOCKS_PER_SEC; printf("time spent = %10.6f\n", timeSpent);

return 0; }

//This is my code, I just change the const int n = 128 to others for test.

I just checked your code and confirmed that I have the same result. Yes, this behavior is pretty standard.

When using more threads, we must consider the cost of thread creation and deletion, data distribution, memory access, etc. Using more threads means a lot of time needed to do them.


SVG not supported

Login to reply

👍?
helpful
1:13pm Wed 28th Aug, ANONYMOUS

um.... so what is suitable number to achieve a speedup, it seems I can't find that number.


 UWA week 36 (2nd semester, mid-semester break) ↓
SVG not supported

Login to reply

👍?
helpful
2:13pm Wed 4th Sep, ANONYMOUS

To clarify, I mean no matter how many threads I set, e.g. from 1 to 128, the running time is always longer than the running time of 1 thread.

The University of Western Australia

Computer Science and Software Engineering

CRICOS Code: 00126G
Written by [email protected]
Powered by history
Feedback always welcome - it makes our software better!
Last modified  8:08AM Aug 25 2024
Privacy policy