Professional Documents
Culture Documents
PrefixSumDefinitions
Prefixsum
givenanarray...Theprefixsumisthesumofalltheelementsinthearrayfromthe
beginningtotheposition,includingthevalueattheposition.
Thesequentialalgorithmisfairlystraightforwardstartatthefrontofthearrayandgotothe
back.Itwouldmaintainacurrentsumandaddvaluestoit.Itwouldnotneedtoallocate
additionalspacetoperformthesum.
Scans
Prefixincludesmorethansums,forexampleaprefixmaxcomputationisalsopossible.
Thegeneralizationofaprefixoperationiscalledascan.
Forexample:sumscan,productscan,maxscan,etc.
ParallelScans
Parallelscansrequireiterationsofthelooptobeindependentofoneanother.
ANaiveParallelScans
Toparallelizeascandonindependentparallelreductions.
ThespanforaparallelizedscanisO(logn)becausebothparforandreducehave
logarithmicspan.
2
TheworkisO(n
)eachreducecostsO(i)work,sothetotalworkisthesumofallifrom1to
n.
Thisisworsethanthesequentialoperation(whichhasO(n)operations).
Thereforedoingindependentparallelreductionsisnotagoodidea(itslame).
ParallelScans
Parallelizetheprefixscanproblem,thesameconceptcanbeusedforparallelizingotherscans.
PrefixSumParallelizationsteps:
1.
Ifprefixsumcanbereducedtosinglevalueandassociativitycanbeassumed,thethe
partialsumscanberearranged
.
2.
Combineanelementwithitsneighbor.Thiswillbeanewsetof
partialconsecutive
sums.
Thefirstelementofthepartialsumsisthefirstelementofthearray(notthesum).
3.
Doascanonthepartialconsecutivesums.Thiswillgiveyoualloftheevenresults.
4.
Togettheoddresultstakethepartialsumandaddtheoddvaluetoit.Forexample:to
get1:3Add1:2plus3
WorkforthisalgorithymO(n)
2
SpanforthisalgorithmO(log
n)
ParallelScanAnalysis
Work
Notethatthenumberofadditionsthealgorithmdoesatdifferentstages.Addscanoperatesona
problemthatisthesize.
W(n)~2nthisishiddenintheworkandonlycountsadditionoperations.
ThesequentialversionofthealgorithmW(n)=n.
Theparallelversiondoestwicetheamountof
work.
Theoreticallythisisnotimportant,butitmeansthereisacosttobepaidforparallelism.
Span
addScanoperatesonasubproblemofhalfthesize.Theotheroperationscanbeimplemented
usingparfor.addScanhasO(logn)spanandtheotheroperationsareO(logn).When
2
combinedthetotalspanisO(log
n).
Themastertheoremcanbeusedtosolvethisquicklyandeasily.
MasterTheorem
ParallelQuickSort
SequentialQuicksort:
1.
Chooseapivotelementselectoneuniformlyatrandom.
2.
Partitiontheinputaroundthepivotvalue.Elementsthatarelessthanorequaltogoon
oneside,greaterthangoontheotherside.
3.
Nowthetwopartitionscanbeindependentlysorted.
ParallelQuicksort:
1.
Spawnthetwopartitionsasindependenttasks.
2.
Repeat:Chooseapivot,partition,andspawn.
TheParallelAlgorithmforQuicksort
ParallelPartition:
Youcannotjustsubstituteaparforfortheforloop.YouneedtohavealockonK,withoutthis
theparforloopwillgiveincorrectvalues.
ConditionalGathersUsingScan
Todothequicksortinparallel
1.
Selectapivot
.
2.
Inparallelcompareeachelementtothepivot.Iftheelementis
lessthanorequalto
thepivotputa1intheauxiliaryarray
.Iftheelementis
greaterthanputa0inthe
auxiliaryarray.
Thiswillresultinaarrayconsistingof1sand0s.Thisarraycanbescanned.
Thefollowinginformationcanbegottenfromthescan:
1.
Thelastelementinthearrayisthetotalnumberofelementslessthanorequaltothe
pivot
.Thismeanswecanallocateanoutputarrayoftheappropriatesize.
2.
Everytimethereis
anincreaseinthescan,thiscorrespondstoanelementinthearray
thatislessthanorequaltothepivot.
Thismakesthedesiredelementseasytofindina
scan.
3.
Withinthescantheincrementalvaluesarebothuniqueandconsecutive
.Thesevalues
canbeusedasindicestotheoutputarrayandcanbewritteninparallel.
Thisisthepseudocodefortheconditionalgathersusingscans.
W(n)=O(n)
D(n)=O(logn)
AflagscanfollowedbyawritecanbecombinedintoanalgorithmicprimitivecalledgatherIf.
gatherIf(A[:],F[:])A[F[:]]
SegmentedScans
Todoindependentscansonsegmentsofthearraythesegmentsdonothavetobethesame
size.
1.
Beginwithanarrayofflags,consistingofTrueandFalse.Atruevalueisplaced
whereverasegmentbegins.
2.
Thenanalgorithmlooksattheflagarrayandifitseesatrueflagitdoesnothing.Ifit
seesafalseflagitdoesascan.
Usetheopprimitivetoreplacetheconditionalstatement.Thenwecanuseittoparallelizea
segmentedscan.
Theopprimitiveneedstobeassociative
.
op(op(a,b),c)=op(a,(b,c))
Thecostofanoperationaffectsitsworkandspan,butnotitscorrectness.
ListRankingDefinitions
Listrankingishardtospeedup.
Listrankingasks:
Givenalinkedlistwhatisthedistanceofeachnodefromthehead?
Sequentiallytheproblemiseasy.
Ifgivenalinkedlist,addaoneforeachnodeexcepttheheadnode.Thenyoucanscanthelist.
LinkedListsasArrayPools
Alinkedlisthasonlyoneentrypoint,tomakeitmoreaccessible,usetwoarrays.
Onearraywill
holdthevalueofthenode,thesecondarraywillholdtheindex(thenextpointer)
.
Thearrayrepresentationmustbeatleastaslargeasthelinkedlist.
AParallel
ListRanker
Tomakethisparallel,usethejumpprimitive.
Thejumpprimitivetakesanodespointer(thatwaspointingtotheneighbor)andpointsitto
theneighborsneighbor.
Ifthejumpsoccuratthesametime,thelistissplitintotwosublists.Eachsublisthaseveryother
nodeinthelist.
Thesimultaneousjumpscanbedonerepeatedly,creatingmanysublistsofsmallerandsmaller
size.
Theparallellistranker:
1.
storethelistasanarraypool
2.
Listrankingisbasicallyanaddscan
3.
Usethejumpprimitivetodivideandconquer.
Candidateinvariant:if1sareusedtodenoterank,thejumpswilldisturbthismethod.Sothe
rankingshouldbechangedtobeanaddscanaddtheranksandstorethisinthearray.
UpdateRanksisaprimitivethatpushes(addsthetworanks)therankvalueontothereceiving
node.Thispreservestherankofeachnode.Theseupdatescanbedoneinparallel.
Ifthearraypoolisofsizen,themaximumnumberofjumpstepsisO(logn)
W(n)=O(mlogm)
2
D(n)=O(log
m)