You are on page 1of 7

Lesson14

PrefixSumDefinitions

Prefixsum
givenanarray...Theprefixsumisthesumofalltheelementsinthearrayfromthe
beginningtotheposition,includingthevalueattheposition.

Thesequentialalgorithmisfairlystraightforwardstartatthefrontofthearrayandgotothe
back.Itwouldmaintainacurrentsumandaddvaluestoit.Itwouldnotneedtoallocate
additionalspacetoperformthesum.

Scans

Prefixincludesmorethansums,forexampleaprefixmaxcomputationisalsopossible.
Thegeneralizationofaprefixoperationiscalledascan.
Forexample:sumscan,productscan,maxscan,etc.

ParallelScans

Parallelscansrequireiterationsofthelooptobeindependentofoneanother.

ANaiveParallelScans

Toparallelizeascandonindependentparallelreductions.
ThespanforaparallelizedscanisO(logn)becausebothparforandreducehave
logarithmicspan.

2
TheworkisO(n
)eachreducecostsO(i)work,sothetotalworkisthesumofallifrom1to
n.

Thisisworsethanthesequentialoperation(whichhasO(n)operations).

Thereforedoingindependentparallelreductionsisnotagoodidea(itslame).

ParallelScans

Parallelizetheprefixscanproblem,thesameconceptcanbeusedforparallelizingotherscans.

PrefixSumParallelizationsteps:
1.
Ifprefixsumcanbereducedtosinglevalueandassociativitycanbeassumed,thethe
partialsumscanberearranged
.
2.
Combineanelementwithitsneighbor.Thiswillbeanewsetof
partialconsecutive
sums.
Thefirstelementofthepartialsumsisthefirstelementofthearray(notthesum).
3.
Doascanonthepartialconsecutivesums.Thiswillgiveyoualloftheevenresults.
4.
Togettheoddresultstakethepartialsumandaddtheoddvaluetoit.Forexample:to
get1:3Add1:2plus3

WorkforthisalgorithymO(n)
2
SpanforthisalgorithmO(log
n)

ParallelScanAnalysis

Work
Notethatthenumberofadditionsthealgorithmdoesatdifferentstages.Addscanoperatesona
problemthatisthesize.

W(n)~2nthisishiddenintheworkandonlycountsadditionoperations.
ThesequentialversionofthealgorithmW(n)=n.
Theparallelversiondoestwicetheamountof
work.
Theoreticallythisisnotimportant,butitmeansthereisacosttobepaidforparallelism.


Span
addScanoperatesonasubproblemofhalfthesize.Theotheroperationscanbeimplemented
usingparfor.addScanhasO(logn)spanandtheotheroperationsareO(logn).When
2
combinedthetotalspanisO(log
n).

Themastertheoremcanbeusedtosolvethisquicklyandeasily.

MasterTheorem

ParallelQuickSort

SequentialQuicksort:
1.
Chooseapivotelementselectoneuniformlyatrandom.
2.
Partitiontheinputaroundthepivotvalue.Elementsthatarelessthanorequaltogoon
oneside,greaterthangoontheotherside.
3.
Nowthetwopartitionscanbeindependentlysorted.

ParallelQuicksort:
1.
Spawnthetwopartitionsasindependenttasks.
2.
Repeat:Chooseapivot,partition,andspawn.

TheParallelAlgorithmforQuicksort

ParallelPartition:

Youcannotjustsubstituteaparforfortheforloop.YouneedtohavealockonK,withoutthis
theparforloopwillgiveincorrectvalues.

ConditionalGathersUsingScan

Todothequicksortinparallel
1.
Selectapivot
.
2.
Inparallelcompareeachelementtothepivot.Iftheelementis
lessthanorequalto
thepivotputa1intheauxiliaryarray
.Iftheelementis
greaterthanputa0inthe
auxiliaryarray.

Thiswillresultinaarrayconsistingof1sand0s.Thisarraycanbescanned.
Thefollowinginformationcanbegottenfromthescan:
1.
Thelastelementinthearrayisthetotalnumberofelementslessthanorequaltothe
pivot
.Thismeanswecanallocateanoutputarrayoftheappropriatesize.
2.
Everytimethereis
anincreaseinthescan,thiscorrespondstoanelementinthearray
thatislessthanorequaltothepivot.
Thismakesthedesiredelementseasytofindina
scan.
3.
Withinthescantheincrementalvaluesarebothuniqueandconsecutive
.Thesevalues
canbeusedasindicestotheoutputarrayandcanbewritteninparallel.

Thisisthepseudocodefortheconditionalgathersusingscans.

W(n)=O(n)
D(n)=O(logn)

AflagscanfollowedbyawritecanbecombinedintoanalgorithmicprimitivecalledgatherIf.

gatherIf(A[:],F[:])A[F[:]]

SegmentedScans

Todoindependentscansonsegmentsofthearraythesegmentsdonothavetobethesame
size.

1.
Beginwithanarrayofflags,consistingofTrueandFalse.Atruevalueisplaced
whereverasegmentbegins.
2.
Thenanalgorithmlooksattheflagarrayandifitseesatrueflagitdoesnothing.Ifit
seesafalseflagitdoesascan.

Usetheopprimitivetoreplacetheconditionalstatement.Thenwecanuseittoparallelizea
segmentedscan.

Theopprimitiveneedstobeassociative
.
op(op(a,b),c)=op(a,(b,c))

Thecostofanoperationaffectsitsworkandspan,butnotitscorrectness.

ListRankingDefinitions

Listrankingishardtospeedup.

Listrankingasks:
Givenalinkedlistwhatisthedistanceofeachnodefromthehead?

Sequentiallytheproblemiseasy.

Ifgivenalinkedlist,addaoneforeachnodeexcepttheheadnode.Thenyoucanscanthelist.

LinkedListsasArrayPools

Alinkedlisthasonlyoneentrypoint,tomakeitmoreaccessible,usetwoarrays.
Onearraywill
holdthevalueofthenode,thesecondarraywillholdtheindex(thenextpointer)
.

Thearrayrepresentationmustbeatleastaslargeasthelinkedlist.

AParallel

ListRanker

Tomakethisparallel,usethejumpprimitive.
Thejumpprimitivetakesanodespointer(thatwaspointingtotheneighbor)andpointsitto
theneighborsneighbor.


Ifthejumpsoccuratthesametime,thelistissplitintotwosublists.Eachsublisthaseveryother
nodeinthelist.

Thesimultaneousjumpscanbedonerepeatedly,creatingmanysublistsofsmallerandsmaller
size.

Theparallellistranker:
1.
storethelistasanarraypool
2.
Listrankingisbasicallyanaddscan
3.
Usethejumpprimitivetodivideandconquer.

Candidateinvariant:if1sareusedtodenoterank,thejumpswilldisturbthismethod.Sothe
rankingshouldbechangedtobeanaddscanaddtheranksandstorethisinthearray.

UpdateRanksisaprimitivethatpushes(addsthetworanks)therankvalueontothereceiving
node.Thispreservestherankofeachnode.Theseupdatescanbedoneinparallel.

Ifthearraypoolisofsizen,themaximumnumberofjumpstepsisO(logn)

W(n)=O(mlogm)

2
D(n)=O(log
m)

You might also like