Intro 1.0

Acct 7397 Data Analytics 1 & 2
Work with competence and integrity: In the end, we seek evidence based
decision making - not decision based evidence making.
My Summer Vacation
My Summer Vacation
My Summer Vacation
My Summer Vacation
current trends
transformation of Goldman Sachs, and increasingly

other Wall Street firms, that began with the rise in
computerized trading, but has accelerated over the past
five years, moving into more fields of finance that
humans once dominated...
In addition to back-office workers, on machines are replacing a lot of highly paid people, too.
Average compensation for staff in sales, trading, and research at the 12 largest global investment banks, of which
Goldman is one, is $500,000 in salary and bonus
for the highly paid who remain the pay of the average managing director at Goldman will probably get even bigger,
as there are fewer lower-level people to share the profits with, he says.
his expertise makes him suited to the task of CFO, a role more typically held by accountants Everything we do is
underpinned by math and a lot of software
Goldmans new consumer lending platform is entirely run by software, with no human intervention
current trends
If machines can do accounting, can they do audit auditing, and tax?

current trends
EY
Whats the likelihood that the machines will
replace accounting and audit work?
If they can do accounting work, what cant they

do?
Answer: We dont know yet, but so far, there

seems to be no ceiling.
2020 will be a different world.

but as jobs are eliminated, others are created
Professional Competency EY
CPA skills are not enough anymore and the workforce is becoming
more and more competitive
Professional Opportunity
The job market is changing rapidly and most of the lower level
admin jobs are disappearing at the same time, analytics jobs are
exploding
McKinsey
Analytics job market today
trends This is our target, and this
position will increase faster
than the others
Analytics
Lead
Business
Translator
...Analytics job market today
Scope and Direction of Analytics
Major 2000 2010 2020
disruptions Internet Mobility Artificial Intelligence
1996 2000-2010 2015 -

The Data Business Intelligence Learning Systems
Warehouse Big Data
Toolkit
Ralph Kimball
IDEA, ACL
Visualization
(Tableau, Power BI)
Data Warehouses
(SSAS)
ETL (Extract, Transform, Load)

(SSIS, Informatica)
Scope of Data Analytics Courses
Major 2000 2010 2020
disruptions Internet Mobility Artificial Intelligence
1996 2000-2010 2015 -

The Data Business Intelligence Learning Systems
Warehouse Big Data
Toolkit
Ralph Kimball
IDEA, ACL Tomorrows Analyst Tools

Covered in multiple courses
Visualization
(Tableau, Power BI)
Data Warehouses
(SSAS)
Obsolete ETL (Extract, Transform, Load) IT Focused

(SSIS, Informatica) (re: DISC data track)
Learning Systems
Two Broad Applications: Deep Learning and Process Automation
Deep Learning Thought Exercise:

Youre auditing a large manufacturing company Your doing preliminary analysis, and you can see that the
operating margin has decreased. Possible explanations?
Revenue: Change of volume or price? product mix? product technology and function? markets?
customers? new products, deprecated products? distribution channels? capacity, competitors, perceived
customer value, delivery costs
Costs: Change in capacity, transportation costs, inventory turn, manufacturing strategy / logistics,
materials market, procurement strategy, suppliers, product design and BOM
How many potential dimensions for analysis?
How many interrelationships?
Can you analyze this with a spreadsheet or visualization tool?
If youre in charge of the audit, do you think you need to understand this? What about managers in the
company?
Can the machines analyze this? If machines can identify hidden drivers, do you think the client would
consider that valuable?
Learning Systems
Two Broad Applications: Deep Learning and Process Automation
Process Automation Thought Exercise:

Youre auditing a large manufacturing company and you need to vouch revenues.
Data: What data needs to be considered? Do you need the entire Order-to-Cash process? Contracts -
T&Cs, Sales Order transactions, Shipping Transactions, Customs transactions, Payments.
How extensive and complex is that data? Where is it? How do you match and what about partial
matching? Will you need to read descriptions and documents? Make judgements?
Can machines do this work? Better than people?

Prerequisite Assumptions DA2
R Language (working knowledge or DA1)
Data Acquisition and Description
Analysis of Variance / Covariance, Correlation, Principle Components
Basic Calculus and Linear Algebra (cheat sheets on Blackboard)
Basic Statistics (cheat sheets on Blackboard)
Attitude
Its a competitive world. It takes a hundreds of things done right to get a
promotion, and one thing done wrong to get fired. Take your job seriously
(and this class is your job right now):
Come to class prepared (there will be pop quizzes)
Apply what you learn ask yourself what if and test your knowledge
Take responsibility this is graduate school
Professor
Ellen Terry
http://econolytics.org
ewterry@bauer.uh.edu
MH 360K
713-743-4820
Background:
JP Morgan - Vice President Data Science

General Electric - Director Planning and Programs
Microsoft Corporation - Industry Solution Architect
+ Research at the Santa Fe Institute and United Nations
Deloitte => Polaris Consulting - Principal Consultant
Syllabus DA2 (tentative)
Introduction Statistical Learning Theory
Ch 1 (Intro) &2 (statistical learning) ISL + Class Material
Regression
Ch 3 ISL (linear regression) + Class Material
Ch 6 ISL (linear model selection and regularization) + Class Material
Ch 7 ISL (moving beyond linearity) + Class Material
Classification
Ch 4 ISL (classification) + Class Material
Support Vector Machines
Ch 9 ISL (support vector machines) + Class Material
Resampling Well learn in R
Studio
Ch 5 ISL (resampling methods) + Class Material Well do Projects in
Projects
R Studio
We may use AML
TBD depending on project
scope and budget
Data Analytics 2
Grading:
1 Mid-Term Exam MC + Problems (R Code files) (40%)
Project Review Team Score %*% Ranking by Leader (30%)
n Pop Quizzes (20%) MC and/or Problems drawn from ISL and class material
Do not miss class and dont be late (no makeups)!
n Homework assignments (10%)
Resources:
Required: Introduction to Statistical Learning (James, Witten, Hastie, Tibshirani)
free download: http://www-bcf.usc.edu/~gareth/ISL/ . (exam questions will be
pulled from this book, whether theyre covered in class or not)
R Studio (on your computer or Lab Server)
Azure Machine Learning (team allocations you can sign up with an individual
account for free with limits)
Optional Reads:
Elements of Statistical Learning (James, Witten, Hastie, Tibshirani)
All of Statistics (Larry Wasserman)
All of Non-Parametric Statistics (Larry Wasserman)
Statistical Learning Body of Knowledge
Body of knowledge is VAST, COMPLEX, and growing at

a BREATHTAKING PACE. But thats too bad you have
to know it anyway: You can learn as you go, but be
aware of the implications not doing the research and
applying best practices is incompetence, which is
unethical and introduces legal exposure to you and
your firm.
Analytics Leads (business translators) take

responsibility for projects: which means you have to
know enough to set direction, garner respect
(respect is earned through competency, not title),
make decisions and fill gaps (resources are scarce
and youll have to pick up the slack youre still
responsible to deliver on time!)
Modeling
Model
Parameters Hyper Parameters
, a,e,
Data
Modeling is iterative and intuitive - all the elements are interconnected in a

complex (and sometimes perplexing) network. Discovery and changes to
models, parameters, hyper parameters and data impact all the elements, and
finding the sweet spot is both art and science. Thats why the modeling
process is called an experiment. Good data scientists are algorithm
whisperers.
And thats why we spend a lot of time on theory and intuition

Modeling
Model Development Model Evaluation

In search of
Why estimate ?
Prediction
Estimation of a value (note: Prediction is not just about the future (time is
just another dimension - e.g., you might estimate past to compare with
actual y in assurance)
Inference
Description of the underlying data and relationships
How do we estimate ?
Parametric Methods. Fits to data based on assumption about the form of
using statistical methods to determine model parameters
Non-Parametric Methods. Develops the form of based on the data (within
broad groups: regression (continuous ) classification (discrete ) and
subgroups (e.g., support vector machines, decision trees, regression)
George Box
PhD from the University of London in 1953, under the supervision of

Egon Pearson (you should know Karl Pearson)
Created the Department of Statistics @ University of Wisconsin
Married Joan Fisher - Ronald Fisher's daughter (you should know Ronald
Fisher)
All models are wrong but some are useful
Ellens Opinion: Its mostly important that you have applied the existing body of knowledge in a competent
manner. If the model is the best you have (not so obvious) then go with it (just be transparent and always
keep project sponsors in the loop). Telling managers and clients that the business model is invalid because of
some theoretical issue doesnt inspire them to write checks and continue your employment
Work with competence and integrity: In the end, we seek evidence based decision making - not decision
based evidence making.
Model Flexibility vs. Interoperability
Linear
Regression
Interoperability Polynomial and

Non-Linear
Least Squares
Local Models
(Splines and LOESS)
Support
Vector
Machines
Flexibility
In general, more restrictive models will be better for inference (understanding relationships between predictors
and response variables) and ensembling (consolidating algorithms within larger models, or piping data
between models).
More flexible models are usually more complex and involve more parameters leading to issues with
overfitting, more training data and processing times.
Central Concept: Bias vs Variance
Dont get these definitions confused with data description and

distributions, were talking about Models here:
^
Bias. The difference between f (x) and f (x) due to the model parameters (testing)
^
Variance. The difference between f (x)s due to training data samples (training)
Mathematically: reducible irreducible
Bias Variance
^ ^ ^
Err(x)=(E[(x)]f(x))2+E[((x)E[(x)])2]+2e
Test MSE model training Random
parameters samples
General rule: as you move to more complex models, bias decreases and variance increases
Bias vs Variance
Test MSE
Minimum Possible Test MSE
Training MSE
^
Comparing different f models to the true f non-linear data (composite for all models)
bias vs variance
Comparing different f^ models to the true f ~linear data

bias vs variance
Fitting ^f models to the true f with highly ~non-linear data

bias vs variance
DA1 Review: R Matrix Creation
A matrix is m x n (rows, columns), so A[2,1] is the second row, first column element in a matrix
#create a matrix and a vector (several ways to do it)

A <- matrix ( c(2, 2)) # this creates a vector
A <- cbind(A, c(3,1)) # then combine vectors with cbind or
A <- matrix (c(2,3), nrow =1)
A <- rbind(A, c(2,1)) # then combine rows
A <- matrix( c(2, 2, 3, 1), nrow=2, ncol = 2) # or just create the matrix at once.
write.csv(A, file = "Class 1 - Foundations/A.csv", row.names = FALSE)
A <- read.csv(file = "Class 1 - Foundations/A.csv", header=TRUE, stringsAsFactors=FALSE)
# read.csv creates a dataframe (covered later)

str(A) #you always need to check on the data types that dataframes create
A <- as.matrix(A)
# OK, let's create a couple of vectors and move on with matrix operations
B <- c(3,2) # vector 1
C <- c(0,1) # Vector 2
D <- A #easy to duplicate data structures
D <- A[,1] # or parts - notice that D beccomes a vector
D <- cbind(D, A[,2]) # and back again
D <- t(D) #transpose a matrix
A matrix transposition rotates the matrix on the main diagonal (from 1,1)
DA1 Review : Basic Matrix Operations
Operator or
Description
Function
A+B Must be same structure
A-B Addition / Subtraction always an element operation
t(A) Transpose
A*B Element multiplication (the product of vectors or matrices e.g., product * price)
A %*% B Matrix multiplication (Important)
A %o% B Outer product. AB'
crossprod(A,B)
A'B and A'A respectively.
crossprod(A)
DA1 Review : Matrix Addition and Multiplication
D <- A+D # has to be same structure

E <- B+C A[1,1] * B[1] = E[1,1]
E <- B-C A[1,2] * B[1] = E[1,2]
E <- A*B # elements
(A[1,1] * B[1,1]) + (A[1,2] * B[2,1]) = E[1,1]

(A[1,1] * B[1,2]) + (A[1,2] * B[2,2]) = E[1,1]
G <- A %*% B
#Dot product
t(A)
H <- crossprod(A,B)
#(equivalent to t(A) * B)
DA1 Review: Diagonals and Determinants
diag(x) Creates diagonal matrix with elements of x in the principal diagonal
E <- diag(A)
I <- diag(2)
If you feed it a matrix, it will give you back a vector of the diagonal
If you feed it a number, it will create an identity matrix with that number (1 being the most common)
diag(x) Creates diagonal matrix with elements of x in the principal diagonal
J <- det(A)
(2*1)-(3*2) =
used to solve linear systems of
equations
K <- solve(A)
used to achieve matrix division, among

other things
To get the inverse (A)-1: swap the positions of a and d, put negatives in front of b and c, and divide everything by
the determinant (thats why you have solve ) a matrix * its inverse equals the identity (definition of inverse) A-1A = I
Check to see if K*A is a matrix of ones (not the same as an identity matrix)
L <- K*A
DA1 Review: Eigenvector transformation
Eigenvectors & Eigenvalues
Originally utilized to study principal axes of the rotational motion of rigid bodies, eigenvalues and eigenvectors have a wide
range of applications, for example in stability analysis, vibration analysis, atomic orbitals, facial recognition, and matrix
diagonalization. In essence, an eigenvector v of a linear transformation T is a non-zero vector that, when T is applied to it,
does not change direction. Applying T to the eigenvector only scales the eigenvector by the scalar value , called an
eigenvalue. This condition can be written as the equation:
Av = v , or (A- I) = 0
if det((A- I) = 0
is a scalar eigenvalue associated with an eigenvector v that can be used for transformation of a matrix.
Eigenvalues are:
Non-Zero
nxn (square) matrix only (matrix is diagonalizable)
From a geometrical perspective, does not change the direction of a vector A (next slide)
There always exists at least one eigenvalue / eigenvector
When eigenvectors are applied to linear transformation, the matrix just gets scaled, and the transformation still tells
us what we need to know about the original matrix
It breaks down the linear transformation into simple operations.

To solve for eigenvalues:
DA1 Review: Dot Product Transformation
A matrix can be thought of as defining a linear transformation in space
Z <- A%*%B
When dot products are applied to linear transformation, the transformation still tells us what we need to know
about the original matrix
Also called the Scalar product

DA1 Review: Eigenvectors as scalars
V1 <- data.frame(x=c(0, 3), y=c(0, 4))
p <- ggplot(V1, aes(x=x, y=y))+geom_point(color="black")

p <- p + geom_segment(aes(x = V1[1,1], y = V1[1,2], xend = V1[2,1], yend
=V1[2,2] ), arrow = arrow(length = unit(0.5, "cm")))
p <- p +xlim(0, 20) + ylim(0,20)
p
# create and draw the eigenvector (transpose first)

tV1 <- t(V1)
eV1 <- cbind(eigen(tV1)$values, eigen(tV1)$vectors)
ev2 <- eigen(V1)$values
V2<-data.frame(X=c(0, (eV1[1,1]*eV1[1,2])),Y=c(0, (eV1[1,1]*eV1[2,2])))

p <- p + geom_segment(aes(x = V2[1,1], y = V2[1,2], xend = V2[2,1], yend
=V2[2,2] ), col="blue", linetype="dashed", arrow = arrow(length = unit(0.5,
"cm")))
p
#create and draw the dot product vector

V3 <- as.matrix(V1)
dpV3 <- V3%*%V3
p <- p + geom_segment(aes(x = dpV3[1,1], y = dpV3[1,2], xend = dpV3[2,1],
yend =dpV3[2,2] ), col="red", linetype="dashed", arrow = arrow(length = unit(0.5,
"cm")))
p Term: scalar
Note that the eigenvectors and the dot product vector scale the original vector, but the direction doesnt
change (i.e., it gives us a mechanism to transform data without senescence)
DA1 Review: More Definitions
Vector Norm (sometimes call the magnitude)

|x|2 = |x| = x = c(1,2,3), then |x|2 = 14
2
=1
Note: the norm of a matrix is often written ||x||
L2 Norm (or Euclidean norm) is most common (there are different ways to calculate a norm and they get
different answers). More later on this
At
Get clear on Transpose (a few examples)
A <- matrix(c(1, 2), nrow = 1, ncol = 2)

tA <- t(A)
B <- matrix(c(1, 2, 3, 4), nrow = 2, ncol = 2)

tB <- t(B)
C <- matrix(c(1, 2, 3, 4, 5, 6), nrow = 2, ncol = 3)

tC <- t(C)
DA1 Review: More Definitions
Determinants
Vectors dont have determinants but they have norms |x|

Matrices have norms |x| and determinants ||x|| (I know, confusing)
Recall earlier:
# check to make sure it's right - should be identity matrix
L <- K%*%A
DA1 Review: matrix decomposition
Remember factoring in HS algebra?
X2 + 4X + 3 = (X + 3)(X + 1)
You can factor Matrices too, which turns out to be very useful
# singular value decomposition U

X <- matrix( c(1, 1, 1, 1, 1, 2, 3, 4), nrow=4, ncol = 2)
SingVal <- svd(X)

U <- SingVal$u D
D <- diag(SingVal$d)
V <- t(SingVal$v)
X2 <- U %*% D %*% V

V
# X = U D V'
DA1 Review Correlation
myQuery <- "
library(RODBC)
SELECT
library(tidyverse)
[Obs]
,[TV]
# NOTE SERVER and DATABASE
,[Radio]
CHANGE!!!!!!!!!!!
,[Newspaper]
,[Sales]
myServer <-
FROM [dbo].[Advertising]
"tcp:analyticslab.database.windows.net,1433"
"
myUser <- "Student"
Advertising <- sq(myQuery)
myPassword <- "Acct7397"
myDatabase <- "Accounting"
Ad <- dplyr::select(Advertising, Sales, TV, Radio, Newspaper)
myDriver <- "ODBC Driver 13 for SQL Server" #
Must correspond to an entry in the Drivers tab of
cor(Ad, method = 'pearson', use = 'pairwise')
"ODBC Data Sources"
connectionString <- str_c(

"Driver=", myDriver,
";Server=", myServer,
";Database=", myDatabase,
";Uid=", myUser,
";Pwd=", myPassword)
sq <- function (myQuery){

conn <- odbcDriverConnect(connectionString)
tQuery <- (sqlQuery(conn, myQuery))
close(conn)
return (tQuery)
}
Normal equations
Solving the derivatives directly
0 = 81 + 202 56 solve(X, B)
[1,] 3.5
0 = 201 + 602 154 [2,] 1.4
X <- matrix( c(8, 20, 20, 60), nrow=2, ncol = 2)

B <- matrix( c(-56, -154), nrow=2, ncol = 1)
solve(X, B)
Solving the equations using matrix algebra
= (XTX)-1 (XTY)
X <- cbind(1, mydata$X)

y <- mydata$Y
# we can solve this from the raw data by using a transpose
print(betaHat)
betaHat <- solve(t(X)%*%X) %*% t(X) %*%y [1,] 3.5
print(betaHat) [2,] 1.4
normal equations
Solving using single value decomposition

w
# now solving using SVD [1,] 3.5
[2,] 1.4
x <- t(X) %*% X
duv <- svd(x)
x.inv <- duv$v %*% diag(1 / duv$d) %*% t(duv$u)
x.pseudo.inv <- x.inv %*% t(X)
w <- x.pseudo.inv %*% y
w
# note we can also use SVD for dimension reduction (like PCA)
# it's also used in advanced numerical solutions (won't be doing
that here)
fun with vectors
# exercise for fun # get the vector magnitude (eclidian norm)
norm_vec <- function(x)sqrt(sum(x^2))
V1 <- data.frame(x=c(0, 3), y=c(0, 4)) mV1 <- norm_vec(V1)
p <- ggplot(V1, aes(x=x, y=y))+geom_point(color="black") # Calculte the Direction Vector and Show that the norm = 1
p <- p + geom_segment(aes(x = V1[1,1], y = V1[1,2], xend = V1[2,1],
yend =V1[2,2] ), arrow = arrow(length = unit(0.5, "cm"))) cosX <- V1[2,1]/mv1
p <- p +xlim(0, 20) + ylim(0,20) cosY <- V1[2,2]/mv1
p V4 <- data.frame(X=c(0,cosX), Y=c(0,cosY))
# create and draw the eigenvector (transpose first) p <- p + geom_segment(aes(x = V4[1,1], y = V4[1,2], xend = V4[2,1], yend
tV1 <- t(V1) =V4[2,2] ), col="red", linetype="dashed", arrow = arrow(length = unit(0.5,
eV1 <- cbind(eigen(tV1)$values, eigen(tV1)$vectors) "cm")))
ev2 <- eigen(V1)$values p
V2<-data.frame(X=c(0, (eV1[1,1]*eV1[1,2])),Y=c(0, norm_vec(V4)

(eV1[1,1]*eV1[2,2])))
p <- p + geom_segment(aes(x = V2[1,1], y = V2[1,2], xend = V2[2,1], mV4 <- as.matrix(V4)
yend =V2[2,2] ), col="blue", linetype="dashed", arrow = arrow(length = norm(mV4, type = '2')
unit(0.5, "cm")))
p
#create and draw the dot product vector

V3 <- as.matrix(V1)
dpV3 <- V3%*%V3
p <- p + geom_segment(aes(x = dpV3[1,1], y = dpV3[1,2], xend =
dpV3[2,1], yend =dpV3[2,2] ), col="red", linetype="dashed", arrow =
arrow(length = unit(0.5, "cm")))
p
# draw a right triangle (just for visual reference)
p <- p + geom_segment(aes(x = V1[2,1], y = V1[1,1], xend = V1[2,1],

yend =V1[2,2] ), col="blue", linetype="dashed")
p
fun with vectors
Euclid
300 B.C.
Alexandria, Ptolemaic
Egypt
Non-Euclidian Geometry
~ 1800 Gauss
Extended
~ 1900 Hilbert
kernel functions
Einstein
Many of the underlying principles that form the basis of statistical learning theory are still based
on Euclids Axioms (we just extend them into infinite dimensions using the work of Gauss and
Hilbert (and many, many others)

Intro 1.0

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Intro 1.0

Uploaded by

Copyright:

Available Formats

Acct 7397 Data Analytics 1 & 2

transformation of Goldman Sachs, and increasingly

If machines can do accounting, can they do audit auditing, and tax?

If they can do accounting work, what cant they

Answer: We dont know yet, but so far, there

2020 will be a different world.

1996 2000-2010 2015 -

ETL (Extract, Transform, Load)

1996 2000-2010 2015 -

IDEA, ACL Tomorrows Analyst Tools

Obsolete ETL (Extract, Transform, Load) IT Focused

Deep Learning Thought Exercise:

How many potential dimensions for analysis?

How many interrelationships?

Can you analyze this with a spreadsheet or visualization tool?

Process Automation Thought Exercise:

Can machines do this work? Better than people?

JP Morgan - Vice President Data Science

Body of knowledge is VAST, COMPLEX, and growing at

Analytics Leads (business translators) take

Parameters Hyper Parameters

Modeling is iterative and intuitive - all the elements are interconnected in a

And thats why we spend a lot of time on theory and intuition

Model Development Model Evaluation

PhD from the University of London in 1953, under the supervision of

All models are wrong but some are useful

Interoperability Polynomial and

Dont get these definitions confused with data description and

Mathematically: reducible irreducible

Minimum Possible Test MSE

Comparing different f^ models to the true f ~linear data

Fitting ^f models to the true f with highly ~non-linear data

#create a matrix and a vector (several ways to do it)

# read.csv creates a dataframe (covered later)

D <- A+D # has to be same structure

E <- A*B # elements

(A[1,1] * B[1,1]) + (A[1,2] * B[2,1]) = E[1,1]

diag(x) Creates diagonal matrix with elements of x in the principal diagonal

used to achieve matrix division, among

It breaks down the linear transformation into simple operations.

Also called the Scalar product

p <- ggplot(V1, aes(x=x, y=y))+geom_point(color="black")

# create and draw the eigenvector (transpose first)

V2<-data.frame(X=c(0, (eV1[1,1]*eV1[1,2])),Y=c(0, (eV1[1,1]*eV1[2,2])))

#create and draw the dot product vector

A <- matrix(c(1, 2), nrow = 1, ncol = 2)

B <- matrix(c(1, 2, 3, 4), nrow = 2, ncol = 2)

C <- matrix(c(1, 2, 3, 4, 5, 6), nrow = 2, ncol = 3)

Vectors dont have determinants but they have norms |x|

# check to make sure it's right - should be identity matrix

Remember factoring in HS algebra?

# singular value decomposition U

SingVal <- svd(X)

X2 <- U %*% D %*% V

connectionString <- str_c(

sq <- function (myQuery){

Solving the derivatives directly

X <- matrix( c(8, 20, 20, 60), nrow=2, ncol = 2)

Solving the equations using matrix algebra

X <- cbind(1, mydata$X)

Solving using single value decomposition

V2<-data.frame(X=c(0, (eV1[1,1]*eV1[1,2])),Y=c(0, norm_vec(V4)

#create and draw the dot product vector

# draw a right triangle (just for visual reference)

V2<-data.frame(X=c(0, (eV1[1,1]eV1[1,2])),Y=c(0, (eV1[1,1]eV1[2,2])))

X2 <- U %% D %% V