Announcing support for complex-domain linear Programs in Convex.jl

Ayush Pandey

2016-07-27 02:13

Introduction

In recent years, programmatic advertising is been taking over the online advertisement industry. Many companies referred to as DSP (Demand Side Platform) compete for the same ad-slot on the internet. The success of the DSPs to deliver the values to the advertisers is evaluated on the below two criteria:

High Click Through Rate = Number of clicks/ Number of ads shown
High Conversion Rate = Number of conversions (such as a purchase)/ Number of ads shown

To achieve high CTR and Conversion Rates, DSPs heavily rely on using Artificial Intelligence techniques and develop their in-house Machine Learning algorithms. The problem of applying Machine Learning in Adtech is different from the standard problem in many sense. Google's paper Ad Click Prediction: a view from the Trenches and Facebook's paper Practical Lessons from Predicting Clicks on Ads at Facebbok have discussed in details the lessons learned while building an AI for the adtech industry. In the remaining blog post, I will try to summarize the intricacies of applying ML in adtech and how it is tackled in general:

Large size of the training vector: Every feature in the ML model is a categorical feature and encoding them into numerical feature explodes the size of the training vector to the order of billions. For example, one of the most important features in the ML model is the publisher website where the ad would be displayed which is a categorical feature and there are millions of publishers so using one-hot encoding would result in a training vector of million entries.
Skewness of the training data: Typically, CTRs are much lower than 50% (generally, CTR is around 1-2%)which means that positive examples (clicks) are relatively rare, hence is the problem of skewness in the training data is introduced.
Rapid changes in the online ad landscape: The adtech domain is a very dynamic environment where the data distribution changes over time. Facebook conducted an experiment where they trained the model on one day of data and evaluated on the six consecutive days. The results showed that the performance of the model decreases as the delay between the training and test set increases. Thus, it is very important to update the model every few hours to keep it very real-time
Per-coordinate learning rate: In most of the standard ML problems, the learning rate is a constant. In adtech, there is a huge imbalance of the number of training instances on each feature. For example, a famous publisher such as cnn.com will be having more users thus more ad-slots compared to a very unknown one thus, our training data will have a huge number of training instances for cnn.com. Therefore, we want to decrease the learning rate for the coordinate as its frequency in the training data increases.
Using Progressive Validation rather than cross-validation: Validating the model on the data set which lags the train set by hours or days is not a good idea as we discussed above that the nature of the dataset changes with time. The online log loss is instead a good proxy for the model performance because it measures the performance only on the most recent data before we train on it, this is exactly analogous to what happens when the model is in production. This also ensures that we can use 100% of our data for both training and testing.
Relative changes in the metric compared to Absolute metric: Click rates vary from country to country and from ad-slot to ad-slot and therefore the metrics change over the course of a single day. Google experiments indicate that the relative changes(compared to a baseline model) are much more stable over time, therefore, a relative change in logloss is a better metric than the average log loss. We also take care only to compare metrics computed from exactly the same data.
Segmented performance metrics instead of aggregated metrics: One of the things that we have to take care while analyzing the performance of the models in adtech is that the aggregated performance metrics may hide effects that are specific to certain sub-populations of the data. For example, a high CTR may, in fact, be caused by a mix of low and high CTR from different ad exchanges. This makes it critical to visualize the performance metrics not only on the aggregate data but also on the various slicing of the data such as per ad exchange, per adgroup, per device type, per publisher.

This blog is the aggregation of all the lessons learnt while working as Data Scientist in the AdTech company. Do let me know if you have any additional comments.

Do you have any questions?

Ask your questions in the comments below and I will do my best to answer.

A month of Excitement!

Ayush Pandey

2016-05-22 21:44

Every day is a new experience and it opening up new opportunities. Life is awesome.

That is how I would describe my first month as GSoCer. From meeting new like minded people from all across the world to having an intense discussion about our projects in different groups to getting excited about receiving google goodies and now finally getting invited to JuliaCon'16 to be held at MIT, Cambridge, Massachusetts. Every experience has been worth the work put behind getting selected into GSoC'16.

I got into coding my project and community bonding as soon as my exams got over on 29th April. Before I get into the details of my project, I would like to extend this opportunity to explain how my project is of importance to the optimization community in particular. The reason is as follows:

Many problems in applied sciences are posed as optimization problems over the complex field such as:

Phase retrieval from sparse signals
Designing an FIR filter given desired frequency response
Optimization problems in AC power systems
Frequency domain analysis in signal processing and control theory.

The present approach is to manually convert the complex-domain problems to real-domain problems (example) and pass to solvers. This process can be time-consuming and non-intuitive sometimes. The correct approach to such problem would be to make existing packages deal with complex-domain optimization hence making it easier for the optimization community to deal with complex-domain problems.

In the first hangout call with my mentors, we had decided to implement the functionality to support the Linear complex-domain optimization problem in Convex.jl. In order to support the above functionality, I was required to make the following changes:

Add support for complex variables, Hermitian SemiDefinite matrix and complex constants. This was done by introducing a new sign ComplexSign which was introduced as the subtype of the user-defined type Sign in dcp.jl file. I also went on to define the new rules for basic rules on interactions of mathematical expressions with the mathematical expression with ComplexSign

abstract Sign
type Positive <: Sign                   end
type Negative <: Sign                   end
type NoSign <: Sign                     end
type ComplexSign <: Sign                end

-(s::ComplexSign) = ComplexSign()
+(s::ComplexSign, t::ComplexSign) = s
+(s::Sign, t::ComplexSign) = t
+(t::ComplexSign, s::Sign) = s+t
*(t::ComplexSign, s::ComplexSign) = t
*(t::ComplexSign, s::Sign) = t
*(s::Sign, t::ComplexSign) = t
*(s::ComplexSign, m::Monotonicity) = NoMonotonicity()
*(m::Monotonicity, s::Sign) = s * m
*(s::ComplexSign, v::ConstVexity) = v
*(s::ComplexSign, v::AffineVexity) = v
*(s::ComplexSign, v::ConvexVexity) = NotDcp()
*(s::ComplexSign, v::ConcaveVexity) = NotDcp()
*(s::ComplexSign, v::NotDcp) = v

As a result of changes made in point 1, I was able to modify the constant.jl and variable.jl file accordingly.

ComplexVariable(m::Int, n::Int, sets::Symbol...) = Variable((m,n), ComplexSign(), sets...)
ComplexVariable(sets::Symbol...) = Variable((1, 1), ComplexSign(), sets...)
ComplexVariable(size::Tuple{Int, Int}, sets::Symbol...) = Variable(size, ComplexSign(), sets...)
ComplexVariable(size::Int, sets::Symbol...) = Variable((size, 1), ComplexSign(), sets...)
HermitianSemidefinite(m::Integer) = ComplexVariable((m,m), :Semidefinite)
function HermitianSemidefinite(m::Integer, n::Integer)
  if m==n
    return ComplexVariable((m,m), :Semidefinite)
  else
    error("HermitianSemidefinite matrices must be square")
  end
end

Now, the user can create the complex variables, Hermitian-semidefinite matrix in Convex.jl as follows:

y = ComplexVariable()
Variable of
size: (1, 1)
sign: Convex.ComplexSign()
vexity: Convex.AffineVexity()
z = HermitianSemidefinite(4)
Variable of
size: (4, 4)
sign: Convex.ComplexSign()
vexity: Convex.AffineVexity()

The third step was to redefine the affine atoms in Convex.jl to accommodate the complex variable case as well. Interestingly, by virtue of the rules defined in step 1 in dcp.jl, we didn't even need to define any affine atom to accommodate the complex case, Convex.jl is smart enough to pick the sign of the expression using rules in dcp.jl.

So this was the summary of what I have been doing on the coding side of GSoC. On the other side, I moved to Bangalore (India's Silicon Valley) where I would be spending next two months and I am very excited to meet the other JuliaSoCers at the Julia Computing Bangalore office in the days to come.

Also, the coming week is crucial for us as we have decided to fully implement the complex-domain Linear Programming Problem in Convex.jl. Last night, we had our 3rd hangout meeting where we discussed the nitty-gritty of Convex.jl especially the conic form functionality and how to proceed so as to meet the deadline that we have set up for ourselves. I look forward to another week of coding and making new friends and getting involved in the Open Source Community.

The Bragging Rights- Convex.jl would be the second optimization package after Matlab's cvx to support complex-domain optimization problems, so yes this excites me as well to be a part the team that is developing such a useful package.

Also, I am very very much excited to attend JuliaCon'16 at the Massachusetts Institute of Technology and give a talk on what I have been doing this summer. I hope I get the visa on time, wish me luck everyone :)

See you next time till then keep laughing, spreading happiness and the most importantly living this journey called life!!!!

Exciting Summers Ahead!

Ayush Pandey

2016-05-03 21:38

Fall seven times, stand up eight !!!!

Google Summer of Code, the thing I wanted to be a part of as soon as I heard of it in my 1st year of college. It took me 4 years when I am finally selected for the prestigious open source program. Harsh Gupta no word can express my gratefulness for you. Thank you very much for always being there. Coincidently, the results were announced in the middle of my end-semester examination and I must accept my selection was one more reason for not studying for the exams. I am glad that my exams are over and as always I am looking forward to the summers where I apply the theories I have learnt during the semester in a practical world. I think I should make it apparent that applying to Julia and getting selected was a well-thought process, there were series of events which helped me getting selected in GSoC under The Julia Lang. It all started when I was finishing the web-development internship during the winter vacations when I realized that I need to pursue applied mathematics more seriously as that is somewhere I can excel at. Operations Research and Optimization was an obvious choice as I enjoyed the introductory course in Linear Programming way more than other cs and mathematics courses that I took during my stay at IIT Kharagpur. I decided to take an advanced course in Optimization(Non-Linear Programming) but sadly the course was open for final year students but yes where there is will, there is a way, I was able to enroll for the course after consulting with my professor, interestingly I was the only pre-final year student in that class yet I tried to attend each and every class in spite of not having familiar faces to sit with. During the semester, 4th InterIIT Meet was being held at IIT Mandi. Initially, I was not the part of IIT Kharagpur's contingent as I didn't find myself competent enough to apply for the selections. But as soon as the problem statement for the event Portfolio Optimization was out, my friend Raja Ambrish contacted me as the problem was related to Convex Optimization because he thought I could help him. My eyes were lit bright as soon as I saw the problem statement as it was the application of what I have been reading throughout the semester. I worked with the other team members and was finally selected to present our solution to the jury at IIT Mandi. Yay, IIT Kharagpur were the overall champions and also our team managed to win gold medal Portfolio Optimization as well.

A click during InterIIT Presentation

The InterIIT experience was awesome and it made me realize that indeed I have a special interest in the field of Optimization. I wanted to explore more of this field and Google Summer of Code was an exciting opportunity. I told my friend Harsh about the plans to participate in GSoC in an organization which has Optimization related projects and his first suggestion was Julia. I came back room and browsed projects and wow Julia has this one project namely "Support for Complex SemiDefinite Programming within Convex.jl" on Convex Optimization that I fell in love with. The concepts I learnt during the classroom hours in Non-Linear Programming were of immense help to me in getting started with Convex.jl special thanks to Prof. Geetanjali Panda for teaching this course in a very intuitive way.

Finally, after having an extensive discussion with my mentors Madeleine Udell and Dvijotham Krishnamurthy, I submitted my proposal under The Julia Lang and NumFocus and was selected under former. I want to express my gratefulness to each and everyone who has helped me before and throughout the GSoC Application Period.

I am officially in the first week of the Community Bonding Period. My week started with having a video call with my mentors where we discussed my proposal in more detail and went through some of the existing codebase in order to understand what specific changes need to be made to accomplish our task. We have started by first tackling the complex-domain Linear Programming Problems. We have decided that we would need few utility functions like:

A function to recurse on the expression tree to check if an expression is real or complex
A function conj(A)
A function to check the LHS and RHS of each inequality constraint is real.

I would also need to think of the rules that would be helpful in determining whether the combination of two or more expressions is real or complex on the lines of how we determine the DCP compliance of a problem. The next step would be to rewrite all the atoms to accept the complex parameters.

The next assignment was to set-up a blog to write and publish my experience as GSoCer or rather JuliaSoCer which I did just complete :D During this week, I was also introduced to the Julia community and added to their slack channel. I believe I would have some awesome time with the community and hope to see them during JuliaCon 2016.

Interestingly, I have my birthday on 4th May which I plan to celebrate with my family so I would be traveling to my home from my university. Don't forget to wish me ;)

This is all for now. Thank you very much for making it this far :)