We implement our operator in the Entity Framework Core

Once on a cloudy Saturday morning in March, I decided to see how Microsoft was doing in the good work of transforming the Entity Framework mastodon into the Entity Framework Core. Exactly a year ago, when our team started a new project and selected ORM, our hands itched to use everything as stylish and youthful as possible. However, looking at the EFC, we realized that it is still very far away production. There are a lot of problems with N + 1 queries (greatly improved in the 2nd version), crooked nested selections (fixed in 2.1.0-preview1 ), there is no Many-to-Many support (still not) and a cherry on the cake - lack of DbGeometry support, which was very critical in our project. It is noteworthy that the last feature is in the road map the project since 2015 in the list of high priority. Our team even has a joke on this topic: "This task will be added to the list of high priority." And now, one year has passed since the last revision of the EFC, the second version of this product was already released and I decided to check how things were going.


In my opinion, one of the best ways to test a product is to try to expand it with some custom feature. This immediately sheds light on: a) the quality of architecture; b) the quality of the documentation; c) community support.


A quick look at the first page of Google’s results showed that full-text search in EFC is not yet supported, but there are plans. Well, that’s what we need, we can try to implement the predicate CONTAINS from T-SQL ourselves.


We come up with an API


I did not bother with complicated methods and simply declared an extension method for strings:


public static class StringExt
{
    public static bool ContainsText(this string text, string sub)
    {
        throw new NotImplementedException("This method is not supposed to run on client");
    }
}

In the body of the method, we simply throw an exception, because it is just a marker, not intended to be launched on the client. In custom code, it should look something like this:


dbContext.Posts.Where(x => x.Content.ContainsText("egg"));

it remains to figure out how to implement this.


Search for extension points


Things are more complicated with this. Google at the request of "ef core create custom operator" provides only a link to the topic from the project’s github, ending with a message like "hey, any updates on that?". It is also suggested that you run the SQL query by hand, which would certainly work, but this is not our option.


The best way to do something new is to do it by analogy. What is the closest operator that we want to implement? Correctly LIKE . The operator is LIKE translated from the method String.Contains . All we need to do is peek at how this is done by the EFC developers.


Download the repository, open it in Visual Studio 2017 and ... Visual Studio goes into a dead corkscrew. Well, ok, fat IDEs for amateurs, we take Visual Studio Code, everything flies there. Moreover, Code Lens works out of the box, just amazing.


We find files containing Contains in the name, SqlServerContainsOptimizedTranslator.cs - our candidate. I wonder what is so optimized about it? It turns out that EFC, in contrast to EF, uses CHARINDEX > 0 instead LIKE '%pattern%' .


Strong statement

image


This SO post casts doubt on the decision of the EFC team.


Code Lens tells us that it is SqlServerContainsOptimizedTranslator used only in one place - SqlServerCompositeMethodCallTranslator.cs . Bingo! This class inherits from RelationalCompositeMethodCallTranslator and, judging by the name, translates the call of .NET methods to an SQL query, which is what we need! You just need to expand this class and add one more of our custom translators to its list.


Writing your translator


The translator must implement the interface IMethodCallTranslator . The contract that it must execute in the method Expression Translate(MethodCallExpression methodCallExpression) is quite simple: if the input expression is not known, return null, in the other case, convert the expression to Sql.
This is what the class looks like:


public class FreeTextTranslator : IMethodCallTranslator
{
    private static readonly MethodInfo _methodInfo
        = typeof(StringExt).GetRuntimeMethod(nameof(StringExt.ContainsText), new[] {typeof(string), typeof(string)});

    public Expression Translate(MethodCallExpression methodCallExpression)
    {
        if (methodCallExpression.Method != _methodInfo) return null;

        var patternExpression = methodCallExpression.Arguments[1];
        var objectExpression = (ColumnExpression) methodCallExpression.Arguments[0];

        var sqlExpression =
            new SqlFunctionExpression("CONTAINS", typeof(bool),
                new[] { objectExpression, patternExpression });
        return sqlExpression;
    }
}

It remains only to connect it using CustomSqlMethodCallTranslator:


public class CustomSqlMethodCallTranslator : SqlServerCompositeMethodCallTranslator
{
    public CustomSqlMethodCallTranslator(RelationalCompositeMethodCallTranslatorDependencies dependencies) : base(dependencies)
    {
        // ReSharper disable once VirtualMemberCallInConstructor
        AddTranslators(new [] {new FreeTextTranslator() });
    }
}

DI in EFC


EFC uses the DI pattern to the fullest, I would even say too much. The influence of the Kestrel team (or vice versa) is felt. If you are already working with ASP.NET Core, then you will not have problems understanding the implementation and resolution of curls in EFC. The extension method UseSqlServer installs a couple of dozen dependencies necessary for the library to work. Sources can be found here . There is ours ICompositeMethodCallTranslator , which we will rewrite using the helper ReplaceService


optionsBuilder.ReplaceService<ICompositeMethodCallTranslator, CustomSqlMethodCallTranslator>();

Install and run.


var textContains = dbContext.Posts.Where(x => x.Content.ContainsText("egg")).ToArray();

SQL Generation Issues


After launch, we find 2 news: good and not very. The good thing is that our custom translator was successfully picked up by EFC. Bad - the request was incorrect.


SELECT [x].[Id], [x].[AuthorId], [x].[BlogId], [x].[Content], [x].[Created], [x].[Rating], [x].[Title]
      FROM [Posts] AS [x]
      WHERE CONTAINS([x].[Content], N'egg') = 1

Obviously, the final SQL generator that converts the intermediate tree of expressions into a ready-made query expects some value from the SQL function. But CONTAINS is a predicate that returns bool, which the SQL generator does not pay attention to. After googling, many unsuccessful attempts to create a crutch, I gave up. I even tried using SqlFragmentExpression one that inserts the SQL string into the final query as is. The generator added inappropriately = 1 . Before going to sleep, I left a bug report on the project # 11316 github . And, lo and behold, they pointed out to me the problem and the demand for its solution within 24 hours.


Problem and Solution


My guess is that the SQL generator wants the return value to be true. To solve this problem, it was necessary to replace VisitBinary with VisitUnary in SqlVisitor'e, because CONTAINS is a unary operator. Here there have realized the idea. We act by analogy, create our custom generator, connect it in the container and start again.


public class FreeTextSqlGenerator : DefaultQuerySqlGenerator
{
    internal FreeTextSqlGenerator(QuerySqlGeneratorDependencies dependencies, SelectExpression selectExpression) : base(dependencies, selectExpression)
    {
    }

    protected override Expression VisitBinary(BinaryExpression binaryExpression)
    {
        if (binaryExpression.Left is SqlFunctionExpression sqlFunctionExpression
            && sqlFunctionExpression.FunctionName == "CONTAINS")
        {
            Visit(binaryExpression.Left);

            return binaryExpression;
        }

        return base.VisitBinary(binaryExpression);
    }
}

Everything worked, the correct SQL is generated. The method ContainsText can participate in various expressions, in general, is a full member of the EFC.


conclusions


Architecturally, the EFC has gone far ahead of the classic EF. Expanding it is not a problem, but be prepared to look for solutions in the source. For me, this is one of the main ways to learn something new, although it takes a lot of time.


Project maintainers are ready to give a detailed answer to your question. I noticed that 4 days after I reported my bug, another ~ 20 issues were opened. Most of them were answered.


The finished code is here . To run it, you need the latest VS and docker on linux containers, or SQL Server with Full-Text Search. Unfortunately, localdb comes without linguistic services and it is not possible to connect them. I used the docker file from the Internet. The assembly and launch of the docker image is in the database-create.ps1 file.


Also do not forget to start migrations using cmdlet update-database .