December » 2019 » JasonLe's TechBlog

Archive for December, 2019

正则总结

December 20th, 2019

如果需要匹配的字符串含有特殊字符，那就需要用 \转义。比如 a&b，在用正则表达式匹配时，需要使用 a\&b，又由于在 Java 字符串中，\ 也是特殊字符，它也需要转义，所以 a\&b 对应的 Java 字符串是 a\\&b，它是用来匹配 a&b 的。

System.out.println("a&b".matches("a\\&b")); // 输出为 true

\d\d 就能匹配两个数字，\d\d\d 能匹配三个数字，需要匹配几个数字就写几次就行了。

System.out.println("1".matches("\\d\\d")); // 输出为 false
System.out.println("11".matches("\\d\\d")); // 输出为 true
System.out.println("111".matches("\\d\\d")); // 输出为 false

在 \d 后面打上花括号 {}，{n} 表示匹配 n 次。\d{10000} 就表示匹配 10000 个数字。如果要匹配 n ~ m 次，用 {n,m} 即可，如果要匹配至少 n 次，用 {n,} 即可。需要注意 , 后不能有空格。

System.out.println("1".matches("\\d{1,2}")); // 输出为 true
System.out.println("12".matches("\\d{1,2}")); // 输出为 true
System.out.println("123".matches("\\d{1,2}")); // 输出为 false
System.out.println("123".matches("\\d{2,}")); // 输出为 true

正则的基础规则中，除了 \d，还有 \w和\s，w 是 word 的简写，表示匹配一个常用字符，包括字母、数字、下划线。s 是 space 的简写，表示匹配一个空格，包括三种：空格键打出来的空格/Tab 键打出来的空格/回车键打出来的空格。

System.out.println("LeetCode_666".matches("\\w{12}")); // 输出为 true
System.out.println("\t \n".matches("\\s{3}")); // 输出为 true
System.out.println("Leet\tCode 666".matches("\\w{4}\\s\\w{4}\\s\\d{3}")); // 输出为 true

将字母换成大写，就表示相反的意思。用 \d 你可以匹配一个数字，\D 则表示匹配一个非数字。类似地，\W 可以匹配 \w 不能匹配的字符，\S 可以匹配 \s 不能匹配的字符。

System.out.println("a".matches("\\d")); // 输出为 false
System.out.println("1".matches("\\d")); // 输出为 true
System.out.println("a".matches("\\D")); // 输出为 true
System.out.println("1".matches("\\D")); // 输出为 false

我们对某些位置的字符没有要求，仅需要占个位置即可。这时候我们就可以用 . 字符。我们对匹配的次数没有要求，匹配任意次均可，这时，我们就可以用 * 字符。出现了 0 次，* 是指可以匹配任意次，包括 0 次。也就是说，* 等价于 {0,}

System.out.println("1".matches("\\d*")); // 输出为 true
System.out.println("123".matches("\\d*")); // 输出为 true
System.out.println("".matches("\\d*")); // 输出为 true

可以用 + 匹配，+ 表示至少匹配一次。它等价于 {1,}

System.out.println("1".matches("\\d+")); // 输出为 true
System.out.println("123".matches("\\d+")); // 输出为 true
System.out.println("".matches("\\d+")); // 输出为 false

如果某个字符要么匹配 0 次，要么匹配 1 次，我们就可以用 ? 匹配。它等价于 {0,1}

如果我们规定电话号码不能以 0 开头，正则匹配规则是 [123456789]\d{10}。

System.out.println("1".matches("[1-9a-gU-Z]")); // 输出为 true
System.out.println("b".matches("[1-9a-gU-Z]")); // 输出为 true
System.out.println("X".matches("[1-9a-gU-Z]")); // 输出为 true
System.out.println("A".matches("[1-9a-gU-Z]")); // 输出为 false

考虑一个实际需求，有许许多多以下格式的字符串，你需要用正则表达式匹配出其姓名和年龄。
Name：Aurora Age：18
其中还夹杂着一些无关紧要的数据
Name：Bob Age：20
错误的数据有着各种各样错误的格式
Name：Cassin Age：22

观察字符串的规则，只需要用 Name:\w+\s*Age:\d{1,3} 就能匹配了。

System.out.println("Name:Aurora   Age:18".matches("Name:\\w+\\s*Age:\\d{1,3}")); // 输出为 true
System.out.println("其中还夹杂着一些无关紧要的数据".matches("Name:\\w+\\s*Age:\\d{1,3}")); // 输出为 false
System.out.println("Name:Bob      Age:20".matches("Name:\\w+\\s*Age:\\d{1,3}")); // 输出为 true
System.out.println("错误的数据有着各种各样错误的格式".matches("Name:\\w+\\s*Age:\\d{1,3}")); // 输出为 false
System.out.println("Name:Cassin   Age:22".matches("Name:\\w+\\s*Age:\\d{1,3}")); // 输出为 true

Pattern pattern = Pattern.compile("Name:(\\w+)\\s*Age:(\\d{1,3})");
Matcher matcher = pattern.matcher("Name:Aurora   Age:18");
if(matcher.matches()) {
    String group1 = matcher.group(1);
    String group2 = matcher.group(2);
    System.out.println(group1);   // 输出为 Aurora
    System.out.println(group2);   // 输出为 18
}

只要用 () 将需要取值的地方括起来，传给 Pattern 对象，再用 Pattern 对象匹配后获得的 Matcher 对象来取值就行了。每个匹配的值将会按照顺序保存在 Matcher 对象的 group 中。用 () 把 \\w+ 和 \\d{1,3} 分别括起来了，判断 Pattern 对象与字符串是否匹配的方法是 Matcher.matches()，如果匹配成功，这个函数将返回 true，如果匹配失败，则返回 false。

group(0) 被用来保存整个匹配的字符串了。

考虑一个实际场景：你有一个让用户输入标签的输入框，用户可以输入多个标签。可是你并没有提示用户，标签之前用什么间隔符号隔开。
二分，回溯，递归，分治
搜索；查找；旋转；遍历
数论图论逻辑概率

System.out.println(Arrays.toString("二分,回溯,递归,分治".split("[,;\\s]+")));
System.out.println(Arrays.toString("搜索;查找;旋转;遍历".split("[,;\\s]+")));
System.out.println(Arrays.toString("数论 图论 逻辑 概率".split("[,;\\s]+")));

System.out.println("二分,回溯,递归,分治".replaceAll("[,;\\s]+", ";"));
System.out.println("搜索;查找;旋转;遍历".replaceAll("[,;\\s]+", ";"));
System.out.println("数论 图论 逻辑 概率".replaceAll("[,;\\s]+", ";"));

在 replaceAll 的第二个参数中，我们可以通过 $1，$2，…来反向引用匹配到的子串。只要将需要引用的部分用 () 括起来就可以了。

System.out.println("二分,回溯,递归,分治".replaceAll("([,;\\s]+)", "---$1---"));
System.out.println("搜索;查找;旋转;遍历".replaceAll("([,;\\s]+)", "---$1---"));
System.out.println("数论 图论 逻辑 概率".replaceAll("([,;\\s]+)", "---$1---"));

输出为：

二分---,---回溯---,---递归---,---分治
搜索---;---查找---;---旋转---;---遍历
数论--- ---图论--- ---逻辑--- ---概率

贪婪匹配和贪心算法原理是一致的。与之对应的匹配方式叫做非贪婪匹配，非贪婪匹配会在能匹配目标字符串的前提下，尽可能少的向后匹配。
在需要非贪婪匹配的正则表达式后面加个 ? 即可表示非贪婪匹配。

Pattern pattern = Pattern.compile("(\\w+?)(e*)");
Matcher matcher = pattern.matcher("LeetCode");
if (matcher.matches()) {
    String group1 = matcher.group(1);
    String group2 = matcher.group(2);
    System.out.println("group1 = " + group1 + ", length = " + group1.length());
    System.out.println("group2 = " + group2 + ", length = " + group2.length());
}

str.replaceAll(“[\\.。]+”, “”) 可以匹配所有的. 和。

消除str.replaceAll(“[^0-9a-zA-Z]”, “”)消除所有的非大小写字母和数字的非法字符。

No comments »

Posted in Code杂谈

Tags: regex

Java8 中Stream尝鲜

December 16th, 2019

Stream简介

Java 8引入了全新的Stream API。这里的Stream和I/O流不同，它更像具有Iterable的集合类，但行为和集合类又有所不同。
stream是对集合对象功能的增强，它专注于对集合对象进行各种非常便利、高效的聚合操作，或者大批量数据操作。
只要给出需要对其包含的元素执行什么操作，比如 “过滤掉长度大于 10 的字符串”、“获取每个字符串的首字母”等，Stream 会隐式地在内部进行遍历，做出相应的数据转换。

为什么要使用Stream

函数式编程带来的好处尤为明显。这种代码更多地表达了业务逻辑的意图，而不是它的实现机制。易读的代码也易于维护、更可靠、更不容易出错。
高端

Filter

遍历数据并检查其中的元素时使用。
filter接受一个函数作为参数，该函数用Lambda表达式表示。

/**
     * 过滤所有的男性
     */
    public static void fiterSex(){
        List<PersonModel> data = Data.getData();

        //old
        List<PersonModel> temp=new ArrayList<>();
        for (PersonModel person:data) {
            if ("男".equals(person.getSex())){
                temp.add(person);
            }
        }
        System.out.println(temp);
        //new
        List<PersonModel> collect = data
                .stream()
                .filter(person -> "男".equals(person.getSex()))
                .collect(toList());
        System.out.println(collect);
    }

    /**
     * 过滤所有的男性 并且小于20岁
     */
    public static void fiterSexAndAge(){
        List<PersonModel> data = Data.getData();

        //old
        List<PersonModel> temp=new ArrayList<>();
        for (PersonModel person:data) {
            if ("男".equals(person.getSex())&&person.getAge()<20){
                temp.add(person);
            }
        }

        //new 1
        List<PersonModel> collect = data
                .stream()
                .filter(person -> {
                    if ("男".equals(person.getSex())&&person.getAge()<20){
                        return true;
                    }
                    return false;
                })
                .collect(toList());
        //new 2
        List<PersonModel> collect1 = data
                .stream()
                .filter(person -> ("男".equals(person.getSex())&&person.getAge()<20))
                .collect(toList());

    }

Map

map生成的是个一对一映射,for的作用
比较常用
而且很简单

 /**
     * 取出所有的用户名字
     */
    public static void getUserNameList(){
        List<PersonModel> data = Data.getData();

        //old
        List<String> list=new ArrayList<>();
        for (PersonModel persion:data) {
            list.add(persion.getName());
        }
        System.out.println(list);

        //new 1
        List<String> collect = data.stream().map(person -> person.getName()).collect(toList());
        System.out.println(collect);

        //new 2
        List<String> collect1 = data.stream().map(PersonModel::getName).collect(toList());
        System.out.println(collect1);

        //new 3
        List<String> collect2 = data.stream().map(person -> {
            System.out.println(person.getName());
            return person.getName();
        }).collect(toList());
    }

FlatMap

顾名思义，跟map差不多,更深层次的操作
但还是有区别的
map和flat返回值不同
Map 每个输入元素，都按照规则转换成为另外一个元素。
还有一些场景，是一对多映射关系的，这时需要 flatMap。
Map一对一
Flatmap一对多
map和flatMap的方法声明是不一样的
- <r> Stream<r> map(Function mapper);
- <r> Stream<r> flatMap(Function> mapper);
map和flatMap的区别：我个人认为，flatMap的可以处理更深层次的数据，入参为多个list，结果可以返回为一个list，而map是一对一的，入参是多个list，结果返回必须是多个list。通俗的说，如果入参都是对象，那么flatMap可以操作对象里面的对象，而map只能操作第一层。

public static void flatMapString() {
        List<PersonModel> data = Data.getData();
        //返回类型不一样
        List<String> collect = data.stream()
                .flatMap(person -> Arrays.stream(person.getName().split(" "))).collect(toList());

        List<Stream<String>> collect1 = data.stream()
                .map(person -> Arrays.stream(person.getName().split(" "))).collect(toList());

        //用map实现
        List<String> collect2 = data.stream()
                .map(person -> person.getName().split(" "))
                .flatMap(Arrays::stream).collect(toList());
        //另一种方式
        List<String> collect3 = data.stream()
                .map(person -> person.getName().split(" "))
                .flatMap(str -> Arrays.asList(str).stream()).collect(toList());
}

Collect

collect在流中生成列表，map，等常用的数据结构
toList()
toSet()
toMap()
自定义

/**
     * toList
     */
    public static void toListTest(){
        List<PersonModel> data = Data.getData();
        List<String> collect = data.stream()
                .map(PersonModel::getName)
                .collect(Collectors.toList());
    }

    /**
     * toSet
     */
    public static void toSetTest(){
        List<PersonModel> data = Data.getData();
        Set<String> collect = data.stream()
                .map(PersonModel::getName)
                .collect(Collectors.toSet());
    }

    /**
     * toMap
     */
    public static void toMapTest(){
        List<PersonModel> data = Data.getData();
        Map<String, Integer> collect = data.stream()
                .collect(
                        Collectors.toMap(PersonModel::getName, PersonModel::getAge)
                );

        data.stream()
                .collect(Collectors.toMap(per->per.getName(), value->{
            return value+"1";
        }));
    }

    /**
     * 指定类型
     */
    public static void toTreeSetTest(){
        List<PersonModel> data = Data.getData();
        TreeSet<PersonModel> collect = data.stream()
                .collect(Collectors.toCollection(TreeSet::new));
        System.out.println(collect);
    }

    /**
     * 分组
     */
    public static void toGroupTest(){
        List<PersonModel> data = Data.getData();
        Map<Boolean, List<PersonModel>> collect = data.stream()
                .collect(Collectors.groupingBy(per -> "男".equals(per.getSex())));
        System.out.println(collect);
    }

    /**
     * 分隔
     */
    public static void toJoiningTest(){
        List<PersonModel> data = Data.getData();
        String collect = data.stream()
                .map(personModel -> personModel.getName())
                .collect(Collectors.joining(",", "{", "}"));
        System.out.println(collect);
    }

    /**
     * 自定义
     */
    public static void reduce(){
        List<String> collect = Stream.of("1", "2", "3").collect(
                Collectors.reducing(new ArrayList<String>(), x -> Arrays.asList(x), (y, z) -> {
                    y.addAll(z);
                    return y;
                }));
        System.out.println(collect);
    }

调试

list.map.fiter.map.xx 为链式调用，最终调用collect(xx)返回结果
分惰性求值和及早求值
判断一个操作是惰性求值还是及早求值很简单:只需看它的返回值。如果返回值是 Stream，那么是惰性求值;如果返回值是另一个值或为空，那么就是及早求值。使用这些操作的理想方式就是形成一个惰性求值的链，最后用一个及早求值的操作返回想要的结果。
通过peek可以查看每个值，同时能继续操作流

private static void peekTest() {
        List<PersonModel> data = Data.getData();

        //peek打印出遍历的每个per
        data.stream().map(per->per.getName()).peek(p->{
            System.out.println(p);
        }).collect(toList());
}

No comments »

Posted in Java

Tags: stream